Take three layers of information:
Layer 1: Information
This layer contains data with UNIQUE
natural indexes and a surrogate key that is easily transferrable.
Table Surnames:
+-----------------------------+--------------+
| ID (Auto Increment, PK) | Surname |
+-----------------------------+--------------+
| 1 | Smith |
| 2 | Edwards |
| 3 | Brown |
+-----------------------------+--------------+
Table FirstNames
+-----------------------------+--------------+
| ID (Auto Increment, PK) | FirstName |
+-----------------------------+--------------+
| 1 | John |
| 2 | Bob |
| 3 | Mary |
| 4 | Kate |
+-----------------------------+--------------+
Natural Keys
Alternatively, the two tables above can be without ID
and utilize Surname and FirstName as Natural Primary Keys, as explained by Mike Sherrill. In this instance, assume the layer below references varchar
rather than int
.
Layer 2: People
In this layer a composite index is used. This value can be UNIQUE
or PRIMARY
, depending on whether a surrogate key is utilized as the Primary Key.
+-----------------+--------------+
| FirstName | LastName |
+-----------------+--------------+
| 1 | 2 |
| 1 | 3 |
| 2 | 3 |
| 3 | 1 |
| 4 | 2 |
| ... | ... |
+-----------------+--------------+
Layer 3: Parents
In this layer, relationships between people are explored through a ParentsOf
table.
ParentsOf
+-----------------+-----------------+
| Person | PersonParent |
+-----------------+-----------------+
OR
+-----------------+-----------------+-----------------+-----------------+
| PersonFirstName | PersonSurname | ParentFirstName | ParentSurname |
+-----------------+-----------------+-----------------+-----------------+
The Question
Assuming that referential integrity is VERY important to me at its very core, and I will have FOREIGN KEYS
on these indexes so that I keep the database responsible for monitoring its own integrity on this front, and that, if I were to use an ORM, it would be one like Doctrine which has native support for Compound Primary Keys...
Please help me to understand:
The list of trade-offs that take place with utilizing surrogate keys vs. natural keys on the 1st Layer.
The list of trade-offs that take place with utilizing compound keys vs. surrogate keys on the 2nd Layer which can be transferred over to the 3rd Layer.
I am not interested in hearing which is better, because I understand that there are significant disagreements among professionals on this topic and it would be sparking a religious war. Instead, I am asking, very simply and as objectively as is humanly possible, what trade-offs will you be taking by passing surrogate keys to each Layer vs maintaining Primary keys (natural/composite, or surrogate/composite). Anyone will be able to find someone saying NEVER or ALWAYS use surrogate keys on SO and other websites. Instead, a reasoned analyses of trade-offs is what I will most appreciate in your answers.
EDIT: It has been pointed out that a surname example is a poor example for a use of 6NF. For the sake of keeping the question intact, I am going to leave it be. If you are having trouble imagining the use case for this, a better one might be a list of "Grocery Items". AKA:
+-----------------------------+--------------+
| ID (Auto Increment, PK) | Grocery |
+-----------------------------+--------------+
| 1 | Sponges |
| 2 | Tomato Soup |
| 3 | Ice Cream |
| 4 | Lemons |
| 5 | Strawberries |
| 6 | Whipped Cream|
+-----------------------------+--------------+
+-----------------------------+--------------+
| ID (Auto Increment, PK) | Brand |
+-----------------------------+--------------+
| 1 | Bright |
| 2 | Ben & Jerry's|
| 3 | Store Brand |
| 4 | Campbell's |
| 5 | Cool Whip |
+-----------------------------+--------------+
Natural Composite Key Example:
+-----------------------------+--------------+
| Grocery | Brand |
+-----------------------------+--------------+
| Sponges | Bright |
| Ice Cream | Ben & Jerry's|
| Ice Cream | Store Brand |
| Tomato Soup | Campbell's |
| Tomato Soup | Store Brand |
| Lemons | Store Brand |
| Whipped Cream | Cool Whip |
+-----------------------------+--------------+
Recommended Pairings
+-----------------+-----------------+-----------------+-----------------+
| Grocery1 | Brand1 | Grocery2 | Brand2 |
+-----------------+-----------------+-----------------+-----------------+
To reiterate, this is also just an example. This is not how I would recommend proceeding, but it should help to illustrate my question.
There ARE shortfalls to this method. I'll reiterate that this question was to request walking through the benefits and drawbacks of each method below, not to highlight one as better than another. I believe most people were able to look past the questionable nature of this specific example to answer the core question. This edit is for those that cannot.
There are some very good answers below and if you are curious about which direction to go, please read them.
END EDIT
Thank you!
See Question&Answers more detail:
os