Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-26 : 08:14:46
|
| Hi,Lets say I have an entity, e.g. a product, which is spread across three tables: Products, ProductAttributes and AttributeOptions.Products========productID int pkname varchar(50)ProductAttributes=================ProductAttributeID int pkproductID int fkname varchar(50)AttributeOptions================AttributeOptions int pkProductAttributeID int fkname varchar(50)Lets say I wanted to edit all of this information for one product within a single web page. I would need to retrieve the data which spanned the three tables.It would make things easier if the data could be down loaded from each table seperately. So I would need to do three different selects.The question is, when retrieving a product's AttributeOptions, is it a fair design to add to the table a foreign key which references the parent product?I appreciate that this would be redundant data but it would make the query simpler.Cheers,XF. |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-26 : 16:52:39
|
So,which is better...Script 1:USE tempdbGODECLARE @ProductID intSET @ProductID = 2CREATE TABLE Products( productID int primary key identity, name varchar(50) )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod2' )CREATE TABLE Attributes( attributeID int primary key identity, productID int, name varchar(50))INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Color' )INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Size' )INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Length' )INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Pack' )CREATE TABLE Options( OptionID int primary key identity, attributeID int, name varchar(50))INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Red' )INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Gree' )INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Big' )INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Small' )INSERT INTO Options ( attributeID, name ) VALUES ( 3, '10m' )INSERT INTO Options ( attributeID, name ) VALUES ( 3, '20m' )INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Economy' )INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Bonus' )SELECT * FROM Products WHERE productID = @ProductIDSELECT * FROM Attributes WHERE productID = @ProductIDSELECT o.* FROM Options o INNER JOIN Attributes a ON a.attributeID = o.attributeID WHERE productID = @ProductIDDROP TABLE ProductsDROP TABLE AttributesDROP TABLE Options Script 2:USE tempdbGODECLARE @ProductID intSET @ProductID = 2CREATE TABLE Products( productID int primary key identity, name varchar(50) )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod2' )CREATE TABLE Attributes( attributeID int primary key identity, productID int, name varchar(50))INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Color' )INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Size' )INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Length' )INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Pack' )CREATE TABLE Options( OptionID int primary key identity, attributeID int, productID int, name varchar(50))INSERT INTO Options ( attributeID, productID, name ) VALUES ( 1, 1, 'Red' )INSERT INTO Options ( attributeID, productID, name ) VALUES ( 1, 1, 'Gree' )INSERT INTO Options ( attributeID, productID, name ) VALUES ( 2, 1, 'Big' )INSERT INTO Options ( attributeID, productID, name ) VALUES ( 2, 1, 'Small' )INSERT INTO Options ( attributeID, productID, name ) VALUES ( 3, 2, '10m' )INSERT INTO Options ( attributeID, productID, name ) VALUES ( 3, 2, '20m' )INSERT INTO Options ( attributeID, productID, name ) VALUES ( 4, 2, 'Economy' )INSERT INTO Options ( attributeID, productID, name ) VALUES ( 4, 2, 'Bonus' )SELECT * FROM Products WHERE productID = @ProductIDSELECT * FROM Attributes WHERE productID = @ProductIDSELECT * FROM Options WHERE productID = @ProductIDDROP TABLE ProductsDROP TABLE AttributesDROP TABLE Options The different is in the Options table and the queries used to retrieve from it.Is the latter relationally offensive? |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-10-26 : 17:13:55
|
When in doubt NORMALIZEWhen not in doubt NORMALIZEWhat's with the identity columns by the way ???CREATE TABLE Products( productID int primary key identity, name varchar(50) )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )SELECT productID ,name FROM Products rockmoose/* Chaos is the nature of things...Order is a lesser state of chaos */ |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-27 : 03:56:25
|
| An identity column means that SQL Server will automatically generate an integer primary key. |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-10-27 : 04:07:21
|
And in your table it is permissible to have many products with the same name ?I was questioning the value of only having a generated IDENTITY as the PK of your table,you should define at least one UNIQUE CONSTRAINT on the table.Unless this is ok:CREATE TABLE Products( productID int primary key identity, name varchar(50) )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod1' )SELECT productID ,name FROM Products rockmoose |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-28 : 16:19:12
|
Yeah, I've left some stuff out.But now returning to the original idea. What if I extend the tables further and cross the product options to create product combinations. e.g. so I can buy a 'Red - Big' prod1.Like this...USE tempdbGODECLARE @ProductID intSET @ProductID = 2CREATE TABLE Products( productID int primary key identity, name varchar(50) )INSERT INTO Products ( name ) VALUES ( 'Prod1' )INSERT INTO Products ( name ) VALUES ( 'Prod2' )CREATE TABLE Attributes( attributeID int primary key identity, productID int, name varchar(50))INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Color' )INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Size' )INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Length' )INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Pack' )CREATE TABLE Options( OptionID int primary key identity, attributeID int, name varchar(50))INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Red' )INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Green' )INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Big' )INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Small' )INSERT INTO Options ( attributeID, name ) VALUES ( 3, '10m' )INSERT INTO Options ( attributeID, name ) VALUES ( 3, '20m' )INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Economy' )INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Bonus' )CREATE TABLE Combinations( ComboID int primary key identity, code varchar(50))INSERT INTO Combinations ( code ) VALUES ( 'A41' )INSERT INTO Combinations ( code ) VALUES ( 'AX1' )INSERT INTO Combinations ( code ) VALUES ( 'Aq1' )INSERT INTO Combinations ( code ) VALUES ( 'Aj1' )INSERT INTO Combinations ( code ) VALUES ( 'A41' )INSERT INTO Combinations ( code ) VALUES ( 'A761')INSERT INTO Combinations ( code ) VALUES ( 'A31' )INSERT INTO Combinations ( code ) VALUES ( 'Ahu1')CREATE TABLE OptionsCombinations( OptionID int, ComboID int)INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 3 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 4 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 3 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 4 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 7 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 8 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 7 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 8 )SELECT * FROM Products WHERE productID = @ProductIDSELECT * FROM Attributes WHERE productID = @ProductIDSELECT o.* FROM Options o INNER JOIN Attributes a ON a.attributeID = o.attributeID WHERE productID = @ProductIDSELECT DISTINCT c.* FROM Combinations c INNER JOIN OptionsCombinations oc ON c.ComboID = oc.ComboID INNER JOIN Options o ON oc.OptionID = o.OptionID INNER JOIN Attributes a ON a.attributeID = o.attributeID INNER JOIN Products p ON a.productID = p.productIDWHERE p.productID = @ProductIDDROP TABLE OptionsCombinationsDROP TABLE ProductsDROP TABLE AttributesDROP TABLE OptionsDROP TABLE Combinations And I want to return all combinations for a given product.The query needed to do this ( unless you know a better one ) contains 4 joins and a DISTINCT.Now, if you needed to return these results every time a product was viewed, would you seriously do this 4 join query and not throw in an extra foreign key and do a simple SELECT?I know you said always normalize. Still the same huh? My doubt stems only from the perceived potential gain in performance.Cheers,XF. |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-28 : 16:41:39
|
oops, that combo data should be..INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 1 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 3, 1 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 2 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 4, 2 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 3 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 3, 3 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 4 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 4, 4 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 5 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 7, 5 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 6 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 8, 6 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 7 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 7, 7 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 8 )INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 8, 8 ) |
 |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2004-10-28 : 17:56:13
|
X-Factor,Rockmoose has questioned if you need a articial key at all. The artificial key is causing the "excessive" joins. If you stuck to natural keys the query can be be constructed without a join at all and is fully normalised (as far as I tell). ie:(Without FK's)CREATE TABLE Products ( Productname varchar(50) primary key)CREATE TABLE Attributes (Attributename varchar(50), Productname varchar(50), PRIMARY KEY (Attributename, Productname))CREATE TABLE Options (OptionName varchar(50),Attributename varchar(50),Productname varchar(50), primary key(OptionName, Attributename, Productname))CREATE TABLE Combinations(Combination varchar(50) primary key )CREATE TABLE OptionsCombinations(OptionName varchar(50),Attributename varchar(50),Productname varchar(50),Combination varchar(50), primary key(OptionName, Attributename, Productname, Combination)) Then the query becomes..SELECT DISTINCT Combination FROM OptionsCombinations cwhere productname = 'prod2' DavidM"Always pre-heat the oven" |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-30 : 10:32:44
|
| Is it common to use non-artificial keys like this in a real world application?I don't think I would. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2004-10-30 : 10:37:52
|
| It isn't common, but it should be. The decision to use an artifical key or not is part of the design process, and shouldn't be taken lightly. Here's a REALLY good link about it:http://www.datamodel.org/DataModelKeys.html |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-30 : 12:33:12
|
| If one does put a foreign key referencing a product in the Combinations table, which normal form does that contradict? |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-10-30 : 14:18:24
|
It violates 2nd Normal form:Definition of 2nd Normal Form:2nd Normal Form (2NF) Def: A table is in 2NF if it is in 1NF and if all non-key attributes are dependent on all of the key.By putting the ProductID in the Combinations table you make the database wide open to update anomalies.Suppose You update the ProductID of an Attribute and you don't update the ProductID of all the Options dependent on that Attribute ... you're in big trouble.Suppose You update the ProductID of a Option AND you don't update the ProductID of all other Options dependent on the updated Option's AttributeID AND you don't update the ProductID of the updated Options Attribute... well you're in big trouble again.To achieve 2nd Normal Form you have to Eliminate Redundant Data.To DENORMALIZE the database by adding redundancies to the left and to the rightjust to make it easier to JOIN the tables, that is stupid rockmoose |
 |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2004-10-30 : 16:11:38
|
quote: Is it common to use non-artificial keys like this in a real world application?I don't think I would.
Why not X-Factor? Does it give you a feeling that it doesn't "look quite right"? There seems to be a lots of repeated data? What about the updating key issue?It was a bit excessive. Particularly the ProductName as the key. Even if we use an artifical key for that table, the resulting query is trivial.The EAV (Entity, Attribute, Value) schema can be a very difficult system to maintain/use. I have seen many instances of it implemented and the one thing that all the bad ones have in common was poor key choice, followed by poor domain/type constraints.DavidM"Always pre-heat the oven" |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-10-30 : 17:09:52
|
Guys,There must have been some discussions on Generated vs. Natural Keys on this forum before.Does anyone have any good links to that ?Rob's link was quite good, nice website [url]www.datamodel.org[/url]...The urge to spawn identities and guids all over the place when designing a databasewith no thought to any domain constraints is becoming a cause of major irritation to me.I mean, why even bother to learn what a PK or Normalization is?, the DBMS takes care of that for me Lazy moose that doesn't use the search functionrockmoose |
 |
|
|
Kristen
Test
22859 Posts |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-31 : 09:52:32
|
My reading of the page at the DataModel site leads me to believe that the artificial keys are the way to go.quote: A design that favors single column surrogate primary keys will have very efficient joins between tables, as most of them will be single column joins. Tables also will be much narrower than their natural key cousins, meaning much less dead weight. However, getting from a deeply nested great, great, great grandchild table to the ultimate parent table may take some serious navigation. Read that as complex queries. Relational database engines are optimized for these kinds of queries, and the overall reduction in table dead weight tends to offset the complexity of the queries.
And then believe it or not... quote: If using single column primary keys, remember that deeply nested tables may need a direct foreign key link to a great grandparent table. This can provide a shortcut up the chain if the application and data access tendencies require it.
|
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-10-31 : 10:23:56
|
X-Factor,FYI this was my response to the author of that article:( I was not going to post this, but after your last post I decide to put it in the public domain... )quote: Hi,I have had the pleasure to read your article "About Surrogate and Natural Keys"And found it very good and informative.( I have also read some of the other articles, and I really appreciate your work, very good !, thank you for this website :-) )( A link to the article was provided here: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=41743 )I have a couple of things to mention:1.I don't agree with the following statement:"If using single column primary keys, remember that deeply nested tables may need a direct foreign key link to a great grandparent table. This can provide a shortcut up the chain if the application and data access tendencies require it."The reason is that this IMO is bad practice and denormalizes the database.Heck it violates 2nd Normal Form !You gain: Shortcut in the join chain to the parentYou lose: Data integrity. ( you can put the database in an inconsistent state by updating the direct link / or by forgetting to update the direct foreign link when the parent table(s) are updated )For the sake of providing the most correct information possible to anyone reading the article I suggest that some kind of warning / or explanation of the consequences also be given at this point.2.Next thing I would like to mention is the following:The article IMO is somewhat biased towards recommending surrogate keys vs. natural keys.Perhaps You could mention that the Relational Model and the Implementations of SQL databases sometimes are at odds.The use of Surrogate Keys are sometimes a necessity ( for performance, practical reasons, limitations on key length, readability etc.. )not due to the fact that Surrogate Keys are "better" than Natural Keys in most cases.But they happen to perform better in the physical implementation of the database.e.g.In the Logical model Natural Keys are suitable and modelledIn the Physical Model the Natural Keys are just not suitable due to the nature of the DBMS, and Surrogate Keys have to be introduced, in the physical implementation.
Lead the blind lead the blind ...Edit: Not to suggest that the author of the article is blind or not knowledgeable, i am sure of the contrary.rockmoose |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-31 : 10:54:37
|
| I'm looking forward to hearing the reply rockmoose. |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2004-10-31 : 12:19:37
|
quote: Why not X-Factor? Does it give you a feeling that it doesn't "look quite right"? There seems to be a lots of repeated data? What about the updating key issue?
I worked on an application framework which managed multiple entities. A request to manage an entity would be made to the framework. The framework didn't know anyhing about the entity except its primary key. The framework would load the appropriate form for managing the entity and pass it the primary key.Because all entities had an integer primary key this worked. But if entities had had keys which spanned columns and were of varying types then I think it would have made the application framework undesirably complicated. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2004-10-31 : 12:48:32
|
quote: The framework didn't know anyhing about the entity except its primary key
This is a good sign that application and data layers are sufficiently encapsulated and independent of one another. Therefore changing the key(s) of the table(s) involved shouldn't be all that difficult. If it is difficult then the app and DB layers could use a little more work.quote: But if entities had had keys which spanned columns and were of varying types then I think it would have made the application framework undesirably complicated.
This tends to contradict the previous statement. If you're using stored procedures, views, or other database-centric means to retrieve and modify data, your application would not care at all and would require minimal, or zero, modification to accommodate the change. I've written several apps that dealt with tables with multi-column keys, everything was done with sprocs that passed 2-3 parameters at most. It's certainly not difficult to manage from the application standpoint.I agree with rockmoose that the article leans more towards surrogate keys and doesn't describe the benefits of natural keys as well. I think surrogate keys have their place (yeah, so shoot me) but your situation doesn't seem to be one of those places. Your entities are pretty simple and well-defined and would not benefit from a surrogate key. To me, that's crucial: if a surrogate key provides NO BENEFIT, then it is NOT a good choice.quote: An identity column means that SQL Server will automatically generate an integer primary key.
I hope you didn't mean that, it's a common misconception. Identity and primary keys are completely different things, and plenty of people have identity columns AND duplicate rows because they didn't add a primary key or unique constraint on that column, or on the combination of columns that are supposed to be unique. Lose the identity and you lose that problem right away. |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2004-10-31 : 14:02:34
|
quote: Originally posted by X-FactorI worked on an application framework which managed multiple entities. A request to manage an entity would be made to the framework. The framework didn't know anyhing about the entity except its primary key. The framework would load the appropriate form for managing the entity and pass it the primary key.Because all entities had an integer primary key this worked. But if entities had had keys which spanned columns and were of varying types then I think it would have made the application framework undesirably complicated.
Did the framework require all the PK columns be named "ID" as well I think I would have found the framework quite limiting.It's hard not to be sarcastic here, but passing more than 1 paramater, IMO that is not "overly complicated"...rockmoose |
 |
|
|
Next Page
|
|
|
|
|