Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Extra Foreign key

Author  Topic 

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-26 : 08:14:46
Hi,

Lets say I have an entity, e.g. a product, which is spread across three tables: Products, ProductAttributes and AttributeOptions.

Products
========
productID int pk
name varchar(50)

ProductAttributes
=================
ProductAttributeID int pk
productID int fk
name varchar(50)

AttributeOptions
================
AttributeOptions int pk
ProductAttributeID int fk
name varchar(50)

Lets say I wanted to edit all of this information for one product within a single web page. I would need to retrieve the data which spanned the three tables.

It would make things easier if the data could be down loaded from each table seperately. So I would need to do three different selects.

The question is, when retrieving a product's AttributeOptions, is it a fair design to add to the table a foreign key which references the parent product?

I appreciate that this would be redundant data but it would make the query simpler.

Cheers,

XF.

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-26 : 16:52:39
So,which is better...

Script 1:

USE tempdb
GO

DECLARE @ProductID int
SET @ProductID = 2

CREATE TABLE Products
(
productID int primary key identity,
name varchar(50)
)

INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod2' )

CREATE TABLE Attributes
(
attributeID int primary key identity,
productID int,
name varchar(50)
)

INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Color' )
INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Size' )
INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Length' )
INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Pack' )

CREATE TABLE Options
(
OptionID int primary key identity,
attributeID int,
name varchar(50)
)

INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Red' )
INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Gree' )
INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Big' )
INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Small' )
INSERT INTO Options ( attributeID, name ) VALUES ( 3, '10m' )
INSERT INTO Options ( attributeID, name ) VALUES ( 3, '20m' )
INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Economy' )
INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Bonus' )


SELECT * FROM Products WHERE productID = @ProductID
SELECT * FROM Attributes WHERE productID = @ProductID
SELECT o.* FROM Options o INNER JOIN Attributes a ON a.attributeID = o.attributeID WHERE productID = @ProductID

DROP TABLE Products
DROP TABLE Attributes
DROP TABLE Options

Script 2:




USE tempdb
GO

DECLARE @ProductID int
SET @ProductID = 2

CREATE TABLE Products
(
productID int primary key identity,
name varchar(50)
)

INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod2' )

CREATE TABLE Attributes
(
attributeID int primary key identity,
productID int,
name varchar(50)
)

INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Color' )
INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Size' )
INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Length' )
INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Pack' )

CREATE TABLE Options
(
OptionID int primary key identity,
attributeID int,
productID int,
name varchar(50)
)

INSERT INTO Options ( attributeID, productID, name ) VALUES ( 1, 1, 'Red' )
INSERT INTO Options ( attributeID, productID, name ) VALUES ( 1, 1, 'Gree' )
INSERT INTO Options ( attributeID, productID, name ) VALUES ( 2, 1, 'Big' )
INSERT INTO Options ( attributeID, productID, name ) VALUES ( 2, 1, 'Small' )
INSERT INTO Options ( attributeID, productID, name ) VALUES ( 3, 2, '10m' )
INSERT INTO Options ( attributeID, productID, name ) VALUES ( 3, 2, '20m' )
INSERT INTO Options ( attributeID, productID, name ) VALUES ( 4, 2, 'Economy' )
INSERT INTO Options ( attributeID, productID, name ) VALUES ( 4, 2, 'Bonus' )


SELECT * FROM Products WHERE productID = @ProductID
SELECT * FROM Attributes WHERE productID = @ProductID
SELECT * FROM Options WHERE productID = @ProductID

DROP TABLE Products
DROP TABLE Attributes
DROP TABLE Options

The different is in the Options table and the queries used to retrieve from it.

Is the latter relationally offensive?
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-10-26 : 17:13:55
When in doubt NORMALIZE
When not in doubt NORMALIZE

What's with the identity columns by the way ???
CREATE TABLE Products
(
productID int primary key identity,
name varchar(50)
)

INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )

SELECT productID ,name FROM Products



rockmoose
/* Chaos is the nature of things...Order is a lesser state of chaos */
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-27 : 03:56:25
An identity column means that SQL Server will automatically generate an integer primary key.
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-10-27 : 04:07:21

And in your table it is permissible to have many products with the same name ?
I was questioning the value of only having a generated IDENTITY as the PK of your table,
you should define at least one UNIQUE CONSTRAINT on the table.

Unless this is ok:

CREATE TABLE Products
(
productID int primary key identity,
name varchar(50)
)

INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod1' )

SELECT productID ,name FROM Products



rockmoose
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-28 : 16:19:12
Yeah, I've left some stuff out.

But now returning to the original idea. What if I extend the tables further and cross the product options to create product combinations. e.g. so I can buy a 'Red - Big' prod1.

Like this...


USE tempdb
GO

DECLARE @ProductID int
SET @ProductID = 2

CREATE TABLE Products
(
productID int primary key identity,
name varchar(50)
)

INSERT INTO Products ( name ) VALUES ( 'Prod1' )
INSERT INTO Products ( name ) VALUES ( 'Prod2' )

CREATE TABLE Attributes
(
attributeID int primary key identity,
productID int,
name varchar(50)
)

INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Color' )
INSERT INTO Attributes ( productID, name ) VALUES ( 1, 'Size' )
INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Length' )
INSERT INTO Attributes ( productID, name ) VALUES ( 2, 'Pack' )

CREATE TABLE Options
(
OptionID int primary key identity,
attributeID int,
name varchar(50)
)

INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Red' )
INSERT INTO Options ( attributeID, name ) VALUES ( 1, 'Green' )
INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Big' )
INSERT INTO Options ( attributeID, name ) VALUES ( 2, 'Small' )
INSERT INTO Options ( attributeID, name ) VALUES ( 3, '10m' )
INSERT INTO Options ( attributeID, name ) VALUES ( 3, '20m' )
INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Economy' )
INSERT INTO Options ( attributeID, name ) VALUES ( 4, 'Bonus' )

CREATE TABLE Combinations
(
ComboID int primary key identity,
code varchar(50)
)

INSERT INTO Combinations ( code ) VALUES ( 'A41' )
INSERT INTO Combinations ( code ) VALUES ( 'AX1' )
INSERT INTO Combinations ( code ) VALUES ( 'Aq1' )
INSERT INTO Combinations ( code ) VALUES ( 'Aj1' )
INSERT INTO Combinations ( code ) VALUES ( 'A41' )
INSERT INTO Combinations ( code ) VALUES ( 'A761')
INSERT INTO Combinations ( code ) VALUES ( 'A31' )
INSERT INTO Combinations ( code ) VALUES ( 'Ahu1')

CREATE TABLE OptionsCombinations
(
OptionID int,
ComboID int
)

INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 3 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 4 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 3 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 4 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 7 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 8 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 7 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 8 )



SELECT * FROM Products WHERE productID = @ProductID
SELECT * FROM Attributes WHERE productID = @ProductID
SELECT o.* FROM Options o INNER JOIN Attributes a ON a.attributeID = o.attributeID WHERE productID = @ProductID

SELECT DISTINCT c.* FROM Combinations c
INNER JOIN OptionsCombinations oc ON c.ComboID = oc.ComboID
INNER JOIN Options o ON oc.OptionID = o.OptionID
INNER JOIN Attributes a ON a.attributeID = o.attributeID
INNER JOIN Products p ON a.productID = p.productID

WHERE p.productID = @ProductID

DROP TABLE OptionsCombinations
DROP TABLE Products
DROP TABLE Attributes
DROP TABLE Options
DROP TABLE Combinations


And I want to return all combinations for a given product.

The query needed to do this ( unless you know a better one ) contains 4 joins and a DISTINCT.

Now, if you needed to return these results every time a product was viewed, would you seriously do this 4 join query and not throw in an extra foreign key and do a simple SELECT?

I know you said always normalize. Still the same huh? My doubt stems only from the perceived potential gain in performance.

Cheers,

XF.
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-28 : 16:41:39
oops, that combo data should be..
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 1 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 3, 1 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 1, 2 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 4, 2 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 3 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 3, 3 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 2, 4 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 4, 4 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 5 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 7, 5 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 5, 6 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 8, 6 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 7 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 7, 7 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 6, 8 )
INSERT INTO OptionsCombinations ( OptionID, ComboID ) VALUES ( 8, 8 )
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2004-10-28 : 17:56:13
X-Factor,

Rockmoose has questioned if you need a articial key at all. The artificial key is causing the "excessive" joins. If you stuck to natural keys the query can be be constructed without a join at all and is fully normalised (as far as I tell).

ie:(Without FK's)

CREATE TABLE Products ( Productname varchar(50) primary key)

CREATE TABLE Attributes (Attributename varchar(50),
Productname varchar(50), PRIMARY KEY (Attributename, Productname))

CREATE TABLE Options (OptionName varchar(50),
Attributename varchar(50),
Productname varchar(50), primary key(OptionName, Attributename, Productname))

CREATE TABLE Combinations(Combination varchar(50) primary key )

CREATE TABLE OptionsCombinations(OptionName varchar(50),
Attributename varchar(50),
Productname varchar(50),
Combination varchar(50), primary key(OptionName, Attributename, Productname, Combination))


Then the query becomes..


SELECT DISTINCT Combination FROM OptionsCombinations c
where productname = 'prod2'


DavidM

"Always pre-heat the oven"
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-30 : 10:32:44
Is it common to use non-artificial keys like this in a real world application?

I don't think I would.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2004-10-30 : 10:37:52
It isn't common, but it should be. The decision to use an artifical key or not is part of the design process, and shouldn't be taken lightly. Here's a REALLY good link about it:

http://www.datamodel.org/DataModelKeys.html
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-30 : 12:33:12
If one does put a foreign key referencing a product in the Combinations table, which normal form does that contradict?
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-10-30 : 14:18:24
It violates 2nd Normal form:

Definition of 2nd Normal Form:
2nd Normal Form (2NF)
Def: A table is in 2NF if it is in 1NF and if all non-key attributes are dependent on all of the key.

By putting the ProductID in the Combinations table you make the database wide open to update anomalies.
Suppose You update the ProductID of an Attribute and you don't update the ProductID of all the Options dependent on that Attribute ... you're in big trouble.
Suppose You update the ProductID of a Option AND you don't update the ProductID of all other Options dependent on the updated Option's AttributeID AND you don't update the ProductID of the updated Options Attribute... well you're in big trouble again.

To achieve 2nd Normal Form you have to Eliminate Redundant Data.
To DENORMALIZE the database by adding redundancies to the left and to the right
just to make it easier to JOIN the tables, that is stupid

rockmoose
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2004-10-30 : 16:11:38
quote:

Is it common to use non-artificial keys like this in a real world application?
I don't think I would.



Why not X-Factor? Does it give you a feeling that it doesn't "look quite right"? There seems to be a lots of repeated data? What about the updating key issue?

It was a bit excessive. Particularly the ProductName as the key. Even if we use an artifical key for that table, the resulting query is trivial.

The EAV (Entity, Attribute, Value) schema can be a very difficult system to maintain/use. I have seen many instances of it implemented and the one thing that all the bad ones have in common was poor key choice, followed by poor domain/type constraints.


DavidM

"Always pre-heat the oven"
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-10-30 : 17:09:52
Guys,

There must have been some discussions on Generated vs. Natural Keys on this forum before.
Does anyone have any good links to that ?
Rob's link was quite good, nice website [url]www.datamodel.org[/url]...

The urge to spawn identities and guids all over the place when designing a database
with no thought to any domain constraints is becoming a cause of major irritation to me.
I mean, why even bother to learn what a PK or Normalization is?, the DBMS takes care of that for me

Lazy moose that doesn't use the search function

rockmoose
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2004-10-31 : 01:52:14
Like this you mean?

http://www.sqlteam.com/item.asp?ItemID=2599

:-)

Kristen
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-31 : 09:52:32
My reading of the page at the DataModel site leads me to believe that the artificial keys are the way to go.

quote:

A design that favors single column surrogate primary keys will have very efficient joins between tables, as most of them will be single column joins. Tables also will be much narrower than their natural key cousins, meaning much less dead weight. However, getting from a deeply nested great, great, great grandchild table to the ultimate parent table may take some serious navigation. Read that as complex queries. Relational database engines are optimized for these kinds of queries, and the overall reduction in table dead weight tends to offset the complexity of the queries.


And then believe it or not...

quote:
If using single column primary keys, remember that deeply nested tables may need a direct foreign key link to a great grandparent table. This can provide a shortcut up the chain if the application and data access tendencies require it.



Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-10-31 : 10:23:56
X-Factor,
FYI this was my response to the author of that article:
( I was not going to post this, but after your last post I decide to put it in the public domain... )

quote:
Hi,

I have had the pleasure to read your article "About Surrogate and Natural Keys"
And found it very good and informative.
( I have also read some of the other articles, and I really appreciate your work, very good !, thank you for this website :-) )

( A link to the article was provided here: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=41743 )


I have a couple of things to mention:

1.
I don't agree with the following statement:

"If using single column primary keys, remember that deeply nested tables may need a direct foreign key link to a great grandparent table. This can provide a shortcut up the chain if the application and data access tendencies require it."

The reason is that this IMO is bad practice and denormalizes the database.
Heck it violates 2nd Normal Form !
You gain: Shortcut in the join chain to the parent
You lose: Data integrity. ( you can put the database in an inconsistent state by updating the direct link / or by forgetting to update the direct foreign link when the parent table(s) are updated )

For the sake of providing the most correct information possible to anyone reading the article I suggest that some kind of warning / or explanation of the consequences also be given at this point.


2.
Next thing I would like to mention is the following:

The article IMO is somewhat biased towards recommending surrogate keys vs. natural keys.
Perhaps You could mention that the Relational Model and the Implementations of SQL databases sometimes are at odds.

The use of Surrogate Keys are sometimes a necessity ( for performance, practical reasons, limitations on key length, readability etc.. )
not due to the fact that Surrogate Keys are "better" than Natural Keys in most cases.
But they happen to perform better in the physical implementation of the database.

e.g.
In the Logical model Natural Keys are suitable and modelled
In the Physical Model the Natural Keys are just not suitable due to the nature of the DBMS, and Surrogate Keys have to be introduced, in the physical implementation.


Lead the blind lead the blind ...
Edit: Not to suggest that the author of the article is blind or not knowledgeable, i am sure of the contrary.

rockmoose
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-31 : 10:54:37
I'm looking forward to hearing the reply rockmoose.
Go to Top of Page

X-Factor
Constraint Violating Yak Guru

392 Posts

Posted - 2004-10-31 : 12:19:37
quote:

Why not X-Factor? Does it give you a feeling that it doesn't "look quite right"? There seems to be a lots of repeated data? What about the updating key issue?


I worked on an application framework which managed multiple entities. A request to manage an entity would be made to the framework. The framework didn't know anyhing about the entity except its primary key. The framework would load the appropriate form for managing the entity and pass it the primary key.

Because all entities had an integer primary key this worked. But if entities had had keys which spanned columns and were of varying types then I think it would have made the application framework undesirably complicated.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2004-10-31 : 12:48:32
quote:
The framework didn't know anyhing about the entity except its primary key
This is a good sign that application and data layers are sufficiently encapsulated and independent of one another. Therefore changing the key(s) of the table(s) involved shouldn't be all that difficult. If it is difficult then the app and DB layers could use a little more work.
quote:
But if entities had had keys which spanned columns and were of varying types then I think it would have made the application framework undesirably complicated.
This tends to contradict the previous statement. If you're using stored procedures, views, or other database-centric means to retrieve and modify data, your application would not care at all and would require minimal, or zero, modification to accommodate the change. I've written several apps that dealt with tables with multi-column keys, everything was done with sprocs that passed 2-3 parameters at most. It's certainly not difficult to manage from the application standpoint.

I agree with rockmoose that the article leans more towards surrogate keys and doesn't describe the benefits of natural keys as well. I think surrogate keys have their place (yeah, so shoot me) but your situation doesn't seem to be one of those places. Your entities are pretty simple and well-defined and would not benefit from a surrogate key. To me, that's crucial: if a surrogate key provides NO BENEFIT, then it is NOT a good choice.
quote:
An identity column means that SQL Server will automatically generate an integer primary key.
I hope you didn't mean that, it's a common misconception. Identity and primary keys are completely different things, and plenty of people have identity columns AND duplicate rows because they didn't add a primary key or unique constraint on that column, or on the combination of columns that are supposed to be unique. Lose the identity and you lose that problem right away.
Go to Top of Page

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-10-31 : 14:02:34
quote:
Originally posted by X-Factor
I worked on an application framework which managed multiple entities. A request to manage an entity would be made to the framework. The framework didn't know anyhing about the entity except its primary key. The framework would load the appropriate form for managing the entity and pass it the primary key.

Because all entities had an integer primary key this worked. But if entities had had keys which spanned columns and were of varying types then I think it would have made the application framework undesirably complicated.



Did the framework require all the PK columns be named "ID" as well
I think I would have found the framework quite limiting.
It's hard not to be sarcastic here, but passing more than 1 paramater, IMO that is not "overly complicated"...

rockmoose
Go to Top of Page
    Next Page

- Advertisement -