Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Primary Key (Int vs Char)

Author  Topic 

jnghh
Starting Member

10 Posts

Posted - 2007-04-13 : 23:01:37
I seem to recall many debates and articles about primary keys, but for the life of me, I can't seem to figure out what to do in what is probably be a very straight forward situation (below). Must be all the late nights lately impairing my common sense.

So, I've basically got a set of data where I want to extract a status field from the table design. I'll end up with two tables:

MainTable(Field1,Field2, ... ,FieldA ,StatusID)
StatusTable(StatusID,Field1,Field2) [shouldn't be that big a table, actually]

It's a silly thing but... are there any significant reasons why I should choose int for my StatusID instead of char?

I seem to be under the impression that using arbitrary int numbers as the identifiers will yield better performance on joins; but from past experience, using arbitrary int numbers can be a bother to interpret the data. :P I've seen tables where they were so full of numeric codes, that I had to do multiple joins to the respective code tables to yield what each code meant (or what that particular row meant as a whole).

Of course, I don't mean to use an actual full description for my primary keys, but would it be so bad to use something like "N" (to indicate "normal")?

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-04-14 : 00:08:29
Read this thread for a sometimes heated discussion of the pros and cons of both approaches.
Article: Identity and Primary Keys
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=6136


As for me, I always use single column numeric primary keys, usually identity columns. In my experience, I have usually regreted it when I used a code for the primary key, so I just don't do it anymore.

I would design the Status table with an alternate key (unique constraint) Status Code to hold the 'Normal' code, and have a number primary key. Normally, I would use an int, but if the number of possible status codes is small, and the MainTable is high volume, I might use a tinyint or smallint to save space.

For codes themselves, I usually use a varchar(20) to allow it to be long enough to be meaningful. I also usually have a short description and a long description column to ensure there is enough info to completely describe it, and make these alternate keys also, so that if you use them on reports, they will be unique.

If there is a need to replicate the data to other databases, I would also add a rowguid column to the Status table, and make that an alternate key.



CODO ERGO SUM
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-04-14 : 03:38:24
I tend towards the Alphabetic codes. But it has all the inherent problems that MVJ alludes to.

I do this so that a quick

SELECT TOP 100 * FROM MainTable

gives me "meaningful" values in the Status Column. (This also applies to seeing the raw data somewhere else - an import file for example).

With an INT I would have to do

SELECT TOP 100 S.StatusName, M.*
FROM MainTable AS M
LEFT OUTER JOIN StatusTable AS S
ON S.StatusID = M.StatusID

which is a load more bother.

BUT the only person benefiting from this is ME. The punter will only ever see the data after all the JOINs etc are done, and its better for performance that the join is on TINYINT rather than VARCHAR(10)

Add a "Sequence" column to StatusTable so that you can rearrange the order in the future (assuming the Status table is for some sort of logical work-flow style progression, rather than an arbitrary set) - e.g. when you want to [make a logical] insert between "LIVE" and "DEAD" !!. This can be useful, for example, for making sure that status changes are only "forwards" and for presenting then in a sensible human-ordered list such as a SELECT list.

Kristen
Go to Top of Page

jnghh
Starting Member

10 Posts

Posted - 2007-04-15 : 22:05:01
I'll take note of the article and weed through as and when I have the time. :) But thank you both! Your thoughts were exactly what I needed to clear my head; it's awfully nice to have a sounding board. :) I also appreciate your additional suggestions; very very helpful--I think I may just incorporate some of 'em. Thank you again.

For the curious: I think I'll continue to use number primary keys. As Kristen points out, really, the only person who's benefiting is ME -- and as nice as that is, the benefits (for my project) isn't strong enough of a justification ;)
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-04-15 : 22:17:43
quote:
Originally posted by jnghh

I'll take note of the article and weed through as and when I have the time. :) But thank you both! Your thoughts were exactly what I needed to clear my head; it's awfully nice to have a sounding board. :) I also appreciate your additional suggestions; very very helpful--I think I may just incorporate some of 'em. Thank you again.

For the curious: I think I'll continue to use number primary keys. As Kristen points out, really, the only person who's benefiting is ME -- and as nice as that is, the benefits (for my project) isn't strong enough of a justification ;)


If you use a codes design, the benefit to you many be short lived.

When the business decides that the code should change from 'N' for Normal to 'Typical', you can have a lot of work to make that change, instead of updating one column in one row of the Status table.



CODO ERGO SUM
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-04-15 : 22:59:59
As Micahel said, the main factor is change. If the value being used as a foreign key changes, then issues rise. How much casading pain can you manage?

This is the main reason to use a database generated key. Immutable.

But what a database generated key does cause is data abstraction.
It is so simple to change a value at the Parent level that people do it without much thought, and because the data consequences are not felt until a JOIN takes place, the results can surprise users.

DavidM

Production is just another testing cycle
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2007-04-18 : 02:03:24
A numeric value will probably be faster than character but that's probably not an issue.

>> gives me "meaningful" values in the Status Column.
Actually that could be considered a reason for using numeric values.
Depends on whether you want the descriptions to be updateable.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-04-19 : 03:55:50
"When the business decides that the code should change from 'N' for Normal to 'Typical'"

nah, the rule here is "We'll talk about it for 5 minutes and then do it Kristen's way " and no-one is going to swing a change from "Normal" to "Typical" past me until I'm old and senile!

Kristen
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2007-04-19 : 11:25:07
oye...the surrogate key discussion...

I don't use them, except when I have to

http://weblogs.sqlteam.com/brettk/archive/2004/06/09/1530.aspx



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-04-19 : 19:01:37
quote:
Originally posted by X002548

oye...the surrogate key discussion...



Where's quazibubble?



CODO ERGO SUM
Go to Top of Page
   

- Advertisement -