Which way to go - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

Which way to go

Author

Topic

SamC
White Water Yakist

3467 Posts

Posted - 2002-07-28 : 16:07:45

My "Users" table contains an evolvoing set of columns. Name, Department, Email and so on.

The lastest additional Column is [EmployeeType]. Values will be company specific, you can imagine the values like: Employee, Non Employee, Contractor, Civil and so on.

The EmployeeType column is informational, and queries on the type of employee will be executed.

Is it better to implement the Employee Type Column as

(1) Nvarchar (30) containing the text of the employee type - recalling input to this field will be through drop-down lists from a web interface. Not too likely a spelling error would create an invalid value.

Or (2) do I go the "longer way" and make "Employee Type" numeric - a pointer to a table containing the "Name" of the particular Employee Type index?

This seems to me to be a 6 of one half-dozen of the other decision. The first solution being quicker to implement, somewhat simpler (1 less table). But I'm left wondering if I'll bump my head on it's limitations somewhere in the future.

There is "ONE" performance hit I can forsee for the first solution (1). To create a drop-down list, I need to query the user table to get the values with:

SELECT DISTINCT EmployeeType from Users ORDER BY EmployeeType

The overhead to do this SELECT has got to be much larger than a corresponding query for solution (2) simply because the table size in (1) is tens of thousands, while (2) would generally not have more than 10 rows total. BUT - This query isn't executed frequently. It's an Administrative function, so I believe the performance issue fades to nothing.

Comments appreciated.

Thanks,

SamC

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-07-28 : 16:27:19

Do both. Store the EmployeeType column as character (don't use nvarchar unless you absolutely know you'll have foreign characters in there), AND keep a table of the employee types:

CREATE TABLE EmployeeTypes (EmployeeType varchar(15) NOT NULL PRIMARY KEY)

Then do this:

ALTER TABLE Users ADD EmployeeType varchar(15) CONSTRAINT FK_EmployeeTypes REFERENCES EmployeeTypes (EmployeeType)

Now you can query EmployeeTypes to get the values for the drop down box. AND, if you need to, you can query Users:

SELECT * FROM Users WHERE EmployeeType='Contractor'

Without needing to join it to EmployeeTypes. AND you get to maintain referential integrity between the two.

SamC
White Water Yakist

3467 Posts

Posted - 2002-07-28 : 16:37:01

Pretty cool. I hadn't thought of this one. Maybe that's why I keep coming back to this site.

The solution suggests putting a trigger on Users as a possible method to keep the two tables in step. Someone inserts a new EmployeeType in Users, the trigger automatically updates EmployeeTypes.

But before going there, it depens on what model makes the most sense for adding new values.

How does the constraint affect an Update on Users when the EmployeeType was invalid? Does the Update pull an error?

SamC

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-07-28 : 16:50:24

Yes, if you attempt to insert a User with an EmployeeType that is NOT in the EmployeeTypes table the foreign key would throw an error. You *can* use an INSTEAD OF trigger to perform the insert into EmployeeTypes first, then continue with the rest of the insert, but I'd recommend that EmployeeTypes be administered separately.

If you let people freely type in a new employee type when they input a new user, you'll end up with "Contractor", "Contracter", "Contactor", "Umployee", etc., you get the idea. And if you have a trigger that automatically populates these "new" types, well, good luck figuring out which ones are the TRULY valid ones! :)

I think it would be best to populate all of the possible EmployeeTypes beforehand. It would probably take you just as long to write the trigger as it would to sit down and compile a list of all possible employee types. Plus it makes you sit down and think through the problem(s), which is always a good thing to do. You can still add new types, but create a process separate from inserting a new user. Then, you don't need a trigger on any of the tables, and you'll get faster updates and inserts.

SamC
White Water Yakist

3467 Posts

Posted - 2002-07-28 : 17:01:15

Thanks.

Now the afterburner.

After declaring all my string variables as nvarchar for other columns instead of character (already) - could you tell me a little more about the downside of what I've done?

Then I'll have to weigh the downside against going back in there and putting things right.

SamC

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-07-28 : 17:27:20

There's no downside really. Nchar/nvarchar use 2 bytes per character instead of 1 byte like char/varchar. The 2 bytes are used to store international characters, like Hebrew, Arabic, Japanese and such. If you know that you're only gonna store English and maybe some common European accented characters, you really don't need nvarchar and you can store more rows per page of data using regular varchar.

If it's too much of a hassle then leave it the way it is now. Just keep it in mind for the future.

SamC
White Water Yakist

3467 Posts

Posted - 2002-07-28 : 19:48:42

So,, varchar is half the size of nvarchar

Will this make a performance difference when searching for a username among 50 thousand records? Or is the prinicipal performance hit disk space?

----------

Just for fun, I hopped in Enterprise manager - design view, (a discarded database) and changed the Column Email from nvarchar to varchar. The warning issues was "may" cause loss of data. No data appears to be lost in the conversion, so I'm guessing that was a general warning issued against converting column datatypes in general.

SamC

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-07-28 : 20:32:23

It's hard to say, because collation settings and case-sensitivity will affect it as well. But since they take up half the space, twice as many fit in the same space, so twice as many get read for a each disk read, twice as many fit in RAM, etc.

Since Windows is a 32 bit operating system, registers are accessed in 32 bit (4 byte) chunks, and I think (not 100% sure) that each character will be converted to a 32 bit word for the CPU to process it. This conversion is probably about the same for 1 byte and 2 bytes characters, so I don't think it'll be as large a CPU hit.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources