Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 When it comes to users, what is normal?

Author  Topic 

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-06-18 : 09:26:04
I'd like to run some design questions past the crew . . .

create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
password varchar(12) not null )

 
This table is normalized. A superkey is defined (logon), there are no partial dependencies (cause its a simple key) and there are no transitive dependencies as logon is the sole determinant for the password column. Lastly, there are no multivalued dependencies to consider...


create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
password varchar(12) not null,
firstname varchar(20) not null,
lastname varchar(20) not null,
dob smalldatetime )

 
We are still good so far, right? So this brings me to my first question. What do you call this?

create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
firstname varchar(20) not null,
lastname varchar(20) not null,
dob smalldatetime )

create table authentication (
logon varchar(16) not null
constraint pk_authentication primary key
constraint fk_authentication_logon references userbase(logon),
password not null )

 
So what do you call that? It seem to pass the rules of normalization, but my gut says, 'Why is this two tables?'. Should it be like this?

Which leads to my second question, what happens when you add some business rules to authentication. Let's say the user is allowed to provide an incorrect password at logon time 3 times and then they are locked out. Is this a valid reason for creating a userbase andauthentication table?


create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
password varchar(12) not null,
failedlogincount tinyint not null
constraint dft_userbase_failedlogincount default (0),
firstname varchar(20) not null,
lastname varchar(20) not null,
dob smalldatetime )

 
With the above table, would failedlogincount be a transitive dependency? I don't think so as the count is by logon, however, it is the password column that determines the failure, so it seems password might be a candidate for a determinant.

It's very infrequent that a dba has the luxury of time to think about these things; more often than not, we just throw tables together using gut feeling and don't think about works like 'determinant' and 'transitive dependency'. I am given both the luxury of time and a clean slate on this new database and I'd like to be very purposeful in my design.

Thanks for you help.

<O>

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-06-18 : 14:37:27
quote:
create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
firstname varchar(20) not null,
lastname varchar(20) not null,
dob smalldatetime )

create table authentication (
logon varchar(16) not null
constraint pk_authentication primary key
constraint fk_authentication_logon references userbase(logon),
password not null )



Only reason I'd ever see a need for this is if more than one person uses a single login/password (husband/wife/kid all login under the same name/password but have different entries in the first/last name). Thats kinda a strange reason... Your first option is much nicer/cleaner.


Second question... I like create 2 tables from login tracking anyway.

quote:

create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
password varchar(12) not null,
failedlogincount tinyint not null
constraint dft_userbase_failedlogincount default (0),
firstname varchar(20) not null,
lastname varchar(20) not null,
dob smalldatetime )



I get rid of the failed login count on this table and make a second one...

create table userlogon(
logon varchar(16) not null
constraint pk_userbase primary key clustered
Logon_date datetime not null,
Logon_Status varchar(20) not null )

Then from here a simple query such like
select count(47) as 'Failed Attempts', Logon
from user Logon
where Logon_status = 'Failed'
and logon_date between getdate() and datediff(dd,-1,getdate())

(sorry if syntax is wrong... I don't have query analyzer here to check)

Not only can you say 3 failures and your out, but you also limit them to 3 failures per day (that way next day you don't need to reset the account)... Only problem with this is you assume the logon that they are using is correct. My database got filled up with alot of entries as people misspelt their logons.


-----------------------
Take my advice, I dare ya
Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-06-18 : 16:37:44
Interesting. So what do you do if the business rule is "3 failures in 15 minutes results in a temporary lock (15 to sit in the corner and think about what you've done, young man!). 4 temporary locks in 24 hours results in an administrative lock (manual intervention)"

Your FailedAttempts table would have a hard time capturing this (not impossible)....

Here's a third question . . . when faced with a varchar(16) natural key on a table that will hold hundreds of thousands of records, would you choose a surrogate key (some meaningless identity)? Tell me more than "integers will sort and join faster" . . . gimme some concrete numbers or experiences.

<O>
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-06-18 : 17:14:15
I'm sure I'll sound like Benedict Arnold, but uh, yeah, a 4-byte non-null integer value will definitely sort faster than a 0-16 byte nullable character value. Not to mention that your average login will be more than 4 characters (anything less than 8 is not too secure). You *might* see equivalent performance if your SQL Server is installed as case-sensitive, but you'll introduce a million other problems that will more or less cancel that out.

If your performance is that crucial, and you CHOOSE to use a smaller surrogate key over a natural key, that's perfectly fine. Just so long as you've considered the natural key and not just knee-jerked an identity column in there

I can think of ways to make your administrative lockout features work with just about any design. The data is only necessary to support the authentication PROCESS, but the PROCESS must do the validation of the rules. A couple of IF statements to check each rule in turn, from highest priority to lowest, and updating the user table accordingly for any violations, piece of cake.

I'm not sure if you were talking about CONSTRAINTS, though. You can't construct a CONSTRAINT in SQL to properly define those rules because they don't affect how you retrieve the data; nor can constraints define how many rows are valid in comparison to one another. If a single row is valid, it gets inserted, regardless of any other valid rows.

Basically you're trying to enforce a quota, and there's no real way to do that in SQL; you need a procedure of some kind (a trigger could do it). Therefore, any table design that holds the proper data can be used.

Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-06-18 : 18:11:25
Rob so far, I'm on board with everything you've said. I'd love to hear some thoughts on the 1-to-1 relationship between userbase and authentication (or really any 1-to-1 relationship). I'm not talking about multivalues (4nf/5nf), but rather just splitting a determinants data into two tables cause 'feels right'.

At my current client site, I see some of this. A particular object may have 50 or 60 attributes, but they may be split into several tables. . . is there a name for that?

<O>
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2002-06-18 : 18:12:07
Page47,

My 2 cents

The "FailedLoginCount" is (excuse the bad terminology) "temporal" dependant.. That is, it's dependancies is really based on the time and the logon (according to business rules). All other attributes are dependant on the "Key, the whole key and nothing but the key".

So to answer the questions

1) Redundant
2) A user MUST have a logon right? A User is also defined by that logon. The password is dependant on the logon. A User (Logon) is validated by the password. So if we think that the overall role of this entity to be the validation of a Logon then there really is no "extra" dependancy for the password. Think of it another way.. You can not validate a password WITHOUT the logon! Because it is possible for different Logon's to have the same password... If the password has another dependancy so must the logon in this context.

If you are interested in keeping a history of failed logon's then a separate table is an absolute must!

3) Natural Key, Natural Key, Natural Key!!!!!@!!!!

DavidM

"SQL-3 is an abomination.."
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2002-06-18 : 18:21:54
I think 1:1 relationships are the most under used in DB design.

I will do just about anything to avoid nulls and it is usually for this purpose alone that I go for 1:1.

The classic example is the StartTime, EndTime attributes.
Default values for the Datetime data type, well.. suck! I try and avoid nulls, so I split away this single attribute and the Key into a separate table...

A view can fix the "Front-End" problem, and because the likely hood of the EndDate having an index is miniscule (In the Null design), but in the 1:1 I get to use the Key and the really cool "Exists" keyword!

DavidM

"SQL-3 is an abomination.."
Go to Top of Page

graz
Chief SQLTeam Crack Dealer

4149 Posts

Posted - 2002-06-18 : 19:05:04
Great Question! DDL and everything. And all the questions are clear and concise in their requests for information and at the same time invite additional comments! So can you write me a FAQ on how to ask questions?

As to your questions ... keeping in mind that I AM a lazy DBA.

First, I really dislike having two tables that have the same primary key. Especially if you have to have records in both tables to be valid. A design like that is just begging to have one of the two tables have a problem. Now if you always need records in one table but the second one is used only 10% of the time that's a little different scenario. If tables are split out due to the 8K limit that's also a different situation.

Second, No. Failed logon (login?) count and password are both attributes. The fact that password determines a logon failure is a process issue and not a data model or data storage issue. It could as easily be password and two datetime fields that define allowed login times. Also, I think a FirstFailedLogin (datetime) field and a FailedLoginCount (smallint) field could handle most scenarios. And the rule base could easily be table driven. You care about how much time has passed between the first failed login and the most recent and how many failed logins occured. Any correct login will reset these fields.

Third, remember to consider that integers will sort and join faster than varchars! Hey, it is a valid point to consider. Actually a bigger reason is that people may want to change a username. That reason is very specific to the fact that you picked a table of usernames for this question. Invoice numbers would have been a different answer. If you have a non-natural primary key it makes it easy to change usernames. When Rob wants the cool hacker handle r0bv01k I can easily update just one table which I think is one of the goals of a relational system. I'm also very fond of single field primary keys. That leads to less typing for RSI damaged wrists and elbows. Plus it lets me use EXISTS alot.

Fourth, when are you going to write an article on transitive dependencies, determinates, multivalued dependencies and other SQL-related words I can barely spell?

===============================================
Creating tomorrow's legacy systems today.
One crisis at a time.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-06-18 : 19:09:09
Another 2¢:

If you're suggesting that:
create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
firstname varchar(20) not null,
lastname varchar(20) not null,
dob smalldatetime )

create table authentication (
logon varchar(16) not null
constraint pk_authentication primary key
constraint fk_authentication_logon references userbase(logon),
password not null )
...is better than this:
create table userbase (
logon varchar(16) not null
constraint pk_userbase primary key clustered,
firstname varchar(20) not null,
lastname varchar(20) not null,
dob smalldatetime,
password not null )

You're gonna have a hard time convincing me. I see two tables with a redundant (not just related) column and redundant constraints, vs. one table with no redundant or superfluous columns or constraints. Other than reasons of security, or ridiculously esoteric performance enhancement (debateable, and better accomplished using other methods) I don't see any benefit to a two-table 1:1 setup. Sorry.

Also, I'm perfectly comfortable with nulls for StartDate and EndDate on a single row instead of a Start attribute/value and End attribute/value on separate rows. Sorry again David, but if I need to compare dates I refuse to believe it's BETTER to perform any kind of summarization where the two dates to be compared are on 2 separate rows, or if I have to test an entire set for the NON-EXISTENCE of a row. And unless every column is individually indexed I'm gonna end up with a number of important queries that will scan the table. Nothing I do on the front end is gonna improve that.

I know you won't agree, but there really is such a thing as too much normalization. I'm all for logical purity in table design AS LONG AS it does not compromise the physical function of the database.

Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2002-06-18 : 19:35:08
That's cool Rob!

Different strokes, differents folks!

My point is the query.. "Show me the PK of those with No end date".
With a null design

Select PK from table1 where Enddate is null

1:1 design

Select PK from Table1 as T1
where not exists (Select 1 from Table2 where T1.PK = PK)


Again my main issues is with nulls, I just have this really big aversion to them...although the current db I am working on has them (nulls), without them I looking at about an extra 10 tables.. because the client is paying (ie. Wants the earth but will only pay for 1 acre), I do the right thing by them and just allow nulls.....


DavidM

"SQL-3 is an abomination.."

EDIT: for code readability. STOOOOOOPID Snitz bugs!

Edited by - robvolk on 06/18/2002 19:42:29
Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-06-18 : 19:52:33
Thank you all for you input. I have purposefully worded by original post because I figured I could stimulate an intelligent discussion. My sneaky plan worked.....

I think the surrogate (synthetic/contrived/whatever) vs. natural debate could go on for ever. Given some factors outside of this conversation (namely politics and choosing my battles) I think I'll go with the surrogate (int identity) and bastardize that which is pure and relational and good...

quote:
If you're suggesting that: .....is better than this: .... You're gonna have a hard time convincing me.


I agree with you Rob. The multi-table approach is something I see a lot at the particular client site and I am trying to get them to change things. I was hoping there would be some solid, logical, absolutely relational reasoning I could use to convince them. The whole room shuts up and agrees with you when you say, "No sir, you've got a determinant that is not a candidate key there, so you are not in Boyce-Codd Normal Form. However, my design is." No one (read: the programmers who do 70% of the database design that happens at so many unfortunate shops.) want to admit they don't understand what you just said. I might as well hold up a poster of Nat Portman to win over the herd of programmers; it works almost as well. However, I find that saying "Two tables with the same key? Why thats....stupid." just invites the bastards to gang up on you like a pack of hungry dogs. Between Rob's and Graz's comments, I think I can put something a bit more intelligent together than "thats stupid"...

quote:
Fourth, when are you going to write an article on transitive dependencies, determinates, multivalued dependencies and other SQL-related words I can barely spell?


Come on Graz, I AM every bit as lazy...and its just to easy to point out how redundant such an article would be. Not to mention (as noted above) this is the secret weapon to shutting up the programmers.

I've got a thing against programmers these days ... hum

<O>
Go to Top of Page

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-06-19 : 10:40:55
quote:
Here's a third question . . . when faced with a varchar(16) natural key on a table that will hold hundreds of thousands of records, would you choose a surrogate key (some meaningless identity)? Tell me more than "integers will sort and join faster" . . . gimme some concrete numbers or experiences.




All the way back up at the top there... You just hit on what the project I'm on is running into. 3 databases (2 flat files.. IDMS and one oracle) are being combined into one. They each use different keys then the next so right now the major discussion is finding natural keys so the data from all 3 DB's can be combined. Well, now I got coming into to me databases that have 3 columns that together make each record unique (only 2 on some tables fortunately). Big debate would be whether or not going through and adding a 4 byte integer to each column then remapping all the relations through this new integer is worth it. Right now the project head is leaning to natural keys... So I guess I'm about to find out how many issues we'll run into using varchar(20) columns as keys.

Another (I think we're at point 5?) question I wanna add... Has anyone any experience with using datetime fields as natural keys on a table? Any hang ups there? The way this one IDMS is set up the only natural key that tables can be successfully joined on is this Datetime (I'm not quite sure how.. But I guess the input datetime is what IDMS was using to link records... right down to the millisecond). I'm not exactly sure how well, if at all, this will work coming into a SQL server... Someone said leave the Datetime as the key, but I'm not even sure if it would work.

so 5th question : Natural Keys on Datetime fields : good or baaaaad?



-----------------------
Take my advice, I dare ya
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-06-19 : 10:56:53
Datetime keys won't be unique in SQL Server for multi-row inserts:

INSERT INTO myTable SELECT TOP 100 * FROM otherTable

All 100 rows that are inserted will have the same datetime value; even if it takes more than a millisecond to do it, the default value will be calculated at the start of the process and used throughout. This holds true even for 1000's of rows inserted in a single operation. However if you're importing existing data with unique datetimes, and you are absolutely sure that you'll never insert two or more rows at the exact same moment, it might be possible.

Stick with using the natural keys unless their performance is *totally unacceptable*...meaning, don't just say "integers will be faster" and use them instead. TEST the natural keys and see if they perform acceptably.

Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-06-19 : 11:05:08
The other time I almost always choose a natural key is for simple look-up tables. Like say PhoneNumberType. If I know the business rules won't change, I'll put a check constraint on my PhoneNumber table such that type in ('work','home','beepa'). But if I think other types may be added later, I'll create a lookup table using the Type (natural key) as the primary key, then have Phone.Type reference PhoneNumberType(type) ... this way, I can guarrentee integrity and grab home phones without the join...

What do you think of that?

<O>
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-06-19 : 11:13:37
Oh yeah, I do that a lot. I've sometimes done 3 row lookup tables with the expectation it might go as high as...5 rows. Which it never did. But yeah, in that case the overhead is really minor and it gives you the opportunity to expand it greatly without changing the design. Especially if the rows are small and there aren't a lot of them, they can fit comfortably on one data page, or two, and will most likely stay in cache (and if they don't, it's no big deal to PINTABLE a 2 page table)

What's that newfangled computing term, "scalability"? That would give you the most scalability bang for the buck.

Go to Top of Page

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-06-19 : 11:32:45
Thats kinda what I thought as far as inserts with dates would go... There are bulk inserts so I guess it's not all to possible. I'll have to find a new key and move away from the datetimes I guess.


And for that last bit there page, it's already in operation here. Under several well types reference a table that has 1 column

type
o
n
m
l
d

Theres the entire table.. Seems like something simular to what your saying there... Never looked at their coding for it though, but I assume something simular. Theres something like 22 tables like this (all named R_tablename). Seems like a great idea to me.


Heres one for you. I have this table that has three different text files loaded to it using dts. Everynight the table is truncated and repopulated from the text files. This is fine for now, however this table is now being updated/inserted to from SQL as well. So completely truncating it is now out of the question.

The question is instead of completely truncating the table everynight, how would you go about 'transactional' replication of this database (this is what they like to call it). Pretty much, every night this text file will be loaded to the database. It needs to be compared to the previous version of the text file and any changes made to the text file need to be applied to the Table in question.

Bleh, hard to word it, hopefully you get the point... I don't want actual coded answers here... more theory.

-----------------------
Take my advice, I dare ya
Go to Top of Page

colinm
Yak Posting Veteran

62 Posts

Posted - 2002-06-19 : 11:52:30
Instead of using lots of separate lookup tables, does anybody combine them all together in one table?

select * from master.dbo.spt_values

Looks like Microsoft does. Anybody thoughts on whether this is a good/bad idea?


Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-06-19 : 11:57:34
quote:

Instead of using lots of separate lookup tables, does anybody combine them all together in one table?

select * from master.dbo.spt_values

Looks like Microsoft does. Anybody thoughts on whether this is a good/bad idea?




Well, my first thought here is you loose the ability to strongly type you values, which is not something I am usually willing to give up. Secondly, if you lookup entities have some attributes, you quickly get into transitive dependency world and lose 3nf.

<O>
Go to Top of Page
   

- Advertisement -