index - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

index

Author

Topic

fmardani
Constraint Violating Yak Guru

433 Posts

Posted - 2005-06-22 : 09:00:58

In the staff table there are several columns including ID, firstname and lastname, ...
The queries in the stored procedures are mostly to do with passing firstname and/or lastname to search
Is it best to create a clustered index in the staff table for firstname, lastname ? or shall I create a non-clustered index? if so should I have one non-clustered for each of the two fields or one key for both?
Thanks

Merkin
Funky Drop Bear Fearing SQL Dude!

4970 Posts

Posted - 2005-06-22 : 09:23:54

Imagine your data like a huge list of names you want to publish as a phone book. Obviously you do want to have to search through a huge random list of names to find the one you want, so this is why we use indexes. But you know this....

A clustered index physically reorders your data, a standard phone book is ordered like you suggest : lastname, firstname.

This is great if you need to find all the "Smith" names, or if you need John Smith. But won't help you if you need to find anyone named John. You will have to scan all the rows to find people called John. This is what a Clustered Index Scan is.

A non clustered index is like the index in the back of a book, if you look up "John" and you have a non-clustered index on first name it will tell you "there is one page 7, one on page 13 and one on page 273".

So, if you will be searching on EITHER First name OR Last Name, I'd probably go for a separate Non Clustered index on each of those columns, because a combined index won't help you find only one of those criteria.

Make sense ?

Damian
"A foolish consistency is the hobgoblin of little minds." - Emerson

nieurig
Starting Member

8 Posts

Posted - 2005-06-22 : 09:29:19

Hello,

1. Clustered Index
I think you can't build a Clustered Index on firstname and lastname
because your data will not fit to this unique key.
Otherwise this would a good opertunity instead of having the ID
as a clustered key.

2. NonClustered Index
I have read that 2 Indexes are better because of the benefit
for query which runs on one of this Index-Values. The best way
was testing it using the Query Analyse (Strg+K to see the
Execution plan)

Good luck

Niels

mwjdavidson
Aged Yak Warrior

735 Posts

Posted - 2005-06-22 : 10:23:15

I've rarely read a better explanation - Nice one Damian!

Mark

Merkin
Funky Drop Bear Fearing SQL Dude!

4970 Posts

Posted - 2005-06-22 : 10:57:22

Thanks Mark!

Damian
"A foolish consistency is the hobgoblin of little minds." - Emerson

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-06-22 : 19:42:59

>>I think you can't build a Clustered Index on firstname and lastname
>>because your data will not fit to this unique key.

Clusterd Indexes and Keys are NOT the same thing.
A key (in SQL Server) requires an index, but an index does not require a key.

Here's a tip...
Indexing is HARD.. there are so many factors involved.. transaction rates, transaction types, hardware, predicate use...

DavidM

A front-end is something that tries to violate a back-end.

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-06-22 : 20:51:57

quote:
Originally posted by byrmol

Clusterd Indexes and Keys are NOT the same thing.
A key (in SQL Server) requires an index, but an index does not require a key.

Can you explain how an index (clustered or non-clustered) works if you don't have a key?

A key only exists to provide the sort order (and hence inherent search predicate) of an index. An index cannot exist unless it has a key (sort order) defined for it.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-06-22 : 20:57:57

quote:
Originally posted by Merkin

So, if you will be searching on EITHER First name OR Last Name, I'd probably go for a separate Non Clustered index on each of those columns, because a combined index won't help you find only one of those criteria.

Unless one of the queries is _much_ more common than the other, in which case it may be more efficient to make one of the indexes clustered to avoid the extra IO of having to search the non-clustered index and then do another IO to get the row from the heap. The downside is that queries that go though the non-clustered index will have to then seek into the clustered index too, rather than doing a single page IO into a heap. Again, this tradeoff works if one query is way more common than the other.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-06-22 : 21:10:20

>>Can you explain how an index (clustered or non-clustered) works if you don't have a key?

You've been spending too much time at the disk level... A good thing for the lead on the storage engine... Love your work by the way!

I was speaking about the difference between a KEY (Relational model) vs SQL Server's implmentation of them.. I am pretty sure I can create an index on a non-key column in sql server...

Paul, are you aware of the TransRelationalModel (TM)? Seems right up your alley. [url]http://www.dbdebunk.com/page/page/1548800.htm[/url]

EDIT: Spelling (what a shock!) and question

DavidM

A front-end is something that tries to violate a back-end.

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-06-22 : 21:54:13

quote:
Originally posted by byrmol

>>Can you explain how an index (clustered or non-clustered) works if you don't have a key?

You've been spending too much time at the disk level... A good thing for the lead on the storage engine... Love your work by the way!

I was speaking about the difference between a KEY (Relational model) vs SQL Server's implmentation of them.. I am pretty sure I can create an index on a non-key column in sql server...

Paul, are you aware of the TransRelationalModel (TM)? Seems right up your alley. [url]http://www.dbdebunk.com/page/page/1548800.htm[/url]

EDIT: Spelling (what a shock!) and question

DavidM

A front-end is something that tries to violate a back-end.

Of course, you're absolutely right. In pure relational terms a key is just defined and does not have to be physically manifested in any way. Even in SQL Server , a foreign-key relationship is not manifested. Its easy to become blinkered by implementation details :-)

No - haven't seen that, I'll check it out.

You can't create an index on a non-key column - the columns that are indexed form its key, by definition. In SQL Server 2005, you can include non-key columns in a non-clustered index so you can fully cover queries and avoid the situation I described in my other reply to this thread.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-06-22 : 22:16:10

>>You can't create an index on a non-key column - the columns that are indexed form its key, by >>definition

You are speaking internally again arn't you? So SQL Server creates an internal "row index key" if there is none defined declaratively?

Given this example of a bag...


create table TestIndex(NonKey INT NOT NULL)
GO
CREATE NONCLUSTERED INDEX IDX_NONKEY ON TestIndex (NonKey)
--CREATE CLUSTERED INDEX IDX_NONKEY ON TestIndex (NonKey)
GO
DECLARE @Count INT
SET @Count = 0
WHILE @Count < 1000
BEGIN
	INSERT TestIndex(NonKey)
	VALUES (@Count)
	SET @Count = @Count + 1
END
--Duplicate value 
INSERT TestIndex(NonKey)
	VALUES (1)
GO
Select * 
from TestIndex
WHERE NonKey = 1

Select * 
from TestIndex
WHERE NonKey BETWEEN 5 and 8

--DROP table TestIndex

So what is happening internally here?

DavidM

A front-end is something that tries to violate a back-end.

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-06-23 : 12:31:08

quote:
Originally posted by byrmol

>>You can't create an index on a non-key column - the columns that are indexed form its key, by >>definition

You are speaking internally again arn't you? So SQL Server creates an internal "row index key" if there is none defined declaratively?

Given this example of a bag...

create table TestIndex(NonKey INT NOT NULL)
GO
CREATE NONCLUSTERED INDEX IDX_NONKEY ON TestIndex (NonKey)
--CREATE CLUSTERED INDEX IDX_NONKEY ON TestIndex (NonKey)
GO
DECLARE @Count INT
SET @Count = 0
WHILE @Count < 1000
BEGIN
INSERT TestIndex(NonKey)
VALUES (@Count)
SET @Count = @Count + 1
END
--Duplicate value
INSERT TestIndex(NonKey)
VALUES (1)
GO
Select *
from TestIndex
WHERE NonKey = 1

Select *
from TestIndex
WHERE NonKey BETWEEN 5 and 8

--DROP table TestIndex

So what is happening internally here?

DavidM

A front-end is something that tries to violate a back-end.

Yes, you've implicitly defined an index key, as opposed to a relational key of the table. You have to make sure you qualify the term 'key'. This was my initial confusion in the thread - I assumed you were talking about an index key.

Internally, the nonclustered index is sorted by the index key 'nonKey'. Each row will have the nonKey value and the row-identifier of the data row in the heap or clustered index.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2005-06-24 : 19:16:44

>> I assumed you were talking about an index key
Whoa,
I always thought a key was a key when talking about databases.
But then I am not programming the database, just using it...
Interesting.

rockmoose

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-06-26 : 06:25:04

Thanks for the explanation Paul..

I (and rockmoose) am what Codd refers to as a "user". A key to me, is a relational key signifying identity. How that happens underneath is mostly none of my concern as long as the logical key holds! The idea that the logical is abstracted from the physical is a central tenant of RM and I believe that SQL Server goes closer than most to this goal in regard to file/index management. As I said earlier, great job!

Do CHECK constraints get an index key? Because I can create a logical key without an index by using UDF's and CHECK constraints...

ie


create function keycount(@Key INT)
returns int
as
begin
declare @return int
select @return = COUNT(*) from test where derivedkey = @key
RETURN @return
end
go
create table test(derivedkey int not null CHECK (dbo.keycount(derivedkey) = 1))
GO
insert test (derivedkey)
	SELECT 1
	UNION ALL 
	SELECT 2
go
insert test (derivedkey)
	SELECT 3
	UNION ALL 
	SELECT 2

I'm just trying to reinforce the notion that relational identity is distinct from any physical implemetation..

And I think I speak for everyone here in saying that we appreciate the interaction with the SQL Server dev team members and would love to see more of it!

Thanks again for the insight.

DavidM

A front-end is something that tries to violate a back-end.

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-06-26 : 09:12:31

quote:
Originally posted by byrmol

Do CHECK constraints get an index key? Because I can create a logical key without an index by using UDF's and CHECK constraints...

No they don't.

quote:

And I think I speak for everyone here in saying that we appreciate the interaction with the SQL Server dev team members and would love to see more of it!

Cool - its difficult to persuade more people to get on the newsgroups and forum sites like this but we're working on it.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2005-06-26 : 17:40:08

Yes, it's great to see You here Paul!
You guys are always welcome...

Is there a physical difference in creating a UNIQUE CONTRAINT or a UNIQUE INDEX ?
Are they basically the same thing, or are they treated differently by SQL Server?
When I look in sysindexes table UC / UIX look the same apart from the status.

CREATE TABLE t(c INT NOT NULL)

ALTER TABLE t ADD CONSTRAINT UC_x_c UNIQUE(c) WITH FILLFACTOR = 80 	-- UNIQUE CONSTRAINT
CREATE UNIQUE INDEX UIX_x_c ON t(c) WITH FILLFACTOR = 80 		-- UNIQUE INDEX

EXEC sp_helpindex t
EXEC sp_helpconstraint t
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.TABLe_CONSTRAINTS WHERE TABLE_NAME = 't'
--SELECT * FROM sysindexes WHERE id = OBJECT_ID('t')

DROP TABLE t

/*
index_name                                          index_description                                   index_keys                                          
--------------------------------------------------- --------------------------------------------------- ----------
UC_x_c                                              nonclustered, unique, unique key located on FG_LAB1 c
UIX_x_c                                             nonclustered, unique located on FG_LAB1             c

constraint_type                                     constraint_name
--------------------------------------------------- ---------------
UNIQUE (non-clustered)                              UC_x_c

CONSTRAINT_NAME
---------------
UC_x_c
*/

Sometimes I need to retrieve all the relational keys of a table, i.e. all PK's + UIX's + UC's defined on a table.
I look for UIX's in the sysindexes system table, PK's and UC's can be found in INFORMATION_SCHEMA.
I would have liked to be able to access this metadata in one place!

rockmoose

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-06-27 : 13:12:18

quote:
Originally posted by rockmoose

Is there a physical difference in creating a UNIQUE CONTRAINT or a UNIQUE INDEX ?
Are they basically the same thing, or are they treated differently by SQL Server?

They're the same thing under the covers although I think the error returned when uniqueness is violated may be different.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2005-06-27 : 13:54:21

Thanks, nice to know that they are the same under the covers.

Found a quirk in BOL / SQL server, a 3604 (IGNORE_DUP_KEY) gets higher severity than PK violation,
contrary to what BOL says !?

Different errors:

CREATE TABLE t(c INT NOT NULL)

--ALTER TABLE t ADD CONSTRAINT PK_t_c PRIMARY KEY(c)			-- PRIMARY KEY
/*
Server: Msg 2627, Level 14, State 1, Line 1
Violation of PRIMARY KEY constraint 'PK_t_c'. Cannot insert duplicate key in object 't'.
The statement has been terminated.*/

--ALTER TABLE t ADD CONSTRAINT UC_t_c UNIQUE(c) WITH FILLFACTOR = 80 	-- UNIQUE CONSTRAINT
/*
Server: Msg 2627, Level 14, State 2, Line 1
Violation of UNIQUE KEY constraint 'UC_t_c'. Cannot insert duplicate key in object 't'.
The statement has been terminated.*/

--CREATE UNIQUE INDEX UIX_t_c ON t(c)			 		-- UNIQUE INDEX
/*
Server: Msg 2601, Level 14, State 3, Line 1
Cannot insert duplicate key row in object 't' with unique index 'UIX_t_c'.
The statement has been terminated.*/

CREATE UNIQUE INDEX UIX_t_c ON t(c) WITH IGNORE_DUP_KEY 		-- UNIQUE INDEX
/*
Server: Msg 3604, Level 16, State 1, Line 27
Duplicate key was ignored.*/
--BOL on 3604 : Severity Level 10 !!!???
--Funny this get's caught before a PK violation


INSERT T SELECT 1
INSERT T SELECT 1
GO

DROP TABLE T
GO

rockmoose

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-06-27 : 14:27:05

quote:

Originally posted by rockmoose

Thanks, nice to know that they are the same under the covers.

Found a quirk in BOL / SQL server, a 3604 (IGNORE_DUP_KEY) gets higher severity than PK violation,
contrary to what BOL says !?

Different errors:

CREATE TABLE t(c INT NOT NULL)

--ALTER TABLE t ADD CONSTRAINT PK_t_c PRIMARY KEY(c)			-- PRIMARY KEY
/*
Server: Msg 2627, Level 14, State 1, Line 1
Violation of PRIMARY KEY constraint 'PK_t_c'. Cannot insert duplicate key in object 't'.
The statement has been terminated.*/

--ALTER TABLE t ADD CONSTRAINT UC_t_c UNIQUE(c) WITH FILLFACTOR = 80 	-- UNIQUE CONSTRAINT
/*
Server: Msg 2627, Level 14, State 2, Line 1
Violation of UNIQUE KEY constraint 'UC_t_c'. Cannot insert duplicate key in object 't'.
The statement has been terminated.*/

--CREATE UNIQUE INDEX UIX_t_c ON t(c)			 		-- UNIQUE INDEX
/*
Server: Msg 2601, Level 14, State 3, Line 1
Cannot insert duplicate key row in object 't' with unique index 'UIX_t_c'.
The statement has been terminated.*/

CREATE UNIQUE INDEX UIX_t_c ON t(c) WITH IGNORE_DUP_KEY 		-- UNIQUE INDEX
/*
Server: Msg 3604, Level 16, State 1, Line 27
Duplicate key was ignored.*/
--BOL on 3604 : Severity Level 10 !!!???
--Funny this get's caught before a PK violation


INSERT T SELECT 1
INSERT T SELECT 1
GO

DROP TABLE T
GO

rockmoose

BOL is correct - its an informational message.

There's a problem with that message in the translation table that OLEDB uses and so it comes out as sev 16. This is fixed in SQL Server 2005.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2005-06-27 : 14:39:28

quote:
BOL is correct - its an informational message.

There's a problem with that message in the translation table that OLEDB uses and so it comes out as sev 16. This is fixed in SQL Server 2005.

Thank You for the information.

rockmoose

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources