| Author |
Topic |
|
ts_abbott@hotmail.com
Starting Member
36 Posts |
Posted - 2004-12-03 : 09:41:46
|
| I've got a table which holds all post codes / address info for the uk (2.2 million records) and i was wondering what the best way to index it would be.this is the table definition.i need to query on PCD (postcode). ie. SELECT * FROM Address WHERE PCD='XX12 0YY'.CREATE TABLE [dbo].[Address] ( [POB] [varchar] (255) NULL , [DST] [varchar] (255) NULL , [STR] [varchar] (255) NULL , [DDL] [varchar] (255) NULL , [DLO] [varchar] (255) NULL , [TWN] [varchar] (255) NULL , [PCD] [varchar] (255) NULL , [CTA] [varchar] (255) NULL , [CTP] [varchar] (255) NULL , [CTT] [varchar] (255) NULL , [SCD] [varchar] (255) NULL , [CAT] [varchar] (255) NULL , [NDP] [varchar] (255) NULL , [GRE] [varchar] (255) NULL , [GRN] [varchar] (255) NULL ) ON [PRIMARY]any help would be much apprecited. currently a simple select count(*) from address takes about 20 seconds. |
|
|
ts_abbott@hotmail.com
Starting Member
36 Posts |
Posted - 2004-12-03 : 09:42:48
|
| i forgot to mention that PCD (post code) isn't unique as there could be as many as 3 other records with the same post code. |
 |
|
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2004-12-03 : 10:46:43
|
| You index it ....according to what delivers the best performance for your application.There is no hard and fast rule to this...and the decision may need to be revisited over time, depending on the evolving workload....You MUST read up on this topic in BOL and SQLTeam.com...it's a core principle for a SQL DBA/Developer to understand the cost/benefits of indices.However in this instance "SELECT * FROM Address WHERE PCD='XX12 0YY'"a non-unique index on the PCD column should be of benefit. |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2004-12-03 : 10:52:35
|
| What? You got this table from Access? First I'd worry about sizing the columns correctly and normalizing the data. An index with varchar(255) would not be a good idea.Brett8-) |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2004-12-03 : 10:55:32
|
Here... Do this and tell us what it returnsSELECT MAX(LEN([POB])) , MAX(LEN([DST])) , MAX(LEN([STR])) , MAX(LEN([DDL])) , MAX(LEN([DLO])) , MAX(LEN([TWN])) , MAX(LEN([PCD])) , MAX(LEN([CTA])) , MAX(LEN([CTP])) , MAX(LEN([CTT])) , MAX(LEN([SCD])) , MAX(LEN([CAT])) , MAX(LEN([NDP])) , MAX(LEN([GRE])) , MAX(LEN([GRN])) FROM ADDRESS Brett8-) |
 |
|
|
ts_abbott@hotmail.com
Starting Member
36 Posts |
Posted - 2004-12-06 : 09:47:46
|
| hi,thanks for all the replies. i've reduced each fields size and added an index to the pcd field and that has made a big difference to query speed.the original data wasn't in access it was in a csv file but for some reason the sql server import process defaults all fields to varchar(255). even when i manually changed it to the correct size it still set them to 255 so I ended up creating the table first and then importing which has worked. |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-12-06 : 10:25:34
|
| make sure you have a primary key defined, also. please!- Jeff |
 |
|
|
ts_abbott@hotmail.com
Starting Member
36 Posts |
Posted - 2004-12-06 : 11:49:30
|
quote: Originally posted by jsmith8858 make sure you have a primary key defined, also. please!- Jeff
why would i need a primary key for this table? |
 |
|
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2004-12-06 : 11:54:04
|
| 99.999% of tables need a primary key....to ensure adherence to proper data modeling requirements....only stuff that doesn't, falls in to the categories of audit-trail or 1-record system-control tables.... |
 |
|
|
ts_abbott@hotmail.com
Starting Member
36 Posts |
Posted - 2004-12-06 : 12:16:48
|
| the data i've got is used purely to lookup an address from it's postcode. it's like a big lookup table so i can't see any benefit in having a primary key as i'll never be linking to it or select/updating/deleteing based on it. |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2004-12-06 : 13:33:49
|
What number do you get from thisSELECT SUM(OCCURS)/COUNT(*)* 1.0000000 AS Density FROM ( SELECT PCD, COUNT(*) AS OCCURS FROM ADDRESS ) AS XXX Brett8-) |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2004-12-06 : 13:43:47
|
quote: the data i've got is used purely to lookup an address from it's postcode. it's like a big lookup table so i can't see any benefit in having a primary key
Preventing duplicates for one thing, however...quote: as i'll never be linking to it or select/updating/deleteing based on it
Doesn't sound like you're using it at all, why import it then? |
 |
|
|
|