Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2002-03-01 : 09:12:56
|
| Nick writes "Hi. I am planning a database design that will have 6.5 million rows, 40 columns across.35 columns are [smalldatetime] fields. 5 columns are other data like ID,State,ZIP, Gender.What I need help on, is:1) How can I achieve any kind of selection (max of 20 columns simultanuously) in less than 30 seconds?Is it possible? (this is gonna be a web application with users running dynamic queries in real-time)2) What is the best way to partition the database? (Horizontal or vertical?)3) What kind of hardware do you recommend for this? (RAM, CPU, HD)The first thing I tought is to convert all the datetime fields into the gregorian dates, so that I can have those fields as integers, thus indexing them.This is a tuffy.I appreciate your time and effort. Any help would be very very usefull to me.Thank you again.Nick."" |
|
|
graz
Chief SQLTeam Crack Dealer
4149 Posts |
Posted - 2002-03-01 : 09:12:56
|
You should be able to get subsecond response time with a decent table design, good indexes and decent hardware. I have a table with 16 million rows that I can pull a single uncached record in < 1.5 seconds. And that's on a 2 x PIII-550, 2GB RAM, SCSI RAID array. That's a two year old box.
I wouldn't partition the table -- that's way too complex.
I'd leave the datetime values alone. SQL Server can index them just fine. It will also sort them for you properly. Integer values take 4 bytes. Smalldatetime also take 4 bytes.
I am curious what the 35 smalldatetime columns are for though.
|
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2002-03-01 : 11:26:08
|
| Can you tell us what sort of queries you'll be running? It's pretty difficult to answer this question knowing the structure of a table, but not how it will be used. |
 |
|
|
ironhard
Starting Member
1 Post |
Posted - 2002-03-01 : 13:03:34
|
Thank you very much for your response.This table is a customer database.Every customer has signed up for different offers on different dates.Actually, there are 35 offers and a date for each, giving us 35 smalldatetime fields. I was thinking of first converting all the dates to smallint which is 2 bytes, for 2 reasons:1) in the original database we only have YYYY-MM format. There is no DAY field, so I can just use an INTEGER to represent the number of months since the year 1900.2) some of my queries will involve just a single COUNT(*) on a combination of date fields at the same time across the 6.5 million rows. And that part needs to be as quick as possible.Ex: the 2002-01 (that is january 2002) will be converted to smallint of 1213. The difference is 101 years and 1 month thus giving us 1212+1 = 1213 months. An index on all those fields will speed things up for the final count.Also, I was thinking of splitting the table to smaller tables according to STATES. that will give us 52 state tables.My queries are going to be dynamic, meaning that I choose the table to be searched on the fly from an ASP script.I am plannning to buy a machine with at leat 2 GB of RAM and dual P3 1.2 Ghz and 5-6 ultra 160, 15K rpm drives RAID 5.Again, thank you for your time. Wish me good luck... :)Nick.quote: You should be able to get subsecond response time with a decent table design, good indexes and decent hardware. I have a table with 16 million rows that I can pull a single uncached record in < 1.5 seconds. And that's on a 2 x PIII-550, 2GB RAM, SCSI RAID array. That's a two year old box.
I wouldn't partition the table -- that's way too complex.
I'd leave the datetime values alone. SQL Server can index them just fine. It will also sort them for you properly. Integer values take 4 bytes. Smalldatetime also take 4 bytes.
I am curious what the 35 smalldatetime columns are for though.
|
 |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2002-03-01 : 14:47:06
|
| If you put a clustered index on that "interger date field" that should speed things up pretty well.If you have data separated into different states, you could create a table for each state, put constraints on the tables to only allow records for that state to enter that table, and put an indexed view on the whole thing. I suggest taking graz's advice and keep it simple. 6.5 million records in one table is not too bad.Another question that will definitly need to be answered is, what happens if there is a 36th+ offer for the customers? Do all customers have all 35 offers?Michael |
 |
|
|
aiken
Aged Yak Warrior
525 Posts |
Posted - 2002-03-02 : 12:30:00
|
Traditional database design would be for a users table, an offers table, and a join table to link the two. That way your users table doesn't have 35 datetime fields, users who haven't signed up for a particular offer don't have wasted space in their row, and you can add additional offers without changing table schemas. You'll also get higher performance from the join table approach, since to look up any order date for any customer you only need 1 index, versus 35.Think of it like this:CREATE TABLE users ( i int IDENTITY, user_name varchar(250) NOT NULL, zip_code int)CREATE TABLE offers ( i int IDENTITY, offer_name varchar(25))CREATE TABLE user_offers ( i_users int NOT NULL, i_offers int NOT NULL signup_date smalldatetime) Cheers-b |
 |
|
|
|
|
|
|
|