| Author |
Topic |
|
snelson
Starting Member
10 Posts |
Posted - 2004-11-18 : 11:03:05
|
| I have a table with 3 million rows (company names with descriptions.)The descriptions are full text indexed and the company names are also indexed.We are finding that any search is relatively slow (about 4 - 8 seconds). Can anyone suggest a design or architecture that could improve our query results? The query uses the company name and freetext search on the description.PS: The tables have already been horizontally partitioned. This is the row count that we have to deal with.Is it possible to make a couple servers work together on a SINGLE QUERY and then just add together the result set?If someone could point us in the right direction it would be appreciated.Thanks very much,Steve Nelson |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2004-11-18 : 12:54:40
|
| "Is it possible to make a couple servers work together on a SINGLE QUERY and then just add together the result set?"Yes, it's called Distributed Partitioned Views. From what I read, it's not super easy to impliment.Describe your hardware setup. I suspect that throwing RAM and Disk at this problem is the way to go.Michael<Yoda>Use the Search page you must. Find the answer you will.</Yoda> |
 |
|
|
snelson
Starting Member
10 Posts |
Posted - 2004-11-18 : 14:55:34
|
| We are using 2 Duals 2.8 Zeonswith 2 Gigs of RAM on boardwith 1.5 Gig of RAM dedicated to the SQL We are getting results around 4 seconds..but we need to get it down to about 0.3 - 0.5 seconds.What you suggested sounds very interesting. I haven't heard of it though. I will do some research.Any other suggestions in terms of hardware?Thanks so much. |
 |
|
|
DustinMichaels
Constraint Violating Yak Guru
464 Posts |
Posted - 2004-11-18 : 14:59:51
|
| Have you tried using the Contains Predicate instead of the FreeText Predicate?I'm not an expert on full text search but I would think the Contains predicate would be faster.Dustin Michaels |
 |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2004-11-18 : 15:01:00
|
| What about your disk configuration? How many disks, what type fo RAID setup, what SQL stuff is on the disk etc.With SQL server, you usually focus on RAM, DISK, CPU or DISK, RAM, CPU in that order when looking at hardware upgrades.If your database is smaller, go for RAM first. If your databas is really large (greateer than the amount of RAM the server can hold) then add disks.I bet you need more disks.Michael<Yoda>Use the Search page you must. Find the answer you will.</Yoda> |
 |
|
|
snelson
Starting Member
10 Posts |
Posted - 2004-11-18 : 15:07:45
|
| When you say you need more disks do you mean putting the page file or FTS files on the seperate disk/controllers. What is your view on SATA drives VS. SCSI drives. |
 |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2004-11-18 : 15:53:42
|
| Well, the generally accepted disk setup for a resonably large SQL database is this:For a Read heavy database(2) Disks in a RAID 1 Configuration for your SQL TX Logs (LDF)(3-5) Disk in a RAID 5 Configuration for your SQL Data (MDF)For a Write heavy or Read / Write balanced database(2) Disks in a RAID 1 Configuration for your SQL TX Logs (LDF)(4-6) Disk in a RAID 1/0 Configuration for your SQL Data (MDF)If this isn't enough, then you can start doing things like adding another set of disks for your indexes.As far as SATA vs SCSI, that's a tough call. Personally, I'd prefer SCSI becuase they are designed to be running all the time and have high performance. There are not too many SATA drives that are designed to be actively working most of their run time. If you can afford it go SCSI or preferably Fibre Channel. We use Fibre Channel external storage boxes for our SQL server's with disk setups like the ones above. You are talking about $30k+ or so for the storage subsystem, not counting the servers you need to connect to them.SATA -- good for storing files that don't need constant accessUnless you use the Western Digital Raptors, and then I think it's a different story. I use them at home in a RAID 0 config, and boy are they fast. For SQL server, I'd avoid SATA.Question:How large is your database in GB's? With only 2GB's of RAM total, you might want to bump up to 4GB and use the /3GB switch.Michael<Yoda>Use the Search page you must. Find the answer you will.</Yoda> |
 |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2004-11-18 : 16:16:39
|
| 1) Split out the FT column in to its own table.2) Ensure that the FT table has its own filegroup and files3) As a bonus, the filegroup and files should be placed on their own disk.4) In the FT column, break 1NF by placing the Key in with the description. ie "Microsoft||Really big software company that needs to do a lot of work on FTI".DavidM"Always pre-heat the oven" |
 |
|
|
snelson
Starting Member
10 Posts |
Posted - 2004-11-18 : 16:19:45
|
| Hey Micheal,You have been super helpful. Our Database is about 30Gigs. The 3 million record table I specified is just the table of California. I am going to have to take a closer look at hardware.Is there a program out there that will get the tables ready for Distributed Partitioned Views? |
 |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2004-11-18 : 16:29:50
|
| I don't know of a program to handle the DPV's for you automatically.For a 30GB database and only 2 GB's of RAM, I think that you definatly need to add RAM to start (4GB total at LEAST), and then take David's suggestions about the FT Filegroup and Files and my suggestions about the disk subsystem. Considering your setup, 4 seconds is not bad at all.Michael<Yoda>Use the Search page you must. Find the answer you will.</Yoda> |
 |
|
|
snelson
Starting Member
10 Posts |
Posted - 2004-11-18 : 23:38:52
|
| Thanks for you advice. We have decided to probably buy a 8 xeon processor server with 16 Gigs of ram.My question is with this type of setup would it be better to use DPV and cluster a few dual Xeons together or would it be better to buy the 8 xeon processor. What we have are a ton of tables (here is an example of a table)RecordId |Company Name |State |Primary Key Full Text Indexed. Contstraint State='CA'We have users specifying the Company Name and a state so our query only usesthe Company Name.How could we Divide this table up for DPV, when the only values we get from the userare the Company Name and State. Would I just a seperate the table in Half or what is the best way to do this if i want to just split it in two.Thanks very much for all your help. |
 |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2004-11-19 : 02:49:11
|
| In my experince and from everything I've been told, It's usually best to get the biger box with SQL server. DPV sounds cool and has some scalability advantages over not using them, but it's nto super easy to work with. Horizontal partitioning is as far as I've gotten with "high performance" sql server stuff. Even with some fairly slow hardware, I'm able to do intense things with my 25+ million row Partitioned View in just a few seconds on my single physical SQL Server.Be sure to keep in mind what I mentioned before about where to spend money for SQL server. The 16GB's of RAM is a GREAT idea and will make such a huge difference in your queries. If you can afford it, DO IT! Instead of the 8-way Xeon box, you might want to get an 8-way capable box but only put two or four processors in it to start. You can always add them down the road, and that leaves you extra money to spend on more / faster disks. Pump money into your disk subsystem! After RAM, it's the best bang for the buck. Usually the sales folks for disk subsystems will have a technical guy that can help you configure the disk subsystem so that it's as fast as possible for what you are trying to do.Something else you could try is this. Post some CREATE TABLE and INSERT INTO statements on the board and we can try to help you optimize your existing code etc to make it perform better with your existing hardware.That should be enough things for you to think about for awhile. I'm off to play some Half-Life 2!Michael<Yoda>Use the Search page you must. Find the answer you will.</Yoda> |
 |
|
|
|