Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Other SQL Server 2008 Topics
 Performance of Search Query

Author  Topic 

prana_besh
Starting Member

6 Posts

Posted - 2011-06-07 : 04:02:14
Hi All ,

We are tring to build a simple search table (in SQL Server 2008) for one of the clients. The Seach table will have the following structure:

Rule ID: VARCHAR(15)
Search Sring: VARCHAR(255)
TimeStamp: DTATETIME

Number of rows inserted in the table: approx 10 million per day, same is the number of searches will be performed.
No. of concurrent users: ~1000
Search Criteria: SELECT COUNT(…) FROM <TABLE_NAME> WHERE RULE_ID=<..> AND Search_String=<SEARCH_STRING> AND TimeStamp BETWEEN <date time start>, <date time end>

Tha database size may grow @300-400MB/day, purging will be done to keep last 90 day's data.

Anyone please suggest what level of optimization (indexing etc. if so where) would be necessary to ensure search operation is quicker.

Key Facts:

1. There can be 5 to 25 simimar searches can be initiated for a single transaction
2. The front-end application (web based) will initiate a search and wait for a response. Hence performance is the key.
3. The front-end will also start a insert procedure after a search is coomplete. For each search criteria there will be one insert to the table.


Many Thanks,
Pranabesh

prana_besh
Starting Member

6 Posts

Posted - 2011-06-09 : 13:34:39
Hi ...

Advice please?
Go to Top of Page

prana_besh
Starting Member

6 Posts

Posted - 2011-06-16 : 03:49:55
Still waiting for expert's advice, please....
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-06-16 : 05:53:23
First of all you'll probably need to partition this table. That means you'll need Enterprise Edition of SQL server if you don't already have it. Then I'd suggest you create one partition for each day, that would make maintenance fairly "easy", at least compared to keeping it all in the same table.

Then indexing...hm. Some questions:
- Are there more columns than these 3?
- Are all search queries identical?
- Are all search operations an equal to operation (col = @col) or do you have like operations as well (col LIKE @col)?
- Are inserts sequential?

Can you give some specific examples of the data in this table and what the search queries actually look like?

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

prana_besh
Starting Member

6 Posts

Posted - 2011-06-17 : 12:23:04
quote:
Originally posted by Lumbago

First of all you'll probably need to partition this table. That means you'll need Enterprise Edition of SQL server if you don't already have it. Then I'd suggest you create one partition for each day, that would make maintenance fairly "easy", at least compared to keeping it all in the same table.

Then indexing...hm. Some questions:
- Are there more columns than these 3?
- Are all search queries identical?
- Are all search operations an equal to operation (col = @col) or do you have like operations as well (col LIKE @col)?
- Are inserts sequential?

Can you give some specific examples of the data in this table and what the search queries actually look like?

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/



Thanks Lumbago.

We do have Enterprise edition, so partioning can be done. Thanks for the suggestion.

1. I am pretty sure there will not be more than 3 columns as it fits the requirement.
2. Search queries are all identical except the varition of parameters (e.g. time)
3. The matching will be always equal to operation (i.e col = @col), no LIKE.
4. Are inserts sequential? - not sure what do you mean. The plan is - the Front-end will perform a search and fire a insert query asynchronously. so it is not necessary to maintain sequence. The front-end will not wait until insert is done.

Example data:

'FRD_RSK_0000001', 'Mr. Holand Math|30/06/1966|SW1 3CM, 90 EXP, LONDON, UK', '12/06/2011 10:20:21'

Search can be

SELECT COUNT(*) FROM <TABLE_NAME> WHERE RULE_ID='FRD_RSK_0000001' AND Search_String='Mr. Holand Math|30/06/1966|SW1 3CM, 90 EXP, LONDON, UK' AND TimeStamp BETWEEN '13/06/2011', '13/03/2011'
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-06-20 : 04:06:53
Hm, what about this RuleID? Does it have to have this format? Is it possible to split it in 2 or 3 columns so that the 'FRD' and 'RSK' would be separated from the number? Or is the 'FRD_RSK' static and can be removed altogether? The problem is that if you index this column as the clustered primary key (or a nonclustered regular index for that matter) the index will be highly fragmented, and inserts will be slow and cause a lot of page splitting. If however you could separate the number at least to a different column and inserts were sequential (like an identity-column or something) it would be a different matter.

These are difficult requirements...I'd maybe consider creating one single clustered index on all columns (a covering index) putting the most selective columns in the beginning of the index.

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

prana_besh
Starting Member

6 Posts

Posted - 2011-06-20 : 04:36:20
quote:
Originally posted by Lumbago

Hm, what about this RuleID? Does it have to have this format? Is it possible to split it in 2 or 3 columns so that the 'FRD' and 'RSK' would be separated from the number? Or is the 'FRD_RSK' static and can be removed altogether? The problem is that if you index this column as the clustered primary key (or a nonclustered regular index for that matter) the index will be highly fragmented, and inserts will be slow and cause a lot of page splitting. If however you could separate the number at least to a different column and inserts were sequential (like an identity-column or something) it would be a different matter.

These are difficult requirements...I'd maybe consider creating one single clustered index on all columns (a covering index) putting the most selective columns in the beginning of the index.

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/b




The format of RuleID is not decided, but if this helps we can suggest to make the ruleid numeric instead of varchar. However, ruleid is not the unique key, multiple rows can have same ruleids.

I am more worried about the search than inserts (as they are asynchronous) but you have already indicated inserts may cause page splitting.

Will the single index covering all columns make the search faster?
Go to Top of Page

prana_besh
Starting Member

6 Posts

Posted - 2011-06-23 : 08:37:55
Instead of 'Search Sring: VARCHAR(255)' if we use 'Search Sring: VABINARY(50)' and store the md5 hash of the actual string (instead of the string itself) will it make the search faster?

md5 hash is actually a hexadecimal string of some fixed length e.g. 20 char or so. So this column can then have a fixed length data instead of varchar.

The application can send the md5 hash key while doing the search.
Go to Top of Page
   

- Advertisement -