Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Search engine advice from the guru's

Author  Topic 

kelleyb
Yak Posting Veteran

61 Posts

Posted - 2004-03-16 : 20:15:53
I have a 32,000-record table of businesses that includes most of the usual fields (business name, address, phone, city, etc). My objective is to return the most relative information to the user as possible when querying the business name. However, I want to know how you GURU's go about returning relative results for a simple text search. I'm too novice to write really complicated procedures with a percentage of relevance so I crafted my own rules. Here are two of the 15 or so statements that I use in my procedure.

----- 90% (BizName, @Querystring + space + ANYTHING)
INSERT INTO #SR(<column names removed for brevity>, Score, Pattern)
SELECT <column names removed for brevity>,
'90' AS Score,
'-->' + @QueryString + ' ' + '%' + '<--' AS Pattern
FROM DIRECTORY
WHERE BizName LIKE @QueryString + ' ' + '%'
AND
DIRECTORY.RowID NOT IN (SELECT BizRowID FROM #SR)

----- 60% (BizName, ANYTHING + space + @Querystring + space + ANYTHING)
INSERT INTO #SR(<column names removed for brevity>, Score, Pattern)
SELECT <column names removed for brevity>,
'60' AS Score,
'-->' + '%' + ' ' + @QueryString + ' ' + '%' + '<--' AS Pattern
FROM DIRECTORY
WHERE BizName LIKE '%' + ' ' + @QueryString + ' ' + '%'
AND
DIRECTORY.RowID NOT IN (SELECT BizRowID FROM #SR)
I have something like 15 different "INSERT INTO #temp_table SELECT ... " statements like this. When the statements match against a record in my business table ("DIRECTORY"), I get results where the patterns that matched look something like this:(matched pattern is between the arrows)

Pattern BizName
-------------------------------- --------------------------------
-->ed %<-- Ed Dale Properties
-->ed[^a-zA-Z0-9][a-zA-Z0-9]%<-- Ed's Auto Body Inc
-->% ed<-- Catholic Department of Ed
-->% ed %<-- Buttner, Ed CPA
-->ed%<-- Eddie's Catering & Social Hall
-->% ed%<-- A G Edwards & Sons Inc
-->%ed%<-- AAA Lederman Bail Bonds

At the end of my procedure I select the temp table, order by my predefined score, and spit it out the door. Where I see some trouble is 1)I'm only querying against a single column (two additional relevant columns in this table), 2)the patterns I have identified as "relevant" seem to me to be only the tip of the iceberg, 3)worried about performance if the procedure gets too "table scan happy".

Those of you who have authored "wicked" search engines, could you give me a few pointers or nuggets of wisdom as I move forward?

Thanks

-Brian

kelleyb
Yak Posting Veteran

61 Posts

Posted - 2004-03-18 : 16:59:27
None of the guru's have responded to this post. Did I miss something? Seriously. Are my questions so simple that I should have found answers to them in BOL? I read the article about how Graz sorts and ranks search results, but I don't think it will work for my particular application.

-Brian
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2004-03-18 : 18:23:44
Have you looked at FTI, Brian?

DavidM

"An Ugg Boot is a generic Australian term that has been in use for nearly 100 hundred years. Now some coporate wanker has trademarked it.. "
Go to Top of Page
   

- Advertisement -