Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Compare and rank by similarities in a column

Author  Topic 

djchrome
Starting Member

19 Posts

Posted - 2002-12-27 : 14:14:39
Is there any way to compare all the data in a single column in one table across all rows and then order the results by the percentage of matching words?

For an example, I have a table that holds the description of a project as a text field. I would like to be able to query all the rows in that table (for the description column) and if more than 1 row has the same text then it gets numeric value, the more and more similarities a row has with other rows, the greater that value.

I imagine this is similar to what a search engine does, but I would like to do it without a keyword. Just rank the rows by similiaries so I can see how many projects I have that are alike.

Thanks my fellow nurds!

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2002-12-27 : 14:53:11
if "Table" is your table, and "Field" is the field to check out, this will return the # of records that match each field:

SELECT Field, COUNT(*) as NumberOfMatches
FROM
Table
GROUP BY Field

So, if you want to update a field called "NumberOfMatches", then you have:

UPDATE Table
SET NumberOfMatches = (SELECT COUNT(*) FROM Table A WHERE A.Field =Table.Field)

If "Field" is a TEXT field, you may have to convert it to a large VARCHAR() first in the above SQL.

Also: the matches returned in the above query must be exact, character for character matches (though not case sensitive). if that requirement doesn't work for you, you will need a MUCH more complicated solution and some sort of hashing or matching algorithm.


- Jeff
Go to Top of Page

MichaelP
Jedi Yak

2489 Posts

Posted - 2002-12-27 : 15:10:41
Going back to what Jeff said, if his solution doesn't work for you, you might need to break each description into separate words similar to this:

http://www.sqlteam.com/item.asp?ItemID=5857

Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page
   

- Advertisement -