Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Cursor VS While

Author  Topic 

Lopaka
Starting Member

48 Posts

Posted - 2006-03-27 : 16:11:36
Problem:
I have a table contain 500K records, “and growing”. In the table (a) is a field containing several words that I would like to compare against keywords, also in a table (b). Table b contains either a single word, or multiple words separated by a semi-colon. Table b will continually grow as long as table a grows, currently table b has about 500 records.

Question:
I am currently reading about ‘Cursors’… Before I head down this path, would it be better to loop over the select statements and compare it against table B, or would it be better to use a cursor and loop over the select?

EXAMPLE:

Table A:
FIELD1
AAA CCC EEE

Table B:
FIELD1, FIELD2
1, AAA
2, BBB
3, CCC
4, DDD
5, EEE
6, FFF

Output:
@VARIABLE = 135

AGAIN, remember the amount of records will continually to grow...

Thank you


Robert R. Barnes

jhermiz

3564 Posts

Posted - 2006-03-27 : 16:14:10
You could all do it without a loop using the LIKE predicate clause ?




Keeping the web experience alive -- [url]http://www.web-impulse.com[/url]

RS Blog -- [url]http://weblogs.sqlteam.com/jhermiz[/url]
Go to Top of Page

Lopaka
Starting Member

48 Posts

Posted - 2006-03-27 : 16:18:25
Due to the amount of the records and size of the field in table A, I was wondering what would be faster a loop or a cursor? I currenlty of a basic loop, I inherited it from a SAS programmer...:( A completely other story...

Robert R. Barnes
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2006-03-27 : 17:53:10
quote:
Originally posted by Lopaka

Problem:
I have a table contain 500K records, “and growing”. In the table (a) is a field containing several words that I would like to compare against keywords, also in a table (b). Table b contains either a single word, or multiple words separated by a semi-colon.

So table both table (a) and table (b) may store several words in a single field? This is like a double-barreled shotgun blast to any concept of normalization.

How's your resume' looking? Up to date?

Whatever. Any LIKE() solution you come up with is going to be faster than a cursor, but maybe not much. Throw any concept of indexes out the window.

You REALLY need to fix this schema, or you are going to be creating code to cover its flaws indefinitely.
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2006-03-27 : 17:54:47
To answer your question, a cursor IS a loop, so you will not get any appreciable performance improvement using one (I predict). Its possible the database engine may be somewhat optimized to handle cursors over progamatic loops.
Go to Top of Page

Lopaka
Starting Member

48 Posts

Posted - 2006-03-31 : 11:55:22
I will try and explain in better detail...
TableA.ColA
Contains "Senior Systems Analyst" or "Certified Public Acctountant". Now containing 1.2 Million Distinct Titles
TableB.ColA
Contains a list of single words, of which are "Senior", "Systems", "Analyst", "Certified", "Public", "Acctountant", etc... There are 25K distinct words...
My algorithm would compare each word in TableA with the list of words in TableB. The figure out that the first title belongs to IT and the second title belongs to the Financial Department

Robert R. Barnes
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-03-31 : 14:19:03
Is there a reason a simple join won't work? Why do you need a loop *or* a cursor?

Assuming that you have a column in your "Words" table that identifies a category, like "I.T." for example, a simple LIKE join does the trick:

select Titles.Title, min(Words.Category) as Category
from Titles
inner join Words
on
' ' + Titles.Title + ' ' like '%[^a-z]' + Words.Word + '[^a-z]%'
group by Titles.Title

That returns 1 category per title from the matching words, and it just picks the MIN() or the one with the lowest alphabetical sort. You can expand upon this to find the best match by using COUNT(*) and other ways.

Efficiency should not matter -- something like this should not be run "on the fly". You should categorize your titles once and store the values in your table, and then do it only for new titles going forward.
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-03-31 : 14:23:04
quote:
Originally posted by Lopaka

I will try and explain in better detail...
TableA.ColA
Contains "Senior Systems Analyst" or "Certified Public Acctountant". Now containing 1.2 Million Distinct Titles
TableB.ColA
Contains a list of single words, of which are "Senior", "Systems", "Analyst", "Certified", "Public", "Acctountant", etc... There are 25K distinct words...
My algorithm would compare each word in TableA with the list of words in TableB. The figure out that the first title belongs to IT and the second title belongs to the Financial Department

Robert R. Barnes



I just saw this more carefuly ... you did realize that words like Senior, Analyst, Certified, and so on are extremely common in all kinds of titles, right?

once again, it will be best if you give us actually specifics as to what you are doing, with actual table and column names, and a little bit of representative sample data so that we can see what your goal is. Your question is way to broad, and at this point alarms are going off indicating to me that what you are trying to do is probably not a good idea.
Go to Top of Page
   

- Advertisement -