Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Set Based Solution?

Author  Topic 

cDc
Starting Member

30 Posts

Posted - 2004-07-07 : 05:32:10
Hi
wondering if someone can work some magic for this?

My requirement is to select a distinct set if id's based on a comma seperated input such as "the,matrix" whereby the keywords are narrowed down like the sql below but to allow for as many keywords as neccessary. I got this working using temp tables but i am not hapy with performance.



select distinct a.itemid FROM
(
select itemid from keyworditem
inner join keywords on keywords.id = keyworditem.keywordid
and keyword = 'matrix'
) a
inner join
(
select itemid from keyworditem
inner join keywords on keywords.id = keyworditem.keywordid
and keyword = 'the'
) b ON b.itemid = a.itemid

CREATE TABLE [dbo].[keywords] (
[id] [int] IDENTITY (1, 1) NOT NULL ,
[keyword] [varchar] (50) COLLATE Latin1_General_CI_AS NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[keyworditem] (
[itemid] [int] NOT NULL ,
[keywordid] [int] NOT NULL
) ON [PRIMARY]
GO



thanks!

ditch
Master Smack Fu Yak Hacker

1466 Posts

Posted - 2004-07-07 : 05:37:24

select itemid from keyworditem
inner join keywords on keywords.id = keyworditem.keywordid
and keyword like '%matrix%'
and keyword like '%the%'

Duane.
Go to Top of Page

cDc
Starting Member

30 Posts

Posted - 2004-07-07 : 05:50:50
that wont work as i need to dynamically do this based on the number of keywords passed to a stored procedure. e.g. "the,matrix,revolutions"

i was thinking of using dynamic sql and just tacking on the inner join each time... but im thinking there must be an easier way...

cheers.
Go to Top of Page

cDc
Starting Member

30 Posts

Posted - 2004-07-07 : 06:02:54
Here is the dynamic sql version.. seems to work OK!


CREATE PROCEDURE [dbo].[TestSearch2]
@searchterms varchar(200)
AS
set nocount on

set @searchterms = @searchterms + ','

declare @count int
set @count = 1

declare @dynsql varchar(5000)

while patindex('%,%' , @searchterms) <> 0
begin

declare @separator char(1)
declare @separator_position int
declare @array_value varchar(1000)

select @separator_position = patindex('%,%' , @searchterms)
select @array_value = left(@searchterms, @separator_position - 1)
select @searchterms = stuff(@searchterms, 1, @separator_position, '')

IF @count = 1
begin
set @dynsql = 'select distinct ['+cast(@count as varchar(2))+'].itemid FROM (select itemid from keyworditem inner join keywords on keywords.id = keyworditem.keywordid and keyword = '''+@array_value+''') ['+cast(@count as varchar(2))+']'
end
ELSE
begin
set @dynsql = @dynsql + 'inner join (select itemid from keyworditem inner join keywords on keywords.id = keyworditem.keywordid and keyword = '''+@array_value+''') ['+cast(@count as varchar(2))+'] ON ['+cast(@count as varchar(2))+'].itemid = ['+cast(@count-1 as varchar(2))+'].itemid '


end

set @count = @count + 1

end

set nocount off
exec (@dynsql)
GO

Go to Top of Page

ditch
Master Smack Fu Yak Hacker

1466 Posts

Posted - 2004-07-07 : 06:03:56
How about this:
DECLARE @Keyword VARCHAR(4000)
DECLARE @SQL varchar(4000)
DECLARE @SearchString VARCHAR(4000)

SET @Keyword = 'the,matrix,revolutions'

SET @SQL = 'select itemid from keyworditem
inner join keywords on keywords.id = keyworditem.keywordid '

SET @SearchString = REPLACE(@Keyword, ',' , '''% and keyword like ''%')
SET @SearchString = ' and keyword like ''%' + @SearchString + '%'''


exec(@SQL + @searchstring)

Duane.
Go to Top of Page

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2004-07-07 : 06:35:07
cDc, I'm not entirely clear what you are aiming at there.

If you get 3 entries:
The,Matrix
Matrix,Revolutions
The,Matrix,Revolutions

Are you looking to have something like in a table.

keyword frequency
The 2
Matrix 3
Revolutions 2

as an example? And then a search on "the" returns the frequency 2? or should it return links to "entries" 1 and 3, since they had the keyword ?

D@MN - you pasted while I was looking away :-) Let me rethink this ...

CiaO

*##* *##* *##* *##*

Chaos, Disorder and Panic ... my work is done here!
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2004-07-07 : 07:09:22
Have alook at the following link:

http://www.sqlteam.com/item.asp?ItemID=2652
Go to Top of Page

cDc
Starting Member

30 Posts

Posted - 2004-07-07 : 13:01:19
Thanks folks
i think this is way more complicated than it sounds
my original dynamic SQL was flawed. (if you search on a word that doesnt exists in the keywords table then it borks)

the desired output is like this

searchstring = 'The'
should return A,B
searchstring = 'The,Matrix'
should return A,B
searchstring = 'The,Matrix,Reloaded'
should return B
searchstring = 'The,Matrix,Blah'
should return A,B

hope that makes more sense!

I have got it working by using a fairly awful temp table approach, I create one temp table on the first keyword, and then create a second temp table for the second keyword then i iterate round the keywords in the searchstring each time narrowing the results.

it would be ok but im dealing with somewhere in the region of 500,000 keywords and approx 10million rows in my keyworditem table


insert into items (itemid,description) values ('A','The Matrix Revolutions')
insert into items (itemid,description) values ('B','The Matrix Reloaded')

insert into keywords (1,'the')
insert into keywords (2,'matrix')
insert into keywords (3,'revolutions')
insert into keywords (4,'reloaded')

insert into keyworditem (itemid,keywordid) values ('A',1)
insert into keyworditem (itemid,keywordid) values ('A',2)
insert into keyworditem (itemid,keywordid) values ('A',3)
insert into keyworditem (itemid,keywordid) values ('B',1)
insert into keyworditem (itemid,keywordid) values ('B',2)
insert into keyworditem (itemid,keywordid) values ('B',4)




THANKS!
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2004-07-07 : 13:12:35
You need to deal with ranking, as the whole set of these things will be a series of ORs...not ANDs....

Take a look at:

http://weblogs.sqlteam.com/brettk/archive/2004/05/05/1312.aspx



Brett

8-)
Go to Top of Page

kselvia
Aged Yak Warrior

526 Posts

Posted - 2004-07-07 : 20:20:09
Not sure I understand why searchstring = 'The,Matrix,Blah' would return A,B. Would this work?

declare @searchstring varchar(255)
declare @sql varchar(5000)
set @searchstring = 'The,Matrix,Reloaded'
set @sql = 'select distinct itemid from keyworditem
inner join keywords on keywords.id = keyworditem.keywordid
and keyword in (''' + replace (@searchstring,',',''',''') + ''')'
select @sql
exec (@sql)


Sql= select distinct itemid from keyworditem
inner join keywords on keywords.id = keyworditem.keywordid
and keyword in ('The','Matrix','Reloaded')

Results:
itemid
-----------
1
2
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2004-07-07 : 21:46:05
[code]
--create tally table
set nocount on
create table dbo.numbers ( n int primary key)
declare @n int ; set @n = 1
while @n <=255 --max length of input parameter
begin
insert into numbers
select @n
set @n = @n + 1
end
GO


declare @searchstring varchar(30)
set @searchstring = 'The,Matrix'

SELECT ki.itemid,
CASE WHEN count(ki.itemid) - d2.ttl_item_ct = 0 THEN 'Complete Match' ELSE 'Partial Match' END as Match
FROM keywords k
JOIN keyworditem ki on ki.keywordid=k.keywordid
JOIN
(
SELECT DISTINCT NullIf(SubString(',' + @searchstring + ',' , n , CharIndex(',' , ',' + @searchstring + ',' , n) - n) , '') AS description
FROM numbers
WHERE n <= Len(',' + @searchstring + ',') AND SubString(',' + @searchstring + ',' , n - 1, 1) = ','
) d on d.description = k.description
JOIN
(
SELECT itemid as item_ct,count(itemid) as ttl_item_ct
FROM keyworditem
GROUP BY itemid
) d2 on d2.item_ct = ki.itemid
GROUP BY ki.itemid,d2.ttl_item_ct

[/code]
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-07-08 : 09:43:38
How about something like:

Use a split function, if you don't have one:
quote:

CREATE FUNCTION dbo.Split
(
@RowData nvarchar(2000),
@SplitOn nvarchar(5)
)
RETURNS @RtnValue table
(

Id int identity(1,1),
Data nvarchar(100)
)
AS
BEGIN

Declare @Cnt int
Set @Cnt = 1

While (Charindex(@SplitOn,@RowData)>0)
Begin
Insert Into @RtnValue (data)
Select
Data = ltrim(rtrim(Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)))

Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+1,len(@RowData))
Set @Cnt = @Cnt + 1
End

Insert Into @RtnValue (data)
Select Data = ltrim(rtrim(@RowData))

Return

END
Go



then try the following
quote:

Declare @items table (itemId nvarchar(10), description nvarchar(100))
insert into @items (itemid,description) values ('A','The Matrix Revolutions')
insert into @items (itemid,description) values ('B','The Matrix Reloaded')

Declare @keywords table (keyWordId int, keyWord nvarchar(100))
insert into @keywords values (1,'the')
insert into @keywords values (2,'matrix')
insert into @keywords values (3,'revolutions')
insert into @keywords values (4,'reloaded')

Declare @keywordItem table (itemId nvarchar(10), keyWordId int)
insert into @keyworditem (itemid,keywordid) values ('A',1)
insert into @keyworditem (itemid,keywordid) values ('A',2)
insert into @keyworditem (itemid,keywordid) values ('A',3)
insert into @keyworditem (itemid,keywordid) values ('B',1)
insert into @keyworditem (itemid,keywordid) values ('B',2)
insert into @keyworditem (itemid,keywordid) values ('B',4)


Declare @searchStr nvarchar(1000)

Set @searchStr = 'the,matrix'

Declare @searchWords table (id int, searchWord nvarchar(100))
Insert Into @SearchWords Select * From dbo.split(@searchStr,',')

Select D.ItemId, D.Description
From @keyWords as A
Inner Join @SearchWords as B
On A.KeyWord = B.searchWord
Inner Join @keyWordItem as C
On A.keyWordId = C.keyWordId
Inner Join @items as D
On C.itemId = D.itemId
Group By D.ItemId, D.Description
Having count(*) = (Select count(*) From @SearchWords)




Corey
Go to Top of Page

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2004-07-08 : 14:46:06
Use this split function... its faster and it doesn't require a numbers or 'tally' table (although I think it would be faster with one).


CREATE FUNCTION dbo.udfSplit (@param varchar(8000),@delim as varchar(3))
RETURNS TABLE
AS
RETURN
(
SELECT SUBSTRING(@delim + @param + @delim, Number + len(@delim),
CHARINDEX(@delim, @delim + @param + @delim, Number + LEN(@delim))
- Number - LEN(@Delim)) AS [item]
FROM (
SELECT n1.n+n2.n+n3.n+n4.n+1 Number
FROM
(
SELECT 1 as n
UNION ALL
SELECT 2 as n
UNION ALL
SELECT 3 as n
UNION ALL
SELECT 4 as n
UNION ALL
SELECT 5 as n
UNION ALL
SELECT 6 as n
UNION ALL
SELECT 7 as n
UNION ALL
SELECT 8 as n
UNION ALL
SELECT 9 as n
UNION ALL
SELECT 0 as n
) n1
CROSS JOIN
(
SELECT 0 as n
UNION ALL
SELECT 10 as n
UNION ALL
SELECT 20 as n
UNION ALL
SELECT 30 as n
UNION ALL
SELECT 40 as n
UNION ALL
SELECT 50 as n
UNION ALL
SELECT 60 as n
UNION ALL
SELECT 70 as n
UNION ALL
SELECT 80 as n
UNION ALL
SELECT 90 as n
) n2
CROSS JOIN
(
SELECT 0 as n
UNION ALL
SELECT 100 as n
UNION ALL
SELECT 200 as n
UNION ALL
SELECT 300 as n
UNION ALL
SELECT 400 as n
UNION ALL
SELECT 500 as n
UNION ALL
SELECT 600 as n
UNION ALL
SELECT 700 as n
UNION ALL
SELECT 800 as n
UNION ALL
SELECT 900 as n
) n3
CROSS JOIN
(
SELECT 0 as n
UNION ALL
SELECT 1000 as n
UNION ALL
SELECT 2000 as n
UNION ALL
SELECT 3000 as n
UNION ALL
SELECT 4000 as n
UNION ALL
SELECT 5000 as n
UNION ALL
SELECT 6000 as n
UNION ALL
SELECT 7000 as n
) n4
) n
WHERE SUBSTRING(@delim + @param + @delim, number, LEN(@delim)) = @delim
AND Number <= LEN(@delim + @param + @delim) - LEN(@delim)
)
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-07-08 : 16:05:07
Wouldn't the splitting of the search criteria is the smalles part of the process??


Corey
Go to Top of Page

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2004-07-08 : 16:53:43
A loop isn't all that efficient. I was simply attempting to improve the solution's efficiency.

I believe you will also wish to modify this line:

Declare @searchWords table (id int, searchWord nvarchar(100))
Insert Into @SearchWords Select * From dbo.split(@searchStr,',')

and instead use this one:

Declare @searchWords table (searchWord nvarchar(100))
Insert Into @SearchWords Select s.item From dbo.udfsplit(@searchStr,',') s
inner join @keywords k on s.item = k.Keyword

The reason for the additional inner join is to satisfy the criteria listed above about non-matching keywords. Using Seventhnight's code will not return results for 'the,matrix,blah' because the having clause will return 3 instead of 2.
Go to Top of Page

cDc
Starting Member

30 Posts

Posted - 2004-07-10 : 11:05:53
Thanks a lot for the input - I will spend plenty of time going through your solutions. For the record i had come up with this but have not had time to do detailed testing yet!

Cheers


CREATE PROCEDURE [dbo].[TestSearch2]
@searchterms varchar(200)
AS
set nocount on
create table #test (description varchar(30))

set @searchterms = @searchterms + ','

declare @count int
set @count = 0

while patindex('%,%' , @searchterms) <> 0
begin

declare @separator char(1)
declare @separator_position int
declare @array_value varchar(1000)

select @separator_position = patindex('%,%' , @searchterms)
select @array_value = left(@searchterms, @separator_position - 1)
select @searchterms = stuff(@searchterms, 1, @separator_position, '')
IF EXISTS (select * from keywords where keyword = @array_value)
begin
insert into #test (description) values (@array_value)
set @count = @count + 1
end
end

set nocount off

select itemid from keywords s
inner join #test t on s.keyword = t.description
inner join keyworditem i on i.keywordid = s.id
group by itemid
having count(*) >= @count

drop table #test

Go to Top of Page
   

- Advertisement -