Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 Other Forums
 Other Topics
 Random Selection based on weights

Author  Topic 

yakoo
Constraint Violating Yak Guru

312 Posts

Posted - 2003-01-08 : 10:52:27
Does anyone know of a good solid way to randomly select a record from a table base on weightings?


Table: MyTable
FIELD1 WEIGHT
A 10
B 50
C 15
D 20
E 5



Say I select from this table 100 times, is there any way that we can be anywhere close to having
10 A selections, 50 B selections, 15 C selections, and so forth.

I know there will most likely be no exact selections due to randomness but is there any way to come close?


Got SQL?

Edited by - yakoo on 01/08/2003 10:53:25

Edited by - yakoo on 01/08/2003 10:54:07

nr
SQLTeam MVY

12543 Posts

Posted - 2003-01-08 : 11:08:57
something like

declare @i int, @j int
select @i = sum(Weight) from tbl
select @j = (@i * rand()) + 1
select @j
select min(Field1) from tbl t1 where @j <= (select sum(t2.Weight) from tbl t2 where t2.Field1 <= t1.Field1)


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Edited by - nr on 01/08/2003 11:10:35
Go to Top of Page

yakoo
Constraint Violating Yak Guru

312 Posts

Posted - 2003-01-08 : 11:21:13
I can't really rely on the values of FIELD1. The letters A-E were just used as an example, since my actual table contains more fields. I was really interested in a way of just randomly selecting a record from a table based on weights given to each record, without using a cursor.

I have come close but the solution uses a cursor.


DECLARE @randomNumber int, @currValue int, @temp = 0
SELECT @temp = 0, @currValue = 0
SELECT @temp = SUM(WEIGHT) FROM MyTable
SELECT @randomNumber = (@temp * rand()) + 1

DECLARE @field1 char, @weight int
DECLARE randmom_cursor CURSOR FOR
SELECT FIELD1, WEIGHT FROM MyTable
OPEN random_cursor
FETCH NEXT FROM random_cursor INTO @field1, @weight
WHILE @@FETCH_STATUS = 0 BEGIN
@currValue = @currValue + @weight
IF (@currValue > @randomNumber)
BREAK
FETCH NEXT FROM random_cursor INTO @field1, @weight
END
CLOSE random_cursor
DEALLOCATE random_cursor

SELECT @field1



Got SQL?
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-01-08 : 11:24:12
What I gave you just needs Field1 to be unique - doesn't matter what the values are. To get the whole record

select * from tbl where Field1 =
(select min(Field1) from tbl t1 where @j <= (select sum(t2.Weight) from tbl t2 where t2.Field1 <= t1.Field1))

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2003-01-08 : 12:02:50
What about the following the idea of creating a tally table for each weighting...ie the tallytable would contain

A 01
A 02
A 03
A 04
A 05
A 06
A 07
A 08
A 09
A 10
B 01
B 02
B 03
B 04
B 05
B 06
B 07
B 08
B 09
B 10, ETC

C 01
C 02
C 03
C 04
C 05
C 06
C 07
C 08
C 09
C 10
C 11
C 12
C 13
C 14
C 15
D 01
D 02
D 03
D 04
D 05
D 06
D 07
D 08
D 09
D 10
D 11
D 12
D 13
D 14
D 15
D 16
D 17
D 18
D 19
D 20
E 01
E 02
E 03
E 04
E 05, etc


and joining this table to your data table....with a join that will match each one of the above to a random row in the main data table....


that way you should be able to skip the cursor.....not sure it would work....just some latteral thinking...

Go to Top of Page

graz
Chief SQLTeam Crack Dealer

4149 Posts

Posted - 2003-01-08 : 12:08:25
The approach that AndrewMurphy described is exactly how I display banners on SQLTeam.com. I have one table with a record for each banner that includes a weight. A trigger on that table populates a "selection" table. A weight of 1 creates one entry in the selection table, a weight of 2 creates two records in the selection table, etc. Then I use a simple SELECT TOP1 ... ORDER BY NEWID() to select a single record. The trigger isn't very effecient but it only runs a couple of times per month. The actual selection of a record is pretty quick.

===============================================
Creating tomorrow's legacy systems today.
One crisis at a time.
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-01-08 : 12:13:18
That's what I first thought of - but all it's doing is just allocating a record number for each weighting entry - so why not just calculate which entry the random number would fall in.
Saves creating an extra table and populating with records.


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

yakoo
Constraint Violating Yak Guru

312 Posts

Posted - 2003-01-08 : 12:41:47
How can TOP and NEWID guarantee randomness?


SELECT TOP 1 * FROM MyTable mt
INNER JOIN Selection s
ON (mt.ID = s.mt_ID)
ORDER BY NEWID()


How would performance differ for a million records (even 10 million records) for the cursor
attempt and the one graz has implemented on sqlteam.com?

Also is there something going on with the display of messages using 'preview' and in actual postings. It seems that I have to 'break' messages out so that they show up completely.

Got SQL?
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-01-08 : 12:46:30
order by newid() will benerate a guid for each row and should be random and order by that.

try this.
I think on some OS's newid() doesn't generate a random value though

select *
from (select i = 1 union select 2 union select 3) as a
order by newid()

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

graz
Chief SQLTeam Crack Dealer

4149 Posts

Posted - 2003-01-08 : 12:52:49
I don't think this solution would work well on a million plus records.

Here's the article I wrote on NEWID (http://www.sqlteam.com/item.asp?ItemID=8747). We've got a couple of articles on randomness at http://www.sqlteam.com/FilterTopics.asp?TopicID=135 I know one them discusses using weights.

===============================================
Creating tomorrow's legacy systems today.
One crisis at a time.
Go to Top of Page
   

- Advertisement -