| Author |
Topic |
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2004-02-25 : 22:20:41
|
| I just read somewhere that a good way to generate a random recordset would be to select like this:SELECT * FROM table ORDER BY NEWID() First of all I can't understand why this works at all since there needn't be a uniqueindentifyer-field in the select and order by-items usually have to be a part of the tables you are selecting from. Secondly I'd really like to understand what the heck is going on when I do something like this and how it impacts my select. Does anybody know?--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2004-02-25 : 22:30:03
|
Lumbago,It will order the entire set.. so its pretty slow.. Must it does give random results...It is one of those functions that is generated per row like identity...As opposed to RAND()Try it in a select firstSelect *, NEWID() from Table DavidM"SQL-3 is an abomination.." |
 |
|
|
derrickleggett
Pointy Haired Yak DBA
4184 Posts |
Posted - 2004-02-25 : 22:41:01
|
| I believe if you do that, you'll still want to have order by 2 at the end, or something like that.select t1.col1, t1.col2, newid() as rand_id from table t1 order by rand_idMeanOldDBAderrickleggett@hotmail.comWhen life gives you a lemon, fire the DBA. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2004-02-25 : 22:43:28
|
| So what you're saying is that if I do a SELECT TOP 10 * FROM table ORDER BY NEWID() on a table with say 100 000 records it will first generate the 16-bit guid for all of them and then select the top 10 of them ordered by the guid alphabetically?? If that's the case then damn...sorry for the foul language but I guess we would have to find another method then ;) I belive I read some lace else that each row in a table is actually not referenced internally by the primary key but by some kind of number that uniquely identifies the row within the database or something, would this be something to use? "ORDER BY SUBSTRING(InternalRowID, 6, 15)" or something :D--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2004-02-25 : 22:45:24
|
| Hmm...this wouldn't be very random when I come to think of it...bummer :(--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
 |
|
|
derrickleggett
Pointy Haired Yak DBA
4184 Posts |
Posted - 2004-02-25 : 22:49:10
|
| That's not what I was saying. Per BOL: If a SELECT statement that includes TOP also has an ORDER BY clause, the rows to be returned are selected from the ordered result set. The entire result set is built in the specified order and the top n rows in the ordered result set are returned.It will order the entire set by the NEWID(), then select the top 10. Order by NEWID() I believe is what you want.MeanOldDBAderrickleggett@hotmail.comWhen life gives you a lemon, fire the DBA. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2004-02-25 : 22:56:26
|
| Yup, thanx derrick...I was actually referring to byrmols post but you managed to sneak it in while I was writing, but that doesn't really matter. I got the point ;) Just out of curiosity; are there any other functions I can order by in the same manner as newid()? Still find the syntax a bit weird...--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
 |
|
|
derrickleggett
Pointy Haired Yak DBA
4184 Posts |
Posted - 2004-02-25 : 23:05:17
|
| Not really. There are some things you can do with functions if you want to randomize based on algorithms. NEWID() is the easiest to use though.MeanOldDBAderrickleggett@hotmail.comWhen life gives you a lemon, fire the DBA. |
 |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2004-02-25 : 23:22:51
|
| My SELECT example was to show that NEWID is generated per row, not per result set as RAND is...SELECT *, NEWID(), RAND() FROM TABLEDavidM"SQL-3 is an abomination.." |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2004-02-25 : 23:23:35
|
quote: I belive I read some lace else that each row in a table is actually not referenced internally by the primary key but by some kind of number that uniquely identifies the row within the database or something, would this be something to use? "ORDER BY SUBSTRING(InternalRowID, 6, 15)" or something
This is only true of a table with no clustered index. When creating a non-clustered index on such a table, the row identifier (RID) is made up of the file id, page id, and slot id of the row on the data page. If the table does have a clustered index, then all non-clustered indexes reference the clustering key's value. In the case of duplicate key values, SQL Server adds a suffix to each key to uniquely identify them. Books Online and the Inside SQL Server books describe this in greater detail.In any event, the internal RID is not accessible through SQL functions, and because data is stored in pages, update/insert/delete operations frequently cause these values to be changed, and can force all of the indexes to be rebuilt unnecessarily. That's one of the reasons why it's always a good idea to have a clustered index on each of your tables; the index would only be altered if part of the clustering key was updated. |
 |
|
|
|