Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 ORDER BY NEWID()

Author  Topic 

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2004-02-25 : 22:20:41
I just read somewhere that a good way to generate a random recordset would be to select like this:

SELECT * FROM table ORDER BY NEWID()

First of all I can't understand why this works at all since there needn't be a uniqueindentifyer-field in the select and order by-items usually have to be a part of the tables you are selecting from. Secondly I'd really like to understand what the heck is going on when I do something like this and how it impacts my select. Does anybody know?

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2004-02-25 : 22:30:03
Lumbago,

It will order the entire set.. so its pretty slow.. Must it does give random results...

It is one of those functions that is generated per row like identity...As opposed to RAND()

Try it in a select first

Select *, NEWID() from Table




DavidM

"SQL-3 is an abomination.."
Go to Top of Page

derrickleggett
Pointy Haired Yak DBA

4184 Posts

Posted - 2004-02-25 : 22:41:01
I believe if you do that, you'll still want to have order by 2 at the end, or something like that.

select t1.col1, t1.col2, newid() as rand_id from table t1 order by rand_id


MeanOldDBA
derrickleggett@hotmail.com

When life gives you a lemon, fire the DBA.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2004-02-25 : 22:43:28
So what you're saying is that if I do a SELECT TOP 10 * FROM table ORDER BY NEWID() on a table with say 100 000 records it will first generate the 16-bit guid for all of them and then select the top 10 of them ordered by the guid alphabetically?? If that's the case then damn...sorry for the foul language but I guess we would have to find another method then ;)

I belive I read some lace else that each row in a table is actually not referenced internally by the primary key but by some kind of number that uniquely identifies the row within the database or something, would this be something to use? "ORDER BY SUBSTRING(InternalRowID, 6, 15)" or something :D

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2004-02-25 : 22:45:24
Hmm...this wouldn't be very random when I come to think of it...bummer :(

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

derrickleggett
Pointy Haired Yak DBA

4184 Posts

Posted - 2004-02-25 : 22:49:10
That's not what I was saying. Per BOL:

If a SELECT statement that includes TOP also has an ORDER BY clause, the rows to be returned are selected from the ordered result set. The entire result set is built in the specified order and the top n rows in the ordered result set are returned.

It will order the entire set by the NEWID(), then select the top 10. Order by NEWID() I believe is what you want.


MeanOldDBA
derrickleggett@hotmail.com

When life gives you a lemon, fire the DBA.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2004-02-25 : 22:56:26
Yup, thanx derrick...I was actually referring to byrmols post but you managed to sneak it in while I was writing, but that doesn't really matter. I got the point ;) Just out of curiosity; are there any other functions I can order by in the same manner as newid()? Still find the syntax a bit weird...

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

derrickleggett
Pointy Haired Yak DBA

4184 Posts

Posted - 2004-02-25 : 23:05:17
Not really. There are some things you can do with functions if you want to randomize based on algorithms. NEWID() is the easiest to use though.

MeanOldDBA
derrickleggett@hotmail.com

When life gives you a lemon, fire the DBA.
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2004-02-25 : 23:22:51
My SELECT example was to show that NEWID is generated per row, not per result set as RAND is...

SELECT *, NEWID(), RAND() FROM TABLE

DavidM

"SQL-3 is an abomination.."
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2004-02-25 : 23:23:35
quote:
I belive I read some lace else that each row in a table is actually not referenced internally by the primary key but by some kind of number that uniquely identifies the row within the database or something, would this be something to use? "ORDER BY SUBSTRING(InternalRowID, 6, 15)" or something
This is only true of a table with no clustered index. When creating a non-clustered index on such a table, the row identifier (RID) is made up of the file id, page id, and slot id of the row on the data page. If the table does have a clustered index, then all non-clustered indexes reference the clustering key's value. In the case of duplicate key values, SQL Server adds a suffix to each key to uniquely identify them. Books Online and the Inside SQL Server books describe this in greater detail.

In any event, the internal RID is not accessible through SQL functions, and because data is stored in pages, update/insert/delete operations frequently cause these values to be changed, and can force all of the indexes to be rebuilt unnecessarily. That's one of the reasons why it's always a good idea to have a clustered index on each of your tables; the index would only be altered if part of the clustering key was updated.
Go to Top of Page
   

- Advertisement -