What are the performance characteristics of Guids? - SQL Server Forums

Home | Weblogs | Forums

Site Sponsored By: SQLDSC - SQL Server Desired State Configuration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

What are the performance characteristics of Guids?

Author

Topic

Lavos
Posting Yak Master

200 Posts

Posted - 2004-04-13 : 21:37:14

Hola,

While the issue is probably going to be dead soon, a small firestorm was set off at my work about using Guid's as a uniqueifier in our database/biz objects.

All arguments about uniqueifiers denormalizing the logical data schema aside, what exactly would I be able to expect to be different when using uniqueidentifier over, say, a bigint identity column?

Are there any major pitfalls (besides allowing update anomalies) that we would have to watch out for?

For instance, if the index on the Guid column is a clustered index, the guid is going to be added to every single index on the table. (Similar problems would occur if the clustered index is on a string column I believe.)

Is the join'ing of two tables on a guid column going to be significantly slower over a bigint?

What about doing straight database lookups on a column?

I think the major advantage was that we wouldn't have to worry about matching up newly inserted identity columns back on the client since the client can reliably make those guids up itself, and in theory help in our legacy integration (How it was going to help in that, I have no idea,) and help if we have to merge our database with another association's.

----------------------
"O Theos mou! Echo ten labrida en te mou kephale!"

"Where theres a will, theres a kludge." - Ken Henderson

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2004-04-13 : 22:09:33

GUID's and clustered indexes don't mix very well, simply because of the sorting required.. splitting is common.. scan density drops like a stone...

Bigint is 8 bytes, GUID 16 bytes so JOIN's "should" be slower with GUID's..As for significantly slower, I don't know.. define significant?!

Lookups are OK....

Maintainance is probably the biggest pain... space, re-indexing etc.. seem to raise there ugly heads more often when GUID's are used....

DavidM

"If you are not my family or friend, then I will discriminate against you"

Lavos
Posting Yak Master

200 Posts

Posted - 2004-04-13 : 23:44:52

Who knows what significant means.

I would assume that it would mean, in a real world environment, there is faster growth in the ratio of costs vs the ratio of key sizes. It's kind of hard to tell if proportional growth by the same standard would count. Since I'm intoxicated, let me give a quick formula since I know I'm not speaking clearly:

((Cost of Query using Guid)/(Cost of Query using Other Key))
/
((Size of Guid)/(Size of Other Key))

If that ratio is greater than one, then it's probably significant. Less than one, most likely not. Around one, then it would probably depend on how big the denominator was.

Our problem is that noone at my place or our collab partners have real experience with using Guids in production.

This means we really don't know what to expect if we went down that route. As it is, we've had some extremely vocal support for using surrogates, and so we'll be using them regardless.

Since I tend to take Fowler's approach to surrogates, as opposed to Celko, I'm fairly ambivalent and middle of the road. I'm starting to slip towards using natural keys for our current development, but it's not my call.

----------------------
"O Theos mou! Echo ten labrida en te mou kephale!"

"Where theres a will, theres a kludge." - Ken Henderson

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2004-04-14 : 06:48:27

I'm not really experienced on this and I only use GUID's to store in a cookie and verify logins on my website. However, the fact that a GUID requires 16 bytes and a bigint "only" requires 8 I would belive it's fair to say that bigint is preferable. If you really need to use bigint as your identity as opposed to int in your tables, converting them all to GUID's will double the size of all identities and for what...? It almost feels like people think using GUIDs are cool because they are globally unique or something. I have in my short experience never found any good use for it except in my example above, but then again I'm only a "semirookie".

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"

X002548
Not Just a Number

15586 Posts

Posted - 2004-04-14 : 09:25:46

First, Never user used GUID
Second, used NewId() for randomness
third, If you're talking about surogates, local to a database, why are you worrying about a Global Unique Identifier
fourth, the cookie thing makes sense...is really unique to the World?
fifth, I'd worry more about performance...

This was a fun thread...

http://www.dbforums.com/showthread.php?threadid=973338&perpage=15&pagenumber=4

Oh, and MOO

Can we have a poll of how many people use GUID in Production?

Brett

8-)

Lavos
Posting Yak Master

200 Posts

Posted - 2004-04-14 : 20:06:42

Well, it's all very easy to say not to use one, but we have to justify using it or not using it.

GUID's, if generated by the correct algorithmn are pretty much guaranteed unique across the world. Part of the algorithm that produces them is based on the unique MAC address of your ethernet adapter, which helps ensure that if all other conditions are the same, our computers produce different GUIDs.

Our physical database spans DB2 and SQL Server, and possibly a couple of other systems. The plan right now, is to have one source of record for different types of information, and use services to aggregate the data from other sources.

There are small technicalities to deal with with identity columns if you ever have to merge your data with someone else's, but I think that could be handled when it's time to handle it :)

The one major advantage, is that you can generate the Guid client side , which makes it easier to relate parent-children records when sending updates to the server. With identities, you have to carefully update the client with the proper ID's once the server generates them.

Of course, if I could turn back time to before I was hired, I would've pushed for a non-relational client side representation (AKA no DataSets, and using XPathDocuments ;) to avoid some of the messy object-relational headaches like parent->children identity worries.

----------------------
"O Theos mou! Echo ten labrida en te mou kephale!"

"Where theres a will, theres a kludge." - Ken Henderson

derrickleggett
Pointy Haired Yak DBA

4184 Posts

Posted - 2004-04-14 : 20:43:57

Our physical database spans DB2 and SQL Server, and possibly a couple of other systems. The plan right now, is to have one source of record for different types of information, and use services to aggregate the data from other sources.

I would be extremely interested in how you implement this across the different systems. :) If you would care to share it at the time.

MeanOldDBA
derrickleggett@hotmail.com

When life gives you a lemon, fire the DBA.

Lavos
Posting Yak Master

200 Posts

Posted - 2004-04-14 : 23:33:18

Sorry, it's mainly boring stuff. I mispoke, and should have said our physical database Layer, which might make a difference in how it's imagined. :)

Unfortunately, I'm pretty much completely client side, as is most of the group I'm with, and our collab partners (who own the mainframe and all the super important data) are doing the server side, with the Data Access and Database stuff.

Mainly, it comes down to using web services, or remotable objects, that call other components that know where the source of record is that they have to pull from to assemble the information it's responsible for.

For instance, one of our components might be responsible for customer data, and it knows how to split that data up into chunks to send to the right system for updates, and how to query for chunks from each system to assemble into one DataSet for the service to merge into whatever else it might be grabbing before returning to the client.

Right now, they are using objects to wrap DB2 or Sql Server stored procs to make the calls, so the assembling components knows what chunks of data go to which stored proc object, while in theory the stored proc object could point to anything. (There is another object to wrap connection info that can be passed to either type of proc object for user identification and connection info.)

I don't know how well that model is going to work, but it looks good enough on paper.

Now that I think about it, I have used Guids before. It was to form session ID's in the database for audit trails, but haven't used them since. Of couse, that company folded before that got into production, so it doesn't count.

----------------------
"O Theos mou! Echo ten labrida en te mou kephale!"

"Where theres a will, theres a kludge." - Ken Henderson

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources