Article: Efficiently Reuse Gaps in an Identity Column

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

Site Related Forums

Article Discussion

Article: Efficiently Reuse Gaps in an Identity Column

Author

Topic

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2010-02-09 : 08:50:18

This article will demonstrate an efficient way to reuse gaps in an identity column. Please note that this is something you normally shouldn't be bothered about in a well-designed database or application. However, there are circumstances where you are forced to do this.

Read Efficiently Reuse Gaps in an Identity Column

mjda
Starting Member

6 Posts

Posted - 2010-02-09 : 11:43:58

Nice article. I usually avoid running out of identity range by doing something like:
CREATE TABLE dbo.Yaks(YakID bigint identity(-9223372036854775808,1),YakName char(20))
:)

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-02-09 : 17:28:04

Yes, and using 8 bytes of storage instead of 4.
For 200 million records, that is a saving of 763MB in data storage, and another 763mb index storage.
763 mb storage space is 100 000 less pages to store, seek and scan.

N 56°04'39.26"
E 12°55'05.63"

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2010-02-09 : 19:48:50

quote:
Originally posted by mjda

Nice article. I usually avoid running out of identity range by doing something like:
CREATE TABLE dbo.Yaks(YakID bigint identity(-9223372036854775808,1),YakName char(20))
:)

Is starting a bigint identity with a negative number seed really necessary in an practical application?

Using just the positive numbers, a table with only a bigint identify column and no other columns would use 67,108,864 TB of storage before you ran out of numbers. And it would take over 290 years to load that table at a rate of 1 billion rows per second.

CODO ERGO SUM

ya3mro
Starting Member

37 Posts

Posted - 2010-05-18 : 02:06:59

very excellent solution to such a usual problem

How I Came To Islam?
http://english.islamway.com

Celko
Starting Member

23 Posts

Posted - 2010-07-29 : 12:16:37

Since I don't use IDENTITY or other metadata in my DDL, this is not a problem for me javascript:insertsmilie('

') .

However, when we wanted to find gaps in a sequence for re-use, the old trick was:

CREATE VIEW Vacancies(vacant_seq)
AS
SELECT vacant_seq
FROM (SELECT seq +1
FROM Foobar
WHERE seq +1
NOT IN (SELECT seq FROM Foobar)
AND seq+1 <= <<upper limit of seq>>
UNION
SELECT seq -1
FROM Foobar
WHERE seq -1
NOT IN (SELECT seq FROM Foobar)
AND seq-1 >= 1;

It does not get all of the open slots, but it was good enough when you needed a quick answer for a few openings in a relatively small table. The other trick was

SELECT (<<upper limit of seq>> - COUNT(seq)) AS vacancy_cnt
FROM Foobar;

Today, we would probably use ROW_NUMBER() to update the seq and put all of the vacancies on the high end of the table:

UPDATE Foobar
SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC);

Author of
SQL for Smarties
Thinking in Sets
SQL Puzzles and Answers
Analytics and OLAP in SQL
SQL Programming Style
Standards in SQL
Data & Databases
Trees & Hierarchies in SQL

X002548
Not Just a Number

15586 Posts

Posted - 2010-07-29 : 12:30:35

WELCOME JOE!!!!!!!!!!

Have a beer on me...Love your books

Stop by the Corracl and say hi!

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=148000

Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-07-30 : 00:54:30

The idea is to update ONLY the records who needs to be updated.
Why update all 100 million records when there are only 10 or 20 records to update?

N 56°04'39.26"
E 12°55'05.63"

Celko
Starting Member

23 Posts

Posted - 2010-07-30 : 18:04:34

quote:
Originally posted by Peso

The idea is to update ONLY the records who needs to be updated.
Why update all 100 million records [sic: rows] when there are only 10 or 20 records [sic: rows] to update?

Because in the old days, we never had big tables

I am not sure how SQL server will treat this, but you could also write Standard SQL to save some effort:

UPDATE Foobar
SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC)
WHERE seq <> ROW_NUMBER() OVER (ORDER BY seq ASC);

Author of
SQL for Smarties
Thinking in Sets
SQL Puzzles and Answers
Analytics and OLAP in SQL
SQL Programming Style
Standards in SQL
Data & Databases
Trees & Hierarchies in SQL

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-07-30 : 18:44:26

Create a table with 9 rows, numbered 1 to 9. Now remove row 2.


Seq    ROW_NUMBER
1      1
3      2
4      3
5      4
6      5
7      6
8      7
9      8

It still means you are updating 89% of the table.
With my suggestion you only update 1 record (Seq 9 becomes Seq 2).

N 56°04'39.26"
E 12°55'05.63"

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2010-11-26 : 22:24:32

Peter. This isn't an attack but I have to ask, why would you really want to reuse IDENTITY values or their equivalent? Why even bother having such a column if it means so little that you can reuse and reassign the numbers? Like I said, not an attack... just an insane amount of curiosity.

--Jeff Moden

"Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! "
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row"

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2010-11-26 : 22:31:35

quote:
Originally posted by Celko

quote:
Originally posted by Peso

The idea is to update ONLY the records who needs to be updated.
Why update all 100 million records [sic: rows] when there are only 10 or 20 records [sic: rows] to update?

Because in the old days, we never had big tables

I am not sure how SQL server will treat this, but you could also write Standard SQL to save some effort:

UPDATE Foobar
SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC)
WHERE seq <> ROW_NUMBER() OVER (ORDER BY seq ASC);

Now that looks like fun! I've gotta try that. I agree with Peter that will update many rows unncessarily if the goal is only reuse of numbers but your code will also guarantee no gaps.

I also have to ask you, especially since your so anti-mag-tape-technology... Why do this? Isn't the reuse of such identifier columns against what an RDBMS is supposed to be all about?

--Jeff Moden

"Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! "
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row"

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-11-27 : 16:20:54

Yes it is, but the question was brought up from another MVP which had a table for which they were quickly running out of numbers.
Changing to BIGINT was not an option. Neither was changing the seed to -2147483648 due to the clustered index on the identity column.

N 56°04'39.26"
E 12°55'05.63"

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2010-11-27 : 18:29:49

In that case, why move data at all? Why not just look for the gaps an insert new rows there. That can also be done quite efficiently and you can use the Tally table to quickly expand the endpoints of gaps to a range of insert-values.

--Jeff Moden

"Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! "
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row"

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources