Author |
Topic |
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2010-02-09 : 08:50:18
|
This article will demonstrate an efficient way to reuse gaps in an identity column. Please note that this is something you normally shouldn't be bothered about in a well-designed database or application. However, there are circumstances where you are forced to do this.Read Efficiently Reuse Gaps in an Identity Column |
|
mjda
Starting Member
6 Posts |
Posted - 2010-02-09 : 11:43:58
|
Nice article. I usually avoid running out of identity range by doing something like:CREATE TABLE dbo.Yaks(YakID bigint identity(-9223372036854775808,1),YakName char(20)):) |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-02-09 : 17:28:04
|
Yes, and using 8 bytes of storage instead of 4.For 200 million records, that is a saving of 763MB in data storage, and another 763mb index storage. 763 mb storage space is 100 000 less pages to store, seek and scan. N 56°04'39.26"E 12°55'05.63" |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2010-02-09 : 19:48:50
|
quote: Originally posted by mjda Nice article. I usually avoid running out of identity range by doing something like:CREATE TABLE dbo.Yaks(YakID bigint identity(-9223372036854775808,1),YakName char(20)):)
Is starting a bigint identity with a negative number seed really necessary in an practical application?Using just the positive numbers, a table with only a bigint identify column and no other columns would use 67,108,864 TB of storage before you ran out of numbers. And it would take over 290 years to load that table at a rate of 1 billion rows per second.CODO ERGO SUM |
|
|
ya3mro
Starting Member
37 Posts |
|
Celko
Starting Member
23 Posts |
Posted - 2010-07-29 : 12:16:37
|
Since I don't use IDENTITY or other metadata in my DDL, this is not a problem for me javascript:insertsmilie('') . However, when we wanted to find gaps in a sequence for re-use, the old trick was:CREATE VIEW Vacancies(vacant_seq)AS SELECT vacant_seq FROM (SELECT seq +1 FROM Foobar WHERE seq +1 NOT IN (SELECT seq FROM Foobar) AND seq+1 <= <<upper limit of seq>> UNION SELECT seq -1 FROM Foobar WHERE seq -1 NOT IN (SELECT seq FROM Foobar) AND seq-1 >= 1;It does not get all of the open slots, but it was good enough when you needed a quick answer for a few openings in a relatively small table. The other trick was SELECT (<<upper limit of seq>> - COUNT(seq)) AS vacancy_cnt FROM Foobar; Today, we would probably use ROW_NUMBER() to update the seq and put all of the vacancies on the high end of the table:UPDATE Foobar SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC);Author ofSQL for Smarties Thinking in SetsSQL Puzzles and AnswersAnalytics and OLAP in SQLSQL Programming Style Standards in SQLData & DatabasesTrees & Hierarchies in SQL |
|
|
X002548
Not Just a Number
15586 Posts |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-07-30 : 00:54:30
|
The idea is to update ONLY the records who needs to be updated.Why update all 100 million records when there are only 10 or 20 records to update? N 56°04'39.26"E 12°55'05.63" |
|
|
Celko
Starting Member
23 Posts |
Posted - 2010-07-30 : 18:04:34
|
quote: Originally posted by Peso The idea is to update ONLY the records who needs to be updated.Why update all 100 million records [sic: rows] when there are only 10 or 20 records [sic: rows] to update?
Because in the old days, we never had big tables I am not sure how SQL server will treat this, but you could also write Standard SQL to save some effort:UPDATE Foobar SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC) WHERE seq <> ROW_NUMBER() OVER (ORDER BY seq ASC);Author ofSQL for Smarties Thinking in SetsSQL Puzzles and AnswersAnalytics and OLAP in SQLSQL Programming Style Standards in SQLData & DatabasesTrees & Hierarchies in SQL |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-07-30 : 18:44:26
|
Create a table with 9 rows, numbered 1 to 9. Now remove row 2.Seq ROW_NUMBER1 13 24 35 46 57 68 79 8 It still means you are updating 89% of the table.With my suggestion you only update 1 record (Seq 9 becomes Seq 2). N 56°04'39.26"E 12°55'05.63" |
|
|
Jeff Moden
Aged Yak Warrior
652 Posts |
Posted - 2010-11-26 : 22:24:32
|
Peter. This isn't an attack but I have to ask, why would you really want to reuse IDENTITY values or their equivalent? Why even bother having such a column if it means so little that you can reuse and reassign the numbers? Like I said, not an attack... just an insane amount of curiosity.--Jeff Moden "Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! ""RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row" |
|
|
Jeff Moden
Aged Yak Warrior
652 Posts |
Posted - 2010-11-26 : 22:31:35
|
quote: Originally posted by Celko
quote: Originally posted by Peso The idea is to update ONLY the records who needs to be updated.Why update all 100 million records [sic: rows] when there are only 10 or 20 records [sic: rows] to update?
Because in the old days, we never had big tables I am not sure how SQL server will treat this, but you could also write Standard SQL to save some effort:UPDATE Foobar SET seq = ROW_NUMBER() OVER (ORDER BY seq ASC) WHERE seq <> ROW_NUMBER() OVER (ORDER BY seq ASC);
Now that looks like fun! I've gotta try that. I agree with Peter that will update many rows unncessarily if the goal is only reuse of numbers but your code will also guarantee no gaps.I also have to ask you, especially since your so anti-mag-tape-technology... Why do this? Isn't the reuse of such identifier columns against what an RDBMS is supposed to be all about?--Jeff Moden "Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! ""RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row" |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-11-27 : 16:20:54
|
Yes it is, but the question was brought up from another MVP which had a table for which they were quickly running out of numbers. Changing to BIGINT was not an option. Neither was changing the seed to -2147483648 due to the clustered index on the identity column. N 56°04'39.26"E 12°55'05.63" |
|
|
Jeff Moden
Aged Yak Warrior
652 Posts |
Posted - 2010-11-27 : 18:29:49
|
In that case, why move data at all? Why not just look for the gaps an insert new rows there. That can also be done quite efficiently and you can use the Tally table to quickly expand the endpoints of gaps to a range of insert-values.--Jeff Moden "Your lack of planning DOES constitute an emergency on my part... SO PLAN BETTER! ""RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row" |
|
|
|