| Author |
Topic |
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-14 : 01:46:35
|
| Hi there,Is there a faster way to delete 1 million rows? The stored proc I'm currently using is just a simple DELETE FROM statement but deleting a million rows takes a lot of time. I'm just considering other options on how to make this faster. A suggestion I had was to use indexing. What are your thoughts? and do you have other suggestions? :)Thanks so much!Rafferty |
|
|
jen
Master Smack Fu Yak Hacker
4110 Posts |
Posted - 2004-12-14 : 02:51:03
|
| rename the old table, create a new table with same schema and indexes, then drop the old tableremember to apply the same permissions on the table--------------------keeping it simple... |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-12-14 : 03:40:32
|
you can also do Truncate table. read about it in BOL.Go with the flow & have fun! Else fight the flow |
 |
|
|
MichaelP
Jedi Yak
2489 Posts |
Posted - 2004-12-14 : 06:50:59
|
| Do you have your database recovery model set to Simple? I believe that helps speed up deletes as well.Do you need to delete every row from a table, or just 1 million out of several million? For deleting all of the rows, TRUNCATE TABLE is the way to go like Spirit1 said.Michael<Yoda>Use the Search page you must. Find the answer you will.</Yoda> |
 |
|
|
AjarnMark
SQL Slashing Gunting Master
3246 Posts |
Posted - 2004-12-14 : 12:30:36
|
| Adding an index as you're thinking may or may not help. Assuming you are not emptying the entire table, then an index may speed up the filtering for the records that are to be deleted, however, the index itself must be maintained and that will add time to the whole process. You'd have to do some performance testing to know for sure whether the index would speed it up or slow you down. And of course the index also affects performance of any other data modifications from the rest of your system.-----------------------------------------Professional IT Solutions from Infoneering |
 |
|
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-14 : 19:41:07
|
| I'll be deleting a million records out of several million so truncate won't work. How do I know if the database recovery model is set to simple?btw, I don't think renaming the old table and creating a new one with the data I need is good because it will require extra space and I heard there are some update issues that needs to be handled with the stored procedures of that table.I had another suggestion of using BEGIN TRAN and COMMIT TRAN to delete rows in chunks:SET ROWCOUNT 1000declare @Nrows intset @Nrows = 1while @Nrows > 0begin begin tran MyDeletes DELETE TableName WHERE <Your Condition Here> SET @Nrows = @@rowcount commit tran MyDeletes enddo you think this'll work?-Rafferty |
 |
|
|
derrickleggett
Pointy Haired Yak DBA
4184 Posts |
Posted - 2004-12-14 : 19:49:26
|
| The overall process probably won't be any faster with that. It will just be less interruptive. Are users selecting from this recordset while you're trying to delete from it? Why don't you post the delete statement.MeanOldDBAderrickleggett@hotmail.comWhen life gives you a lemon, fire the DBA. |
 |
|
|
jen
Master Smack Fu Yak Hacker
4110 Posts |
Posted - 2004-12-14 : 21:38:06
|
| if the table is in production and you want to delete, you can:1. use dts/bcp to export the needed data, once data has been transferred, rename the tables or2. use simple recovery and then do delete in chunkseither way, i expect you'll be saving the deleted items through backups and you'll need to use extra space. also you'll be able to release that extra space anyways once you're done with what you're doing.--------------------keeping it simple... |
 |
|
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-15 : 03:24:06
|
| My delete statement is just a simple DELETE FROM MyTable WHERE Modified_Date = @ModifiedDateBut this takes a while as we're expecting about a million records to be deleted each time. But deletes only take place at least once a month so I think the effect on the users will be very minimal. (I won't be able to confirm this until Friday)Will I have to recompile the stored procedures using sp_recompile when I rename -> recreate -> delete the old table? And I'll need to do some sp_updatestats and dbcc updateusage (0) with count_rows right? I wonder if this will still be faster...-Rafferty |
 |
|
|
jen
Master Smack Fu Yak Hacker
4110 Posts |
Posted - 2004-12-15 : 03:27:19
|
| we're doing some archiving (backup then delete) and there's no problem, i just schedule this when the activity is low.hth--------------------keeping it simple... |
 |
|
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-15 : 04:09:52
|
| Ahh! That's great! so the plan is to do something like:1. INSERT INTO TempTable( Attrib1, Attrib2, AttribOthers ) SELECT Attrib1, Attrib2, AttribOthers FROM MyTable WHERE Modified_Date <> @ModifiedDate2. DROP TABLE MyTable3. ALTER TABLE TempTable uh.. how do you rename a table?-Rafferty |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-12-15 : 04:21:07
|
EXEC sp_rename 'MyTable', 'MyTableNew'Go with the flow & have fun! Else fight the flow |
 |
|
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-15 : 18:51:26
|
| Thanks. So is this plan good?1. INSERT INTO TempTable( Attrib1, Attrib2, AttribOthers ) SELECT Attrib1, Attrib2, AttribOthers FROM MyTable WHERE Modified_Date <> @ModifiedDate2. DROP TABLE MyTable3. EXEC sp_rename 'TempTable', 'MyTable'I'm particularly concerned with the INSERTs that are to be done... This is only good assuming that there are more rows to be deleted vs the rows that will be retained right? So it's possible that this will be even slower than the original DELETE statement.-Rafferty |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2004-12-15 : 19:19:50
|
| Take a look at these:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=42710http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=41634You should consider the bcp route. Done properly it will be faster and take fewer resources than any other methods. To summarize:1. bcp out the data you wish to KEEP, using a query. Use native format if possible (-n switch)2. Drop non-clustered indexes on the table. Keep the clustered index if one exists.3. Truncate the table.4. bcp the data back in, using the TABLOCK hint, and also ORDER if the data is clustered.5. Add the non-clustered indexes back.Make sure that your database is set to Bulk Logged Recovery and/or set the Select Into/Bulkcopy setting to ON. |
 |
|
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-16 : 21:51:48
|
| Hello there, sorry for the long reply I had to wait and ask for more details on this task I'm doing. Here's what I need:I have one table and one view. The table needs to be able to provide data to users 100% of the time (therefore deleting, truncating won't work). New data is inserted into the table MONTHLY. When new data arrives, what happens is that we insert this new data, SWITCH the VIEW (to show the new data only), and then delete the old data.Currently, the best option I have is still indexing. I'm not knowledgable about BCP, can I use BCP and still get the table to provide data 100% of the time? Or maybe there are other more methods?Thanks so much!-Rafferty |
 |
|
|
derrickleggett
Pointy Haired Yak DBA
4184 Posts |
Posted - 2004-12-16 : 22:01:40
|
| Are you using enterprise edition? If you are, this would be a perfect use for a partitioned view.MeanOldDBAderrickleggett@hotmail.comWhen life gives you a lemon, fire the DBA. |
 |
|
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-16 : 22:59:26
|
| yes we are! tell me more about this partitioned view...-Rafferty |
 |
|
|
derrickleggett
Pointy Haired Yak DBA
4184 Posts |
Posted - 2004-12-16 : 23:27:01
|
| You can create an indexed view for a set of tables. Create two data tables with an "index" column and the rest of the data, then a third table with just the "index column and one record. Create an indexed view looking at the FULL JOIN of the the tables. You can INSERT the new records with an "index" one number higher each time. After the records are inserted, change the number in the table with just an "index" column, and delete ALL records from the old table. This allows the delete process to delete all records in a table instead of partial tables. You don't have to worry about interferring with the users though while the delete/insert is occurring. Just a thoght. :)MeanOldDBAderrickleggett@hotmail.comWhen life gives you a lemon, fire the DBA. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2004-12-17 : 02:37:23
|
| "New data is inserted into the table MONTHLY"So no data is added at other time?You might be able to migrate data to a temporary table, append the new data, and then "rename tables" as an atomic transaction, such that data availability is uninterrupted. But partitioned views may be more attractive!Something like:CREATE TABLE MyTempTable ...INSERT INTO MyTempTableSELECT *FROM MyMainTableWHERE MyDate > @SomeCutoffDateINSERT INTO MyTempTableSELECT *FROM MyStagingTableOfNewDataBEGIN TRANSACTIONDROP TABLE MyMainTableEXEC sp_rename 'MyTempTable', 'MyMainTable'COMMITKristen |
 |
|
|
Rafferty Uy
Starting Member
23 Posts |
Posted - 2004-12-19 : 20:12:28
|
| Yes no data is added at other time :) Partitioned views does look attractive and so does dropping and renaming tables or simply truncating the table for that matter. But I forgot to mention that there are foreign keys in the table which makes things much more difficult.In this case, is indexing the best option I have? Or is it easy (and efficient) to update the foreign keys if I truncate the table?-Rafferty |
 |
|
|
jen
Master Smack Fu Yak Hacker
4110 Posts |
Posted - 2004-12-20 : 04:52:54
|
| you can issue a cascade delete to handle the constraint--------------------keeping it simple... |
 |
|
|
Next Page
|