Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
shishirkhandekar
Starting Member
23 Posts |
Posted - 2011-01-30 : 18:06:04
|
Just today we were testing a failover of our databases using the DoubleTake software. Once the primary server was switched off and the secondary brought up, one of our databases did not come online. The state_desc for the database was "Recovery Pending". When I tried to restore from a database backup the Doubletake had replicated, the header record showed something interesting.The backup name column (Name column in the RESTORE gui) was *** INCOMPLETE *** and all the other columns were empty.My feeling is that the backup file did not, for some reason, replicate correctly. Is there something more to this than what I am thinking?Also we received a Severity 24 alert that said:SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:32; actual 0:0). It occurred during a read of page (1:32) in database ID 9 at offset 0x00000000040000 in file <filename.mdf>. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.How do you recover from such scenarios or how do you fix them?Thanks for any insight.Shishir Khandekar |
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
shishirkhandekar
Starting Member
23 Posts |
Posted - 2011-01-30 : 20:11:17
|
Tara,Yes, DBCC CHECKDB ran without any errors on the source database. On the target, I could not do anything as it was in recovery pending and the only thing that was possible was to put it in EMERGENCY mode and then running DBCC. But since this was a failover test and we had some indications of incomplete replications, we did not try to recover.Shishir |
|
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-01-31 : 00:33:07
|
How you recover from scenarios like you describe is typically restore from a good backup. If the replica is damaged and the source isn't then either something went wrong in replication or there's something wrong with the IO subsystem on the secondary.--Gail ShawSQL Server MVP |
|
|
shishirkhandekar
Starting Member
23 Posts |
Posted - 2011-01-31 : 09:16:37
|
Thank you, Gail. We are still trying to figure out what went wrong and the suspects, as per the vendor support, are disk fragmentation and/or bad I/O driver causing memory leaks and therefore insufficient resources.So in short what I understand is that irrespective of how we reached this state, in case of "incorrect page ID" error, the only way out is:1) EMERGENCY MODE DBCC with REPAIR_ALLOW_DATA_LOSS 2) Recover from a good backup.If anyone has any more insight, it would be much appreciated.Thank youShishir Khandekar |
|
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-01-31 : 09:53:45
|
Your options should be the other way around. Restore from backup is by far the recommended. Emergency mode repair is a last resort.Disk fragmentation wouldn't cause this. Bad IO driver could, though probably not through memory leak, through incorrectly writing data to the disk or changing it later. SAN controller, SAN cache, HBA drivers, any IO filter drives, actual disk hardware, etc.--Gail ShawSQL Server MVP |
|
|
|
|
|
|
|