Author |
Topic |
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 06:11:05
|
Application Event Log recorded an Error event:Attempt to fetch logical page (1:1634317) in database 8 failed. It belongs to allocation unit 19422104110694400 not to 72057594049134592.I presume "database 8" is the same asSELECT nameFROM master.sys.databasesWHERE database_id = 8 and if so thenDBCC CHECKDB WITH NO_INFOMSGS, EXTENDED_LOGICAL_CHECKS, DATA_PURITY shows no errors.Can't see any earlier errors in the event log. Bit reluctant to just ignore it though.We've been having the devils own time with the poxy Oracle drivers on this box (pulling data from a linked Oracle server). I can't seem to find any files under the \oracle root folder, or any file on a local drive named ora*.*, that has changed since the error was recorded in the error logs.When I try to run the scheduled task to query the Oracle server manually I'm frequently (but not all the time) gettingServer: Msg 7399, Level 16, State 1, Line 1The OLE DB provider "OraOLEDB.Oracle" for linked server "MY_ORACLE_NAME" reported an error. The provider reported an unexpected catastrophic failure.Server: Msg 7323, Level 16, State 1, Line 1An error occurred while submitting the query text to OLE DB provider "OraOLEDB.Oracle" for linked server "MY_ORACLE_NAME".That may be totally unrelated though. Its always been an issue (although I have no idea how often, if it was running 25% of the time I don't suppose anyone would notice, the process is fail-safe and resumes from the last successful attempt) |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 06:23:02
|
I've run DBCC CHECKDB WITH NO_INFOMSGS, EXTENDED_LOGICAL_CHECKS, DATA_PURITYon all DBs (except TEMPDB) and no errors reportedThere are Exceptions logged for each scheduled task run during the night (but not for the ones I have run manually where I got an error)previously there were exceptions reported (coinciding with some of, but not all, the scheduled task runs) when the Oracle drive was playing up (I think that just resulted in a reboot, rather than installing newer version of Oracle drivers) |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-09-27 : 06:56:12
|
Let me guess, there's been an index rebuild/reorganise since that error?Nothing to do with oracle drivers, that's database corruption. Check hardware.--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 07:16:30
|
I'll try to refrain from advising "Just rebuild your indexes, you'll be fine" next time I see a data corruption message here!Thanks for that, off to investigate.Hardware fault may be causing the Oracle drivers to be wobbly too, perhaps. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 08:25:48
|
No sign of index rebuild There is no scheduled task.MODIFY date on the index has not changedLatest Stats date has 3 statistics more recent that the error logged in Event Viewer, but there is no "wholesale" set of stats dates that would suggest a scheduled task had run.Could the duff page-fetch have been statistics (which have since been updated)?The IT guys are looking into the health of the server meantime. |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-09-27 : 08:58:25
|
No. Not statistics (they aren't data). Something happened to deallocate that page between the time that the message appeared and the time you did checkDB.Index rebuilds are NOT a fix for corruption. Since they deallocate pages, they can make a minor corruption disappear. Emphasis 'minor'.--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 10:55:42
|
"Index rebuilds are NOT a fix for corruption"Hehehe ... yeah, I was kidding The tables in this database get quite a lot of updates - the database is used for staging. We don't bother with UPSERT we just do:DELETE TFROM Target AS T JOIN #Source AS S ON S.ID = T.ID thenINSERT INTO Target(Col1, ...)SELECT Col1, ...FROM #Source Could that have deallocated the broken page?Or should it have given me a Torn Page error? (I'll just go check that Torn Page IS turned on for this database) |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-09-27 : 12:00:09
|
It may well have. If a delete emptied a page then the page could have been deallocated.No, the error you got is not a torn page error. Why do you have torn page on anyway? Checksum is the default and recommended setting unless you're still on SQL 2000 (which, from the error message you're not)--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 12:13:35
|
Sorry, mental blockage, I was thinking 2000-speak. We do indeed have CHECKSUM turned on for that database; its SQL 2008 (R1).Would SQL not have raised error when deallocating that page (when it found that the checksum was wrong)? Or is the error in Event Viewer the result of a CHECKSUM error raised by the page deallocation? (I was expecting my app. to be given a fatal error - although its possible it was and it hasn't manifested in any useful way through the APP) |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-09-27 : 13:10:58
|
But that's not an incorrect checksum error...It was a very fatal error though. It's severity 21 which means that the connection would have been terminated as a result.I can't go back in time and watch your app to see what happened. 'Disappearing corruptions' are a result of the page that's in error being deallocated or some other operation that modifies the affected structures.--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 13:22:30
|
"But that's not an incorrect checksum error..."Sorry, I should have realised that.So if I understand it correctly SQL has written a page with a pointer to an incorrect page. There was physical checksum error on that page, so a checksum error was not raised. Fair enough. But nonetheless that page was written pointing to an incorrect location ?So the bits were NOT twiddled between SQL and the Disk - e.g. by a faulty disk controller - (otherwise a checksum error would have arisen?)So under what sort of hardware fault does this sort of scenario occur?I'm wondering whether the exception errors that the Oracle debacle triggered may have mucked things up enough to induce that?Apparently SQL is running on a virtual machine, the physical machine is home to all sorts of stuff. The IT people think that if there was a hardware error it would be likely to manifest itself all over the various virtual machine instances.(Its a machine owned by a client which we use for a staging database to gather data from Oracle and pass it on to our SQL box, so I don't have much knowledge of the client's infrastructure; but I am keen to get them to sort out a hardware fault if I can pin it down enough for them to do so) |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-09-27 : 14:28:00
|
quote: Originally posted by Kristen So if I understand it correctly SQL has written a page with a pointer to an incorrect page. There was physical checksum error on that page, so a checksum error was not raised. Fair enough. But nonetheless that page was written pointing to an incorrect location ?
No, it's an allocation unit mess up. Page owned by incorrect table/index, not an incorrect page number. Not sure if it's page header wrong or allocation structure wrong or both.--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 14:38:57
|
Could it be something other than hardware fault do you think? (leaving aside remote possibility that I am the first person to stumble over a new bug in SQL Server) |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-09-27 : 14:43:33
|
I chatted with Paul. He said "Transient in-memory corruption from intermittently faulty memory chip"That won't cause a page checksum error, cause the checksum is calculated at the point the page goes to disk.--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2011-09-27 : 19:52:42
|
Thanks very much Gail. I'll let you know what the IT folk find. The behaviour of the Oracle Driver is a bit strange, I've posted the details here:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=165989 |
 |
|
|