Error handling in long-running batch jobs

By Guest Authors on 29 January 2001 | Tags: Application Design , DMO

Andy Pope, who cleverly uses the name andypope in the forums, posed a question on handling errors in long running batch jobs. Basically Andy wanted a way to store job log entries in a SQL Server log table that didn't get rolled back if the enclosing transaction failed. I really wasn't much help giving him an answer so the good Mr. Pope went and figured it own on his own. And boy do I like his solution! He posted his solution in the forums but I thought it deserved a little more explanation and attention.

You can find the original question and the discussion that followed here. Andy's solution involved using SQL Server's Distributed Management Objects (SQL-DMO) to create another connection to SQL Server from inside the transaction. Books Online describes SQL-DMO as a dual interface COM, in-process server implemented as a dynamic-link library (DLL). When creating a SQL-DMO application, you can use any OLE Automation controller or COM client development platform using C or C++. Andy was clever enough to figure out you could also use the sp_OA series of stored procedures to connect to SQL-DMO.

We'll start with a log table that looks like this:

CREATE TABLE [dbo].[BatchLog] (
[EntryID] [int] IDENTITY (1, 1) NOT NULL ,
[JobID] [int] NULL ,
[Message] [varchar] (255) NULL )

and a stored procedure that looks like this. (At the bottom of this article is a link to a SQL Script with all this code.)

CREATE PROCEDURE  spi_BatchLog (@JobID int, @Message varchar(255) ) AS


INSERT BatchLog (JobID, Message)
VALUES (@JobID, @Message)


Pretty basic stuff so far. Now what we want to do is going to look something like this (written in my weak excuse for SQL Server psuedo-code):

  1. Log a message
  2. Begin Tran
  3. Execute Task A
  4. Log a message
  5. Rollback Task A while keeping the messages
  6. Log a rollback message

I'm going to show you a couple of highlights from the script. Please download the full script and run it to see how it functions. First we log a message about starting the job, begin a transaction and put in a transaction to roll back (steps 1-3 above).

EXEC spi_BatchLog @JobID, 'Starting Batch'
BEGIN TRAN   -- This is never committed, it's just a test
-- Insert another message - this one will be rolled back
-- (This is just to give is something to rollback in our test proc)
--  This is the Execute Task A from above
EXEC spi_BatchLog @JobID, 'Roll Me Back'  -- Task A

The next step is connect to SQL-DMO. We do that using the sp_OACreate stored procedure:

EXEC @Error = sp_OACreate 'SQLDMO.SQLServer', @oSQLServer OUT

The variable @oSQLServer is our pointer for future references to this object. In the script you'll see a couple of housekeeping statements at this point. The next major piece of code in the script calls the stored procedure to record a log entry. Basically we are calling the stored procedure through the SQL-DMO connection which is outside the scope of our declared transaction (step 4 above).

SET @SQL = 'EXECUTE spi_BatchLog ' + 
  CONVERT(CHAR(12),@JobID) + ', ' + '''Message from Inside Transaction'''

-- Execute the SQL (this will NOT be rolled back)
-- We could use ExecuteImmediate here, but it doesn't return a result
--  and we found while debugging that you really NEED that
SET @Length = LEN(@SQL)
SET @ErrorMsg = '' -- Must initialize the @ErrorMsg string
EXEC @Error = sp_OAMethod @oDatabase,  'ExecuteWithResultsAndMessages', 
  @oQueryResults OUTPUT, @Length = @Length, 
  @Command = @SQL , @Messages = @ErrorMsg OUTPUT
IF @Error <> 0
 GOTO OA_Error

That series of statements called the spi_BatchLog stored procedure outside the scope of the transaction. You'll see that entry in the log file even thought there is a ROLLBACK TRAN issued next.

SET @Error = 50001
ROLLBACK TRAN  --step 5 above

-- Get the message description
 @ErrorMsg = 'Critical Error: ' + convert(varchar(10), @Error)

-- Write the error to the log (step 6 above)
EXEC spi_BatchLog @JobID, @ErrorMsg

That completes the script. As a last step we'll SELECT back the contents of the BatchLog table and see this:

EntryID     JobID       Message
----------- ----------- ---------------------------------- 
1           10          Starting Batch
3           10          Message from Inside Transaction
4           10          Critical Error: 50001

(3 row(s) affected)

Notice that our "Message from Inside Transaction" entry is present. Also notice that our "Roll Me Back" entry is not in the table. It would have been EntryID 2 which is missing from our result set.

Thanks Andy for this great solution. I'm actually working on including it in a 10 million row update that I'm writing write now. You can download the complete script here. If you have any further discussion, please post them in the original thread.

Related Articles

Application Locks (or Mutexes) in SQL Server 2005 (7 January 2008)

What I Wish Developers Knew About SQL Server (Presentation) (11 October 2007)

Multiple Active Result Sets (MARS) – Transactions and Debugging (17 June 2007)

Multiple Active Result Sets (MARS) (3 April 2007)

How SQL Server 2005 Enables Service-Oriented Database Architectures (8 September 2006)

Presentation: What I Wish Developers Knew About SQL Server (17 November 2005)

GeoCoding with SQL Server and Google (8 August 2005)

How to Asynchronously Execute a DTS package from ASP or ASP.NET (27 March 2005)

Other Recent Forum Posts

Oracle Database Client for Microsoft Tools (11h)

Summarize the proportions stored in the table (1d)

SQL SSMS 20 Grid Not Updating? (1d)

Problem With Joining 4 Tables With 1 Being A Reference Table (1d)

Reporting Using User Selection Dropdowns (2d)

Sum Nested Join Help (2d)

How many records for group and how many records for group members? (3d)

Group by issues (6d)

- Advertisement -