Regular Expressions in T-SQL

By Guest Authors on 24 June 2003 | Tags: Queries


This article comes to us from Cory Koski. Cory writes "I recently had the problem of trying to search for a regular expression in a database field. There is no version of SQL Server that supports regular expressions natively, but I discovered a way to add all sorts of regular expression goodness to your T-SQL applications. In order to use regular expressions easily, we can use a User Defined Function to help us out and abstract the work for us."

For this solution we need SQL Server 2000 or higher. Also we need to make sure we have the VBScript.RegExp library on our computer. This should come with most Windows 2000 servers, in the Windows Scripting package. If you are using this on an older version of Windows, you will probably have to download the latest version of Windows Scripting for your OS.

The UDF

Here is the UDF that I wrote to search for a regular pattern expression in a source string:

CREATE FUNCTION dbo.find_regular_expression
	(
		@source varchar(5000),
		@regexp varchar(1000),
		@ignorecase bit = 0
	)
RETURNS bit
AS
	BEGIN
		DECLARE @hr integer
		DECLARE @objRegExp integer
		DECLARE @objMatches integer
		DECLARE @objMatch integer
		DECLARE @count integer
		DECLARE @results bit
		
		EXEC @hr = sp_OACreate 'VBScript.RegExp', @objRegExp OUTPUT
		IF @hr <> 0 BEGIN
			SET @results = 0
			RETURN @results
		END
		EXEC @hr = sp_OASetProperty @objRegExp, 'Pattern', @regexp
		IF @hr <> 0 BEGIN
			SET @results = 0
			RETURN @results
		END
		EXEC @hr = sp_OASetProperty @objRegExp, 'Global', false
		IF @hr <> 0 BEGIN
			SET @results = 0
			RETURN @results
		END
		EXEC @hr = sp_OASetProperty @objRegExp, 'IgnoreCase', @ignorecase
		IF @hr <> 0 BEGIN
			SET @results = 0
			RETURN @results
		END	
		EXEC @hr = sp_OAMethod @objRegExp, 'Test', @results OUTPUT, @source
		IF @hr <> 0 BEGIN
			SET @results = 0
			RETURN @results
		END
		EXEC @hr = sp_OADestroy @objRegExp
		IF @hr <> 0 BEGIN
			SET @results = 0
			RETURN @results
		END
	RETURN @results
	END

Save this UDF into your database, and ensure that the permissions are set so it can be executed. Of course you will also need to ensure that people will have the permissions to execute the sp_OAxxxxx family of extended stored procedures for this to work.

This particular function has been used with no wrinkles and it seems to be a very snappy performer, even with the use of the COM object.

Example

One way to use regular expressions is to test for special characters. Instead of searching for all the special characters that exist, we’ll look for only matches of normal characters, like letters and spaces. Let’s see this in action:

DECLARE @intLength AS INTEGER
DECLARE @vchRegularExpression AS VARCHAR(50)
DECLARE @vchSourceString as VARCHAR(50)
DECLARE @vchSourceString2 as VARCHAR(50)
DECLARE @bitHasNoSpecialCharacters as BIT

-- Initialize variables
SET @vchSourceString = 'Test one This is a test!!'
SET @vchSourceString2 = 'Test two This is a test'

-- Our regular expression should read as:
-- [a-zA-Z ]{}
-- eg. [a-zA-Z ]{10}  ...  For a string of 10 characters

-- Get the length of the string
SET @intLength = LEN(@vchSourceString)

-- Set the completed regular expression
SET @vchRegularExpression = '[a-zA-Z ]{' + 
CAST(@intLength as varchar) + '}'

-- get whether or not there are any special characters
SET @bitHasNoSpecialCharacters = dbo.find_regular_expression(
@vchSourceString, @vchRegularExpression,0)

PRINT @vchSourceString
IF @bitHasNoSpecialCharacters = 1 BEGIN
	PRINT 'No special characters.'
END ELSE BEGIN
	PRINT 'Special characters found.'
END

PRINT '---'

-- Get the length of the string
SET @intLength = LEN(@vchSourceString2)

-- Set the completed regular expression
SET @vchRegularExpression = '[a-zA-Z ]{' + 
CAST(@intLength as varchar) + '}'

-- get whether or not there are any special characters
SET @bitHasNoSpecialCharacters = dbo.find_regular_expression(
@vchSourceString2, @vchRegularExpression,0)

PRINT @vchSourceString2
IF @bitHasNoSpecialCharacters = 1 BEGIN
	PRINT 'No special characters.'
END ELSE BEGIN
	PRINT 'Special characters found.'
END

GO

The results for this example would be:

Test one This is a test!!
Special characters found.
---
Test two This is a test
No special characters.

Conclusion

As you can see, this is a very simple technique to get a very powerful result in certain situations. You as a T-SQL developer can take and extend this technique to other methods in the regular expression library VBScript.RegExp.


Cory Koski is a full-time web developer living in Toronto, Ontario, for an international travel insurance B2B company since June 2001. Born and raised in Thunder Bay, Ontario, he has been working as a web developer building ASP-based solutions since 1997. In the last 4 years, Cory has worked extensively with SQL Server, and now currently builds solutions with ASP.NET and SQL Server 2000. He is also a part-time consultant and is also an academic speaker on the topic of web development practices.


Related Articles

Using Dynamic SQL in Stored Procedures (7 March 2011)

Joining to the Next Sequential Row (2 April 2008)

Writing Outer Joins in T-SQL (11 February 2008)

Aggregating Correlated Sub-Queries (23 October 2007)

How to Use GROUP BY with Distinct Aggregates and Derived tables (31 July 2007)

How to Use GROUP BY in SQL Server (30 July 2007)

Returning Complex Data from User-Defined Functions with CROSS APPLY (11 June 2007)

Returning a week number for any given date and starting fiscal month (2 May 2007)

Other Recent Forum Posts

Compare alpha results to INT after get values from a string (1d)

Query performance Call Center data (2d)

Looking for on Premises tool to read data from SQL logs and populate data warehouse (3d)

Possible SQL 2014 to 2016 Issue - Some Application Functions Slow/Failing (3d)

Working with multiple WHERE statements (3d)

Create a new field value that shows a difference in value from 2 fields (4d)

Hierarchy wise Sales Targets (5d)

Get the MaxDate in results (7d)

- Advertisement -