Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Delete non alpha numeric characters from field

Author  Topic 

Tigger
Yak Posting Veteran

85 Posts

Posted - 2002-01-21 : 23:56:51
I need to remove all characters which are not 0-9 or A-Z or [space]
from 6 fields on a table of 1 million records.

I was hoping to write a function to do this (I'm using SQL2000) but I can't see how to do it without

a) looping through each field one character at a time
OR
b) using lots of nested replace statements

Any ideas? My brain just will not work today

fisherman_jake
Slave to the Almighty Yak

159 Posts

Posted - 2002-01-22 : 01:42:34
Tigger, mate, this is a hhuge job you have taken.. I have a couple of ideas, but the most sensible one I could think of is:

 

Create Procedure usp_DeleteCharFromTable
@CharToDelete char(1)
As
Declare @SQLStr varchar(300)
-- You can alter the code so you can pass in the TableName
-- and look for the columns in the Information Schema column view
-- see BOL - Information Schema,
-- and build your dynamic sql string accordingly, to make
-- the procedure more generic.

Set @SQLStr = ' Update TableName '
Set @SQLStr = @SQLStr + ' Set Col1 = Replace(Col1, ''' + @CharToDelete + ''', ''''), '
...
Set @SQLStr = @SQLStr + ' Col6 = Replace(Col6, ''' + @CharToDelete + ''', '')'
Set @SQLStr = @SQLStr + ' Where Col1 like ''%' + @CharToDelete + '%'''
Set @SQLStr = @SQLStr + ' OR Col2 like ''%' + @CharToDelete + '%'''
...
Set @SQLStr = @SQLStr + ' OR Col6 like ''%' + @CharToDelete + '%'''
Exec (@SQLStr)

return (0)

-- Call from another SP
Create Procedure usp_DeleteInvalidChar
As
Exec usp_DeleteCharFromTable '@'
Exec usp_DeleteCharFromTable '#'
Exec usp_DeleteCharFromTable '!'
...
return (0)



So It's not all nested loops, but then again it's a lot of copy and paste!! hehehehe.. Good luck!! toast]

==================================================
World War III is imminent, you know what that means... No Bag limits!!!
Master Fisherman
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-01-22 : 06:17:38
Can't say I'm keen on Jake's solution: that's 32 updates of a 1M row table at the least!
While this function might not be very fast, it should cut things down to one pass through the table.

CREATE FUNCTION RemoveChars (@Str varchar(8000), @DelPat varchar(8000))
RETURNS varchar(8000)
AS
BEGIN
DECLARE @Res varchar(8000), @i int
SET @DelPat = '%[' + @DelPat + ']%'
SET @Res = ''
SET @i = PATINDEX(@DelPat, @Str)
WHILE @i <> 0
BEGIN
SET @Res = @Res + SUBSTRING(@Str, 1, @i - 1)
SET @Str = SUBSTRING(@Str, @i + 1, 8000)
SET @i = PATINDEX(@DelPat, @Str)
END
SET @Res = @Res + @Str
RETURN @Res
END


UPDATE TableName SET
Col1 = RemoveChars(Col1, '^ A-Z0-9'),
Col2 = RemoveChars(Col2, '^ A-Z0-9'),
Col3 = RemoveChars(Col3, '^ A-Z0-9'),
Col4 = RemoveChars(Col4, '^ A-Z0-9'),
Col5 = RemoveChars(Col5, '^ A-Z0-9'),
Col6 = RemoveChars(Col6, '^ A-Z0-9')



Go to Top of Page

Tigger
Yak Posting Veteran

85 Posts

Posted - 2002-01-22 : 19:02:33
Thanks for the help guys.

I used Arnold's solution in the end - the whole job ran in just over 6 minutes so cheers ...



It's great to have somewhere that you can ask a question just before finishing work and come in the next day to a solution !

Go to Top of Page

Tigger
Yak Posting Veteran

85 Posts

Posted - 2002-01-22 : 20:27:27
Quick question...

How can I change
Col1 = RemoveChars(Col1, '^ A-Z0-9')

to include single quote as a valid character?

I've tried putting [] or "" or '' around it but they don't work.

It's probably sooo simple...

Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-01-22 : 20:34:17
Just use two sequential single quotes: ''aaa'

To specify a single quote, and nothing else, you need to use: ''''

Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-01-23 : 04:40:40
Sorry, I should have noted how the @DelPat parameter of RemoveChars works. I started off writing a function that would replace entire patterns as found by PATINDEX. Then I realized that there's no way of finding the length of the match so it wouldn't have worked. Instead, @DelPat is the character range part of a PATINDEX / LIKE pattern (within the square brackets), so the initial ^ is special, saying keep the following characters. So the answer to the question is RemoveChars(Col1, '^'' A-Z0-9')


Go to Top of Page
   

- Advertisement -