Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-11-30 : 21:49:13
|
I think I saw somewhere that there is a setting that controls how spaces are evaluated in T-SQL. In my SS2000 DB, they are evaluated as blank and this is probably the default. This is driving me nuts. The following code illustrates the point. To me '' is NOT ' ' and SS says it is. Also, SS says that a space has a length of 0. Please point me to the relevant reading. DECLARE @Space nvarchar(10)SET @Space = ' 'PRINT 'Len() of a space: '''+CAST(Len(@Space) AS varchar(20))+''' (ouch)'IF @Space = ' ' PRINT 'Space = space: True (as expected)'ELSE PRINT 'Space = space: False'GODECLARE @Space nvarchar(10)SET @Space = ''PRINT 'Len() of a blank: '''+CAST(Len(@Space) AS varchar(20))+''' (as expected)'IF @Space = ' ' PRINT 'Blank = space: True (ouch)'ELSE PRINT 'Blank = space: False'GO |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2005-11-30 : 23:34:00
|
When SQL Server compares two strings, it makes a comparison from left to right and treats the shorter expression as though it was padded with spaces to equal the length of the longer string. SQL Server does this with =, but with LIKE all characters are compared. If you convert the strings to varbinary, the = works the way you were expecting.LEN Returns the number of characters, rather than the number of bytes, of the given string expression, excluding trailing blanks.This code illustrates all of this:declare @Space nvarchar(10)declare @Space2 nvarchar(10)set @Space = ''set @Space2 = ' 'select [len] = len(@Space2),[datalength]=datalength(@Space2)if @Space = @Space2 print '@Space = @Space2'else print '@Space <> @Space2'if convert(varbinary(20),@Space) = convert(varbinary(20),@Space2) print 'Binary @Space = Binary @Space2'else print 'Binary @Space <> Binary @Space2'if @Space like @Space2 print '@Space Like @Space2'else print '@Space Not Like @Space2'len datalength ----------- ----------- 0 2(1 row(s) affected)@Space = @Space2Binary @Space <> Binary @Space2@Space Not Like @Space2 CODO ERGO SUM |
|
|
druer
Constraint Violating Yak Guru
314 Posts |
Posted - 2005-12-01 : 10:02:41
|
Per books on line: LEN Returns the number of characters, rather than the number of bytes, of the given string expression, excluding trailing blanks. DATALENGTH Returns the number of bytes used to represent any expression.I would imply that behind the scenes the LEN function does an RTRIM before it takes the length. And I'm not sure why they would automatically do that since it is exactly the opposite of what I would think a TSQL developer would want, but when you you return a result set and it makes no sense to have all of the trailing blanks coming across the network you have to trim them yourself. C'est la vie. |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-12-01 : 10:35:03
|
RTRIM = LEFT(MyColumn, LEN(MyColumn))Q.E.D. Kristen |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2005-12-01 : 10:35:09
|
quote: but with LIKE all characters are compared
It's not symmetrical: as BOM explains, LIKE treats trailing spaces in the pattern (RHS) as significant, but not in the match_expression (LHS).declare @Space nvarchar(10)declare @Space2 nvarchar(10)set @Space = ''set @Space2 = ' 'if @Space like @Space2 print '@Space Like @Space2'else print '@Space Not Like @Space2'if @Space2 like @Space print '@Space2 Like @Space'else print '@Space2 Not Like @Space'@Space Not Like @Space2@Space2 Like @Space |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 10:40:11
|
quote: Originally posted by Michael Valentine Jones When SQL Server compares two strings, it makes a comparison from left to right and treats the shorter expression as though it was padded with spaces to equal the length of the longer string.
All much clearer now - thank you! It sounds like LIKE is more like an equality in other programming languages and = is interpreted. So far I learned that '' = ' ' and NULL <> NULL. To make it even more interesting, there are sentences like this one in Books Online: "Unicode LIKE is compatible with the SQL-92 standard. ASCII LIKE is compatible with earlier versions of SQL Server." Makes me want to do all the work in the middle tier.The question remains: how do I find a correct number of characters in the case of one or more spaces? datalength() and len() of a binary value both return the number of bytes, not characters. Do I have to divide bytes by 2 for all unicode types? This must be a huge stumbling block for most novice SQL programmers.By the way, how can I retrieve the datatype of a table column? Is there a way to retrive the datatype of a variable? |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 10:47:30
|
quote: Originally posted by Kristen RTRIM = LEFT(MyColumn, LEN(MyColumn))
These behaviors (space equals blank, which is not a null, which doesn't equal null) make me wonder how people build robust queries, let alone more generic solutions. |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 10:52:26
|
quote: Originally posted by Arnold FribbleIt's not symmetrical: as BOM explains, LIKE treats trailing spaces in the pattern (RHS) as significant, but not in the match_expression (LHS).
I am getting symmetrical results in SS2000:@Space Not Like @Space2@Space2 Not Like @Space |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-12-01 : 11:00:33
|
"make me wonder how people build robust queries"They learn from experience, and then become consultants and charge Big Bucks "'' = ' ' and NULL <> NULL"That's too funny!FWIW we don't [in the main] store trailing spaces in our databases (and no CHAR columns, only VARCHAR)Kristen |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2005-12-01 : 11:04:50
|
quote: Originally posted by Marioi
quote: Originally posted by Kristen RTRIM = LEFT(MyColumn, LEN(MyColumn))
These behaviors (space equals blank, which is not a null, which doesn't equal null) make me wonder how people build robust queries, let alone more generic solutions.
The main thing is: why is a space important in your data? Typically, it is helpful for the database to ignore the case of the characters and/or trailing spaces in comparisons and joins. I suspect that the "trailing spaces don't matter" idea was implemented due to the existance of the CHAR() datatype, in which all strings are a fixed width and padded with spaces as needed.How about an example of some of the data you are working with, and why it is important that trailing spaces and such are recognized? If your data has some values of "XYZ" and some of "XYZ " in key columns in your tables and these need to be recognized as being different, I might suggest that maybe that is not the best database design. Also -- there *may* be a collation available that you can use that takes spaces into account during compares, but I am not 100% sure about this (check BOL). |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2005-12-01 : 11:07:00
|
Oops, should have tested that, shouldn't I?I rashly assumed that becauseSELECT CASE WHEN '' LIKE ' ' THEN 1 ELSE 0 ENDSELECT CASE WHEN ' ' LIKE '' THEN 1 ELSE 0 ENDgives you 0 and 1 respectively, it would work for the example.Hmm... now why is that different...Erk! It's the difference between nvarchar and varchar.SELECT CASE WHEN N'' LIKE N' ' THEN 1 ELSE 0 ENDSELECT CASE WHEN N' ' LIKE N'' THEN 1 ELSE 0 ENDreturns 0 and 0.Which is also documented in BOL |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 11:31:15
|
quote: Originally posted by jsmith8858The main thing is: why is a space important in your data?
I am writing sprocs and UDFs in which spaces are significant. An example is a UDF that splits a string based on a delimiter (I found several examples through a FAQ thread on this forum). Since the delimiter can be a space I have to deal with it. Or a UDF that counts words. |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 11:43:25
|
quote: Originally posted by Arnold Fribble Oops, should have tested that, shouldn't I?
I played with nchar and char and there the results are predictable: a blank string and a space are treated as 10 spaces or 20 bytes.But with nvarchar, why is len of varbinary returning 2 (the line in code isPrint 'len of @Space2 as varbinary: '''+CAST(Len(CAST(@Space2 AS varbinary(20))) AS varchar(20))+'''' and the line in results islen of @Space2 as varbinary: '2' (I figured it out - see my reply to this message.)The complete code is:declare @Space nchar(10)declare @Space2 nchar(10)set @Space = ''set @Space2 = ' 'Print 'EXAMPLE WITH nchar:'Print 'len of @Space2: '''+CAST(len(@Space2) AS varchar(20))+''''Print 'datalength of @space2: '''+CAST(datalength(@Space2) AS varchar(20))+''''Print '-------------------'Print 'len of @Space2 as varbinary: '''+CAST(Len(CAST(@Space2 AS varbinary(20))) AS varchar(20))+''''Print 'datalength of @Space2 as varbinary: '''+CAST(datalength(CAST(@Space2 AS varbinary(20))) AS varchar(20))+''''Print '-------------------'if @Space = @Space2 print '@Space = @Space2'else print '@Space <> @Space2'if CAST(@Space AS varbinary(20)) = CAST(@Space2 AS varbinary(20)) print 'Binary @Space = Binary @Space2'else print 'Binary @Space <> Binary @Space2'if @Space like @Space2 print '@Space Like @Space2'else print '@Space Not Like @Space2'PRINT '-------------------'-- Testing whether LIKE is symmetrical.if @Space like @Space2 print '@Space Like @Space2'else print '@Space Not Like @Space2'if @Space2 like @Space print '@Space2 Like @Space'else print '@Space2 Not Like @Space'GO I get:EXAMPLE WITH nchar:len of @Space2: '0'datalength of @space2: '20'-------------------len of @Space2 as varbinary: '20'datalength of @Space2 as varbinary: '20'-------------------@Space = @Space2Binary @Space = Binary @Space2@Space Like @Space2-------------------@Space Like @Space2@Space2 Like @Space EXAMPLE WITH nvarchar:len of @Space2: '0'datalength of @space2: '2'-------------------len of @Space2 as varbinary: '2'datalength of @Space2 as varbinary: '2'-------------------@Space = @Space2Binary @Space <> Binary @Space2@Space Not Like @Space2-------------------@Space Not Like @Space2@Space2 Not Like @Space |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 12:03:47
|
quote: Originally posted by MarioiBut with nvarchar, why is len of varbinary returning 2
I got it - because of the padding SS does in string comparisons of unequal lengths. This is a problem because I can't anticipate which side of the comparison the string will be shorter.My conclusions are:- Use the same type of string variable throughout (nvarchar in my case)- If you have to use char() and nchar(), RTrim it before you blink- Instead of = always use LIKE- Instead of Len() always use Datalength() and divide the number by 2 (what happens if the number happens to be odd - that will never happen, right?)Anything else?Was it George Orwell who wrote 'All spaces are created equal, but some are more equal than others?' |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-12-01 : 12:44:58
|
"I am writing sprocs and UDFs in which spaces are significant"Where trailing spaces are significant??Kristen |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 13:46:02
|
quote: Originally posted by KristenWhere trailing spaces are significant??
All spaces are significant because I am trying to programmatically manipulate strings supplied by any consumer, not only retrieved from SS data. The problem is I don't know when I will receive a string of one or more spaces, or characters followed by one or more spaces. Whatever methodology I develop, I want to be sure that spaces (Char(32) will be treated just like any other character. |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2005-12-01 : 15:41:57
|
A specific example would be helpful. I can't think of a scenerio in which the way SQL handles spaces would cause difficulty; I know I've written dozens of UDFs/procs along these lines and I've never had a problem or needed to do anythign specific to deal with spaces.How about an example UDF -- give us some sample input, a description of what the UDF needs to do, and then the desired output. I suspect you may be making things more complicated than needed. |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 18:30:41
|
quote: Originally posted by jsmith8858 A specific example would be helpful.
I started really looking into this problem when I wanted to support space as the delimiter in this UDF, but I also got stung earlier when dealing with our own metadata (don't remember the details). I wrote this before I learned about datalength() and using LIKE instead of = so now I think I have a solution, albeit not as elegant as I would prefer.To test, change the second parameter in the SELECT code on the very bottom to any number of space characters.IF EXISTS (SELECT * FROM sysobjects WHERE name = N'fnSplit') DROP FUNCTION fnSplitGO------------------ fnSplit ----------------------CREATE FUNCTION dbo.fnSplit (@RowData nvarchar(4000), @SplitOn nvarchar(5))-- AUTHOR: MI 11/30/05-- Adapted from Seventhnight's posting on http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648.-- The function preserved starting and trailing spaces.-- IMPORTANT: Current does NOT support spaces as delimiters. -- PURPOSE: -- Split a uniformly delimited string into a table.-- PARAMETERS:-- @RowData: the string to split.-- @SplitOn: the delimiter.-- RETURNS:-- Table with 2 columns: counter and Data.-- EXAMPLE 1:-- SELECT * FROM dbo.fnSplit -- ('Test|string||that needs to|be||split', -- '|')-- GO-- EXAMPLE 2:-- SELECT * FROM dbo.fnSplit -- (' Test\\\ string\\\that needs to \\\\\\ be \\\ split \\\\\\', -- '\\\')-- GORETURNS @RtnValue table (Id int identity(1,1), Data nvarchar(4000)) AS BEGIN -- Ensure that the SplitOn parameter has at least 1 character. If not, -- we would loop infinitely -- Len returns 0 when only spaces are in @SplitOn. Spaces are not -- not supported in @SplitOn because they cause an infinite loop -- further down. IF Len(@SplitOn) < 1 BEGIN-- RAISERROR ('Please provide a valid character in the %d parameter.',-- 16, 1, '@SplitOn') Return END ELSE BEGIN Declare @Cnt int Declare @sValue nvarchar(4000) Set @Cnt = 1 While (Charindex(@SplitOn,@RowData)>0) Begin (Select @sValue = Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)) -- removed LTrim(RTrim()) Insert Into @RtnValue (data) Select Data = @sValue Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+Len(@SplitOn),len(@RowData)) Set @Cnt = @Cnt + 1 End Select @sValue = @RowData -- removed LTrim(RTrim()) Insert Into @RtnValue (data) Select Data = @sValue END ReturnENDGO-- =============================================-- Example to execute function-- =============================================SELECT * FROM dbo.fnSplit (' Test\\\ string\\\that needs to \\\\\\ be \\\ split \\\\\\', '\\\')GO |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2005-12-01 : 18:48:52
|
The problem here is a logic one: If you want to support spaces as a delimiter, but also "preserve" all spaces, then you have conflicting things you are trying to do.How should "This is a string" be delimited into words, if you want to use a space as a delimiter, but also "preserve" all spaces?1) "This ", "is ", "a", "string"or2) "This", "", "is", "", "a","string"or3) "This", " is", " a","string"or4) "This", " is ", "a", "string"etc ?What about "hello there" (3 spaces) ?Should it be1) "hello"," ","there" 2) "hello ","there"3) "hello "," there"4) "hello"," there" 5) "hello"," "," "," ","there" 6) "hello","","","","there"etc ?You can't preserve spaces and also allow a space character to be a delimiter! By definition, the delimiter character is removed from the string, isn't it? What happens in your example of delimiting by "///" if you say you wish to preserve all instances of "///" ?Your problem isn't a T-SQL problem, it's a problem with what you want to do. Your first step before dealing with *any* T-SQL code is clearly defining your specs with logic that makes sense. |
|
|
Marioi
Posting Yak Master
132 Posts |
Posted - 2005-12-01 : 19:12:15
|
quote: Originally posted by jsmith8858What about "hello there" (3 spaces) ?
I don't want to preserve spaces, I want to properly interpret them. Your example above is the same as 'hello,,,there' if the comma was a separator so the result is:"hello"+""+""+"there"And if the delimiter were ' ' (2 spaces), then the result would be:"hello"+" there" |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2005-12-01 : 19:32:41
|
OK, now we are getting somewher. Remember this from earlier in the thread:quote: LEN Returns the number of characters, rather than the number of bytes, of the given string expression, excluding trailing blanks. DATALENGTH Returns the number of bytes used to represent any expression.
And let's get back to the problem: your fn doesn't work for spaces. Yet, twice in your function you use the LEN() function around the delimiter variable, where you'd like to allow a space!quote: CREATE FUNCTION dbo.fnSplit (@RowData nvarchar(4000), @SplitOn nvarchar(5))-- AUTHOR: MI 11/30/05-- Adapted from Seventhnight's posting on http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648.-- The function preserved starting and trailing spaces.-- IMPORTANT: Current does NOT support spaces as delimiters. -- PURPOSE: -- Split a uniformly delimited string into a table.-- PARAMETERS:-- @RowData: the string to split.-- @SplitOn: the delimiter.-- RETURNS:-- Table with 2 columns: counter and Data.-- EXAMPLE 1:-- SELECT * FROM dbo.fnSplit -- ('Test|string||that needs to|be||split', -- '|')-- GO-- EXAMPLE 2:-- SELECT * FROM dbo.fnSplit -- (' Test\\\ string\\\that needs to \\\\\\ be \\\ split \\\\\\', -- '\\\')-- GORETURNS @RtnValue table (Id int identity(1,1), Data nvarchar(4000)) AS BEGIN -- Ensure that the SplitOn parameter has at least 1 character. If not, -- we would loop infinitely -- Len returns 0 when only spaces are in @SplitOn. Spaces are not -- not supported in @SplitOn because they cause an infinite loop -- further down. IF Len(@SplitOn) < 1 BEGIN-- RAISERROR ('Please provide a valid character in the %d parameter.',-- 16, 1, '@SplitOn') Return END ELSE BEGIN Declare @Cnt int Declare @sValue nvarchar(4000) Set @Cnt = 1 While (Charindex(@SplitOn,@RowData)>0) Begin (Select @sValue = Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)) -- removed LTrim(RTrim()) Insert Into @RtnValue (data) Select Data = @sValue Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+Len(@SplitOn),len(@RowData)) Set @Cnt = @Cnt + 1 End Select @sValue = @RowData -- removed LTrim(RTrim()) Insert Into @RtnValue (data) Select Data = @sValue END ReturnEND
As mentioned, to allow for spaces, you need to use DATALENGTH. And you also need to remember that unicode uses 2 bytes per character, not 1, and DATALENGTH returns the # of bytes in the string. So what happens if you replace the two instances ofLEN(@SplitOn)withDATALENGTH(@SplitOn)/2? |
|
|
Next Page
|
|
|
|
|