Author |
Topic |
R
Constraint Violating Yak Guru
328 Posts |
Posted - 2012-01-31 : 08:36:07
|
I have stored procedures where I pass large CSV strings as parameter values, and parse them into a table variable. Currently I use the function below, but can anyone advise please if there is a faster method available?CREATE FUNCTION [dbo].[myApp_CreateIdTable] ( @Ids nvarchar(max), @separator char(1) )RETURNS @tbl_Ids TABLE ( IdVal int NOT NULL )AS BEGIN -- check @Ids string ends with separator character IF NOT RIGHT(@Ids, 1) = @separator BEGIN SET @Ids = @Ids + @separator END DECLARE @separator_position INT DECLARE @array_value nvarchar(10) WHILE PATINDEX('%' + @separator + '%' , @Ids) <> 0 BEGIN -- feed array of integers into tempTable SELECT @separator_position = PATINDEX('%' + @separator + '%' , @Ids) SELECT @array_value = LEFT(@Ids, @separator_position - 1) INSERT INTO @tbl_Ids (IdVal) VALUES (@array_value) SELECT @Ids = STUFF(@Ids, 1, @separator_position, '') END RETURN END |
|
webfred
Master Smack Fu Yak Hacker
8781 Posts |
Posted - 2012-01-31 : 08:51:24
|
What is your definition of a "large CSV string"?hmmm...Maybe the optimization should happen in the use of the function and not in changing the function itself...? No, you're never too old to Yak'n'Roll if you're too young to die. |
 |
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2012-01-31 : 08:53:21
|
refer to http://www.sqlservercentral.com/articles/Tally+Table/72993/ KH[spoiler]Time is always against us[/spoiler] |
 |
|
R
Constraint Violating Yak Guru
328 Posts |
Posted - 2012-01-31 : 10:06:32
|
HiMy CSV strings vary in size. Maybe at the lower end, 50 characters of variable length between commas, right up to 50,000 characters, again with variable lengths between commas (although for both the values are numeric ID values, eg. '5,6,7897,564,487874,54,125,4')I tried DelimitedSplit8K but it stops working on larger strings. Is there a way to configure it without adding more SELECT 1 UNION ALL commands? |
 |
|
webfred
Master Smack Fu Yak Hacker
8781 Posts |
Posted - 2012-01-31 : 10:43:08
|
I can't believe my eyes...If you are able to give up 50,000 characters as CSV-parm to Stored Procedure then you should be able go another way.Maybe you can fill a table with that values instead of giving them as a parm? No, you're never too old to Yak'n'Roll if you're too young to die. |
 |
|
R
Constraint Violating Yak Guru
328 Posts |
Posted - 2012-01-31 : 10:53:58
|
Okay here's what happens. In an ASP.NET application I have a control that allows a user to select other users to report against. Some of our customers have only, say, 100 users, while others have 5000.The chosen users are collected in ASP.NET as a CSV list of database ID values, and passed to SQL to use in the report.So if the user selects 100 users, SQL gets a fairly short CSV string. IF they select all the users, the string might well be 50,000 characters long depending on the length of each User's ID value.If you know of a better way to do this I am extremely open to suggestions. |
 |
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2012-01-31 : 11:23:16
|
If it's a matter of selecting all of 5000 IDs, and never 4000 or 3000, then pass 'ALL' as the parameter value and modify your function to return everything. I agree that Jeff Moden's technique is the best way to go, short of using a CLR UDF to parse the string.Another method you may want to investigate is using fixed-length values instead of comma separated. You can store int values in 4 bytes of binary and pass the entire set as a varbinary(max). The length of the value / 4 is the number of int's passed. Here's an example:-- create and populate sample tableCREATE TABLE #a(i INT NOT NULL, alpha AS CHAR(64+i));WITH n(n) AS (SELECT 1 UNION ALL SELECT n+1 FROM n WHERE n<100)INSERT #a SELECT n FROM n-- set up variable (parameter) and append ID values as binary(4)DECLARE @param VARBINARY(MAX)=0x;SET @param=@param + CAST(4 AS BINARY(4))SET @param=@param + CAST(23 AS BINARY(4))SET @param=@param + CAST(19 AS BINARY(4))SET @param=@param + CAST(47 AS BINARY(4))SET @param=@param + CAST(87 AS BINARY(4))-- parse fixed-length binary values and convert to int, then join to table;WITH n(n) AS (SELECT 1 UNION ALL SELECT n+4 FROM n WHERE n<DATALENGTH(@param)-4), i(i) AS (SELECT CAST(SUBSTRING(@param,n,4) AS INT) FROM n)SELECT a.* FROM i INNER JOIN #a a ON i.i=a.i The technique is practically identical to Jeff's article and avoids having to search for commas. It's also more space efficient, particularly if you were using GUIDs instead of ints (which was the reason I started using this technique). |
 |
|
R
Constraint Violating Yak Guru
328 Posts |
Posted - 2012-01-31 : 11:59:54
|
Hi RobThanks for that. I was having problems getting Jeff Moden's technique to work with lengthy strings. His example seems limited to varchar(8000) and my T-SQL isn't hot enough to expand on it (other than the guess I made above)...Your method looks interesting, but surely getting the application to convert all the records to a byte array would surely outweigh the speed gains from the function you provided, right??I will certainly modify my procedure to use the 'ALL' technique, but unfortunately I then end up heading down a catch-all query path, which I really wanted to avoid (as ALL would require a CustomerID lookup instead of userID lookup, and I don't fancy duplicating this query for all combinations of parameter values...).Unless of course there's a better way... |
 |
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2012-01-31 : 16:48:46
|
quote: Your method looks interesting, but surely getting the application to convert all the records to a byte array would surely outweigh the speed gains from the function you provided, right??
Not really, CLR code to convert int to binary will be very fast and compiled. You'll take a bigger performance hit by keeping this in a SQL UDF, especially a multi-statement table function, especially if you're JOINing or CROSS APPLYing to it. This will have to be interpreted by the query processor and won't optimize well. A single-statement TVF or stored procedure can be optimized.You can avoid most or all of this by using a SQL CLR function to split strings. There's a great analysis of various methods here:http://sqlblog.com/blogs/aaron_bertrand/archive/2010/07/07/splitting-a-list-of-integers-another-roundup.aspxAnd Adam's CLR version here:http://sqlblog.com/blogs/adam_machanic/archive/2009/04/26/faster-more-scalable-sqlclr-string-splitting.aspxquote: I will certainly modify my procedure to use the 'ALL' technique, but unfortunately I then end up heading down a catch-all query path, which I really wanted to avoid (as ALL would require a CustomerID lookup instead of userID lookup, and I don't fancy duplicating this query for all combinations of parameter values...).Unless of course there's a better way...
If you've got a lot of parameters, dynamic SQL may be the answer. Check Gail's post:http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/ |
 |
|
R
Constraint Violating Yak Guru
328 Posts |
Posted - 2012-02-01 : 06:21:56
|
Hi RobThank you very much for those links. I had read Gail's post a few times before, but the first two links are priceless. I've never created a CLR function before. From my Googling this morning, it seems the compiled .NET code must be referenced inside a SQL stored procedure in order for it to be run. Could it also be linked inside a SQL function as well? |
 |
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2012-02-01 : 06:51:05
|
Sure, once the assembly is registered and the function/procedure/data type created, CLR objects work like any other database object. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-01 : 12:34:51
|
quote: Originally posted by R His example seems limited to varchar(8000) and my T-SQL isn't hot enough to expand on it (other than the guess I made above)...
You should, probably!, be able to just changevarchar(8000)tovarchar(max)to get handling for unlimited sized stringThere is skilful discussion of the various methods here:http://www.sommarskog.se/arrays-in-sql-2005.htmland also table-value-parameters, which you may very well be able to do easily from ASP.NET (in essence passing a ready-formed databases "table" instead of a CSV "list")I have several SPLIT functions that I use, different ones for whether the function is returning INTs or Strings, and whether the input string is < 8,000 or > 8,000The performance is vastly different, and I use different algorithms for Small vs. Large string splitting. Make sure you time yours on both Small and Large string tests, we didn't realise how disastrously our original code performed on large strings until we fell over the problem by accident! |
 |
|
R
Constraint Violating Yak Guru
328 Posts |
Posted - 2012-02-01 : 12:52:57
|
Hi KristenThanks for your valued input. Unfortunately our live machine is SQL 2005, so I don't think Table Value Parameters work on that version. Safe to say I'll be given them my full attention when we upgrade, as that would make life so much easier.I've been researching the CLR methods in depth today. Although I've never actually used or create one, it does seem that these far outperform any other internal SQL function or approach. So... I'm currently figuring out how to get my hacked .NET CLR function to be available in SSMS... :-) |
 |
|
|