Author |
Topic |
dextrous
Starting Member
15 Posts |
Posted - 2010-02-03 : 04:01:29
|
Hello,I'm struggling with a very large table in my application. There's one column with unique values (around 24 million rows) but there are around 120 other columns in the same database. Depending upon the client's input 2 or 3 or multiple values are compared (its a zipcode profile column) and it outputs the highest value. When a client gives around a million of these, we are facing performance issues. At first I thought it might be large number of rows in the db, so I created views (each range has around only 500,000 records), so around 48 views. But the problem, I think, is also with the large number of columns in the table.Is there an efficient way to handle large # of columns in a db like there's the views concept of large # of rows? |
|
harsh_athalye
Master Smack Fu Yak Hacker
5581 Posts |
Posted - 2010-02-03 : 04:07:32
|
Do you need all the columns as output always? Is the table normalized? 120 seems to be quite a large number of columnsHarsh Athalyehttp://www.letsgeek.net/ |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-03 : 04:11:59
|
Need to see a Query Plan for a "typical" query that is "too slow" please |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-03 : 07:24:13
|
Are you on 2008? If so you can look in to sparse columns...if you have a lot of NULLs in these 120 columns they could prove really handy. Sparse columns are stored as xml, hence if there is no value for one of the 120 columns nothing will be stored. If this is not suitable, you could look at both vertical and horizontal partitioning...- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-03 : 07:50:06
|
@Lumbago: I don't think the presence of 120 columns is making performance bad (particularly if the values in them are sparse!I reckon it will be an optimisation issue, but if that doesn't work then you are right about Partitioning.24M rows is not really a "huge" table, I'd class it as "big".If most of those 120 columns are TEXT/VARCHAR(MAX), and never part of the WHERE clause, I might want to shift them out of the main table into a 1:1 sibbling-table .... |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-03 : 08:18:35
|
Well...I agree that the sparse columns might not be what dextrous is looking for after thinking a little more about it. It's more of a space saver than a performance too so I guess partitioning is the keyword to remember.@dextrous: maybe you can post a part of your table and the queries you usually run...maybe there is room for optimizations?- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
dextrous
Starting Member
15 Posts |
Posted - 2010-02-04 : 02:12:38
|
ALTERProcedure [dbo].[STR_SP_GetCultureByZcode] (@Zipcode INT, @Querystr Varchar(200), @TblName Varchar(100), @existsZip INT OUTPUT )ASBeginSETNOCOUNT ON; DECLARE @Result VARCHAR(8000), @CondStmt1 nvarchar(200) --check Whether record exist in table with this zipcode Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + Convert(varchar(30),@Zipcode) EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT If(@existsZip > 0) -- If Exists get the data Begin Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM (SELECT * FROM ' + @TblName + ' WHERE ZipData =' + Convert(varchar(30),@Zipcode) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC' End else -- If not exist take only 5 digits from zipcode and pass as input to query Begin if(Exists(Select ZipData from CountryOrigin_View0 Where Left(ZipData,5) = @Zipcode)) Begin Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM (SELECT * FROM CountryOrigin_View0 WHERE ZipData =' + Convert(varchar(30),Left(@Zipcode,5)) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC' End End Exec(@Result) EndGOSETANSI_NULLS OFF GOSETQUOTED_IDENTIFIER OFF GO |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-04 : 03:01:46
|
[code]Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = @Zipcode' + Convert(varchar(30),@Zipcode) EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT, @Zipcode INT', @existsZip OUTPUT, @Zipcode[/code]this will improve the chances of the query plan being cachedChange:[code]if(Exists(Select ZipData from CountryOrigin_View0 Where Left(ZipData,5) = @Zipcode))[/code]to first reduce @Zipcode to 5 characters - e.g.[code]SELECT @Zipcode = CONVERT(int, CONVERT(varchar(5), @Zipcode))[/code]then do a range test on ZipData (you will need an index on ZipData column) and assuming I have understood ZipData usage correctly and my maths is reasonable [code]if(Exists(Select * from CountryOrigin_View0 Where ZipData >= @Zipcode * 10 AND ZipData < (@Zipcode + 1) * 10[/code]Then, because you modified @Zipcode to 5 digits already, you can use the same query for both routes:[code]Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM ([blue]SELECT * FROM[blue] ' + @TblName + ' WHERE ZipData = @Zipcode ) AS t1UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC' [/code]and then use sp_ExecuteSQL instead of EXEC :[code]EXEC sp_executesql @Result, N'@Zipcode INT', @Zipcode[/code]again, this will improve the chances of the query plan being cachedCan you replace SELECT * with[code]SELECT ' + @Querystr + ' FROM[code]thus avoiding pulling all 120 columns? (There may be some extra columns you need in addition to the ones in @Querystr)I also question whether your logic is right for the second IFYou do IF EXISTS based on LEFT(ZipData, 5), and then your actual UNPIVOT WHERE clause is ZipData = LEFT(@ZipCode, 5) - which is a very different thing! |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2010-02-04 : 03:19:08
|
And I don't really see the need for the derived table in the "SET @Result"-query. Unless I'm missing something here the following will be far more efficient:Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM CountryOrigin_View0 WHERE ZipData =' + Convert(varchar(30),Left(@Zipcode,5)) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ') ORDER BY theValue DESC'- LumbagoIf the facts don't fit the theory, change the facts. Albert Einstein |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-04 : 03:27:49
|
I've never used UNPIVOT (I'm normal!!!) so have no idea what works in that regard. Useful to know Lumbago, thanks. |
|
|
satish.gorijala
Posting Yak Master
182 Posts |
Posted - 2010-02-04 : 03:55:18
|
Hi kristen,I didn't understand your statement. check the bold one below.I am getting error when i try to execute this statmentSet @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = @Zipcode' + Convert(varchar(30),@Zipcode)EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT, @Zipcode INT', @existsZip OUTPUT, @ZipcodeExec STR_SP_GetCultureByZcode_New 85260,'INDI,CHIN,JPAN','CountryOrigin_View0',1Msg 137, Level 15, State 2, Line 1Must declare the scalar variable "@Zipcode85260".developer :) |
|
|
dextrous
Starting Member
15 Posts |
Posted - 2010-02-04 : 04:02:17
|
satish works with me...so he will also have questions :)thanks for your help, kristen & lumbago |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-04 : 09:56:51
|
Did you delete the bit in Blue?Put a PRINT statement in the SProc so you can see what is in @CondStmt1 and @Result before it is executed (comment out the EXEC / SP_ExecuteSQL temporarily if the error is preventing you seeing the PRINT debug message) |
|
|
satish.gorijala
Posting Yak Master
182 Posts |
Posted - 2010-02-07 : 04:58:46
|
Hi kristen, i didn't delte the blue one from your suggestion. bcz i am getting some syntax error. I have refined stored proc with some of ur suggestions. please give suggestions from below stored proc what can i do more to optimize this thing and get results faster.ALTER Procedure [dbo].[STR_SP_GetCultureByZcode] ( @Zipcode INT, @Querystr Varchar(200), @TblName Varchar(100), @existsZip INT OUTPUT ) ASBegin SET NOCOUNT ON; DECLARE @Result VARCHAR(8000), @CondStmt1 nvarchar(200), @Zip5 varchar(50), @Zcode Varchar(50)Set @Zcode = Convert(varchar(30),@Zipcode)Set @Zip5 = Convert(varchar(30),Left(@Zipcode,5)) --check Whether record exist in table with this zipcode Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + @ZcodeEXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT If(@existsZip > 0) -- If Exists get the dataBegin Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM ' + @TblName + ' WHERE ZipData =' + @Zcode + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC' End else -- If not exist take only 5 digits from zipcode and pass as input to query Begin Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM CountryOrigin_View0 WHERE ZipData =' + @Zip5 + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC' End Exec(@Result)EndGO quote: Originally posted by Kristen Did you delete the bit in Blue?Put a PRINT statement in the SProc so you can see what is in @CondStmt1 and @Result before it is executed (comment out the EXEC / SP_ExecuteSQL temporarily if the error is preventing you seeing the PRINT debug message)
developer :) |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-02-08 : 02:39:27
|
"i didn't delte the blue one from your suggestion. bcz i am getting some syntax "Yeah, well you will do if you don't delete the piece marked strikethrough.I marked it up so you would be able to see the changes I made and from that understand the changes. ... |
|
|
|