| Author |
Topic |
|
heze
Posting Yak Master
192 Posts |
Posted - 2006-03-20 : 23:46:48
|
| Hi Im developing a completely unnormalized table which is in fact very useful for my bosses. THe table is intended to track a group of people defined by an ID number through time and the different characteristics are listed. ie.ID characteristic year1 year2 year3 year4 yearn22 HoursWorked 3000 3500 3888 ... ...in reality I just produce manually the first 3 colimns, the rest is taken care of by means of temporry tables and innerjoins.Ofcourse, although very comprehensive an useful for our purposes, it gets cumbersome and unnatural when aggregating horizontally. For example in the previous table: how much hours has 22 worked from year 1 to year 4. Here is where I need your comments. How to make this the easiest possible so as not to do Year1+Year2+...+Yearneverytime I want to aggregate. I am analyzing using the Information schema or the sp_columns and build a user def funct with while cycles and maybe cursors to allow the end user to simply perform a MySUM(Range(n...m)) wherre n and m are the first and last column to aggregate. Is there any suggestion?Thank you |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-03-21 : 00:22:11
|
quote: Originally posted by heze...Im developing a completely unnormalized table...Is there any suggestion?...
I would suggest keeping your data in a normalized table and write you query against that.CODO ERGO SUM |
 |
|
|
heze
Posting Yak Master
192 Posts |
Posted - 2006-03-21 : 00:41:04
|
| hi, the reason for this structure is that we have been able to perform some queries for finding attriion and retention rates in a much simpler way using this scenario than using a normalized one, additionally, the number of characteristics to monitor per year per person is now in the order of 40 and will continue increasing, the process for producing the rest of the colmuns is automatic so duplicated queriesare not a problem, furthermore since my job is to provide precise numbers I need a source iin which I can search in a quick way that the results of my queries to the normalized tables as well as the trends to see if my reports make sense, my bosses, who are not programmers want to be able to do the same. The department produces numbers so we can not afford building applications for everything. |
 |
|
|
madhivanan
Premature Yak Congratulator
22864 Posts |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2006-03-21 : 06:35:52
|
| If u insist on ur un-normalized design,U might create Dynamic SQL;eg.declare @s varchar(8000), @s1 varchar(8000)Set @s = ' Selelct ID, ' --------------Using a loop(if the # of fields vary) or manually(if # of fields fixed), field names are taken to a string concatenated by necessary text and make the list as follows;..........-- @s1 => 'isnull(year1,0) + isnull(year2,0) + isnull(year3,0) + .... isnull(yearn,0) '-----------------Set @s = @s + @s1 + ' From ....'Exec (@s) |
 |
|
|
heze
Posting Yak Master
192 Posts |
Posted - 2006-03-21 : 08:36:58
|
| Thanks Srinka!, it worked just fine, I used:declare cr_ColumnNames cursorforselect COLUMN_NAME from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME=@currentColumnName AND ORDINAL_POSITION between @startRange and @endRangeORDER BY ORDINAL_POSITIONopen cr_ColumnNamesFETCH NEXT FROM cr_ColumnNames into @currentColumnselect @Str1='cast(['+@currentColumn+'] as int)'FETCH NEXT FROM cr_ColumnNames into @currentColumnWHILE @@FETCH_STATUS = 0--while(@range>=0)begin select @Str1=@Str1+ '+' +'cast(['+@currentColumn+'] as int)' FETCH NEXT FROM cr_ColumnNames into @currentColumnendselect @Str2='select '+ @Str1 +' from MT_TrackingElement_15'print @Str2exec(@Str2)CLOSE cr_ColumnNamesDEALLOCATE cr_ColumnNames |
 |
|
|
heze
Posting Yak Master
192 Posts |
Posted - 2006-03-21 : 08:42:25
|
| In your experience, when do you know you are abusing of exec(). In my un normalized design it has been very useful but everything has a price and would like to receive some input. Performance nor security are issues for me, my job is to produce numbers.thank you |
 |
|
|
madhivanan
Premature Yak Congratulator
22864 Posts |
Posted - 2006-03-21 : 08:43:54
|
| If you use Dynamic SQL, then the user has to have Select Permission on that specified table rather than just having Execute permission over stored ProcedureMadhivananFailing to plan is Planning to fail |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2006-03-21 : 09:37:51
|
| heze -- you do understand that if you had a normalized design, you'd be able to "just get the numbers" even faster, with shorter and clearer code, without a cursor or dynamic sql, right? Because your bosses eventually want to see your data spread out in columns has *nothing* to do with how you should actually be storing the data in your database. |
 |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2006-03-21 : 10:10:59
|
heze,[ Still I'm with the others that u'd be better off if u normalize ur table ]U can avoid the cursor and produce the columns list as followsDECLARE @sColsCols VARCHAR(8000)SELECT @sCols=''SELECT @sCols=@sCols + Column_name + ','from INFORMATION_SCHEMA.Columnswhere table_name = 'UrTbl' set @sCols = left(@sCols, Datalength(@sCols)-1) |
 |
|
|
heze
Posting Yak Master
192 Posts |
Posted - 2006-03-21 : 10:36:44
|
| Thanks for the comments. I have a normalized database. My un-normalized table adds nothing new datawise. The only problem is that, for tracking purposes I have to create many views and queries, maintianing and keeping track of them is a nightmare, specially because I perform queries to databases of different schemas in different servers. The databases are not mutually exclusive ie there can be one column in three different server.databases with apparently same data, however it becomes tricky because in one database it can be a snapshot while in the other it is a live table, furthermore, snapshots can occur at different points in time it is hard to keep track of the etire schema and not forget the details and allways have to search the field dictionaries.Thus, I found useful to aggregate all data in one single table. I dont know if some of you have an alternative but when you want to track many characteristics (40 so far)of a particular individual thourgh time in a single table one has to make a decision, if the number of characteristics to track increases(records per individual in my table) at a much higher rate than the time frames(ie columns in my table) I found more natural to implement the mentioned solution. Of course one can argue, why dont I create a set of normalized tables instead of one? the answer is that they already exist and we go back to the original problem? Of course, I write to the community to search for alternatives. Srinka, thanks, will revisit my code.Thank you |
 |
|
|
heze
Posting Yak Master
192 Posts |
Posted - 2006-03-26 : 22:13:40
|
| Hi again, Concerned about my design, I did some research and among others heres what I found, I guess sometimes "best db practices" are not in sync with real life requirements-the following is a research paper from IBM, of relevance to this comment is the vertical schema which is extremely similar to my schema, of course the queries become complex but the attributes, as mentioned before increase much faster than the comumns which represent terms, ie we have 2 per year.http://www.research.ibm.com/people/m/minwang/publications/edbt02.pdfComments?thanks |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-03-26 : 23:36:54
|
| That paper doesn't seem at all relevant to what you have described. It is basically a discussion a specialized query processing layer that is optimized for an Entity/Attribute type schema, and not at all a real world implementation of a practical product.You are basically talking about simple de-normalized tables in which you have summarized the data that would normally go into separate rows in a summary table into individual columns by time period.I can tell you from experience that your approach is a bad idea, and that if you want to summarize by time period, you should just have a summary row for each time period. Your approach suffers from several problems. You cannot easily rollup multiple time periods. The table usually takes more storage than necessary, because of sparseness. You have to continually modify the schema to add new time periods.It is much easier to transform a table with a row per time period into a cross tab view than it is to go the other way.However, you don’t seems willing to listen to the advice here. The problem you posted of having to sum multiple columns is a direct result of your design, and is one of its drawbacks. You seem intent on convincing us (or yourself) that it is a good design, and seem to be asking for some magic to make this problem go away.CODO ERGO SUM |
 |
|
|
|