Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Dynamic data elements collected during data collection

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2005-09-21 : 07:22:36
pradeep writes "For a data collection application, which is a better model.
1. Vertical data model. (ID, AttributeName, AttributeValue)
2. Custom columns (ID, custom1, custom2....custom50)

As the columns collected may change year over year, which of the above better takes care of this dynamicness."

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-09-21 : 07:23:04
#1. Don't even think about trying the 2nd model, it's a waste of time.
Go to Top of Page

mittal_pradeep
Starting Member

5 Posts

Posted - 2005-09-21 : 11:56:07
1. But the data collection and reporting is in horizontal format. If collected data is edited vertically, won't there be a extra steps of converting horizontally obtained data to vertical and then vertical to horizontal reports. In custom column model data always remains horizontal. Won;t performance not be a issue in taking care of two extra steps.
2. Won;t the concurrency be a issue, considering the fact that a logical single horizontal row is edited as say 10 rows. Two people might be changing same Primary key's different attributes at the same time.

Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-09-21 : 20:17:50
1. Here's an answer (read all the links posted in this thread):

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=55371

2. Concurrency is an issue with either structure. The lowest level lock that SQL Server can utilize is a row lock, so if someone is editing one column, the whole row is locked anyway. If fact, the multi-row approach will provide you greater potential concurrency, since only one attribute would be locked at a time.

Also, SQL Server is not like Access where you hold locks while browsing data...in fact, any application that does so is badly designed, IMHO. Data access should be as simple and quick as possible. Locks are managed automatically by SQL Server, and there is rarely any reason to interfere with its lock manager.
Go to Top of Page

mittal_pradeep
Starting Member

5 Posts

Posted - 2005-09-26 : 15:07:41
1. From the link i got a way to change vertical data to horizontal. But I will have to write a complex way to convert horizontally submitted data (in excel) to vertical format also. If I right a case statement I need to know the data elements collected. OR I have to right dynamic sql. Making such large dynamic sql's does not looks like a good approach.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-09-26 : 20:22:18
Here's a solution to un-pivot your Excel data to an attribute-value format:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=3914

If you want to store your data as if it were an Excel sheet, then don't put it into SQL Server at all, leave it in Excel. The only reason to go to SQL Server is because you're (obviously) having problems with the current Excel format, so it makes no sense to replicate that format.
Go to Top of Page

mittal_pradeep
Starting Member

5 Posts

Posted - 2005-09-28 : 14:52:32
that's really really great. Thanks a lot.

I see two more problems with vertical format.

1. When we present the vertically stored data in horizontal format for manual data cleansing. Do we fire indiviual update statement for each attribute. i.e.

verticalTable = (pk, attribName, attribValue)

update verticalTable set attribValue='val1' where pk='pk1' and attribName='atrib1'
update verticalTable set attribValue='val2' where pk='pk2' and attribName='atrib2'
update verticalTable set attribValue='val3' where pk='pk3' and attribName='atrib3'
update verticalTable set attribValue='val4' where pk='pk4' and attribName='atrib4'
update verticalTable set attribValue='val5' where pk='pk5' and attribName='atrib5'



2. Also If i have to maintain history/ audit, I will have history also in vertical format with historyid as a extra column in the primary key. i.e.

historytable = (pk, attribName,historid, attribValue)
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2005-09-29 : 07:49:09
1. That's one possible solution, some other solutions that don't require multiple updates are described here:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=19846

Read all the links in that thread. I can't say if any of them are viable options for you.

My suggestion would be to clean the data up BEFORE you put it into the database. Again, if it's a spreadsheet, do the work on it while it's a spreadsheet. Changing the data structure in the database will necessarily change the way you work with that data; presenting it as a horizontal format for data entry/manipulation would be a waste of time IMHO. As I said earlier, if this horizontal structure is so mandatory then it really doesn't make sense to put it into SQL Server at all.

2. Adding a historyid column does not make sense to me, I would suggest having a ChangedDate column and maybe also a UserName column to indicate when it was changed and who changed it. The historyid is useless since it won't be used outside the table. If you must have a primary key, put it on PK, attribName, and ChangedDate.

I'll be honest and say I don't think log/history/audit tables need a primary key. That's not saying PK's aren't important, only that in this specific case it really doesn't bring any value. The history table is not going to be joined to anything else in a query, and the nature of its data doesn't require uniqueness. I have a number of audit tables at my current job and none of them have PK's, there's no suffering caused by their absence.
Go to Top of Page

mittal_pradeep
Starting Member

5 Posts

Posted - 2005-09-29 : 16:01:40
I get the data in horizontal structure.
Data cleansing is done in horizontal structure.
Data Reporting is done in horizontal structure.

Only reason for choosing vertical structure is maintenance. Every year data elements collected change by 10 percent or so. Now if i want the same table to store data for ten years, I need to maintain table with number of columns much larger than actual no. of data elements collected in a particular year.

That is why i was thinking of custom column approach, where by, For every year i maintain a map between custom column and actual data elements. And my table is table with 50 dummy or custom columns.

Go to Top of Page
   

- Advertisement -