Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Kristen
Test
22859 Posts |
Posted - 2015-03-26 : 23:33:50
|
I have to pull some data from another application. That has a Parent table and a Child table of Event Dates.Some of the Events have multiple dates, for a specific Event Type, that's fine ... I'd use a child-table for that of course.But most of the dates are distinct - only [zero or] one date allowed for a particular Event Type. Personally I would have created Columns in the Parent table for those.I will create a VIEW with multiple OUTER JOINs to the Child Table so that I can use that in any general purpose queries, but I also have to build a Record Card application - it will display a single record at a time, and will be the majority-use by users.Is there any merit in having a (more efficient??) single-row query which doesn't have all the OUTER JOINs?Or even warehousing the data (overnight) to flatten the Event table's distinct values into columns in the Parent table?Here's some code in case it helps illustrate the scenarioCREATE TABLE TEMP1 -- Parent Table( T1_ID int, T1_NonDate1 int, T1_NonDate2 varchar(10) PRIMARY KEY ( T1_ID ))CREATE TABLE TEMP2 -- Child Table of Event Types and Dates( T2_ID int, T2_T1_ID int, T2_Type varchar(4), T2_Date datetime, PRIMARY KEY ( T2_ID ))INSERT INTO TEMP1SELECT 1, 1234, 'FOO' UNION ALLSELECT 2, 5678, 'BAR'INSERT INTO TEMP2SELECT 1001, 1, 'AAA', '20000101' UNION ALL -- AAA is distinctSELECT 1002, 1, 'BBB', '20000102' UNION ALL -- BBB is distinctSELECT 1003, 1, 'CCC', '20000201' UNION ALL -- CCC can have multiple rows for each Parent recordSELECT 1004, 1, 'CCC', '20000202' UNION ALLSELECT 1005, 1, 'CCC', '20000203' UNION ALL--SELECT 1006, 2, 'AAA', '20010101' UNION ALLSELECT 1008, 2, 'CCC', '20010201' UNION ALLSELECT 1009, 2, 'CCC', '20010202'GO-- Sample QuerySELECT T1_ID, T1_NonDate1, T1_NonDate2, [AAA]=A.T2_Date, [BBB]=B.T2_DateFROM TEMP1 AS T LEFT OUTER JOIN TEMP2 AS A ON A.T2_T1_ID = T.T1_ID AND A.T2_Type = 'AAA' LEFT OUTER JOIN TEMP2 AS B ON B.T2_T1_ID = T.T1_ID AND B.T2_Type = 'BBB'GO-- Method 1 - Create ViewCREATE VIEW TEMP_VASSELECT [V_T1] = T1_ID, [V_AAA]=A.T2_Date, [V_BBB]=B.T2_Date FROM TEMP1 AS T LEFT OUTER JOIN TEMP2 AS A ON A.T2_T1_ID = T.T1_ID AND A.T2_Type = 'AAA' LEFT OUTER JOIN TEMP2 AS B ON B.T2_T1_ID = T.T1_ID AND B.T2_Type = 'BBB'GO-- Method 1 - Retrieve Data from ViewSELECT T1_ID, T1_NonDate1, T1_NonDate2, V_AAA, V_BBBFROM TEMP1 JOIN TEMP_V ON V_T1 = T1_ID-- Method 2 - Select data for a single rowDECLARE @AAA datetime, @BBB datetime, @MyID intSELECT @MyID = 1 -- The record I want - e.g. SProc parameter / Value provided by User-- Retrive data from associated tableSELECT @AAA = CASE WHEN T2_Type = 'AAA' THEN T2_Date ELSE @AAA END, @BBB = CASE WHEN T2_Type = 'BBB' THEN T2_Date ELSE @BBB ENDFROM TEMP2WHERE T2_T1_ID = @MyID AND T2_Type IN ('AAA', 'BBB')-- Select data from appropriate rowSELECT T1_ID, T1_NonDate1, T1_NonDate2, [AAA]=@AAA, [BBB]=@BBBFROM TEMP1WHERE T1_ID = @MyID-- Drop Tables/ViewDROP VIEW TEMP_VGODROP TABLE TEMP1GODROP TABLE TEMP2GO |
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2015-03-27 : 17:27:55
|
So you'll never select data for T2_Type = 'CCC'? I would rather be the man who bought the Brooklyn Bridge than the one who sold it. -Will Rogers |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-03-28 : 11:09:29
|
Yeah. There are multiple occurrences of some TYPEs of data, and other TYPEs are guaranteed to only have one entry. In this instance I'm interested in getting all the one-only types and display them in a "record card".The TYPEs that have multiple entries, like the "CCC" one in my example, will have to be displayed in a separate TAB / List / something-like-that.Personally I would have only used this sort of "child table" for TYPEs that could have multiple rows in the child table ... |
|
|
huangchen
Starting Member
37 Posts |
Posted - 2015-04-02 : 05:51:51
|
unspammed |
|
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2015-04-02 : 12:07:49
|
I wouldn't try to selectively denormalize the parents that currently can have only one child. Personally I like the more flexible, and in my view more complete, design as it is vs. making that change. |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-04-08 : 04:36:28
|
Interesting. Can you clarify why?Just to be clear the Child Table has multiple rows, but only one row for specific "attributes", so in order to display a Record Card type page I will have to do multiple joins to the Child Table to get maybe a dozen values.Perhaps I could PIVOT the child table to get multiple columns that I could then JOIN to? (Never had the need to try that before, no idea if it would work)This is the sort of output I'm getting from SET STATISTICS IO ON - I can see performance being dire for this!Table 'Worktable'. Scan count 0, logical reads 0Table 'CHILD1a'. Scan count 0, logical reads 10Table 'CHILD1'. Scan count 7, logical reads 21Table 'Worktable'. Scan count 0, logical reads 0Table 'CHILD2'. Scan count 5, logical reads 10Table 'CHILD3'. Scan count 10, logical reads 30Table 'PARENT_TABLE'. Scan count 0, logical reads 3 |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-04-08 : 04:45:04
|
In case it helps to try to describe a real-world example consider an order. It has Date Order Placed, date Credit Check completed, various shipment dates, and Date Order Closed.User would like to see the dates for Order Placed, Credit Checked and Order Closed on a "record card" style display, and a separate grid/tab for the (part) shipment dates |
|
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2015-04-08 : 10:30:06
|
If it's possible to have multiple values, you really can't use columns in the parent table, you need a child table.You're quite correct to be concerned about potential performance issues, however. The key to performance is to cluster the child table on the parent key first, then, optionally, the child key(s). Never cluster the child table on, say, an identity column first. |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-04-08 : 10:41:00
|
The issue is that the developers have put every date relevant to the Parent Table into the Child Table. Many of these only have one instance (which is enforced by rules in the application, so can't have more than one). Hence I wanted to flatten the ones that can only occur once.I started off thinking I could query the OLTP database itself, using a View, so users would be viewing current data. Performance is not good enough, so I'm going to have to warehouse the data.We build warehouse tables overnight using a script that only UpSerts rows that have changed ... but if any date in the child table has changed that constitutes a change, so I'm basically having to compare the "view" against the whole of my flat-table ... and it runs like a dog. I'm in the process of trying my one-pass multi-column update, I'll report back if I find that that is faster. |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-04-08 : 10:50:11
|
I've checked the source databases clustered indexes and, thankfully!, the CHILD table shares the same leading key fields of the PARENT table. Thanks for pointing that out. |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-04-08 : 11:07:33
|
Some figures, so far:Using a VIEW that JOINs the child table multiple times to find the unique date columns:Select all records from remote database into a temporary table. 7,911 records, INSERT INTO ... SELECT elapsed time = 506 seconds!Select just the columns available in the Parent table (into #TEMP table) = < 1 second, then UPDATE the temporary table against the Child Date table (a single pass, but columns updated multiple times) 7,911 rows processed = 1 secondSeems like that is going to be a much more efficient way of "pulling" the data. Sadly none of my existing code than mechanically generates the scripts for this includes such wizardry! |
|
|
|
|
|
|
|