Author |
Topic |
zchase
Starting Member
8 Posts |
Posted - 2015-04-23 : 09:28:42
|
Hi,I am new to SQL but trying to do join a few tables to get result showing showing one row per unique record.Tables include:-1. REQ2. RFQ3. PO4. DOCUMENT (contains LAST_DOCUMENT_STATUS, DOCUMENT_ID, DOCUMENT_NUMBER, for example, REQ_CANCELLED, REQ_ID, REQ_NO)5. DOCUMENT_STATUS (contains status of document, REQ_CREATE)6. DOCUMENT_TRAIL (contains link between documents, PARENT_DOCUMENT, CURRENT_DOCUMENT, for example, REQ_ID (PARENT_DOCUMENT), RFQ_ID (CURRENT_DOCUMENT)7. PO_REVISION (contains PO REVISION, when link with DOCUMENT, PO_REV_NO)Currently when i tried to join all the TABLES, i get multiple lines against REQ_NO.I realised the multiple lines generated due to the following:-One to many relationships:A. RFQ - 1 or more POB. PO - 1 or more PO_REVISONI was thinking how to MAX the records in PO to show only the last PO_REVISION. It seems that DOCUMENT_TRAIL will contain 1 base document PO and 1 or more PO_REVISION.Would appreciate some assistance to teach me on some techniques.Regards, |
|
Kristen
Test
22859 Posts |
Posted - 2015-04-23 : 12:29:06
|
Something like this perhaps?SELECT Col1, Col2, ...FROM(SELECT Col1, Col2, ..., ROW_NUMBER OVER ( PARTITION BY R.SomeID ORDER BY R.SomeID, R.PO_REV_NO DESC ) AS T_RowNumberFROM PO JOIN PO_REVISION AS R ON R.SomeID = PO.SomeIDWHERE ...) AS XWHERE T_RowNumber = 1 -- Include only the first row from PO_REVISIONORDER BY Col1, ... There might be smarter ways of doing this depending on:What version of SQL are you using? |
|
|
zchase
Starting Member
8 Posts |
Posted - 2015-05-01 : 04:48:58
|
Microsoft SQL Server version 2008 R2.What is the smarter ways of doing this. |
|
|
gbritton
Master Smack Fu Yak Hacker
2780 Posts |
Posted - 2015-05-01 : 08:39:42
|
In 2012, you could use the FIRSTVALUE function, but Kristen's solution is good for 2008R2 |
|
|
zchase
Starting Member
8 Posts |
Posted - 2015-05-01 : 22:39:24
|
i was able to get the results with the following script:-- Removed the PO revision but still retain multiple POs reference to each REQ/RFQ.SELECT doc1.document_number AS REQ_NUM, CONVERT(INT, doc2.document_number) AS RFQ_NUM, CONVERT(INT, doc3.document_number) AS PO_NUM, CONVERT(INT, doc4.document_number) AS PO_REV FROM requisition req LEFT JOIN document doc1 ON doc1.document_id = req.document_id LEFT JOIN document_trail dt1 ON dt1.document_id = doc1.document_id LEFT JOIN document_trail dt2 ON dt2.parent_document_id = doc1.document_id AND dt2.document_type_id = 2 LEFT JOIN document doc2 ON doc2.document_id = dt2.document_id LEFT JOIN rfq rf ON rf.document_id = doc2.document_id LEFT JOIN (SELECT * FROM (SELECT dt.document_id, dt.parent_document_id, Row_number() OVER ( partition BY dt.parent_document_id, dt.document_id ORDER BY dt.document_id ASC ) AS T_RowNumber FROM document_trail dt INNER JOIN po p ON p.document_id = dt.document_id WHERE document_type_id = 3) AS temp WHERE temp.t_rownumber = 1) dt3 ON ( dt3.parent_document_id = doc2.document_id OR dt3.parent_document_id = doc1.document_id ) LEFT JOIN document doc3 ON doc3.document_id = dt3.document_id LEFT JOIN po p ON p.document_id = doc3.document_id LEFT JOIN po_revision pov ON pov.document_revision_id = p.document_id AND pov.document_revision_id = p.current_revision_id LEFT JOIN document doc4 ON doc4.document_id = p.current_revision_id LEFT JOIN document_ship dship ON dship.document_id = doc1.document_id LEFT JOIN ship s ON s.ship_id = dship.ship_idWHERE s.hidden = 0 ORDER BY doc1.document_number |
|
|
zchase
Starting Member
8 Posts |
Posted - 2015-05-02 : 02:45:28
|
i was able to get most of the results correct but i still have some records without PO_NUM. |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-05-02 : 03:15:39
|
I haven't checked it carefully but I don't think you can do: Row_number() OVER ( partition BY dt.parent_document_id, dt.document_id ORDER BY dt.document_id ASC but rather you must include the PARTITION columns in the ORDER BY, and then any additional "tie break" column(s) Row_number() OVER ( partition BY dt.parent_document_id, dt.document_id ORDER BY dt.parent_document_id, dt.document_id, dt.document_id ASC Please put[CODE]...[/CODE]tags around your code to preserve the formatting.By the by you can simplifyJOIN( SELECT * FROM ( SELECT ROW_NUMBER() OVER (...) AS T_RowNumber, .... FROM ... ) AS TEMP WHERE TEMP.T_RowNumber = 1) AS DT3ON ( dt3.parent_document_id = doc2.document_id OR dt3.parent_document_id = doc1.document_id ) toJOIN( SELECT ROW_NUMBER() OVER (...) AS T_RowNumber, .... FROM ...) AS DT3ON ( dt3.parent_document_id = doc2.document_id OR dt3.parent_document_id = doc1.document_id ) AND dt3.T_RowNumber = 1 |
|
|
zchase
Starting Member
8 Posts |
Posted - 2015-05-02 : 06:58:25
|
I suspect it may be due to data issue but unable to verify it.[CODE]SELECT doc1.document_number AS REQ_NUM ,CONVERT(INT, doc2.document_number) AS RFQ_NUM ,CONVERT(INT, doc3.document_number) AS PO_NUM ,CONVERT(INT, doc4.document_number) AS PO_REVFROM requisition req LEFT JOIN document doc1 ON doc1.document_id = req.document_id LEFT JOIN document_trail dt1 ON dt1.document_id = doc1.document_id LEFT JOIN document_trail dt2 ON dt2.parent_document_id = doc1.document_id AND dt2.document_type_id = 2 LEFT JOIN document doc2 ON doc2.document_id = dt2.document_id LEFT JOIN rfq rf ON rf.document_id = doc2.document_id LEFT JOIN (SELECT * FROM (SELECT dt.document_id ,dt.parent_document_id ,Row_number() OVER ( partition BY dt.parent_document_id ,dt.document_id ORDER BY dt.parent_document_id, dt.document_id DESC) AS T_RowNumber FROM document_trail dt INNER JOIN po p ON p.document_id = dt.document_id WHERE dt.document_type_id = 3) AS temp ) AS dt3 ON ( dt3.parent_document_id = doc2.document_id OR dt3.parent_document_id = doc1.document_id ) AND dt3.t_rownumber = 1 LEFT JOIN document doc3 ON doc3.document_id = dt3.document_id LEFT JOIN po p ON p.document_id = doc3.document_id LEFT JOIN po_revision pov ON pov.document_revision_id = p.document_id AND pov.document_revision_id = p.current_revision_id LEFT JOIN document doc4 ON doc4.document_id = p.current_revision_id LEFT JOIN document_ship dship ON dship.document_id = doc1.document_id LEFT JOIN ship s ON s.ship_id = dship.ship_idWHERE s.hidden = 0ORDER BY doc1.document_number[/CODE] |
|
|
zchase
Starting Member
8 Posts |
Posted - 2015-05-02 : 07:04:45
|
sample data i got from the above script.Basically everything in order except for first line.I got about 300+ REQ_NUM without PO_NUM.[CODE]REQ_NUM RFQ_NUM PO_NUM PO_REVETASEN0000017 2007008 NULL NULLKENADK0000590 NULL 1700122 0KENAEN0000114 2004704 5007456 0KENAEN0000114 2004704 5007457 2KENAEN0000114 2004704 5007458 1KENAEN0000114 2004704 5007459 1KENAEN0000403 7101818 8102483 0KENAEN0000403 7101818 8102484 0KENAEN0000588 7104470 8104098 0[/CODE] |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-05-02 : 13:55:54
|
Note that I said "and then any additional "tie break" column(s)" but you have left those out. It may not influence whether you get PO_NUM, but without it you will potentially get different results each time you run it. I gave you a full example, you've only copied part of it. It is important that you understand the code, otherwise you are not going to be able to make it work correctly, safely.Same with moving "dt3.t_rownumber = 1" into the JOIN's "ON" clause - you've left part of the original code behind. It won't make any difference to the outcome but it suggests that you either don't understand the code and/or you are happy with sloppy code. I'm not!I can't see how we can solve the rows without PO_NUM issue without some sample data which illustrates the problem. Given the number of tables you have I doubt that will be easy ... |
|
|
zchase
Starting Member
8 Posts |
Posted - 2015-05-03 : 06:31:11
|
Is there a way i can provide sample date for your to check my SQL script?Perhaps maybe you can use this link to generate the data and test it.http://sqlfiddle.com/#!6/d5203/1 |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-05-03 : 08:57:12
|
SQL Fiddle is fine (personally I'd prefer that the Tables where #TEMP otherwise I have to create / use a temporary database etc. etc. so probably fewer people will bother to help you if you use permanent tables)REQ_NUM = ETASEN0000017has no corresponding row(s) in [PO]. There is one corresponding row in DOCUMENT_trail (DOCUMENT_ID=7400034180) which also has document_type_id=3, but no row in [PO] for DOCUMENT_ID=7400034180. Thus PO_NUM displays NULL for that row.Note that DOCUMENT_trail aliased as "dt1" table is unused, so could be removed from the query |
|
|
zchase
Starting Member
8 Posts |
Posted - 2015-05-03 : 10:11:44
|
i am pretty new at this since i just learn SQL in like a week and running through the data model and data dictionary which i am totally not familiar with.I had to work with it since no one else in the IT department wants to help.Even the company who developed the application wants to charge me for the SQL script.For your information, i am suspecting it could be data issue but i can't rule out the incompetency of me not knowing the best way to write a script.Nevertheless, i will submit my results to the company for more investigation work.Thx for your help on the matter since i came on this forum to seek assistance. |
|
|
Kristen
Test
22859 Posts |
Posted - 2015-05-03 : 13:20:23
|
quote: Originally posted by zchase i am suspecting it could be data issue
I suppose you might need to ORDER the sub-query DT3 so that any row in [PO] that DOES have a corresponding [document_id] in [DOCUMENT] table was ordered first. It might be that the first ROW_NUMBER() ... OVER() that you are getting, ordered by:dt.parent_document_id, dt.document_id, dt.document_id DESChappens NOT to have a corresponding row in [DOCUMENT] table (i.e. but there are other joined rows that DO have a corresponding [DOCUMENT] row). (However, maybe you will tell me that THAT should never happen, and all rows should have a [DOCUMENT] record, and thus that in itself would be a "Data Issue"? )What I would suggest you do is to query for missing data - i.e. data that SHOULD be there but is missing.I had a quick look at your query but I don't have enough time to manipulate it into an "exceptions" report.Basically what you are looking for is:SELECT req.DOCUMENT_IDFROM (Your Query)WHERE NOT EXISTS (a row with doc3.document_id IS NOT NULL)In this example that would be ETASEN0000017Then feed that list of IDs back into the query and display ALL rows (not just the ROW_NUMBER() OVER() #1 first row) - so that you can see if there IS anything in doc3 that SHOULD have been included in your report.If you find anything you need to improve your query (to include it) ...... if you find NOTHING then I agree that it looks like a Data Issue - and your "Exception Report" can then be used to find any/all the other instances where the data is goofy. |
|
|
|