Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
gaucho
Starting Member
2 Posts |
Posted - 2002-03-08 : 06:57:07
|
| hi,To what extent does the From clause affect an SQL query??? I mean i hv a query as follows:SELECT DISTINCT DOCUMENT.document_code,DOCUMENT.version,DOCUMENT.name_doc, DOCUMENT.layout_template,DOC_VERSION.reserved_by_cod,STATE.state FROM DOC_PRIVILEGES,DOCUMENT,PARTICIPANTS_ML,TEAM_WORK, DOC_VERSION,STATE,STATE_DOC,CLASS,LAYOUT,ORGANIZATION_DOC,ORG_DOC,ORIGIN,DOCUMENT_TYPE,KEYWORDS WHERE DOCUMENT.document_code ='NEW123DOC' and DOCUMENT.version is not null and DOCUMENT.name_doc is not nullNow, exceuting this even in the Query Analyzer hangs or stays in busy mode... I remove from of the table names and the query runs fine..I want to know how the tables affect the query and whats the remedy for it??Another problem is that even though there exists only 1 record with the code name 'NEW123DOC' , the records appears multiple times.. how can i avoid this??Thnx a lot in adv,sands |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2002-03-08 : 07:21:48
|
| I think you're probably misunderstanding how joins work in SQL.SQL requires you to give a join condition for the tables in the query -- it won't automatically join using the names of the columns in the tables, or on the foreign key relationships. If you think of a join as a filtered cartesian product (that's not how it gets implemented most of the time, but the result is the same), then the effect of not giving a join condition is no filtering (i.e. the same as CROSS JOIN). For example, if you say:SELECT table1.col1, table2.col2 FROM table1, table2and there are 100 rows in table1 and 50 in table2, you will get 100*50 = 5000 rows in the result.In your case, your result is from DOCUMENT, DOC_VERSION and STATE, so you will have as many rows as the filtered set from DOCUMENT (1) * all from DOC_VERSION * all from STATE.In addition, SQL Server will have to check that each of the other tables contains at least one row -- old versions of SQL Server may even have to cross-join those tables too and then reaggregate the identical rows. [see note 1]If you take out the DISTINCT, and get an estimated query plan, you'll see huge numbers for the number of estimated rows -- those numbers is likely to be pretty accurate for a query with simple WHERE conditions like this one.So at the least, you're going to need join conditions between DOCUMENT, DOC_VERSION and STATE. If the other tables are irrelevant to the result, don't include them in the query![note 1] I didn't test this well enough: even on SQL Server 2000, it doesn't optimize this well. Hardly surprising, really.Edited by - Arnold Fribble on 03/08/2002 07:42:15 |
 |
|
|
Lou
Yak Posting Veteran
59 Posts |
Posted - 2002-03-09 : 16:01:03
|
| Building on what Arnold said. The old sql syntax specified joins like this:select ..from tableA,tableBwhere tableA.id = tableB.idThis for all practical purposes, meant that sql will create a cartesian product of tableA on TableB, then it will filter the output using the where condition. Not very efficient.The new sql standard syntax for joins is:select ..from tableA JOIN tableBON tableA.id=tableB.idBehind the scenes, sql is optimizing the JOIN such that it does not create a cartesian product and is much faster.My little mnemonic for sql syntax/rule of order is this. 1. SQL builds the FROM clause (including JOINS) and filters using WHERE clause2. SQL then groups by the GROUP BY clause and filters using the HAVING clause3. SQL then builds the SELECT clause and filters for duplicate rows if you have the DISTINCT operatorHTH |
 |
|
|
|
|
|
|
|