Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 from clause---help!

Author  Topic 

gaucho
Starting Member

2 Posts

Posted - 2002-03-08 : 06:57:07
hi,

To what extent does the From clause affect an SQL query??? I mean i hv a query as follows:

SELECT DISTINCT DOCUMENT.document_code,DOCUMENT.version,DOCUMENT.name_doc,
DOCUMENT.layout_template,DOC_VERSION.reserved_by_cod,STATE.state FROM
DOC_PRIVILEGES,DOCUMENT,PARTICIPANTS_ML,TEAM_WORK, DOC_VERSION,STATE,STATE_DOC,
CLASS,LAYOUT,ORGANIZATION_DOC,ORG_DOC,ORIGIN,DOCUMENT_TYPE,KEYWORDS WHERE
DOCUMENT.document_code ='NEW123DOC' and DOCUMENT.version is not null and DOCUMENT.name_doc is not null

Now, exceuting this even in the Query Analyzer hangs or stays in busy mode... I remove from of the table names and the query runs fine..

I want to know how the tables affect the query and whats the remedy for it??

Another problem is that even though there exists only 1 record with the code name 'NEW123DOC' , the records appears multiple times.. how can i avoid this??

Thnx a lot in adv,

sands

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-03-08 : 07:21:48
I think you're probably misunderstanding how joins work in SQL.

SQL requires you to give a join condition for the tables in the query -- it won't automatically join using the names of the columns in the tables, or on the foreign key relationships. If you think of a join as a filtered cartesian product (that's not how it gets implemented most of the time, but the result is the same), then the effect of not giving a join condition is no filtering (i.e. the same as CROSS JOIN). For example, if you say:

SELECT table1.col1, table2.col2 FROM table1, table2

and there are 100 rows in table1 and 50 in table2, you will get 100*50 = 5000 rows in the result.

In your case, your result is from DOCUMENT, DOC_VERSION and STATE, so you will have as many rows as the filtered set from DOCUMENT (1) * all from DOC_VERSION * all from STATE.
In addition, SQL Server will have to check that each of the other tables contains at least one row -- old versions of SQL Server may even have to cross-join those tables too and then reaggregate the identical rows. [see note 1]

If you take out the DISTINCT, and get an estimated query plan, you'll see huge numbers for the number of estimated rows -- those numbers is likely to be pretty accurate for a query with simple WHERE conditions like this one.

So at the least, you're going to need join conditions between DOCUMENT, DOC_VERSION and STATE. If the other tables are irrelevant to the result, don't include them in the query!

[note 1] I didn't test this well enough: even on SQL Server 2000, it doesn't optimize this well. Hardly surprising, really.


Edited by - Arnold Fribble on 03/08/2002 07:42:15
Go to Top of Page

Lou
Yak Posting Veteran

59 Posts

Posted - 2002-03-09 : 16:01:03
Building on what Arnold said. The old sql syntax specified joins like this:

select ..
from tableA,tableB
where tableA.id = tableB.id

This for all practical purposes, meant that sql will create a cartesian product of tableA on TableB, then it will filter the output using the where condition. Not very efficient.

The new sql standard syntax for joins is:

select ..
from tableA JOIN tableB
ON tableA.id=tableB.id

Behind the scenes, sql is optimizing the JOIN such that it does not create a cartesian product and is much faster.

My little mnemonic for sql syntax/rule of order is this.
1. SQL builds the FROM clause (including JOINS) and filters using WHERE clause
2. SQL then groups by the GROUP BY clause and filters using the HAVING clause
3. SQL then builds the SELECT clause and filters for duplicate rows if you have the DISTINCT operator

HTH

Go to Top of Page
   

- Advertisement -