Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Tim
Starting Member
392 Posts |
Posted - 2002-12-11 : 01:26:28
|
| when selecting all records and joining multiple tables my query's execution plan normally shows index or table scan for the first table and then index seeks for the others, with nested loops for each join.when I add a group by clause I get "Bookmark Lookups" in the execution plan that add to the cost."is it normal doc? or am i dying?"what are bookmark lookups and can they be got rid of? |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-12-11 : 09:00:51
|
Ummmmmm, don't use GROUP BY? From Books Online:quote: Bookmark LookupThe Bookmark Lookup logical and physical operator uses a bookmark (row ID or clustering key) to look up the corresponding row in the table or clustered index. The Argument column contains the bookmark label used to look up the row in the table or clustered index. The Argument column also contains the name of the table or clustered index in which the row is looked up. If the WITH PREFETCH clause appears in the Argument column, then the query processor has determined that it is optimal to use asynchronous prefetching (read-ahead) when looking up bookmarks in the table or clustered index.
You can add your own JOIN hints to the query and see if they improve performance.If you're grouping on a joined column, try putting another table's column in the GROUP BY instead, it might use that table's indexes instead. |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2002-12-11 : 09:24:03
|
| Chances are you are doing many more joins than you need to, as well.Consider the following tables:Customers: *CustomerID , NameOrders: *CustomerID, *OrderID, Date, Amount(*indicates key fields for each table)Suppose you want a result that gives you customer name and their total order amount.So you say:SELECT Name, SUM(Amount) as TotalAmountFROM Customers C INNER JOIN Orders OON C.CustomerID = O.OrderIDGROUP BY C.NamePretty easy, right? Except -- there is a more efficient way to do the above query. In the above, you are joining every line in the "Orders" table to the "Customer" table -- if you have 10,000 orders and 10 Customers, you are performing 10,000 joins to the Customer table.Consider this:SELECT C.Name, O.TotalAmountFROM Customers CINNER JOIN (SELECT CustomerID, SUM(Amount) as TotalAmount FROM Orders GROUP BY CustomerID) OON C.CustomerID = O.CustomerIDThe above query performs easily rolls up the Orders table from 10,000 into 10 records and THEN peforms only 10 joins, because now it doesn't have to do a JOIN on every record. Now, SQL may optimise the first query to perform the same as the second, I'm not 100% sure, but I do know that in general (not just with SQL server but with SQL in general) the second query is the most efficient way to write the above.I only point this out because a) the first query is what 99% of people write and b) it sounds like you may have something similiar, because you mentioned you had lots of joins and then just added GROUP BY's and noticed the performance suffered.So, my point is (finally!) :Just adding GROUP BY's without re-working the a query that includes many joins probably will always degrade performance because you are now forcing SQL to sort and GROUP in ways in which the optimizer wasn't doing it before. Before, it concentrated on efficient joins. But now, it has to deal with GROUPING as well.- Jeff |
 |
|
|
Tim
Starting Member
392 Posts |
Posted - 2002-12-11 : 18:25:35
|
| Jeff, I like your idea while the derived table has a very small number of records. but...The derived rows still have to be joined to the first table, and the derived table supplies no index for this.So at some point there must be a limit where you lose the efficiency gained in the "non join rollup" to the process of "join the rolled up records".Do I have it right?If so, then I have to make a call about which technique to use, based on how many rows I forsee being the result of the rollup. |
 |
|
|
simondeutsch
Aged Yak Warrior
547 Posts |
Posted - 2002-12-11 : 21:24:05
|
| Tim, I just tried both types of queries,and the derived table one actually did much better.Customers table(100,000 rows), Orders table (450,000) rowsand the first query using ordinary joins had 61.44 of the cost, query takes 1.36 mins to complete, the derived table one had 38.56, took 17 seconds. There was no WHERE clause in either query, and there were approx. 45,000 rows returned.Sarah Berger MCSDOooopsss.... I just ran DBCC DropCleanBuffers and reran query 2. Took 52 secs.Edited by - simondeutsch on 12/11/2002 21:26:31 |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-12-11 : 21:27:05
|
Ahhhh, but did you DBCC DROPCLEANBUFFERS between the two runs? Also do a DBCC FREEPROCCCACHE too.Never mind... Edited by - robvolk on 12/11/2002 21:28:02 |
 |
|
|
Tim
Starting Member
392 Posts |
Posted - 2002-12-11 : 22:08:42
|
sarah thanks for that... it helps ... and saves me buggerising around doing it myself ! Just to be sure I understand, your orders table must have 45,000 distinct customers in it and 55,000 of your customers don't have orders. right?(I would be bothered if I were the boss of that virtual business ! )Maybe when I finally receive my copy of the guru's guide next week it will have the solution! I will just open the TOC and look up "Tims problem #22307" and flip straight to the answer. (note the # references SQL team TOPIC_ID for convenience, as all good books should )I have BIG expectations of this book after all the hype generated by "the pimp"Edited by - tim on 12/11/2002 22:16:20 |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2002-12-11 : 22:11:37
|
| whew! it sounds like i had at least SOME idea of what I was talking about!- Jeff |
 |
|
|
Tim
Starting Member
392 Posts |
Posted - 2002-12-11 : 22:18:53
|
always a relief eh jeff ! |
 |
|
|
simondeutsch
Aged Yak Warrior
547 Posts |
Posted - 2002-12-12 : 11:52:10
|
| Hi, Tim!After Rob's pimping, I got the book too. You will not be disappointed, even if you don't find "Tim's Problem #22307" in the index.And never mind about that data: It's junk. Not a production db. Otherwise, marketing would be all be under mob contracts.Sarah Berger MCSD |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2002-12-13 : 12:14:10
|
I'm not convinced.DROP TABLE OrdersDROP TABLE CustomersCREATE TABLE Customers (CustomerID int PRIMARY KEY, "Name" varchar(50) NOT NULL)CREATE TABLE Orders (CustomerID int NOT NULL REFERENCES Customers(CustomerID), OrderID int NOT NULL, "Date" datetime NOT NULL, Amount money NOT NULL, PRIMARY KEY (CustomerID, OrderID))INSERT INTO CustomersSELECT n, CAST(NEWID() AS varchar(50))FROM NumbersWHERE n < 10INSERT INTO OrdersSELECT N1.n, N2.n, DATEADD(s, 86400*CAST(N2.n AS float) + 43200*RAND(CAST(NEWID() AS binary(4))), '20000101'), RAND(CAST(NEWID() AS binary(4)))*1000.0FROM Numbers AS N1, Numbers AS N2WHERE N1.n < 10 AND N2.n < 1000 The test queries:-- ASELECT Name, SUM(Amount) as TotalAmountFROM Customers CINNER JOIN Orders OON C.CustomerID = O.CustomerID-- I presume Jeff didn't really mean C.CustomerID = O.OrderIDGROUP BY C.Name-- BSELECT C.Name, O.TotalAmountFROM Customers CINNER JOIN (SELECT CustomerID, SUM(Amount) as TotalAmount FROM Orders GROUP BY CustomerID) OON C.CustomerID = O.CustomerID The first query has the same execution plan as the second, except that it then sorts on Customers."Name" and stream aggs to finishes the SUM. But that's to be expected. Unless there's a UNIQUE constraint on Customers."Name" these are different queries.What's less good is that adding in the UNIQUE constraint makes no difference. SQL Server 2000 can't figure out that because both CustomerID and "Name" are candidate keys for Customers, there's no need to do a second aggregation.What if we'd grouped on CustomerID?-- CSELECT Name, SUM(Amount) as TotalAmountFROM Customers CINNER JOIN Orders OON C.CustomerID = O.CustomerIDGROUP BY C.CustomerID, C.Name Now this is equivalent to the second query (B) above, irrespective of the UNIQUE constraint. And it has the same execution plan.Edited by - Arnold Fribble on 12/13/2002 12:20:50 |
 |
|
|
|
|
|
|
|