Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Consolidating Duplicate Rows

Author  Topic 

Jeepaholic
Starting Member

36 Posts

Posted - 2003-09-09 : 19:20:52
Hi there. Done some searching, but I haven't quite nailed this one yet. I've got a shopping cart table where I need to consolidate (sum) the quantities of duplicate product entries.

For example...

Table Columns:
USER
PRODUCT
QUANTITY

Table Rows:
Al, 1, 5
Al, 2, 3
Bob, 3, 1
Joe, 3, 2
Bob, 1, 1
Al, 2, 1
Bob, 3, 1
Al, 2, 2

So...I need to do the following:
1) Search for duplicates of USER/PRODUCT
2) Sum the quantity of all matches for each pair combination
3) Insert the new row with the summed quantity
4) Delete the old rows

What's the best way to do this? Of course, I'm open to changes in my structure (#1-#4) above. Looking for the best solution, however it's done.

The final result-set should look like this:
Al, 1, 5
Al, 2, 5 <-- Sum of 3, 1, and 2
Bob, 3, 2 <-- Sum of 1 and 1
Joe, 3, 2
Bob, 1, 1

I've contemplated some methods, so I'll post thoughts here:
1) Select and group, based on USER/PRODUCT
2) Store in temp table
3) Cursor through the temp table, creating a sum of all values for each pair combination.
4) Delete all duplicate records.
5) Add new record with summed total.

Any better ways? Thanks for any help!

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2003-09-09 : 19:35:19
Why bother deleting the duplicate records? Just query the data:

SELECT [USER], PRODUCT, SUM(QUANTITY)
FROM Table1
GROUP BY [USER], PRODUCT

Tara
Go to Top of Page

Jeepaholic
Starting Member

36 Posts

Posted - 2003-09-09 : 20:48:06
Really, it's an efficiency thing. This data is queried fairly regularly. Therefore, a more complicated query (with the SUM and GROUP BY) would hurt performance more than a very basic query (in the long run).

This consolidation needs to happen after two shopping carts are merged (i.e. When a guest user with stuff in their cart logs into their personal account with stuff in that cart as well). The guest cart is updated to be a part of the user cart, and would then be consolidated. This consolidation query would be run much less frequently.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2003-09-10 : 11:46:22
What you have proposed in your initial thread will not be more efficient than the query that I have given.

Tara
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-09-10 : 12:05:57
SQL is designed to take detailed tables like what you mention and return results in any manner you like, in most cases summarized to some extent. That is why SELECT and GROUP BY exist.

Changing the data in your table will cause problems if anyone is concurrently in the table, plus you have no history of the original transactions. Does this table have a primary key? You will NEVER want to the see the original history of the original shopping carts before they are merged?

YOu can also always create a summary/reporting table that you can update now and then that contains only summarized transactions if you like.

How many rows in these tables are we talking about?

- Jeff
Go to Top of Page

Jeepaholic
Starting Member

36 Posts

Posted - 2003-09-10 : 21:46:43
Hi folks. I appreciate the answers...

I understand that SQL is optimized to handle statements such as SUM and GROUP BY. However, that doesn't mean they will perform as well as a base SELECT statement without these extra commands. This query (not the consolidation one) will be performed on nearly every page load, for thousands of users. So, it's very important that each query is optimized to it's fullest...which is why I'd rather have the data consolidated during an infrequent query so that the frequent queries are simple and quick.

Jeff, to answer some of your direct questions:

* I do not see the need to have the original history of the original carts before merging. Please shed some light on why I would?

* The transaction would occur all in one Stored Procedure. The carts are merged, and duplicates are consolidated. During this, isn't the table locked anyway?

* Having a separate table that's updated occasionally doesn't do any good, unless I'm missing your point. I'm not concerned about reporting, I'm concered about data shown directly to the end user. This is a shopping cart system that needs accuracy on the fly.

* The number of rows in the table are limited only by the number of thousands of users who happen to have items still in their shopping carts (not bought yet). The table will be fairly large, but pruned of stale data on occasion.

Thanks for your thoughts,
Al
Go to Top of Page
   

- Advertisement -