Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 How would you approach this...

Author  Topic 

cDc
Starting Member

30 Posts

Posted - 2003-01-15 : 03:41:21
I have a table of customer data this contains usual stuff like email address,password,username and has a unique ID. The problem I need to address is that one person may have many records (i.e. multiple accounts) and each account may have associated data in other tables (such as a login history table - that sort of thing).

What I would like to do is prune this customer data and merge accounts into one where multiple records refer to one person - at first I was thinking of writing some sort of application and get someone to manually go through and change data where neccessary however we are talking +200,000 rows and that does not seem too practical. I am wondering if there is a way to "find" a particular ID across all user tables that refer to my customer table (field name should be the same) and in one hit update all records?

any advice appreciated!
ta!!!

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2003-01-15 : 07:21:19
You've got two problems. A design issue and a data conversion issue. Which do you want help with? If its the latter, you need to supply your before and after DDL and some sample data.

Jay White
{0}
Go to Top of Page

Crespo24
Village Idiot

144 Posts

Posted - 2003-01-15 : 11:08:14
quote:

I have a table of customer data this contains usual stuff like email address,password,username and has a unique ID. The problem I need to address is that one person may have many records (i.e. multiple accounts) and each account may have associated data in other tables (such as a login history table - that sort of thing).

What I would like to do is prune this customer data and merge accounts into one where multiple records refer to one person - at first I was thinking of writing some sort of application and get someone to manually go through and change data where neccessary however we are talking +200,000 rows and that does not seem too practical. I am wondering if there is a way to "find" a particular ID across all user tables that refer to my customer table (field name should be the same) and in one hit update all records?

any advice appreciated!
ta!!!





OK.. I am not sure I understand your problem completely but here is some advice.
If I was you, I would not have multiple records in the person table.
You should keep your unique members in a separate table and have their account history in a separate table.
I am not sure what your data looks like so I can not help you any further, but like Page47 said, a DDL would be useful here.

Regards.


Go to Top of Page

cDc
Starting Member

30 Posts

Posted - 2003-01-15 : 14:15:41
sorry this better explains what I need to do, more of a data question than design really
currently one customer may have many records (50 or more!!! - dont ask how it has got like this my task is to fix it :-) )

customerid,email,username,password
1,'user@sqlteam.com','username1','password1'
38,'user@sqlteam.com','username2','password2'
6087,'user@sqlteam.com','username3','password3'
45621,'user@sqlteam.com','username4','password4'
100658,'user@sqlteam.com','username5','password5'

I need to clean the data up to look like

100658,'user@sqlteam.com','username5','password5'
& delete (or maybe archive) other records

so for the customer user@sqlteam.com my History data may look something like this

id,customerid,action
1,1,'updated address on: 19-12-2002'
2,38,'updated password: 4-12-2002'
3,6087,'updated password on: 13-12-2002'
4,45621,'updated password: 10-12-2002'
5,100658,'updated username on: 11-12-2002'

after "fixing" should look like

1,100658,'updated address on: 19-12-2002'
2,100658,'updated password: 4-12-2002'
3,100658,'updated password on: 13-12-2002'
4,100658,'updated password: 10-12-2002'
5,100658,'updated username on: 11-12-2002'


thanks

Go to Top of Page

setbasedisthetruepath
Used SQL Salesman

992 Posts

Posted - 2003-01-15 : 14:32:08
Close ... and this is more of a personal gripe than anything else ...

DDL is NOT someone's particular interpretation of table structure. It is the actual CREATE TABLE statements. Not a comma separated list. Not a pipe-delimited list. Not one column per line. Nothing else, nothing different.

Why do we ask for this? BECAUSE WE HOPE TO RUN YOUR DDL AGAINST OUR SQL SERVERS TO RECREATE YOUR ENVIRONMENT SO WE CAN HELP YOU.

Sample data is the same way. Why not just provide real, bona fide DML? Why make everyone retype it? Why provide for the possibility that someone will misunderstand, or introduce a typo?? Why???

Jonathan
{0}

Edited by - setbasedisthetruepath on 01/15/2003 14:32:57
Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2003-01-15 : 14:34:21
quote:
I am wondering if there is a way to "find" a particular ID across all user tables that refer to my customer table (field name should be the same) and in one hit update all records?
Yeah, an UPDATE statement.


update users
set customerid = maxcustomerid
from users u
inner join (
select
email,
max(customerid) as maxcustomerid
from users
group by email ) u2
on u.email = u2.email

 
Something along those lines.

Now this is a great one for the surrogate key lovers out there. When you can't seem to normalize a table, throw a surrogate key at it. When somebody screws that up, just throw another surrogate key on top of that!!



Jay White
{0}
Go to Top of Page

cDc
Starting Member

30 Posts

Posted - 2003-01-15 : 14:56:19
I can appreciate why the actual structure is more useful however my actual data is a lot more complex and I was trying to simplfy the problem plus the field names reveal information i do not wish to post of a public forum plus my experience of this forum and indeed sql server is extremely limited and I am not clear how to find a DDL and DML but maybe i will try the search feature once i check my abbreviations for dummies handbook!. so sorry about that.

page47: thanks that makes a good start!

Go to Top of Page

Crespo24
Village Idiot

144 Posts

Posted - 2003-01-16 : 05:10:55
Hi cDc,

What you are trying to achieve is quite easy I think.

Firstly, like Page47 suggested.. clean up your person table by keeping ONE record for each person... I don't know how you want to do it, but do you want to keep the one with the higest ID?

I have a lovely little script which might be useful for you (as in clean up your table with duplicate person details)...if you want it please let me know and I will post it here.

Regards.


Go to Top of Page

afterburn
Starting Member

28 Posts

Posted - 2003-01-16 : 21:00:02
He is going to have serveral issues. What He needs to do is create a hash table of all the customer ids and actions with any unique data he needs


CREATE TABLE [dbo].[tblCustomerActionXRef] (
[ActionDate] [datetime] NOT NULL ,
[customerID] [int] (36) NOT NULL ,
[ActionID] int not null
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[tblCustomerActionXRef] WITH NOCHECK ADD
CONSTRAINT [DF_tblEmails_ActionDate] DEFAULT (getdate()) FOR [ActionDate]
GO


This will allow you to create another table with action name for friendly name and a query like this

update tblUsers set email = lower(email)
Go
insert into tblCustomerActionXRef(CustomerID,ActionID)
select UserID,0 from tblUsers inner join (Select Email from tblUsers) as M where M.Email = tblUsers.Email

You may need a bit more of a query but this is the jist

here is the delete


delete from tblUsers where UserID< (Select Max(UserID)
from tblUsers M where M.Email = tblUsers.Email)



Edited some items
Edited by - afterburn on 01/16/2003 21:00:49

Edited by - afterburn on 01/16/2003 21:01:57

Script timed out while posting

Edited by - afterburn on 01/16/2003 21:16:59
Go to Top of Page
   

- Advertisement -