Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
ajaygoel
Starting Member
2 Posts |
Posted - 2006-10-20 : 03:25:15
|
| Hello,I'm in charge of an email application, and we currently store a user's email data in 4 tables:EmailGroupsEmailGroupFieldsEmailsEmailGroupFieldValuesTogether, these four tables allow a user to create an "Email Group". They pick a name for their "Group" (record inserted into "EmailGroups" table), add fields to the Group (like FirstName/LastName/Company)...for each field, a record gets inserted into "EmailGroupFields". Then, when they upload their data into the Email Group, the Emails and EmailGroupFieldValues tables gets filled up. "Emails" contains just the e-mail addresses while "EmailGroupFieldValues " contains the data for FirstName/LastName etc. or whatever fields the user has created.When a user wants to retrieve the data for his Email Group, and this happens a lot throughout our application, we generate a monstrous query customized for the user's Email Group and then execute the query. The query builds a join for each "field" in the user's Email Group. For an Email Group with 10 fields, this is a sample query that would be generated and then executed:select [e1].emailaddress as [emailaddress], [v1Newsletter].EmailGroupFieldValue_data as [Newsletter], [v1PR].EmailGroupFieldValue_data as [PR], [v1FirstName].EmailGroupFieldValue_data as [FirstName], [v1LastName].EmailGroupFieldValue_data as [LastName], [v1Company].EmailGroupFieldValue_data as [Company], [v1DOB].EmailGroupFieldValue_data as [DOB], [v1LastPurchase].EmailGroupFieldValue_data as [LastPurchase], [v1LastVisited].EmailGroupFieldValue_data as [LastVisited], [v1Address].EmailGroupFieldValue_data as [Address], [v1City].EmailGroupFieldValue_data as [City], [v1State].EmailGroupFieldValue_data as [State], [e1].submission_date as [submission_date]from [Emails] as [e1] left outer join EmailGroupFieldValues as [v1Newsletter] on [e1].email_id = [v1Newsletter].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1PR] on [e1].email_id = [v1PR].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1FirstName] on [e1].email_id = [v1FirstName].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1LastName] on [e1].email_id = [v1LastName].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1Company] on [e1].email_id = [v1Company].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1DOB] on [e1].email_id = [v1DOB].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1LastPurchase] on [e1].email_id = [v1LastPurchase].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1LastVisited] on [e1].email_id = [v1LastVisited].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1Address] on [e1].email_id = [v1Address].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1City] on [e1].email_id = [v1City].EmailGroupFieldValue_emailid left outer join EmailGroupFieldValues as [v1State] on [e1].email_id = [v1State].EmailGroupFieldValue_emailidwhere [e1].EmailGroup_id = 8703099 and [v1Newsletter].EmailGroupFieldValue_fieldid = 1764596 and [v1PR].EmailGroupFieldValue_fieldid = 1764608 and [v1FirstName].EmailGroupFieldValue_fieldid = 1764536 and [v1LastName].EmailGroupFieldValue_fieldid = 1764548 and [v1Company].EmailGroupFieldValue_fieldid = 1764560 and [v1DOB].EmailGroupFieldValue_fieldid = 1764572 and [v1LastPurchase].EmailGroupFieldValue_fieldid = 1764584 and [v1LastVisited].EmailGroupFieldValue_fieldid = 1764620 and [v1Address].EmailGroupFieldValue_fieldid = 1764632 and [v1City].EmailGroupFieldValue_fieldid = 1764644 and [v1State].EmailGroupFieldValue_fieldid = 1764656This is a cumbersome and CPU/memory intensive query, and if a user wants 50+ fields in his Email Group, this query results in a memory error. Even when the Group has say, 2 members in it. The other problem is that our Emails table now has 500 million rows in it. When a customer wants to delete an "Email Group" or just delete all the data in an existing "Email Group", these operations take too long.I was wondering if it might solve a lot of my problems if I switched to a different model where we create a separate physical table in the database for each "Email Group". There are about 30,000 E-mail Groups in our app right now -- all the data is currently stored in those four tables refernced above.Should I convert this data to 30,000 different tables? I'm thinking that the following problems would be solved then:1. When user deletes all Group members, can just truncate table, and it will go very fast.2. When user deletes Group, just drop table, and it will go very fast.3. Can expand the max number of fields allowed in a Group to 255.4. Getting back the data in a "Email Group" will be as simple and fast as doing a "select * from table"Are there any downsides to this approach? Is having 30,000 tables going to kill the I/O on my disk system?Switching to this data model is going to take a significant effort on our part, so I wanted to seek the advice of some experts before moving forward with it!Thanks in advance,Ajay Goel |
|
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2006-10-20 : 06:24:13
|
| What indices are on these tables?Going down the route of 30000 tables is a logistical madness.What code are you using to update/deelte groups?There must be a solution that performs better. |
 |
|
|
ajaygoel
Starting Member
2 Posts |
Posted - 2006-10-20 : 06:47:11
|
There are lots of indices on the tables, and I've had a few DBAs look them over, and the indices I believe are all correct and doing the best job they can.I have a careful plan layed out of how to transfer the data to the individual tables, so I'm not worried about the logistical madness.The code to delete Groups and Group members are simple DELETE statements.quote: Originally posted by AndrewMurphy What indices are on these tables?Going down the route of 30000 tables is a logistical madness.What code are you using to update/deelte groups?There must be a solution that performs better.
|
 |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2006-10-20 : 07:35:29
|
quote: I'm in charge of an email application......and I've had a few DBAs look them over...
So if you didn't build the application and you are not the DBA, what are you?I'll give you 3 letters to ponder... EAVDavidMProduction is just another testing cycle |
 |
|
|
mr_mist
Grunnio
1870 Posts |
Posted - 2006-10-20 : 07:51:13
|
| Wow. I hope you don't plan on using Enterprise Manager to look at your database with 30000 tables in it. You're opening yourself up to a world of pain when it comes to joining these tables, with dynamic-sql-arama. Also, you'd need (I expect) to have a lookup table, which would be subject to massive contention.-------Moo. :) |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-10-20 : 08:39:48
|
| You should look at a common data model that could use a simple design for the email. Your current EAV design is a really bad idea. The 30,000 tables would be a whole lot worse.Can't you just analyze the data you already have to find what is really necessary for an email application? I think if you really put some effort into modeling a relational structure, you could come up with a much simpler solution than either your current bad design or your proposed new bad design.CODO ERGO SUM |
 |
|
|
LoztInSpace
Aged Yak Warrior
940 Posts |
Posted - 2006-10-21 : 02:08:55
|
| This is a database designed by a java programmer! You're fucked. As MVJ says - start again. |
 |
|
|
|
|
|
|
|