| Author |
Topic |
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2006-11-29 : 10:36:12
|
| Hi,I'm working on a messaging feature for a website. The site will have tens of thousands of users who will potentially be sending messages each day. This will amount to millions of messages.So at which point is this going to overload SQL Server? There shouldn't be too much searching on the table. Mainly just - get all messages for a given user.Cheers, XF. |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-11-29 : 12:27:54
|
| Millions and tens-of-millions of rows is fine. But you have moved into the realms of needing a properly managed, designed, and optimised system for it to run effectively.I've today detected and optimised an Sproc that was taking 10 seconds to run and being called 2,000 times a day. Its now running in <25ms - programming error in the first place.I've also got to work today on an SProc that is taking 500ms to run. Not too bad you may think, but its being called 200,000 times a day for a cumulative total of 80million elapsed seconds a day, and clearly any improvement there will make an impact overall.So ... the bigger the database the more you will people with skills to do those type of jobs.Kristen |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-29 : 16:30:16
|
| Great work!From just about 6 hours per day, to under 10 minutes.Peter LarssonHelsingborg, Sweden |
 |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2006-11-29 : 17:03:23
|
You'll need some sort of purging/archiving strategy though. tables that grow without bound will kill perf eventually. You can't grow by 10m rows a day and expect sproc calls to stay snappy for long  SqlSpec: a fast and comprehensive data dictionary generator for SQL Server 2000/2005, Analysis Server 2005, Access 97/2000/XP/2003 http://www.elsasoft.org |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-11-29 : 17:06:12
|
And the messages have to be stored in normalized tables with optimal datatypes for efficient retrieval and inserts.> "There shouldn't be too much searching on the table"[i/]If there is any searching going on, you need appropriate indexes.[i]"The site will have..."promote it here , where is it?rockmoose |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-29 : 17:14:46
|
maybe horizontaly partition the table?put users 1000 users' messages in the first table, other 1000 in the second etc...than you can have a mapping table that tells you which users are where.It should improve the performace as your tables are small and an overhead of a mapping table lookup is diminished.Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
|
rockmoose
SQL Natt Alfen
3279 Posts |
Posted - 2006-11-29 : 17:33:47
|
| Did you hear about partitoned views?What's this mapping table?The #1 benefit of horizontal partitioning is imo for archiving/backup strategies, the performance is not always noticeable.rockmoose |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-11-30 : 01:47:17
|
| Re: partiioned views. This relates to SQL 2000, but may help somewhat:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=55210&SearchTerms=Requirements%20for%20Horizontal%20Partitions |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-11-30 : 02:34:35
|
| "From just about 6 hours per day, to under 10 minutes."Actually checking the stats today its fallen from 10.247 seconds average execution to 77ms. I'm please with that!Kristen |
 |
|
|
X-Factor
Constraint Violating Yak Guru
392 Posts |
Posted - 2006-11-30 : 05:10:38
|
quote: "The site will have..."promote it here , where is it?
Its a system for students from the county of Nottinghamshire in England to manage their qualifications.I think I'm going to use a single table for now. I've been given a value of 16000 users. So I figure they can each send one mail a day for 1.7 years until 10 million rows is reached. In the mean time I think its better to get the site live. |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-30 : 05:16:27
|
i wasn't talking about partitioned views. I might be wrong but on a table with tens-of-millions of rows with equal distribution of users partitioned views won't be any faster.i'd try having let's say 10 tables each holding 100 users and that would have lets say about 2 million rows each.then you'd have a mapping table (or you can do this in the app) that tells you in which table is which user.maybe a little complex but querying would be a lot faster.My opinion only. Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-11-30 : 05:23:44
|
| "I might be wrong but on a table with tens-of-millions of rows with equal distribution of users partitioned views won't be any faster."I think it probably would if the User was part of the PK - SQL would query the appropriate underlying table, using the clustered PK index to get to the data for that user.But then in a single table it would be doing the same trick - so apart from possibility having one more level in the index when the table gets huge, I don't see it being any slower.However, if a user has lots of message its possible that the Statistics will cause the query planner to use a Table Scan ... and then partitioned views will help (provided SQL Server spots that it can use only one of the tables ... and in SQL 2000 at least I don't reckon that is a racing-certainty!If you are going the 10 tables route (which I think is a good idea) then you probably want to choose the table based on module 10 of the User's ID. Otherwise your first 100 users will all be in the same table with no performance gain ... and no shakedown testing of the system either!Kristen |
 |
|
|
|