Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Database Design Stategy for 20 Billions Tuples

Author  Topic 

seenhzj
Starting Member

8 Posts

Posted - 2012-01-17 : 00:39:55
I am going to create a very large database with 20 billions of records
in one single table, which has only 4 int column. The data storage will be about 2 TB. I have to use a MS SqlServer 2008 R2 in a server with limited computing power (one disk array only; 48G memory). I would like to get some ideas about the physical design with optimal database files, filegroups and table partition allocations. My purpose is to do multidimensional summaries and analytical reporting. Could you point me to the related resources? Thank you very much.

------
Edited: the four dimensions are: TimeId, LocationId, UserId, ActId. They are all int. The combination of these four are unique, while each of them is not unique. We are interested to discover the knowledge pattern from each dimension, and want to generate hypothesis for further analysis.

Sachin.Nand

2937 Posts

Posted - 2012-01-17 : 04:03:42
What is your disk and memory configuration ?

After Monday and Tuesday even the calendar says W T F ....
Go to Top of Page

jackv
Master Smack Fu Yak Hacker

2179 Posts

Posted - 2012-01-17 : 08:29:01
Will the data be updated regularly ? or will it be a "on-off" INSERT ?
What sort of queries? Will there be a Primary Key?
Also , as already asked - size of memory .

Jack Vamvas
--------------------
http://www.sqlserver-dba.com
Go to Top of Page

seenhzj
Starting Member

8 Posts

Posted - 2012-01-17 : 11:28:47
I have 4 physical disk and 48 G memory.
quote:
Originally posted by Sachin.Nand

What is your disk and memory configuration ?



Learning to master T-SQL
Go to Top of Page

seenhzj
Starting Member

8 Posts

Posted - 2012-01-17 : 11:31:19
It will be kind of a "on-off" insert but we have to insert the data several times.
There are no primary keys in the table
The main query we will perform is aggregation query at the moment, we want to get some statistics based on the data.

quote:
Originally posted by jackv

Will the data be updated regularly ? or will it be a "on-off" INSERT ?
What sort of queries? Will there be a Primary Key?
Also , as already asked - size of memory .

Jack Vamvas
--------------------
http://www.sqlserver-dba.com



Learning to master T-SQL
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2012-01-17 : 12:17:03
You have no primary key?

Do you have a unique clustered index?

Do you have a non unique clustered index

Do you have ANY indexes at all?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/


Go to Top of Page

seenhzj
Starting Member

8 Posts

Posted - 2012-01-17 : 17:10:38
Hello Brett,
Basically what we are expecting are raw data. The combination of the four column should be considered unique. We are thinking strategies to create indexes and partition the table before we load the data
quote:
Originally posted by X002548

You have no primary key?

Do you have a unique clustered index?

Do you have a non unique clustered index

Do you have ANY indexes at all?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/






Learning to master T-SQL
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2012-01-17 : 18:42:02
why are you doing this? what kind of queries do you plan to run? Are you sure that a relational database is what you need?

20 billion 4 dimensional vectorspoints.....

I'd be interested to hear what you are going to do with them.

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-01-18 : 05:01:31
How does the queries for this table look like?



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2012-01-18 : 16:19:05
What is a supple tuple?

Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/


Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2012-01-18 : 17:20:50
quote:

What is a supple tuple?



is it a (linked)list of values where the head is connected to the tail?

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

seenhzj
Starting Member

8 Posts

Posted - 2012-01-18 : 21:27:23
We would like to get some quick statistics from each dimension, and also prepare for future filtering...
quote:
Originally posted by Transact Charlie

why are you doing this? what kind of queries do you plan to run? Are you sure that a relational database is what you need?

20 billion 4 dimensional vectorspoints.....

I'd be interested to hear what you are going to do with them.

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION




Learning to master T-SQL
Go to Top of Page

seenhzj
Starting Member

8 Posts

Posted - 2012-01-18 : 21:30:33
The original post has been changed.

Learning to master T-SQL
Go to Top of Page

steve.caplin77
Starting Member

3 Posts

Posted - 2012-01-27 : 14:14:49
unspammed
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2012-01-27 : 17:45:44
If I'm not mistaken...a tuple is a row

Where is ALL tis data coming from?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/


Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2012-01-28 : 01:52:34
unless your import file is already ordered the same as your clustered index, you should build indexes only AFTER the data is loaded. here i am assuming you will use bcp/bulk insert to load it. what you cluster on depends heavily on what queries you will run. but you haven't provided that so can't be much help there.

http://sqlcat.com/ has some good resources for large deployments on sql server.

good luck.


elsasoft.org
Go to Top of Page

jackv
Master Smack Fu Yak Hacker

2179 Posts

Posted - 2012-01-28 : 04:14:23
Think of different strategies according to intended activity. For example , if you're doing a large UPDATE statement , you may want to disable Non Clustered Indexes and Rebuild Indexes afterewards.
Having already defragmented the indexes.
It depends heavily on your intentions

Jack Vamvas
--------------------
http://www.sqlserver-dba.com
Go to Top of Page
   

- Advertisement -