Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 sql design issue

Author  Topic 

mike123
Master Smack Fu Yak Hacker

1462 Posts

Posted - 2005-02-16 : 15:04:43
I am attempting to develop a web application that has to be very scalable in terms of storage space. JPG images are stored, relative to rows in a specific table in the database.

I have to be able to store millions of JPG's. With my estimation of approximately 100gb of HD space for the images per 1 million rows, I am going to have to go across multiple drives. I think also for management issues it is a good idea to go across many directories per drive.

Do you think this is a DB design issue? If so, what would be a good solution?

I was going to store each file name with an INT counter value since security is not a concern. I have seen other situations where the file names are as such:

1r3jo9.jpg
1r3jo8.jpg
1r3jo7.jpg

I like the shortness of this, could anyone tell me what this filename stucture is?

Thanks again for any help!
mike123

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2005-02-16 : 23:03:17
For a database this size, you need to do a good job of identifying how much total data you will have. Will all of the JPG's be stored in single or multiple tables?

Once you identify how much data you are going to store, you will have a better idea how to design the storage. It sounds like you will be above 100 GB or maybe above a terrabyte.

When databases get this large, you need to plan the storage carefully. You need enough disk for the database, enough for at least three database backups, and enough room to restore the database to a new database if you have to recover data. A good rule of thumb is to have enough disk for at least five times the size of your full sized database.

It sounds like you will need one or more large disk arrays, or SAN storage for this. If you have to have multiple disk arrays, you can distribute the the data by creating file groups and putting multiple files on different disk arrays within each filegroup.

quote:
Originally posted by mike123

I am attempting to develop a web application that has to be very scalable in terms of storage space. JPG images are stored, relative to rows in a specific table in the database.

I have to be able to store millions of JPG's. With my estimation of approximately 100gb of HD space for the images per 1 million rows, I am going to have to go across multiple drives. I think also for management issues it is a good idea to go across many directories per drive.

Do you think this is a DB design issue? If so, what would be a good solution?

I was going to store each file name with an INT counter value since security is not a concern. I have seen other situations where the file names are as such:

1r3jo9.jpg
1r3jo8.jpg
1r3jo7.jpg

I like the shortness of this, could anyone tell me what this filename stucture is?

Thanks again for any help!
mike123



Codo Ergo Sum
Go to Top of Page

mike123
Master Smack Fu Yak Hacker

1462 Posts

Posted - 2005-02-16 : 23:27:25
Hi Michael,

Thanks for the response. I think I should have been a bit clearer in my wording. I have seperated the DB server and the Web server.

The DB server is not storing images in the actual table, they are being stored on the webserver. I'm trying to figure out the best way to relate the data between the drive / directory structure on the webserver(s) and the database server.

Thanks,
Mike123
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-02-17 : 01:24:29
Mike123,

Plan your filegroups and files well. Consider creating a filegroup(s) to hold the image data and placing multiple files (say 10 files of 10Gb etc...) in that group. It will enable you to restore non-image data without the hassle of the 100Gb of Image stuff...



DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2005-02-17 : 03:57:50
Can't say I'm 100% in on what you are really asking but I'd probably design a system with several webservers like this:
CREATE TABLE dbo.image_paths (
PathID int IDENTITY(1, 1) NOT NULL,
LocalPath varchar(500) NOT NULL,
BaseURL varchar(500) NOT NULL
)

CREATE TABLE dbo.images (
ImageID int IDENTITY(1, 1) NOT NULL,
PathID int NOT NULL,
Filaname varchar(500) NOT NULL
)
This is rather simple but will let you have images on an unlimited numbers of drives/servers. What the filenames are is not very important in my opinion but I have seen websites that have each filename as a GUID. Could be a good idea when using multiple servers and it would save you the hassle of enforcing a certain naming pattern....

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

mike123
Master Smack Fu Yak Hacker

1462 Posts

Posted - 2005-02-17 : 15:15:02
hi lumbargo,

thanks for the suggestion, this is what i am looking for .. i just over complicated it for some reason in fear of not preparing enough ...

I also like to use GUID's however in the past i've had users experience problems when attempting to cut/ paste / email GUID's .. they seem to get broken / seperated in emails ..

I really like this structure that i've seen because of its shortness, but im not sure what it is ... is it a format or just some hack solution???

1r3jo9.jpg
1r3jo8.jpg
1r3jo7.jpg

any idea?

Thanks very much again!
mike123

Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2005-02-18 : 03:05:26
Making a proper naming scheme can hardly be a problem, but enforcing it could potentially be. What you could do which would be extremely simple is to number them sequentially with an IDENTITY-field. 1.jpg, 2.jpg, 3.jpg and so on...but crossing the million mark will maybe be in breach of your shortness-requirement. What you could do is name them aaaaa.jpg, aaaab.jpg, aaaac.jpg, and so on...using 5 letters like this will give you 11.8 mill combinations (26 letters from a-z isn't it???). This will keep filenames short, be a simple structure, and shouldn't be too hard to enforce.

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

mike123
Master Smack Fu Yak Hacker

1462 Posts

Posted - 2005-02-18 : 04:15:26
Hi Lumbargo,

thanks again for the reply ..

how would I determine how to sequentially move from

1r3jo8
to
1r3jo9

etc? I assume there is no datatype for this, would I just figure out some logic to count up myself? Perhaps put this in the SQL logic?

Thanks again!
mike123
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2005-02-18 : 04:32:24
If you make the letters in my previous example into a sort of "extended HEX-format" meaning a, b...y, z, 0, 1...9 it will fit in to the format you got there. Reducing the lengt by one letter to 5 will give you 60 million combinations...

EDIT: To enforce it you would probably have to make a UDF which splits the string into each character and increase them individually in some way. Have never done anything like that so I really don't have a clue...

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

mike123
Master Smack Fu Yak Hacker

1462 Posts

Posted - 2005-02-18 : 14:29:39
lumbargo,

i think that clears everything up for me, thanks again!
mike123
Go to Top of Page
   

- Advertisement -