| Author |
Topic |
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2005-02-16 : 15:04:43
|
| I am attempting to develop a web application that has to be very scalable in terms of storage space. JPG images are stored, relative to rows in a specific table in the database.I have to be able to store millions of JPG's. With my estimation of approximately 100gb of HD space for the images per 1 million rows, I am going to have to go across multiple drives. I think also for management issues it is a good idea to go across many directories per drive.Do you think this is a DB design issue? If so, what would be a good solution?I was going to store each file name with an INT counter value since security is not a concern. I have seen other situations where the file names are as such:1r3jo9.jpg1r3jo8.jpg1r3jo7.jpgI like the shortness of this, could anyone tell me what this filename stucture is?Thanks again for any help!mike123 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2005-02-16 : 23:03:17
|
For a database this size, you need to do a good job of identifying how much total data you will have. Will all of the JPG's be stored in single or multiple tables?Once you identify how much data you are going to store, you will have a better idea how to design the storage. It sounds like you will be above 100 GB or maybe above a terrabyte.When databases get this large, you need to plan the storage carefully. You need enough disk for the database, enough for at least three database backups, and enough room to restore the database to a new database if you have to recover data. A good rule of thumb is to have enough disk for at least five times the size of your full sized database.It sounds like you will need one or more large disk arrays, or SAN storage for this. If you have to have multiple disk arrays, you can distribute the the data by creating file groups and putting multiple files on different disk arrays within each filegroup.quote: Originally posted by mike123 I am attempting to develop a web application that has to be very scalable in terms of storage space. JPG images are stored, relative to rows in a specific table in the database.I have to be able to store millions of JPG's. With my estimation of approximately 100gb of HD space for the images per 1 million rows, I am going to have to go across multiple drives. I think also for management issues it is a good idea to go across many directories per drive.Do you think this is a DB design issue? If so, what would be a good solution?I was going to store each file name with an INT counter value since security is not a concern. I have seen other situations where the file names are as such:1r3jo9.jpg1r3jo8.jpg1r3jo7.jpgI like the shortness of this, could anyone tell me what this filename stucture is?Thanks again for any help!mike123
Codo Ergo Sum |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2005-02-16 : 23:27:25
|
| Hi Michael,Thanks for the response. I think I should have been a bit clearer in my wording. I have seperated the DB server and the Web server.The DB server is not storing images in the actual table, they are being stored on the webserver. I'm trying to figure out the best way to relate the data between the drive / directory structure on the webserver(s) and the database server.Thanks,Mike123 |
 |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2005-02-17 : 01:24:29
|
| Mike123,Plan your filegroups and files well. Consider creating a filegroup(s) to hold the image data and placing multiple files (say 10 files of 10Gb etc...) in that group. It will enable you to restore non-image data without the hassle of the 100Gb of Image stuff...DavidMA front-end is something that tries to violate a back-end. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2005-02-17 : 03:57:50
|
Can't say I'm 100% in on what you are really asking but I'd probably design a system with several webservers like this:CREATE TABLE dbo.image_paths ( PathID int IDENTITY(1, 1) NOT NULL, LocalPath varchar(500) NOT NULL, BaseURL varchar(500) NOT NULL)CREATE TABLE dbo.images ( ImageID int IDENTITY(1, 1) NOT NULL, PathID int NOT NULL, Filaname varchar(500) NOT NULL) This is rather simple but will let you have images on an unlimited numbers of drives/servers. What the filenames are is not very important in my opinion but I have seen websites that have each filename as a GUID. Could be a good idea when using multiple servers and it would save you the hassle of enforcing a certain naming pattern....--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2005-02-17 : 15:15:02
|
| hi lumbargo,thanks for the suggestion, this is what i am looking for .. i just over complicated it for some reason in fear of not preparing enough ... I also like to use GUID's however in the past i've had users experience problems when attempting to cut/ paste / email GUID's .. they seem to get broken / seperated in emails .. I really like this structure that i've seen because of its shortness, but im not sure what it is ... is it a format or just some hack solution???1r3jo9.jpg1r3jo8.jpg1r3jo7.jpgany idea?Thanks very much again!mike123 |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2005-02-18 : 03:05:26
|
| Making a proper naming scheme can hardly be a problem, but enforcing it could potentially be. What you could do which would be extremely simple is to number them sequentially with an IDENTITY-field. 1.jpg, 2.jpg, 3.jpg and so on...but crossing the million mark will maybe be in breach of your shortness-requirement. What you could do is name them aaaaa.jpg, aaaab.jpg, aaaac.jpg, and so on...using 5 letters like this will give you 11.8 mill combinations (26 letters from a-z isn't it???). This will keep filenames short, be a simple structure, and shouldn't be too hard to enforce.--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2005-02-18 : 04:15:26
|
| Hi Lumbargo,thanks again for the reply .. how would I determine how to sequentially move from1r3jo8to 1r3jo9etc? I assume there is no datatype for this, would I just figure out some logic to count up myself? Perhaps put this in the SQL logic? Thanks again!mike123 |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2005-02-18 : 04:32:24
|
| If you make the letters in my previous example into a sort of "extended HEX-format" meaning a, b...y, z, 0, 1...9 it will fit in to the format you got there. Reducing the lengt by one letter to 5 will give you 60 million combinations...EDIT: To enforce it you would probably have to make a UDF which splits the string into each character and increase them individually in some way. Have never done anything like that so I really don't have a clue...--Lumbago"Real programmers don't document, if it was hard to write it should be hard to understand" |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2005-02-18 : 14:29:39
|
| lumbargo,i think that clears everything up for me, thanks again!mike123 |
 |
|
|
|