Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 09:58:35
|
Hi,I've build this query to convert a table with normal IP-addresses (213.122.0.12) to an long/int value. (-1107296256) so the total table size will be smaller.It looks like this:declare @ip varchar(20)declare @long bigintdeclare @blok1 bigintdeclare @blok2 bigintdeclare @blok3 bigintdeclare @blok4 bigintdeclare @temp varchar(20)declare @counter intset @counter = 1While @counter <= (select count(*) from ripencc_20021101)BEGINset @ip = (Select col004 from ripencc_20021101 where id = @counter)set @blok1 = str(left(@ip, PATINDEX('%.%', @ip)))set @temp = right(@ip, len(@ip) - PATINDEX('%.%', @ip))set @blok2 = str(left(@temp, PATINDEX('%.%', @temp)))set @temp = right(@temp, len(@temp) - PATINDEX('%.%', @temp))set @blok3 = str(left(@temp, PATINDEX('%.%', @temp)))set @blok4 = right(@temp, len(@temp) - PATINDEX('%.%', @temp))set @long = (@blok1 * 16777216)set @long = @long + (@blok2 * 65536)set @long = @long + (@blok3 * 256)set @long = @long + @blok4set @long = @long - 2147483648update ripencc_20021101 set col004 = @long where id = @counterset @counter = @counter + 1endNow I know this takes a lot resources, but I'am not that formiliar with SQL so I would like to know how I can eliminate that loop.Thanks.Bjorn |
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-12-02 : 10:13:13
|
Well, you're gonna have a ton of problems making this fit into an int (if it can be done at all, I don't think it can) and the overhead of converting to and from will kill your performance. Not to mention that you'll have no ability to query for a particular subnet or IP range if you convert to an int.If you really can't store it as a varchar (c'mon, storage space isn't THAT expensive), you are far better off splitting the IP into 4 tinyint columns:CREATE TABLE IPAddresses (octet1 tinyint NOT NULL, octet2 tinyint NOT NULL,octet3 tinyint NOT NULL,octet4 tinyint NOT NULL )Then you can do this:INSERT INTO IPAddresses (octet1, octet2, octect3, octet4)SELECT Convert(tinyint, PARSENAME(col004,4)),Convert(tinyint, PARSENAME(col004,3)),Convert(tinyint, PARSENAME(col004,2)),Convert(tinyint, PARSENAME(col004,1))from ripencc_20021101That will convert all of the IP's into the 4 tinyint columns in one shot. You can reconstruct them using the following:SELECT CAST(octet1 as varchar) + '.' + CAST(octet2 as varchar) + '.' + CAST(octet3 as varchar) + '.' + CAST(octet4 as varchar) AS IPAddressFROM IPAddressesEdited by - robvolk on 12/02/2002 10:17:31 |
 |
|
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 10:42:32
|
ow, I'll keep that in mind, but than comes my second problem.I need to uses stats files from arin.net (ftp://ftp.arin.net/pub/stats/) and other registrars to put into my table, but these are in this format:arin|US|ipv4|24.30.224.0|8192|0|allocatedarin|US|ipv4|24.31.0.0|8192|0|allocatedarin|US|ipv4|24.31.32.0|57344|0|allocatedas you can see, It brings the start address, but it does not brings the end address, it only brings the number of ip's. so, I'll still need to convert the ip address into long, and as you sugested, I maybe can convert it back into varchar.maybe this cleares it up.ps. And it can fit into an int, if tested it with ip address 0.0.0.0 (minimum) and 255.255.255.255 (maximum) :xEdited by - bjornH on 12/02/2002 10:46:02 |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2002-12-02 : 10:47:57
|
bigint not int.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 10:52:12
|
nope, int 255.255.255.255 brings back 2147483647 and what is the maximum of a int? Integer (whole number) data from -2^31 (-2,147,483,648) through 2^31 - 1 (2,147,483,647). Storage size is 4 bytes. The SQL-92 synonym for int is integer. |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2002-12-02 : 11:06:56
|
Oops missed the subtraction at the endYou can store the bit map in an int - 4 bytes * 255 per byte (i.e. tinyint) which I guess is what this is doing.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 11:11:22
|
jup, but i'll still could need some optimalisation... :D |
 |
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2002-12-02 : 11:20:50
|
Your biggest problem, despite how you choose to store the data, is looping through the table the way you are to calc the int.Encapsulate your logic into a user-defined function. Call it, for example, ConvertToInt(), and then you can run:UPDATE ripencc_20021101 set col004 = ConverToInt(columToConvert);This will update the entire table (or you can use a WHERE to choose only certain rows) at once, without looping through the table the way you are currently doing it.See books on-line for info on user-defined functions.- Jeff |
 |
|
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 11:26:09
|
aah yes, i thought of that, but i couldn't find much about it, so i thought that it didn't excisted in SQL, but Apparently it does. I'll continue my quest for user-defined functions... thank you.Bjorn |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2002-12-02 : 11:32:12
|
well you can do what Rob says probablyupdate ripencc_20021101 set col004 =Convert(bigint, PARSENAME(col004,4)) * 16777216+ Convert(bigint, PARSENAME(col004,3)) * 65536+ Convert(bigint, PARSENAME(col004,2)) * 256+ Convert(bigint, PARSENAME(col004,1)) - 2147483648 from ripencc_20021101 ==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2002-12-02 : 11:35:21
|
It only exists in SQL 2000 -- if you are running 7 or below, it doesn't have the user-defined functions.By the way, I also think 4 tiny int fields would be maybe a nicer way to store this data, so you could parse it easily. And it requires the same amount of space - 4 bytes.By putting it all into 1 integer, as you are trying to do, any querying you do of the data will have to run a reverse version of your ConvertToInt() function, which won't be able to use indexes or anything and will be very slow.Good luck!- JeffEdited by - jsmith8858 on 12/02/2002 11:36:44 |
 |
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2002-12-02 : 11:39:07
|
I didn't notice the use of PARSENAME -- very clever! a nice side effect of IP address and database objects both using the "." as a delimiter!That would be the way to go. very nice.- Jeff |
 |
|
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 11:54:59
|
I've used the user defined function and it works, very well i can say. btw, I think i still can use my Int ip-addresses because i use ASP to convert them back, (and when the normal user is on the website, it also uses ASP to convert into Int, i only created this script for the mass-input from Arin.net) so this doesn't affect indexes and preformance, or does it? |
 |
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-12-02 : 12:02:59
|
Sure it will affect performance. Every IP address that is accessed in the query needs to be converted from an int back to an IP. The overhead can add up if you are returning thousands of rows. Doing it as an ASP function will only slow it down further. Even if you aren't returning a lot of rows, it will take much longer to process because of the conversion.Using a SQL Server UDF isn't much better. The important parts of an IP address--the octets--are completely lost in an int value. You'd have to do some fancy bit-flipping or bitwise operations to parse out a subnet. It certainly isn't any more efficient than leaving them in a varchar column; you can still parse things out using PARSENAME, but it won't work on an int. And you'll get no help from an index on an int column.It really isn't worth converting to an int, you'll only get lesser performance. Either keep it as varchar, or parse it into 4 tinyint columns.Edited by - robvolk on 12/02/2002 12:04:47 |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2002-12-02 : 12:04:51
|
Depends what you want to do with it.If you just want to return the IP address or find a specific IP address then fine (except that you are leaving an odd database structure which will probably cause a maintenance overhead in the future).If you want to search on part of an IP address then you have big problems.I wouldn't code the convertion in ASP - leave it in the database (as udf's or just in the SPs) then it is available to anything that wants to access the data - you won't have to re-code it for every application that needs the data.It will also be available when you want to look at the data in query analyser.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 13:27:25
|
ow, i've forgot something that's quite important I think.The ip-addresses are in a range. So i have 123.233.43.0 to 123.233.45 and with this range comes a id for a country (let's say Holland, Holland is a nice country ). And a visitor that has the ip 123.233.44.234, the ip is in the range, so the it will give back Holland.now aint it easier to use a query like this:select countryID from iprange where @visitorID between ipfrom AND iptoWhen I use my way, the Int way, this doesn't give any probs, but with a varchar way this isn't possible (or am i wrong).so, is it still bad to use my way? (I think this is my last counter... :D, if you come back good, i'll do it your way and shut my big mouth.. ) |
 |
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-12-02 : 13:54:38
|
Yes, you can do it with a varchar:SELECT * FROM myTableWHERE PARSENAME(IP, 4) = '123'AND PARSENAME(IP, 3) = '233'AND CAST(PARSENAME(IP, 2) AS int) BETWEEN 43 AND 44Now, if you parse the IP addresses into 4 columns:SELECT * FROM myTableWHERE octet1 = 123AND octet2 = 233AND octet3 BETWEEN 43 AND 44The second version, however, allows you to use indexes for each octet, and it can be much faster because the parsing is done ahead of time. Using PARSENAME allows you to keep it as a varchar, but you can't index on a subnet.Depending on the values you have in your data, if you create a clustered index on the proper column(s), you can get lightning fast response even with millions of rows. |
 |
|
bjornh
Yak Posting Veteran
87 Posts |
Posted - 2002-12-02 : 14:12:46
|
ok, your the guru :), i'll check it out morning, thanks for all the help |
 |
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2002-12-02 : 16:20:02
|
quote: arin|US|ipv4|24.30.224.0|8192|0|allocatedarin|US|ipv4|24.31.0.0|8192|0|allocatedarin|US|ipv4|24.31.32.0|57344|0|allocated
Stupid question, but can that third field contain 'ipv6'? Pretty difficult to fit 128 bits in an int! |
 |
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2002-12-03 : 05:43:17
|
quote: If you want to search on part of an IP address then you have big problems.
As you say you have a lot of IP addresses then splitting up the octets into different fields gives more indexing options - and if you store IP addresses this is usually valuable.You can store as a varchar and still index but this probably won't be as flexible as the separate fields.That's why you rarely see IP addresses stored in the sort of format you were suggesting.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2002-12-03 : 09:04:42
|
I'd be interested to see how you "4 tinyint columns" folk would write a query that found which subnet (if any) a given address was in. I'd use binary(4) myself.To load the table:DROP TABLE IP_StagingCREATE TABLE IP_Staging ( auth char(4), country char(2), type varchar(255), ipstart varchar(255), netsize int, somedate varchar(255), status varchar(255))BULK INSERT IP_StagingFROM 'r:\temp\arin.20021201.txt'WITH ( FIELDTERMINATOR = '|', FIRSTROW = 2)DROP TABLE IP_SubnetsCREATE TABLE IP_Subnets ( ipstart binary(4) PRIMARY KEY, netmask binary(4) NOT NULL, country char(2) NOT NULL)INSERT INTO IP_Subnets (ipstart, netmask, country)SELECT CAST(CAST(PARSENAME(ipstart, 4) AS tinyint) AS binary(1)) + CAST(CAST(PARSENAME(ipstart, 3) AS tinyint) AS binary(1)) + CAST(CAST(PARSENAME(ipstart, 2) AS tinyint) AS binary(1)) + CAST(CAST(PARSENAME(ipstart, 1) AS tinyint) AS binary(1)) AS ipstart, CAST(netsize - 1 AS binary(4)) AS netmask, countryFROM IP_StagingWHERE type = 'ipv4' AND ipstart <> '*' To find the subnet:DECLARE @ip varchar(16), @ipbin binary(4)SET @ip = '66.129.68.65' // www.sqlteam.com, of courseSET @ipbin = CAST(CAST(PARSENAME(@ip, 4) AS tinyint) AS binary(1)) + CAST(CAST(PARSENAME(@ip, 3) AS tinyint) AS binary(1)) + CAST(CAST(PARSENAME(@ip, 2) AS tinyint) AS binary(1)) + CAST(CAST(PARSENAME(@ip, 1) AS tinyint) AS binary(1))SELECT @ipbin, ipstart, netmask, country, CAST(CAST(SUBSTRING(ipstart, 1, 1) AS tinyint) AS varchar(3)) + '.' + CAST(CAST(SUBSTRING(ipstart, 2, 1) AS tinyint) AS varchar(3)) + '.' + CAST(CAST(SUBSTRING(ipstart, 3, 1) AS tinyint) AS varchar(3)) + '.' + CAST(CAST(SUBSTRING(ipstart, 4, 1) AS tinyint) AS varchar(3))FROM IP_SubnetsWHERE ipstart = ( SELECT MAX(ipstart) FROM IP_Subnets WHERE @ipbin >= ipstart ) AND @ipbin < CAST(CAST(ipstart AS int) | CAST(netmask AS int) AS binary(4)) -- * * how silly! The bitmap operators don't work on binary datatypes.Er, my terminology is a bit wrong: "netmask" column is really the complement of the netmask.Edited by - Arnold Fribble on 12/03/2002 09:08:17 |
 |
|
Next Page
|
|
|
|
|