Using the PARSENAME function to split delimited dataBy Damian Maclennen on 10 November 2003 | Tags: SELECT , Functions PARSENAME is an SQL Server function used for working with linked servers and replication. It is also a handy addition to your TSQL toolkit when writing queries involving delimited data. This article covers the standard usage of PARSENAME as well as some handy tricks from a real world scenario. SQL Server uses a four part object naming syntax to represent database objects. This syntax covers objects within the current database, within a different database on the same server as well as across multiple servers. In this syntax, the full path to an object is represented in the form of Servername.Databasename.Ownername.Objectname. As an example of the four part naming convention, in a linked server environment you can data with syntax similar to : SELECT *FROM HeadOfficeSQL1.Northwind.dbo.Authors The PARSENAME function is designed to allow you to easily parse and return individual segments from this convention. It's syntax is : PARSENAME('object_name', object_piece)Where object_piece represents which segment from the four parts you wish to retrieve. The numbering works from right to left so the object name is piece 1, owner is 2, database name is 3 and server name is 4. An example : Declare @ObjectName nVarChar(1000)Set @ObjectName = 'HeadOfficeSQL1.Northwind.dbo.Authors' SELECT PARSENAME(@ObjectName, 4) as Server, PARSENAME(@ObjectName, 3) as DB, PARSENAME(@ObjectName, 2) as Owner, PARSENAME(@ObjectName, 1) as Object Will return : Server DB Owner Object -------------------------------------------------------------------- HeadOfficeSQL1 Northwind dbo Authors (1 row(s) affected) As I mentioned in the introduction, PARSENAME can also be "re-purposed" to become an incredibly handy tool in your coding arsenal when it comes to having to split delimited data. This is because the function has the ability to return a value from a specified position in a "Dot" delimited string. It does have a limitation that it will not work with more than four delimited values within a string. There are many articles on the site about working with delimited data, particularly CSV strings, which may be more appropriate for longer values. Before I go on, I have to acknowledge Rob Volk who highlighted these techniques in the forums. However, in the interests of providing information in a timely fashion, I decided to write the article myself. IP AddressesA common question in the SQLTeam forums is something along the lines of "How to I break up IP address data to group by subnet ?" Because IP addresses have a similar four part scheme, the parsename function is a perfect fit here. Create Table WebsiteLog( HitDate DateTime, IPAddress varchar(15), RequestedPage varchar(200) ) GO INSERT INTO WebsiteLog (HitDate, IPAddress, RequestedPage) VALUES ('2003-11-10 11:00:00.000', '127.0.0.1', 'Default.aspx') INSERT INTO WebsiteLog (HitDate, IPAddress, RequestedPage) VALUES ('2003-11-10 11:05:00.000', '192.168.0.140', 'About.aspx') INSERT INTO WebsiteLog (HitDate, IPAddress, RequestedPage) VALUES ('2003-11-10 11:06:00.000', '192.168.0.145', 'News.aspx') GO SELECT PARSENAME(IPAddress, 4) as Part1, PARSENAME(IPAddress, 3) as Part2, PARSENAME(IPAddress, 2) as Part3, Count(*) as CountOfHitsFromSubnet FROM WebsiteLog GROUP BY PARSENAME(IPAddress, 4), PARSENAME(IPAddress, 3), PARSENAME(IPAddress, 2) This can of course be extended to filter data in a similar fashion : SELECT * FROM WebsiteLog WHERE PARSENAME(IPAddress, 4) = '192' AND PARSENAME(IPAddress, 3) = '168' AND PARSENAME(IPAddress, 2) = '0' Data with a different delimiterNot all data you might want to parse will be delimited by a "dot". Commas and Pipes ( | ) are common too. In these cases, it is often possible to do a replace to the data to swap delimiters. I am sure a lot of people have come up against migrating multi value data from an old system or someone's bright idea of an Excel sheet. In these cases, if you can replace the commas with the "." symbol it takes all the hard work out of parsing 3 phone numbers out of the one column. A real world example.Last week I was working migrating some data from several different locations into a new SQL Server database. The new table contained start and finish datetime columns, as did most of the data I was working with. However, there was one set of data that expressed the data in the form of a start datetime and a char field representing the duration in Days, Hours, Minutes and Seconds like "DD:HH:MM:SS". The data looked something like this : Create Table OldData (recordID int, startDate datetime, duration char(11) ) GO INSERT INTO OldData (recordID, startDate, duration) VALUES (1, '2002-11-10 12:00:00.000', '4:15:03:24') INSERT INTO OldData (recordID, startDate, duration) VALUES (2, '2002-11-10 12:00:00.000', '12:1:23:2') GO SELECT * FROM OldData The table that I needed to migrate this to was more along the lines of this one : Create Table NewData ( recordID int, startDate datetime, endDate datetime ) GO By replacing the colons ":" with "." in the duration column of the old table I was able to parse out the individual parts and add them to my start date to get the end date. The end result is a little bit long winded, but did the job perfectly. SELECT recordID, startDate, DateAdd(s, Cast(ParseName(replace(duration, ':', '.'), 1) as int), DateAdd(n, Cast(ParseName(replace(duration, ':', '.'), 2) as int), DateAdd(hh, Cast(ParseName(replace(duration, ':', '.'), 3) as int), DateAdd(d, Cast(ParseName(replace(duration, ':', '.'), 4) as int), startDate) ) ) ) as endDate FROM OldData ConclusionAlthough these examples are nothing like the intended use of the PARSENAME function, sometimes it fits the bill perfectly. If you do a lot of adhoc data migration work (which is something I find myself doing) you end up with a handful of tricks for getting stubborn data to behave. PARSENAME is a handy one too keep in the arsenal for occasions like the one above.
|
- Advertisement - |