Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Maleks
Starting Member
10 Posts |
Posted - 2008-10-28 : 06:26:27
|
Hey there,I've recently come across a bit of a problem using stored XML in SQL for an application. We needed a way to store differing amounts of form fields so settled on XML.The XML follows the form:<details> <detail mapping="reference" label="Invoice Number"> <value>5675678</value> </detail> <detail label="Other number"> <value /> </detail> <detail mapping="name" label="Name"> <value>Test Testor</value> </detail> <detail mapping="address" label="Address"> <value>12345 Test St</value> </detail> <detail mapping="postcode" label="Postcode"> <value>987 123</value> </detail> <detail label="Telephone Number"> <value>9876789</value> </detail> <detail mapping="email" label="Email"> <value>test@test.com</value> </detail></details>The problem naturally arises with very slow query times when cherry picking values from the XML with .value and also using these in where clauses. The typical .value statement looks something like:Details.value('(/details/detail[@mapping="email"]/value)[1]', 'varchar(255)') I've done a bunch of research and found and implemented the following:- Typing the XML (i.e using a schema). Resulted in speed increase- XML index on primary and secondary on path. Resulted in speed increase- Creating computed columns (based on functions returning values from the XML) and indexing these columns. Obviously querying these columns instead of using .value . Resulted in a dramatic decrease in speed- Reducing calls to .value evaluation by including a sub query in the where clause and then refering to its results. Resulted in no noticable shift in speedSome things which may increase speed but can't do:- Using less generic node structure. So instead of <detail mapping="email"> it would be <email>- Shifting XPath clauses [] to the end of the evaluationsThis has the potential to be quite a useful discussion so please chime in and help out if you can.Thanks,Alex |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-10-28 : 07:58:50
|
XML queries can be faster when using CROSS APPLY.Try thisDECLARE @xml XMLSET @xml = '<details> <detail mapping="reference" label="Invoice Number"> <value>5675678</value> </detail> <detail label="Other number"> <value /> </detail> <detail mapping="name" label="Name"> <value>Test Testor</value> </detail> <detail mapping="address" label="Address"> <value>12345 Test St</value> </detail> <detail mapping="postcode" label="Postcode"> <value>987 123</value> </detail> <detail label="Telephone Number"> <value>9876789</value> </detail> <detail mapping="email" label="Email"> <value>test@test.com</value> </detail> </details>'SELECT d.a.value('@mapping', 'nvarchar(max)') AS Mapping, d.a.value('@label', 'nvarchar(max)') AS Label, v.e.value('.', 'nvarchar(max)') AS ValueFROM @xml.nodes('/details/detail') AS d(a)CROSS APPLY d.a.nodes('value') AS v(e)Mapping Label Value-------- ---------------- -------------reference Invoice Number 5675678NULL Other number name Name Test Testoraddress Address 12345 Test Stpostcode Postcode 987 123NULL Telephone Number 9876789email Email test@test.com E 12°55'05.63"N 56°04'39.26" |
|
|
Maleks
Starting Member
10 Posts |
Posted - 2008-10-28 : 08:45:18
|
Cross apply looks interesting. Thanks.I realise I should probably expand on the table and query details.The table looks like:Payment----------------Id INT,Date SMALLDATETIME,User_Id INT,Amount DECIMAL(18,2),Details XMLThe query as is looks like:SELECT Payment.Id, [Date], User.Email, [User_Id], Details.value('(/details/detail[@mapping="email"]/value)[1]', 'varchar(255)') AS Details_Email, Details.value('(/details/detail[@mapping="name"]/value)[1]', 'varchar(255)') AS Details_Name, Details.value('(/details/detail[@mapping="reference"]/value)[1]', 'varchar(255)') AS Details_Reference, AmountFROM Payment LEFT OUTER JOIN User ON Payment.User_Id = User.IdWHERE [Date] >= '01/01/1900' AND [Date] < '01/02/2079' AND Payment.Deleted = 0 AND ( Payment.Details.value('(/details/detail[@mapping="email"]/value)[1]', 'varchar(255)') LIKE '%est%' OR Payment.Details.value('(/details/detail[@mapping="name"]/value)[1]', 'varchar(255)') LIKE '%est%' OR Payment.Details.value('(/details/detail[@mapping="reference"]/value)[1]', 'varchar(255)') LIKE '%est%' ) ORDER BY Details.value('(/details/detail[@mapping="name"]/value)[1]', 'varchar(255)') DESCYour help is much appreciated |
|
|
LoztInSpace
Aged Yak Warrior
940 Posts |
Posted - 2008-10-29 : 07:59:22
|
Generally these generic solutions are always a bit crap though. It's not really how relational databases are designed to work.Did you consider storing the values in named, indexed columns instead of XML? If you have a static structure then this is the way to go. If not then maybe try having a bunch of different string/int/date columns and put your XML values in there. You can then use your XML schema/meta data to work out which columns refer to which tags in your XML and form a query accordingly. |
|
|
Maleks
Starting Member
10 Posts |
Posted - 2008-10-29 : 08:18:25
|
Yeah it's definitely not an ideal solution. We used it because of the age old problem that the data collected from implementation to implementation may well vary.The added hitch is that this system is used across multiple applications and multiple sites so major restructuring in the DB and business layer will be one heck of a headache.The solution we have settled on:Creating computed columns for key bits of data based on functions which return a specific value from the XML using something like:Details.value('(/details/detail[@mapping="email"]/value)[1]', 'varchar(255)') These computed columns are then indexed and persisted. They are never updated.Does anyone know of any major drawbacks to this method? |
|
|
|
|
|
|
|