Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 06:09:12
|
Hi Guys,
I'm playing with recursive CTE's (not for any production reason yet -- just to get my head round them). I've written this little block of code to try and find the shortest routes between two points in a bus / train / whatever network but I'm sure it can be improved. I can't seem to think of a way to stop the recursive CTE from chasing triangle loops except by placing a hard limit on recursions ( in this case a max of 10 hops)
Here's the code: DECLARE @stations TABLE ( [stationId] INT , [name] VARCHAR(255) , [Address] VARCHAR(3000) , [postCode] VARCHAR(15) )
DECLARE @links TABLE ( [from] INT , [to] INT )
INSERT @stations SELECT 1, 'Glasgow', '', '' UNION SELECT 2, 'Edinburgh', '', '' UNION SELECT 3, 'York', '', '' UNION SELECT 4, 'London', '', '' UNION SELECT 5, 'Aberdeen', '', ''
INSERT @links SELECT 1, 2 -- Glasgow to Edinburgh UNION SELECT 1, 5 -- Glasgow to Aberdeen UNION SELECT 1, 3 -- Glasgow to York (could causes a triangle trip)
UNION SELECT 2, 1 -- Edinburgh to Glasgow UNION SELECT 2, 3 -- Edingurgh to York
UNION SELECT 3, 1 -- York to Glasgow (could causes a triangle trip) UNION SELECT 3, 2 -- York to Edinburgh UNION SELECT 3, 4 -- York to London
UNION SELECT 4, 3 -- London to York
UNION SELECT 5, 1 -- Aberdeen to Glasgow
;WITH routes ([from], [to], [hops]) AS ( SELECT l.[from] , l.[to] , 0 AS [hops] FROM @links l UNION ALL SELECT r.[from] , l2.[to] , [hops] + 1 FROM @links l2 JOIN routes r ON r.[to] = l2.[from] AND l2.[to] <> r.[from] WHERE [hops] < 10 -- This is jsut to stop the recursion before it gets out of hand ) SELECT s.[name] AS [Departure] , s2.[name] AS [Arrival] , MIN(r.[hops]) AS [Links] FROM @stations s JOIN routes r ON r.[from] = s.[stationId] JOIN @stations s2 ON s2.[stationID] = r.[to] WHERE s.[name] = 'London' GROUP BY s.[name] , s2.[name] OPTION (MAXRECURSION 256)
Which produces the following results: Departure | Arrival | Links ==========+===========+========= London | Aberdeen | 2 London | Edinburgh | 1 London | Glasgow | 1 London | York | 0
I'm not looking for better ways to handle this problem. I'm looking for suggestions about how to improve the CTE. Any and all replies welcome.
Regards,
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 06:57:45
|
Thanks Peso,
There's a lot of posts in that link! I'll work my way through them.
It looks like you are doing a recursive loop inside a function. While I certainly find that kind of thinking easier to understand from functional programming - I'm not sure how applicable that can be to a recursive CTE approach. You can't declare variables inside a CTE can you?
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-11-27 : 07:47:39
|
Maybe you want something similar to this?-- Prepare station sample data DECLARE @Stations TABLE ( stationID INT, name VARCHAR(255) )
INSERT @Stations SELECT 1, 'Glasgow' UNION ALL SELECT 2, 'Edinburgh' UNION ALL SELECT 3, 'York' UNION ALL SELECT 4, 'London' UNION ALL SELECT 5, 'Aberdeen' UNION ALL SELECT 6, 'Bjuv'
-- Prepare link sample data DECLARE @Links TABLE ( fromID INT, toID INT )
INSERT @Links SELECT 1, 2 UNION ALL SELECT 1, 5 UNION ALL SELECT 1, 3 UNION ALL SELECT 2, 1 UNION ALL SELECT 2, 3 UNION ALL SELECT 3, 1 UNION ALL SELECT 3, 2 UNION ALL SELECT 3, 4 UNION ALL SELECT 4, 3 UNION ALL SELECT 6, 4 UNION ALL SELECT 5, 1
-- Stage all station combinations DECLARE @Stage TABLE ( fromID INT, fromName VARCHAR(255), toID INT, toName VARCHAR(255), hops INT, hopPath VARCHAR(MAX) )
INSERT @Stage ( fromID, fromName, toID, toName ) SELECT s1.stationID, s1.name, s2.stationID, s2.name FROM @Stations AS s1 INNER JOIN @Stations AS s2 ON s2.stationID <> s1.stationID
-- Stage link permutations DECLARE @Lyx TABLE ( hops INT, fromID INT, toID INT, hopPath VARCHAR(MAX) )
INSERT @Lyx ( hops, fromID, toID, hopPath ) SELECT 0, l.fromID, l.toID, d.name + '-' + e.name FROM @Links AS l INNER JOIN @Stations AS d ON d.stationID = l.fromID INNER JOIN @Stations AS e ON e.stationID = l.toID
-- Start processing links DECLARE @hops INT
SET @hops = -1
WHILE @@ROWCOUNT > 0 BEGIN SET @hops = @hops + 1
UPDATE s SET s.hops = l.hops, s.hopPath = l.hopPath FROM @Stage AS s INNER JOIN @Lyx AS l ON l.fromID = s.fromID AND l.toID = s.toID AND l.hops = @hops WHERE s.hops IS NULL AND l.hops = @hops
INSERT @Lyx ( hops, fromID, toID, hopPath ) SELECT DISTINCT q.Hops + 1, q.fromID, w.toID, q.hopPath + '-' + s.name FROM @Lyx AS q INNER JOIN @Lyx AS w ON w.fromID = q.toID INNER JOIN @Stations AS s ON s.stationID = w.toID WHERE q.hops = @hops AND q.fromID <> w.toID AND NOT EXISTS (SELECT * FROM @Lyx AS e WHERE e.fromID = q.fromID AND e.toID = w.toID) END
-- Show the final output SELECT fromName, toName, hops, hopPath FROM @Stage ORDER BY fromName, toName
Final output from above sample data is
fromName toName hops hopPath --------- --------- ---- --------------------------------- Aberdeen Bjuv NULL NULL Aberdeen Edinburgh 1 Aberdeen-Glasgow-Edinburgh Aberdeen Glasgow 0 Aberdeen-Glasgow Aberdeen London 2 Aberdeen-Glasgow-Edinburgh-London Aberdeen York 1 Aberdeen-Glasgow-York Bjuv Aberdeen 2 Bjuv-London-York-Aberdeen Bjuv Edinburgh 2 Bjuv-London-York-Edinburgh Bjuv Glasgow 2 Bjuv-London-York-Glasgow Bjuv London 0 Bjuv-London Bjuv York 1 Bjuv-London-York Edinburgh Aberdeen 1 Edinburgh-Glasgow-Aberdeen Edinburgh Bjuv NULL NULL Edinburgh Glasgow 0 Edinburgh-Glasgow Edinburgh London 1 Edinburgh-York-London Edinburgh York 0 Edinburgh-York Glasgow Aberdeen 0 Glasgow-Aberdeen Glasgow Bjuv NULL NULL Glasgow Edinburgh 0 Glasgow-Edinburgh Glasgow London 1 Glasgow-York-London Glasgow York 0 Glasgow-York London Aberdeen 2 London-York-Edinburgh-Aberdeen London Bjuv NULL NULL London Edinburgh 1 London-York-Edinburgh London Glasgow 1 London-York-Glasgow London York 0 London-York York Aberdeen 1 York-Glasgow-Aberdeen York Bjuv NULL NULL York Edinburgh 0 York-Edinburgh York Glasgow 0 York-Glasgow York London 0 York-London
E 12°55'05.63" N 56°04'39.26" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 08:21:47
|
Hi Peso,
I feel bad about this: you've obviously put quite some effort to trying to help me and generated some very impressive results. Hopefully they'll be useful to someone else.
But the point of my question wasn't to solve the links problem -- that was just some example that I thought of that could be handled recursively. It was *just* to try and learn more about recursive CTE's.
I don't want you to think I'm not grateful for the time you've expended. I'm a big admirer of your posts here and I've learned a lot from them. You are also very selfless with your time and energy and I admire that a lot.
I think I'd like to cancel this post. I don't want to waste any more of your or anyone else's time.
Sorry for the misunderstanding.
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-11-27 : 08:28:09
|
Not that much time. 15 minutes maybe. I changed an existing algorithm to suit this need.
There is no elegant way to resolve "circular reference" in a CTE, because you can only reference the cte once in the recursive part.
E 12°55'05.63" N 56°04'39.26" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 08:39:35
|
That's what I was finding out with some of my tests which had a "AND NOT EXISTS ()" clause. I was wondering if there was some more efficent was (still in the CTE) that I could logically trap repeat trips / triangle paths. The best I could come up with was a hard cap to the recursion level and then choose the minimum steps to each location. Obviously that generates a lot more data than is actually required.
So a recursive CTE is really only useful to navigate a tree rather than that expanding it to a node net like this?
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-11-27 : 08:54:01
|
In short, yes. There may be another solution, and that is a recursive csv string!
Setting the maximum recursion level is deceiving. What if there is a path longer than the maximum recursion level?
E 12°55'05.63" N 56°04'39.26" |
 |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 09:06:21
|
Yes -- I thought of that. I read that the maximum recursion level for a CTE was 32767 levels which I though would probably be enough for any conceivable trip.
However, I think with a sufficiently large dataset the CTE as posted (with a hard limit of 32767 say) would probably either:
a) consume all available system resources and die / slow the dbserver to a crawl. b) not finish by next week
-------------------------
A purely recursive function is limiited to 32 recursion levels? So for a good efficient answer to the task of plotting the journeys would have to be performed by a hybrid function as you posted.
-------------------------
What's your "There may be another solution, and that is a recursive csv string!" idea?
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-11-27 : 09:17:56
|
I ahve solved it... Give me some more minutes...
E 12°55'05.63" N 56°04'39.26" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-11-27 : 09:25:09
|
It seems the problem wasn't that hard after all  This algorithm will only present the possible ways, not all combinations.DECLARE @Stations TABLE ( stationID INT, name VARCHAR(255) )
INSERT @Stations SELECT 1, 'Glasgow' UNION ALL SELECT 2, 'Edinburgh' UNION ALL SELECT 3, 'York' UNION ALL SELECT 4, 'London' UNION ALL SELECT 5, 'Aberdeen' UNION ALL SELECT 6, 'Bjuv'
DECLARE @Links TABLE ( fromID INT, toID INT )
INSERT @Links SELECT 1, 2 UNION ALL SELECT 1, 5 UNION ALL SELECT 1, 3 UNION ALL SELECT 2, 1 UNION ALL SELECT 2, 3 UNION ALL SELECT 3, 1 UNION ALL SELECT 3, 2 UNION ALL SELECT 3, 4 UNION ALL SELECT 4, 3 UNION ALL SELECT 6, 4 UNION ALL SELECT 5, 1
;WITH paths(stationIDs, pathLength, hopPath, fromID, fromName, toName) AS ( SELECT CAST(stationID AS VARCHAR(MAX)), CAST(-1 AS INT), CAST(name AS VARCHAR(MAX)), stationID, CAST(name AS VARCHAR(MAX)), CAST(NULL AS VARCHAR(MAX)) FROM @Stations
UNION ALL
SELECT p.stationIDs + '/' + CAST(l.toID AS VARCHAR(MAX)), pathLength + 1, p.hopPath + '-' + s.name, l.toID, p.fromName, CAST(s.name AS VARCHAR(MAX)) FROM paths AS p INNER JOIN @Links AS l ON l.fromID = p.fromID INNER JOIN @Stations AS s ON s.stationID = l.toID WHERE '/' + stationIDs + '/' NOT LIKE '%/' + CAST(l.toID AS VARCHAR(MAX)) + '/%' )
SELECT fromName, toName, pathLength AS hops, hopPath FROM paths WHERE pathLength >= 0 ORDER BY fromName, toName, pathLength OPTION (MAXRECURSION 0)
E 12°55'05.63" N 56°04'39.26" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-11-27 : 09:36:19
|
To get only the shortest path between two stations, try this. Replace last SELECT statement with the SELECT statement below.SELECT fromName, toName, hops, hopPath FROM ( SELECT fromName, toName, pathLength AS hops, hopPath, ROW_NUMBER() OVER (PARTITION BY fromName, toName ORDER BY pathLength) AS recID FROM paths WHERE pathLength >= 0 ) AS d WHERE recID = 1 ORDER BY fromName, toName OPTION (MAXRECURSION 0)
E 12°55'05.63" N 56°04'39.26" |
 |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 10:51:34
|
Peso: That's both simple and a bit brilliant.
I see what you mean about the delimited string now.
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 11:06:45
|
Thanks for the discussion Peso, It's been very useful.
Also, thanks to Bjuv and a bit of recursive wikipediaing I now know a bit about "Skåne län". A part of the work I had never heard of.
Many thanks (once again). Hope the weather is nice for you. (cold rain here in Edinburgh)
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-27 : 12:02:12
|
Here's modified code with weighting. It was quite easy to add weights to the links: DECLARE @Stations TABLE ( [stationID] INT , [name] VARCHAR(255) )
INSERT @Stations SELECT 1, 'Glasgow' UNION ALL SELECT 2, 'Edinburgh' UNION ALL SELECT 3, 'York' UNION ALL SELECT 4, 'London' UNION ALL SELECT 5, 'Aberdeen' UNION ALL SELECT 6, 'Bjuv' UNION ALL SELECT 7, 'Båstad'
DECLARE @Links TABLE ( [fromID] INT , [toID] INT , [weighting] INT )
INSERT @Links SELECT 1, 2, 1 UNION ALL -- Glasgow, Edinburgh SELECT 1, 5, 2 UNION ALL -- Glasgow, Aberdeen SELECT 1, 3, 4 UNION ALL -- Glasgow, York SELECT 2, 1, 1 UNION ALL -- Edinburgh, Glasgow SELECT 2, 3, 2 UNION ALL -- Edinburgh, York SELECT 2, 7, 13 UNION ALL -- Edinburgh, Båstad SELECT 3, 2, 2 UNION ALL -- York, Edinburgh SELECT 3, 4, 3 UNION ALL -- York, London SELECT 4, 3, 3 UNION ALL -- London, York SELECT 4, 6, 6 UNION ALL -- London, Bjuv SELECT 5, 1, 2 UNION ALL -- Aberdeen, glasgow SELECT 6, 4, 6 UNION ALL -- Bjuv, London SELECT 6, 7, 4 UNION ALL -- Bjuv, Båstad SELECT 7, 2, 13 UNION ALL -- Båstad, Edinburgh SELECT 7, 6, 4 -- Båstad, Bjuv
/* -- Node Map (* = Go to)
A * 2 / * 1 13 G *---* E *--* Bd \ * * 4 \ / 2 | * * | Y | 4 * | 3 | | * 6 * L *------* Bj
*/ ;WITH paths([stationIDs], [pathLength], [weight], [hopPath], [fromID], [fromName], [toName]) AS ( SELECT CAST([stationID] AS VARCHAR(MAX)) , CAST(0 AS INT) , CAST(0 AS INT) , CAST([name] AS VARCHAR(MAX)) , [stationID] , CAST([name] AS VARCHAR(MAX)) , CAST(NULL AS VARCHAR(MAX)) FROM @Stations UNION ALL SELECT p.[stationIDs] + '/' + CAST(l.[toID] AS VARCHAR(MAX)) , [pathLength] + 1 , [weight] + l.[weighting] , p.[hopPath] + ' -> ' + s.[name] , l.[toID] , p.[fromName] , CAST(s.[name] AS VARCHAR(MAX)) FROM paths p INNER JOIN @Links l ON l.[fromID] = p.[fromID] INNER JOIN @Stations s ON s.[stationID] = l.[toID] WHERE '/' + p.[stationIDs] + '/' NOT LIKE '%/' + CAST(l.[toID] AS VARCHAR(MAX)) + '/%' ) SELECT d.[fromName] , d.[toName] , d.[hops] , d.[hopPath] , d.[weight] FROM ( SELECT [fromName] , [toName] , [pathLength] AS hops , [hopPath] , [weight] , ROW_NUMBER() OVER (PARTITION BY fromName, toName ORDER BY [weight]) AS recID FROM paths WHERE pathLength >= 1 ) d WHERE [recID] = 1 ORDER BY [fromName] , [toName] OPTION (MAXRECURSION 0) |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-11-27 : 13:33:33
|
I trust weight is cost of fare?
If A-C costs 40 pounds, but A-B-C costs 10+23 = 33, is that an alternative? Maybe you should add duration as another dimension?
E 12°55'05.63" N 56°04'39.26" |
 |
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2008-11-28 : 04:51:52
|
Well I meant any kind of weighting really -- whether that be cost or distance or duration or difficulty of terrain. Doesn't really matter. Adding more columns to model these things seems trivially easy.
Thanks again, I really enjoyed tinkering with this.
Charlie.
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
|
|
|
|
|