Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Other SQL Server 2008 Topics
 Social Network Problem

Author  Topic 

MinerOnline
Starting Member

2 Posts

Posted - 2011-12-31 : 04:59:23
My problem is related to a complex query on a social graph.

I have a user table with columns: id, name, photo, and location

Then I have a table which represents the bidirectional edges in a social graph, so it has columns userId1 and userId2

I need an efficient query which takes parameters for: userId, max # generations/hops away, and location. As output I need id, name, photo and generations away for all users connected to userId within X degrees of separation. If possible it would be great to have start and limit included as parameters.

Example:

id name location
1 mike USA
2 jon Germany
3 sara Sweeden
4 james Sweeden

id1 id2
1 2
3 1
4 2

So passing in (1 as userId, 3, Sweeden) would return
3 sara photo 1
4 james photo 2

This is very similar to the problem discussed at http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=73079 (and there were a few other similar ones I saw), but I need all users within X generations returned, with their data, and I need to be able to dynamically adjust the X generations (although X <= 3). SwePeso you seem to be all over these type of problems, I hope you find this thread!

Sorry if this is hard to follow, please ask if further clarification is needed, its my first post.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-01-16 : 05:16:39
[code]DECLARE @Sample TABLE
(
ID INT,
Name VARCHAR(20),
Location VARCHAR(20)
)

INSERT @Sample
VALUES (1, 'Mike', 'USA'),
(2, 'Jon', 'Germany'),
(3, 'Sara', 'Sweden'),
(4, 'James', 'Sweden')

DECLARE @Social TABLE
(
ID1 INT,
ID2 INT
)

INSERT @Social
VALUES (1, 2),
(3, 1),
(4, 2)

DECLARE @UserID INT = 1,
@Levels INT = 3,
@Filter VARCHAR(20) = 'Sweden'

;WITH cteSource(ID, UserPath, Iterations)
AS (
SELECT ID,
CAST('\' + CAST(ID AS VARCHAR(12)) + '\' AS VARCHAR(MAX)) AS UserPath,
CAST(@Levels AS INT)
FROM (
SELECT ID1 AS ID
FROM @Social
WHERE ID2 = @UserID

UNION

SELECT ID2
FROM @Social
WHERE ID1 = @UserID
) AS d

UNION ALL

SELECT CASE
WHEN src.ID = soc.ID1 THEN soc.ID2
ELSE soc.ID1
END AS ID,
src.UserPath + CAST(CASE WHEN src.ID = soc.ID1 THEN soc.ID2 ELSE soc.ID1 END AS VARCHAR(MAX)) + '\',
src.Iterations - 1
FROM @Social AS soc
INNER JOIN cteSource AS src ON src.ID IN (soc.ID1, soc.ID2)
WHERE src.Iterations > 0
AND src.UserPath NOT LIKE '%\' + CAST(CASE WHEN src.ID = soc.ID1 THEN soc.ID2 ELSE soc.ID1 END AS VARCHAR(MAX)) + '\%'
)
SELECT DISTINCT s.ID,
s.Name,
s.Location
FROM cteSource AS c
INNER JOIN @Sample AS s ON s.ID = c.ID
WHERE s.Location = @Filter[/code]


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

MinerOnline
Starting Member

2 Posts

Posted - 2012-01-16 : 14:39:05
Thank you so much SwePeso, works great!
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2012-01-17 : 10:19:58
Very nice, Peso! When OP posted it, I stared at this for a good half hour, but the construct with the "not like" clause that avoids the infinite loops was eluding me, so I gave up.
...
AND src.UserPath NOT LIKE '%\' + CAST(CASE WHEN src.ID = soc.ID1 THEN soc.ID2 ELSE soc.ID1 END AS VARCHAR(MAX)) + '\%'
...
Edit: Make that 2 hours, if you count the time I was pulling out my own hair in frustration.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-01-17 : 11:19:02
You have a lot hair than me then

Thanks.

//Peter



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2012-01-17 : 12:01:47
this technique is really cool. This is where peso had the original idea:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=115290

it is a little slow on bigger data sets (because the LIKE predicate can't use an index I suspect).

The quickest way I've found is to use a reducing set based while loop. (each iteration of the loop does a whole level at a time) in the loop you can use a NOT EXISTS predicate to eliminate any nodes that you've found before.

Here is some example code. NONFUNCTIONAL as none of the table exist but you should be able to see how it works.


IF OBJECT_ID('tempDb..#hopDestinations') IS NOT NULL DROP TABLE #hopDestinations
CREATE TABLE #hopDestinations (
[AirportID] CHAR(3)
, [AirlineID] CHAR(2)
, [Hops] SMALLINT
, [Path] VARCHAR(4000)
)
CREATE NONCLUSTERED INDEX IX_Foo ON #hopDestinations ([AirportID],[AirlineID])

-- Initial Population
INSERT #hopDestinations
SELECT
s.[ToAirportID]
, s.[AirlineID]
, 0
, s.[FromCityID] + ' (' + s.[FromAirportID] + ') -> ' + s.[ToCityID] + ' (' + s.[ToAirportID] + ')'
FROM
dbo.tbSimpleRoutes AS s
JOIN dbo.tbCities AS c ON c.[CityID] = s.[FromCityID]
WHERE
c.[CityIATACode] = 'EDI'
UNION ALL SELECT
a.[AirportID]
, ''
, -1
, c.[CityID] + ' (' + a.[AirportID] + ')'
FROM
dbo.tbAirports AS a
JOIN dbo.tbCities AS c ON c.[CityID] = a.[AirportCityID]
WHERE
c.[CityIATACode] = 'EDI'

DECLARE @iterations INT = 3
DECLARE @counter INT = 1
DECLARE @rows INT SET @rows = 1

WHILE @rows > 0 AND @counter <= @iterations
BEGIN
INSERT #hopDestinations
SELECT
s.[ToAirportID]
, s.[AirlineID]
, @counter AS [Hops]
, h.[Path] + ' -> ' + s.[ToCityID] + ' (' + s.[ToAirportID] + ')'
FROM
dbo.tbSimpleRoutes AS s

JOIN #hopDestinations AS h ON
s.[AirlineID] = h.[AirlineID]
AND s.[FromAirportID] = h.[AirportID]
WHERE
h.[Hops] = @counter -1
AND NOT EXISTS (
SELECT 1
FROM #hopDestinations AS h2
WHERE
h2.[AirportID] = s.[ToAirportID]
)
SET @rows = @@ROWCOUNT

SET @counter += 1

END

SELECT
c.[CityName] AS [City]
, a.[AirportName] AS [Airport]
, al.[AirlineName] AS [Airline]
, h.[Hops] + 1 AS [Flights Involved]
, h.[Path] AS [Route Taken]
FROM
#hopDestinations AS h
JOIN dbo.tbAirports AS a ON a.[AirportiD] = h.[AirportID]
JOIN dbo.tbCities AS c ON c.[CityID] = a.[AirportCityID]
JOIN dbo.tbAirlines AS al ON al.[AirlineID] = h.[AirlineID]
ORDER BY
h.[Hops] ASC
, c.[CityID]
, a.[AirportID]
, h.[AirlineID]
, h.[Path] ASC

SELECT
c.[CityName] AS [City]
, a.[AirportName] AS [Airport]
, MIN(h.[Hops]) + 1 AS [Minimum Flights]
FROM
#hopDestinations AS h
JOIN dbo.tbAirports AS a ON a.[AirportiD] = h.[AirportID]
JOIN dbo.tbCities AS c ON c.[CityID] = a.[AirportCityID]
GROUP BY
c.[CityName]
, a.[AirportName]


Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2012-01-18 : 07:02:34
If what they say about everyone in the world being only seven degrees removed from you, then the loop shouldn't run more than seven iterations! So that may be the way to go.

But, from a purely intellectual perspective, the CTE solution looks so elegant. Don't spread this around, but I post on this forum not to help anyone, but for my own entertainment and intellectual curiosity, so CTE looks nicer to me
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2012-01-18 : 08:47:06
it is very slick. Very 'nice'.

just not that fast

And -- I also feel pretty much the same way.

I'm happy if what I posted actually helped someone else but I'm *ecstatic* if I learn something while doing it

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page
   

- Advertisement -