Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Help with developing the correct query for this situation.

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2002-11-11 : 10:36:02
Ed writes "I am trying to query a table called identities.
There can only be 1 type of identity in a given set of 3 columns, but they must remain in their own column (ie, cannot combine ident1,2 and 3 into one column). There is also an info column that has some data.

***************************************
**** table identities *****************
ident1 * ident2 * ident3 *** info *
***************************************
a 12
a 14
b 11
b 9
b 2
c 4
c 9
d 8
e 10
d 12
d 13


I want to query on the occurances of each identity and generate a result such as this:


***********************************************
**** table identities *************************
*ident1 * ident2 * ident3 * info * total *
***********************************************
a 12 2
a 14 2
b 11 3
b 9 3
b 2 3
c 4 2
c 9 2
d 8 1
e 10 1
d 12 2
d 13 2


The total column is the number of occurances of each identity in the table for the appropriate column.
I would like to generate this table in one query without creating a new table in thedatabase as an intermediate step.
I also must be able to order the results based on the total
(although I am not showing that in this resulting query).
I have looked into using COUNT and GROUP BY, but I have been unsuccessful. Help appreciated!"

mr_mist
Grunnio

1870 Posts

Posted - 2002-11-11 : 10:43:56
This would be some kind of select joined to lots of

select sum (case when identa = 'a' or identb = 'a' or identc='a' then 1 else 0 end) as sumofa

The fact that you can have any ident type in either of the rows doesn't make it particularly easy. Do you know how many different ident types there are (ie is it just a,b,c,d and e?)

Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2002-11-11 : 12:15:38
Something like this?

select tbl.ident1, tbl.ident2, tbl.ident3, total = coalesce(a.num, b.num, c.num)
tbl ,
left outer join (select ident1, num = count(*) from tbl where iden1 <> '' group by ident1) as a
on tbl.ident1 = a.ident1
left outer join (select ident2, num = count(*) from tbl where iden2 <> '' group by ident2) as b
on tbl.ident2 = b.ident2
left outer join (select ident3, num = count(*) from tbl where iden3 <> '' group by ident3) as c
on tbl.iden3 = c.ident3
order by total



==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

LarsG
Constraint Violating Yak Guru

284 Posts

Posted - 2002-11-11 : 12:31:20
select ident1,ident2,ident3,info,
(select count(*) from identities
where ident1 = i.ident1
or ident2 = i.ident2
or ident3 = i.ident3) as total
from identities i


assuming that if ident1 has a value then ident2 and ident3 is null et.c. If this is not the case the where clause can easily be fixed,

Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2002-11-13 : 13:31:58
Or, avoid the subqueries in the SELECT field list like this: (I find avoiding the subqueries can increase performance quite a bit as tables get large; below, the subquery A is only evaluated once instead of for every record in the table.)

SELECT Tbl.*, A.Total
FROM Tbl
INNER JOIN
(SELECT Indent1, Indent2, Indent3, Count(*) as Total
FROM Tbl
GROUP BY Ident1, Indent2, Indent3 ) A
ON
Tbl.Indent1 = A.Indent1 OR
Tbl.Indent2 = A.Indent2 OR
Tbl.Indent3 = A.Indent3

Again, assuming that Indent1 is not null -> indent2 is null and indent3 is null, etc.

Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-11-13 : 15:35:55
Interesting. Using the following setup, I make Lars' solution about 2x faster than jsmith's. Both are pretty bad, though: it's the OR join causing problems.

CREATE TABLE identities (
ident1 varchar(10),
ident2 varchar(10),
ident3 varchar(10),
info int NOT NULL
)

INSERT INTO identities (ident1, ident2, ident3, info)
SELECT
CASE WHEN pos = 0 THEN ident END,
CASE WHEN pos = 1 THEN ident END,
CASE WHEN pos = 2 THEN ident END,
info
FROM (
SELECT CHAR(FLOOR(RAND(CAST(NEWID() AS binary(4)))*26) + 65) AS ident,
FLOOR(RAND(CAST(NEWID() AS binary(4)))*3) AS pos,
FLOOR(RAND(CAST(NEWID() AS binary(4)))*100) AS info
FROM Numbers
WHERE n BETWEEN 0 AND 999999
GROUP BY n -- This *is* necessary!
) a

 
Nigel's solution (with the <> '' bits changed to IS NOT NULL) is much faster. This is a tad faster still and saves a couple of scans:

SELECT I.ident1, I.ident2, I.ident3, I.info, C.ct
FROM identities AS I
INNER JOIN (
SELECT COALESCE(ident1, ident2, ident3) AS ident,
CASE
WHEN ident1 IS NOT NULL THEN 1
WHEN ident2 IS NOT NULL THEN 2
ELSE 3
END AS pos,
COUNT(*) AS ct
FROM identities
GROUP BY ident1, ident2, ident3
) C
ON COALESCE(I.ident1, I.ident2, I.ident3) = ident
AND
CASE
WHEN ident1 IS NOT NULL THEN 1
WHEN ident2 IS NOT NULL THEN 2
ELSE 3
END = pos



Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2002-11-13 : 17:00:06
abosulutely -- the OR's kill the queries. If the data is structured so there aren't NULL values, but another value like an empty string '' or something similiar, then the OR's can become ANDs and I think at that point the shorter solutions should work fine.


Go to Top of Page
   

- Advertisement -