Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Piemur1
Starting Member
10 Posts |
Posted - 2014-08-05 : 14:19:54
|
Hello, I have a little problem getting a query I'm trying to build working. I have a large database of user records and am trying to get a count of how many email addresses of potential duplicate records there are, broken down by the first letter of the email address. This is the query I am using to get the list of email addresses: SELECT DISTINCT A.Email_AddressFROM User_x AS AINNER JOIN User_x AS BON A.Email_Address = B.Email_Address AND A.Login_Name != B.Login_NameWHERE A.Last_Name LIKE B.Last_NameORDER BY A.Email_AddressAs you can see, I'm using an inner join to find any records that have the exact same email address, but is a different login name. Now, I want to build a query that will give me the counts of all duplicate email addresses broken down by first letter. So I built this query:SELECT LEFT(A.Email_Address, 1), COUNT(LEFT(A.Email_Address, 1))FROM User_x AS AINNER JOIN User_x AS BON A.Email_Address = B.Email_Address AND A.Login_Name != B.Login_NameWHERE A.Last_Name LIKE B.Last_NameGROUP BY LEFT(A.Email_Address, 1)ORDER BY LEFT(A.Email_Address, 1)Except this query is flawed in that it is giving me counts of every record with those email addresses (as in every record that has a duplicate would give a count of at LEAST 2 per email address) and would give me an inaccurate number. I had to remove the DISTINCT clause, because it would either give me a syntax error or would give me counts of exactly 1 for every letter (dependant on where the DISTINCT clause was placed). This query is something that I would LIKE to build, but am unable to.SELECT LEFT(DISTINCT A.Email_Address, 1), COUNT(LEFT(DISTINCT A.Email_Address, 1))FROM User_x AS AINNER JOIN User_x AS BON A.Email_Address = B.Email_Address AND A.Login_Name != B.Login_NameWHERE A.Last_Name LIKE B.Last_NameGROUP BY LEFT(DISTINCT A.Email_Address, 1)ORDER BY LEFT(DISTINCT A.Email_Address, 1)Also please note this is a production database, so I will not be building any new tables. |
|
Piemur1
Starting Member
10 Posts |
Posted - 2014-08-05 : 14:24:58
|
I am trying to get a results list something like this:A 6B 7C 11D 657E 4F 2G 3H 1I 2J 1007K 26L 21M 16N 4P 2R 19S 870T 342W 6Y 2(I had to cheat and use excel to actually get these results) |
|
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2014-08-05 : 14:27:19
|
I believe you can replace your first query with the query below. Then use the second query to get the totals you want, adjusting it if/as needed.SELECT Email_Address, Last_Name, COUNT(DISTINCT Login_Name) AS Total_LoginsFROM User_x GROUP BY Email_Address, Last_NameHAVING COUNT(DISTINCT Login_Name) > 1ORDER BY Email_AddressSELECT LEFT(Email_Address, 1) AS Email_First_Letter, SUM(Total_Logins)AS Total_LoginsFROM ( SELECT Email_Address, Last_Name, COUNT(DISTINCT Login_Name) AS Total_Logins FROM User_x GROUP BY Email_Address, Last_Name HAVING COUNT(DISTINCT Login_Name) > 1) AS subquery1GROUP BY LEFT(Email_Address, 1)ORDER BY Email_First_Letter |
|
|
Piemur1
Starting Member
10 Posts |
Posted - 2014-08-05 : 14:38:21
|
That works a bit better, but is still giving me counts of all the users with duplicate emails. I want to get just the total number of emails by itself, so I had to adjust your query a bit and am getting some record counts where the last name are not the same. There are some records that have the same email address but for different people (such as org boxes) and I do not want to include those in the count.Thank you for your assistance, and I had not really considered pulling the query in a nest like that. |
|
|
|
|
|
|
|