Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Return Single Row for Join

Author  Topic 

jp2code
Posting Yak Master

175 Posts

Posted - 2008-11-10 : 15:47:03
Background: We have 2 tables that I am pulling information from.

The Machine_Data is filled by our vendor's machines whenever parts are made: It inserts the string OP_ID, which consists of the Employee's FirstName and LastName, as read from our EmployeeData table (i.e. "John Doe"). The information in EmployeeData is entered by HR and sometimes contains unnecessary trailing spaces (the original source code is gone, and so the application has to be rewritten to fix the problem), so the OP_ID as entered by our vendor's machine could contain 2 spaces between the FirstName and LastName.

Next: When an employee gets terminated and rehired at a later date, they are entered again in the EmployeeData table. So, there are employees with more than one entry in our EmployeeData table. Each EmployeeData record is unique by a Primary Key integer called [Count].

With this background, I have this query that I need to modify:
SELECT
CASE WHEN (ED.[LastName] IS NULL)
THEN MD.[OP_ID]
ELSE ED.[FirstName]+' '+ED.[LastName]+' ('+ED.[NUM]+')' END
AS Operator, MD.[Item_Date]
FROM Machine_Data MD
LEFT JOIN EmployeeData ED
ON Replace(MD.[OP_ID], ' ', ' ')=RTrim(ED.[FirstName])+' '+RTrim(ED.[LastName])
WHERE
(MD.[Item_Date] BETWEEN @DateStart AND @DateEnd)
Where:
  • ED.FirstName, ED.LastName, ED.NUM, and MD.OP_ID are all stored as string values; ED.[Count] is an integer.
  • The CASE statement exists because some records are Machine Diagnostics entries, which do not correspond to any records in our EmployeeData table.
  • The Replace() function is there if HR ever updates and removes the trailing spaces from the employee's name and the vendor's OP_ID data is not changed.
The Problem: I am getting duplicate records for employees that have more than one employee ED.NUM. I know that to get the current employee and prevent duplicates, I need to select the employee that has the highest ED.[Count], but I don't know the best way to modify this query.

Could someone give me a suggestion? This query is currently in production, and I don't trust my limited SQL knowledge.

Regards,
~Joe


Avoid Sears Home Improvement

hanbingl
Aged Yak Warrior

652 Posts

Posted - 2008-11-10 : 16:59:41
[CODE]
;WITH EmployeeData_TEMP as (
select *, ROW_NUMBER() OVER (PARTITION BY FIRSTNAME, LASTNAME, NUM ORDER BY [COUNT] DESC) AS RN
from EmployeeData
)
SELECT
CASE WHEN (ED.[LastName] IS NULL)
THEN MD.[OP_ID]
ELSE ED.[FirstName]+' '+ED.[LastName]+' ('+ED.[NUM]+')' END
AS Operator, MD.[Item_Date]
FROM Machine_Data MD
LEFT JOIN EmployeeData_TEMP ED
ON Replace(MD.[OP_ID], ' ', ' ')=RTrim(ED.[FirstName])+' '+RTrim(ED.[LastName])
AND ED.RN = 1
WHERE
(MD.[Item_Date] BETWEEN @DateStart AND @DateEnd)[/CODE]
Go to Top of Page

hanbingl
Aged Yak Warrior

652 Posts

Posted - 2008-11-10 : 17:14:17
Oops, SQL 2000:

SELECT
ED.COUNT,
CASE WHEN (ED.[LastName] IS NULL)
THEN MD.[OP_ID]
ELSE ED.[FirstName]+' '+ED.[LastName]+' ('+ED.[NUM]+')' END
AS Operator, MD.[Item_Date]
FROM
@Machine_Data MD
LEFT JOIN (
select distinct (select top 1 [COUNT] FROM EmployeeData
where Firstname = e.Firstname
and ISNULL(lastname,'') = ISNULL(E.lastname,'')
and num = e.num order by count desc) as [count], Firstname, NULLIF(lastname,'') AS LASTNAME, num
from EmployeeData e
) ED
ON Replace(MD.[OP_ID], ' ', ' ')=RTrim(ED.[FirstName])+' '+RTrim(ED.[LastName])
WHERE
(MD.[Item_Date] BETWEEN '1995-01-13 00:00:00.000' AND '2000-05-25 00:00:00.000')
Go to Top of Page

jp2code
Posting Yak Master

175 Posts

Posted - 2008-11-10 : 18:02:27
Wow. I'm studying this still.


Avoid Sears Home Improvement
Go to Top of Page
   

- Advertisement -