Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Is there a Set-Based Method?

Author  Topic 

SMerrill
Posting Yak Master

206 Posts

Posted - 2004-05-11 : 13:18:55
SQL Team:
Suppose there is an interactive Telecom system that records agent choices in each call. These events are recorded into an event log, which is imported into a table.
The EventLog table is defined in the script below. Also, see the EntryTypes list.
The rule is that a call always begins with an Answer entry, and terminates with an End entry, but may have any number of CallData or Error entries in between.
(The CallID column tolerates NULLs because call IDs are computed in the next pass.)

I wish to normalize this log file using a set-based method, and stay away from a cursor if possible.

There is another related table named Calls, and related to that is the third Agent table.

Now for that next pass: My thought is that
SELECT ID FROM EventLog WHERE EntryType=1
will give me all the Answer entries. I wish to insert one new Calls record for every Answer entry, and refer to each new Calls.CallID number with a respective New EventLog.CallID reference.

Here is a script to get started.


CREATE TABLE Agents (AgentID INT IDENTITY (1, 1) NOT NULL
PRIMARY KEY CLUSTERED, AgentName VARCHAR (30) NOT NULL)
CREATE TABLE Calls (CallID INT IDENTITY (1, 1) NOT NULL
PRIMARY KEY CLUSTERED, AgentID INT NULL)
CREATE TABLE EntryTypes (EntryTypeID INT IDENTITY (1, 1) NOT NULL
PRIMARY KEY CLUSTERED, EntryTypeName CHAR (10) NULL )
CREATE TABLE EventLog (ID INT IDENTITY (1, 1) NOT NULL
PRIMARY KEY CLUSTERED, EntryTypeID INT NOT NULL ,Entry CHAR (10) NOT NULL ,CallID INT NULL )

ALTER TABLE Calls ADD
CONSTRAINT FK_Calls_Agents FOREIGN KEY (AgentID)
REFERENCES Agents (AgentID)
ALTER TABLE EventLog ADD
CONSTRAINT FK_EventLog_Calls FOREIGN KEY (CallID) REFERENCES Calls (CallID),
CONSTRAINT FK_EventLog_EntryTypes FOREIGN KEY (EntryTypeID)
REFERENCES EntryTypes (EntryTypeID)

INSERT INTO [Agents] ([AgentName]) VALUES('Joe')
INSERT INTO [Agents] ([AgentName]) VALUES('Tom')
INSERT INTO [Agents] ([AgentName]) VALUES('Kris')
INSERT INTO [Agents] ([AgentName]) VALUES('Shaun')

SET IDENTITY_INSERT [dbo].[EntryTypes] ON
INSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(1,'Answer')
INSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(2,'CallData')
INSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(3,'Error')
INSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(4,'End')
SET IDENTITY_INSERT [dbo].[EntryTypes] OFF

SET IDENTITY_INSERT [dbo].[EventLog] ON
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(1,1,'ANS',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(3,2,'AGENT,1',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(4,2,'YES',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(5,3,'ERR,3',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(6,4,'END',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(7,1,'ANS',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(8,2,'AGENT,4',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(9,2,'NO',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(10,4,'END',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(11,1,'ANS',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(12,2,'AGENT,3',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(13,3,'ERR,1',NULL)
INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(14,4,'END',NULL)
SET IDENTITY_INSERT [dbo].[EventLog] OFF

Thanks for considering this problem. I know there must be a set-based method. There is a huge amount of data (Thousands of entries each day, cumulative forever.)

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2004-05-11 : 13:41:39
Question: Why not just add AgentID to EventLog?
Go to Top of Page

SMerrill
Posting Yak Master

206 Posts

Posted - 2004-05-11 : 13:49:58
Because my primary focus is to count the calls made by each agent. (I have simplified the actual model greatly.) Without identifying each separate call, I'm stuck. It would be easy to iterate through the log with a cursor, but I prefer elegance.
Of course, the next logical step for me is to fill in the CallID column in the remaining rows, then finally fill in the Calls.AgentID column based upon the number following the AGENT entries.
Besides, if I added AgentID to EventLog, I still wouldn't be able to fill it with data.
Thanks for asking.
My desired result for this step is as follows:
SELECT * FROM EventLog
ID,EntryTypeID,Entry,CallID
1,1,'ANS' ,1
3,2,'AGENT,1',NULL
4,2,'YES' ,NULL
5,3,'ERR,3' ,NULL
6,4,'END' ,NULL
7,1,'ANS' ,2
8,2,'AGENT,4',NULL
9,2,'NO' ,NULL
10,4,'END' ,NULL
11,1,'ANS' ,3
12,2,'AGENT,3',NULL
13,3,'ERR,1' ,NULL
14,4,'END' ,NULL

SELECT * From Calls
CallID,AgentID
1,NULL
2,NULL
3,NULL

Even though I did not post this follow-up question yet, the step after that will contain:
ID,EntryTypeID,Entry,CallID
1,1,'ANS' ,1
3,2,'AGENT,1',1
4,2,'YES' ,1
5,3,'ERR,3' ,1
6,4,'END' ,1
7,1,'ANS' ,2
8,2,'AGENT,4',2
9,2,'NO' ,2
10,4,'END' ,2
11,1,'ANS' ,3
12,2,'AGENT,3',3
13,3,'ERR,1' ,3
14,4,'END' ,3


I hope you see my train of thought clearer now.
Go to Top of Page

SMerrill
Posting Yak Master

206 Posts

Posted - 2004-05-11 : 14:33:29
Then finally, I will need:
SELECT * FROM Calls
CallID, AgentID
1,1
2,4
3,3
This sure sounds like a job for a cursor to me. (Perish the thought!)
I'm confident I can do all but the first step myself.
So, prevail and conquer, Yak Warriors!

~ Shaun Merrill
Seattle, WA
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2004-05-11 : 15:37:46
[code]declare @callid int
set @callid = 1
if exists (select 1 from calls)
select @callid = max(callid)
from calls

select @callid callid,identity(int,0,1) ct,e.ID-1 id,SUBSTRING(e.Entry,CHARINDEX(',',e.Entry,1)+1,LEN(e.Entry)) as AgentID
into #t
from EventLog e
where EntryTypeID=2
and entry like 'AGENT%'
order by e.ID

select identity(int,0,1) ct,id+0 e_id
into #e
from eventlog
where entrytypeid = 4
order by id

insert calls
select agentid
from #t

update eventlog
set callid = d.callid
from
(
select callid+t.ct callid,id,agentid,e_id
from #t t
join #e e on e.ct = t.ct
) d, eventlog e
where e.id between d.id and d.e_id

select * from calls
select * from eventlog

drop table #t,#e
[/code]
Go to Top of Page

SMerrill
Posting Yak Master

206 Posts

Posted - 2004-05-11 : 16:47:33
EHORN is incredible! Bravo!
Thanks, EHorn, you got me on the right track, and taught me how to do this kind of work. I had no idea you could create temp tables with identity columns in them!

The first assumption your code makes is that the Agent entry always follows the Answer entry, which is not always true, so I will need to test on Answer alone.

Secondly, I was getting an error when I ran it, and if I understand what's going on here, your inner select is getting the wrong set of CallID numbers, and works better if it is
select C.CallID,id, t.agentid, e_id
from #t t
join #e e on e.ct = t.ct
join Calls C on C.AgentID=T.AgentID

but this makes the assumption that the Agent didn't call more than once, which is not a correct rule. So I'm still on the drawing board, but you have certainly taught me a big lesson today! I'm probably not the only one learning new tricks on this wonderful forum . . .

Thanks again,


~ Shaun Merrill
Seattle, WA
Go to Top of Page

SMerrill
Posting Yak Master

206 Posts

Posted - 2004-05-11 : 17:37:13
Here is the final result without those assumptions, for the education of all readers.
This answers the generic question, "how do you normalize multi-line records" very nicely.

--Reset the test environment:
update eventlog set CallID = NULL WHERE CallID is not null
DELETE FROM Calls
DBCC CHECKIDENT (Calls, RESEED, 5)
DROP TABLE EndLoc,AnsLoc,AgentLoc
GO

-- Calculate the maximum CallID in Calls:
declare @callid int
insert into Calls (AgentID) VALUES (1)
set @callid = 0
if exists (select 1 from calls)
select @callID = max(callid) from calls
PRINT 'Call ID='+ cast (@CallID as Varchar)

-- Calculate the Answer tags:
select @CallID CallID,
identity(int,1,1) ct,
e.[ID]+0 AnsLoc
into AnsLoc
from EventLog e
where EntryTypeID = 1

-- Calculate the End tags:
select identity(int,1,1) ct,
id+0 EndLoc
into EndLoc
from EventLog
where EntryTypeID = 4

-- Calculate the Agent tags:
select identity(int,1,1) ct,
id+0 AgentLoc,
SUBSTRING(E.ENTRY,CHARINDEX(',',E.ENTRY,1)+1,LEN(E.ENTRY)) AgentID
into AgentLoc
from EventLog E
where (EntryTypeID = 2 and Entry Like 'AGENT%')

--Populate the Agent numbers from each new call
insert Calls
select AgentID
from AgentLoc

--Populate the CallID foreign key:
update eventlog
set CallID = d.CallID
from
(
SELECT AnsLoc.ct + AnsLoc.CallID AS CallID,
AnsLoc.AnsLoc,
EndLoc.EndLoc
FROM EndLoc
INNER JOIN AnsLoc ON EndLoc.ct = AnsLoc.ct
INNER JOIN AgentLoc ON AnsLoc.ct = AgentLoc.ct
) d, EventLog e
where e.id between d.AnsLoc and d.EndLoc

select * from Calls
select * from EventLog

Thanks again, SQL Team!


~ Shaun Merrill
Seattle, WA
Go to Top of Page

SMerrill
Posting Yak Master

206 Posts

Posted - 2004-05-14 : 12:58:59
Well, this is getting more hairy all the time. What if there are some Answer entries missing, or some End entries missing? I now want to say, "match each answer tag with an end tag that is positionally greater than itself, but not greater than the very next answer tag."
Think, struggle, think . . .

~ Shaun Merrill
Seattle, WA
Go to Top of Page

SMerrill
Posting Yak Master

206 Posts

Posted - 2004-05-17 : 01:48:11
How do I get Next Record and Current record on the same row?
I'm faced with set-based thoughts like "Give me the position of the tag which is the minimum position in the set which is positionally greater than this current record." Does anyone have an answer to this question?"

~ Shaun Merrill
Seattle, WA
Go to Top of Page
   

- Advertisement -