| Author |
Topic |
|
SMerrill
Posting Yak Master
206 Posts |
Posted - 2004-05-11 : 13:18:55
|
SQL Team:Suppose there is an interactive Telecom system that records agent choices in each call. These events are recorded into an event log, which is imported into a table.The EventLog table is defined in the script below. Also, see the EntryTypes list.The rule is that a call always begins with an Answer entry, and terminates with an End entry, but may have any number of CallData or Error entries in between.(The CallID column tolerates NULLs because call IDs are computed in the next pass.)I wish to normalize this log file using a set-based method, and stay away from a cursor if possible.There is another related table named Calls, and related to that is the third Agent table.Now for that next pass: My thought is that SELECT ID FROM EventLog WHERE EntryType=1 will give me all the Answer entries. I wish to insert one new Calls record for every Answer entry, and refer to each new Calls.CallID number with a respective New EventLog.CallID reference.Here is a script to get started. CREATE TABLE Agents (AgentID INT IDENTITY (1, 1) NOT NULL PRIMARY KEY CLUSTERED, AgentName VARCHAR (30) NOT NULL) CREATE TABLE Calls (CallID INT IDENTITY (1, 1) NOT NULL PRIMARY KEY CLUSTERED, AgentID INT NULL) CREATE TABLE EntryTypes (EntryTypeID INT IDENTITY (1, 1) NOT NULL PRIMARY KEY CLUSTERED, EntryTypeName CHAR (10) NULL ) CREATE TABLE EventLog (ID INT IDENTITY (1, 1) NOT NULL PRIMARY KEY CLUSTERED, EntryTypeID INT NOT NULL ,Entry CHAR (10) NOT NULL ,CallID INT NULL ) ALTER TABLE Calls ADDCONSTRAINT FK_Calls_Agents FOREIGN KEY (AgentID) REFERENCES Agents (AgentID) ALTER TABLE EventLog ADD CONSTRAINT FK_EventLog_Calls FOREIGN KEY (CallID) REFERENCES Calls (CallID), CONSTRAINT FK_EventLog_EntryTypes FOREIGN KEY (EntryTypeID) REFERENCES EntryTypes (EntryTypeID) INSERT INTO [Agents] ([AgentName]) VALUES('Joe')INSERT INTO [Agents] ([AgentName]) VALUES('Tom')INSERT INTO [Agents] ([AgentName]) VALUES('Kris')INSERT INTO [Agents] ([AgentName]) VALUES('Shaun')SET IDENTITY_INSERT [dbo].[EntryTypes] ONINSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(1,'Answer')INSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(2,'CallData')INSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(3,'Error')INSERT INTO [EntryTypes] ([EntryTypeID], [EntryTypeName]) VALUES(4,'End')SET IDENTITY_INSERT [dbo].[EntryTypes] OFFSET IDENTITY_INSERT [dbo].[EventLog] ONINSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(1,1,'ANS',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(3,2,'AGENT,1',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(4,2,'YES',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(5,3,'ERR,3',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(6,4,'END',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(7,1,'ANS',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(8,2,'AGENT,4',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(9,2,'NO',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(10,4,'END',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(11,1,'ANS',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(12,2,'AGENT,3',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(13,3,'ERR,1',NULL)INSERT INTO [EventLog] ([ID],[EntryTypeID],[Entry],[CallID]) VALUES(14,4,'END',NULL)SET IDENTITY_INSERT [dbo].[EventLog] OFF
Thanks for considering this problem. I know there must be a set-based method. There is a huge amount of data (Thousands of entries each day, cumulative forever.) |
|
|
ehorn
Master Smack Fu Yak Hacker
1632 Posts |
Posted - 2004-05-11 : 13:41:39
|
| Question: Why not just add AgentID to EventLog? |
 |
|
|
SMerrill
Posting Yak Master
206 Posts |
Posted - 2004-05-11 : 13:49:58
|
Because my primary focus is to count the calls made by each agent. (I have simplified the actual model greatly.) Without identifying each separate call, I'm stuck. It would be easy to iterate through the log with a cursor, but I prefer elegance.Of course, the next logical step for me is to fill in the CallID column in the remaining rows, then finally fill in the Calls.AgentID column based upon the number following the AGENT entries.Besides, if I added AgentID to EventLog, I still wouldn't be able to fill it with data. Thanks for asking.My desired result for this step is as follows:SELECT * FROM EventLogID,EntryTypeID,Entry,CallID 1,1,'ANS' ,1 3,2,'AGENT,1',NULL 4,2,'YES' ,NULL 5,3,'ERR,3' ,NULL 6,4,'END' ,NULL 7,1,'ANS' ,2 8,2,'AGENT,4',NULL 9,2,'NO' ,NULL10,4,'END' ,NULL11,1,'ANS' ,312,2,'AGENT,3',NULL13,3,'ERR,1' ,NULL14,4,'END' ,NULLSELECT * From CallsCallID,AgentID1,NULL2,NULL3,NULL Even though I did not post this follow-up question yet, the step after that will contain:ID,EntryTypeID,Entry,CallID 1,1,'ANS' ,1 3,2,'AGENT,1',1 4,2,'YES' ,1 5,3,'ERR,3' ,1 6,4,'END' ,1 7,1,'ANS' ,2 8,2,'AGENT,4',2 9,2,'NO' ,210,4,'END' ,211,1,'ANS' ,312,2,'AGENT,3',313,3,'ERR,1' ,314,4,'END' ,3 I hope you see my train of thought clearer now. |
 |
|
|
SMerrill
Posting Yak Master
206 Posts |
Posted - 2004-05-11 : 14:33:29
|
Then finally, I will need:SELECT * FROM CallsCallID, AgentID1,12,43,3 This sure sounds like a job for a cursor to me. (Perish the thought!)I'm confident I can do all but the first step myself.So, prevail and conquer, Yak Warriors!~ Shaun MerrillSeattle, WA |
 |
|
|
ehorn
Master Smack Fu Yak Hacker
1632 Posts |
Posted - 2004-05-11 : 15:37:46
|
| [code]declare @callid intset @callid = 1if exists (select 1 from calls)select @callid = max(callid)from callsselect @callid callid,identity(int,0,1) ct,e.ID-1 id,SUBSTRING(e.Entry,CHARINDEX(',',e.Entry,1)+1,LEN(e.Entry)) as AgentIDinto #tfrom EventLog ewhere EntryTypeID=2and entry like 'AGENT%'order by e.IDselect identity(int,0,1) ct,id+0 e_idinto #efrom eventlogwhere entrytypeid = 4order by idinsert callsselect agentidfrom #tupdate eventlogset callid = d.callidfrom( select callid+t.ct callid,id,agentid,e_id from #t t join #e e on e.ct = t.ct) d, eventlog ewhere e.id between d.id and d.e_idselect * from callsselect * from eventlogdrop table #t,#e[/code] |
 |
|
|
SMerrill
Posting Yak Master
206 Posts |
Posted - 2004-05-11 : 16:47:33
|
EHORN is incredible! Bravo!Thanks, EHorn, you got me on the right track, and taught me how to do this kind of work. I had no idea you could create temp tables with identity columns in them!The first assumption your code makes is that the Agent entry always follows the Answer entry, which is not always true, so I will need to test on Answer alone.Secondly, I was getting an error when I ran it, and if I understand what's going on here, your inner select is getting the wrong set of CallID numbers, and works better if it isselect C.CallID,id, t.agentid, e_id from #t t join #e e on e.ct = t.ct join Calls C on C.AgentID=T.AgentID but this makes the assumption that the Agent didn't call more than once, which is not a correct rule. So I'm still on the drawing board, but you have certainly taught me a big lesson today! I'm probably not the only one learning new tricks on this wonderful forum . . . Thanks again,~ Shaun MerrillSeattle, WA |
 |
|
|
SMerrill
Posting Yak Master
206 Posts |
Posted - 2004-05-11 : 17:37:13
|
Here is the final result without those assumptions, for the education of all readers.This answers the generic question, "how do you normalize multi-line records" very nicely.--Reset the test environment:update eventlog set CallID = NULL WHERE CallID is not nullDELETE FROM CallsDBCC CHECKIDENT (Calls, RESEED, 5)DROP TABLE EndLoc,AnsLoc,AgentLocGO-- Calculate the maximum CallID in Calls:declare @callid intinsert into Calls (AgentID) VALUES (1)set @callid = 0if exists (select 1 from calls) select @callID = max(callid) from callsPRINT 'Call ID='+ cast (@CallID as Varchar)-- Calculate the Answer tags:select @CallID CallID, identity(int,1,1) ct, e.[ID]+0 AnsLocinto AnsLocfrom EventLog ewhere EntryTypeID = 1-- Calculate the End tags:select identity(int,1,1) ct, id+0 EndLocinto EndLocfrom EventLogwhere EntryTypeID = 4-- Calculate the Agent tags:select identity(int,1,1) ct, id+0 AgentLoc, SUBSTRING(E.ENTRY,CHARINDEX(',',E.ENTRY,1)+1,LEN(E.ENTRY)) AgentIDinto AgentLocfrom EventLog Ewhere (EntryTypeID = 2 and Entry Like 'AGENT%')--Populate the Agent numbers from each new callinsert Callsselect AgentIDfrom AgentLoc--Populate the CallID foreign key:update eventlogset CallID = d.CallIDfrom( SELECT AnsLoc.ct + AnsLoc.CallID AS CallID, AnsLoc.AnsLoc, EndLoc.EndLoc FROM EndLoc INNER JOIN AnsLoc ON EndLoc.ct = AnsLoc.ct INNER JOIN AgentLoc ON AnsLoc.ct = AgentLoc.ct) d, EventLog ewhere e.id between d.AnsLoc and d.EndLocselect * from Callsselect * from EventLog Thanks again, SQL Team!~ Shaun MerrillSeattle, WA |
 |
|
|
SMerrill
Posting Yak Master
206 Posts |
Posted - 2004-05-14 : 12:58:59
|
| Well, this is getting more hairy all the time. What if there are some Answer entries missing, or some End entries missing? I now want to say, "match each answer tag with an end tag that is positionally greater than itself, but not greater than the very next answer tag."Think, struggle, think . . .~ Shaun MerrillSeattle, WA |
 |
|
|
SMerrill
Posting Yak Master
206 Posts |
Posted - 2004-05-17 : 01:48:11
|
| How do I get Next Record and Current record on the same row?I'm faced with set-based thoughts like "Give me the position of the tag which is the minimum position in the set which is positionally greater than this current record." Does anyone have an answer to this question?"~ Shaun MerrillSeattle, WA |
 |
|
|
|
|
|