Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 Monthly Excel Import - New to SSIS

Author  Topic 

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-21 : 08:54:39
I am creating an employee directory. We need to update employee records once a month from an excel file.

You can download a sample excel file and the sql database diagram here:

[url]http://www.ronnieo.net/files/empdirfiles.zip[/url]

My database is normalized, but the excel data, of course, is not.
All of this is presenting some interesting challenges, and I don't really know where to start.

Any suggestions?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-08-21 : 10:22:39
Use Excel source connection to excel file and OLEDB destination connection to a staging table. then update your main table by comparing to staging table. You can also change connection to excel file dynamically using expressions for connection string. How will your excel file name be? Will it be containing month name or datename?
Go to Top of Page

JeriHatTrick
Starting Member

11 Posts

Posted - 2008-08-21 : 11:45:05
Please keep us informed of any more challenges that you encounter with this process. Thanks.
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-21 : 13:41:30
- Thanks! If you don't mind holding my hand a little farther, what is a staging table. Is it just a table that looks just like my spreadsheet? How would I go about comparing the main table with the staging table?

- My excel file doesn't have a set name yet.

Thanks a lot,
Ronnie
Go to Top of Page

JeriHatTrick
Starting Member

11 Posts

Posted - 2008-08-21 : 15:37:15
In this case your staging table could be your spreadsheet. My hunch is that your department and division tables would be either static or populated from some other source. (I doubt this spreadsheet would be authoritative document on your organizations structure, although I could be wrong). So that integrity constraint is taken care of.

From there you could delete everyone from the db table that is no longer on the list (staging table) This would be done using the Execute SQL Task. Then you could use a Lookup task to go through your staging table and either update or insert records. There's plenty of documentation on the Lookup task online so I won't go into much detail here but basically you compare the values coming through the SSIS pipeline w/ values in your staging table. If the employee ID is found then the record already exists and you would update the record. If the employee ID is not found then it is a new record and an insertion is needed.

These are basic guidelines and there are plenty of ways to get it done. This would be one of the easiest. You could probably also get it done w/o the use of a staging table. Again, let us know if you have further questions.
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-22 : 09:00:50
Actually, the spreadsheet is authoritative on the organization structure. The spreadsheet is a direct output of the system that keeps track of who we have and what department/division they are in. And its not uncommon for our employees to be moved into new departments, departments renamed, new departments created, etc. So, I have to check these things as well when doing the monthly import. Any advice on that process is well appreciated.

Again, thanks very much for your help. This is good stuff.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-08-22 : 09:51:53
quote:
Originally posted by ronnieoverby

Actually, the spreadsheet is authoritative on the organization structure. The spreadsheet is a direct output of the system that keeps track of who we have and what department/division they are in. And its not uncommon for our employees to be moved into new departments, departments renamed, new departments created, etc. So, I have to check these things as well when doing the monthly import. Any advice on that process is well appreciated.

Again, thanks very much for your help. This is good stuff.


does that mean the column information or structure of excel file can change each time?
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-22 : 09:59:14
If I understand you, no. The sample spreadsheet from my initial post is what we will get each month.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-08-22 : 10:16:07
quote:
Originally posted by ronnieoverby

If I understand you, no. The sample spreadsheet from my initial post is what we will get each month.


Ok. in that case not much of a problem
The checks you wanted to perform can be done by comparing the data from excel with existing data on your table. You have a task called SCD task for this available in SSIS. you can specify the key(values that uniquely define your each row in excel) and it will do comparison based on that. Also you've the option to preserve history just in case you want each change spotted to be put as a new record rather than updating already existing one.
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-22 : 14:22:47
Sorry, but I am lost. I don't understand what I need to do next. Do I work on making sure the division names are in sync first and work down to the employee records. Do I do it the reverse way? I can't just delete a division or department record if it isn't found in my spreadsheet, because I will have employee records that are related to them. Will this require multiple passes through the table? Is SSIS well-suited for this problem or should I do most of it with stored procedures?

Thanks,
Ronnie
Go to Top of Page

JeriHatTrick
Starting Member

11 Posts

Posted - 2008-08-22 : 14:38:41
The slowly changing dimension (SCD) mentioned earlier by Visakh16 may be a bit more work than what is required. It actually is not much work but getting a simple working SSIS package running would be 1st priority. Maybe make that a phase 2 change. It's a good idea though.

So is this spreadsheet going to contain all employees or just ones that need updating? If it contains all employees then all divisions and departments will be represented in it, right? So the first thing you would want to do in that case is delete all employees that are no longer in list, then delete all departments that no longer exist, then all divisions. That way you don't delete a primary key that is needed for a foreign key constraint. After that then you would use the reverse order to insert or update the divisions table, department table, then employees table.
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-22 : 14:44:05
Should I just do all of those things with execute sql tasks?
Go to Top of Page

JeriHatTrick
Starting Member

11 Posts

Posted - 2008-08-22 : 15:12:32
You can try and use the Lookup task I mentioned earlier. That would be in the data flow. After you initially populate your staging table with the contents of your spreadsheet you can look up table keys. If an employee is not found in the spreadsheet then delete them (ole db command task). Do the same for departments and divisions.

Then in another data flow use the staging table and a lookup task to update or insert records in each of the db tables. This would use the ole db command and ole db destination for the update and insert respectively.

Hope I'm making myself clear. It's difficult to get concise descriptions over a forum but I'm doing my best.
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-24 : 11:11:26
I couldn't ask for better help. Thanks very much. Please bear with me, though.

Ok So I in my Control flow I have first a execute sql task that clears my staging table.
Then I have a data flow task. The data flow task populates the staging table from the spreadsheet.
THen I have another data flow. This is where I want to use a lookup to delete stale employees. Which table should be input and which should be reference. I'm not sure how to use this lookup transformation task. How do I make it delete employees that aren't found in the staging table?

Again, I am sorry that I'm so slow with this. Thanks for the quick responses! I really appreciate it.
Go to Top of Page

JeriHatTrick
Starting Member

11 Posts

Posted - 2008-08-24 : 18:20:13
The lookup for the delete is actually more work than is necessary. My bad. You could just use an Execute SQL Task does something like this:

DELETE FROM dbo.Employee
WHERE ID NOT IN (SELECT employee_id FROM dbo.Staging)

You could do something similar for the other two tables.

As for the insert/update you would use the staging table as your source and reference the employee table. If the id is not found then insert the new record. If it is found then you could potentially need an update. I say potentially because the employee may exist in the spreadsheet without any changes to the record. After you match the employee's id you could have a subsequent lookup that checks the last name to see if it chaged, then the address, etc. This is the problem of the slowly changing dimension (SCD) that the other responder mentioned earlier. There are a few way to handle that situation. Figure out how you/your organization wants to handle that then we can proceed from there.
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-25 : 07:26:17
I built a stored procedure with the query you gave:

DELETE FROM dbo.Employee
WHERE ID NOT IN (SELECT employee_id FROM dbo.Staging)

That did the trick for the stale employees. But now trying to think about how to do the same for the departments table, I can't come up with it.

Thanks
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-08-25 : 07:40:12
you will be always having Department information linked to employee?
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-25 : 09:19:24
Each employee record has a foreign key field called DepartmentID.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-08-25 : 09:50:16
quote:
Originally posted by ronnieoverby

Each employee record has a foreign key field called DepartmentID.


then cant you compare and delete those to whom no employees are linked? Using query like before?
Go to Top of Page

ronnieoverby
Starting Member

36 Posts

Posted - 2008-08-25 : 09:57:16
What if a department is renamed? Or if a department is disbanded and its employees are regrouped into new departments? The employees still exist in there table, but they need a new department ID before I can delete or rename the old department.

Thanks
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-08-25 : 10:08:44
quote:
Originally posted by ronnieoverby

What if a department is renamed? Or if a department is disbanded and its employees are regrouped into new departments? The employees still exist in there table, but they need a new department ID before I can delete or rename the old department.

Thanks


In such cases wont staging contain employee linked to recent department?
Go to Top of Page
    Next Page

- Advertisement -