Implementing Table InterfacesBy Jeff Smith on 19 May 2008 | Tags: Database Design , Table Design Last time, we discussed Table inheritance, which allowed us to easily reduce redundancies in our table design by creating "base" or "super" tables that contain columns and relations that "sub-tables" automatically inherit. That generally works well, but what if you just want to have several entities share a relation, but no common attributes? That is, the entities are not really the same type, and a base class wouldn't make a lot of sense. For example, suppose you are modeling Employees and Offices, and both entities can have multiple phone numbers that you'd like to store. Is there a simple way to create a data model for that without the need for redundant tables and code? Table "Interfaces"Continuing to use OOP terminology, I like to think of this situation as modeling "Table Interfaces". Unlike Table Inheritance, we don't want to create a base entity to be inherited by multiple entities, we just want some entities to "implement" the same interface. So, in our example, both Employees and Offices would implement the "PhoneNumbers" interface. This concept allows us to pick and choose which interfaces an entity has, whereas using inheritance means that all sub-tables always have the same columns/relations. We can add add/remove interfaces from different entities without worrying about affecting other entities that have the same base. This is not as straight-forward in SQL, and there really is no direct, easy, standard way to handle this situation, unlike with Table Inheritance. Here's just one possible idea for you to consider if you need to implement this concept. First, let's start with two tables, Employees and Offices: create table Employees ( EmpID int primary key, EmpName varchar(100) ) create table Offices ( OfficeID int primary key, OfficeName varchar(100) ) We wish for our data model to allow both an Employee and an Office to have multiple phone numbers. Of course, to make our data meaningful and not contain just a random list of phone numbers, we will create a table of Phone Number Types: create table PhoneNumberTypes ( PhoneTypeCode varchar(10) primary key, Description varchar(100), Sort int not null default 0 ) So far, so good. Now, here's where we have some options and we really need to think a little. Let's start by examining a common attempt to create a re-usable and generic "PhoneNumbers" table that I feel doesn't work so well. A Common ApproachMany database architects will start by creating a table of phone numbers, and then they'd create two more tables: one to relate PhoneNumbers to Offices, and another to relate PhoneNumbers to Employees: create table PhoneNumbers ( PhoneID int identity primary key, AreaCode char(3), Exchange char(3), Number char(4), Extension varchar(10) ) create table EmployeePhoneNumbers ( EmpID int references Employees(EmpID) on delete cascade, PhoneID int refernces PhoneNumbers(PhoneID), PhoneType varchar(10) references PhoneNumberTypes(PhoneTypeCode), primary key (EmpID, PhoneType) ) create table OfficePhoneNumbers ( OfficeID int references Offices(OfficeID) on delete cascade, PhoneID int references PhoneNumbers(PhoneID), PhoneTypeCode varchar(10) references PhoneNumberTypes(PhoneTypeCode) primary key (OfficeID, PhoneType) ) However, there are several problems with this situation.
So, I would recommend avoiding this design as it complicates the situation without much benefit. If you're going to do this, you might has well just store the Phone Number data itself in the EmployeePhoneNumbers and OfficePhoneNumbers tables and get rid of the PhoneNumbers table completely. A More Generic, Flexible AlternativeInstead, consider this approach: create table PhoneNumbers ( EmpID int references Employees(EmpID) on delete cascade, OfficeID int references Offices(OfficeID) on delete cascade, PhoneTypeCode varchar(10) references PhoneNumberTypes(PhoneTypeCode) not null AreaCode char(3), Exchange char(3), Number char(4), Extension varchar(10), check (case when EmpId is null then 0 else 1 end + case when OfficeID is null then 0 else 1 end = 1), unique constraint PhoneNumbers_UC (EmpID, OfficeID, PhoneTypeCode) ) Let's a take minute to dissect that. We have created a table of Phone Numbers, with columns that allow us to relate the phone number to an Employee and/or an Office, but neither is required. We also have a required PhoneTypeCode column, and we have the standard phone number attributes in there as well. We could easily add a notes column, or description, or last updated/modified columns as well. Notice the check constraint: that expression is simply ensuring that only EmpID or OfficeID is not null, but not both. There is no "exclusive or" boolean operator in T-SQL, so this is one easy way to handle it using CASE expressions that return 0 if a column is null or 1 if it has a non-null value. If we wrote: check (EmpID is not null OR OfficeID is not null) Then the table would allow an entry in both the EmpID and OfficeID columns, which we do not want. Each row in this table will relate to only an office or an employee. Plus, we can easily add more columns to this check constraint as necessary (more on that later). Next, instead of a standard primary key, we have a unique constraint on the columns EmpID/OfficeID/PhoneTypeCode. This ensures that only one row per EmpID/OfficeID/PhoneTypeCode is allowed. A primary key constraint cannot allow NULL values in any of the primary key columns, so we cannot use that type of constraint. If you recall, either EmpID or OfficeID is always null, but never both. Since PhoneTypeCode is never null, this constraint effectly ensures that there are never two entries for a single EmpID/PhoneTypeCode or an OfficeID/PhoneTypeCode. With that, we are done. We can now write a single stored procedure that allows us add a PhoneNumber to either an Office or an Employee like this: create procedure PhoneNumberAdd @OfficeID int = null, @EmpID int = null, @PhoneTypeCode varchar(10), @AreaCode char(3), @Exchange char(3), @Number char(4), @Extension varchar(10) as insert into PhoneNumbers (OfficeID, EmpID, PhoneTypeCode, AreaCode, Exchange, Number, Extension) values (@OfficeID, @EmpID, @PhoneTypeCode, @AreaCode, @Exchange, @Number, @Extension) This procedure effectively requires that you pass either an EmpID or an OfficeID, since if both are null, or neither is not null, the INSERT will fail. Writing a procedure that returns the phone number for an entity that "implements" the PhoneNumber interface is a slightly more complicated, but still quite simple. We must write our criteria expression to check for the possibility that either an @EmpID or an @OfficeID is passed in, since both conditions are possible. Thus, that results in something like this: create procedure PhoneNumberSelect @OfficeID int = null, @EmpID int = null as select p.OfficeID, p.EmpID, p.PhoneTypeCode, pt.Description as PhoneTypeDescription, p.AreaCode, p.Exchange, p.Number, p.Extension from PhoneNumbers p inner join PhoneTypes pt on p.PhoneTypeCode = pt.PhoneTypeCode where EmpID = @EmpID or OfficeID = @OfficeID order by pt.Sort Again, effectively this requires that you pass a value for either the @OfficeID or the @EmpID parameter, but not both, and the matching data will be returned. We don't have to worry about issues where NULL != NULL because we never want to return a row where both a parameter value and a column value is NULL -- we only want to return rows where both are not null, and therefore the comparison will return TRUE. We could of course also put a @PhoneTypeID parameter in there as well if you only wanted to return a single, specific phone number of the specified type. Update and Delete will work in much the same way: create procedure PhoneNumberUpdate @OfficeID int = null, @EmpID int = null, @PhoneTypeCode varchar(10), @AreaCode char(3), @Exchange char(3), @Number char(4), @Extension varchar(10), as update PhoneNumbers set AreaCode=@areacode, Exchange=@exchange, Number=@number, Extension=@extension where (PhoneTypeCode = @PhoneTypeCode) and (EmpID = @EmpID or OfficeID = @OfficeID) create procedure PhoneNumberDelete @OfficeID int = null, @EmpID int = null, @PhoneTypeCode varchar(10) as delete from PhoneNumbers where (PhoneTypeCode = @PhoneTypeCode) and (EmpID = @EmpID or OfficeID = @OfficeID) Calling our code is very easy. From within T-SQL or from a client, we just set the parameter that corresponds to the entity we are referencing: exec PhoneNumberDelete @OfficeID=2, @PhoneTypeCode='HOME' exec PhoneNumberDelete @EmpID=45, @PhoneTypeCode='WORK' ... etc ... Implementing the InterfaceWith all this in place, we now can use one set of stored procedures to Add/Select/Update/Delete phone numbers. The only variable is the parameter that we pass in to the stored procedure, which indicates the entity that we are referencing. That's all fine and dandy, but the question is: how would we set up a new entity to implement the "Phone Number" interface? For example, suppose that we now decide that we need to track "Contacts", and Contacts will have multiple phone numbers. (Please note that we could argue that Employees and Contacts should simply inherit from the same base class, and thus an interface is not necessary, but let's ignore that for the sake of this example.) Per usual, we would start by creating a table of Contacts: create table Contacts ( ContactID int primary key, ContactName varchar(100) ) Now, it is time to "implement" the interface. Where do we begin? Well, it is actually done rather backwards. Normally, you alter the class you have created so that it implements a particular interface, but in this case, we need to alter the interface so that it works with the class. So, we actually don't do anything to the Contacts table -- we alter the PhoneNumbers table and the corresponding PhoneNumber stored procedures. This is a bit of a drawback of this approach, and a big reason why implementing an interface in a relational database is not as clean as implementing inheritance. So, the steps we must take are:
Having done that, we now need to edit each of the stored procedures to include contactID as an optional parameter, and to include contactID throughout the SELECT and WHERE clauses. It sounds like a lot of work, but it's really not too bad and can be done quite quickly. The "template" we've created for this interface is very easy to maintain. Despite the maintenance required, the overall process is pretty nice because:
The "Phone Entity" ApproachAnother option is to create one table with foreign key references back to the tables that "implement" your interface, and with an identity primary key on that table, and then put all of the actual data in another table. In other words, you might have one table that just defines "PhoneEntities", with links back to the parents, and then we put the actual address data in a table with a primary key of (PhoneEntityId/PhoneTypeCode). For example, you might have a PhoneEntities table like this: create table PhoneEntities ( PhoneEntityID int identity primary key, EmpID int references Employees(EmpID) on delete cascade, OfficeID int references Offices(OfficeID) on delete cascade, check (case when EmpId is null then 1 else 0 end + case when OfficeID is null then 0 else 1 end = 0), unique constraint PhoneEntities_UC (EmpID, OfficeID) ) Notice that this table just stores a reference back to either an Office or an Employee, and our unique constraint ensures that there is only one value in this table for either. It is a bit simpler because we aren't worry about PhoneTypes here, or actual phone numbers. That data is stored in a more traditional PhoneNumbers table, with a foreign key reference to the PhoneNumberEntity table: create table PhoneNumbers ( PhoneEntityID int references PhoneEntities(PhoneEntityID) on delete cascade, PhoneTypeCode varchar(10) references PhoneTypes(PhoneTypeCode), AreaCode char(3), Exchange char(3), Number char(4), Extension varchar(10), primary key (PhoneEntityID, PhoneTypeCode) ) That allows the PhoneNumbers table to be a more "normal" looking, and a bit simpler as well. This does complicate the stored procedures a little, since first you need to determine if a particular entity already has an entry in the PhoneEntities table, and if not then you need to add one in. For example, with that schema, here's a PhoneNumberAdd stored procedure: create procedure PhoneNumberAdd @OfficeID int = null, @EmpID int = null, @PhoneTypeCode varchar(10), @AreaCode char(3), @Exchange char(3), @Number char(4), @Extension varchar(10) as declare @EntityID int set @EntityID = (select PhoneEntityID from PhoneEntities where Office=@OfficeID or EmpID = @EmpID) if (@EntityID is null) begin insert into PhoneEntities (EmpID, OfficeID) values (@EmpID, @OfficeID) set @EntityID = scope_identity() end insert into PhoneNumbers (EntityID, PhoneTypeCode, AreaCode, Exchange, Number, Extension) values (@EntityID, @PhoneTypeCode, @AreaCode, @Exchange, @Number, @Extension) This approach may or may not work better, depending on your specific situation. As you can see, the logic in the stored procedures can potentially get a little complicated since now we have two tables to deal with. But, this does makes the PhoneNumbers table itself much simpler, so you may get better performance with this approach. ConclusionI'll be completely honest here: this isn't a really clean and simple technique to use, it is really one of those subjective gray areas where there does not appear to be a standard method or best practice. I would love to hear feedback about alternate suggestions; I presented only a few possible ideas, and I am sure there are alternatives that may be superior in different situations. I have found that these work quite well, and again they are nice because we have full referential integrity, cascading deletes, and there are no triggers required. However, I also recognize not all techniques work in every situation, so the ideas presented here may not be applicable in many cases. Overall these methods can work quite well depending on what you are modeling. With these techniques in place, if you decide to change the columns used to store phone numbers, you do it all in one place and your done. Or, if you decide to add "UpdatedAt" and "UpdatedBy" audit columns to track changes in the phone numbers, again you just need to do it all in one place. Every entity you have that "implements" the phone number interface will instantly get all those changes. And, most importantly, we will never have orphan phone numbers due to full support for cascading deletes, we don't have to worry about any data integrity issues since it all checks out, and no triggers were necessary for any of this. I look forward to hearing ideas and suggestions in the comments.
|
- Advertisement - |