Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
graz
Chief SQLTeam Crack Dealer
4149 Posts |
Posted - 2006-06-01 : 14:48:07
|
We're having the "Common Lookup Table" discussion at one of my clients. They are considering storing many of the lookup codes and their description in a common table. Codes that are more complex will still get their own tables.I was curious if people have any practical experience dealing with this type of situation. What worked? What didn't work? I'm really looking for real-world situations and not a "data modeling" answer.-Bill===============================================Creating tomorrow's legacy systems today.One crisis at a time. |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-06-01 : 15:30:57
|
I don’t really see the advantage to it. It makes the queries against the common table more complex, and it really makes DRI impossible without a lot more work.If you do setup DRI with the codes table, you could easily end up with a hundred other tables referencing the codes table, and wondering why it takes two hours to delete a single row from that table.Also, you can get bad query plans when you have a lot of data mixed together, and the query analyzer just can’t see the best way to find the data for that particular query. You may end up having to extract into a temp table anyway.It’s one of those things that sound easy up front, “Only one table to deal with!”. What happens is that you end up doing more work every time you write a query, and you have to have detailed knowledge of the contents of the table. You save a little work up front and do a hundred times more additional work later.Another issue is that you may think you only need the code and description now, but find you need additional attributes later. If it is used in a hundred different procs, you really have a job changing all your code.We had a database where this was done. It took us years to clean up the mess, and I still have nightmares when I think of the queries.Notice how it complicates even this simple query with a single join. So where does it actually save some work? No where that I've ever seen.-- Normal design queryselect a.EMPLOYEE_NAME, b.JOB_DESCRIPTIONfrom T_EMPLOYEE a join T_JOB b on a.JOB_ID = b.JOB_ID-- Codes table design queryselect a.EMPLOYEE_NAME, JOB_DESCRIPTION = b.DESCRIPTIONfrom T_EMPLOYEE a join T_CODE b on a.JOB_CODE_TYPE_ID = b.CODE_TYPE_ID and a.JOB_CODE = b.CODE In short, don’t do it.If you are looking for advantages, well, it generates a lot of income for consultants. CODO ERGO SUM |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-06-02 : 06:13:00
|
We use a Lookup Table for things that don't need any "additional complex stuff".But I agree with MVJ's points.Our coding is done through a module that controls most of this type of stuff - so an attribute on a column (i.e. stored in our own "metadata" tables) identifies that the column has a lookup - and that in turn causes our form rendering modules to provide picklists, validate the data, and so on.The Metadata defines whether the generic lookup table is used, or a specific table, or indeed a set of SProcs for anything more exotic (for example, a lookup table of Catalogues, end user should only see Enabled Catalogues, Back office screens may also need to display Disabled Catalogues in the Picklists).Apart from "unusual" scenarios all the rows in the Lookup table are cached in the application - so our Forms module doesn't touch the DB for select lists and validation etc. - that would be a bit more of a pain to do if we had separate tables.Separate tables would also require a bit more effort for Maintenance screens, and rollout of DDL changes at each release, but I'm be hard pressed to make a case for it on that basis! But I expect it is a bit quicker at the DEV stage. Add a new table, or some columns to an existing table, and just tick a box [in our Metadata maintenance stuff] to indicate that the column uses our Lookup table and that's it job done - all Forms will now utilise it.In a reporting query we have to make a JOIN to the lookup table for a Description [of the code], doesn't make any difference to DEV time whether its a generic table, or a specific table. (Our column names are unique in the application, we use that as the "Attribute Type" - so our code would be:select a.EMPLOYEE_NAME, JOB_DESCRIPTION = b.DESCRIPTIONfrom T_EMPLOYEE a join T_CODE b on b.CODE_TYPE_ID = a.JOB_CODE_TYPE_ID and b.ColumnName = 'JOB_CODE' I don't have strong feeling on the subject. We typically have only a few hundred rows in the Lookup table (for the whole application) so I doubt query efficiency becomes an issue.Kristen |
|
|
RyanRandall
Master Smack Fu Yak Hacker
1074 Posts |
|
X002548
Not Just a Number
15586 Posts |
Posted - 2006-06-02 : 12:20:32
|
Bill,That's how peoplesoft does itThe Oracle versionCREATE TABLE ENR_XLATTABLE ( FIELDNAME VARCHAR2(18) NOT NULL, FIELDVALUE VARCHAR2(4) NOT NULL, EFF_STATUS CHAR(1) NULL, XLATLONGNAME VARCHAR2(30) NULL, XLATSHORTNAME VARCHAR2(10) NULL) TABLESPACE USER_DATA; Brett8-)Hint: Want your questions answered fast? Follow the direction in this linkhttp://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspxAdd yourself!http://www.frappr.com/sqlteam |
|
|
X002548
Not Just a Number
15586 Posts |
|
cmdr_skywalker
Posting Yak Master
159 Posts |
Posted - 2006-06-02 : 14:35:20
|
It depends on what is "common" to your data? Do you mean a table with a code, code type, description, and reference type? The problem sometimes lies on how people names name. For instance, a employee table can have ID column and other table referencing this table calls it EmpID, EmployeeID, EMP_ID, etc. That's why name is destiny. Once you have a name stick with it.A lookup as a matter of additional join is not really the issue but the issue is, do you have a business object of that type? I create a reference table because there is such thing as AgencyCodes in our client's business domain. A lookup can be implemented in several ways and not just joins. i.e. stored procedure that pass the column name and table name returns a recordset. That is how I usually implement lookup in codes. Much easier in coding and we don't have to worry about understanding the archaic database design when handing it out to another consultant because the assumptions is seen in the database model, and ARE NOT IMPLIED where you don't really no where it is.Yes, in some instances, I can combine several reference/lookup tables into a CONFORMED DIMENSION (DW term) or lookup/reference table in RDB when the business, technical, and other factors calls for it. But should not be shunned because its an additional join. 3rd Normal forms are optimized for transactions, not analytical reporting or data mining. If you are now moving to analytical reporting and not just daily operational reports, consider now the DW.May the Almighty God bless us all! |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2006-06-02 : 19:39:49
|
Just to point out some hazards:I work with a database that has 58, count 'em, 58 code tables (out of 426) consisting of 2 or 3 columns: Code, Name, and sometimes Description. A good number of them are not used, they were intended to support features that never materialized. We also have something like 20-25 "link" tables consisting of 2 columns. And every time we need to add a feature similar to one we already have, we create a new table for it. I'm not against separate code tables per se, but like anything else, moderation is key and excess just destroys the entire purpose.We are considering a new design and if we implement it, I will literally hold the designer's and developer's feet to the fire if they suggest a new or separate code table for something. I think it's worth the time to rethink a DB design with a few hard limits in place, just to get the creative juices flowing and to make people really think about what they're doing. |
|
|
byrmol
Shed Building SQL Farmer
1591 Posts |
Posted - 2006-06-02 : 20:01:54
|
>>...3rd Normal forms are optimized for transactions...That's a definate "Quote of the week"...Bill, the notion that we can't bring a data model (aka theory) counter point to this debate sucks badly..So anyway...The CLT method requires less DDL for implementation.The CLT method requires more complex code for application use and data analysis.User don't care about how much "crap is at the back", they want fast, correct answers...and CTP CLT does not make that goal easy to achieve...EDIT: CTP (Compulsory Third Party Insurance) for car re-registration in NSW, Australia.. I've got 2 days to fix the car for inspection and I hate working upside down....DavidMIntelligent Design is NOT science.A front-end is something that tries to violate a back-end. |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-06-03 : 04:51:20
|
"The CLT method requires more complex code for application use and data analysis."Do you reckon?Isn't it just the difference between:SELECT E.EMPLOYEE_NAME, JT.JOBTYPE_DESCRIPTIONFROM T_EMPLOYEE AS E JOIN T_JOBTYPE AS JT ON JT.JOBTYPE_ID = E.EMPLOYEE_JOBTYPE_IDandSELECT E.EMPLOYEE_NAME, L.LOOKUP_DESCRIPTIONFROM T_EMPLOYEE AS E JOIN T_LOOKUP AS L ON L.LOOKUP_CODE = E.EMPLOYEE_JOBTYPE_CODE AND L.LOOKUP_COLUMN = 'EMPLOYEE_JOBTYPE_CODE' Edit: The application (or trigger etc.) also has to deal with the absence of an FK, as Arnold pointed out below. Our application takes care of this based on MetaData, so I'm prone to forgetting the need for this particular piece of validation - sorry about that!Kristen |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2006-06-03 : 04:56:23
|
FK! |
|
|
cmdr_skywalker
Posting Yak Master
159 Posts |
Posted - 2006-06-03 : 06:20:07
|
"I work with a database that has 58, count 'em, 58 code tables (out of 426) consisting of 2 or 3 columns: Code, Name, and sometimes Description."You haven't seen the inside of a Medicaid Management Information System (MMIS) or SACWIS/Healthcare Information Systems . You can add the short description and long description in your column list :). But again, as the business evolves, so is the nature of the underlying data model. I tried models to be resilient but I also have to consider other factors such as implementations, performance impact, reusability, conformation across application context/model domains/enterprise domains, etc. And most of the time, to maintain 3rd Normal form, you create a new table to represent a new business object. I agree with you, somebody should be in the driver seat and hold them accountable on the data model changes. And those people must understand not only the business impact but also the peripheral impact of the proposed solution (application, enterprise, etc). I try to train people not to develop a kludge. To use their head, be proactive, and not just follow the whim of trying to get over with it. And it take some conscious effort to be more responsible on the system. In reality, whew!, its hard work because you have to understand the business and the system and that just takes too long/cost (if I have 500 or more tables across the enterprise and each table has 100+ columns, and not just that, you also have to deal with politics, object permissions, etc. Most of the people would say: Heck, i'll use my head! forget that analysis or caring about the system, I have one application to worry and just add the new table and be done with it! I am not gonna spend extra unpaid hours for that! ). And so, the path of the one is not easy. It is important that everyone in the project or key players knows and understand that substantial responsibility of maintaing a sound and up-to-date data models and dbms :). Give your self a pat!.May the Almighty God bless us all! |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-06-03 : 06:25:55
|
So there you go Graz .... clear as mud! I can lend you my coin to toss if you like Kristen |
|
|
cmdr_skywalker
Posting Yak Master
159 Posts |
Posted - 2006-06-03 : 07:45:54
|
Going back to the original statement:"We're having the "Common Lookup Table" discussion at one of my clients. They are considering storing many of the lookup codes and their description in a common table. Codes that are more complex will still get their own tables."And given the discussion above, the answer lies on your discretion of what you perceived as the total benefit (business/technical//other criteria) of implementing the "common table". There is no panacea to all the situation but I would not deviate from creating an individual reference tables not unless it is called for. That means if you have 100 unique business objects, most likely, you end up 100 unique tables to represent those business objects, unless you have to adjust because of certain requirements/elements :).May the Almighty God bless us all! |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2006-06-03 : 11:57:54
|
I haven’t seen anything in this thread that explains any actual benefit to a "Common Lookup Table."The best I have seen for it is basically, "In some situations, is doesn’t suck too bad."Given the numerous drawbacks, I think the case against it is clear.CODO ERGO SUM |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2006-06-04 : 10:14:31
|
My take:First off, I feel that many (not all) lookup tables greatly benefit from storing more columns than just (ID, Description) -- see example #4 here: http://weblogs.sqlteam.com/jeffs/archive/2006/02/10/9002.aspx . If your lookups have any *meaning* within your application or the reports or if they will ever be explicitly used to filter or check conditions in your code, then (IMHO) I feel that adding attributes to those lookup values and using those attributes results in clearer, shorter code that is very easy to maintain and doesn't rely on specific values to be stored in your lookup tables. Therefore, for these types of lookups, I would *not* use a common table but rather create very specific tables for each lookup with those attributes.Now, for lookup tables that have no true meaning within your code (i.e., strictly for the users to label items or for that users use for filtering or sorting and so on), then a common lookup table might not be a bad idea. You can still have nice FK contraints to a common lookup table as well. Ideally, this would be accomplished by adding a computed column for each lookup in each table to indicate its "lookup code" and then using a multi-column FK contraint on (LookupColumnID/LookupCode) to your common lookup table. However -- last I checked, you could not use a computed column in an FK constraint. So, an alternate solution is to define each "lookup code" column as DEFAULT='YourCode' CHECK (='YourCode')And then using that in your FK constraints. Of course, this does mean that more storage is used since this code will be stored in every row in your table and tables with multiple lookup columns will need multiple lookup-code columns (which is why computed columns would be preferable if it would work). Each join to the lookup values is then done using two columns, of course (and without the need to hard-code "lookup codes" in your SQL, as some of the previous examples in this thread have shown, which you definitely want to avoid).If you need some DDL and an example of this technique, let me know (I have no doubt that what I've written is probably not too clear).I feel that if you don't do something like that, and if you end up needing to forgo RI or if you need to use triggers to enforce it, then I would *not* use a common lookup table.EDIT: fixed some typos. |
|
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2006-06-04 : 10:33:54
|
On the occasions when I have used CLTs (or OTLTs: "one-true-lookup-tables") I've created a table function that accepts the lookup value type as a parameter and returns the subset of records as an independent recordset. From then on, the function is referenced in secondary SQL code, not the table. So the only increased complexity in coding is adding the lookup type when you call it.I've also used GUIDs as primary keys on the CLT, which allows FKs to be created, but then GUIDs are a whole other hot topic, eh?I WILL say that I experienced a significant performance impact using this method on an import process that had to wade through large volumes of staging data. The recordsets returned from the UDF were effectively unindexed. Fortunately, it was a weekly batch process and speed was not the big issue. For single lookups, the method did not noticably impact performance, but I would think twice about it if you frequently need to process large volumes of data. |
|
|
eyechart
Master Smack Fu Yak Hacker
3575 Posts |
|
graz
Chief SQLTeam Crack Dealer
4149 Posts |
Posted - 2006-06-04 : 11:22:25
|
I thought I'd bring a small measure of closure to this debate and tell you what we actually did. We had our first meeting discussing this the same day I started this thread so I didn't have all these great comments. We already have over 1,000 tables and I have no idea how many of them are lookup tables (I'm a consultant there and still learning the model).The main driver behind this was developer productivity. They wanted to to be able to write somthing like:listBoxItems = GetCodeValues("States")so they could populate a listbox with a single statement (please ignore my fake syntax). We're going at this in two different ways. The first approach is to create some naming standards and apply them to new lookup tables. We're writing a small component that will return the lookup values in a common format when passed a table name. It will only work for new tables but that's a start.We're also investigating some code generation utilities (CodeSmith) to generate strongly typed classes and stored procedures for each code table. I personally think that's a better long term solution.In my mind the arguement came down to whether or not we wanted foreign key referential integrity. If we didn't then I didn't mind the OTLT (I really like that acronym!). If we do want DRI then we need to have individual lookup tables.===============================================Creating tomorrow's legacy systems today.One crisis at a time. |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2006-06-04 : 12:06:47
|
Why not have a stored proc - it gets passed the paramneters to return the result.This can be entity, value for a single entry or just the entity for all values to populate drop down lists (for this I usually have code that developers access - they pass the object and entity and it does the population).Then it doesn't matter where they are kept - you just change the sp if you change the structure. Maybe have a table which indicates how each entity is stored. Also if you change then you just change the spFor queries that need to join to the values you could use a view or function if you want to encapsulate the process but it's probably better if you have names which are easily searchable then if you change the structure you can search and replace all accesses for that entity.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Kristen
Test
22859 Posts |
Posted - 2006-06-04 : 13:38:51
|
"It will only work for new tables but that's a start"I'm sure you haven't missed the opportunity, so its a question rather than a suggestion!!:Why not put a VIEW on the old Lookup Table(s) so that they conform to the new naming convention, and thus get the benefit of the "component that will return the lookup values in a common format when passed a table name" ??Kristen |
|
|
Next Page
|
|
|
|
|