11 April 2016
Oftentimes in our applications we will have such things as a “lookup table”. I am defining a lookup table as a list of relatively fixed or static choices such as status codes, states or provinces, and so on. In this post I will share how I manage lookup tables using Entity Framework.
Let’s work with a lookup table of status codes. Often, in an application, there is some logic that is related to the status of some entity, such that doing the same action will produce different results depending on the entity’s status. For example, in a banking system, the loan process might be different between regular or VIP customers. In a retail store, the checkout process might be different between members and non-members. In each case, a status check is part of the workflow, and the workflow branches depending on the result of the status check.
Now, suppose that statuses are stored in a lookup table in a database. How can the status be queried reliably and accurately?
Often, what I see done is this: in the database, an additional column is created called “code” or similar. The sole purpose of this column is to serve as an identifier in the application: the known “codes” are stored in the application in some data structure such as a dictionary or in a class of constants. The codes doesn’t appear on any screen, and the data structure holding the codes are only used during checking.
For example, the database values would look like this:
Id Code Text
1 NON Non-Member
2 GLD Gold Member
3 SVR Silver Member
And then there would be a class of constants that look like this:
public class StatusCode
{
public const string NonMember = "NON";
public const string GoldMember = "GLD";
public const string SilverMember = "SVR";
}
Which would be used like this:
switch (entity.Status.Code)
{
case StatusCode.NonMember:
// logic for non-members
case StatusCode.GoldMember:
// logic for gold members
// .. other cases ..
}
This works, but there are a few downsides / points for improvements that I see:
So what’s the better way?
Yes, you read that right: just use the primary key as the identifier. Before I list the “why”, let me post some sample code that uses this approach.
The database values would now look like this:
Id Text
1 Non-Member
2 Gold Member
3 Silver Member
The class of constants would now look like this:
public class StatusId
{
public const int NonMember = 1;
public const int GoldMember = 2;
public const int SilverMember = 3;
}
Which would then be used like this:
switch (entity.StatusId)
{
case StatusId.NonMember:
// logic for non-members
case StatusId.GoldMember:
// logic for gold members
// .. other cases ..
}
Now I can list the advantages that I see, which are answers to the disadvantages I listed above:
I know what you’re thinking now:
Back in the day, lookup tables are populated with SQL insert commands. If lookup tables have an identity primary key (and they usually do), there was no reliable way to determine beforehand what the generated primary keys would be - you would have to look at those after the insert script ran. Therefore, using codes made sense, as that was something known before the actual insert and is something that could be synchronized with the code.
But when using Entity Framework, particularly the Code First workflow with migrations, insert scripts are no longer necessary. The implementation of migrations also lend itself well to the approach I described of using the primary keys as identifiers.
A very useful method that can be used in migrations is the AddOrUpdate
method. The AddOrUpdate
method takes a list of entities as a parameter. For each entity in the list, it adds it to the database if it’s not already there, and updates it otherwise. The method also takes in an expression as a parameter, and uses this expression to determine if an entity already exists or not.
How does this fit in with the lookup tables we are discussing?
Well, remember the constants class above, where we had all the ids? Here is how we can use that in migrations:
context.Statuses.AddOrUpdate(s => s.Id,
new Status { Id = StatusId.NonMember, Text = "Non-Member" },
new Status { Id = StatusId.GoldMember, Text = "Gold Member" },
new Status { Id = StatusId.SilverMember, Text = "Silver Member" });
Here we are seeding statuses and using the Id as the identifier expression. Notice that we are using the StatusId class when constructing the objects - the same class that we use when checking the statuses. This gives us a strong guarantee that our identifiers really accurately identify whichever row they’re supposed to be able to identify.
This is an example of embracing the Code First workflow - now we not only using code first for the schema but for the data as well.
In my opinion, this trumps the approach with the “code” columns I described above.
In this post I described a solution to lookup table management using Entity Framework Code First and migrations. The solution involves using primary keys directly instead of introducing an arbitrary column (such as a “code” column). This approach ties in nicely with the Code First migrations workflow and provides a very strong means of identification.