In a nutshell, this post is going to be about using natural and surrogate keys in Entity Framework.
Definitions and a Simple Example
A surrogate key is a key (the type doesn’t really matter) with no business meaning that is used to uniquely identify a record in a table. A natural key, on the other hand, DOES have a business meaning. Let’s look at an example.
public class Country
{
public int CountryId { get; set; }
public string Abbreviation { get; set; }
public string Name { get; set; }
}
In the above example, Country.CountryId is a primary key (obviously) that has no business meaning. Its only purpose is to identify a country record. Hence, CountryId is a surrogate key.
Now let’s look at the same entity rewritten to use the natural key:
[Index(nameof(Name), IsUnique = true)]
public class Country
{
public int CountryId { get; set; }
public string Abbreviation { get; set; }
public string Name { get; set; }
}
Wait, something is not right. Why CountryId is still there? For the sake of the demonstration, imagine that it’s there because Country entity already exists in the system and you don’t want to refactor the rest of the system to remove it right now (often it’s not an option). But then if the Country entity already exists in the system in it’s original form, why would we even consider introducing a natural key?
To answer that question, let’s introduce a NEW Guest entity:
public class Guest
{
public int GuestId { get; private set; }
// TODO: need to reference country
}
Later we need to reference a Country. As a quick win we could add a CountryId property to Guest entity (using a surrogate key reference). We’ve probably all configured Entity Framework to work this way, so I will skip the configuration code for this. Instead, let us follow up on this and add a new requirement by saying that the corresponding UI needs to list the guest information including country of origin (e.g. for the invoicing purposes). That means our backend code would look somewhat like this:
dbContext.Guests.Include(g => Country);
or we could use projection with AutoMapper:
dbContext.Guests.ProjectTo<GuestDto>()
It doesn’t matter which option you choose. The underlying T-SQL query generated by Entity Framework will still include the JOIN Country statement. But that is only the case when we are using surrogate keys! Let us redesign the Guest entity to directly reference Country via a unique, natural key:
public class Guest
{
// ...
public string CountryName { get; private set; }
// ...
}
No additional JOIN statement is not needed for the query above. The country name is simply part of the Guest entity. Before we run it we need to configure EF to handle this new relationship accordingly:
// EntityTypeBuilder<Guest> entity
entity
.HasOne<Country>()
.WithMany()
.HasForeignKey(g => g.CountryName)
.HasPrincipalKey(c => c.Name)
// EntityTypeBuilder<Country> entity
entity
.HasIndex(c => c.Name)
.IsUnique();
Does Using Natural Keys Affect Performance?
In general, no. I do not expect that the performance will ever be an issue, but it can happen in certain circumstances. You should approach this choice the same way when choosing an ORM over raw SQL or relational database over NoSQL.
When to Absolutely Avoid Using Natural Keys?
You have to be aware that natural keys are ‘owned’ by a business. Should it ever require renaming them, you’re gonna go bald in an instant from the stress (records in the parent AND the child tables will have to be updated). Remember that there is NEVER a guarantee that the choices you make will prove bulletproof over time. Even the Country example above isn’t immune to changes say as Turkey recently changing its name to Türkiye.