Stop Using Static Data in Tests
At some point in the software development lifecycle you’re going to find yourself writing tests (we’re gonna gloss over the TDD debate) and these tests almost always need test data of some sort. Most developers that I’ve worked with tend to go for the tried and true method of just hard coding the data that they need either right in the test code itself or in a separate data file that gets included or read in during the test.
Now technically this is fine right? You can give exact data that is needed for each test and it’s versioned so you know when that data changes. Sure, that’s all true. However there are a number of downsides to this method.
Namely:
- More work to write tests
- Noisier tests
- Hard to manage changes to models
- May not match reality
Sample model⌗
Before we get into those reasons let’s first define an example model we can use to measure the problems of static data definitions against.
interface User {
readonly id: string;
readonly username: string;
readonly firstName: string;
readonly lastName: string;
readonly createdOn: Date;
readonly lastLoggedIn: null | Date;
readonly deactivatedOn: null | Date;
}
This is an overly simplified model of a user of some system. There’s a lot that could be added here and it’s very western focused in it’s design but it should get most of the points across that we want to hit.
🗈 Also worth noting that I’m using Typescript to give the code samples in this post however the ideas still apply to any language that you might be working in. Typescript was just convenient for writing things in this case.
What do I mean by static data?⌗
Static test data largely falls into two categories from what I’ve seen:
- Data embedded directly in code
- Data written in files that are included/read in tests
Data embedded directly in code⌗
describe("MyService", (): void => {
it("should do the thing", (): void => {
const sampleNewUser: User = {
id: "0123-4567-89AB-CDEF",
username: "mctest",
firstName: "Mc",
lastName: "test",
createdOn: new Date(),
lastLoggedIn: null,
deactivatedOn: null,
};
const service = new MyService();
service.doTheThing(sampleNewUser);
// ... the rest of the test
});
});
In the sample code above you can see a Mocha test which uses our model with some service under test. We had to manually write out each of the values that make up the test data by hand.
It’s possible to make these more reusable by pulling the test data out of the test and just importing the data where you need it but the result ends up being the same.
Data written in files that are included/read in tests⌗
{
"id": "0123-4567-89AB-CDEF",
"username": "mctest",
"firstName": "Mc",
"lastName": "test",
"createdOn": "2023-01-01T00:00:00Z",
"lastLoggedIn": null,
"deactivatedOn": null,
}
Here we have some file (maybe sampleUser.json
) which contains the
same data which tests would be responsible for reading and using.
This is a weaker option to the embedded option since you lose a lot of the type checking and auto-complete features. It’s also harder to find usages of the model in these kinds of files unless the model has a very unique field name in it.
Problems⌗
1. More work to write tests⌗
When you have to have static data defined in your tests it’s just going to be more work to write a new test. Instead of focusing in on the actual work of testing and verifying the different happy/unhappy paths of what you’re testing you also need to spend a bunch of time and energy writing static data most of which is probably not all that important to what you’re trying to do.
I mean look at the example of embedding data in tests above. There’s 9 lines of code which are just for defining a single model (and a small model at that). These kinds of tests get really messy the more data you need to properly run it. On top of that, in my experience tests often use exactly the same poorly constructed sample data over and over again which just further adds to the amount of noise a reader needs to sift through to get to what matters.
Sure, you can somewhat alleviate this problem by taking any commonly reused data and importing it where you need but that just leads to someone creating tons of samples to fit every use case which isn’t any better.
2. Noisier tests⌗
As mentioned in the section above, the more code you have in your tests which only exists to do the setup work the harder they are to reason about. This is compounded when it’s not always clear what part of your static data is actually important to the test.
const user: User = {
id: "0123-4567-89AB-CDEF",
username: "mctest",
firstName: "Mc",
lastName: "test",
createdOn: new Date(),
lastLoggedIn: null,
deactivatedOn: null,
};
Is it important that this sample user hasn’t logged in yet after having their account created? Is it important that they aren’t deactivated? The bigger and more complicated the model the harder it is to understand everything that is important in your sample data to your specific test.
3. Hard to manage changes to models⌗
At some point your data model is almost certainly going to change and that means any test data is probably also going to have to change as well. Depending on which language you’re using and which features of that language you might have a hard time updating every place that you’ve statically defined your data.
Let’s look at an extreme example:
describe("MyService", (): void => {
it("should do the thing", (): void => {
const sampleNewUser = {
id: "0123-4567-89AB-CDEF",
username: "mctest",
firstName: "Mc",
lastName: "test",
createdOn: new Date(),
lastLoggedIn: null,
deactivatedOn: null,
};
const service = new MyService();
service.doTheThing(sampleNewUser);
// ... the rest of the test
});
});
In this example the sampleNewUser
value doesn’t have any type
annotations which means its going to be significantly harder to find
this reference when User
changes.
Even if the test was well written and includes the property types on the variable you still need to either write a new codemod (assuming that you’re using those) or manually update every reference to that type which is tedious at best and potentially error prone at worst.
4. Could be missing edge cases⌗
In my experience, developers almost always either copy/paste from other tests to create sample data or they write data using very primitive rules for picking which values to use. In most cases the test data ends up looking basically identical throughout the code. This could result in the code not being fully tested since you’re not exercising different forms of the data.
This is part of the reason that Hypothesis Testing exists.
This kind of problem becomes even more of an issue when you start
dealing with users that come from a different culture than you’re used
to. As a simple example, our User
type assumes that the user
actually has a first and a last name which is not always true
everywhere.
5. May not match reality⌗
By hard coding data you’re at best replicating the conditions that match the kind of real world data you’re seeing at that moment but those patterns may change over time meaning the usefulness of the data will also drift.
More likely though the developer is either picking lazy or cheeky values (if I have to see one more reference to the Matrix…) which has no bearing on the data the code is actually working with.
A solution⌗
So what’s the solution then? Dynamically generating the data on each test run.
This will require some investment up front for each data model that you’re going to be using in tests but after that it easily pays for itself.
For each data model you’re going to be defining a function which will return a fully formed instance of your data model with “random” data used to populate it. “Random” is in quotes because you’ll probably be using a random number generator (or text generator) to create the data but you’re probably still going to have to respect some kind of business logic.
In the case of our User
type it doesn’t make sense for the
lastLoggedIn
or deactivatedOn
properties to be set to values that
come before createdOn
. You’ll need to expend some amount of effort
into codifying these rules into the generator but this is something
you should be doing once and you would have had to do manually before
for each test.
Ideally you should also be providing a parameter to the generator that allows the caller to customize the generated data but this isn’t required.
Example⌗
Let’s look at a simple example:
export function generateUser(userOpts: Partial<User>): User {
const generator = randomChoice(
generateNewUser,
generateStandardUser,
generateDeactivatedUser,
);
return generator(userOpts);
}
export function generateNewUser(userOpts: Partial<User>): User {
return generateBaseUser({
...userOpts,
lastLoggedIn: null,
deactivatedOn: null,
});
}
export function generateStandardUser(userOpts: Partial<User>): User {
return generateStandardUser({
...userOpts,
deactivatedOn: null,
});
}
export function generateDeactivatedUser(userOpts: Partial<User>): User {
return generateBaseUser(userOpts);
}
function generateBaseUser(userOpts: Partial<User>): User {
const defaultFirstName = generateFirstName();
const defaultLastName = generateLastName();
const createdOn = randomDateAfter(new Date() - 7 * days);
const lastLoggedIn = randomDateAfter(createdOn);
return {
id: uuid(),
username: `${defaultFirstName.toLowerCase()}.${defaultLastName.toLowerCase()}`,
firstName: defaultFirstName,
lastName: defaultLastName,
createdOn,
lastLoggedIn,
deactivatedOn: randomDateAfter(lastLoggedIn),
...userOpts,
};
}
Here we defined four different data generators for our User
model:
generateNewUser
- Creates a new user that hasn’t signed ingenerateStandardUser
- Creates a user that has logged in and is activegenerateDeactivatedUser
- Creates a user that has been deactivatedgenerateUser
- Randomly generates one of the three user types
We do assume that there are a few helper functions available to make this easier. They’re not defined here mostly because they’re easy to write by hand or find a library that provides them for you.
randomChoice
- Picks a random value from what was suppliedgenerateFirstName
- Creates a random first namegenerateLastName
- Creates a random last namerandomDateAfter
- Picks a random datetime after the given datetime
Now you could write a test like so:
describe("MyService", (): void => {
it("should do the thing", (): void => {
const sampleNewUser = generateNewUser();
const service = new MyService();
service.doTheThing(sampleNewUser);
// ... the rest of the test
});
});
There’s only a single line required to generate a unique user. If you needed multiple users you could just call the function again to get another unique user.
Updating the generators⌗
One of the benefits to this approach is that you can just update the data generators and have all the tests that use them get the changes without any effort required in most cases.
Let’s look at how that might look. Let’s say we wanted to make our user generator more culturally diverse by forcing the last name to be empty some amount of the time.
function generateName(): [string, string] {
const percentage = randomFloat();
return [
generateFirstName(),
percentage > 0.1 ? generateLastName() || "",
];
}
Now 1%
of users that are generated can have no last name set and all
the tests by default (unless they specifically set the name) will
verify that no last name being set is OK.
I remember when I first started changing some of the repositories at work to use this style of testing we immediately found a bunch of tests that couldn’t actually handle many edge cases and we had just gotten lucky.
Potential issues⌗
As with everything, there are some potential downsides you might encounter with this approach to testing. Thankfully they all have ways to overcome them which I’ll also try and cover.
Flakiness⌗
It’s certainly possible to encounter instances where a test fails and when rerun it succeeds. Or vice versa. However this is actually a feature in my opinion. When this happens it’s the test telling you that you either haven’t properly defined the constraints on what type of data you expect to be working with OR that the model has changed and your test isn’t accounting for some form of the data.
To fix this problem you would simply need to correct the constraints on the data that’s generated or you would need to update your test to work with the new data model.
Reproducibility⌗
This is related to the flakiness problem. Basically say you’re running a test in your continuous integration (CI) and you get a failure that resolves after rerunning, how would you be able to reproduce that locally to fix the problem? We already said that the data is generated randomly on each run meaning that each run is unique.
The solution to this problem is pretty straightforward but could make some existing data generators less usable. The basic idea is to seed the random number generator (RNG) from some known value. This seed could be picked arbitrarily under normal conditions (maybe the current Unix timestamp or something) but when you need to reproduce a test run you could provide the seed as an environment variable which would mean you get the same test values during this execution.
The main wrinkles here are that if you probably can’t use an existing test data generator unless you can provide an RNG instance to them that you have already configured. The other issue is that if you’re randomizing the order of your tests (you absolutely should be) you’ll need to make sure that you can reproduce the order of the tests when rerunning.