I'm working on a website that functions as an index of trades people in Ireland. It allows users to register a trade profile detailing their services. This trade profile can then be reviewed by other users. I have this functionality pretty much built out. But until recently I had very little test data, which lead me on a journey into the world of database seeding. Which is the generation of test data that mimics real world data. Here's how I got on.
The goal
I wanted to generate test data that was similar to what would be seen in a real world scenario. A bunch of lorem ipsum wouldn't have been good enough. This is where the Faker module came into play.
Faker allows for the generation of realistic data. For example if you want to generate a user's details using Faker, it would look something like the following.
Faker faker = new Faker();
String firstName = faker.name().firstName(); // John
String lastName = faker.name().lastName(); // Wayne
String email = faker.internet().emailAddress(); // rooster.cogburn@gmail.com
Faker generates data for many scenarios, 79 domains as of writing, and facilitated my need for realistic looking text.
But I didn't just want to generate the text, I wanted to store it in the database. This is where things got complicated.
The Spring pattern that I usually follow for data loading is to create a DataLoader
component class, which extends the
CommandLinerRunner
class. I then add some logic in the run
method to populate my dev database. Usually this means
generating one off objects to persist in the database via a big loop. Long story short, there was no particular method
to the madness. That was until recently when I learned
of Laravel's approach to seeding via factory classes.
Laravel is a PHP framework that has some neat ideas around eloquent code. Which basically means developer experience. The gist of Laravel's approach to seeding was to use factories with overwritable blueprints to generate entity objects. Which is a good idea. The Laravel implementation however seemed a little overcomplicated to me. With many helper methods that could only be implemented due the looseness of PHP.
My goal was to implement the same functionality in java but with a much more limited api. I wanted an abstract base
class, BlueprintFactory<T>
, which could be extended to allow the creation of objects based off a blueprint. I also
wanted to limit the caller api to two methods. These methods being with
and create
. create
will be used to create
the object, and with
will be used to supply custom setters. The latter being useful when dealing with related
entities. With this goal in mind, I set to work.
Usage
Before implementing something like this, I like to start of by thinking about how the consumer code should look. So I
start with how I want to use it. Allowing me to have a better idea of how to implement it. My initial thought was to
have an abstract BlueprintFactory<T>
class that would force the extending class to implement a
abstract T blueprint();
method. In the case of a UserFactory
class that might look like the following.
@Component
@AllArgsConstructor
public class UserFactory extends BlueprintFactory<AppUser> {
private final Faker faker;
private final PasswordEncoder passwordEncoder;
protected AppUser blueprint() {
var user = new AppUser();
user.setEmail(faker.internet().emailAddress());
user.setPassword(passwordEncoder.encode("password"));
return user;
}
}
The above is clean and clearly gives it's intent. It's also nice and concise as the parent class will implement the
with
and create
methods. The blueprint
method will be called each time an object is created. This means that each
object will have it's data generated by faker.
I'm happy with this approach to creating the factories. So then I look at using the factories in a seeder class.
The seeder will be the class that stores the objects created via the factories in a database. So I write out how I want
my UserSeeder
to look. In my case the AppUser
entity also has a one to one relationship with a UserProfile
entity.
So I'll need to create and store that along with the AppUser
. It's a good idea to give each entity its own factory.
So I create the following UserProfileFactory
.
@Component
@AllArgsConstructor
public class UserProfileFactory extends BlueprintFactory<UserProfile> {
private final Faker faker;
@Override
protected UserProfile blueprint() {
UserProfile userProfile = new UserProfile();
userProfile.setFirstName(faker.name().firstName());
userProfile.setLastName(faker.name().lastName());
return userProfile;
}
}
UserProfile
inside the UserFactory
blueprint. This is better handled inside the
seeder, where the related entity can be passed during object creation via the with
method.
With the UserFactory
and UserProfileFactory
set up. I can now go about writing the seeder. The UserSeeder
will
create the AppUser
and UserProfile
entities using the factories we created. It will also handle the relationship
assignments. For example, we will want to set the AppUser
that owns the UserProfile
when it is created, using the
with
method. The with
method will return a new instance of the factory, with the custom setter in place, to allow
for composability. The UserSeeder
should look as follows.
@Order(1)
@Component
@Profile("dev")
@RequiredArgsConstructor
public class UserSeeder implements CommandLineRunner {
private final UserFactory userFactory;
private final AppUserRepository appUserRepository;
private final UserProfileFactory userProfileFactory;
private final UserProfileRepository userProfileRepository;
@Override
@Transactional
public void run(String... args) {
if (appUserRepository.count() > 0) {
return;
}
var savedUsers = appUserRepository.saveAll(userFactory.create(50));
var userProfiles = new ArrayList<UserProfile>();
for (var user : savedUsers) {
userProfiles.add(userProfileFactory.with(() -> user, UserProfile::setUser).create());
}
userProfileRepository.saveAll(userProfiles);
}
}
In this class we create 50 AppUser
entities, by passing 50
to the create
method. I then assign a UserProfile
to
each of those. All of which are persisted to the database.
An area of note is the following for loop.
for(var user : savedUsers){
userProfiles.add(
userProfileFactory.with(() ->user,UserProfile::setUser).create()
);
}
Here we loop through the already saved users, creating a UserProfile
for each one. The interesting part is how the
user is passed to the user profile. The with
method adds a custom setter. Meaning that UserProfile::setUser
will be
called, and the result of () -> user
will be passed to it. Allowing that AppUser
instance to be set on the newly
created UserProfile
.
Some other things to note here is that I add an @Order
annotation to specify I want this seeder to run first. As well
as the @Profile
annotation to say I want it only to run in dev. I also firstly check if there are any users in the
database before running the seeder. This is to ensure the database is only seeded when empty.
You may want to create a test user in your seeder. YOu can do this by passing a variant to the create method. This will overwrite the specified fields.
AppUser testUser = appUserRepository.save(userFactory.create(Map.of("email", "test@test.com")));
The above creates a test user with the email "test@test.com". The factory reads the keys of the map and overwrites the matching value on the blueprint.
Implementation
The implementation comes down to three main areas.
BlueprintFactory<T>
: the extensible basewith
: allowing composable factory instancescreate
: construction of objects
And there is a lot to go through in each. So I will keep it brief.
BlueprintFactory<T>
This is an abstract class that forces the extender to implement a abstract T blueprint();
method. This method will get
called each time an object is created. The result of which will then have any custom setters applied, before being
passed to an object mapper which will apply any overwrites specified in the create
method call.
So for clarity if a custom setter is added. It will be applied before the variant/overwrite is applied via the create
method.
with
This method exists to handle the setting of related entities. As I stated earlier it's good practice to not create any
related entities inside your factory. The setting of related entities should be handled solely by the with
method.
Let's take the example of a User
entity that can own many Post
entities. In our seeder we may want to create the
User
entity first, then apply that User
to a bunch of Post
entities when they are created. The with
method
can specify the Post::setUser
setter always set the supplied User
like so:
var myUserPosts = postFactory.with(() -> myUser, Post::setUser).create(50);
The above will create 50 posts belonging to myUser
. As the with
method returns a new instance of the factory with
the custom setter applied it can also be made composable like so:
var myUserFactory = postFactory.with(() -> myUser, Post::setUser);
var myUserPosts = myUserFactory.create(50);
This is achieved by an internal constructor, protected BlueprintFactory(List<Consumer<T>> customSetters)
, in which
a copy of the factory's custom setters is passed to a new instance of said factory. That new instance is then returned.
This allows for the construction of composable factories.
Inorder to facilitate applying a single entity and a list of entities method overloads are used, to match the type of the supplied argument.
with(Supplier<R>, BiConsumer<T, R>)
: Injects a single related objectwith(Function<Integer, List<R>>, int, BiConsumer<T, List<R>>)
: Injects a list of related objectswith(Function<V, R>, V, BiConsumer<T, R>)
: Injects a value based on a variant
I made a big effort here with these overloads to keep the public api as simple as possible but still allow for flexibility.
create
The create
method is self-explanatory. It creates the new object based of the object returned via the blueprint
method. It will call blueprint
, then it will apply any custom setters to the return value of blueprint, before
passing the modified object to an object mapper which will merge any overwrites that have been passed to the create
method call.
Similar to the with
method there are a number of overloads to allow for different usages of create
.
create()
: Builds a single instance using the blueprint and custom setterscreate(Map<String, ?> variation)
: Applies field overrides using a mapcreate(T variation)
: Merges fields from an existing instancecreate(int count)
: Batch creation of identical instancescreate(VariantList<T>)
: Batch creation using pre-defined entity variantscreate(VariantMapList)
: Batch creation using field maps (variant by map)
I was forced to create some custom types here to avoid ambiguity with the type checker. The was caused by allowing
both a List<Map<String, ?>>
and a List<T>
to be passed to create
to allow the creation of multiple variants.
This is called a Sequence
in the Laravel terminology. To get the type checker to allow this I had to create some
wrapper classes, VariantMapList
and VariantList<T>
to distinguish between the types. This creates some overhead
but until I can find a better solution. I will put up with it to keep the concise public api.
The module
BlueprintFactory
has been uploaded to maven central. You can read the docs and learn how to use it in your own
projects at BrianDouglasIE/BlueprintFactory.
Feel free to create a github issue with any suggested improvements.