In this tutorial we learn how to use mocks when unit testing, why integration testing is sometimes just not enough, and how to save some trees. 🌳
Prerequisites
I assume that you already are familiar with some basic concepts of unit testing. I’ll be using Java, JUnit and Mockito, but don’t worry if you’re not familiar with them: the purpose of this tutorial is to learn to use mocking. You’ll pick up platform-specific things on the fly.
Here is a link to the repository where I have put the full source code for this article. You can download it and look at it while reading, or look at it afterwards.
Let’s go!
Our project
I borrowed this idea from a friend. Her son, 5 years old at the time, would ask her every day: "Mommy, mommy, how many days did I live today?"
The mommy would calculate and tell him: "Sweetie, you are 1868 days old today".
Next day, the son would ask again.
So she wrote him an iOS app that he could open every day and see for himself.
I thought this could be a nice idea to base this tutorial on :).
Of course, we are not going to write a full iOS app today. Our project is going to be a simple Java console app. We are going to have a small database with several registered users. When the user requests their age in days, we are going to tell them.
I am going to omit a lot of functionality in our app: there is going to be no user registration process; and our database is a small .csv
file; and there is no fancy user interface, just a print to the console. I made this example as simple as possible to illustrate the mocking technique.
Here is the repository again, where all the source code is located.
Our main logic is located in this file, UserAgeCalculator.java. It contains the method calculateUserAge
that is going to be the centre of our attention today.
// Calculates the user age in days. public int calculateUserAge(int userId, Date dateTo) { User user = usersDatabase.getUserById(userId); if (user == null) throw new IllegalArgumentException("User doesn't exist"); double daysDateTo = dateTo.getTime() / MILLISECONDS_IN_DAY; double daysUserBirthDate = user.getBirthDate().getTime() / MILLISECONDS_IN_DAY; return (int)(daysDateTo - daysUserBirthDate); }
This code is pretty straightforward. First, we fetch the user from our database based on the userId. If the user wasn't found, we throw an exception, because we cannot calculate the age of someone we don't know.
After that, we calculate the difference in days between the provided date and the user birth date.
If we take a look at our database, this is what it looks like:
1,"Emma","1983-12-03" 2,"Janneke","1992-04-25" 3,"Jaap","1996-02-19" 4,"Sophie","1978-07-30" 5,"Pieter","2019-01-01"
We have several users registered, all with the user id, their name, and their birth date.
And now, to our first test.
First test
Here is our first test.
@test public void userEmmaAge() { UsersDatabase usersDatabase = new UsersDatabase(); UserAgeCalculator calculator = new UserAgeCalculator(usersDatabase); Date dateTo = DateHelpers.parseDate("2019-01-01"); int resultAge = calculator.calculateUserAge(1, dateTo); assertEquals(12813, resultAge); }
We have picked the user with id=1, who is called Emma, and have calculated the difference in days between the 1st of January 2019 and Emma's birthday, which is the 3rd of December 1983.
Emma is 12813 days old.
Integration test?
What we have just written is called an integration test.
We are not only testing our calculateUserAge
method. We are testing several pieces of code together. In other words, our calculateUserAge
depends on other code in order to work, and we have kept this dependency in our tests.
What does it depend on?
Actually, it depends on the implementation of the UsersDatabase
. calculateUserAge
is calling the UsersDatabase.getUserById
method, which opens the database, reads the user by id, and returns this user to the caller.
So, in short, our unit test has a dependency on the database.
Why is this bad?
This is not necessarily bad.
Integration tests are a great tool to ensure that your whole app is working well together. When you have several pieces of code, covered well with unit tests, but you haven't tested them working together, there is a chance that you missed a bug. There might be a problem with the pieces working together, and you might miss it if you only test them individually.
If we were working on a real app here, we could decide to have a bunch of unit tests for checking the calculation logic and a couple of integration tests to ensure that the whole app works well.
Integration tests are usually more expensive to run. When your test accesses a database, pushes some data into a queue, calls another service via network and processes the response data, this takes time. An integration test will run slower than a unit test.
Along with that, you might need some protection measures to ensure your integration test doesn't mess with the real production data.
Why is using the production data bad?
Again, it is not necessarily bad in itself. However, there are some complexities this adds.
Tests writing in the database
One problem occurs if your test writes some data to a database. If you run this test, there will be new entries in the database. You will have to take measures to ensure you don't run this on the production database.
Tests reading from the database
It is a bit easier with reading the data. When reading the data, you don't mess with it, you just read it, right? Your production stays safe, right?
While this might be true, there are some problems with reading still.
Missing cases
We can list a couple of problems with integration tests that will not let us test our app fully and extensively.
Data stability
One problem is that production data is sometimes not stable. In our age-calculating app, new users are registered every day, and some users leave the app as well.
What happens if Emma deletes her account?
You can simulate that by deleting Emma from our little database and running the test.
Yes, our test will start failing. Since the user with id=1 will not be found, the calculateUserAge
method will throw an exception, and this is definitely not what we expect.
Data richness
The production data for our new app is small. Many cases are not present in our database (just yet).
If we think about cases, like we discussed in this article, we might find that we want to test much more than just our 5 users offer.
For example, right now we don't have a user who is going to be born in the future. But it is possible that an expecting parent might register in our app and count the days before the baby arrives, according to the doctors' predictions. I would like to test this case just to make sure that when this happens in production, our app is prepared and displays the correct number: i.e., -10 days.
Another case would be when we want to calculate the age of a user not present in the database. Remember, our calculateUserAge
method throws an exception in this case.
For this, we need to fetch a user in the test whose id is not present in the database. say, we have 5 users right now, with ids from 1 to 5, how about we use 42 for a non-existing user?
@test public void nonExistingUser() { UsersDatabase usersDatabase = new UsersDatabase(); UserAgeCalculator calculator = new UserAgeCalculator(usersDatabase); Date dateTo = DateHelpers.parseDate("2019-09-12"); assertThrows(IllegalArgumentException.class, () -> { int resultAge = calculator.calculateUserAge(42, dateTo); }); }
Of course, if our app is popular, we'll reach 42 users in no time. This test will start to fail because it won't be throwing an exception anymore.
Environment stability
When the connection to our database fails, we might want to display a message to the user: "Sorry, something went wrong, please try again in a moment".
Naturally, we might want to test that.
However, it might just so happen that our connection is rather stable, and the database doesn't go down while you're running your tests.
This means that a test expecting the database to fail will most likely never pass.
Imagine a different app that compiles a report and sends it to a printer. You wouldn't want to waste paper every time you want to run the tests, would you? In cases like this one, "cutting off" the real environment helps to simplify the app and concentrate on its logic, and also save some trees. 🌳
Mocking
In order to overcome these problems, we will turn this test into a true unit test. We will cut off the dependency on the database and will replace it with our own "database", a fake one, the one that we will make individually for each test.
The code will not know the database is fake and will work as it should. But we will! And this makes us powerful.
Let us see how it's done.
Take a look at this new test suite.
@test public void userBabyAge() { UsersDatabase usersDatabase = Mockito.mock(UsersDatabase.class); int userId = 42; User user = new User(userId, "Baby", DateHelpers.parseDate("2019-09-01")); Mockito.when(usersDatabase.getUserById(userId)).thenReturn(user); Date dateTo = DateHelpers.parseDate("2019-09-12"); UserAgeCalculator calculator = new UserAgeCalculator(usersDatabase); int resultAge = calculator.calculateUserAge(userId, dateTo); assertEquals(11, resultAge); }
Look at how we create the UsersDatabase
instance. Here is what we did before:
UsersDatabase usersDatabase = new UsersDatabase();
And here's what we are doing now:
UsersDatabase usersDatabase = Mockito.mock(UsersDatabase.class);
We have just created a pretend database, a completely fake one, which will behave as we wish, and will only exist for our test.
After that, we create a user:
User user = new User(userId, "Baby", DateHelpers.parseDate("2019-09-01"));
We have just created a user which does not exist in our database. A completely pretend user, not real.
Then, we make our pretend database work with our fake user:
Mockito.when(usersDatabase.getUserById(userId)).thenReturn(user);
After that, we create our calculator and calculate the age of this very user with userId = 42
.
See what we're doing here? Only this imaginary user "exists" in our pretend "database", and we made them only for this test. The test doesn't access the real UsersDatabase
now. We have all the power!
This technique is called mocking, and this fake database is called a mock. By adding this mock and replacing the real database with it, we have made our test a pure unit test. This is really powerful because now we can test all scenarios that we couldn't before.
How about a non-existing user? How about a user from the future? Everything is possible now.
What are other things mocks are useful for?
Imagine we would not return the age of the user as a return value, but instead would print it on paper.
This would be very difficult to test. Of course, you don't want to waste paper, but the larger problem is that your test would have to find the printer, pull the paper out of it, photograph the text, apply some cool text recognition techniques to see what is being printed, and compare the recognised text with the expected value.
This sounds pretty cool, but it is so difficult to implement. Is there a way to do it easier?
Yes, mocking helps with this as well. If you would mock the printer instance, you could ensure that Printer.print()
was called exactly once with specific arguments.
Mocking frameworks have a number of useful possibilities. You can check how many times a method was called. You can see what arguments were passed each time and then compare only the ones that are needed. You can mock several consecutive calls to one method and make them return different values. You can make every second call throw an exception. I am not sure if this would be useful, but hey, it might be. Take a look at the documentation of your favourite mocking framework to see the full list.
Summary
We have briefly discussed the difference between unit tests and integration tests. Sometimes, unit tests are more desirable than integration tests. Along with that, writing, supporting and running unit tests is easier and cheaper than integration tests.
We have learned to apply mocking in order to isolate our code from external dependencies and write pure unit tests.
Mocking lets us explore more scenarios that were not possible when using "real" data, and our tests are more stable because we control the environment now.