State vs Behaviour Verification

I stumbled upon an excellent article by Martin Fowler in which he describes the difference between stubs and mocks, and afterwards the difference between classic and mockist TDD. Be warned, the article is a few years old, but still a highly recommendable reading.

Test Objects

In every unit test we usually focus the testing on a single object (SUT or System Under Test). This object, however, usually requires some collaborator objects (secondary objects) to performs its operations, and the correctness of these collaborators may not be the target of the unit test.

Because of this, and of the awkwardness of working with certain objects (such as data access layers which would require a database to be used), Test Doubles are used instead of the real objects.

Here are the definitions of Tests Doubles (the collaborators of the object being tested replaced by doubles) as Fowler presents them, based on Gerard Meszaro’s terms:

Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed in for the test. Stubs may also record information about calls, such as an email gateway stub that remembers the messages it 'sent', or maybe only how many messages it 'sent'.
Mocks are what we are talking about here: objects pre-programmed with expectations which form a specification of the calls they are expected to receive.

Verification

We are interested in the difference generated by using mocks against any other type of collaborator, no matter it is a fake, a stub or even a real object. The main difference, as described by Fowler, is that the former use State Verification, while the latter use Behaviour verification.

In state verification you have the object under testing perform a certain operation, after being supplied with all necessary collaborators. When it ends, you examine the state of the object and/or the collaborators, and verify it is the expected one.

In behaviour verification, on the other hand, you specify exactly which methods are to be invoked on the collaboratos by the SUT, thus verifying not that the ending state is correct, but that the sequence of steps performed was correct.

Example

Let’s take a look at an example. Suppose you have three objects: a Calculator, a Repository and a Requestor. The first one will be the SUT, performing certain operations for which it takes certain data from the Repository and reports results or actions to the Requestor. These last two are therefore the collaborator objects.

A concrete example could be a UserMailNotifier service, which reads data from users in the database using the UserRepository and sends mails to those who fall under a certain condition using the MailService. The first one would be the calculator, the object being tested with the logic you want to test. The other two are the collaborators.

Note that there might be many more than just two collaborators, here we are just presenting an abstract example in which we divide them based on frequent interactions between components.

Let’s take a look at the interfaces of our example, clearly oversimplified:

interface IUserMailNotifier
{
    // Sends a mail to all premium users telling them something
    void MailPremiumUsers();
}

interface IUserRepository
{
    // Returns all users in the repository
    IEnumerable<User> GetUsers();

    // Returns all premium users in the repository
    IEnumerable<User> GetPremiumUsers();
}

interface IMailService
{
    // Sends the mail to target user
    void SendMail(User to, object mail);
}

So we have our implementation of UserMailNotifier, which we want to test, that invokes the GetUsers() method, filters the premium ones, and mails them:

class UserMailNotifier : IUserMailNotifier
{
    private IUserRepository userRepository;
    private IMailService mailService;

    public UserMailNotifier(IUserRepository userRepository, IMailService mailService)
    {
        this.mailService = mailService;
        this.userRepository = userRepository;
    }

    public void MailPremiumUsers()
    {
        foreach (User user in userRepository.GetUsers())
        {
            if (user.IsPremium)
            {
                this.mailService.SendMail(user, Mail.Premium);
            }
        }
    }
}

And yes, suppose we haven’t noticed that there is a GetPremiumUsers in IUserRepository, so we are getting all users and filtering client side.

Testing the Example

We will first test our UserMailNotifier in the classic way, without using any mocks. We will create a fake UserRepository, that will use a List as an in-memory data repository, and a stub MailService, that will store the recipients in a list to be checked later.

class UserRepositoryFake : IUserRepository
{
    public List<User> Users = new List<User>();

    public IEnumerable<User> GetUsers()
    {
        return Users;
    }

    public IEnumerable<User> GetPremiumUsers()
    {
        return Users.Where(user => user.IsPremium);
    }
}

class MailServiceStub : IMailService
{
    public List<User> Recipients = new List<User>();

    public void SendMail(User to, object mail)
    {
        Recipients.Add(to);
    }
}

So our test would look like this:

[TestMethod]
public void TestState()
{
    // Setup
    var mailService = new MailServiceStub();
    var userRepository = new UserRepositoryFake();

    // Create fake data
    userRepository.Users.Add(new User { Name = "Joe", IsPremium = true  });
    userRepository.Users.Add(new User { Name = "Jim", IsPremium = false });

    // Create object to test
    var userMailNotifier = new UserMailNotifier(userRepository, mailService);

    // Execute!
    userMailNotifier.MailPremiumUsers();

    // Verify state
    Assert.AreEqual(1, mailService.Recipients.Count);
    Assert.AreEqual("Joe", mailService.Recipients[0].Name);
}

Let’s leave out the fact that most of the test should be factorized out to a SetUp method. What this test is doing is initializing all necessary collaborators, using test doubles when the original ones are too awkward or complicated to use, and use them to instance the SUT.

The verification is done examining the mails sent after the execution flow has finished. In other words: you set up the environment for your SUT to execute, invoke it, and analyze the status of the environment after execution.

How would we accomplish this using mocks? First of all, the fake and stub classes are not necessary, since we will be expressing everything as expectations in the mocks. If we merge the setup with the test itself, we end up with something like this:

[TestMethod]
public void TestBehaviour()
{
    // Setup
    var mailService = new Mock<IMailService>();
    var userRepository = new Mock<IUserRepository>();

    // Create expectations
    userRepository
        .Expect(r => r.GetUsers())
        .Returns(() => new User[] {
            new User { Name = "Joe", IsPremium = true },
            new User { Name = "Jim", IsPremium = false }});

    // Create object to test
    var userMailNotifier = new UserMailNotifier(userRepository.Object, mailService.Object);

    // Execute!
    userMailNotifier.MailPremiumUsers();

    // Verify
    mailService.Verify(m => m.SendMail(It.Is<User>(u => u.Name == "Joe"), It.IsAny<Object>()));
}

Here we are testing that the UserMailNotifier made a certain call to our mock object, instructing it to send a mail. Both tests pass, and apparently they are both testing the correctness of our class. However, there is a slight difference between the two…

Changing the SUT’s logic

Now we realize that the UserMailNotifier would be much more efficient if actually used the GetPremiumUsers() method that did the filtering server side for us. So we change the implementation of our method:

public void MailPremiumUsers()
{
    foreach (User user in userRepository.GetPremiumUsers())
    {
        this.mailService.SendMail(user, Mail.Premium);
    }
}

Note that this implementation is also correct. However, as you may have guessed by now, dear reader, the first test will succeed whereas the mock-based test will fail.

We have not only changed our SUT’s internal logic, we have changed the way it interacts with its collaborators. And it precisely that what we are testing by using mocks. In state verification, we test the what, but in behaviour verification, we test the what via the how.

Coupling implementation to tests

The reason for the problem we have just encountered is that behaviour based test are tightly coupled to the interaction between collaborators, because that is the very thing they are testing. So if we change the interaction, we have to change the tests.

State based tests are more black-boxed. They don’t actually care how the SUT achieves its result, as long as it is correct. This makes them more resistant to changes and less coupled to design.

Nevertheless, coupling testing with collaboration is not necessarily a bad thing. Sometimes what you are looking for is that a sequence of events or invocations is fulfilled by a certain object, or any other behaviour-like testing that is painfully difficult to check with states.

Another good reason for being pro-coupling is TDD. Here we presented the interfaces first, then the implementation, and then the tests. An actual TDD practitioner would invert the second two steps, defining the expected behaviour of the SUT inside the tests. Being forced to define expectations, the mockist TDDer is solving the collaborations before actually implementing the class, dealing with interactions earlier.

Martin Fowler deals with both sides of TDD in a rich section of his article.

As a side note, in most of this post I have drawn one to one relations between state verification and stubs, and behaviour verification and mocks. It is possible to implement both verifications using both types of test doubles. Stubs may keep a list storing all invocations to be verified later, and mocks may define relatively lax expectations and make state verification on the SUT (if possible).

Finally…

To sum up, when deciding whether to use mocks or stubs, you must bear in mind that the decision will be coupled to what you want to test, not only how to test it. Remember this when writing your next fixture.