The Testing Pyramid

If you are a good developer/tester you probably ask yourselves these questions ūüôā

  • How many tests of each type should I have?
  • Do I need to write more unit tests?
  • Do I need to have less e2e tests?

Well luckily, we have the test pyramid below that can be used as a guideline for how many tests, roughly, you should have of each type. It is fairly self-explanatory; have a lot of unit tests and very few e2e tests. Why you ask?

test pyramid

Unit tests are cheap in terms of power/CPU usage/processes etc and e2e tests are the opposite. Unit tests give you fast feedback and tell you exactly where an issue is if one occurs. With an e2e test this is much more difficult. Because an e2e test most likely is using all the clients and services (or at least two or three), you have to dig more into the code to figure out what is going on.

The test pyramid also highlights testing strategies. This is called ‘Bottom Up’, i.e.

  • Test the domain
  • Tests closer to the code
  • Integrate early
  • Use mocks or stubs
  • Visualise test coverage


So if all is good and well, you have had your changes approved and you can release (yay!). But is it really yay? What is something goes wrong and you need to be able to fix it quickly…This is where monitoring comes in.

At the moment we are using splunk¬†and it is a really great log aggregator. You can take all your log files and get some meaningful information about what your users are doing. How many transactions are successful/failing? What cards are customers using? When are peak times? And in the case of errors, it can give you the exact service that is returning the error from a graph. What is great about seeing your logs returning useful information live is that they can also tell you that in some cases you are not doing great logging. And so, you can go and add better logging ūüôā

A note about splunk, the search mechanism it uses is all based on filters and field extractors. For example, let’s say you want to see transaction amounts against card type. You have to extract both these¬†field from the logs¬†then do a search query based on these field extractors.

The key to using these kind of tools usefully and successfully, is to have meaningful logs in the first place. You have to have done that work. This is a place where you want to know instantly whether everything is okay or not….


Unit Tests

What is a unit test?

Takes a very small piece of testable code and determines whether it behaves as expected. The size of the unit under test is not strictly defined, however unit tests are typically written at the class level or around a small group of related classes. The smaller the unit under test the easier it is to express the behaviour using a unit test since the branch complexity of the unit is lower.

If it is difficult to write a unit test, this can highlight when a module should be broken down into independent, more coherent pieces and tested individually.

Unit testing is a powerful design tool in terms of code and implementation, especially when combined with TDD.

What can/should you unit test?

  1. Test the behaviour of modules by observing changes in their state. This treats the unit as a black box tested entirely through its interface.
  2. Look at the interactions and collaborations between an object and its dependencies. These interactions and collaborations are replaced by mocks/stubs etc.

Purpose of unit tests

  • Constrain the behaviour of the unit
  • Fast Feedback


Failing tests in the pipeline

That¬†situation where there are failing tests in the pipeline and you ask someone about it and the response you get it ‘Oh, these tests are failing because so and so service is not running, so it’s fine; these tests can fail’. Sound familiar? I really dislike this response for threefold reasons

  1. Why did we write these tests if we are going to be fine with them failing?
  2. Surely if they fail, this should be a flag that something is wrong? (an unreliable, flaky service)
  3. If these tests do not give valuable feedback and are useless, just get rid of them. A failing test build should mean that there is no release. If you release with a failing test build and something goes wrong in production then what are you going to do? The tests highlighted that something was wrong and we chose to risk it.

If tests need certain services running to pass, have those services running. If a service stops running for random reasons then how can you be confident that it won’t be the same on production?

If tests need certain data to pass, have that data. That data would be in live right?

A test environment should simulate as close as possible the live environment. We can test in it all we want, but the environment configuration should be almost identical.

Automated tests should always have a meaning and purpose, otherwise there is just no point having them.