Introducing Evidence Sets

[Updated to reflect change in feature scope following user feedback]

We’re pleased to announce a new feature that we’re working on – Evidence Sets. The premise here is that this allows a user to group a set of test run sets together to form a single viewable unit which can be used to provide evidence (hence the name) of testing.

Evidence sets can be used in multiple different ways, including:

  • Allowing for failing tests to be re-run if desired without having to re-run everything.
  • Grouping multiple pieces of testing together to have a single reference for test results and for end user sign-off.

Main features

The main features of evidence sets are:

  • Ability to collate multiple test run sets together, including:
    • optional prefixes to create custom hierarchies
    • subsets of test runs as desired
    • from multiple different products
  • Ability to have multiple test runs contribute to a single test (e.g. to handle re-runs). There are several options (best result, worst result, first result, last result or not allowed) for deciding the state of a test if multiple contributing test runs are specified


Internally, we use the evidence set functionality to allow us to have a coherent overview of the state of the application prior to release. For each release, we want to run:

  • integration tests for the API layer
    • For a fresh install
    • For each DB upgrade path
  • integration tests for the DB update functionality
  • integration tests for the fresh install functionality

Additionally, we would like to be able to show the results of the UI testing

Example – API Integration Tests

The API integration tests are designed to check that a given instance of the API performs as expected and will cover everything from uploading results to checking the security model works as expected. Given that we want to ensure that the functionality is correct regardless of whether it’s a fresh install or an upgraded install, we want to run the same set of integration tests against as many combinations as possible. As the running of these tests is highly automated (one just needs to specify the target server and the appropriate admin user to use to start) then these are trivially easy to run and can generate a large number of result sets to analyse.

By using the evidence sets functionality, we can collate all of these result sets into a single display unit so that it’s very easy to get an overview of the state of the release candidate. We do this by using the ‘prefix’ functionality so it’s very clear where there’d be a problem, e.g.

  • api
    • clean
    • upgrades
      • v1
      • v2
      • v3
      • etc.

And then the usual test hierarchy applies underneath each node.

Note that as we wouldn’t release anything which is non-green, we don’t need to leverage the sign-off functionality in evidence sets.

In addition to the functionality above, we then add the installer / upgrade test results to the same evidence set (under appropriate prefixes) so we can demonstrate to the people signing off the release that everything is good.


We’re putting the final touches to the functionality and we’re hoping to have this work complete in the next week or so and then we’ll make it available to all of our clients in the usual fashion.

In the meantime, if you have any questions, queries or suggestions then please do get in touch with us

Leave a Reply

Your email address will not be published. Required fields are marked *