How we test

As the tool itself is designed to encourage people to improve the testing of their product, it behooves us to make sure that the tools that we provide are properly tested themselves. Also, by using Conical to test itself, we are able to follow the same workflows that we recommend to others and as such, if there are any issues with any of them, then we’re doubly incentivised to fix them.

In general, we try to use unit tests wherever possible and fall back on relying on integration testing only where it’s not practical to use unit tests.

In general, we use Xunit, Moq and FluentAssertions in our unit tests. For integration tests where we want to report differences (as opposed to throwing on differences as we do in the unit tests), we use the object flattener and comparison frameworks we released previously. These can be found:

Architecture Reminder

The tool itself is broken down into a series of layers:

  • DB layer
  • Main api (.net core)
  • Main UI (Blazor WebAssembly)

The client access libraries, final installation tool and comparison tools are outside the scope of this article but it can be assumed that similar approaches are taken for those components.

Each layer is tested in various ways before we move to the next layer. This way, it’s simple to track where any changes in behaviour come from.

Conical configuration

As part of the dogfooding process, we use Conical to help us test Conical. We do this by using Conical to store the results of the various integration testing that we do. When a test fails, we typically publish a swathe of information (via logs or additional files) to assist in understanding what happened.

For our use-cases, we have 3 products defined:

  • dogfood-api
  • dogfood-ui
  • dogfood-deployment

Each of these uses a single test run type (‘Integration Test’ FWIW) exposing logs and additional files.

To do the actual uploading, we have a special user account solely for this purpose and then use a restricted access token.

DB Schema Testing

It’s quite difficult to be able to test the DB layer / schema as it [the underlying DB as opposed to the DB installation tool] is not an independent component. So to ensure that the DB schema itself is tested, we tend to rely on integration tests at the API layer to ensure the api + DB combination behaves as expected.

However, as the Conical docker image is capable of installing and updating the underlying DB schema, we have encapsulated that functionality into its own component which can be tested directly. The DB installation component is capable of performing multiple operations:

  1. Installing a fresh instance of the schema (including creating new users)
  2. Upgrading an existing schema to a newer schema

To test these, we have a PowerShell / linux script which does the following:

* Create a new docker network
* Spin up a Docker container with a new instance of SQL server

And then =>

New installations
-----------------
 * Create a new empty schema
 * run the installation tool against said schema (success criteria is no errors)

Existing installations
----------------------
 * Import a previous version of the DB schema into a new schema on that DB instance
 * run the upgrade tool against said schema (success criteria is no errors)

These tests can be run by a developer locally from the command line and are also run by our TeamCity infrastructure upon every DB related check-in. Note that these check-ins are normally based off adding new entries to the DB creation script and associated updating components.

Once the tests have passed, then the built binaries are uploaded to our internal nuget server for consumption later on (by API testing and final release packaging). They can also be used to upgrade the dogfooding instance.

Note that this testing doesn’t perform any testing that the API will function correctly (that is covered by the subsequent API testing), but confirms that the installation / upgrade process will succeed.

API Testing

The API is .net core webapi, internally structured using a services architecture with the publicly exposed controllers being typically very thin shims to the underlying functionality and not containing much logic. To this end, wherever possible these individual services are unit tested. Where it hasn’t been practical to structure the controllers in this way, they themselves are unit tested.

What these unit tests cannot do however, is test the interaction with an actual live DB or see what the behaviour of the system over a series of actions would be and so we have a second layer of tests to handle those checks. To that end, we have an integration tests project which can perform a series of operations trying to mimic what our user base might try to do

  • Push CSV data of various forms and check that it correctly round trips
  • Create a new user
  • Create or edit products
  • Upload results, mark it for deletion, then restore it
  • Upload results, market it for deletion and empty the recycle bin

The results of these tests are uploaded to our dogfooding instance of the tool so we can view them and drill into any failures. Because we do not expect there to be any valid reasons for failure (e.g. numeric tolerances), we treat any failure here as grounds to fail the build.

These tests are usually run against a fresh instance (see below), but they can also be run against a locally running instance of the API (i.e. one running in a Visual Studio debugger session) if necessary. We do try to avoid doing so however, as the tests are quite free with creating additional products in the tool so it can leave your local instance looking a little bit messy.

Note that the code that the integration tests use to connect to the API is generated automatically by a tool from the swagger file. Typically we would expect to see only additions to the generated code, but where changes occur, they’re flagged up as differences here to remind the developer to be careful.

The approach we take is (and again, these steps are actionable with a single script as before):

  • Spin up a docker network
  • Spin up a sql server docker image and attach to said network
  • Wait a bit (depending on the underlying machine which is running the tests, this can take between 2 and 30s)
  • Run the DB creation tool to create a new schema (this also creates the admin user with a pre-defined username/password/token combination for simplicity). This is done within a docker image attached to the same network for simplicity’s sake.
  • Create a docker image for the api and launch that on the same network
  • Wait a bit for the tool to get started
  • Run the integration tests against the new candidate api container (again, this is in its own container on the same network)
  • Upload these results to our dogfooding instance of the tool

Note that as they can be run locally by a developer, it’s expected that they’re run prior to committing the code. When they’re run locally, they also publish data to the dogfooding instance so the analysis process in case of differences is identical.

Once the tests have been run, one can go to the dog fooding instance and browse to see what tests passed and what failed (the CI system also goes red on failure). When a test has failed, then it will typically have uploaded all of the necessary information as to why it failed, logs, additional files etc. so that a developer can see what happened.

Note that these integration tests can also be run against upgraded schemas to ensure that all different usage patterns are covered. The steps taken for these are very similar to the ‘new instance’ approach, except that a DB backup is imported into the SQL server instance and then upgraded.

The aim here is to be confident that the tool will behave as expected for a programmatic client (as opposed to the UI).

UI Testing

The UI is a Blazor Web Assembly based app and is structured as a series of pages, reusable controls and services. All of the pages and controls are structured as a model / view approach which makes it simpler to use unit tests.

We test the models and the views independently with bUnit being used to enable the view testing.

From here, we then have confidence that the tool will perform most actions according to expectations. However, it still leaves us with a gap of ensuring that the website actually looks and behaves appropriately, i.e. that there hasn’t been a mistake defining the CSS / making the correct linkages etc.

We then run a set of integration tests using Selenium. These tests are run against a range of browsers (currently all on Linux for ease of automation on our Linux build machine but we would like to extend that to cover mobile devices as well) and then an evidence set is created for all of these runs.

These integration tests typically cover areas which are hard to cover by unit tests, e.g. we have tests to check that if a user has special characters in a name, then they’re correctly encoded and the links work as expected.

At this point, we know that all of the automated tests have passed (or in the case of a failure, where it’s failed and we can investigate as to whether or not it’s something to be concerned about). Note that a failure in a unit test will fail the build whereas a failure in an integration test just requires a manual check (i.e. sometimes browsers are tempramental and there are false negatives).

From here, there’s then a final set of manual testing. This is currently a check list in Excel with a list of steps to take, e.g. for the results xml control, we check that we can see the XML, transform it and download it. We typically don’t perform all of these tests for each release unless we’ve worked on that area / might have affected it. They can typically be performed in about 20-30 minutes, but that’s still a significant hit on developers’ time. The tester will then use our Excel addin to upload the test results (for audit trail purposes) to the dog fooding instance. This also has the effect of dogfooding the Excel integration.

Although this latter statement might seem that we’re ignoring our own imploration of testing everything, we mention it because it highlights the reality of software development in that testing budgets (time or money) are limited and that people have to make choices as to where to best spend their budgets. For these specific tests, it’s possible to automate most of them, but doing so would take a reasonable length of time and we think it’s not the highest priority task right now.

TLDR – We’ve adopted the approach of automating everything as much as possible using the full range of testing tools available to us (and will continue to do so) but where the cost of automation is currently greater than the cost of manual testing, then we are willing to make that choice.

Docker image

Although internally, we treat the API and UI as distinct projects, this is neither how an end user would think of them nor how we ship them publicly. To that end, we have a CI process which combines all of the various dependencies together to generate the actual Docker image that we upload to the public repository.

In order to ensure that this docker image behaves in the expected manner, we run multiple levels of tests:

  • Installer tests
    • Check that a completely fresh instance can be installed
    • Check that a new server against a valid DB instance can work
      • No DB upgrade needed
      • DB upgrade needed
    • Check that invalid states are handled
    • Check valid states just starts
  • Re-running API tests

Again, these results are uploaded to our dogfooding instance of Conical (dogfood-deployment). As with the API example, failures here are deemed to fail the whole build.

These results are all uploaded with an appropriate tag so that they can be identified as having come from a particular build pipeline, e.g. ci-123.

Once all of the tests have run, then we use the command line evidence set creator tool to create an evidence set containing all of the testing evidence from that pipeline, no matter how it was created (API tests, DB tests etc.). We use this to double check that all of the testing has actually been run and that there hasn’t been a CI configuration problem etc.

After the automated tests have been run and the candidate image is ready, we can launch a CI job which spawns up a completely fresh instance of Conical using the candidate Docker image which is then populated (using the same population tool as we use for the demo instance). This allows us to perform some last minute sanity checks before we upload to the public Docker repository.

Future plans

Our future plans to improve the testing include:

  • Automating more of the UI tests
  • Running the UI tests against additional browsers / devices.
  • Updating the Excel checklist with an in-tool ‘test book’.
  • Using the dashboard functionality to provide a summary page for a release (we currently use an evidence set)