We recently had a client who had an API which was live and being actively used and they wanted to improve both it and the testing of it. The API contained a series of end points which the client thought provided a set of rows / objects, where there was a single row per set of unique keys (‘Contract ID’ and ‘Month’). However, the reality was that the API was returning multiple rows per expected set of unique keys.
We were expecting the ‘Contract ID’-‘Month’ tuple to be unique, however the API had other ideas:
Expected:
ContractID: 1
Month: 2025-06
Payment: 375
ContractID: 1
Month: 2025-07
Payment: 375
ContractID: 2
Month: 2025-06
Payment: 57
Actual:
ContractID: 1
Month: 2025-06
Payment: 375
ContractID: 1
Month: 2025-07
Payment: 375
ContractID: 2
Month: 2025-06
Payment: 35
ContractID: 2
Month: 2025-06
Payment: 22
Obviously the long term desired outcome was to update the API such that it behaved as expected, however, their front end had been coded in such a way to tolerate this duplication and they had more pressing needs for their product than being architecturally pure.
We wanted to put in a series of comparative tests. These are where we compare the output of 2 versions of the API (differing in either software or configuration or anything) and compare their outputs. Unlike classic integration tests, these are less intended to be a pure pass-fail, but to let us know what the impact of releasing the new version will be.
One option here would have been to ignore this whole end point during the testing process until the API behaved as expected. This was swiftly ruled out as it relied on some of the most complex logic in the platform (heavily SQL based so unit tests were somewhat scarser) and we were rewriting it for them.
This left us with a few options:
- Do a summation in the test code – i.e. grouping all of the rows together and then testing the resulting summed rows.
- Do the usual collections comparison functionality where we could and then compare the “non-unique unique” row sets.
Option #1 was ruled out as there was no guarantee that the summation would be correct, especially when it came to if future properties were added to the returned data model running the risk of false negatives. Note that because we use a code generation tool to generate the code level data model, the dynamically generated data models get updated fairly regularly anyway, so it’s unlikely that it gets out of sync with the actual API being tested.
This left option #2. For this, we used the standard BorsukSoftware.Testing.Comparison.Extensions.Collections (nuget) functionality. The return type here contains:
- Matching keys
- Additional keys
- Missing Keys
- Non-matching keys
- Incomparable keys
For the incomparable keys, we get a set of:
- the keys which were expected to be unique, but weren’t (e.g. in this example date and contract number)
- the expected rows which matched these keys
- the actual rows which matched these keys
From here, we then needed to come up with a way to compare these collections. Because we weren’t interested in the returned order, the simplest thing to do here was to:
- pick an ordering method (payment in our case)
- flatten down the rows using the array plugin
- compare these flattened values
The upside of this approach:
- We were aware of the impact of our changes’ impact on this very important end point
- We didn’t have a permanent false positive in our tests. These cause developers to simply ignore the given test and therefore they’d miss if there was an actual unexpected change in this space.
- It was quick to deliver
The downside of this approach:
- It’s a sticky plaster, we still didn’t have a pure API
- When the API is fixed so that the number of rows returned dropped, we’ll see very noisy results for that test run. Note that when the API was fixed, the test could also be updated during the development process to do the summation thus proving that the totals hadn’t changed. After that confirmation, the test code be updated (in a subsequent PR most likely) to remove the summation code for the reasons mentioned above so that everything was keen.
The upsides outweighed the downsides and the long term fix was added to the backlog.
We did this via a helper function:
public static (IReadOnlyCollection<IReadOnlyDictionary<string, object>> matching, IReadOnlyCollection<(IReadOnlyDictionary<string, object> Keys, IReadOnlyList<KeyValuePair<string, BorsukSoftware.Testing.Comparison.ComparisonResults>> Differences)> multipleRowSetsDifferences)
CompareIncomparableItems<T>(
IComparativeTestContext context,
BorsukSoftware.ObjectFlattener.ObjectFlattener objectFlattener,
BorsukSoftware.Testing.Comparison.ObjectComparer objectComparer,
BorsukSoftware.Testing.Comparison.Extensions.Collections.ObjectSetComparerStandard.ComparisonResults<T> comparisonResults,
Func<IEnumerable<T>, IOrderedEnumerable<T>> sortingFunc)
{
var multipleRowSetsMatching = new List<IReadOnlyDictionary<string, object>>();
var multipleRowSetsDifferences = new List<(IReadOnlyDictionary<string, object> Keys, IReadOnlyList<KeyValuePair<string, BorsukSoftware.Testing.Comparison.ComparisonResults>> Differences)>();
if (comparisonResults.IncomparableKeys.Count > 0)
{
context.LogMessage("");
context.LogMessage(" => Comparing non-unique collections by index");
foreach (var grouping in comparisonResults.IncomparableKeys)
{
var expectedRows = grouping.Value.ExpectedObjects ?? Array.Empty<T>();
var actualRows = grouping.Value.ActualObjects ?? Array.Empty<T>();
var differences = objectComparer.CompareValues(
objectFlattener.FlattenObject(null, sortingFunc(expectedRows)),
objectFlattener.FlattenObject(null, sortingFunc(actualRows))).
ToList();
if (differences.Count == 0)
multipleRowSetsMatching.Add(grouping.Key);
else
multipleRowSetsDifferences.Add((grouping.Key, differences));
}
context.LogMessage("Summary:");
context.LogMessage($" matching - {multipleRowSetsMatching.Count}");
context.LogMessage($" differences - {multipleRowSetsDifferences.Count}");
}
return (multipleRowSetsMatching, multipleRowSetsDifferences);
}
The client was happy and the devs were happy as they could see the impact of the fairly chunky changes that they were making.
As usual, any questions, please ask.
Happy Testing!