Tasks? Whither the Test?


On Friday, via Twitter, @michaelbolton asked @rbcs about the unit of measurement for test cases. To this, @rbcs replied:







A test is a task? Sounds reasonable.

But wait, a test is surely two tasks? An action and an observation? Or is it three? An action, an observation and an evaluation?

But wait! What if the test is not completely trivial to set up? What if it takes many separate tasks to configure a system in order to conduct a single test? Perhaps a test consists of many, many tasks?

Then again, Rex’s tweet suggests he is referring to tasks in a project management context. Please imagine a Gantt chart. I can’t say that I’ve ever seen project planning down to the individual test case – it is more normally the case that tests are wrapped up into a higher-order task on the plan. So perhaps a test is but a fraction of a task and not a whole one?

Also, in a project management sense, a task might be of any size, from a few hours to many days effort an duration.

So, it would appear that a test could be an indeterminate number of tasks of indeterminate size.

Now /that/ seems like a sound basis for a unit of measurement.

It get worse.

Ever performed a test that revealed something interesting and unexpected? Where a single test spawned many follow up tests aimed at isolating and qualifying the impact of your discovery? Tests begat tests.

Ever experienced the opposite? Where a test turned out to be irrelevant or simply not viable? Tasks may have been performed, but you are left having performed no test at all? Just as tests are spawned, so they can disappear.

Imagine that you have employed a contractor to build you a house made of bricks. Imagine that the bricks vary in size from Lego proportions to that of boulders. Imagine that, when laid, some bricks  spontaneously vanish, whilst others materialize miraculously in place. The contractor, of course, reports his progress by telling you “42.3% of bricks have been laid“. I’d be inclined not to trust the contractor.

Of course, bricks don’t behave that way: they are real, concrete phenomena. Tests are not. Tests are constructs, abstractions.

Whither the Test?

But what does this mean? What constitutes a test case? This can be particular tricky to answer.

Let’s take the example of a project that  I participated in last year. My team were testing an ETL solution, and were focused on testing the rules by which data, extracted from a variety of source systems, was transformed in order to load it to a single target system. Testing was performed by conditioning real data (to cover various conditions evident in the transformation rules), predicting the results of transformation for every cell (table/column/row intersection) within the source data set, and reconciling ETL results against our predictions.

So, what is a “test case” in this example?

The tools we created for this purpose were capable of performing in excess of ten million checks per minute. Over the course of a particular test run, we were performing perhaps a billion checks. Were we executing a billion test cases?

Now, those checks were performed at a field level. In most cases, the transformation logic was tied to an individual row of data, with combinations of field values within the record contributing to the outcome of each transformation. In this way, each row might be seen as representing a particular combination of conditions. We were testing with a few million rows of data. Were we executing a few million test cases?

Of course, many of these checks were seemingly redundant. The underlying transformation rules represented in the order of two thousand different outcomes, and a given data load might result in many, many instances of each outcome. So were we only executing two thousand unique test cases?

Each test run was orchestrated over the course of about a week. Typically, each run was conducted with a new set of test data. Conditioning data took considerable time, as did analyzing results and potential anomalies. If we conceive of our tools as being scientific instruments and the ETL implementation, in combination with any given set of data, the subject of our enquiries, then perhaps we should consider a test run to be a single experiment, a single test. Were we only performing only one test, albeit a complex one, each time?

Any of these, from one to a billion, might be an appropriate answer dependent on how you choose to define a test case. For our purposes, with an eye to coverage of conditions and outcomes, we chose to count this as being two thousand test cases. There was nothing inherently “correct” to this, it was simply a decision that we made on the basis that defining a test case at this level seemed useful.

Test cases are how you choose to define them.