Tag Archives: Inference

Forwards & Backwards

A little while ago, I wrote about “testing backwards”. In her comments, Savita pointed out that when testing backwards, I was actually doing exploratory testing. This is true; though it might be more accurate to say that I was using types of reasoning that tend to be more common in exploratory testing than in “testing forwards”.

When we test forwards we tend to rely heavily on deduction, i.e. predicting a result based on a rule. There are other forms of reasoning: induction and abduction. Each has an important part to play in testing.  When you test, you probably use all of these subconsciously: I find it useful to remain aware of these different modes of thinking, and consciously switch gears when I’m getting stuck. This post expands on these types of reasoning, and describes some of their uses in testing.

Note: whilst deduction, induction and abduction have a long history, this terminology can be terribly confusing: even now, more than twenty years after studying the subject at university it still leaves me struggling at times. The chief difficulty is with the term abduction. This is used in different ways across disciplines and by different people (sound familiar?) Even Charles Sanders Peirce, the inventor of the term “abduction”, changed his usage during his career. Lesson 29 of Lessons Learned in Software Testing (Kaner, Bach and Pettichord) provides an excellent description of abductive inference, without which I would probably still be scratching my head.

Deduction: Reasoning to a Result

With deduction, we predict results based on a rule and a set of initial conditions. This type of reasoning  is commonly associated with specification based testing: model the software based on its requirements, derive rules from the model that define how the software should behave, feed the software a set of conditions (inputs) for each rule, and check that results match those predicted by the rule.

For example, when testing account authentication you might test that the account becomes locked after a specified number of failed attempts.  Imagine your specification states that accounts will become locked after a third unsuccessful login attempt: based on a set of conditions (three failed attempts) and a rule (three failed attempts → account locked) you predict a result (account locked). This leads you to design and execute a test that sets up the necessary conditions and allows you to observe whether the rule has been implemented correctly.

We don’t just use deduction when we testing from specifications, we use this type of reasoning in many types of test design. For example, you might understand that software often fails when subjected to large inputs and apply this to your testing: based on a condition (large input) and a rule (large input → failure) you predict a result (software failure). This leads you test to see whether the software does indeed fail with a large input. In this example, you are still applying a model, but the model is not one of how the software should work (e.g. drawn from specifications) but how it could fail. In this way, many different models can be used to predict results and design tests.

Induction: Reasoning to a Rule

When we practice induction, we attempt to determine a rule based on our observations. We often do this when we use software with the goal of learning about it, i.e. when we use the evidence of our tests to infer the rules upon which it is based. This is primarily the form of reasoning that I was talking about in “Testing Backwards”:

Let’s return to the account locking example: perhaps you haven’t got a specification at all, but you’ve noticed that after three failed login attempts the application tells you that your account is now locked. From the conditions (three failed attempts) and results (account locked) you speculate that the software implements a rule (three failed attempts → account locked).

A note of caution: whilst useful, induction does not guarantee that the rule you infer will be correct. Of all the possible conditions and results you might observe, how do you know that you have observed the critical ones? Consider the above example again: what if the account appears to have become locked for some reason other than three failed login attempts? Perhaps your account was locked by another tester performing a different test? Perhaps the rule you inferred is in fact valid, but only applies to particular types of account? When you use induction to infer a rule, you are making a conjecture that can only be verified, or disproven, through further testing.

Abduction: Reasoning to the Best Explanation

Abduction is the process of gathering information, identifying possible explanations (rules) based on that evidence, and seeking to verify or disprove each explanation until you arrive at one which best explains the available data.

Back to the account locking example…let’s say that you determined from the specification that accounts are locked after three unsuccessful login attempts. However, on testing you determine that the account remains unlocked even after three failed attempts. Is this a bug? What could have happened here?

Based on your input, results, and some previous testing experience, you might apply a common failure model and speculate that the account locking logic has a boundary bug. Perhaps rather than implementing “lock account if failed attempts = 3” this has been implemented as “lock account if failed attempts > 3”? This might lead you to test another failed log in, to see if that locks the account.

Perhaps this behavior is configurable? Perhaps a configuration file needs adjusting to set the threshold to three, or to even activate the feature. This might lead you to investigate what the specification has to say about configuration, or to nose around some configuration files.

You construct a number of possible explanations, and investigate each in turn – by reading the specs, poking around in config, attempting further tests or talking to the developers until you arrive at a reasonable explanation, or conclude that the only reasonable explanation is that you have found a bug.


This final example illustrates the need to use more than simple deduction in testing: whether using an exploratory approach or otherwise.

In the above case, if you rely purely on deduction, you stop at the unexpected behavior: “it didn’t do what the spec says, it’s a bug”. Yet there many other possible explanations for this behavior, some entirely benign. If you log a bug at this point, you may well be raising a false positive, damaging your credibility and wasting valuable developer time. In short, your job isn’t done until you have an explanation or can no longer justify spending any more time on this particular line of enquiry.

Classical models of testing, which overemphasize the use of deduction and simple comparisons between expected/actual results lobotomize our testers and constrain the real value that they can bring to their projects.

Testing Backwards

One of my favourite projects started off by testing backwards.

The project in question involved taking software used by one customer and customizing it for use by another. First we would define which of the existing features would be preserved, removed, and modified. Unfortunately, none of the original development team was available, nor were there any existing models, requirements or design documents. Our starting point: the source code and a little bit of domain knowledge. This was hardly a basis for having a meaningful conversation with the customer: we needed to reverse engineer the software before we could start to change it.

Testing proved to be a big part of the solution to this problem. As strange as it might seem, this project didn’t just end with testing, it started with testing.

When you test forwards you use a model. This might be a set of requirements, it might be a design, or it might be your expectations based on experience or conversations with stakeholders.  This model allows you to make predictions as to how the software will behave under certain conditions. You then execute a test with those conditions, and verify that it behaved as predicted.

In contrast, testing backwards is concerned with deriving such a model. You investigate how the software behaves under a range of conditions, gradually building an understanding of why it behaves the way it does. This is reverse engineering, determining rules from an existing system.

You might be forgiven for assuming that testing backwards is only concerned with determining how the software works rather assessing it and finding bugs, after all you need some kind of model as to how it should behave in order to determine whether it fails to do so. This is not the case: the model of the software’s behaviour is not the only model in play. When you test, you bring many models to bear:

  • Models that describe generally undesirable behaviour, for example; unmanaged exceptions shouldn’t bubble up to the UI as user unfriendly stack traces.
  • Models based on general expectations, for example; calculations should comply with mathematical rules, things that are summed should add up.
  • Models based on domain experience, for example; an order should not be processed if payment is refused.

When I first started on this project, I imagined that by testing backwards I was actually doing something unusual, but it slowly dawned on me that I had been doing this on every project I’d ever tested on:

  • Every time that I had started a new project and played with the software to figure out what it did, I’d been testing backwards.
  • Every time I’d refined tests to account for implementation details not apparent from the specification, I’d been testing backwards.
  • Every time I’d found a bug and prodded and poked so as to better understand what the software was doing, I’d been testing backwards.

I was struck by the power of testing backwards:by seeking to understand what the software did rather than simply measuring its conformance with expected results, we are better able to learn about the software. By developing the skills required to test backwards, we are better able to investigate possible issues. By freeing ourselves of the restrictions of a single model, a blinkered view that conformance to requirements alone equates to quality, we are better able to evaluate software in terms of value.

Would testing backwards serve your mission?