Tester Interdependence

 “Send three and four pence, we’re going to a dance”. -apocryphal message received from the front lines during WWI.

Tester independence is commonly cited as an essential principal in software testing. It serves two purposes:

  • Avoid conflicts of interest. We all have our own interests. In business, that can equate to a lot of money. If I have software developed for me by a vendor, perhaps I don’t want to rely on them telling me that it works just fine. Perhaps I want it independently tested. 
  • Avoid author bias. When you develop something, you are down in the detail. It can be difficult to see the wood for the trees. You can stare blindly at a problem that is flashing neon red right in front of you. Development is hard, and everyone can benefit from an extra set of eyes.

The former requires some degree of management separation so that the focus of testing is not biased against the interests of the customer, and that the findings of testing have an appropriate forum. The latter can be served simply by having someone other than the developer doing the testing. Therefore, any given context will require a different degree of independence.

This principle is often applied with little regard for context. For example when joining a team as the new test manager, I set about meeting all the testers, trying to form picture of the team. One conversation stood out: 

Me: “So how regularly do you speak to the developers?”

Tester: “Er, well, hmm, we’re not actually allowed to talk to the developers…”

This was tester independence run riot. Our context was supporting in house development projects yet someone had taken the independence mantra to the extreme and set up the team as a black box:

  • Inputs: requirements, executable code
  • Outputs: test reports, bug reports.

This kind of set up might be appropriate to contract development or accepting third party software, but for this team it was just plain nuts. Why? Because there was no conflict of interest, there was a common interest: build working software. The degree of independence practiced was overkill.

What’s the problem? Surely if independence is good, then more independence is better? Like anything, tester independence involves tradeoffs. In many industries, a common approach to avoiding conflicts of interest is to erect “Chinese walls”; information barriers between parties. Instead of flowing freely, information is directed through formalized channels. This creates the opportunity for error and ambiguity, and removes the informal channels that might allow for clarification. 

Rather than slavish adherence to the principle of tester independence, I like to use a balancing principle: tester interdependence. This principle is simple: Projects need the information testers can provide. To provide the right information, testers need information from the project. 

Why is this important? As testers, we trade information; it’s our bread and butter. Starve us of information and we’ll test the wrong things, in the wrong ways, at the wrong times: the information we provide will be at best irrelevant, at worst damaging.

This topic came up last year whilst discussing Agile with a colleague. He put forward the view that the “principle” of tester independence is violated when the developer and tester work too closely together: that this eliminates tester objectivity and thereby reduces the tester’s value in combating author bias. This argument is based on the belief that a tester is objective. But testers are subjects not objects. When did our job titles bestow us with the superhuman power of objectivity? I prefer to acknowledge that all the stakeholders on a project are subjective, have their own perceptions as to what is important, and their own understanding as to what is being implemented.

We can model these different perspectives in a Venn diagram. Let’s keep it simple and consider only three stakeholders: a user, a developer and a tester.

 

 

 

 

 

 

Whilst the three parties agree on some things (the intersection of all three sets), there are significant differences in perspective. This gives rise to the possibility of bugs, false positives, missing features, unneeded features and irrelevant testing. Practicing excessive tester independence helps to perpetuate this situation.

In contrast, if we apply the principal of tester interdependence (and by extension the interdependence of all stakeholders), then we apply practices that narrow the gap. Reviews are a good example: I’m not talking about sterile document review / sign-off here, but interactive conversations aimed at resolving misunderstanding and ambiguity. The result might look something like this:

 

 

 

 

 

 

Though applying the principle of interdependence we can converge the perspectives of project stakeholders and by doing so reduce the opportunity for error.

By all means give thought to how independent you need your testers to be, but balance that with interdependence instead of simply locking them away. Chinese walls breed Chinese whispers: “Send reinforcements, we’re going to advance“.

Forwards & Backwards

A little while ago, I wrote about “testing backwards”. In her comments, Savita pointed out that when testing backwards, I was actually doing exploratory testing. This is true; though it might be more accurate to say that I was using types of reasoning that tend to be more common in exploratory testing than in “testing forwards”.

When we test forwards we tend to rely heavily on deduction, i.e. predicting a result based on a rule. There are other forms of reasoning: induction and abduction. Each has an important part to play in testing.  When you test, you probably use all of these subconsciously: I find it useful to remain aware of these different modes of thinking, and consciously switch gears when I’m getting stuck. This post expands on these types of reasoning, and describes some of their uses in testing.

Note: whilst deduction, induction and abduction have a long history, this terminology can be terribly confusing: even now, more than twenty years after studying the subject at university it still leaves me struggling at times. The chief difficulty is with the term abduction. This is used in different ways across disciplines and by different people (sound familiar?) Even Charles Sanders Peirce, the inventor of the term “abduction”, changed his usage during his career. Lesson 29 of Lessons Learned in Software Testing (Kaner, Bach and Pettichord) provides an excellent description of abductive inference, without which I would probably still be scratching my head.

Deduction: Reasoning to a Result

With deduction, we predict results based on a rule and a set of initial conditions. This type of reasoning  is commonly associated with specification based testing: model the software based on its requirements, derive rules from the model that define how the software should behave, feed the software a set of conditions (inputs) for each rule, and check that results match those predicted by the rule.

For example, when testing account authentication you might test that the account becomes locked after a specified number of failed attempts.  Imagine your specification states that accounts will become locked after a third unsuccessful login attempt: based on a set of conditions (three failed attempts) and a rule (three failed attempts → account locked) you predict a result (account locked). This leads you to design and execute a test that sets up the necessary conditions and allows you to observe whether the rule has been implemented correctly.

We don’t just use deduction when we testing from specifications, we use this type of reasoning in many types of test design. For example, you might understand that software often fails when subjected to large inputs and apply this to your testing: based on a condition (large input) and a rule (large input → failure) you predict a result (software failure). This leads you test to see whether the software does indeed fail with a large input. In this example, you are still applying a model, but the model is not one of how the software should work (e.g. drawn from specifications) but how it could fail. In this way, many different models can be used to predict results and design tests.

Induction: Reasoning to a Rule

When we practice induction, we attempt to determine a rule based on our observations. We often do this when we use software with the goal of learning about it, i.e. when we use the evidence of our tests to infer the rules upon which it is based. This is primarily the form of reasoning that I was talking about in “Testing Backwards”:

Let’s return to the account locking example: perhaps you haven’t got a specification at all, but you’ve noticed that after three failed login attempts the application tells you that your account is now locked. From the conditions (three failed attempts) and results (account locked) you speculate that the software implements a rule (three failed attempts → account locked).

A note of caution: whilst useful, induction does not guarantee that the rule you infer will be correct. Of all the possible conditions and results you might observe, how do you know that you have observed the critical ones? Consider the above example again: what if the account appears to have become locked for some reason other than three failed login attempts? Perhaps your account was locked by another tester performing a different test? Perhaps the rule you inferred is in fact valid, but only applies to particular types of account? When you use induction to infer a rule, you are making a conjecture that can only be verified, or disproven, through further testing.

Abduction: Reasoning to the Best Explanation

Abduction is the process of gathering information, identifying possible explanations (rules) based on that evidence, and seeking to verify or disprove each explanation until you arrive at one which best explains the available data.

Back to the account locking example…let’s say that you determined from the specification that accounts are locked after three unsuccessful login attempts. However, on testing you determine that the account remains unlocked even after three failed attempts. Is this a bug? What could have happened here?

Based on your input, results, and some previous testing experience, you might apply a common failure model and speculate that the account locking logic has a boundary bug. Perhaps rather than implementing “lock account if failed attempts = 3” this has been implemented as “lock account if failed attempts > 3”? This might lead you to test another failed log in, to see if that locks the account.

Perhaps this behavior is configurable? Perhaps a configuration file needs adjusting to set the threshold to three, or to even activate the feature. This might lead you to investigate what the specification has to say about configuration, or to nose around some configuration files.

You construct a number of possible explanations, and investigate each in turn – by reading the specs, poking around in config, attempting further tests or talking to the developers until you arrive at a reasonable explanation, or conclude that the only reasonable explanation is that you have found a bug.

Conclusions

This final example illustrates the need to use more than simple deduction in testing: whether using an exploratory approach or otherwise.

In the above case, if you rely purely on deduction, you stop at the unexpected behavior: “it didn’t do what the spec says, it’s a bug”. Yet there many other possible explanations for this behavior, some entirely benign. If you log a bug at this point, you may well be raising a false positive, damaging your credibility and wasting valuable developer time. In short, your job isn’t done until you have an explanation or can no longer justify spending any more time on this particular line of enquiry.

Classical models of testing, which overemphasize the use of deduction and simple comparisons between expected/actual results lobotomize our testers and constrain the real value that they can bring to their projects.

Uncertainty is Good for You

Keeping our minds open to new explanations requires tolerating uncertainty, which, ironically, is precisely the mental vexation we try to relieve by thinking. Thomas Szasz, in the foreword to Levy’s Tools of Critical Thinking.

I’ve written previously about the role of testing in reducing uncertainty in software projects. You might be forgiven for thinking that uncertainty is an evil that we must drive out.

In fact, the opposite is true: uncertainty is our friend, and there is little place for certainty in a tester’s work:

  • Consider a tester who is certain that there is one correct way to test. How well do you think that tester will adapt to a changing mission or to different project constraints?
  • Consider a tester who is certain that a specification is complete and correct. How do you rate his or her chances of identifying unfulfilled needs or specified, but undesired, behaviours?
  • Consider a tester who is certain that a given test will “pass”. How motivated do you think the tester would be to run that test? How attentively do you think the tester will observe software behavior during the execution of that test?

In each of these examples, certainty is poison. In contrast:

  • Where uncertainty abounds, where there is little agreement between users, programmers, BAs, PMs and other project stakeholders, there is ample opportunity for confusion, errors and bugs. Like nature, testers abhor a vacuum: where there is an absence of certainty, there is fertile territory for our craft. Uncertainty can act as a flashing neon sign that reads “TEST ME”.
  • When we don’t understand something, we seek to do so. When we feel uncertainty about software, about how it might react, what it might do, we are experiencing the prelude to discovery, the motivation to ask “what if?”. Uncertainty is the powerhouse of testing.
  • Uncertainty drives us to question not only the software under test, but our oracles, our practices and our very mission: without such questioning, our habits and assumptions needlessly constrain us. Uncertainty is the antidote to testing chauvinism.

When we test, we seek to reduce uncertainty. Paradoxically, we must embrace uncertainty in order to do so.