Tag Archives: Uncertainty

Confidence and Uncertainty

Tester 1: “Are you certain?”

Tester 2: “I’m almost certain”

Tester 1: “Then you’re uncertain?”

Tester 2: “No, I, uh…I’m fairly certain”

Tester 1: “So you’re not certain?”

Tester 2: “Dammit, yes I’m certain”

What is the opposite of “certain”? You might think the answer is “uncertain”, but the English language is a tricky beast.

Perhaps adding a little definition would help. There are two forms of certainty:

  • Epistemic certainty which relates to your state of knowledge and whether you hold a belief for which there are no possible grounds for doubt.
  • Psychological certainty which relates to your state of mind. Call this your degree of confidence.

Epistemic certainty is an absolute1 and not a matter of degree, whereas uncertainty is scalar. They cannot be opposites.  Imagine a continuum ranging from absolute uncertainty to absolute (epistemic) certainty.  Any point on that scale represents a degree of uncertainty. No degree of uncertainty can be sensibly said to be the opposite of epistemic certainty, any more than it is sensible to say that any rational number is the opposite of infinity. The relationship is the same; on this continuum epistemic certainty is an unobtainable construct much in the same way that one cannot count to infinity.

What does this have to do with testing? I’ve written several times on this blog about uncertainty (here, here and here). I’ve also written a little about confidence. Having just read Duncan Nisbet’s post on Michael Bolton’s Let’s Test keynote, I think it’s time to link the two.

When we first approach an item of software, we start from a position of little knowledge. This brings with it a great deal of uncertainty.  We read, we ask questions, we test. We build better models in our minds and test those models, constantly refining and enriching them. Our role as testers is to do just this, to learn, and to articulate what we have learned. A natural result of this growth of knowledge is the reduction of uncertainty. This does not mean that we increase epistemic certainty, or that we get closer to it. Moving from ten to one million does not increase one’s proximity to infinity; else infinity would not by definition be infinite.

Is our role as testers to reduce uncertainty? Ultimately yes, I believe that it is, in the epistemic sense at least. What is the value of any item of information that does not reduce uncertainty? If we provide information that has no effect on uncertainty, then we have most likely not provided any new information at all2. We might add value by providing information that increases uncertainty, by identifying an unknown that was not previously known to be unknown3 or that was previously thought to be better known than it is. However, in this sense we are not changing the balance of epistemic uncertainty, but have strayed into the realm of psychological certainty.

Psychological certainty, in contrast to epistemic certainty, is scalar in nature: one can suspect, one can be fairly confident or one can be utterly convinced. In the psychological sense, certain and uncertain are indeed opposites, and an increase in one reduces the other. So when Michael says “A key part of our service is to reduce unwarranted and potentially damaging certainty about the product”, I believe he is talking about psychological certainty4, and I’d be inclined to agree. How do we do so? By doing what we do: investigating, uncovering and revealing information that runs counter to the unwarranted certainty; in other words, by reducing epistemic uncertainty.

In testing, the danger we encounter is when we blur the distinction between epistemic and psychological certainty. “99% of the tests pass”: does this provide a case to increase our confidence? No. “We’ve found and fixed 1000 bugs”? No. A warrant might justify a belief, but we should be wary of seeing ourselves providing warrants that increase psychological certainty. We should certainly not engage in managing confidence. You may be told that one of the purposes of testing is to build confidence and that your practices need to be supportive. If you agree then you are agreeing to a scam. The most we can do is create an environment in which the confidence of our customers will live or die based on relevant information being made available to the right people when they need it. Their confidence is their business.

Notes:

  • 1 You might ask if I’m certain about this: my answer is no. It is entirely possible that one day some bright spark will solve the problems that have been plaguing philosophers for thousands of years, therefore I have reason to doubt this belief, and therefore I am not certain – in the epistemic sense. I might concede to being certain in the psychological sense, but that’s my problem.
  • 2 Think repetitive regression testing.
  • 3 A Rumsfeld.
  • 4 It makes as much sense to talk about reducing epistemic certainty as it does to talk about – you guessed it – reducing infinity.

Uncertainty is Good for You

Keeping our minds open to new explanations requires tolerating uncertainty, which, ironically, is precisely the mental vexation we try to relieve by thinking. Thomas Szasz, in the foreword to Levy’s Tools of Critical Thinking.

I’ve written previously about the role of testing in reducing uncertainty in software projects. You might be forgiven for thinking that uncertainty is an evil that we must drive out.

In fact, the opposite is true: uncertainty is our friend, and there is little place for certainty in a tester’s work:

  • Consider a tester who is certain that there is one correct way to test. How well do you think that tester will adapt to a changing mission or to different project constraints?
  • Consider a tester who is certain that a specification is complete and correct. How do you rate his or her chances of identifying unfulfilled needs or specified, but undesired, behaviours?
  • Consider a tester who is certain that a given test will “pass”. How motivated do you think the tester would be to run that test? How attentively do you think the tester will observe software behavior during the execution of that test?

In each of these examples, certainty is poison. In contrast:

  • Where uncertainty abounds, where there is little agreement between users, programmers, BAs, PMs and other project stakeholders, there is ample opportunity for confusion, errors and bugs. Like nature, testers abhor a vacuum: where there is an absence of certainty, there is fertile territory for our craft. Uncertainty can act as a flashing neon sign that reads “TEST ME”.
  • When we don’t understand something, we seek to do so. When we feel uncertainty about software, about how it might react, what it might do, we are experiencing the prelude to discovery, the motivation to ask “what if?”. Uncertainty is the powerhouse of testing.
  • Uncertainty drives us to question not only the software under test, but our oracles, our practices and our very mission: without such questioning, our habits and assumptions needlessly constrain us. Uncertainty is the antidote to testing chauvinism.

When we test, we seek to reduce uncertainty. Paradoxically, we must embrace uncertainty in order to do so.

Uncertainty Revisited

I’ve written previously about the role of the tester in reducing uncertainty on software development projects: of how we model and observe, building our knowledge and providing information.

It might be tempting to imagine the tester as a perfect observer, standing aloof, measuring and judging. Sadly, such an image is delusional: uncertainty permeates everything we do, not just the software we seek to understand. We don’t stand outside the room looking in.

What are the sources of uncertainty in testing?

1) Much uncertainty is inherent in the testing challenge: the impossibility of complete testing guarantees that we can never have full knowledge of the software we test, nor is it conceivable that any set of test techniques will ever predict with certainty where all the bugs will be found.

2) We are subject to model uncertainty.  As testers we construct models as to how the software should work, and how it could fail. These models are invariably flawed:

  • Consider oracles, our models as to how the software should behave: every oracle is heuristic, that is to say useful but imperfect. If that were not the case, if we had complete true oracles, then these oracles would be indistinguishable from the desired state of the software under test: why would we need that software?  Further, quality is subjective, relating to the needs and values of people: as such there can be no absolute and objective oracle.1
  • Consider bug hypotheses, our models that describe potential failures: if these were not flawed, then we could perfectly predict each bug without running a single test.
  • Consider tests themselves: each test is a model that describes how the software will behave under certain conditions.  Unfortunately the range of conditions is sufficiently vast that it is easy to miss conditions that prove to be critical.  The range of resulting behaviours presents a similar challenge.2
  • Even models that describe testing itself are flawed, some more than others.

3) Our observations are subject to measurement uncertainty: our interactions with the software influence how it behaves. This is not limited to our selection of conditions, nor even to Heisenbugs and the probe effect of resource monitors: the very rate, frequency and sequence of our actions can drive different behaviours in software (consider race conditions and resource leaks).

4) We are subject to human error. Testers are humans too: our perceptions our limited, we can only reliably focus on so many things at once, we have unavoidable psychological biases that will influence our choice of tests, the behaviours we observe, and how we interpret our observations.

Much uncertainty is epistemic, that is to say that it can be reduced.  If we are to reduce the uncertainty associated with software, we would be wise to understand the role that uncertainty plays in our own work, and seek ways in which we can reduce that too.

Notes

  • 1 Michael Bolton discusses this rather eloquently in Oracles.
  • 2 Doug Hoffman provides an interesting and detailed discussion of these issues in Why Tests Don’t Pass.