Tag Archives: Experience

The Validity of the Testing Environment

From art experts recognizing a statue as a fake (Gladwell, 2005) to nurses recognizing obscure signs of infection in infants (Klein, 1998), the human mind is capable of performing amazing feats of intuition. As magical as it might appear, intuition is little more than recognition of a situation (Simon, 1992). As expertise develops, the decision maker is able to recognize an increasing array of contexts and match an appropriate response from a repertoire of possible actions. It is also subject to bias, systematic failures of judgment that the decision maker can walk blindly into. Intuition, it seems, is not accompanied by any subjective signal that marks it as valid, and confidence is not a reliable indicator as to the validity of intuition (Kahneman & Klein, 2009).

Daniel Kahneman and Gary Klein point out that professional environments differ in terms of whether they are conducive to the development of intuitive expertise (Kahneman & Klein, 2009). Whilst firefighters and nurses (amongst others) can develop such expertise, stock-pickers and political pundits do not seem to be able to do so. In the latter case, we would be better served to place our trust in algorithms and other such rational methods than to rely on the advice of experts. What about testers? Can we develop intuitive expertise? Or would we be better off relying on process, decision criteria and other mechanisms?  Given the controversy over best practices, and the importance that context driven testers place on judgment and skill, this would seem to be a fundamental question.

Kahneman and Klein identify two factors that are necessary conditions for the development of expertise: that the environment provides valid contextual cues, and that it provides adequate opportunities to learn through timely and relevant feedback. So how do the environments that we test within stack up against these criteria?

Firstly, let’s look at contextual cues. Does testing provide the kind of cues that facilitate learning? Given that every context is different, that we are not repeatedly producing the same software, one might be forgiven for believing that any such cues provide weak statistical signals. A comparison might be helpful. Consider chess: both chess and testing are intractable problems. It is widely acknowledged that it is impossible to test everything; similarly, the average chess game has in the order of 10120 possible positions. Whilst this would be unfeasible to brute force, grand masters are able to immediately recognize anywhere between fifty and a hundred thousand patterns and select a strong move within a matter of seconds. In short, chess is a paragon for expertise: it provides a valid environment despite the fact that individual cues might be presented only infrequently. We should not mistake complexity for invalidity: the validity of an environment is not solely determined by the frequency with which individual cues occur, but also by the relevance of those cues. For the tester who is willing to listen, there is an abundance of such cues: the values of customer and stakeholder, the interplay between project goals and constraints, the makeup of technical solution. I’ll discuss contextual cues again in the near future, both at CAST 2012 and in an accompanying post.

Expertise is not experience. Without the opportunity to learn, all the experience in the world will not lead to expertise. Let’s turn to that. The learning opportunities present in any given environment are determined by the quality and speed of the feedback that it provides. In testing, this varies. In some cases we are often able to gain feedback, such as our stakeholder’s reactions to our approach, strategies and the bugs that we find. In other cases it can be difficult to get rapid and relevant feedback, for example: on the bugs that we miss. Sometimes these stay missed for a long time, whilst in some contexts we get feedback of the wrong kind and risk learning the wrong lessons. For example if feedback takes the form of a witch hunt, an attempt to allocate blame, just what does that teach testers? Even where this is avoided, we often see an emphasis on processes and methods and how they might be improved, rather than a focus on what individual testers might learn from their mistakes. Perhaps an environment in which human judgment has been surrendered to process is one in which the conditions for the development of expertise have been removed. Not only are factory approaches blinkered and dehumanizing, but they might well rob testers of their means of escape: an opportunity to develop expertise. There are however some intriguing possibilities. Could we train ourselves to become better observers so as to be more receptive to the feedback that our environments supply? Is it possible to reconfigure our environments such that they provide better feedback? Dedicated testers can improve their chances of developing expertise by creating and nurturing feedback loops.

Perhaps asking whether the testing environment is conducive to developing expertise is too simplistic a question. Kahneman and Klein identify some professions as fractionated: where expertise is displayed in some activities but not in others. Given the uneven nature of feedback, it may well be that testing is one such profession: there may be some activities in where it is more appropriate to draw on algorithmic methods than on expertise. Of course, the trick is recognizing when to do so, recognizing the limits of our expertise. And that requires judgment: as James Bach tweeted at the weekend “In any matter that requires sapience, even algorithmic methods can only be applied heuristically.

 

References

  • Gladwell, M. (2005). Blink.
  • Kahneman, D. and Klein, G. (2009). Conditions for Intuitive Expertise: A Failure to Disagree.
  • Klein, G. (1999). Sources of Power: How People Make Decisions.
  • Simon, H. (1992). What is an Explanation of Behavior?

Counting Experience

Once upon a time, I believed that only testers should test, and testing experience counted for everything.  This was an easy trap to fall into: after all, I had been around the block. I’d learned lots about testing approaches, different strategies, methodologies and techniques. Looking back at my earlier efforts, and many mistakes, it was easy to think “if only I knew then what I know now!” and ascribe any improvement to experience.

Dumb. Arrogant. Mistake.

What changed my mind?

Let me briefly describe the project: we were customizing an enterprise solution owned by Client A for use by Client B. Due to the complexity of the product, the lack of any models or documentation, and the significance of the changes, this would be no simple matter. In addition, time scales were tight and we were iterating rapidly. I chose a testing approach that relied heavily on exploratory testing, with automation support for those areas that were considered high regression risk or eminently suitable for data driving.

The exploratory testers were to be the vanguard: first to test new code, first to test fixes, first to investigate and qualify bugs reported by others. This was to be the most critical testing role on the project. For this I was given two subject matter experts seconded from Client A: testers whose sole testing experience was a little bit of UAT.

Now, there’s a common misperception that I often hear about ET: “you need a lot of testing experience to do that”. Perhaps I could have listened to this conventional wisdom. Perhaps I could have succumbed to my prejudices. After meeting the testers in question, I played a hunch and chose not to.

We got started with some basics: discussing the impossibility of complete testing, the oracle problem, bug reporting expectations and the overall approach to testing for the project. Then we got testing. To demonstrate the mechanics of SBTM, I led the first couple of test sessions, after which I dropped back to chartering and reviewing session reports. Within a few weeks I pretty much left them to charter themselves, with a daily huddle to discuss and prioritize things to test.

In the early days I monitored progress intensively:

  • When I heard them debate whether a particular behaviour was a bug, I interrupted with a brief breakout session to discuss the relationship of quality and value, and a variety of different oracles they might consider.
  • When I heard them struggling with how to test a particular feature, I’d introduce them to a few different test design techniques and heuristics that might be relevant.
  • I’d review bug reports, and provide feedback as to follow-up testing they should consider.

Pretty soon they were a wonder to behold. They quickly assimilated every idea and concept thrown at them. The bugs they identified demonstrated significant insight into the product, its purpose, and the needs of its eventual users. It was fascinating to listen to animated discussions along these lines:

Tester 1: Is this a bug?

Tester 2: Hmm, I don’t think so. It’s consistent with both the requirements and the previous version.

Tester 1: But what if [blah, blah, blah]…doesn’t that really kill the value of this other feature?

Tester 2: Got you. That could be a problem with the requirements; we need to talk to the BA.

This pair had rapidly become the most impressive testing duo I’d ever had the pleasure of working with. How had this happened? They brought with them a blend of aptitudes and experiences that far outweighed their relative lack of testing experience:

  • They had open minds, a willingness to learn, and no preconceived notions as to “the one true way” to test.
  • They exhibited an insatiable curiosity: a desire to understand what the software was doing and why.
  • Having lived with the previous version of the product, they were dedicated to delivering quality to other users like them.
  • Their experience as users meant that they had well refined internal oracles that gave them insight into what would diminish user value.

Their experience counted for a lot, but not their testing experience: any testing know-how they needed they were able to learn along the way.

I’m not claiming that testing experience counts for nothing: I’ve also worked in situations where testers needed significant experience in testing, and specific types of testing, or else be eaten alive by a sophisticated client. Testing experience is only one piece in a complicated puzzle that also includes domain experience, technology experience, attitude and aptitude.  Different situations will demand a different blend.

None of these factors are simple. Consider testing experience: this is not a straightforward, indivisible commodity. Testing is a diverse field, and highly context dependent. What works for you in your context might not work for me in mine. Often the recruitment of testers boils down to a simple count of years in testing. A tester can spend decades testing in one context, but when moved suffer from “that’s not how you test” syndrome.  Such an individual is ill equipped to learn or even consider the approaches and techniques that are relevant to a new situation. Even a testing virgin could be a better choice, if accompanied by an open mind. Diversity of experience, and a willingness to learn and adapt, count for far more than years. Counting experience is for fools.