Category Archives: Uncategorized

Principles Not Rules, part 3


As much as I have an aversion to mission statements, born of years working in organisations where everybody had to have one in order to satisfy some standard or other, my team and I agreed on the following “purpose” for our testing:

“To enable informed decision making by discovering and sharing timely and relevant information about the value of solutions, or threats to that value.”

This is our purpose, this is our cause. We test to discover things, things that are useful, things that help our stakeholders make better decisions.

 Yes, it’s generic. But it’s a starting point. It’s helping us to trigger new ways of thinking through testing problems.

Perhaps more importantly, it give us a useful lens through which to challenge ourselves. When someone suggests that we act in a particular way, we can look to our purpose, and ask “does this do anything to help us achieve our purpose, does it take us in the right direction?”

Steve Jobs had something interesting to say about this: “People think focus means saying yes to the thing you’ve got to focus on. But that’s not what it means at all. It means saying no to the hundred other good ideas that there are… innovation is saying no to a thousand things”. In something like testing, which can never be complete, in which every decision is a trade-off, that kind of focus is critical: we need to be able to say “no” to both the otherwise reasonable sounding suggestions, and the more silly ones, alike. In terms of examining such things, and explaining why we choose to say no, our purpose is a wonderful tool.

This kind of thinking is also making its way down to individual projects. I’ve noticed cases where testers have started to think in terms of the major “exam questions” that they need to answer, and the standard of evidence they need for their stakeholders and regulators. I’ve started hearing testers talking to other team members about what their projects are trying to achieve, and what they might need to know to help them. Test strategy is starting to look less like a bunch of logistics and more like a mandate to go discover. Gradually, setting and refining information objectives for testing seems to be becoming part of the way we work. Not everywhere, but hopefully enough to catch.

I will add a note of caution here. When setting information goals for a project, it is easy to think in confirmatory and binary terms “does the product do X?”, “can we load the data?”. But some of the most interesting questions to ask are neither confirmatory or binary: “what happens when?”, “how long does this take?”, “how many users can it handle before it goes BOOM?”. We should avoid closing ourselves off to such questions. Here’s a tip: if you reframe your testing objectives as questions, try and make sure they are not all closed questions. Open questions are important.


In addition to our purpose, my team and I agreed a set of eight principles: context, discovery, integration, accountability, transparency, information value, lean and learning.

I’m not going to go into the detail of these here. I suspect much of the value is less in the concepts, or the particular wording, but more in how we got there: after spending many long hours working these out, sweating the semantics, they’re highly personal to those testing with me. The process of developing and agreeing them as a group was insanely valuable: forget the shallow glossary of terms the other guys peddle, this is a real common language. We understand each other when one of us says “transparency”, or “lean”, because we’ve invested time getting to the bottom of how those labels matter to us, and we continue to invest time in sharing those meanings with those we work with.

Principles can be values, and they can be heuristics that help guide our thinking. They are not prescriptive or detailed, indeed, they can often be open to interpretation, or contradict one another (such tensions may even be a useful indicator that you’re pitching principles at the right level). This means that the user is forced to THINK when applying them, and this encourages the use of judgment.

The alternative is rules: simple mechanistic formulae or explicit instructions: fill out this template, use this technique, check this box. Principles are different. Principles are pivotal in empowering the tester. What we’re trying to do is regulate the testing system rather than simply control it, and principles have a pedigree in regulation.

In 2007 the UK’s Financial Services Authority published a treatise on “Principles Based Regulation” that described a trend away from rules in the regulation of the financial services industry in the UK. In this, they described their rationale:

  • Large sets of detailed rules are a significant burden on industry
  • No set of rules is able to address changing circumstances. In contrast, rules can delay or even prevent innovation. They tend to be retrospective, i.e. solve they yesterday’s problems rather than today’s or tomorrow’s
  • Detailed rules can divert attention towards adhering to the letter rather than the purpose of regulations, i.e. they encourage rule following behaviour, compliance at the expense of doing what’s right.

The FSA aren’t alone. You see this in a number of domains: regulation, law, accounting and audit. In a 2006 paper called “Principles not rules: a question of judgment“*, ICAS, the Institute of Chartered Accountants of Scotland aired views similar to those of the FSA:

  • When using rules, one’s objectives can become lost in a quest for compliance
  • In contrast to principles, rules discourage the use of judgment and deskill professionals.

So, do you want testers burdened down by a large body of rules (dare I say it, a standard?) describing how they should behave, do you want them to be deskilled and reduced to simply taking and obeying orders? Or do you want skilled testers who can think for themselves, apply professional judgment to choose, adapt or even innovate testing practices? If the latter, then I suggest you want to be thinking in terms of principles rather than rules.

Before moving on, it is worth mentioning that the SEC (2003) point out that principles based regulation does not equal principles only regulation, and indeed the FSA saw rules and principles as coexisting: the trick to a robust regulatory framework is in finding the right balance. In our framework, we place great emphasis on principles, but do maintain a handful of rules: for example concerning the use of production data in testing. There are laws after all.

Perhaps one of the more interesting aspects of my role this year has been finding this balance for some of the firm’s largest programmes and change initiatives. In each case we started out with long wish lists of rules, driven by a desire to consistency, yet when considering the legitimate variation between projects ended up agreeing principles instead, supported by a bare minimum set of rules. To avoid disempowering people, a light touch is required.


We believe that testing should be organised at the level at which delivery is performed, because the people closest to a context are those best suited to make the right decisions about what practices are needed. As a result, we do not specify any particular practices in this layer of our framework: we avoid dictating testing practices to projects, we do not push standards, we do not have a testing process document, we do not have templates. Teams, who own their own testing, are free to create, adopt or adapt such things, based on their own needs.

That is not to say that it’s a free for all. I have set a clear expectation that delivery teams are accountable for the quality of their own testing and must remain transparent in what they do. This is critical: empowerment can only thrive when there is trust.

And this is the elephant in the room. Many large enterprises are built on a foundation of mistrust: we manage projects through command and control because there is no trust; we demand suppliers comply with standards because there is no trust; we maintain elaborate sets of rules because there is no trust. To change things, we need trust; and trust is dependent on accountability and transparency.

We acknowledge that testing is a service, and that we are accountable to our stakeholders for the quality of our testing. We define the quality of our testing in terms of information value: whether the information we provide is useful, i.e. timely, relevant and consumable. We recognise that information is of no value if not shared, so we must provide transparency into what we discover, and to provide a warrant for those findings, into the extent, progress and limitations of our testing.

The alert amongst you may have noticed that these are three of our principles: accountability, transparency and information value. The prerequisites for trust and empowerment are firmly rooted in our framework.


Figuring out what kinds of information people need is hard. Evaluating software is hard. Sharing what you find in a way that is accessible to stakeholders is hard.

Nothing about testing is easy. It is hard enough without constraining ourselves unnecessarily with inflexible rules! But we also need to acknowledge that, when you’ve been living under a regime of command and control for a while, it can be hard to empower yourself.

This is where our testing community comes in. To break the rules based culture, we need to create an environment where people are comfortable sharing ideas and challenging one another. We need an environment where people can ask for help and support one another. If our people are going to gain confidence, grow into the role of empowered testers, then we need to make sure that there is a support network for them. You’d be foolish to learn a trapeze act without a safety net: and I need to make sure that those testing in my corner of the organization have one. It’s early days and this is an area where I intend to make significant investment in the coming year.

Final Words

This is proving to be a fascinating journey. It’s a journey of respect, respecting people enough to give them a chance to rise to the challenge of becoming excellent testers, freeing them from the tradition of command and control that has so constrained their work. The empowerment paradox suggests that we cannot directly empower others, but by removing these obstacles, perhaps we can create the conditions for them to empower themselves.

*My thanks to James Christie for discovering and sharing this.

Principles Not Rules, part 2

My Current Challenge

I head testing within the treasury function of a bank.

Like many of our peer organizations, like many large enterprises, we have a history of having commoditized, juniorized and offshored much of our testing. Successive changes to location strategy have left our testers scattered in multiple locations, often geographically separated from their projects. In most cases testing has historically been performed by “independent” testing teams, poorly integrated into the delivery effort, managed via command and control by a handful of test managers in the centre. This model has proved expensive, slow, and not terribly insightful. It has done little to prevent a number of projects “going dark” with regards to the quality of the product being delivered – an event often followed by project failure. It some cases this model is barely – if at all – better than a placebo.

In contrast, what we want is testing that informs us about quality and helps keep projects transparent. We want effective testing that is worth what we pay for it: i.e. that is “cost effective” (the clue is in the second word!). We want testing that is integrated with delivery and that is supportive of our firm’s transition to agile.

In short, we have a big gap between reality and expectation. If this gap weren’t challenging enough, we operate in a highly regulated industry: we have a requirement to demonstrate to our regulators that we have an effective control environment. Certain programmes of work are under intensive regulatory scrutiny and this demands a high level of control and transparency.

This is my challenge: how to enable good testing – by empowering testers – yet still fulfil a seemingly contradictory need for control? This is where our “principles based’ testing framework – a way of thinking about, organizing and governing our testing – comes in. It has its seeds in Simon Sinek’s golden circle.

If you haven’t seen Sinek’s TED talk – Google it, it’s worth a watch. His main argument is that most of us don’t have a purpose, cause or belief that’s worth a damn, and that when most of us communicate, we’re all about WHAT we’re doing, or HOW, but rarely WHY. In contrast, he argues, successful organisations have a clear WHY – and they start from that in everything they do.

This got me to thinking: when’s the last time I heard (outside of a conference) any testers talking about WHY they were testing? When’s the last time I saw a test strategy that gave any even the slightest indication of the mission, goals or objectives of testing? So I started asking people. Why do you test? What value do you bring? I got a lot of generic and dubious answers: “improve quality”, “mitigate risk”, “provide assurance”, “because audit said we need to test”*. Unless you’ve been under a rock for the last couple of decades, you’ll know that there’s a lot of disagreement with these kinds of statements.

It occurred to me that, if so few testers have a clear sense of why they are testing, then much of their testing is in fact purposeless; and without any sense of purpose, it is easy to wind up doing a lot of things that add no value whatsoever.

I decided to start using the circles to help me address that, and to start tackling the empowerment paradox. Unfortunately, early attempts failed dismally: a lot of people got hung up and what’s a what, what’s a how and what’s a why. Much confusion! So I changed the model. Instead of Sinek’s why, how and what, I swapped in “purpose”, “principles” and “practices”. To round things off, I added “people”, giving us a model that looks like this:









In the final post in this series, I’ll explore this model in greater depth.

*”Because audit said so” is a phrase guaranteed to drive me Gordon Ramsey. I have no problem with the auditors themselves, but rather with the use of the word “audit” as an attempt to shut down arguments, or to excuse shoddy practices. Suffice it to say, that this tactic rarely works on me.

Principles Not Rules, part 1

[This week I presented At EuroSTAR 2015. My subject? How testing can be well governed without recourse to standards, and how an emphasis on principles, rather than rules, empowers the tester, freeing them to perform better testing than a likely to be achieved under a command and control regime. This series of posts is drawn from my presentation notes.]

The Empowerment Paradox

Testing, as commonly practiced, has lost its way. But I’m jumping ahead. Let me explain.

For much of my working life, I have been a consultant. One of the benefits that this affords is the opportunity to meet lots of people. And I enjoy speaking to people about their testing: how they approach it, why they think they’re doing it, what they feel they get out of it.

On one notable occasion, whilst presenting to a PMI forum – a group of project and programme managers – I played a game of word association and asked “What’s the first word that springs to mind when I say ‘Testing'”. “Stinks” was the overwhelming response.

And to be brutally honest with you: “stinks” is the safe for work version.

This can be a hard message for a tester to hear. We often bemoan the fact that many of our colleagues don’t “get” testing, or that our stakeholders don’t seem to understand the value of what we do. Unfortunately, in my experience, when the customer of a service doesn’t see the value in it, this often means that there IS NO VALUE in it.

Now, don’t get me wrong. I’m not saying that my experience of testing has been universally stinky. I’m a context driven tester, and many of my best experiences of testing were on small projects where we were very much context driven: we sought to understand what our projects needed to know and designed testing to respond to those needs. And it worked.

Unfortunately, something seems to happen at scale. When projects are grouped and we seek “consistency”, or when we work within large organizations that promote some form of standardization, we start to take decisions away from those people most firmly rooted in project context. We take the decisions away from those people most likely to make good decisions about how to test.

I’m no exception! One of my first attempts at scaling CDT was to write a handbook that mandated practices often associated with context driven testing. The results were horrible. I had made a mistake that I see people making time again: mistaking CDT for a bag of practices. It isn’t. It’s a bag of ANY practices, and more than that: it’s a philosophy that empowers individuals to make their own choices about testing.

Unfortunately, empowerment is hard. The very structures that put one person in a position to “empower” another will often undermine that attempt at empowerment. This paradox brings me to my current challenge…

Party like it’s 1979

We are rolling back the clock so as to prevent

you from finding better ways to test software.

Through this work we will make you value:


Management control over individual accountability

Documentation over finding out about software

Policing the lifecycle over collaboration with the team

Detailed test planning over exploration and discovery


That is, we can’t even begin to imagine

how you might have come to value the things on the right.

-The International Organization for Standardization

Tasks? Whither the Test?


On Friday, via Twitter, @michaelbolton asked @rbcs about the unit of measurement for test cases. To this, @rbcs replied:







A test is a task? Sounds reasonable.

But wait, a test is surely two tasks? An action and an observation? Or is it three? An action, an observation and an evaluation?

But wait! What if the test is not completely trivial to set up? What if it takes many separate tasks to configure a system in order to conduct a single test? Perhaps a test consists of many, many tasks?

Then again, Rex’s tweet suggests he is referring to tasks in a project management context. Please imagine a Gantt chart. I can’t say that I’ve ever seen project planning down to the individual test case – it is more normally the case that tests are wrapped up into a higher-order task on the plan. So perhaps a test is but a fraction of a task and not a whole one?

Also, in a project management sense, a task might be of any size, from a few hours to many days effort an duration.

So, it would appear that a test could be an indeterminate number of tasks of indeterminate size.

Now /that/ seems like a sound basis for a unit of measurement.

It get worse.

Ever performed a test that revealed something interesting and unexpected? Where a single test spawned many follow up tests aimed at isolating and qualifying the impact of your discovery? Tests begat tests.

Ever experienced the opposite? Where a test turned out to be irrelevant or simply not viable? Tasks may have been performed, but you are left having performed no test at all? Just as tests are spawned, so they can disappear.

Imagine that you have employed a contractor to build you a house made of bricks. Imagine that the bricks vary in size from Lego proportions to that of boulders. Imagine that, when laid, some bricks  spontaneously vanish, whilst others materialize miraculously in place. The contractor, of course, reports his progress by telling you “42.3% of bricks have been laid“. I’d be inclined not to trust the contractor.

Of course, bricks don’t behave that way: they are real, concrete phenomena. Tests are not. Tests are constructs, abstractions.

Whither the Test?

But what does this mean? What constitutes a test case? This can be particular tricky to answer.

Let’s take the example of a project that  I participated in last year. My team were testing an ETL solution, and were focused on testing the rules by which data, extracted from a variety of source systems, was transformed in order to load it to a single target system. Testing was performed by conditioning real data (to cover various conditions evident in the transformation rules), predicting the results of transformation for every cell (table/column/row intersection) within the source data set, and reconciling ETL results against our predictions.

So, what is a “test case” in this example?

The tools we created for this purpose were capable of performing in excess of ten million checks per minute. Over the course of a particular test run, we were performing perhaps a billion checks. Were we executing a billion test cases?

Now, those checks were performed at a field level. In most cases, the transformation logic was tied to an individual row of data, with combinations of field values within the record contributing to the outcome of each transformation. In this way, each row might be seen as representing a particular combination of conditions. We were testing with a few million rows of data. Were we executing a few million test cases?

Of course, many of these checks were seemingly redundant. The underlying transformation rules represented in the order of two thousand different outcomes, and a given data load might result in many, many instances of each outcome. So were we only executing two thousand unique test cases?

Each test run was orchestrated over the course of about a week. Typically, each run was conducted with a new set of test data. Conditioning data took considerable time, as did analyzing results and potential anomalies. If we conceive of our tools as being scientific instruments and the ETL implementation, in combination with any given set of data, the subject of our enquiries, then perhaps we should consider a test run to be a single experiment, a single test. Were we only performing only one test, albeit a complex one, each time?

Any of these, from one to a billion, might be an appropriate answer dependent on how you choose to define a test case. For our purposes, with an eye to coverage of conditions and outcomes, we chose to count this as being two thousand test cases. There was nothing inherently “correct” to this, it was simply a decision that we made on the basis that defining a test case at this level seemed useful.

Test cases are how you choose to define them.

ET: Why We Do It, an article by Petter Mattson

What follows is an article by my colleague Petter Mattson.

Petter and I recently made each other’s acquaintance after our organizations, Logica and CGI, merged.  An experienced test manager and an advocate for exploratory testing, Petter wrote this article for internal publication within Logica. Unfortunately its contents were sufficiently divergent with the official testing methodology that it was never published.  Many of the points in this piece resonated for me, and I was determined that it see the light of day.

I’d like to thank Petter, and his management at CGI in Sweden, for allowing me to publish it on Exploring Uncertainty.


Click here for Petter’s article


Dear Paul


I’d like to thank you for your kind words with regards my recent post. I agree with your assertion that there are a number of factors at work that will influence whether a tester will notice more than a machine, and I’d love to know more about your case study.

I suspect that we are closely aligned when it comes to machine checking. One of the main benefits of making the checking/testing distinction is that it serves to highlight what is lost when one emphasizes checking at the expense of testing, or when one substitutes mechanized checks for human ones. I happened to glance at my dog-eared copy of the Test Heuristics Cheat Sheet today, and one item leapt out at me: “The narrower the view, the wider the ignorance”. Human checks have a narrower view than testing, and mechanized checks are narrower still. We need to acknowledge these tradeoffs, and manage them accordingly.

I think we need to be careful about the meanings that we give to the word “check”.  You say the usage that you have observed the most is when “talking about unmotivated, disinterested manual testers with little domain knowledge” or when “talking about machine checking”. Checking, in and of itself, is not a bad thing: rather checks are essential tools. Further, checking, or more accurately the testing activities that necessarily surround checking, are neither unskilled nor unintelligent. Not all checks are created equally: the invention, implementation and interpretation of some checks can require great skill. It is, in my opinion, an error to conflate checking and the work of “unmotivated, disinterested manual testers with little domain knowledge”. Testers who are making heavy use of checking are not necessarily neglecting their testing.

More generally, I worry about the tendency – conscious or otherwise – to use terms such as “checking testers” (not your words) as a pejorative and to connect checking with bad testing. I would agree that the inflexible, unthinking use of checks is bad. And I agree that many instances of bad testing are check-heavy and thought-light. But rather than labeling those who act in this way as “bad testers”, and stopping at the label, perhaps we should go deeper in our analysis. We do after all belong to a community that prides itself in doing just that. I like that, in your post, you do so by exploring some traits that might influence the degree to which testers will go beyond checking.

There are a multitude of reasons why testers might stop short of testing, and it seems to me that many of them are systemic. Here’s a few to consider, inspired in part by Ben Kelly’s series The Testing Dead (though far less stylish). It is neither exhaustive nor mutually exclusive:

  • The bad. Some people may actually be bad testers. It happens; I’ve met a few.
  • The uninformed. The testers who don’t know any better than to iterate through checks. It’s how they were raised as testers. Checking is all they’ve been taught; how to design checks, how to monitor the progress of checks, how to manage any mismatches that the checks might identify.
  • The oppressed. The testers who are incentivized solely on the progress of checks, or who are punished if they fail to hit their daily checking quotas. Testing is trivial after all: any idiot can do it. If you can’t hit your test case target you must be lazy.
  • The disenfranchised. Ah, the independent test group! The somebody-else’s problem group! Lock them in the lab, or better yet in a lab thousands of miles away, where they can’t bother the developers. If fed on a diet of low-bandwidth artifacts and divorced from the life, the culture of the project, is it any wonder then that their testing emphasizes the explicit and that their capacity to connect observations to value is compromised?
  • The demotivated. The testers who don’t care about their work. Perhaps they are simply uninformed nine-to-fivers, perhaps not. Perhaps they know that things can be different, cared once, but have given up: that’s one way to deaden the pain of hope unrealized. Many of the oppressed and disenfranchised might find themselves in this group one day.

Do you notice something? In many cases we can help! Perhaps we can encourage the bad to seek alternate careers. Perhaps we can help the uninformed by showing them a different way (and as you are an RST instructor, I know you are doing just that!). Perhaps we can even free the oppressed and the disenfranchised by influencing the customers of testing, the decision makers who insist on practices that run counter to their own best interests. That might take care of some of the demotivated too.

I like to think there is hope. Don’t you?

Kind regards,