Tag Archives: Regression

Rethinking Regression, Part 5: Your Mission, Should You Choose to Accept It

Wouldn’t it be great if projects would take a “sensible“approach to mitigating regression risks? If projects applied plenty of prevention, used automated unit-level checks for confirmatory testing, and left the testers to do what they do best: find bugs.

This is not the reality on many projects. Nor is it even appropriate on every project. Not every change is sufficient to require reviews. Not every change will necessitate refactoring code. Static analysis tools can be noisy and take time to tune: not all projects will run for long enough to justify this investment. Not every project will be delivering code with a shelf-life that warrants automated unit level checks. Some projects may be having significant difficulties with their configuration management systems that require time to resolve.

“Sensible” therefore takes in a whole range of factors that a tester may not consider or even be aware of. Ultimately, it will not be the tester who determines what mitigation strategies are appropriate for the project: that it the province of the project manager.

What does this mean for the tester? In a word: mission.

It is often helpful to agree a clear testing mission with the relevant stakeholders. Doing so helps to avoid the unpleasant surprises (“You’re doing A? I thought you were doing B!”) that can result from misaligned expectations, and helps to keep the testing effort pulling in the same direction as the project.

The regression testing mission will be driven by a range of contextual factors that might include scope, scale and nature of the changes being implemented, stage within the project life-cycle, project constraints and the other mitigation strategies that the project are employing.  For example:

  • Project A is implementing a wide range of mitigation strategies, including configuration management and unit-level change detection. The project manager and testers agree that the testing mission should be biased towards finding bugs with only light confirmation being performed at the system level (as change detection is largely provided at the unit level).
  • Project B has effective configuration management, but no automated unit level regression checks. The project manager and testers agree that the testing mission should be strike a balance between conducting confirmation around those areas that are changing, and testing for bugs.
  • Project C has little regression mitigation: configuration management has proved highly unreliable and no automated unit level regression checks. Based on the nature of the changes and the stage in the project, the project manager and testers agree that the testing mission should focus on broad confirmation of the software, with some time allocated to testing for bugs.

Explicitly discussing the regression testing mission can provide the tester with an opportunity to ensure that  the relevant project stakeholders are aware of the limitations of black box regression testing. However, if a project manager understands that black box regression testing is not the most cost-effective means of providing change detection and is seriously limited in its ability to find bugs – but decides to rely on it to mitigate regression risks – then that is his or her decision to make. In such a position, all that a tester can reasonably do is recognize that they are selling tobacco and provide a health warning so as to set expectations.

In summary, the regression problem is not a single problem; it is a range of different risks that are most effectively mitigated with a variety of different strategies. By educating their stakeholders about the limitations and tradeoffs involved with black box regression testing, testers can help them to make better risk mitigation decisions. Ultimately contextual factors will drive decisions as to which strategies are appropriate on any given project, and the regression testing mission needs to be defined accordingly.

Other posts in this series:

Rethinking Regression, Part 4: What’s Wrong with Regression Testing?

In her blog post Recession Testing is the new Regression Testing, Anne-Marie Charrett expresses dissatisfaction with the way in which regression testing is often practiced.  This kind of frustration is common amongst testers, and a good indicator that there is an underlying problem.

In relation to managing regression risks, what’s wrong with regression testing?

To start, let’s be clear what I mean by regression testing: I’m specifically referring to the black box regression testing commonly practiced by testers. It often goes something like this:

  • Design and run tests for new functionality.
  • Save these tests into a regression pack for future use.
  • Maybe find a bug: get the fix, write a test to check the fix, save that into the regression pack.
  • Build changes: run some or all of the same tests again.
  • Automate some or all of the regression pack.

Now we’ll consider how this approach stacks up in terms of the types of mitigation strategies described in the previous post.

First, prevention.  Whilst test design might offer some mitigation to regression risks – it often requires careful examination of the test basis – the above approach is one of repetitive execution rather than design.  It has no preventive power in relation to regression risks.

Second, confirmation.  The principle focus of this approach is confirmation; the checking that tests continue to give the same results on this build as they did on the last.  This is where the usefulness of such tests lie.  However, as Paul Gerrard points out in his response to Charrett’s post this kind of testing is effectively “about demonstrating functional equivalence”.  The use of black box testing is more difficult than unit testing for doing so:

  • At the black box level, the tester observes behaviour resulting from many functions acting together, this can mask the behaviour of individual functions.
  • In contrast, functions are defined at the unit level:  it is at this level where the impact of changes is most obvious and the behaviour of individual functions most easily isolated and evaluated.

Finally, disconfirmation.  This approach has significant issues when it comes to finding bugs:

  • Cost effectiveness is poor when it comes to finding bugs.  Tests that have already been run have generally revealed any bugs that they are likely to find.  In addition, regression testing is costly and time consuming – regression packs grow over time, and become a burden in terms of effort to execute and maintain.  I’ve done informal surveys on a few projects: the results generally indicated that 70-90% of testing effort went into running and maintaining regression packs but yielded only around 20% of the bugs.  These results are not dissimilar to those reported by Brian Marick and others in “How Many Bugs Do Regression Tests Find?
  • Worse, regression comes with a significant opportunity cost.  Exhaustive testing is impossible.  That means every testing endeavor is an exercise in sampling.  The more time spent on regression, the less time available for testing other things, i.e. the smaller our sample.  This style of regression testing robs the test effort of coverage and leaves bugs uncovered.
  • Re-running bug retests borders on the absurd. Think of the barn door analogy: once the horse has bolted, been returned and locked back in, what’s the likelihood of the horse escaping again if no one goes near the barn?  What’s the likelihood that a bug will be reintroduced if the corresponding code is not changed?  Zero.  Habitually re-running retests, in the absence of a change to the code that caused the original bug, makes no sense.
  • Old tests are blind to new problems. We design tests in two ways: we design confirmatory tests to check things seem to be as they should be, and we design disconfirmatory tests that are intended to find problems.  The latter are based on our ideas of what might fail.  When code is changed an entirely different set of things might fail than when the code was first introduced.  We need new tests to find new problems, not old tests and luck.
  • Old tests keep missing the same bugs.  Each test may find some bugs and miss others.  Constant repetition of the same tests will consistently miss the same bugs.  In Software Testing Techniques, Boris Beizer described this as the “pesticide paradox”.
  • Automation may not help.  Automation can be expensive to develop and maintain, on some accounts up to 10 times more expensive than the same manual tests (Kaner, Bach and Pettichord: Lessons Learned in Software Testing).  And the results?  Unfortunately, automation is fundamentally stupid.  Whilst a human tester might notice failures that are only tangentially related to the test that they are performing, automation cannot – it only takes note of those things that it has been programmed to look for.  In one of the teams surveyed above, manual regression packs had been substantially replaced with automated GUI regression – automation which found no bugs.  The cost effectiveness of such automation, in relation to finding bugs, can be substantially worse than manual regression.

Given the above, it is clear that the use of black box regression testing is seriously flawed when it comes to mitigating regression risks.  Charrett is right: it’s time for “a different paradigm”.

How might things be different?  I’ll discuss the selection of regression risk mitigation strategies, and how context influences selection, in the next post in this series.

 

Other posts in this series:

Rethinking Regression, Part 3: Mitigation Strategies

In the previous post in this series, I introduced regression risk and the idea that this is not a single problem but a set of different ones:

  • New Bug
  • Old Bug
  • Zombie Bug
  • Bad Build.

For each of these risks, there are a variety of mitigation strategies that fall into three categories:

  • Prevention: Stop the risk from occurring in the first place.
  • Confirmation: Check that things work as they did before, i.e. that the risk hasn’t occurred (Michael Bolton would call this checking).
  • Disconfirmation: Seek to disprove that things work, i.e. prove that bugs have been introduced.

Let’s take each risk in turn, and consider what kinds of mitigation a project might implement.

New Bug:

  • Prevention: We might review specifications of the change in the hope of preventing new bugs from being introduced.
  • Confirmation: We might run old tests in order to build confidence that the new build behaves in the same way that the last one did.
  • Disconfirmation: We might actively seek for new bugs by looking for failures.

Old Bug:

  • Prevention: We might refactor the code to remove the added complexity – and messiness – that inevitably accompanies change piled on top of change.
  • Confirmation: We might rerun tests that we created specifically to retest old bugs, in order to demonstrate that they have not crept back in.
  • Disconfirmation: We might actively look for new ways that old bugs could manifest themselves.

Zombie Bug:

  • Prevention: We might use reviews and static analysis to check that there is no dead code, code for which the conditions of execution can never be met.
  • Confirmation: Similar to new bugs, we might run old tests to check that the software’s behaviour remains the same.
  • Disconfirmation: Similar to new bugs, we might actively seek for bugs by looking for failures.

Bad Build:

  • Prevention: We might implement software configuration management in order to prevent the reintroduction of old, buggy code.
  • Confirmation: We might implement smoke tests to give us some reassurance that a build is sound, or – as for old bugs – rerun old bug retests.

Here’s a brief summary:

This list is far from exhaustive, and I’m not yet making any claims as to the relative merits of these various approaches.  What should be immediately clear though, is that regression testing far from the only solution to the regression problem.

But what’s wrong with regression testing?  That’s where we’ll turn our attention next.

 

Other posts in this series:

Rethinking Regression, Part 2: What is Regression Anyway?

So what is regression?

The simple answer is that when software changes there is a risk that the change has unexpected consequences, that the quality of the software “regresses”. This is a broad definition that can be broken down into the following risks:

New Bug: The change breaks something that worked previously, introducing a new bug.

Old Bug: The change reverses a previous fix such that an old bug is reintroduced. Ever been up late with a developer doing bug driven development? In some cases, it can go a bit like this:

DEVELOPER: Fixed it!
TESTER: OK, Case A now works, but hang on…Case B just failed.
DEVELOPER: Er, OK, [type type type], try this.
TESTER: Great, B works!…but A is failing again.
DEVELOPER: @#$%
etc…

This type of thing can happen when the application logic is complex, and it turns out that whilst a pair of requirements make sense from a user perspective they are mutually exclusive for the current implementation – so we flip back and forth between bugs until we realize than a new design is needed.

Zombie Bug: Risk that the change unblocks a bug that was previously hidden. Sometimes bugs cannot be reached by testers using black box testing: this can be due to other bugs, or because the bugs are hiding out in dead code. From time to time changes reanimate the code, bringing the bugs back from the dead.

Bad Build: Version control fails such that old buggy code is reintroduced. The previous post in this series started with an example of this.

Aha! So that’s why we do regression testing! The clue’s in the name right? To mitigate regression risks?

Not so fast. There are some important subtleties at work here. Before jumping straight to the answer that regression testing is the answer to all our regression risk problems, perhaps we should spend some time thinking about exactly what kind of mitigation strategies these risks require.

Further reading:
Kaner and Bach discuss some of these risks here

Other posts in this series:

Rethinking Regression, Part 1: Hard Lessons

During one of my first test management gigs, I had an unpleasant surprise.

The testing cycle in question was retesting a bunch of bug fixes, and doing regression testing of the affected modules. No other modules were affected by the changes, even indirectly.

Other than a few minor bugs, the tests passed with flying colours, and we happily pushed the build up to UAT.

About half an hour later, an irate project manager arrived at my desk: the acceptance testers had discovered a number of major problems in other parts of the application, problems that sounded hauntingly familiar.

After another hour or so of testing we came to a frightening conclusion: version control issues had caused this build to wipe out over a months worth of fixes in modules that were pretty much done.

The PM’s response: “Why didn’t you test that? You’re meant to be doing regression testing.”

I learned an important lesson that day: always do a full regression.

Unfortunately, that was entirely the wrong lesson.

Regression testing, attitudes to regression testing, and common regression testing practices cause some serious issues for testers and the projects they serve. This series of posts will explore the topic further.

 

Other posts in this series: