How to Actually Reduce Software Defects

Software Defects - Twitter

As an IT management consultant, the most frequent question I hear is some variant of “how can we get our defect count down?

Developers may want this as a matter of professional pride, but it’s the managers and project managers that truly burn to improve on this metric. Our software does thousands of undesirable things in production, and we’d like to get that down to hundreds.

Almost invariably, they’re looking for a percentage reduction, presumably because there is some sort of performance incentive based on the defect count metric. Managers want strategies for reducing defects by some percentage, in the same way that the president of the United States might challenge his cabinet to trim 2% of the unemployment percentage in the coming years. The trouble is, though, that this attitude toward defects is actually part of the problem.

The Right Attitude toward Defects

The president sets a goal of reducing unemployment, but not of eliminating it. Why is that? Well, because having nobody in the country unemployed is simply impossible outside of a planned economy – people will quit and take time off between jobs or get laid off and have to spend time searching for new ones. Some unemployment is inevitable.

Management, particularly in traditional, ‘waterfall’ shops tends to view defects in the same light. We clearly can’t avoid defects, but if we worked really hard, we could reduce them by half. This attitude is a core part of the problem.

It’s often met with initial skepticism, but what I tell  clients is that they should shoot for having no escaped defects (defects that make it to production, as opposed to ones that are caught by the team during testing). In other words, don’t shoot for a 20% or 50% reduction – shoot for not having defects.

It’s not that shooting for 100% will stretch teams further than shooting for 20% or 50%. There’s no psychological gimmickry to it. Instead, it’s about ceasing to view defects as “just part of writing software.” Defects are not inevitable, and coming to view them as preventable mistakes rather than facts of life is important because it leads to a reaction of “oh, wow, a defect – that’s bad, let’s figure out how that happened and fix it” instead of a reaction of “yeah, defects, what are you going to  do?”

When teams realize and accept this, they turn an important corner on the road to defect reduction.

What Won’t Help

Once the mission is properly set to one of defect elimination, it’s important to understand what either won’t help at all or what will help only superficially. And this set includes a lot of the familiar levers that dev managers like to pull.

First and probably most critical to understand is that the core cause of defects is NOT developers not trying hard enough or taking care. In other words, it’s not as though a developer is sitting at his desk and thinking, “I could make this code I’m writing defect free, but, meh, I don’t feel like it because I want to go home.”

It is precisely for this reason that exhortations for developers to work harder or to be more careful won’t work. They already are, assuming they aren’t overworked or unhappy with their jobs, and if those things are true, asking for more won’t work anyway.

And, speaking of overworked, increasing workload in a push to get defect free will backfire. When people are forced to work long hours, the work becomes boring.  “Grueling and boring” is a breeding ground for mistakes – not a fix for them. Resist the urge to make large, effort-intensive quality pushes. That solution should seem too easy, and, in fact, it is.

Finally, resist any impulse to forgo the carrot in favor of the stick and threaten developers or teams with consequences for defects. This is a desperate gambit, and, simply put, it never works. If developers’ jobs depend on not introducing defects, they will find a way to succeed in not introducing defects, even if it means not shipping software, cutting scope, or transferring to other teams/projects. The road to quality isn’t lined by fear.

Understand Superficial Solutions

Once managers understand that eliminating defects is possible and that draconian measures will be counterproductive, the next danger is a tendency to seize on the superficial. Unlike the ideas in the last section, these won’t be actively detrimental, but the realized gains will be limited.

The first thing that everyone seems to seize on is mandating unit test coverage, since this forces the developers to write automated tests, which catch issues. The trouble here is that high coverage doesn’t actually mean that the tests are effective, nor does it cover all possible defect scenarios. Hiring or logging additional QA hours will be of limited efficacy for similar reasons.

Another thing folks seem to love is the “bug bash” concept, wherein the team takes a break from delivering features and does their best to break the software and then repair the breaks. While this certainly helps in the short term, it doesn’t actually change anything about the development or testing process, so gains will be limited.

And finally, coding standards to be enforced at code review certainly don’t hurt anything, but they are also not a game changer. To the chagrin of managers everywhere, “here are all of the mistakes one could make, so don’t make them” doesn’t arise from the past experience of the tenured developers on the team.

Change the Game

So what does it take to put a serious dent into defect counts and to fundamentally alter the organization’s views about defects? The answers here are more philosophical.

The first consideration is to get integration to be continuous and to make deployments to test and production environments trivial. Defects hide and fester in the speculative world between written code and the environment in which it will eventually be run. If, on the other hand, developers see the effects their code will have on production immediately, the defect count will plummet.

Part and parcel with this tight feedback loop strategy is to have an automated regression and problem detection suite. Notice that I’m not talking about test coverage or even unit tests, but about a broader concept. Your suite will include these things, but it might also include smoke/performance tests or tests to see if resources are starved. The idea is to have automated detection for things that could go wrong: regressions, integration mistakes, performance issues, etc. These will allow you to discover defects instead of the customers.

And, finally, on the code side, you need to reduce or eliminate error prone practices and parts of the code. Is there a file that’s constantly being merged and could lead to errors? Do your developers copy, paste, and tweak? Are there config files that require a lot of careful, confusing attention to detail? Does your team have an established code review process, or is it something that is still happening ad-hoc? Recognize these mistake-inviters for what they are and eliminate them.

But here’s the thing – I can’t possibly enumerate all of the tools in your arsenal. These are some of my most tried and true strategies, but you’ll have to figure what works for you. The key is to recognize that defects are not inevitable and go from there.

Improve Software Quality with Code Review

According to the State of Code Quality 2016 Report, code review is the #1 method software teams use to improve code quality. New to code review, or still trying to implement a code review process? Download our free eBook: 10 Things Developers Wished Their Bosses Understood About Code Review.

10 Things eBook

Comments

  1. The title of this article is “How to Actually Reduce Software Defects” but should really be “Things you should not do when trying to reduce software defects”. The only actual suggestions to reduce software defects were very general and not very applicable to everyday software development. The “what not to do” sections were interesting, though.

  2. “The first thing that everyone seems to seize on is mandating unit test coverage, since this forces the developers to write automated tests, which catch issues. The trouble here is that high coverage doesn’t actually mean that the tests are effective, nor does it cover all possible defect scenarios.”

    A couple of things:

    1.) No, unit tests won’t cover *all* possible defect scenarios, but they’ll definitely cover *some*, maybe even *most* in some contexts. Would you seriously counsel a team that doesn’t currently enforce unit test coverage to not start doing so simply because it wouldn’t prevent every possible defect?

    2.) No, unit test coverage doesn’t mean the tests are any good, which is why no responsible person would ever say unit test coverage is the *only* solution to defect prevention. But that doesn’t mean it isn’t a vital part of it. Code reviews can, in fact, surface bad tests that game the test coverage system and push teams to replace bad tests with good ones. Would you seriously not suggest a team start adding unit test coverage simply because it’s *possible* that it could be done poorly?

    I hope not to project this onto you too much, especially if you have good answers to the above questions, but I get really annoyed at the recent fad of bashing unit test coverage. I see it as a lot of strawmen being burned without addressing the actual value that unit test coverage adds to a product.

    I do agree, however, that good CI practices and continuous delivery into deployed test environments are arguably *more* vital to defect prevention than unit test coverage (although I question the value of CI a little bit in the absence of unit test coverage).

    I also appreciate that you strongly argue against the “bugs happen, so what?” attitude, which I also detest.

  3. Totally agree.

    i think …. the defect-free engineers, who believe defect-free software is possible, have vastly lower defect rates than the typical engineer, who believes bugs are a natural part of programming. The defect-free engineers have a markedly higher productivity.

  4. As per my recommendations and analysis as a tester ! Continuity in software testing is a key to maintain higher stranded performance. Financial software are more vulnerable and exposed to threats so its important for their developers to contain performance by applying testing techniques related to automation and security analysis testing.

  5. That was a wonderful site, thanks for sharing post. wish you the best.

  6. Jolene Hart, CSQA says:

    I was disappointed with this essay because of the focus on what DEVELOPERS can do, and the lack of attention to the role of both QA and QC. QA – Quality Assurance: preventing defects, e.g., via feedback and other methods; QC: Quality Control – testing.

    As a former senior developer and current senior QA/QC consultant, I agree that most of your points are valid, just that your recommendations are too limiting. And there was no mention of attention to the intended BUSINESS needs of the organization/software.

    Also Continuous Integration (CI) of crappy code will not work. Ultimately the integration process will fail, and it will never be the solution for software defects due to not understanding/achieving the intended business/customer needs.

  7. Jolene Hart, CSQA says:

    One more point – a compliment. Thank you for mentioning the effectiveness of software code reviews. You might be interested in the article I published in 1978 (and presented to an international software conference) on this topic.

  8. Testing is always vital and effective so as to improve the performance standards of a software and for doing so its necessary for the development manager to implement testing mechanisms like automation testing, security testing and performance testing. Security testing is vital especially for financial software as these could be more easily becomes the victim of bugs and vulnerabilities. Sustainability in testing will be really helpful to actually reduce the defects in an efficient manner.

Speak Your Mind

*