Tech Life At the intersection of improbable and unthinkable

The QA Mindset

My first job in technology was a QA internship. The summer between my freshman and sophomore years, I tested the first release of Paradox for Windows at Borland.

As an intern, I started by following someone else’s QA test plan – dutifully checking each test off the list. After a few weeks, I knew my particular area inside and out. A new build would show up, which I’d install via 3.5-inch floppies, and in ten minutes of usage, I’d have a sense – is this a good or bad build?

In QA, there is a distinct moment. It comes once you’re deeply familiar with your product or product area; it comes when you’re lost in your testing, and it comes in an instant. You find a problem, and because of your strong context about your product, you definitely know: Something is seriously wrong here.

Where’s QA?

At the current gig, there’s no QA department. QA as an independent function does not exist and that’s a first for me.

Every company I’ve ever worked at, for the better part of two decades, whether consumer or enterprise software, has had a well-staffed QA function. I’m told this absence is not unique in the Valley. I’m told that both Facebook and Twitter don’t have a QA entity.

Similarly experienced co-workers keep asking, “Shouldn’t we have a QA function?” and while my instinctive response is an emphatic “Yes”, as a member of a rapidly-evolving profession, I need to be open to the idea that software development evolves in ways that may seem counter-intuitive to me. The fact that the team and the company have been successful sans QA is essential and interesting data.

My thesis regarding the necessity of QA has always been: checks and balances. A healthy development team was one that had engineers doing their best to build a great product. These engineers were paired with a team of QA engineers who were doing their best to prove the product wasn’t great. These conflicting goals result in what I consider to be an essential healthy tension between engineering and QA.

Yes, there is often professional conflict between the teams. Yes, I often had to gather conflicting parties together and explain that, “You are both doing your job. No, engineers are not deliberately creating bugs. No, QA is not hating on the product. Yes, we actually have the same goal: rigorously confirming whether or not the product is great.”

The absence of this healthy tension concerns me, but more worrisome is the absence of the practices a thriving QA function builds and maintains. These practices are still with me and are essential to defining and maintaining a high quality product.

They are:

Do I understand the issue?

It’s natural to get rage when software doesn’t work, especially when you paid your hard-earned money to purchase it. Rage is counterproductive to the QA mindset. In fact, it hinders it. The QA reaction to defects is curiosity: Whoa, whoa. Wait, what happened there…? And what follows is a series of interrelated questions that build on each other.

Can I reproduce it? Does it happen every time? What was I doing right before the situation occurred? If there is a crash, does the crash log offer any clues? Based on my knowledge of the product and the code, do I have a hypothesis as to why this might be happening?

In the last decade, software companies have made the process of capturing crashes stunningly easy thanks to the Internet. When your application or operating system crashes, you often receive a dialog delicately apologizing for the crash and asking if you want to submit a crash report. Usually you are asked if you want to add any additional information, and usually you don’t do this, which leads us to…

Can I effectively report the issue?

In this world of auto-submitted crash reports, we the customers have little incentive to provide any information beyond the crash report because we’re mad. Our software crashed, our game was interrupted, our document was lost, value was not delivered. The QA mindset dictates that “Any additional informational, however seemingly trivial, might aid in the resolution of the issue.”

You probably do not take the time to add any additional information when these crashes occur, or if you do, it’s full of rage: “JUST TRYING TO GET WORK DONE HERE PEOPLE.”

Some bugs are slippery. They exist at the intersection of improbable and unthinkable. This is why these bugs are discovered in the wild. Humans do strange shit to software that we could never predict in the controlled setting of our carefully constructed software development environments.

Slippery bugs are a mystery, and there is initially no telling what contributing factors are relevant. In my time in QA, for issues that were hard to reproduce, we prided ourselves on documenting everything, however irrelevant, that might have lead to the crash. Totally clean OS – just installed. Wired connection, wireless disabled. No virus software. All files are local.

You’re not going to do this because you don’t perceive that you have skin in the game. You’re correctly assuming that part of what you were paying for is quality, and you likely haven’t been in QA. You likely haven’t received that mail in the middle of the night from the development team, who has been chasing that slippery bug for the past two weeks where they ask, “Can you try this patch? We think we fixed your bug.”

And they did. Because you took the time to think before you submitted your report, which leads us to our last part of the QA mindset…

Do I perform these actions unfailingly?

The last QA dictate is the most important and the least likely one that you perform. The last dictate is: “In the face of a problem, do you act to correct it? Unfailingly?”

There’s a bias towards system and applications crashes in the observations above because these crashes are the defects that give us the most rage. And while identifying and fixing these crashes is an obvious high priority, there’s a whole other class of defects of equal priority that are less obvious.

My favorite internal application at Apple is a product called Radar. It’s a Cocoa application that served as our bug tracking system, and if you wanted to know what was going on regarding a specific application at Apple, you went to Radar.

For many groups at Apple, Radar was religion. An issue regarding a product was not considered to exist until it was in Radar. If someone asked me, “Have you seen this bug in your product?” My immediate response was, “Is this in Radar?” “No.” “We’re not talking until you’ve filed Radar.” Case closed. For now.

The unfailing rules were:

  • If you see something wrong in the product – however big or small, report it as best you can.
  • It is good form to take the time to report the issue as descriptively and thoroughly as possible, but it is more important to report the issue.
  • It is also good form to check if the issue has been reported by someone else, but it is more important to report the issue.
  • When the issue is reported as fixed, take the time to confirm it as such, because more often than not, it’s not and/or it created another related issue.
  • Failure to follow these rules will be met with an immediate reminder of the aforementioned rules.

For the teams that unfailingly followed these rules, Radar became a powerful tool because it became a credible source of truth regarding the product. The answer to the question, “How is the product doing?” wasn’t an annoyingly vague, “I’m feeling good about it.” The answer was, “We’re fixing critical issues at a rate of 1 issue per engineer per day. We’ve got 14 engineers, we’ve got 308 issues, which means if no issues arrive, we’ve got 22 days of work. Except our arrival rate is 7 a day and it’s increasing. Also, next time you want to know this data, here’s the query you run. Stop wasting my time.”

You are sensing rage in my answer because I’ve spent a career surrounded by well intentioned humans who believed that it was QA’s job to file bugs, and the fact is that quality is a feature, so like it or not, everyone is in the QA department.

QA is a Mindset

In a pre-Internet world, one of the key reasons for a well-defined quality assurance team was the cost of distribution. When you released software, it required producing a pretty shrink-wrapped box full of disks and documentation. This was expensive to build and ship. More importantly, the infrequent yearly release of this shiny box was the sole yearly opportunity to get your software in front of your customers. It could be upwards of a year before you had a chance to right your buggy wrongs.

Thankfully, blissfully, this is no longer the world we live in. At the current gig, we’re releasing the website a couple times a day. We’re releasing apps at a slower pace, but when I say slower pace, I’m talking days… not months… never years. Perhaps our collective ability to not only rapidly detect, but also fix issues within our products, has made us less dependent on relying on an independent QA function?

My concern is that the absence of QA is the absence of a champion for aspects of software development that everyone agrees are important, but often no one is willing to own. Unit tests, automation, test plans, bug tracking, and quality metrics. The results of which give QA a unique perspective. Traditionally, they are known as the folks who break things, who find bugs, but QA’s role is far more important. It’s not that QA can discover what is wrong, they intimately understand what is right and they unfailingly strive to push the product in that direction.

I believe these are humans you want in the building.