Guerrilla QA for Tiny Teams

How to test well even when you have no idea what you’re doing

Our rigorous testing protocols, conscientiously applied to Pop-Tarts (Vicky Wasik/Serious Eats)

Deep dives on project management, process, workflow, newsroom culture & more

Last year, we started on a complete overhaul of Serious Eats that would better reflect our editorial direction and help readers find what they were looking for, with stories and recipes organized into four main areas. By the beginning of July we felt like the project was in pretty good shape for QA.

With a product team of two—a designer (me) and a developer—we have to carefully consider how much bang for the buck any process will bring us. And admit it, no one likes to do QA. But it doesn’t have to be a painful! It can be cost effective and quick, and provide additional benefits if you plan ahead a little. Here’s how we approached it at Serious Eats.

Define Parameters

Before we could do any testing, we needed to decide what to test, and how. We began by asking a series of questions.

How are we testing? We decided to go the manual testing route with internal users. Automated testing was out of the question as we didn’t have time to comb through auto-generated screenshots or learn how to write tests for the volume of what needed to be tested. Humans can actually be quite efficient once you’ve explained the task. By having humans submit bugs, it narrowed the field considerably and we could just focus on fixing the problem areas.

Who can help us test? Getting people outside of the Product team—but internal to the organization—to help with testing was key to the process. QA should be understood as a team effort and involve “non-technical” people like editors and salespeople. Even if they’ve been involved with the design process from the beginning, they haven’t been staring at it for days on end, so they’ll be able to spot things that you missed. Communicate that it’s an opportunity to get familiar with new styles and features, ask questions and request improvements and what the new design is going to feel like for readers on different devices.

Which platforms and browsers are we testing on? The number and variety of browsers and platforms to test seemed overwhelming at first, but by considering how to achieve the best results given our resources it became clear how we needed to focus. We looked at our traffic trends over the last year and picked a cutoff point: a browser had to represent at least 1% of overall traffic to fall into the “must test” bucket and then the rest as time allowed. Time not spent testing Amazon Silk on first generation Kindle Fires meant that we could spend more time improving the site for our most popular browser/OS combo. Depending on your resources and traffic, you may decide to move that cut off point.

Which content are we testing? We devised a series of rules to prioritize the subset of content we needed to test, as it had to be representative of the breadth and depth of the site. These rules will probably be a bit different for every organization, but our ruleset ended up like this:

  1. Top 50 URLs by traffic over the past six months
  2. Page types (Story, Recipe, Landing, Tag/Category, Slideshow, Search, Utility, etc)
  3. Content that takes advantage of new features (new data fields, “cooking mode”)
  4. Content that uses certain components (video embeds, data tables, tables of content)
  5. Content that relies on legacy styles and classes (smaller images, stuff that doesn’t follow current formatting standards)
  6. Performing tasks (printing, sharing, user registration)


After we decided on our approach, we needed to lay the groundwork for the testing to come, focusing on two main things: documentation of our plan, and clear decisions about how we were going to communicate.


We made a handbook, written in plain language, that laid out our expectations and the methods and parameters for testing.

Google Sheets is also your friend; it was a handy way to organize the URLs we were testing. Use a URL shortener so testers can get to the sheet easily on their devices. Here’s the documentation we included in a single sheet:

  1. Browser/OS matrix: Assignment of browsers and platforms to individuals for testing.
  2. List of URLs to test: based on the ruleset above. We pointed out any special formatting elements that testers should be aware of so that they could hit a comprehensive set.
  3. Verbal Description of Components: This was probably the most time consuming part of the documentation to write as each component’s appearance across screen widths had to be accounted for. I don’t think it’s always necessary, as you should be clear that people should submit bugs if it “looks funny” to them. They might not pick up on, say, the nuances of the gutter width in proportion to the main content well but they sure as hell will tell you if something is hard to read, ads are misplaced, or images are stretched. Chances are, others will experience that same trouble.
  4. Actions to perform and description of the expected outcome

Here’s what our browser/OS matrix and description of components actually looked like:

Browser assignment

Who’s doing what

Description of Visual Components

Descriptions help testers get a sense of what “normal” should look like


We settled on two ways of collecting bugs:

  • Slack: We set up a dedicated #qa public channel for people to ask questions and to notify the Product team that they had completed an assigned browser.
  • Forms: We created a simple bug submission form in Formstack and took advantage of Browserstack’s built in bug tracker. You could use Google Forms, Survey Monkey, whatever floats your boat. We required a URL, OS/device, and description with an optional screenshot; just be sure that you are collecting enough information to reproduce without having to go back to the submitter.


Device libraries are nice to have, but not always possible! Alternatives we’ve used:

  • Personal Devices: We were fortunate that people in our office own a range of devices beyond the latest iPhone. Ask around and see if you can borrow it or assign testing to the individuals who own those devices.
  • Services: like Sauce Labs or Browserstack. It gives you access to a wide range of devices, browsers, OSs and developer tools at a reasonable monthly price. The downside is that they’re a bit slower and can be more difficult to debug. Browserstack has a handy feature where you can run it though localhost to help with bug fixing.
  • Virtual Machines: In the past, we’ve set up virtual machines, but they take time to set up and maintain, so we switched entirely to using Browserstack instead. You may find that you want more control so VMs could be the way to go.
Example of Browserstack screenshot bug capture

Browserstack in action

Test & Fix It

Before rolling out the process to the entire team, I had a colleague do a dry run to test the documentation for any holes and to time how long it would take. After some tweaks, we let the team loose and gave them a week and a half to complete their assignments on their own schedules. Once you get it rolling, testing runs itself.

Splitting up triaging duties is going to vary by who’s on your team and what the aim of QA is. It made the most sense for me to triage since we were mostly focused on QAing the visual design. I could tell if a similar problem had already been submitted and tack it on to an existing ticket to provide more information or to make a new ticket. I could also tell pretty quickly if it was a CSS bug or something that should be investigated further by the developer.

We had to be very pragmatic with bug prioritization. We went through the matrix of priorities and also asked, for each bug:

  • Is it a launch blocker/critical?
  • How long do you think it will take to fix?
  • What can we live with?

Release It

Even though you’ve fixed the mission critical bugs, and you’ve launched, you’re not quite done yet! Your audience can continue to QA for you: we circulated a modified version of the internal bug submission form on social media. Although the feedback is not always helpful, they will tell you if something is broken and help with prioritization if you get a lot of reports for the same error. For instance, we received a fair number of bug reports about printing, and so we prioritized those fixes soon after launch.

A small team doesn’t have to skimp on QA, it just takes a bit of planning and collaboration. The process I’ve described evolved out of our available resources, so you should feel free to modify based on your particular situation. By asking everyone to participate, it has helped set expectations for the release and given us all a better understanding of the design process and reader experience.




Current page