Testing in Production

Written by: ophir

6 min read

How to Accelerate Development Speed Without Compromising on Deployment Safety.

Development Processes

If you’re a mobile developer, you’re probably used to a workflow that looks something like this: (putting aside the specific software development methodology you’re using)

Yes, I know that’s a “bit” simplified, but the overall idea is:

  1. Write Code

  2. Test It

  3. Release if it passes testing

  4. Fix it if it doesn’t pass testing

For mobile development, you also need to deal with versions, and you will want to provide testers with the latest version before it’s live in the app/play store. This means there is another step involved in directly distributing a non-live (ie beta) version of your app to users through testflight or other alternative tools:

How often you actually release new versions depends on quite a few factors, such as dev team size and process, QA team size and process, the cost of having a bug in production, etc. If you have a bug in the live production version of your app, it will immediately impact all of your users, and the only way to fix it is to release a new version with a big-fix, which looks something like this:

Speed vs Safety

One of the biggest problems with the above process is the inherent tradeoff between speed and safety. On one end of the spectrum safety is king. Software quality is paramount, and a bug in production could be catastrophic (ie a banking app). On the other end of the spectrum speed is king. A production bug is a bummer, but not the end of the world, and it’s all about getting the latest and greatest features to your users as quickly as possible.

But there is another way ...

Testing in Production

The concept is quite simple. When developing a new feature or functionality, instead of the feature being on for everyone who downloads the latest version of your app, you install a mechanism, which can be controlled remotely, that decides who gets the feature and who doesn’t. For example, you could turn on a feature just for specific people:

  • QA testers

  • Internal employees

  • Randomly choose 2% of all users

  • Users from specific countries

From a development perspective, the developer writing the code puts the new feature behind a feature flag (also commonly called a feature toggle). Then, when the app runs, if the flag is on, the new code is executed, and if the flag is off, it isn’t executed. Something like this:

Flags.newFeatureName.enabled {
  // new feature is enable code
} else {
  // new feature is not enabled code
}

In order to control the flags, you have a feature deployment solution (aka a feature flagging service) that lets you define specific target groups as to who gets the feature and who doesn’t. These target groups can be defined by both generic device attributes such as device type or OS version, and custom attributes such as userType (ie free or paying), or number of friends for a social app. Once a new feature has been tested in production with a small set of users, and it looks good, you can then push it to everyone by simply changing the criteria of who gets a feature. This is what it looks like:

Truth be told, testing in production is nothing new. Tech giants like Amazon, Facebook

and Google have been testing in production for many years. Even not-so-giant companies are testing in production. The benefits of being able to deploy new features to live users in a controlled fashion, measuring the impact, and reverting those changes if needed are just too big to ignore.

Additional Benefits

Another reason mobile dev teams are using feature flags is that it accelerates the development process itself. If you’ve ever had to deal with a merge conflict, you know that long lived feature branches can cause delays when trying to merge back to the main branch. On the other hand, adding new feature code to the main branch before it’s finished can cause deployment delays if not ready in time. By putting new code behind a feature flag, developers can safely add unfinished code to the main branch and not worry about it holding up a build.

Why isn’t everyone testing in production?

The main reason not all companies are testing in production is usually a matter of resources. First of all, you need a technical solution to handle the feature flagging control mechanism so only specific users get new features that you want to test. Until recently, there were no third-party services that provided feature flags as a service. If you talk to any of the big tech companies (and we have), they all created their own in-house solution because, at the time, there was no build-vs-buy option. The only option was to build it themselves. For smaller dev shops, allocating resources to build such a tool simply isn’t cost-effective, and only recently have third-party feature flagging solutions become available. There is also some development overhead (and technical debt) when putting new features behind feature flags. While this type of technical debt is both prudent and deliberate, it only makes sense if the payoff is greater than the effort. On one side of the scale, you have the developer's time, and on the other side of the scale, you have the potential loss (or gain) of being able to release faster and avoid issues in live apps. For individual developers, this is a “nice to have, but not worth it” capability. For tech giants, this is an obvious “need to have and very much worth it” capability. From the conversations we’ve had with dev teams, both small and large, for teams of 3 people or more, the benefits usually outweigh the costs, and putting new features behind feature flags is something that is high on their to-do list (if they aren’t doing it). Lastly, you need to be able to measure the impact of a new feature - specifically comparing users who have a feature turned on vs off. Luckily, there are plenty of mobile APM or analytics solutions such as New Relic, Flurry, or Google Analytics, and integrating with both third-party solutions or in-house analytics is something your feature flagging tool should provide.

Summary:

Testing in production using a feature toggle solution has tremendous benefits and relatively few downsides. If you’re part of a team, putting new features behind flags has the added benefit of faster development time. All of the big companies are already doing it, and it’s time you did as well.

Stay up-to-date with the latest insights

Sign up today for the CloudBees newsletter and get our latest and greatest how-to’s and developer insights, product updates and company news!