🧙The Dark Art of A/B Testing Part 2: Gamifying product development

In Part 1 we talked about why A/B tests are so great. They give you a way to see the provable differences your changes cause. If you properly use this information it gives you greater access to the underlying truth about what happened.

That means greater compounded learning about your user base over time. Greater learning rates about your user base lead to more “aha” moments. Those eventually lead to a greater frequency of high impact changes. Ultimately, this all results in more user love and more $$$.

In Part 2 I wanted to talk about:

  • the paradigm shift A/B testing can enable for product development on a team, division, and company level,
  • how A/B testing allows you to gamify product development and what it looks like if you do,
  • why it’s so hard to compete with companies like Facebook and Google who’ve had this approach ingrained in their company DNA since birth.

Wait a minute, if it’s so great, why isn’t everybody already doing it?

I’ve spent a great deal of time agonizing over this question. I think the simple truth is A/B testing is still crossing the chasm of being understood for what it enables. It’s not easy to jump into. If you are just starting out, there’s a lot that can go wrong. 

Nascent Ecosystem

The ecosystem is not mature enough to do you many favors. In a multidisciplinary team already short on time, unless you get everybody fully bought in, it’ll only create more problems instead of creating the promised compounded learning and financial growth. Assuming you don’t have a loose couple hundred million bucks laying around like Apple or Microsoft to hire a team of experts to build your own system, that leaves you in a precarious situation.

Reputation Problem

A/B testing also has a reputation problem. It’s seen as a technical detail for data nerds. It’s seen as a mysterious process that some people praise as a holy grail. It’s believed to be a simplistic technique for seeing what different colors of a button do. It’s scoffed upon as a cop out for people who don’t want to do real user research. It’s criticized as a technique incrementalists use who lack vision. I’ve even encountered people who straight up don’t believe the math works.

Taking Flight

But at the end of the day, it is just a technique that enables you to prove differences as we examined in Part 1. That simple little technique added to your product development repertoire at the right level of abstraction is so incredibly powerful it slowly causes a paradigm shift (and you can speed it along). Having been there before, let me share with you exactly what that looks like so you can judge for yourself.

Gamifying product development

At my startup Asgard Analytics our mission is to bring the types of tools that corporate behemoths like Netflix and Amazon have to everyone, with 1 line of code. With them, you can gamify product development at your own company or just for yourself. Our current thesis to make the transition seamless is you need:

  1. Auto collection of absolutely all of the data that you have. 
    1. clicks, scrolls, UI that’s viewed, survey results, chat logs, sales transcripts, email opens, and so on
  2. Automatic A/B testing for all the changes you roll out.
  3. Auto generated metrics for every single thing that you have data about.
  4. Auto generated stories composed of your metrics to explain how different parts of your user base are affected by your changes.

Designers, product managers, engineers, sales people, and so on can understand these stories with ease and incorporate them into their workflow. You can even aggregate all the ongoing explorations monthly, quarterly, and yearly to provide different granularities of “situation reports” accessible to anyone on the frontlines all the way up to the CEO.

As you can see, the focus is to get as close to “auto” in each of the four areas as possible. That’s because the less time something takes, the more tractable the use cases following those steps become. Those following steps tend to be:

  1. If you like building big features, you’ll find some of them do 100x better than others.
  2. Because it’s annoying to spend 3 months building one feature and get 100x the impact versus another one and get 1x the impact, you’ll start to think of smaller ways to de-risk ideas first.
    1. If smaller changes work, you’ll have the ammunition to invest more and more time into an idea.
    2. If an idea fails, you’ll go truth seeking.
      1. You’ll look for more inputs from:
        1. design sprints, user research, domain research, surveys, ideas from teammates, ideas from customers, ideas from sales, ideas from support, ideas from marketing, ideas from analysts.
      2. Alternatively, you’ll have sufficient evidence to abandon the idea.
  3. Any subjectivity that exists about “what you think happened” and “what you know happened” will disappear. Your results are provable after all. If you have teammates, you’ll find everyone becomes more engaged since you’ll be working off the same context: how your user base truly reacted to your changes. Not someone’s gut feelings.
  4. Any unhealthy dynamics around “subjective” behavior will disappear. Instead, everyone will be anxious to get feedback from the user base about the next thing they try.
  5. The inherent gamification feeling of improving the product will cause people to seek gratification faster. To do so, multidisciplinary teams naturally begin to form a healthy bond:
    1. Designers will work even closer with engineers trying to create the best minimal UX needed to test an idea.
    2. Engineers will begin to focus less on how elegant their code is but more on sustainability for a faster rate of change.
    3. PMs will zoom out from UX implementation to steering exploration in the direction that broader user research, strategic, and financial goals of the company encourage.

When I think of A/B testing it’s exactly this gamification of product development that comes to mind. It is absolutely thrilling. If you can achieve the right setup and get your team bought in, not only are there tremendous financial effects, there are also incredible ramifications on the overall “fun” you have as a team.

But beyond the technical shift needed, which I will cover some of in Part 3 of the series, you will also need to make a mindset and cultural shift. Much like a rocket launch you either escape the atmosphere or your rocket falls back down to earth, crashes, and burns. To avoid that from happening, it’s important to go about it step by step, objection by objection.

Breaking down barriers

We can break down the non technical transformation into common objections. Here are the the most common ones I’ve had to overcome, prioritized by most derailing to least derailing:

  1. It won’t add value.
  2. We won’t learn anything new.
  3. It will make things worse.
  4. It’ll ruin our vision.
  5. It’s too incremental.
  6. It’ll make us go slow.
  7. It’s too hard to set up.
  8. It won’t work here.

Phew! That’s a lot of objections to overcome.

Whether you have these concerns yourself or if your teammates do, we’ll go through them one by one here. The difficulty level of this quest is more that of “The Fellowship Of The Rings” rather than the fun yearly grind of “Harry Potter”. Let’s get going then.

Objection #1: It won’t add value

This one makes total sense. The reasoning goes, we make $Z amount of money a year now and never did A/B testing before. Why do it now? While it’s true most companies in existence today never had to rely on A/B testing, the tide is turning. We’ve seen the ascension of product led divisions at Facebook, Netflix, and Google who heavily rely on some version of a product development cycle involving A/B testing. There’s a threefold argument why you don’t want to lag behind in the compounding growth game.

Competition

In Part 1 we had you in the red universe competing with yourself in the blue universe. In that thought experiment, what red you did had no effect on what blue you did. In the real world, your competitors will exploit your weaknesses and syphon off your user base. 

Product development is an arms race. If someone out there can “change the world” (improve their product) at 2X the rate you can, it’s only a matter of time before their product becomes significantly better than yours. If you “don’t have a competitor” there is a lot of historical evidence that shows you want to penetrate 80% of your market with a highly satisfying solution as fast as possible if you want to secure your position. Competition is not always obvious either. You never know who’s stalking you in the shadows. For example, YouTube syphoned MySpace’s entire user base through embedded videos before MySpace noticed. 

Commoditization curve

Humans constantly grow disillusioned with things that at first bring them great joy. Eliezer Yudkowsky calls it “the hedonic treadmill”. Your product is valued most when it is sparkly and new. You want to fight the commoditization curve where your product is no longer even a “meh” in your customers’ minds. Time is your biggest enemy. A pace fast enough to constantly give your customers that same “new value” feeling is critical. To do that you want to minimize the time spent building out things you think people want and find the things they do want more often.

Financial maximization

Lastly, there’s the very practical reason that if you want to maximize wealth for yourself or for your shareholders, you want to accelerate your iteration speed to its maximum. Products spread fastest through word of mouth and that will only happen if you are constantly blowing your customers’ minds. To do so you need more hits and less misses which requires a deeper understanding of what happens with every change.

Objection #2: We won’t learn anything new

This one is a problem with the maturity of the current ecosystem. There are not enough good resources, tools, and human translators to bridge the gap for non-data/non-technical folks yet. A/B tests have a reputation for testing trivial things like changing colors of buttons or that you can only look at the result and do a thumbs up or a thumbs down for one metric at a time like conversion:

As we said in Part 1, A/B testing is just a tool to prove differences about changes. The complexity of what differences you measure can vary dramatically. If you measure simple things, you will learn simple things. If you still release giant features as A/B tests, you might have trouble attributing the pieces responsible for the differences you see.

It is entirely possible that you won’t learn anything new. That hinges on whether or not you have some version of the 4 parts I listed above. While we’re trying to automate that all away at Asgard, nothing is stopping you from putting together ways to perform those things more manually to get to your own version of it. I’ll cover more details in Part 3 of this series as well.

Objection #3: It will make things worse

People are creatures of habit. If something has brought us success in the past we become conditioned it will bring us success in the future. Albert Einstein, a world renowned genius, cranked out five groundbreaking papers at the age of 26. New technology leveraging his discoveries like GPS is still being created today. After 26, he was presented with strong contradictory evidence that his method of thinking was saturated. He relented anyway. Unfortunately, the same stubbornness that brought him tremendous success stalled his career until he died at the age of 76.

In order to convince someone something new is for the best you need to:

  1. Understand what they value today and why.
    1. What brought tremendous success for them in the past?
    2. What do they value about that approach?
  2. Convince them of a small start so they feel safe to revert back to their old ways. 
    1. “We’ll try it 5 times and see how we like it.”
    2. “After 1 month we’ll reassess.”

By meeting them where they are today you can ladder each of the things that are good and appeal to logic to things that are bad. For example:

  1. Good
    1. Them: “I really like how if the team sets our mind to it we can build so many things so quickly until we finally find a solution customers are satisfied with.”
      1. By “laddering” you agree and say that’s an incredible part of product development. 
      2. And, wouldn’t it be more great if on top of finding the solution, we could also find the solution faster by assessing our existing product and the change we made with great detail. Then it would take us less attempts per change, which would allow us to get even more done?
  2. Bad
    1. Them: “It gives me a rush to know I understand our customers so well I can sometimes just come up with ideas out of thin air and when we implement them they work.”
      1. Hmm, that sounds fun for you but doesn’t sound fun for the rest of the team when they feel like they just executed on something meaningless the rest of the time.
      2. Wouldn’t it be more fair to draw more of the ideas out of our customers directly so that if our ideas fail, we only have our interpretation to blame for why we failed?

Objection #4: It’ll ruin our vision

A vision is “how will the world change after we succeed”. For example, “We will become an interplanetary species.” No execution details should affect your vision. Product development innovation will only influence your strategy and execution. You’ll be able to accomplish your mission and enable your vision by iterating and learning so much faster. Executing on your strategy should only get easier.

Objection #5: It’s too incremental

A/B testing is incremental. Not A/B testing is also incremental. Again, from Part 1, A/B testing helps you prove a difference is due to your change. If your change is a 3 year magnum opus, you can still roll it out as an A/B test. Now, you might really, really not want to see that it’s a failure if that’s what the results tell you. You’ll also no longer have a “story” to understand the change but an entire “novel” that can be hard to get through.

So what should you do if you want to have a big marketing release? No problem, just A/B test multiple features with 5% segments in parallel or serially. Fix the ones that make things worse. Then combine them into one final A/B test to see if all together they are an improvement. Afterwards, write a big PR piece and release it with certainty to 100% of your user base. Everyone will love it since you already know they will.

Objection #6: It’ll make us go slow

A/B testing does take between two weeks and a month to determine the worth of a change. That doesn’t stop you from trying many different things in parallel though. At Microsoft, one team I worked with went from releasing 1 change a year without A/B tests and 47 changes the next year with A/B tests.

As I mentioned above, because it’s addictive to release 100x better changes, people stop trying to only build big risk blockbusters. There’s also less risk that you’ll upset customers because you only try it with a small percentage of them at a time. By evaluating whether or not something was good or bad, you can shut it down quicker. The gamification effects greatly outweigh the necessary time to determine with certainty the results of an A/B test.

Objection #7: It’s too hard to set up

To get started with A/B testing you need 5 things:

  1. An ID for each person. (John: 1, Philipp: 2, Devin: 3, Geoff: 4)
  2. Randomization to put 50% of people in bucket A or 50% in bucket B.
  3. An on switch to show bucket B the new change.
  4. Upload the “row” with the event you’d like to prove the change of.
  5. A calculator.

To get to the point where you can actually gamify product development will take a lot more, but this is all you need to start. I’ll focus more on the nuts and bolts of how things work in Part 3.

Objection #8: It won’t work here

If you are purely non-digital with no analytics, it will not work. If you have less than a 100 visitors a month to your website, it will also not work. Otherwise, it will work. The more visitors, users, and customers you have, the more certainty you will have. But 20% uncertainty is still better than 99% uncertainty of not doing it in my opinion.

Recap

In Part 1, we took a stab at understanding A/B testing intuitively. In Part 2, we talked about how A/B testing allows your company to gamify product development and what that looks like in practice. In Part 3, I will attempt to explain how A/B testing works for real in the simplest accessible manner and expand on the bare bones setup from Objection #7 more.

If you want to jerry rig such a setup at your company Part 3 will be fore you. Alternatively, feel free to reach out to work with us at Asgard Analytics. 


If you want to be notified on when Part 3 comes out, I post on Twitter, LinkedIn, and the email list below.

Subscribe to get notified of new posts by email

Comment on “🧙The Dark Art of A/B Testing Part 2: Gamifying product development”

Start the Discussion!

Your email address will not be published. Required fields are marked *