optimization summer games - get started with a/b testing
TRANSCRIPT
Results are always greener on the other side
Lessons learned from failed or inconclusive experiments
Strategy Consultant
@LTatarov
Lev Tatarov
You are not going to get wins all the time!
* N = 90k, May 2014 - July 2016, >=10k visitors, wins = significant uplift on 1 or more goal
You are not going to get wins all the time!
* N = 90k, May 2014 - July 2016, >=10k visitors, wins = significant uplift on 1 or more goal
Inconclusive results
You are not going to get wins all the time!
Inconclusive resultsNo wins
* N = 90k, May 2014 - July 2016, >=10k visitors, wins = significant uplift on 1 or more goal
We need to get better at learning from losing and inconclusive experiments !!!
Hypothesis: If we add press mentions at the bottom of the homepage, we will generate more clicks on the CTA because it will create trust in the brand
Blacklane
Result: No significant difference
A
B
Conclusion: Visitors are not driven to convert by press mentions
Next steps: ...???
A great hypothesis begins with the problem, not the solution
Problem Solution Result
Meaningful hypotheses drive focus
Problem
Solution Solution Solution
Problem
Solution Solution Solution
Company goal
Time
Insight #1
Strong hypotheses enable learning from failures
Start with a meaningful problem definition
Hypothesis: Because we have unused real-estate above the fold on the homepage, if we add press mentions, we will increase booking CTA conversion
Blacklane
Result: No significant difference
A
B
Conclusion: Visitors are not driven to convert by press mentions
Next steps: What else can we use this real-estate for?
Blacklane - next solution
Result: Increased conversion on CTA!B
Hypothesis: Because we have unused real-estate above the fold on the homepage, if we add USPs, we will increase booking CTA conversion
Hypothesis: Because videos are more engaging and informative, if we use them instead of photos on the product page, conversion will increase
Chrome Industries
Result: +0.2% in conversion
A
B
• Use a facade to test whether there is general interest in a certain functionality
• Saves the effort of full implementation
• Allows gradual testing of functionalities
Smoke testing
Smoke testing
Use smoke testing to measure demand for complex functionality before it is built
Insight #2
Hypothesis: Because videos are more engaging and informative, if we use images instead of photos on the product page, conversion will increase
Chrome Industries
Result: +0.2% in conversion
A
B
Conclusion: Difference between variation is not big enough to justify production costs of videos for all products
Hypothesis: Because of users’ reading habits (F shape), if we move the videos link to the left side of the menu it will be more noticeable and will drive more visitors to the videos page,
IGN
Result: -92.3% in clicks
A
B
Insight #3
A significant drop in an important metric might mean you found something your users care about or sensitive to
Why? Visitors believed that the section was deleted / didn’t bother looking for it or went to find the videos elsewhere (youtube)
Hypothesis: Because of users’ reading habits (F shape), if we move the videos link to the left side of the menu it will be more noticeable and will drive more visitors to the videos page,
IGN
Result: -92.3% in clicks
A
B
Who? After segmenting the results, it was clear that the change affected mostly returning visitors
* https://blog.optimizely.com/2014/10/30/the-problem-with-ab-testing-success-stories/
Hypothesis: Because the current layout was not clean and clear enough, if we give the user an obvious next step and remove distractions, cart check-out rates will increase
Rubylane
Result: Inconclusive
A
B What happened?
Insight #4
If results are very unexpected, take some time to validate your test design
Hypothesis: Because the current layout was not clean and clear enough, if we give the user an obvious next step and remove distractions, cart check-out rates will increase
Rubylane - round 2!
Result: +5% cart check-out
A
B
Recap● Strong hypotheses enable learning from failures -
Start with a meaningful problem definition
● Small or inconclusive impact might mean that you are not testing something your users care about
● Use smoke testing to measure demand for complex functionality before it is built
● A significant drop in an important metric might mean you found something your users care about or sensitive to
● If results are very unexpected, take some time to validate your test design
Thanks for listening…..
Lev TatarovStrategy Consultant
@LTatarov
Questions?