building an octopus
TRANSCRIPT
How to build an Octopus - Introduction to our new A/B testing and personalization serviceAugust 13, 2015 Soroosh Sarabadani and Till Riffert
What is A/B testing?
Variant A: blue
Variant B: orange
50% get variant A
50% get variant B
11 %Conversion
23 %Conversion
● A/B testing measures impact of a variant compared to another variant● General principle is also used for more than two variants and e.g. on article level● Breaking down results by population properties (age, gender, order history) forms
the basis for personalization
Baseline
Δ = +12%
Why do I need a central solution for A/B testing?
Scenario: two teams independently perform testing and steering
First team:● What is the impact of 10€ voucher?
○ Give voucher to half of the test-customers○ Do not give voucher to the other half○ Measure impact, e.g. profit up by 12€ per customer
Second team:● What is the impact of a 20€ voucher on top customers?
○ Interferes with the voucher test by first team○ Top customers in the control group of first team get a voucher now
○ Impact measured in testing is wrong○ Profitable treatment is not pursued due to interference○ Centralizing all testings and steerings solves this issue!
Octopus-Call (decision):In: steering point id, environment, entity idOut: variant, correlation id
Feedback:In: correl. id, steering point id, used variant, reason, environm.Out: -
Registration:In: name, entity-type, variants, weights, environment Out: steering point id
Octopus is a service for including A/B testing and personalization capabilities into your features
Feature(built by engineer)
Octopus(A/B testing & steering)
Ste
erin
g P
oint
steering_point = "ecbf6791-102b-4ce6-b777-2b472dbfb550" //provided at registration
app_domain_id = 1
customer_id = "98ba7d91ed2df8eb0385fe7c1b093498"
GET variants/$app_domain?steering_point_id=$steering_point&/customer=$customer_id
result: {
"offered_variant":$offered_variant,
"correlation_id": $correlation_id
}
PUT /feedbacks/$correlation_id
payload: {
"selected_variants": $selected_variant,
"reason": "Not applicable" //only if offered_variant not used
}
Incorporating Octopus in your code takes only a GET and a PUT
Octopus● … offers you live quality control functionalities
○ Automatic stopping when a variant significantly underperforms○ Automatic notification of sudden deviations from previous behaviour
● ... has the overview over all testings and personalizations○ Examine all tests and steerings in one place
● ... automatically detects and resolves interference○ Checks all running tests and personalizations for interference○ Automatically applies layer system for resolving interference
● … provides state of the art analysis○ Uses the analysis framework built by the testing centre○ Incorporates dashboards for exploring your data
Using Octopus gives you access to convenient features, simplifying monitoring and analysis
The Octopus MVP is built and ready to use
Octopus status: August 13th
current version:● Most important APIs ready to use (registration, decision, feedback) ● GUI for registration and management of all of a team's steering points● Integrated forced assignment of variant (black-/white-lists)● QA integration so unit/integration-tests for steering points can be built● Several steering points can be requested at once using batch mode
Next steps (end of Q3):● Integration of within variant monitoring (anomaly detection)● Integration with existing analysis module
Sounds awesome! Where do I find this Octopus?
https://admin-r02.octopus.zalan.do/steering
Octopus architecture
ABDifferentiation
Domain Resolver
Decision tree evaluator
Decision tree builder
AB-SteerageData Fetcher
Decision tree repository
Aggregated data
Shop data
Customer
etc.
Analysis
RE
STA
PIs
Analysis repository
AB repository
CustomerCustomerCustomerCustomerCustomerCustomer
Development and Production environment
Admin Nodes
Production environment
Service Nodes
Monitoring NodesCassandra Nodes
Admin Node
Development environment
Service Node Monitoring NodeCassandra Node
Version 0.1
Admin Node Service Node Monitoring NodeCassandra Node
Version 0.2
Auto Scaling Group Auto Scaling Group Auto Scaling Group Auto Scaling Group
● Any particular set of objects that can be split into multiple groups for A/B testing or personalization is called an entity type
● The available data is specific to the entity type● A specific member of an entity type is called entity● Related entity types form entity domains
Article domain(e.g. pricing-tests)
Model SKU
Config SKU
Simple SKU
Customer domain(e.g. label-in-parcel test)
Customer
Order
Shipment
Pession
Session
What are entity-types and how are they related?
Octopus offers you live quality control functionalities (under development, scheduled for end of Q3)
Between variant monitoring● Automatically switch off variants
which significantly underperform● Able to detect errors in
implementation of variant
Within variant monitoring● Alarms for significant deviation
from previous behaviour● Monitores KPIs to give hints on
unexpected events taking place
Switch off variant D
Switch off variant C
Time
KPIVariant AVariant B
Variant C
Variant D
Impact of variants over time
Unexpected behaviour
Layer A
Layer B
A layer-system resolves and prevents interference (under development, integration scheduled for Q4)
Checkout text colorBlack text (default) Orange text
Checkout background colorOrange background
White background (d.)
Steering points
interfere
Idea behind the layer-system:● Each steering point starts out in new layer● Interfering steering points are merged in one layer● Each entity in a layer is assigned to one variant in one of the
steering points and receives the default variant in all others
A layer-system resolves and prevents interference (under development, integration scheduled for Q4)
Idea behind the layer-system:● Each steering point starts out in new layer● Interfering steering points are merged in one layer● Each entity in a layer is assigned to one variant in one of the
steering points and receives the default variant in all others
Combined layer
Checkout buttonBlack checkout text (default)
Orangecheckout text
Checkout backgroundOrange background
White background (d.)
Analysis dashboardSubgroup analysis
Number of orders
Customer Age
Analysis dashboardAnalysis over timeNumber of orders
● Delta KPIdifference between treatment & control
● Analysis over timedevelopment of difference over time
● Subgroup analysisdifference broken down by properties
Octopus will include the analysis capabilities offered by the framework developed by the testing center @ DI
● Basic analysis capabilities will be migrated to Octopus by end of Q3● Migration of more complex analysis/scoring capabilities scheduled for Q4
Personalizing customer experience based on A/B testing results (part of analysis framework, integration to Octopus in Q4)
Customer
Female
Male
Top customer
Non-top Customer
>1 month to last order
<1 month to last order
+ 5% orders(vs default)
- 2% orders(vs default)
+ 1% orders(vs default)
+ 2% orders(vs default)
● Decision trees are built from test results for each variant● Each leaf has a prediction on the impact of a treatment● Traversing the tree for an entity leads to a prediction of the impact
Female,Non-Top Customer,...
Use default variant
Steering ratio(#steerings / #entities)
Controlling trade off between A/B testing and steering(research on topic scheduled for Q4)
● Usually a steering point starts with a steering ratio of 0%● As soon as enough data is available the steering ratio can be adjusted● Goal: option to automatically and optimally adjust the steering ratio
100%
0%
Entit
ies
used
fo
r ste
erin
g
Time
Testing phase Steering phase(continuous learning)
● Octopus is a centralized service for A/B testing and personalization
● Including octopus in your code requires only a GET and a PUT
● The MVP is ready to use and offers:
○ Registration/configuration of your own steering points
○ Requesting a decision on which variant to display
○ QA capabilities to ensure your code is working
● A lot of cool stuff is being built to make your life easier (~ end of Q3)
○ Integration of the analysis framework
○ Octopus integration of within variant monitoring solution
○ Prototype and octopus integration of between variant monitoring algorithm
○ Prototype of interference detection and automatic layer management
tl;dl - Summary
● Use Octopus and register your own steering point via its admin pagehttps://admin-r02.octopus.zalan.do/steering
● Visit the Octopus tech wiki page for further information on octopus: https://techwiki.zalando.net/pages/viewpage.action?pageId=24823105
● Join the A/B testing guild for Octopus updates & A/B testing info: https://groups.google.com/a/zalando.de/forum/#!forum/ab-testing
● Questions, ideas, feedback? Contact Octopus via hipchat or e-mail:hipchat: #octopus, email: [email protected]
You want to know even more about Octopus?
● Team Octopus is hiring: ○ We are looking for engineers and
business developers○ If you know someone, please
contact us
We are hiring!