andreas reiffen - smx london slidedeck
TRANSCRIPT
#SMX #13B @AndreasReiffen
Creative ideas to testing procedures
How to test(& perfect)
nearly everything
#SMX #13B @AndreasReiffen
About…
• Data-driven online advertising strategist
• Online retail expert
• Entrepreneur
• Over €3 billion in customer revenues last
year
• SaaS product for Google Shopping &
Search
• 130 true experts in their field
• Offices in Germany & UK, new office in
NYC
… me … crealytics & camato
#SMX #13B @AndreasReiffen
ALL aspects of testing? At least some I hope!
2 Types of testing to take
performance to the next
level?
Testing is more than
finding the perfect ad
copy.
5 Common pit falls
Depending on the
setup and the
analysis tests can tell
very different stories
3 Methods & tools to
use for successful
testing
#SMX #13B @AndreasReiffen
Which methods to use
#SMX #13B @AndreasReiffen
1 2 3 4
Drafts and
Experiments
Scheduled A/B
tests
Before/ After tests Further tools for
testing
These are our recommended methods
#SMX #13B @AndreasReiffen
Draft & experiments is the most diversetesting tool for almost everything
Structural
Tests that change the structure within a campaign
• Ads
• Landing Pages
• Match Types
Bidding
Tests that influence bidding of some sort
• Bids
• Modifiers
• Device
• Ad schedule
• Geo-Targeting
• Strategies
• eCPC
• Target CPA
Features
Changes within features added to a campaign
• RLSA
• Ad Extensions
• Sitelinks
• Etc.
Drafts and Experiments allow you to test almost anything within a campaign.Unfortunately this feature is currently not available for Shopping campaigns.
1
#SMX #13B @AndreasReiffen
Set up a draft campaign tocollaborate or begin a new test
1
#SMX #13B @AndreasReiffen
Choose the % of traffic for testingand set a timeframe
1
#SMX #13B @AndreasReiffen
A/B test landing page with drafts & experimentsfor conversion rate
1
Test not successful: The original landing pages lead to a higher Conversion Rate.
Setup: Create an Experiment, change only landing pages
Analysis: Keep track of top line performance using automatic scorecard displayed in the Experiment campaign. Nonetheless, always take a deepdive into performance after finishing the experiment torule out any irregularities
#SMX #13B @AndreasReiffen
Manually scheduled A/B testsstill have some use cases
Search terms
Tests where the query composition is important
• Match Type changes
• Negative changes
Cross campaign
Tests that have to be tested across different campaigns
• Quality Score development in new accounts/ campaigns
Shopping
Any of the tests you can use D&E for in Text ads
• Structure
• Bidding
• Features
What ever can‘t be achieved through D&EUse this scheduling to avoid cannibalization while still being independent from seasonality
2
#SMX #13B @AndreasReiffen
Scheduled A/B test use campaign setting toshare hours justly between A and B
Copy and paste existing campaign andupload two hour scheduling for bothcampaigns so they run alternatingly.
2
#SMX #13B @AndreasReiffen
Setup: Duplicate campaign & set schedule to run against original campaign.
Analysis: Compare traffic & QS levels
Example A/B scheduling: how fast do qualityscores pick up after campaign transition?
8.3
Day 1-4
8.3
-8%
8.6
+4%
7.6
Day 5-30New Campaign
Original Campaign
944
Day 5-30
-32%-3%
1,3911,210
Day 1-4
1,252
2
Quality Scores pick up within a few days.Traffic picks up simultaneously.
#SMX #13B @AndreasReiffen
Before / after are versatile and used for feedcomponents. Control group is important.
Feed changes
Changes in the feed
• Test new titles
• Test new images
Product changes
Tests that affect the product portfolio itself
• Price changes
Things that cannot be easily changed.Make sure to have control group that indicates seasonal or budget changes.
3
#SMX #13B @AndreasReiffen
Before/ after test measures changes of relationbetween test and control
Test
Control
Before During After
3
#SMX #13B @AndreasReiffen
Before/after example: Google rewards cheaperproduct prices with more impressions
100100 93
33
Test Products
-67% -7%
Account Level
AfterBefore
100100 100
62
Account Level
-38% 0%
Test Products
Impressions Clicks
3
True, price changes not only affect CTR but also have massive impact on Impression levels.
Setup: Increased prices from lowest to highest among competitors
Analysis: Compare traffic in before/ after, using account traffic as baseline.
0
50
100
150
0
20
40
60+5%
clicksown price
Clicks
Price
Days
#SMX #13B @AndreasReiffen
Google Merchant Center experiments are a great idea, however lack attention from Google
Google is testing feed optimisations directly in the Merchant Center interface.
Tests include phase 1 and phase 2 in comparison against baseline. Not very well documented since still in beta.
4
#SMX #13B @AndreasReiffen
Product titles A vs B:
Alternative values are proposed from additional column in feed.
Shortcoming: products to include in test & control are randomized, not the impressions or users! Google might discontinue.
Merchant Center experimentscover product titles and images to A/B test
4
#SMX #13B @AndreasReiffen
Online A/B tools are a great help to find out whether tests have significant outcome.
Trials and successescan include:
clicks; conversions
impressions; clicks
4
#SMX #13B @AndreasReiffen
Optimizing current accounts & performance
#SMX #13B @AndreasReiffen
Optimise parameters within the
Google sandbox to get better
Google KPIs
Understand what the black box
does to inform & improve
strategy
Two types of testing: Optimizing PPC/ Understanding Google
Testing ads:
Necessary not to fall behind
Testing Google:
Move first and gain advantage
budget
revenue
Ad Avs
Ad B
#SMX #13B @AndreasReiffen
1 2
Optimizing
existing Google
performance
Reverse
engineering
There are two different types of objectives
#SMX #13B @AndreasReiffen
High IntentBrand + Specific
Product
Low IntentGeneric +
Product Type
High$1.00
Low$0.50
nike mercurial superfly
soccer shoes
BidCampaign A
Campaign B
Hypothesis: Splitting shopping queries„generics vs designers“ can save cost at same revenueEine
alte mit rein
1
#SMX #13B @AndreasReiffen
Google is forced to adopt query split by campaign priority and negatives
Generics
Designers
Designer + ProductName
NegativesPrioritiesCampaigns
high
medium
low
Designer namesProduct names
Product names
n/a
#SMX #13B @AndreasReiffen
Split vs Non-splitDuplicate products, split queries in „test“, increase share of designers by higher bids. Rotate by scheduling.
Test design: Rotating A/B test.Now you could do this with draft & experiments.
Hypothesis holdsQueries with higher conversion probability get more exposure, overcomensating higer CPCs.
80% 75% 70%
28% 35%
Generics
Designers20%
100% 100% 100%Original, no split
Phase 3Phase 2Phase 1
Test
Control
100%
Phase 3Phase 1 Phase 2
128% 137%Revenuetest vs control
Costtest vs control
98%
Phase 3Phase 1 Phase 2
103% 96%
1
#SMX #13B @AndreasReiffen
• A / B testing complex campaign setups is possible
• Keep results comparable: You should either keep coststable or revenue stable
• Don‘t measure the uplift of the „test“ campaign itself, only the change in relation to „control“ to eliminate seasonality
Conclusions for testing Google hacks1
#SMX #13B @AndreasReiffen
Hypothesis: Bidding on products is like „broad match“: higer bids = larger share of less converting traffic
Impressions
Max CPC
Specific queries
Generic queries
2
#SMX #13B @AndreasReiffen
Test design: Increase bids on brands by 200%Now you could do this with draft & experiments.
Chi Chi London before / after(k imps)
Hypothesis holdsTraffic quality gets weaker like in broads.
Surprising: you pay more for same traffic! Overbidding on shopping is dangerous.
4.32.1
0.6
1.3
bid = 1.50bid = 0.50
5.4
0.70.4
designer only [chi chi london]
designer + cats [chi chi dress]
generic terms [party dresses]
0.40
0.09
0.22
0.85
0.25
0.63
CPC
2
#SMX #13B @AndreasReiffen
Conclusions from reverse engineering tests
• Pure before / after tests need multiple sibling tests to validate: we tested several brands with same results
• Look beyond your hypothesis for additional learnings: same traffic at higher CPC was surprising
• Always segment out: queries, device, top vs other, search partners, audience vs non-audience.
2
#SMX #13B @AndreasReiffen
Common pit falls
#SMX #13B @AndreasReiffen
1 2 3 4
Statistical
significanceDon’t
aggregate
Think
outside the
box
Know your
surrounding
5
Look out for
cannibaliza
tion
Common challenges we have encountered
#SMX #13B @AndreasReiffen
Only end testing when statistical significance is reached1
Tipp: Use tools mentioned above to evaluate if data has relevance.
Done wrong: eCPC test run for two weeks only. Result is that eCPC does not work.
Done right: Consider that Googe algorithm needs time to learn + not enough traffic for statisticalrelevance. With more data the result is that eCPC does indeed work.
eCPC
2,930
+5%
CPC
2,8001,010,246
CPC
1,032,007
eCPC
-2%
Impressions Conversions
#SMX #13B @AndreasReiffen
Don‘t analyse totals, measure changes on the actualchanged elements
2
Done wrong: Only analyzed top line data. Result is that title changes hurt performance
Done right: Total decrease caused by one term only, on average Impressions increased by 116%. Result is that title changes work well.
100100
138146
Account Test
Before
After
Impressions
T XOB NK P SJ UIGD YC E MF Q V WRLA H
After
Before
#SMX #13B @AndreasReiffen
Don‘t limit yourself to the original question, there aremore insights to win
3
Done wrong: eCPC works, but some interesting insights slipped our attention!
Done right: Analyzing further we noticed: eCPC helped managin tablet performance (before Google reintroduced them). This opened up a new way of optimizing device performance
+5%
Conversions
2,800
CPC
2,930
eCPCeCPCCPC
Impressions
1,010,246
-2%
1,032,007
eCPCCPC
Tablet CPO
-10%
Lower CPCs
Higher CR
Traffic shift towards Desktop
#SMX #13B @AndreasReiffen
Be aware of your surroundings! What else couldinfluence the test results?
4
Done wrong: Image changes sometimes work, sometimes they don‘t, result inconclusive
Done right: Looking at the test environment shows: If competition images are mixed, there‘s nochange. If competition images are uniform, there‘s an improvement. Result: You have to stand out.
+2.6%
100.0% 102.6%
+27.0%
100.0%127.0%
Not significant Significant
Test A Test B
CTR
Test A Test B
* CTR test vs control, with original image set to 100%
#SMX #13B @AndreasReiffen
Be aware of your surroundings! What else couldinfluence the test results?
5
Done wrong: Measure query clicks on one single product after increasing bids.
Done right: However, the product diverted queries from other products, therefore actual increment ismuch lower.
1.5
200%
bid after
bid before
0.5
nominal increase
baseline
+1,581%
actual increase
nominal increase
cannibalised impressions
baseline
+114%
#SMX #13B @AndreasReiffen
Take aways
#SMX #13B @AndreasReiffen
Take aways
Knack for numbers
You have to like playing withnumbers and think
analytically
More than just numbers
Data miners and scientists are not everything. You need
to understand the bigger picture
Experience
For elaborate testing youneed to be a PPC pro with
experience
Loads of data
You need access to the data warehouse yourself or
know someone who can
1010011001
#SMX #13B @AndreasReiffen
LEARN MORE: UPCOMING @SMX EVENTS
THANK YOU! SEE YOU AT THE NEXT #SMX