andreas reiffen - smx london slidedeck

#SMX #13B @AndreasReiffen

Creative ideas to testing procedures

How to test(& perfect)

nearly everything


About…

• Data-driven online advertising strategist

• Online retail expert

• Entrepreneur

• Over €3 billion in customer revenues last

year

• SaaS product for Google Shopping &

Search

• 130 true experts in their field

• Offices in Germany & UK, new office in

NYC

… me … crealytics & camato


ALL aspects of testing? At least some I hope!

2 Types of testing to take

performance to the next

level?

Testing is more than

finding the perfect ad

copy.

5 Common pit falls

Depending on the

setup and the

analysis tests can tell

very different stories

3 Methods & tools to

use for successful

testing


Which methods to use


1 2 3 4

Drafts and

Experiments

Scheduled A/B

tests

Before/ After tests Further tools for

testing

These are our recommended methods


Draft & experiments is the most diversetesting tool for almost everything

Structural

Tests that change the structure within a campaign

• Ads

• Landing Pages

• Match Types

Bidding

Tests that influence bidding of some sort

• Bids

• Modifiers

• Device

• Ad schedule

• Geo-Targeting

• Strategies

• eCPC

• Target CPA

Features

Changes within features added to a campaign

• RLSA

• Ad Extensions

• Sitelinks

• Etc.

Drafts and Experiments allow you to test almost anything within a campaign.Unfortunately this feature is currently not available for Shopping campaigns.

1


Set up a draft campaign tocollaborate or begin a new test

1


Choose the % of traffic for testingand set a timeframe

1


A/B test landing page with drafts & experimentsfor conversion rate

1

Test not successful: The original landing pages lead to a higher Conversion Rate.

Setup: Create an Experiment, change only landing pages

Analysis: Keep track of top line performance using automatic scorecard displayed in the Experiment campaign. Nonetheless, always take a deepdive into performance after finishing the experiment torule out any irregularities


Manually scheduled A/B testsstill have some use cases

Search terms

Tests where the query composition is important

• Match Type changes

• Negative changes

Cross campaign

Tests that have to be tested across different campaigns

• Quality Score development in new accounts/ campaigns

Shopping

Any of the tests you can use D&E for in Text ads

• Structure

• Bidding

• Features

What ever can‘t be achieved through D&EUse this scheduling to avoid cannibalization while still being independent from seasonality

2


Scheduled A/B test use campaign setting toshare hours justly between A and B

Copy and paste existing campaign andupload two hour scheduling for bothcampaigns so they run alternatingly.

2


Setup: Duplicate campaign & set schedule to run against original campaign.

Analysis: Compare traffic & QS levels

Example A/B scheduling: how fast do qualityscores pick up after campaign transition?

8.3

Day 1-4

8.3

-8%

8.6

+4%

7.6

Day 5-30New Campaign

Original Campaign

944

Day 5-30

-32%-3%

1,3911,210

Day 1-4

1,252

2

Quality Scores pick up within a few days.Traffic picks up simultaneously.


Before / after are versatile and used for feedcomponents. Control group is important.

Feed changes

Changes in the feed

• Test new titles

• Test new images

Product changes

Tests that affect the product portfolio itself

• Price changes

Things that cannot be easily changed.Make sure to have control group that indicates seasonal or budget changes.

3


Before/ after test measures changes of relationbetween test and control

Test

Control

Before During After

3


Before/after example: Google rewards cheaperproduct prices with more impressions

100100 93

33

Test Products

-67% -7%

Account Level

AfterBefore

100100 100

62

Account Level

-38% 0%

Test Products

Impressions Clicks

3

True, price changes not only affect CTR but also have massive impact on Impression levels.

Setup: Increased prices from lowest to highest among competitors

Analysis: Compare traffic in before/ after, using account traffic as baseline.

0

50

100

150

0

20

40

60+5%

clicksown price

Clicks

Price

Days


Google Merchant Center experiments are a great idea, however lack attention from Google

Google is testing feed optimisations directly in the Merchant Center interface.

Tests include phase 1 and phase 2 in comparison against baseline. Not very well documented since still in beta.

4


Product titles A vs B:

Alternative values are proposed from additional column in feed.

Shortcoming: products to include in test & control are randomized, not the impressions or users! Google might discontinue.

Merchant Center experimentscover product titles and images to A/B test

4


Online A/B tools are a great help to find out whether tests have significant outcome.

Trials and successescan include:

clicks; conversions

impressions; clicks

4


Optimizing current accounts & performance


Optimise parameters within the

Google sandbox to get better

Google KPIs

Understand what the black box

does to inform & improve

strategy

Two types of testing: Optimizing PPC/ Understanding Google

Testing ads:

Necessary not to fall behind

Testing Google:

Move first and gain advantage

budget

revenue

Ad Avs

Ad B


1 2

Optimizing

existing Google

performance

Reverse

engineering

Google

There are two different types of objectives


High IntentBrand + Specific

Product

Low IntentGeneric +

Product Type

High$1.00

Low$0.50

nike mercurial superfly

soccer shoes

BidCampaign A

Campaign B

Hypothesis: Splitting shopping queries„generics vs designers“ can save cost at same revenueEine

alte mit rein

1


Google is forced to adopt query split by campaign priority and negatives

Generics

Designers

Designer + ProductName

NegativesPrioritiesCampaigns

high

medium

low

Designer namesProduct names

Product names

n/a


Split vs Non-splitDuplicate products, split queries in „test“, increase share of designers by higher bids. Rotate by scheduling.

Test design: Rotating A/B test.Now you could do this with draft & experiments.

Hypothesis holdsQueries with higher conversion probability get more exposure, overcomensating higer CPCs.

80% 75% 70%

28% 35%

Generics

Designers20%

100% 100% 100%Original, no split

Phase 3Phase 2Phase 1

Test

Control

100%

Phase 3Phase 1 Phase 2

128% 137%Revenuetest vs control

Costtest vs control

98%

Phase 3Phase 1 Phase 2

103% 96%

1


• A / B testing complex campaign setups is possible

• Keep results comparable: You should either keep coststable or revenue stable

• Don‘t measure the uplift of the „test“ campaign itself, only the change in relation to „control“ to eliminate seasonality

Conclusions for testing Google hacks1


Hypothesis: Bidding on products is like „broad match“: higer bids = larger share of less converting traffic

Impressions

Max CPC

Specific queries

Generic queries

2


Test design: Increase bids on brands by 200%Now you could do this with draft & experiments.

Chi Chi London before / after(k imps)

Hypothesis holdsTraffic quality gets weaker like in broads.

Surprising: you pay more for same traffic! Overbidding on shopping is dangerous.

4.32.1

0.6

1.3

bid = 1.50bid = 0.50

5.4

0.70.4

designer only [chi chi london]

designer + cats [chi chi dress]

generic terms [party dresses]

0.40

0.09

0.22

0.85

0.25

0.63

CPC

2


Conclusions from reverse engineering tests

• Pure before / after tests need multiple sibling tests to validate: we tested several brands with same results

• Look beyond your hypothesis for additional learnings: same traffic at higher CPC was surprising

• Always segment out: queries, device, top vs other, search partners, audience vs non-audience.

2


Common pit falls


1 2 3 4

Statistical

significanceDon’t

aggregate

Think

outside the

box

Know your

surrounding

5

Look out for

cannibaliza

tion

Common challenges we have encountered


Only end testing when statistical significance is reached1

Tipp: Use tools mentioned above to evaluate if data has relevance.

Done wrong: eCPC test run for two weeks only. Result is that eCPC does not work.

Done right: Consider that Googe algorithm needs time to learn + not enough traffic for statisticalrelevance. With more data the result is that eCPC does indeed work.

eCPC

2,930

+5%

CPC

2,8001,010,246

CPC

1,032,007

eCPC

-2%

Impressions Conversions


Don‘t analyse totals, measure changes on the actualchanged elements

2

Done wrong: Only analyzed top line data. Result is that title changes hurt performance

Done right: Total decrease caused by one term only, on average Impressions increased by 116%. Result is that title changes work well.

100100

138146

Account Test

Before

After

Impressions

T XOB NK P SJ UIGD YC E MF Q V WRLA H

After

Before


Don‘t limit yourself to the original question, there aremore insights to win

3

Done wrong: eCPC works, but some interesting insights slipped our attention!

Done right: Analyzing further we noticed: eCPC helped managin tablet performance (before Google reintroduced them). This opened up a new way of optimizing device performance

+5%

Conversions

2,800

CPC

2,930

eCPCeCPCCPC

Impressions

1,010,246

-2%

1,032,007

eCPCCPC

Tablet CPO

-10%

Lower CPCs

Higher CR

Traffic shift towards Desktop


Be aware of your surroundings! What else couldinfluence the test results?

4

Done wrong: Image changes sometimes work, sometimes they don‘t, result inconclusive

Done right: Looking at the test environment shows: If competition images are mixed, there‘s nochange. If competition images are uniform, there‘s an improvement. Result: You have to stand out.

+2.6%

100.0% 102.6%

+27.0%

100.0%127.0%

Not significant Significant

Test A Test B

CTR

Test A Test B

* CTR test vs control, with original image set to 100%


Be aware of your surroundings! What else couldinfluence the test results?

5

Done wrong: Measure query clicks on one single product after increasing bids.

Done right: However, the product diverted queries from other products, therefore actual increment ismuch lower.

1.5

200%

bid after

bid before

0.5

nominal increase

baseline

+1,581%

actual increase

nominal increase

cannibalised impressions

baseline

+114%


Take aways


Take aways

Knack for numbers

You have to like playing withnumbers and think

analytically

More than just numbers

Data miners and scientists are not everything. You need

to understand the bigger picture

Experience

For elaborate testing youneed to be a PPC pro with

experience

Loads of data

You need access to the data warehouse yourself or

know someone who can

1010011001


LEARN MORE: UPCOMING @SMX EVENTS

THANK YOU! SEE YOU AT THE NEXT #SMX

http://marketinglandevents.com/smx/?utm_source=slideshare&utm_medium=referral&utm_content=upcoming+smx

andreas reiffen - smx london slidedeck

Software