velocity 2016 speaking session - using machine learning to determine drivers of bounce and...
TRANSCRIPT
![Page 1: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/1.jpg)
Using machine learning to determine drivers
of bounce and conversionVelocity 2016 Santa Clara
![Page 2: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/2.jpg)
Pat Meenan@patmeenan
Tammy Everts@tameverts
![Page 3: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/3.jpg)
What we did (and why we did it)
![Page 4: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/4.jpg)
Get the codehttps://github.com/WPO-
Foundation/beacon-ml
![Page 5: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/5.jpg)
Deep learning
weights
![Page 6: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/6.jpg)
Random forestLots of random decision trees
![Page 7: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/7.jpg)
Vectorizing the data• Everything needs to be numeric• Strings converted to several inputs as
yes/no (1/0)• i.e. Device manufacturer• “Apple” would be a discrete input
• Watch out for input explosion (UA String)
![Page 8: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/8.jpg)
Balancing the data• 3% conversion rate• 97% accurate by always guessing
no• Subsample the data for 50/50 mix
![Page 9: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/9.jpg)
Validation data• Train on 80% of the data• Validate on 20% to prevent
overfitting
![Page 10: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/10.jpg)
Smoothing the dataML works best on normally
distributed data
scaler = StandardScaler()x_train = scaler.fit_transform(x_train)x_val = scaler.transform(x_val)
![Page 11: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/11.jpg)
Input/output relationships
• SSL highly correlated with conversions• Long sessions highly correlated with
not bouncing• Remove correlated features from
training
![Page 12: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/12.jpg)
Training deep learning
model = Sequential()model.add(...)model.compile(optimizer='adagrad', loss='binary_crossentropy', metrics=["accuracy"])model.fit(x_train, y_train, nb_epoch=EPOCH_COUNT, batch_size=32, validation_data=(x_val, y_val), verbose=2, shuffle=True)
![Page 13: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/13.jpg)
Training random forest
clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None)clf.fit(x_train, y_train)
![Page 14: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/14.jpg)
Feature importancesclf.feature_importances_
![Page 15: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/15.jpg)
What we learned
![Page 16: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/16.jpg)
What’s in our beacon?• Top-level – domain, timestamp, SSL
• Session – start time, length (in pages), total load time• User agent – browser, OS, mobile ISP• Geo – country, city, organization, ISP, network speed• Bandwidth• Timers – base, custom, user-defined• Custom metrics• HTTP headers• Etc.
![Page 17: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/17.jpg)
Conversion rate
![Page 18: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/18.jpg)
Conversion rate
![Page 19: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/19.jpg)
Bounce rate
![Page 20: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/20.jpg)
Bounce rate
![Page 21: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/21.jpg)
Finding 1Number of scripts was a predictor…
but not in the way we expected
![Page 22: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/22.jpg)
Number of scripts per page (median)
![Page 23: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/23.jpg)
Finding 2When entire sessions were more
complex, they converted less
![Page 24: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/24.jpg)
Finding 3Sessions that converted had 38% fewer images than sessions that didn’t
![Page 25: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/25.jpg)
Number of images per page (median)
![Page 26: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/26.jpg)
Finding 4DOM ready was the greatest
indicator of bounce rate
![Page 27: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/27.jpg)
DOM ready (median)
![Page 28: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/28.jpg)
Finding 5Full load time was the second
greatest indicator of bounce rate
![Page 29: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/29.jpg)
timers_loaded (median)
![Page 30: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/30.jpg)
Finding 6Mobile-related measurements weren’t meaningful predictors of conversions
![Page 31: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/31.jpg)
Conversions
![Page 32: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/32.jpg)
Finding 7Some conventional metrics
were (almost) meaningless, too
![Page 33: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/33.jpg)
Feature Importance (out of 93)
DNS lookup 79Start render 69
![Page 34: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/34.jpg)
Takeaways
![Page 35: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/35.jpg)
1. YMMV2. Do this with your own data3. Gather your RUM data4. Run the machine learning
against it
![Page 36: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion](https://reader035.vdocuments.net/reader035/viewer/2022070514/587ec9af1a28abf37b8b6b83/html5/thumbnails/36.jpg)
Thanks!