buy prediction in e-commerce with deep learningubicomp/... · peter romov & evgeny sokolov -...

26
Buy Prediction in E-Commerce with Deep Learning Tasmin Herrmann, Master Computer Science, Department of Computer Science, HAW Hamburg

Upload: others

Post on 29-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Buy Prediction in E-Commerce

with Deep Learning

Tasmin Herrmann, Master Computer Science, Department of Computer Science, HAW Hamburg

Page 2: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Amazon ships before you order

2

Page 3: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Outline

1. Introduction & Motivation

2. Literature Review

3. Research Question

4. Methodological Design

5. Summary

3

Page 4: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Machine Learning in E-Commerce

4

Customer in E-Commerce

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 5: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Machine Learning Applications for E-Commerce

● Anticipatory Shipping

● Product Recommendation

● Search○ search ranking○ query expansion

● Dynamic Pricing

● Churn Prediction

5

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 6: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Motivation for Anticipatory Shipping

6

Page 7: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

7

What buys the user?

When buys the user?

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 8: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Data Set

8

● YOOCHOOSE GmbH ○ Personalization Solutions like Recommendation engines

● RecSys Challenge 2015○ ACM RecSys Conference

● 26 Mio Clicks & 1 Mio Buys○ Session ID○ Timestamp○ Item ID○ Price○ Quantity○ Category

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 9: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Literature Review

● Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with

categorical features

● Eduard Weigandt - Auf Data-Mining basierende Personalisierung im E-Commerce mit

implizitem Feedback

9

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 10: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Peter Romov & Evgeny Sokolov

● 1. Which users are going to buy something?

2. What items will users buy?

● 2 staged classification with Gradient Boosting

● Feature extraction: 400 features

● Threshold optimization

● Results:○ mean Jaccard measure of 0.765

10

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 11: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Eduard Weigandt

● motivated implicit feedback data for recommendation systems

● predict user behaviour with one model○ Is the user buying in this session?○ What is he buying?

● Gradient Boosting with XGBoost○ accuracy 87%

● Random Forest with Bagging○ accuracy 88,53%

11

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 12: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Research Question

● How does an machine learning algorithm for Anticipatory Shipping look like?

● Is implicit feedback data suitable for building a recommendation systems?

● Are deep learning models suitable for buy prediction?

● Is a deep learning model a better buy predictor than an ensemble model?

12

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 13: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Knowledge Discovery in Databases

13

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 14: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Selection

14

● Data in HDFS

● Balance Data for product buy prediction

● Sample Data

● Select Rows/Columns

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 15: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Dataset - Clicks

Session ID Timestamp Item ID Category

1 2014-04-07T10:51:09.277Z

214536502 0

1 2014-04-07T10:54:09.868Z

214536500 0

2 2014-04-07T13:56:37.614Z

214662742 0

15

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 16: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Dataset - Buys

Session ID Timestamp Item ID Price Quantity

420374 2014-04-06T18:44:58.314Z

214537888 12462 1

420374 2014-04-06T18:44:58.325Z

214537850 10471 1

281626 2014-04-06T09:40:13.032Z

214535653 1883 1

16

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 17: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Preprocessing

● Removal of noise or outliers

● Collecting necessary information to model or account for noise

● Strategies for handling missing data fields

17

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 18: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Transformation

● Feature Extraction○ Features by Session○ Features by Item in Session

● Dimensionality Reduction

18

Features by Session

Session time in seconds

Average time between two clicks

Day of the week

Month of the year

Features by Item in Session

Whether the item appears more than once in the

session

Whether the item was clicked first in the session

Number of appearances in the session

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 19: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Data Mining

● Deep learning○ Feedforward Networks○ Convolutional Neural Networks○ Recurrent Neural Networks

● build prediction models

with TensorFlow in Python

● other libraries: Pandas, Numpy, Scikit-Learn

19

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 20: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Interpretation/Evaluation

Confusion matrix & Jaccard Index

Compare to other results:

● Romov & Sokolov: ○ mean Jaccard measure of 0.765

● Weigandt: ○ Random Forest with Bagging: accuracy 88,53%○ Gradient Boosting with XGBoost: accuracy 87%

20

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 21: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Knowledge Discovery in Databases

21

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 22: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Motivation for Anticipatory Shipping

22

Page 23: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Risks of work

● Data Set is not enough to answer the questions

● Quality of the data set

● Deep learning model is not better than Ensemble Learning Models

23

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 24: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Summary

What?

● Analyse:○ Are deep learning models suitable for buy prediction?

○ Is a deep learning model a better buy predictor than an ensemble model?

How?

● Knowledge Discovery in Databases with Python Libraries like TensorFlow

24

Introduction & Motivation Literature Review Research Question Methodological Design Summary

Page 25: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Sources I

Papers:● Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with

categorical features

● Eduard Weigandt - Auf Data-Mining basierende Personalisierung im E-Commerce mit

implizitem Feedback

25

Page 26: Buy Prediction in E-Commerce with Deep Learningubicomp/... · Peter Romov & Evgeny Sokolov - RecSys Challenge 2015: ensemble learning with categorical features Eduard Weigandt - Auf

Sources II

Pictures:

● Customer in E-Commerce:

https://techblog.commercetools.com/top-5-machine-learning-applications-for-e-commerc

e-268eb1c89607 (03.05.2018)

● Motivation for Anticipatory Shipping: https://blog.kinaxis.com/2016/04/17617/

(03.05.2018)

● Knowledge Discovery in Databases:

http://www2.cs.uregina.ca/~dbd/cs831/notes/kdd/1_kdd.html (03.05.2018)

26