the continuous cold-start problem in e-commerce recommender systems

34
Booking.com The Continuous Cold Start Problem in e-Commerce Recommender Systems RecSys Vienna 2015-09-20 Lucas Bernardi, Melanie Mueller , Amsterdam Jaap Kamps, Julia Kiseleva Amsterdam & Eindhoven University

Upload: melanie-ji-mueller

Post on 11-Jan-2017

1.275 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Booking.com

The Continuous Cold Start Problem in e-Commerce Recommender Systems

RecSys Vienna 2015-09-20

Lucas Bernardi, Melanie Mueller , Amsterdam Jaap Kamps, Julia Kiseleva Amsterdam & Eindhoven University

Booking.com

Cold Start in Travel

Booking.com

Cold Start in Travel

Booking.com

Travel recommendations

Destinations related to Vienna:

• User-to-user collaborative filtering:

• Content-based item-to-item recommendation:

Customers who viewed Hotel Sacher Wien also viewed:

Booking.com

• Characteristics of the Continuous Cold Start Problem

Outline

• Addressing the Continuous Cold Start Problem

Booking.com

Recommender Systems

5 1

?

? ?

2

5

4

3

• Task: predict rating of new item• Classical cold start: too many ?

???new user

?

?

?

?

new item

Booking.com

• Cold start = not enough information (yet)

Classical Cold Start

→ Bridge period until warmed up

Other information: popularity, content...

Standard recommender

time →

perfo

rman

ce

Booking.com

• Classical cold-start / sparsity new / rare users

User Continuous Cold Start

• Volatility

• Personas

• Identity

Booking.com

• New users: - Millions unique users/day, growing - Most not logged-in, no cookie

Cold Users

1 51 101 151 201 251 301 351

Day

Activity L

evel

Figure 2: Continuously cold users at Booking.com. Activity levels of two randomly chosen users of Book-ing.com over time. The top user exhibits only rare activity throughout a year, and the bottom user has twodi↵erent personas, making a leisure and a business booking, without much activity inbetween.

Figure 3: Continuously cold items at Booking.com. Thousands of new accommodations are added to Book-ing.com every month (left). The user ratings of a hotel can change continuously (right).

items. Hybrid approaches also ignore the issue of multiplepersonas.

Although, to our knowledge, the continuous cold-startproblem as defined in this work has not been directly ad-dressed in the literature, several approaches are promising.

Tang et al. [19] propose a context-aware recommendersystem, implemented as a contextual multi-armed banditsproblem. Although the authors report extensive o✏ine eval-uation (log based and simulation based) with acceptableCTR, no comparison is made from a cold-start problemstandpoint.

Sun et al. [18] explicitly attack the user volatility prob-lem. They propose a dynamic extension of matrix factoriza-tion where the user latent space is modeled by a state spacemodel fitted by a Kalman filter. Generative data present-ing user preference transitions is used for evaluation. Im-provements of RMSE when compared to timeSVD [10] arereported. Consistent results are reported in [5], after o✏ine

evaluation using real data.Tavakol and Brefeld [20] propose a topic driven recom-

mender system. At the user session level, the user intentis modeled as a topic distribution over all the possible itemattributes. As the user interacts with the system, the userintent is predicted and recommendations are computed usingthe corresponding topic distribution. The topic predictionis solved by factored Markov decision processes. Evaluationon an e-commerce data set shows improvements when com-pared to collaborative filtering methods in terms of averagerank.

4. DISCUSSIONIn this manuscript, we have described how CoCoS, the

continuous cold-start problem, is a common issue for e-com-merce applications. Industrial recommender systems do notonly have to deal with ‘cold’ (new or rare) users and items,but also with known users or items that repeatedly ‘cool

• Sparse users: holidays are rare events!

Booking.com

• Classical cold-start / sparsity new / rare users

User Continuous Cold Start

• Volatility user interest changes over time

• Personas

• Identity

Booking.com

Backpacker hostel

User Volatility

Family hotel

Wellness hotel

Booking.com

• Classical cold-start / sparsity new / rare users

User Continuous Cold Start

• Volatility user interest changes over time

• Personas different interest at close-by points in time

• Identity

Booking.com

• Leisure versus business:

User Personas

1 11 21 31 41 51 61 71 81 91Day

Activty

Level

Leisure

booking

Business

booking

Figure 2: Continuously cold users at Booking.com. Activity levels of two randomly chosen users of Book-ing.com over time. The top user exhibits only rare activity throughout a year, and the bottom user has twodi↵erent personas, making a leisure and a business booking, without much activity inbetween.

Figure 3: Continuously cold items at Booking.com. Thousands of new accommodations are added to Book-ing.com every month (left). The user ratings of a hotel can change continuously (right).

items. Hybrid approaches also ignore the issue of multiplepersonas.

Although, to our knowledge, the continuous cold-startproblem as defined in this work has not been directly ad-dressed in the literature, several approaches are promising.

Tang et al. [19] propose a context-aware recommendersystem, implemented as a contextual multi-armed banditsproblem. Although the authors report extensive o✏ine eval-uation (log based and simulation based) with acceptableCTR, no comparison is made from a cold-start problemstandpoint.

Sun et al. [18] explicitly attack the user volatility prob-lem. They propose a dynamic extension of matrix factoriza-tion where the user latent space is modeled by a state spacemodel fitted by a Kalman filter. Generative data present-ing user preference transitions is used for evaluation. Im-provements of RMSE when compared to timeSVD [10] arereported. Consistent results are reported in [5], after o✏ine

evaluation using real data.Tavakol and Brefeld [20] propose a topic driven recom-

mender system. At the user session level, the user intentis modeled as a topic distribution over all the possible itemattributes. As the user interacts with the system, the userintent is predicted and recommendations are computed usingthe corresponding topic distribution. The topic predictionis solved by factored Markov decision processes. Evaluationon an e-commerce data set shows improvements when com-pared to collaborative filtering methods in terms of averagerank.

4. DISCUSSIONIn this manuscript, we have described how CoCoS, the

continuous cold-start problem, is a common issue for e-com-merce applications. Industrial recommender systems do notonly have to deal with ‘cold’ (new or rare) users and items,but also with known users or items that repeatedly ‘cool

• Browsing on a sunny versus rainy day

• Weekend city trip versus long holiday

Booking.com

• Classical cold-start / sparsity new / rare users

User Continuous Cold Start

• Volatility user interest changes over time

• Personas different interest at close-by points in time

• Identity failure to match data from same user

Booking.com

User Identity Problem

• Different devices

• Not logged in

Booking.com

• Classical cold-start / sparsity new / rare items

Item Continuous Cold Start

• Volatility item properties/values change over time

• Personas item appeals to different types of users

• Identity failure to match data from same item

Booking.com

• Characteristics of the Continuous Cold Start Problem

Outline

• Addressing the Continuous Cold Start Problem

Booking.com

Addressing Continuous Cold Start

- Ask user

• Continuous cold start = information continuously missing

→ Need more/other information:

Booking.com

Ask the User

Booking.com

- Implicit ratings: buys, clicks, views...

Addressing Continuous Cold Start

- Ask user

• Continuous cold start = information continuously missing

→ Need more/other information:→ can add friction

Booking.com

Implicit Ratings

Customers who viewed Hotel Sacher Wien also viewed:

Booking.com

- Implicit ratings: buys, clicks, views...

Addressing Continuous Cold Start

- Ask user

• Continuous cold start = information continuously missing

→ Need more/other information:

- Popularity

→ can add friction

→ weak signal

Booking.com

Destination Finder

Booking.com

- Implicit ratings: buys, clicks, views...

Addressing Continuous Cold Start

- Content

- Ask user

→ surprisingly hard to beat! → not personalized

- item descriptions - user profiles - context

• Continuous cold start = information continuously missing

→ Need more/other information:

- Popularity

→ can add friction

→ weak signal

Booking.com

Destination Finder

Booking.com

Destination Finder

Booking.com

Endorsements

• From users who stayed at hotel in destination • Free text endorsements since 2013 • 2014: used NLP techniques to extract 256 tags

• 13 000 unique endorsements • 60 000 destinations

Booking.com

Endorsements

Booking.com

Endorsements

Booking.com

Destination Finder Cold Start• Upper funnel: almost no information about user (very cold)

• Construct MIXED profile:Features: - user: operating system, browser - context: weekday - item rating: endorsement <Ubuntu, Firefox, Tuesday, opera> <0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0...>

• Train recommender for each profile• New user + query: find closest profile

• Clustering → mixed content profiles

Booking.com

A/B testing

Recommender Users Click-Through-Rate

No mixed profiles 13,306 18.5 ± 0.4%

With mixed profiles 13,562 22.2 ± 0.4%

Improved by 20%

Booking.com

- Implicit ratings: buys, clicks, views...

Addressing Continuous Cold Start

- Content

- Ask user

→ surprisingly hard to beat! → not personalized

- item descriptions - user profiles - context

• Continuous cold start = information continuously missing

→ Need more/other information:

- Popularity

→ can add friction

→ weak signal

→ mix ‘em up!

Not only bridge warm-up period, but deal with continuous cold start

Booking.com

• Classical cold-start new & rare users / items

Summary: Continuous Cold Start

• Volatility users/items change over time

• Personas different interest at close-by times

• Identity failure to match data from same user/item

Booking.com

Timely Personalized Recommendation: Don’t do lunch cold start