the continuous cold-start problem in e-commerce recommender systems
TRANSCRIPT
Booking.com
The Continuous Cold Start Problem in e-Commerce Recommender Systems
RecSys Vienna 2015-09-20
Lucas Bernardi, Melanie Mueller , Amsterdam Jaap Kamps, Julia Kiseleva Amsterdam & Eindhoven University
Booking.com
Travel recommendations
Destinations related to Vienna:
• User-to-user collaborative filtering:
• Content-based item-to-item recommendation:
Customers who viewed Hotel Sacher Wien also viewed:
Booking.com
• Characteristics of the Continuous Cold Start Problem
Outline
• Addressing the Continuous Cold Start Problem
Booking.com
Recommender Systems
5 1
?
? ?
2
5
4
3
• Task: predict rating of new item• Classical cold start: too many ?
???new user
?
?
?
?
new item
Booking.com
• Cold start = not enough information (yet)
Classical Cold Start
→ Bridge period until warmed up
Other information: popularity, content...
Standard recommender
time →
perfo
rman
ce
Booking.com
• Classical cold-start / sparsity new / rare users
User Continuous Cold Start
• Volatility
• Personas
• Identity
Booking.com
• New users: - Millions unique users/day, growing - Most not logged-in, no cookie
Cold Users
1 51 101 151 201 251 301 351
Day
Activity L
evel
Figure 2: Continuously cold users at Booking.com. Activity levels of two randomly chosen users of Book-ing.com over time. The top user exhibits only rare activity throughout a year, and the bottom user has twodi↵erent personas, making a leisure and a business booking, without much activity inbetween.
Figure 3: Continuously cold items at Booking.com. Thousands of new accommodations are added to Book-ing.com every month (left). The user ratings of a hotel can change continuously (right).
items. Hybrid approaches also ignore the issue of multiplepersonas.
Although, to our knowledge, the continuous cold-startproblem as defined in this work has not been directly ad-dressed in the literature, several approaches are promising.
Tang et al. [19] propose a context-aware recommendersystem, implemented as a contextual multi-armed banditsproblem. Although the authors report extensive o✏ine eval-uation (log based and simulation based) with acceptableCTR, no comparison is made from a cold-start problemstandpoint.
Sun et al. [18] explicitly attack the user volatility prob-lem. They propose a dynamic extension of matrix factoriza-tion where the user latent space is modeled by a state spacemodel fitted by a Kalman filter. Generative data present-ing user preference transitions is used for evaluation. Im-provements of RMSE when compared to timeSVD [10] arereported. Consistent results are reported in [5], after o✏ine
evaluation using real data.Tavakol and Brefeld [20] propose a topic driven recom-
mender system. At the user session level, the user intentis modeled as a topic distribution over all the possible itemattributes. As the user interacts with the system, the userintent is predicted and recommendations are computed usingthe corresponding topic distribution. The topic predictionis solved by factored Markov decision processes. Evaluationon an e-commerce data set shows improvements when com-pared to collaborative filtering methods in terms of averagerank.
4. DISCUSSIONIn this manuscript, we have described how CoCoS, the
continuous cold-start problem, is a common issue for e-com-merce applications. Industrial recommender systems do notonly have to deal with ‘cold’ (new or rare) users and items,but also with known users or items that repeatedly ‘cool
• Sparse users: holidays are rare events!
Booking.com
• Classical cold-start / sparsity new / rare users
User Continuous Cold Start
• Volatility user interest changes over time
• Personas
• Identity
Booking.com
• Classical cold-start / sparsity new / rare users
User Continuous Cold Start
• Volatility user interest changes over time
• Personas different interest at close-by points in time
• Identity
Booking.com
• Leisure versus business:
User Personas
1 11 21 31 41 51 61 71 81 91Day
Activty
Level
Leisure
booking
Business
booking
Figure 2: Continuously cold users at Booking.com. Activity levels of two randomly chosen users of Book-ing.com over time. The top user exhibits only rare activity throughout a year, and the bottom user has twodi↵erent personas, making a leisure and a business booking, without much activity inbetween.
Figure 3: Continuously cold items at Booking.com. Thousands of new accommodations are added to Book-ing.com every month (left). The user ratings of a hotel can change continuously (right).
items. Hybrid approaches also ignore the issue of multiplepersonas.
Although, to our knowledge, the continuous cold-startproblem as defined in this work has not been directly ad-dressed in the literature, several approaches are promising.
Tang et al. [19] propose a context-aware recommendersystem, implemented as a contextual multi-armed banditsproblem. Although the authors report extensive o✏ine eval-uation (log based and simulation based) with acceptableCTR, no comparison is made from a cold-start problemstandpoint.
Sun et al. [18] explicitly attack the user volatility prob-lem. They propose a dynamic extension of matrix factoriza-tion where the user latent space is modeled by a state spacemodel fitted by a Kalman filter. Generative data present-ing user preference transitions is used for evaluation. Im-provements of RMSE when compared to timeSVD [10] arereported. Consistent results are reported in [5], after o✏ine
evaluation using real data.Tavakol and Brefeld [20] propose a topic driven recom-
mender system. At the user session level, the user intentis modeled as a topic distribution over all the possible itemattributes. As the user interacts with the system, the userintent is predicted and recommendations are computed usingthe corresponding topic distribution. The topic predictionis solved by factored Markov decision processes. Evaluationon an e-commerce data set shows improvements when com-pared to collaborative filtering methods in terms of averagerank.
4. DISCUSSIONIn this manuscript, we have described how CoCoS, the
continuous cold-start problem, is a common issue for e-com-merce applications. Industrial recommender systems do notonly have to deal with ‘cold’ (new or rare) users and items,but also with known users or items that repeatedly ‘cool
• Browsing on a sunny versus rainy day
• Weekend city trip versus long holiday
Booking.com
• Classical cold-start / sparsity new / rare users
User Continuous Cold Start
• Volatility user interest changes over time
• Personas different interest at close-by points in time
• Identity failure to match data from same user
Booking.com
• Classical cold-start / sparsity new / rare items
Item Continuous Cold Start
• Volatility item properties/values change over time
• Personas item appeals to different types of users
• Identity failure to match data from same item
Booking.com
• Characteristics of the Continuous Cold Start Problem
Outline
• Addressing the Continuous Cold Start Problem
Booking.com
Addressing Continuous Cold Start
- Ask user
• Continuous cold start = information continuously missing
→ Need more/other information:
Booking.com
- Implicit ratings: buys, clicks, views...
Addressing Continuous Cold Start
- Ask user
• Continuous cold start = information continuously missing
→ Need more/other information:→ can add friction
Booking.com
- Implicit ratings: buys, clicks, views...
Addressing Continuous Cold Start
- Ask user
• Continuous cold start = information continuously missing
→ Need more/other information:
- Popularity
→ can add friction
→ weak signal
Booking.com
- Implicit ratings: buys, clicks, views...
Addressing Continuous Cold Start
- Content
- Ask user
→ surprisingly hard to beat! → not personalized
- item descriptions - user profiles - context
• Continuous cold start = information continuously missing
→ Need more/other information:
- Popularity
→ can add friction
→ weak signal
Booking.com
Endorsements
• From users who stayed at hotel in destination • Free text endorsements since 2013 • 2014: used NLP techniques to extract 256 tags
• 13 000 unique endorsements • 60 000 destinations
Booking.com
Destination Finder Cold Start• Upper funnel: almost no information about user (very cold)
• Construct MIXED profile:Features: - user: operating system, browser - context: weekday - item rating: endorsement <Ubuntu, Firefox, Tuesday, opera> <0,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0...>
• Train recommender for each profile• New user + query: find closest profile
• Clustering → mixed content profiles
Booking.com
A/B testing
Recommender Users Click-Through-Rate
No mixed profiles 13,306 18.5 ± 0.4%
With mixed profiles 13,562 22.2 ± 0.4%
Improved by 20%
Booking.com
- Implicit ratings: buys, clicks, views...
Addressing Continuous Cold Start
- Content
- Ask user
→ surprisingly hard to beat! → not personalized
- item descriptions - user profiles - context
• Continuous cold start = information continuously missing
→ Need more/other information:
- Popularity
→ can add friction
→ weak signal
→ mix ‘em up!
Not only bridge warm-up period, but deal with continuous cold start
Booking.com
• Classical cold-start new & rare users / items
Summary: Continuous Cold Start
• Volatility users/items change over time
• Personas different interest at close-by times
• Identity failure to match data from same user/item