music r ecommendations at spotify
DESCRIPTION
Music r ecommendations at Spotify. Erik Bernhardsson [email protected]. Spotify. Launched in 2009 Available in 17 countries 20M active users, 5M paying subscribers Peak at 5k tracks/s, 1M logged in users 20M tracks. Some applications. Recommendation stuff at Spotify. - PowerPoint PPT PresentationTRANSCRIPT
![Page 2: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/2.jpg)
Spotify
- Launched in 2009- Available in 17 countries- 20M active users, 5M paying subscribers- Peak at 5k tracks/s, 1M logged in users- 20M tracks
![Page 3: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/3.jpg)
Some applications
![Page 4: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/4.jpg)
Recommendation stuff at Spotify
- Related artists:
![Page 5: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/5.jpg)
Recommendation stuff at Spotify, cont…
![Page 6: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/6.jpg)
More!
![Page 7: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/7.jpg)
How can we find music?
![Page 8: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/8.jpg)
Recommendations
- Manual classification- Feature extraction- Social media analysis, web scraping, metadata based- Collaborative filtering
![Page 9: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/9.jpg)
Pandora & Music Genome Project
- Classifies tracks in terms of 400 attributes- Each track takes 20-30 minutes to classify- A distance function finds similar tracks
- “Subtle use of strings”- “Epic buildup”- “Acid Jazz roots”- “Beats made for dancing”- “Trippy soundscapes”- “Great trombone solo”- …
![Page 10: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/10.jpg)
Scraping the web is another approach
![Page 11: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/11.jpg)
Feature extraction
![Page 12: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/12.jpg)
Collaborative filtering
Idea:- If two movies x, y get similar ratings then they are probably
similar- If a lot of users all listen to tracks x, y, z, then those tracks
are probably similar
![Page 13: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/13.jpg)
Collaborative filtering
![Page 14: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/14.jpg)
Get data
![Page 15: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/15.jpg)
… lots of data
![Page 16: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/16.jpg)
Aggregate data
Throw away temporal information and just look at the number of times
![Page 17: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/17.jpg)
OK, so now we have a big matrix
![Page 18: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/18.jpg)
… very big matrix
Throw out all the temporal data:
![Page 19: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/19.jpg)
Supervised collaborative filtering is pretty much matrix completion
![Page 20: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/20.jpg)
Supervised learning: Matrix completion
![Page 21: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/21.jpg)
Supervised: evaluating rec quality
![Page 22: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/22.jpg)
Unsupervised learning
- Trying to estimate the density- i.e. predict probability of future events
![Page 23: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/23.jpg)
Try to predict the future given the past
![Page 24: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/24.jpg)
How can we find similar items
![Page 25: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/25.jpg)
We can calculate correlation coefficient as an item similarity
- Use something like Pearson, Jaccard, …
![Page 26: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/26.jpg)
Amazon did this for “customers who bought this also bought”
- US patent 7113917
![Page 27: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/27.jpg)
Parallelization is hard though
![Page 28: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/28.jpg)
Can speed this up using various LSH tricks
- Twitter: Dimension Independent Similarity Computation (DISCO)
![Page 29: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/29.jpg)
Are there other approaches?
![Page 30: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/30.jpg)
Natural Language Processing has a lot of similar problems
…matrix factorization is one idea
![Page 31: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/31.jpg)
Matrix factorization
![Page 32: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/32.jpg)
Matrix factorization
- Want to get user vectors and item vectors- Assume f latent factors (dimensions) for each user/item
![Page 33: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/33.jpg)
- Hofmann, 1999- Also called PLSI
Probabilistic Latent Semantic Analysis (PLSA)
![Page 34: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/34.jpg)
PLSA, cont.
+ a bunch of constraints:
![Page 35: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/35.jpg)
PLSA, cont.
Optimization problem: maximize log-likelihood
![Page 36: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/36.jpg)
PLSA, cont.
![Page 37: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/37.jpg)
![Page 38: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/38.jpg)
![Page 39: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/39.jpg)
![Page 40: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/40.jpg)
![Page 41: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/41.jpg)
“Collaborative Filtering for Implicit Feedback Datasets”
- Hu, Koren, Volinsky (2008)
![Page 42: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/42.jpg)
“Collaborative Filtering for Implicit Feedback Datasets”, cont.
![Page 43: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/43.jpg)
Here is another method we use
![Page 44: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/44.jpg)
What happens each iteration
- Assign all latent vectors small random values- Perform gradient ascent to optimize log-likelihood
![Page 45: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/45.jpg)
Calculate derivative and do gradient ascent
- Assign all latent vectors small random values- Perform gradient ascent to optimize log-likelihood
![Page 46: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/46.jpg)
2D iteration example
![Page 47: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/47.jpg)
Vectors are pretty nice because things are now super fast
- User-item score is a dot product:
- Item-item similarity score is a cosine similarity:
- Both cases have trivial complexity in the number of factors f:
![Page 48: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/48.jpg)
Example: item similarity as a cosine of vectors
![Page 49: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/49.jpg)
Two dimensional example for tracks
![Page 50: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/50.jpg)
We can rank all tracks by the user’s vector
![Page 51: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/51.jpg)
So how do we implement this?
![Page 52: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/52.jpg)
Hadoop at Spotify
![Page 53: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/53.jpg)
One iteration of a matrix factorization algorithm
“Google News personalization: scalable online collaborative filtering”
![Page 54: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/54.jpg)
![Page 55: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/55.jpg)
So now we solved the problem of recommendations right?
![Page 56: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/56.jpg)
Actually what we really want is to apply it to other domains
![Page 57: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/57.jpg)
Radio
- Artist radio: find related tracks- Optimize ensemble model based on skip/thumbs data
![Page 58: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/58.jpg)
Learning from feedback is actually pretty hard
![Page 59: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/59.jpg)
A/B testing
![Page 60: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/60.jpg)
More applications!!!
![Page 61: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/61.jpg)
![Page 62: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/62.jpg)
![Page 63: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/63.jpg)
Last but not least: we’re hiring!
![Page 64: Music r ecommendations at Spotify](https://reader038.vdocuments.net/reader038/viewer/2022102912/568168aa550346895ddf4b8f/html5/thumbnails/64.jpg)
Thank you