©Sham Kakade 2016 1
Machine Learning for Big Data CSE547/STAT548, University of Washington
Sham Kakade
May 12, 2016
Collaborative FilteringMatrix Completion
Alternating Least Squares
Case Study 4: Collaborative Filtering
Collaborative Filtering
• Goal: Find movies of interest to a user based on movies watched by the user and others
• Methods: matrix factorization
©Sham Kakade 2016 2
City of God
Wild Strawberries
The Celebration
La Dolce Vita
Women on the Verge of aNervous Breakdown
What do I recommend???
©Sham Kakade 2016 3
Netflix Prize
• Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest accuracy
• 17770 total movies
• 480189 total users
• Over 8 billion total ratings
• How to fill in the blanks?
©Sham Kakade 2016 4Figures from Ben Recht
Matrix Completion Problem
• Filling missing data?
©Sham Kakade 2016 5
Xij known for black cells
Xij unknown for white cells
Rows index moviesColumns index users
X =Rows index users
Columns index movies
Interpreting Low-Rank Matrix Completion (aka Matrix Factorization)
©Sham Kakade 2016 6
X LR’
=
Identifiability of Factors
• If ruv is described by Lu , Rv what happens if we redefine the “topics” as
• Then,
©Sham Kakade 2016 7
X LR’
=
Matrix Completion via Rank Minimization
• Given observed values:
• Find matrix
• Such that:
• But…
• Introduce bias:
• Two issues:
©Sham Kakade 2016 8
Approximate Matrix Completion
• Minimize squared error:– (Other loss functions are possible)
• Choose rank k:
• Optimization problem:
©Sham Kakade 2016 9
Coordinate Descent for Matrix Factorization
• Fix movie factors, optimize for user factors
• First observation:
©Sham Kakade 2016 10
Minimizing Over User Factors
• For each user u:
• In matrix form:
• Second observation: Solve by
©Sham Kakade 2016 11
Coordinate Descent for Matrix Factorization: Alternating Least-Squares
• Fix movie factors, optimize for user factors– Independent least-squares over users
• Fix user factors, optimize for movie factors– Independent least-squares over movies
• System may be underdetermined:
• Converges to
©Sham Kakade 2016 12
Effect of Regularization
©Sham Kakade 2016 13
X LR’
=
What you need to know…
• Matrix completion problem for collaborative filtering
• Over-determined -> low-rank approximation
• Rank minimization is NP-hard
• Minimize least-squares prediction for known values for given rank of matrix
– Must use regularization
• Coordinate descent algorithm = “Alternating Least Squares”
©Sham Kakade 2016 14