collaborative filtering matrix completion alternating least squares€¦ · netflix prize • given...

14
©Sham Kakade 2016 1 Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 12, 2016 Collaborative Filtering Matrix Completion Alternating Least Squares Case Study 4: Collaborative Filtering

Upload: others

Post on 03-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

©Sham Kakade 2016 1

Machine Learning for Big Data CSE547/STAT548, University of Washington

Sham Kakade

May 12, 2016

Collaborative FilteringMatrix Completion

Alternating Least Squares

Case Study 4: Collaborative Filtering

Page 2: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Collaborative Filtering

• Goal: Find movies of interest to a user based on movies watched by the user and others

• Methods: matrix factorization

©Sham Kakade 2016 2

Page 3: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

City of God

Wild Strawberries

The Celebration

La Dolce Vita

Women on the Verge of aNervous Breakdown

What do I recommend???

©Sham Kakade 2016 3

Page 4: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Netflix Prize

• Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest accuracy

• 17770 total movies

• 480189 total users

• Over 8 billion total ratings

• How to fill in the blanks?

©Sham Kakade 2016 4Figures from Ben Recht

Page 5: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Matrix Completion Problem

• Filling missing data?

©Sham Kakade 2016 5

Xij known for black cells

Xij unknown for white cells

Rows index moviesColumns index users

X =Rows index users

Columns index movies

Page 6: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Interpreting Low-Rank Matrix Completion (aka Matrix Factorization)

©Sham Kakade 2016 6

X LR’

=

Page 7: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Identifiability of Factors

• If ruv is described by Lu , Rv what happens if we redefine the “topics” as

• Then,

©Sham Kakade 2016 7

X LR’

=

Page 8: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Matrix Completion via Rank Minimization

• Given observed values:

• Find matrix

• Such that:

• But…

• Introduce bias:

• Two issues:

©Sham Kakade 2016 8

Page 9: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Approximate Matrix Completion

• Minimize squared error:– (Other loss functions are possible)

• Choose rank k:

• Optimization problem:

©Sham Kakade 2016 9

Page 10: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Coordinate Descent for Matrix Factorization

• Fix movie factors, optimize for user factors

• First observation:

©Sham Kakade 2016 10

Page 11: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Minimizing Over User Factors

• For each user u:

• In matrix form:

• Second observation: Solve by

©Sham Kakade 2016 11

Page 12: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Coordinate Descent for Matrix Factorization: Alternating Least-Squares

• Fix movie factors, optimize for user factors– Independent least-squares over users

• Fix user factors, optimize for movie factors– Independent least-squares over movies

• System may be underdetermined:

• Converges to

©Sham Kakade 2016 12

Page 13: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

Effect of Regularization

©Sham Kakade 2016 13

X LR’

=

Page 14: Collaborative Filtering Matrix Completion Alternating Least Squares€¦ · Netflix Prize • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest

What you need to know…

• Matrix completion problem for collaborative filtering

• Over-determined -> low-rank approximation

• Rank minimization is NP-hard

• Minimize least-squares prediction for known values for given rank of matrix

– Must use regularization

• Coordinate descent algorithm = “Alternating Least Squares”

©Sham Kakade 2016 14