collaborative filtering matrix completion alternating least squares€¦ · netflix prize • given...

Post on 03-Jul-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

©Sham Kakade 2016 1

Machine Learning for Big Data CSE547/STAT548, University of Washington

Sham Kakade

May 12, 2016

Collaborative FilteringMatrix Completion

Alternating Least Squares

Case Study 4: Collaborative Filtering

Collaborative Filtering

• Goal: Find movies of interest to a user based on movies watched by the user and others

• Methods: matrix factorization

©Sham Kakade 2016 2

City of God

Wild Strawberries

The Celebration

La Dolce Vita

Women on the Verge of aNervous Breakdown

What do I recommend???

©Sham Kakade 2016 3

Netflix Prize

• Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest accuracy

• 17770 total movies

• 480189 total users

• Over 8 billion total ratings

• How to fill in the blanks?

©Sham Kakade 2016 4Figures from Ben Recht

Matrix Completion Problem

• Filling missing data?

©Sham Kakade 2016 5

Xij known for black cells

Xij unknown for white cells

Rows index moviesColumns index users

X =Rows index users

Columns index movies

Interpreting Low-Rank Matrix Completion (aka Matrix Factorization)

©Sham Kakade 2016 6

X LR’

=

Identifiability of Factors

• If ruv is described by Lu , Rv what happens if we redefine the “topics” as

• Then,

©Sham Kakade 2016 7

X LR’

=

Matrix Completion via Rank Minimization

• Given observed values:

• Find matrix

• Such that:

• But…

• Introduce bias:

• Two issues:

©Sham Kakade 2016 8

Approximate Matrix Completion

• Minimize squared error:– (Other loss functions are possible)

• Choose rank k:

• Optimization problem:

©Sham Kakade 2016 9

Coordinate Descent for Matrix Factorization

• Fix movie factors, optimize for user factors

• First observation:

©Sham Kakade 2016 10

Minimizing Over User Factors

• For each user u:

• In matrix form:

• Second observation: Solve by

©Sham Kakade 2016 11

Coordinate Descent for Matrix Factorization: Alternating Least-Squares

• Fix movie factors, optimize for user factors– Independent least-squares over users

• Fix user factors, optimize for movie factors– Independent least-squares over movies

• System may be underdetermined:

• Converges to

©Sham Kakade 2016 12

Effect of Regularization

©Sham Kakade 2016 13

X LR’

=

What you need to know…

• Matrix completion problem for collaborative filtering

• Over-determined -> low-rank approximation

• Rank minimization is NP-hard

• Minimize least-squares prediction for known values for given rank of matrix

– Must use regularization

• Coordinate descent algorithm = “Alternating Least Squares”

©Sham Kakade 2016 14

top related