beer recommender system

Download Beer recommender system

Post on 16-Apr-2017

21 views

Category:

Data & Analytics

0 download

Embed Size (px)

TRANSCRIPT

  • Beer Recommender Systems

    Hsiang-HsuanHung(HHH)Hsiang.Hung2015@gmail.com

    h=ps://github.com/HsiangHung/BI-analysis/Recommender

    h=p://www.hsianghung.tech

  • IlikeBudweiser!!!

    5 13

    Beer RecommendaEon

    ?

  • IlikeBudweiser!!!

    5 13

    1 3.5 5IdontlikeBudweiser

    Beer RecommendaEon

    ?

    ?

  • Supervised learning, regression problems.

    Central concept: Similarity 1.) item-itemrecommendaEon (Amazon)

    2.) user-itemrecommendaEon

    CollaboraEve filtering (NeOlix, SpoEfy)

    Personalized RecommendaEon

  • Recommender Pipeline

    e-commerce dataraEng data

    web server

    browse

    reco

    Hybrid recommender

  • RecommendaEon Engine Dashboard

    item-item recouser-item reco

    When Mr. Simpsons (id=7) is browsing beer-144:

  • RecommendaEon Engine Dashboard

    item-item recouser-item reco

    When Mr. Simpsons (id=7) is browsing beer-144:

  • user

    s

    beers

    5 4 1 5 5 4

    5 5 5 34 1 1

    11 5 1 221

    33

    CollaboraEve Filtering (CF)

    4 44

    34 1

    b8 = (1,, 5,, 1,, 4)

    b9 = (5,, 1,, 5, 4,)b10 = (5, 1, 2,, 5, 4, 1)

    17 33 34 42 45 47 48

    5 6 7 8 9 10 11 12

    Beer Vectors:raEng tableDB:BeHoppy

  • Beer Vector Space and Cosine Similarity

    b1b2

    b4

    b3

    b 2 Rnum of users

  • b1b2

    b4

    b3

    sA,B =bA bB|bA||bB |

    is more similar

    to than

    Beer Vector Space and Cosine Similarity

    b 2 Rnum of users

  • When Mr. Simpsons (id=7) is browsing beer-144:

    user-item reco

    RecommendaEon Engine Dashboard

    item-item reco

  • ru,3 = 4.5

    ru,4 = 1

    ru,2 = 3

    b 2 Rm

    ru,5 = 4

    Ru,1 =?

    Neighborhood Models

    rH,x = 1

    rM,x

    = 4.5

    rT,x

    = 5

    RS,x

    =?

    u 2 Rn

    ModelB: user-basedModelA: item-based

  • Model C: Latent-Factor Model (Easily Scale Up)

    users preferencesbeer features

    ( )

    ( )m u

    sers

    n beersn

    m

    f

    f

    Computer(2009),Koren,BellandVolinsky

    1 2 5 1 2 2

    3 5 2 5 5 4

    5 4 1 5 5 4

    44 4

    (uS)(uH)(uK)...

    ( )

    b1 b2 b3

    7 33 34 42 45 47 48

    567891011

    ru,i ' uTubipredicEon

  • User-Beer Vector Space

    uS

    u,b 2 Rf

    RS,1 = uTSb1

    f: # of latent factorsb1

  • Models Performance

    MAE =1

    N

    X

    u,i

    |Ru,i ru,i|

    k=10-20

  • Challenges

    (implicit purchase frequency)ru,b = 1 5

    Hu,KorenandVolinsky,2008ru,b 2 I

    Cold Start (need more raEngs).Integrate implicit data: e-commerce data.

    Define confidence for each customer. (explicit raEng)

  • My Background

    Physics PhD@UCSD (2011)

    ComputaEonal Physicist@UT AusEn and UIUC (2012-2015)

    ComputaEonal physics and materials science

    Data Engineering Fellow@Insight (2016)

    Data ScienEst/Engineer@Anheuser-Busch (2016)

    Thank you!

  • Sarwar,Karypis,Konstan,andRiedl,(2001)

    weight

    Ru,i =

    Pj2Sk sijru,jPj2Sk |sij |

    ru,3 = 4.5

    ru,4 = 1

    ru,2 = 3

    kNN + weight:

    b 2 Rm

    ru,5 = 4

    =(0.8 4.5 + 0.7 4 + 0.2 3)

    0.8 + 0.7 + 0.2

    Ru,1 =?

    Model A: Item-based Neighborhood

  • users

    beers

    5 4 1 5 5 4

    5 5 5 34 1 1

    11 5 1 221

    33

    RaEngs as Features of Users Vectors

    4 44

    34 1

    u7 = (5, 4,, 1, 5, 5, 4,)

    u34 = (1, 1,, 5, 1, 2, 2,)

    u45 = (4, 5,, 1, 5, 5, 3, 1)

    s45,7 > s45,34

    7 33 34 42 45 47 48

    5 6 7 8 9 10 11 12

  • Herlocker,Konstan,BorchesandRiedl,(1999)

    weight

    rH,x = 1

    rM,x

    = 4.5

    rT,x

    = 5R

    u,x

    =

    Pv2Sk suvrv,xPv2Sk |suv|

    RS,x

    =?

    Find similar users:

    u 2 Rn

    Model B: User-based Neighborhood

  • What is the ALS?

    ( )

    ( )

    (uS)(uH)(uK)...

    ( )

    b1 b2 b3RaEngmatrix R = UBT

  • 567891011uS =

    0

    BBB@

    uS,1uS,2...

    uS,f

    1

    CCCA=BTB+ I

    1BT

    0

    BBB@

    rS,1rS,2...

    rS,n

    1

    CCCA

    More Detail: Normal EquaEons

    rS

    7 33 34 42 45 47 48

    Hu,KorenandVolinsky,2008

  • 567891011

    bi =

    0

    BBB@

    bi,1bi,2...

    bi,f

    1

    CCCA=UTU+ I

    1UT

    0

    BBB@

    r1,ir2,i...

    rm,i

    1

    CCCA

    uS =

    0

    BBB@

    uS,1uS,2...

    uS,f

    1

    CCCA=BTB+ I

    1BT

    0

    BBB@

    rS,1rS,2...

    rS,n

    1

    CCCA

    = ri

    rSrS,i

    7 33 34 42 45 47 48

    Hu,KorenandVolinsky,2008More Detail: Normal EquaEons

  • ImplemenEng ALS in Spark

  • AlternaEng least square (ALS)

    Solving Matrix-FactorizaEon LR

    regularizaEon

    minu,b,

    X

    (u,i) if ru,i 6=0

    ru,i uTubi u,i

    2+ X

    u

    |uu|2 +X

    i

    |bi|2

    ALS: at each step, fix one variable, and solve minimizaEon: fix , solve fix , solve fix , solveu ub b u b

  • Grid Search Using Cross-ValidaEon

    |u|2 + |b|2

  • LogisEc regression + confidence weight

    CF Using Implicit Data

    minu,b,

    X

    (u,i)

    cu,ipu,i uTubi u,i

    2+ X

    u

    |uu|2 +X

    i

    |bi|2

    user-item interacEon

    bias

    regularizaEon

    confidenceHu,KorenandVolinsky,2008

    cu,i = 1 + ru,i

    cu,i = 1 + log (1 + ru,i/)

    cu,i = 1 + log (1 + ru,i/) + ru,i

    pu,i = 0/1

  • Implicit Data CF Performance

    Metric: percenEle-ranking

    rank =

    Pu,i ru,i ranku,iP

    u,i ru,i

    Random: rank = 50%

    CF (f=20):

    Baseline: rank 29%

    rank 16%