machine learning for recommender systems in the job market

33
Machine Learning for Recommender Systems in the Job Market hamburg.ai, May 2017 Fabian Abel

Upload: fabian-abel

Post on 22-Jan-2018

48 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Machine Learning for Recommender Systems in the Job Market

Machine Learning for

Recommender

Systems in the Job

Markethamburg.ai, May 2017

Fabian Abel

Page 2: Machine Learning for Recommender Systems in the Job Market

Challenge

Given a user, the goal is to recommend job postings…

1. that the user may be interested in and

2. for which the user is an appropriate candidate.

2

Scala Dev(m/w)

ScalaEngineer

Scala Dev, Hamburg

user

job postings

Job

recommende

r

companies

recruiter

19M

750k-1M

Page 3: Machine Learning for Recommender Systems in the Job Market

3

Goals / Triangle of contradiction

Scala Dev,

Hamburg

• Relevant recos

• No spam

• Relevant

candidates

• High reach

• Happy customers

• High revenue (e.g. many

clicks on paid content)

companies

user

Page 4: Machine Learning for Recommender Systems in the Job Market

Job recommendations

Page 5: Machine Learning for Recommender Systems in the Job Market

5

mobile email

Page 6: Machine Learning for Recommender Systems in the Job Market

Job recommendations

Page 7: Machine Learning for Recommender Systems in the Job Market

Job recommendations

Page 8: Machine Learning for Recommender Systems in the Job Market

8

Page 9: Machine Learning for Recommender Systems in the Job Market

9

Job Recommender REST Service

GET /rest/recommendations/jobs/user/42

//response:

{

"total": 20,

"collection":[

{"item_id": 7263, "score": 0.87, "reason": [..],..},

{"item_id": 6526, "score": 0.81, "reason": [..],..},

...

]

}

Page 10: Machine Learning for Recommender Systems in the Job Market

10

Search indices

XIN

G

Sou

rces

/ X

ING

ser

vice

s

MySQLNoSQL

live updates

Batch processing

batchupdates

Infrastructure for recommendersR

eco

mm

en

der

RE

ST

serv

ice

XING Products

Deployment Infrastructure

Page 11: Machine Learning for Recommender Systems in the Job Market

11

Search indices

XIN

G

Sou

rces

/ X

ING

ser

vice

s

MySQLNoSQL

live updates

Batch processing

batchupdates

Infrastructure for recommendersR

eco

mm

en

der

RE

ST

serv

ice

XING Products

Deployment Infrastructure

Page 12: Machine Learning for Recommender Systems in the Job Market

12

Title

Company

Employment type

and career level

Full-text

description

Key properties of a job posting

Page 13: Machine Learning for Recommender Systems in the Job Market

13

Key sources for understanding user demands

Social Network

explicit and

implicit

connections

Profile

Fabian Abel

Data Scientist

Haves:

Interests:

web science

big data, hadoop skills & co.

Interactions

data

web

social media

clicks, bookmarks,

ratings, shown

big data

kununu

Interactions of

similar users

similar usershadoop

scala

Page 14: Machine Learning for Recommender Systems in the Job Market

14

Relevance Estimation

Social Network

explicit and

implicit

connections

Profile

Fabian Abel

Data Scientist

Haves:

Interests:

web science

big data, hadoop skills & co.

Interactions

data

web

social media

clicks, bookmarks,

ratings

big data

kununu

Interactions of

similar users

similar usershadoop

scala

Content-

based

features

Collaborative

features

Social

features

Usage

behavior

features

Core

RecSys

engines(regression model)

Logistic Regression

P(relevant | x) = 1

1 + e -(b0 + bi xi)i

n

feature vector impact of feature xi

Page 15: Machine Learning for Recommender Systems in the Job Market

15

Relevance Estimation + Additional Filters

Content-

based

features

Collaborative

features

Social

features

Usage

behavior

features

Core

RecSys

engines(regression model)

Location-

based

filtering

Frequenty

Shown

Filtering

Monetary-

based

diversification

Career Level

filtering

Filtering &

Diversification

0.92 0.8 0.76

4 core sub-recommender

engines and 19 filters that

together analyze and exploit

around 200 features

(relevance criteria)

...

Page 16: Machine Learning for Recommender Systems in the Job Market

16

Collaborative filteringTheory: User-based and Item-based CF

User-Item-Rating Matrix

Anna3 - 4 - 2

Julia2 - 5 4 1

Tim4 3 - 5 1

John- 4 5 4 -

Java D. SAP Co Data En Data Sc BI Dev

User-based CF:

Compare users based on their

ratings (e.g. cosine sim.)

Use the n most similar users to

predict a rating on an item

Item-based CF:

Compare items based on their

ratings (e.g. cosine sim.)

Use the n most similar items to

predict a rating from a user

(simple weight average)

Page 17: Machine Learning for Recommender Systems in the Job Market

17

Collaborative filteringReality: Ultra sparse User-Item Matrix and primarily implicit feedback

Anna- - 1 - -

Julia- - - - -

Tim- - - - -

John1 - - - -

Java D. SAP Co Data En Data Sc BI Dev

High level of sparsity:

classical collaborative

fitering (or matrix

factorization) does not

work

Page 18: Machine Learning for Recommender Systems in the Job Market

18

Collaborative filteringReality: Ultra sparse User-Item Matrix and primarily implicit feedback

Anna- - 1 - -

Data

Sci- - 32 18 -

Tim524 3 1 - -

John- - 2 4 -

Java D. SAP Co Data En Data Sc BI Dev

Data

Scientists

Skilled

in Java

BI Dev

Pseudo CF:

Cluster users based on...

jobrole

skills

field of study

Recommend items that simillar

users (= clusters) interacted with

New item problem remains...

Page 19: Machine Learning for Recommender Systems in the Job Market

19

Content-based filteringExample: semantic search

Fabian Abel

Data Mining Expert

Haves:

Interests:

ML, j2ee

Hadoop

Raw profile Ontology-based

Data ScientistSynonyms: Data Mining Expert, Data

Mining Specialist, …

6940

263

JEESynonyms:

J2EE, Java

Enterprise, …

370

Computer ScienceSynonyms: Informatik, Comp.

Sci., CS, …

162

HadoopSynonyms:

Apache

Hadoop, …

473

Machine

LearningSynonyms:

Maschinelles Lernen,

[jobrole]

[skills]

[field of studies]

Education: Computer Sci.

query

TFxIDF

Page 20: Machine Learning for Recommender Systems in the Job Market

20

Content-based filteringExample: more-like-this component

Anna

Bookmarked, rated

and applied-to job

postings

1 2 3

q = trans( 1 2 3 )

Recommending

similar items

q

7 8 9R =

8

9

7

TFxIDF

Re-rank by similarity of

topic model vectors:R’ map { r =>

val x = B’ map { b =>

cosineSim(r, b)

}

r -> x.sum / x.size

} sortBy(-_._2)

8

7

9 Re-ranking: - LSI

- Word2Vec

Topic model

vector

representations

1 2

7 8

3

9

1 2

8

3

97R’=B’=

=B=R

Page 21: Machine Learning for Recommender Systems in the Job Market

21

Content-based filteringExample: more-like-this component

CTR

TFxIDFLSI-based re-ranking

+3.2% +3.1%

Word2Vec-basedRe-ranking

Page 22: Machine Learning for Recommender Systems in the Job Market

ChallengesIssues that we have to fight with…

22

Page 23: Machine Learning for Recommender Systems in the Job Market

23

Profiles vs. People’s wishes for their

future

past

past

Profile describes a

user‘s past/current

position(s), not future

wishes

Page 24: Machine Learning for Recommender Systems in the Job Market

What John writes…

24

And what he means…

Recruiter-John

International Sales Manager Call Center Agent(10 EUR per hour)

Sales Manager Sales Manager for B2B

customers(80K EUR per year)

Data Scientist skilled in Hadoop,

Scala, Elasticsearch, … with PhD in …

Data Analyst(skilled in SAS or Excel)

Page 25: Machine Learning for Recommender Systems in the Job Market

What Paul says he is…

25

And what he means…

Paul, the Candidate

CEO Network Engineer(currently unemployed)

BI Engineer(skilled in old-school ETL)

Shopman(in a kiosk)

Data Scientist with 100+ skills

Sales Manager

Page 26: Machine Learning for Recommender Systems in the Job Market

26

Understanding the meaning of things that recruiters

write in job postings and users write in their profiles is

not trivial…

Page 27: Machine Learning for Recommender Systems in the Job Market

27

People freak out if we

recommend

something wrong!

Try to eliminate

freakommendations

(outliers)

Page 28: Machine Learning for Recommender Systems in the Job Market

Outlier Filtering

Core

RecSys

engines

Location Filter

Outlier filterFiltering &

Diversification

0.92 0.8 0.76

Career level

Filter

...

...

2. Filter:

if (r > threshold) keep

else drop

1. predictRating( , )

= predict(toFeatureVec( , )

= r //rating between 1 and 5

Estimate how a user

would rate the

item…

(training: 750k

explicit ratings)

Page 29: Machine Learning for Recommender Systems in the Job Market

good recos

bad recos

Perc

en

tag

eo

ffi

ltere

du

ser-

job

po

sti

ng

pair

sb

yra

tin

g

threshold

29

Example: with a threshold of 2.5 we kill 86% of the bad and 18% of the good recos

Outlier FilteringThe “filter onion”: trade-off between killing bad recosand keeping good ones

Page 30: Machine Learning for Recommender Systems in the Job Market

• xgboost-based model

• Example features (137 features in total):

• Matching & weighting: jobrole, skills, discipline, industry, ...

• Distance: home location / job seeker location

• Transitions: job role job role, field of study job role

• ...

30

Outlier FilteringExample features (137 features in total)

Page 31: Machine Learning for Recommender Systems in the Job Market

Outlier FilteringSome A/B test results: user success

31

filtering

use

rs w

ith

re

cos

no filtering

-10.9%

+7.4%

use

rs w

ho

clic

ked

on

re

cos

no filtering filtering

Less people get recommendations,

but more users click!

Stricter filtering pays off!

Page 32: Machine Learning for Recommender Systems in the Job Market

ACM RecSys Challenge http::/recsyschallenge.com

32

Task: push recommendations (new items, paid vs. non-

paid, premium vs. basic users)

Started beginning of March (ca. 240 teams so far), ends in

June

Offline & online evaluation

Still possible to sign-up for the offline evaluation…

Page 33: Machine Learning for Recommender Systems in the Job Market

Thank you http://2017.recsyschallenge.com

@fabianabel