Download - Learning To Rank data2day 2017
Learning to Rank
Stefan Kühn
Join me on XING
data2day Heidelberg - September 28th, 2017
Stefan Kühn (XING) Ranking 28.09.2017 1 / 30
Contents
1 Rankings and Humans
2 Ranking and Machine Learning
3 Formalizing Ranking Problems
4 Rankings and Recommender Systems
Stefan Kühn (XING) Ranking 28.09.2017 2 / 30
1 Rankings and Humans
2 Ranking and Machine Learning
3 Formalizing Ranking Problems
4 Rankings and Recommender Systems
Stefan Kühn (XING) Ranking 28.09.2017 3 / 30
Rankings in Everyday Life
TODO ListsPrioritized BacklogsTop X songs/movies/. . .You get the idea. . .
Stefan Kühn (XING) Ranking 28.09.2017 4 / 30
Rankings in History
It all started with
Stefan Kühn (XING) Ranking 28.09.2017 5 / 30
Rankings Nowadays
German States by Employee Happiness (according to Kununu)
Stefan Kühn (XING) Ranking 28.09.2017 6 / 30
Rankings, Heuristics, Decisions
Rankings are about comparisonsRankings are about decision-makingSome heuristics are about both
Recognition HeuristicIf one of two objects is recognized and the other is not, then infer that therecognized object has the higher value with respect to the criterion.proposed by Gigerenzer and Goldstein, built upon the great works of Kahneman and Tversky
Stefan Kühn (XING) Ranking 28.09.2017 7 / 30
1 Rankings and Humans
2 Ranking and Machine Learning
3 Formalizing Ranking Problems
4 Rankings and Recommender Systems
Stefan Kühn (XING) Ranking 28.09.2017 8 / 30
Learning
Is Ranking a Machine Learning Problem?
Stefan Kühn (XING) Ranking 28.09.2017 9 / 30
Machine Learning Concepts
Supervised - Learning from LabelsFigure out how to generate correct labels using the given data
ClassificationRegression
Unsupervised - Learning from DataIdentify hidden/inherent structure using the given data
ClusteringDimensionality Reduction / Manifold LearningOutlier Detection
Stefan Kühn (XING) Ranking 28.09.2017 10 / 30
Supervised versus Unsupervised
Learning to RankFigure out how to generate good ranking using the given data
What about Learning to Rank = Machine-Learned Ranking or MLR?1 Supervised because ranks are like labels?2 Unsupervised because ranks are typically based on implicit feedback,
i.e. latent/hidden/inherent structure?3 Mixed/intermediate/something else?4 Ill-posed question?
Could you please rank these options according to whatever you think isappropriate?
And by the way, how did you do it?
Stefan Kühn (XING) Ranking 28.09.2017 11 / 30
Supervised versus Unsupervised
Learning to RankFigure out how to generate good ranking using the given data
What about Learning to Rank = Machine-Learned Ranking or MLR?1 Supervised because ranks are like labels?2 Unsupervised because ranks are typically based on implicit feedback,
i.e. latent/hidden/inherent structure?3 Mixed/intermediate/something else?4 Ill-posed question?
Could you please rank these options according to whatever you think isappropriate?
And by the way, how did you do it?
Stefan Kühn (XING) Ranking 28.09.2017 11 / 30
Example: XING Stream
How to order News?
By time?By content/topic?By popularity?By clicking probability?
Every choice changes the problem tosolve while the result set is always thesame - a ranked list of items. Everychoice represents a different distancemeasure / objective function tominimize.
Stefan Kühn (XING) Ranking 28.09.2017 12 / 30
1 Rankings and Humans
2 Ranking and Machine Learning
3 Formalizing Ranking Problems
4 Rankings and Recommender Systems
Stefan Kühn (XING) Ranking 28.09.2017 13 / 30
Ranking - Problem Formulation
Items x ∈ X
Ordered Labels or Ranks 1 > 2 > . . . > k > . . .
Ranking rule f that allows to do the following:I Input: Unordered subset {x , y , z , . . .} ⊆ XI Output: Ordered list, i.e. y > z > x > . . .
Example: Text searchItems: Set of DocumentsRanking rule f : Similarity measure for documents and search terms
Stefan Kühn (XING) Ranking 28.09.2017 14 / 30
Ranking and Level of Measurement
Supervised Learning ProblemsClassification - Nominal Scale - Class LabelsRanking - Ordinal Scale - RanksRegression - Intervall Scale - Real Values
Ranking is the task of predicting labels on an ordinal scale.
Informally: Learn ordering from labeled training data - typically ordered listsof items - and try to predict ordering for new sets of items.
What is special about this?Ordering is context-dependent. One additional item (or one item less) canchange all other ranks. This is clearly different compared to regression andclassification.
Stefan Kühn (XING) Ranking 28.09.2017 15 / 30
Ranking in Information Retrieval
CC BY-SA 3.0,https://commons.wikimedia.org/w/index.php?curid=518546
Stefan Kühn (XING) Ranking 28.09.2017 16 / 30
Ranking - Pointwise
Approach CharacteristicsInput: Single itemsEvaluation: Scoring function evaluated for each point/itemOptimization: Loss function derived from individual scores
Reduces Ranking Problem to eitherRegressionClassificationOrdinal Regression
Stefan Kühn (XING) Ranking 28.09.2017 17 / 30
Ranking - Pointwise
Image taken from Tie-Yan Liu @ WWW 2009 Tutorial on Learning to Rankhttp://wwwconference.org/www2009/pdf/T7A-LEARNING TO RANK TUTORIAL.pdf
Stefan Kühn (XING) Ranking 28.09.2017 18 / 30
Ranking - Pointwise
Problems with the Pointwise Approach
Length of item lists can differ significantlyExample: There are more website related to the search term Online(ca. 10 Mrd.) than to Offline (ca. 666 Mio)Position of items on list is not taken into accountExample: Incorrect ordering of the top 10 results will have a slightlybigger impact than errors/inversions below position 123456789
ConsequenceLonger lists will dominate the optimization, while actually the shorter listsare more important for humans/customers.
Advantages
If all individual scores are known, all possible Rankings are determined.Stefan Kühn (XING) Ranking 28.09.2017 19 / 30
Ranking - Pairwise
Approach CharacteristicsInput: Pairs of ItemsEvaluation: Preference function evaluated for each pair - binaryclassificationOptimization: Pairwise Classification Loss derived from all pairings,weighted majority voting
Reduces Ranking Problem toBinary (or pairwise) Classification
Stefan Kühn (XING) Ranking 28.09.2017 20 / 30
Ranking - Pairwise
Image taken from Tie-Yan Liu @ WWW 2009 Tutorial on Learning to Rankhttp://wwwconference.org/www2009/pdf/T7A-LEARNING TO RANK TUTORIAL.pdf
Stefan Kühn (XING) Ranking 28.09.2017 21 / 30
Ranking - Pairwise
Problems with the Pairwise Approach
Length of item lists can differ significantlyNumber of pairs depends quadratically on the length of the listEven bigger imbalance w.r.t. list length
Advantages
Comparisons of pairs of elements is a much more natural approach toRanking than Regression or Classification.
Stefan Kühn (XING) Ranking 28.09.2017 22 / 30
Ranking - Listwise
Approach CharacteristicsInput: Set of ItemsEvaluation: Some Evaluation MetricOptimization:
I Either: Directly minimize Evaluation MetricI Or: Loss function defined for permutations of the given input
Reduces Ranking Problem to eitherDirect Optimization of Evaluation MetricListwise Loss Optimization (Distance between lists is non-trivial)
Stefan Kühn (XING) Ranking 28.09.2017 23 / 30
Ranking - Listwise
Image taken from Tie-Yan Liu @ WWW 2009 Tutorial on Learning to Rankhttp://wwwconference.org/www2009/pdf/T7A-LEARNING TO RANK TUTORIAL.pdf
Stefan Kühn (XING) Ranking 28.09.2017 24 / 30
Ranking - Listwise
Problems with the Listwise Approach
Huge complexity issueDirect Optimization: Non-smooth functionsOften only incomplete knowledge about ground truth for lists (onlytiny subset available for learning)
Advantages
Positions on lists are visible to the algorithms.
Stefan Kühn (XING) Ranking 28.09.2017 25 / 30
Important Contributions
Natural Language ProcessingI tf-idfI Okapi BM25I Link to Information Theory
Interesting Nonlinear Evaluation MetricsI P@k = Precision restricted to the best k itemsI MAPI Discounted Cumulative Gain = DCG
Interesting Non-Standard Ojective FunctionsI (N)DCG as optimization objectiveI non-continuous and non-smooth
Interesting RankersI Pointwise: Subset Ranking; McRank; PRanking (Ordinal Regression)I Pairwise: RankNet; FRank; RankBoost; Ranking SVMI Listwise: SoftRank; SoftNDCG; SVM-MAP, Structural SVM, AdaRank
Stefan Kühn (XING) Ranking 28.09.2017 26 / 30
1 Rankings and Humans
2 Ranking and Machine Learning
3 Formalizing Ranking Problems
4 Rankings and Recommender Systems
Stefan Kühn (XING) Ranking 28.09.2017 27 / 30
Example: Personalized Ad Recommendations
Standard ApproachesI Contextual BanditsI Policies based on classifiers for
each adI Collaborative FilteringI Based on Latent Features,
e.g. when using MatrixFactorization
Main ProblemI Extreme sparsity of positive
feedback
Stefan Kühn (XING) Ranking 28.09.2017 28 / 30
Example: Personalized Ad Recommendations
New ApproachesI Still Contextual BanditsI Policies based on rankers
instead of classifiersRecent Paper by Chaudhuri etal.
I Personalized AdvertisementRecommendation: A RankingApproach to Address theUbiquitous Click SparsityProblem
I Works best in the case ofextreme sparsity
Stefan Kühn (XING) Ranking 28.09.2017 29 / 30
Thank you!
Stefan Kühn (XING) Ranking 28.09.2017 30 / 30