efficiency/effectiveness trade-offs in learning to...
TRANSCRIPT
Efficiency/EffectivenessTrade-offsinLearningtoRank
Tutorial@ICTIR2017http://learningtorank.isti.cnr.it/
ClaudioLuccheseCa’Foscari UniversityofVenice
Venice,Italy
FrancoMariaNardiniHPCLab,ISTI-CNR
Pisa,Italy
l a b o r a t o r y
TheRankingProblem
RankingisatthecoreofseveralIRTasks:
• DocumentRankinginWebSearch
• AdsRankinginWebAdvertising
• Querysuggestion&completion
• ProductRecommendation
• SongRecommendation
• …
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 2
TheRankingProblem
Definition:Givenaqueryq andasetofobjects/documentsD,torankD soastomaximizeusers’satisfactionQ.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 3[KDF+13]Kohavi,R.,Deng,A.,Frasca,B.,Walker,T.,Xu,Y.,&Pohlmann,N.(2013,August).Onlinecontrolledexperimentsatlargescale.In Proceedingsofthe19thACMSIGKDDinternationalconferenceonKnowledgediscoveryanddatamining (pp.1168-1176).ACM.
Goal#1:Effectiveness• MaximizeQ !
• buthowtomeasureQ?
Goal#2:Efficiency• Makesuretherankingprocessisfeasibleandnottooexpensive• InBing...“every100msecimprovesrevenueby0.6%.Everymillisecondcounts.”[KDF+13]
Agenda
1. IntroductiontoLearningtoRank(LtR)• Background,algorithms,sourcesofcostinLtR,multi-stageranking
2. DealingwiththeEfficiency/Effectivenesstrade-off• FeatureSelection,EnhancedLearning,Approximatescoring,FastScoring
3. Hands-onI• Software,dataandpubliclyavailabletools• TraversingRegressionForests,SoA toolsandanalysis
4. Hands-onII• Trainingmodels,Pruningstrategies,Efficientscoring
Attheendofthedayyou’llbeabletotrainahighqualityrankingmodel,andtoexploitSoAtoolsandtechniquestoreduceitscomputationalcostupto18x!
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 4
DocumentRepresentationsandRanking
DocumentRepresentations
Adocumentisamulti-setofwords
Adocumentmayhavefields,itcanbesplitintozones,itcanbeenrichedwithexternaltextdata(e.g.,anchors)
Additionalinformationmaybeuseful,suchasIn-Links,Out-Links,PageRank,#clicks,sociallinks,etc.
HundredsignalsinpublicLtR Datasets
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 5
RankingFunctions
Term-weighting[SJ72]VectorSpaceModel[SB88]
BM25[JWR00],BM25f[RZT04]LanguageModeling[PC98]
LinearCombinationoffeatures[MC07]
Howtocombinehundredsofsignals?
[SJ72] Karen Sparck Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, 28(1):11–21, 1972.[SB88]GerardSaltonandChristopherBuckley.Term-weightingapproachesinautomatictextretrieval.Informationprocessing&management,24(5):513–523,1988.[JWR00]KSparck Jones,SteveWalker,andStephenE.Robertson.Aprobabilisticmodelofinformationretrieval:developmentandcomparativeexperiments.Informationprocessing&management,36(6):809–840,2000[RZT04]StephenRobertson,HugoZaragoza,andMichaelTaylor.Simplebm25extensiontomultipleweightedfields.InProceedingsofthethirteenthACMinternationalconferenceonInformationandknowledgemanagement,pages42–49.ACM,2004.[PC98]JayMPonteandWBruceCroft.Alanguagemodelingapproachtoinformationretrieval.InProceedingsofthe21stannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages275–281.ACM,1998.[MC07]DonaldMetzlerandWBruceCroft.Linearfeature-basedmodelsforinformationretrieval.InformationRetrieval,10(3):257–274,2007.
RankingasaSupervisedLearningTask
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 6
d1
y1
Training Instance
Machine Learning Algo(NeuralNet, SVM, Decision-Tree)
q d2
y2
d3
y3
di
yi
…
…
…
…
LossFunction
RankingModel
Query/DocumentRepresentationUsefulsignals
• LinkAnalysis[H+00]• Termproximity[RS03]• Queryclassification[BSD10]• Queryintentmining[JLN16,LOP+13]• Findingentitiesdocuments[MW08]andinqueries[BOM15]
• Documentrecency [DZK+10]• Distributedrepresentationsofwordsandtheircompositionality[MSC+13]
• Convolutionalneuralnetworks[SHG+14]
• ….
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 7
• ExplicitFeedback• ThousandsofSearchQualityRaters
• Absolutevs.RelativeJudgments[CBCD08]
• ImplicitFeedback• clicks/querychains[JGP+05,Joa02,RJ05]• De-biasing/clickmodels[JSS17]
• Minimizeannotationcost• ActiveLearning[LCZ+10]• DeepversusShallowlabelling[YR09]
RelevanceLabelsGeneration
d
q y
EvaluationMeasuresforRanking
Manyareintheform:
• (N)DCG[JK00]:• RBP[MZ08]:• ERR[CMZG09]:
DotheymatchUsersatisfaction?• ERRcorrelatesbetterwithusersatisfaction(clicksandeditorials)[CMZG09]
• ResultsInterleavingtocomparetworankings[CJRY12]• “majorrevisionsofthewebsearchrankers[Bing] ...Thedifferencesbetweentheserankersinvolvechangesofoverhalfapercentagepoint,inabsoluteterms,ofNDCG”
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 8
Gain(d) = 2
y � 1 Discount(r) =1
log(r + 1)
Q@k =X
ranks r=1...k
Gain(dr) · Discount(r)
Gain(d) = I(y) Discount(r) = (1� p)pr�1
Gain(d) = Ri
i�1Y
j=1
(1�Rj) with Ri = (2y � 1)/2ymax
Discount(r) = 1/r
[JK00]Kalervo Jarvelin andJaana Kekalainen.Ir evaluationmethodsforretrievinghighlyrelevantdocuments.InProceedingsofthe23rdannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages41–48.ACM,2000.[MZ08]AlistairMoffatandJustinZobel.Rank-biasedprecisionformeasurementofretrievaleffectiveness.ACMTransactionsonInformationSystems(TOIS),27(1):2,2008.[CMZG09]OlivierChapelle,DonaldMetlzer,Ya Zhang,andPierreGrinspan.Expectedreciprocalrankforgradedrelevance.InProceedingsofthe18thACMconferenceonInformationandknowledgemanagement,pages621–630.ACM,2009.[CJRY12]OlivierChapelle,ThorstenJoachims,FilipRadlinski,andYisong Yue.Large-scalevalidationandanalysisofinterleavedsearchevaluation.ACMTransactionsonInformationSystems(TOIS),30(1):6,2012.
Isitaneasyordifficulttask?
Gradientdescentcannotbeapplieddirectly
Rank-basedmeasures(NDCG,ERR,MAP,…)dependondocumentssortedorder
• gradientiseither0(sortedorderdidnotchange)orundefined (discontinuity)
Solution:weneedaproxyLossfunction• itshouldbedifferentiable• andwithasimilarbehavioroftheoriginalcostfunction
di document score(model parameters)
ND
CG
@k
d0
d1
d2
d3
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 9
di document score(model parameters)
Pro
xy Q
ualit
y Fu
nctio
n
d0
d1
d2
d3
Point-WiseAlgorithms
di yi
Training InstanceEachdocumentisconsideredindependently fromtheothers
• Noinformationaboutothercandidatesforthesamequeryisusedattrainingtime
Adifferentcost-functionisoptimized• Severalapproaches:Regression,Multi-ClassClassification,Ordinalregression,… [Liu11]
AmongRegression-Based:GradientBoostingRegressionTrees [Fri01]
• MeanSquaredError isminimized
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 10[Liu11] Tie-Yan Liu. Learning to rank for information retrieval, 2011. Springer.[Fri01]JeromeHFriedman.Greedyfunctionapproximation:agradientboostingmachine.Annalsofstatistics,pages1189–1232,2001.
Training Algo: GBRTLoss Function: MSE
…
GradientBoostingRegressionTrees
Iterativealgorithm:
Eachfi isregardedasastepinthebestoptimizationdirection,i.e.,asteepestdescentstep:
GivenL = MSE/2:
Gradientgi isapproximatedbyaRegressionTreeti
F (d) =X
i
fi(d)
fi(d) = �⇢i gi(d) � gi(d) = �@L(y, f(d))
@f(d)
�
f=P
j<i fj
�@⇥12MSE(y, f(d))
⇤
@f(d)= �
@⇥12
P(y � f(d))2
⇤
@f(d)= y � f(d)
WeakLearner
negativegradientbyline-search
pseudo-response
d
pre
dic
ted
doc
umen
t sco
re
f 1(d)
y
t1
y-f 1(
d)
f 2(d)
t2
f 3(d)
t3
ErrorY-F(d)
Pair-wiseAlgorithms:RankNet[BSR+05]DocumentsareconsideredinpairsEstimatedprobabilitythatdi isbetterthandj is:
LetQij bethetrueprobability,theCrossEntropyLoss is:
Weconsideronlypairswheredi isbetterthandj ,ie.,yi >yj :
Thisisdifferentiable: usedtotrainaNeuralNetworkwithback-propagation.
Otherapproaches:Ranking-SVM[Joa02],RankBoost[FISS03],…
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 12
[BSR+05]ChrisBurges,TalShaked,ErinRenshaw,AriLazier,MattDeeds,NicoleHamilton,andGregHullender.Learningtorankusinggradientdescent.InProceedingsofthe22ndinternationalconferenceonMachinelearning,pages89–96.ACM,2005.[Joa02]ThorstenJoachims.Optimizingsearchenginesusingclickthrough data.InProceedingsoftheeighthACMSIGKDDinternationalconferenceonKnowledgediscoveryanddatamining,pages133–142.ACM,2002.[FISS03]Yoav Freund,RajIyer,RobertESchapire,andYoram Singer.Anefficientboostingalgorithmforcombiningpreferences.Journalofmachinelearningresearch,4(Nov):933–969,2003.
di
Training Instance
Training Algo: ANNLoss: Cross Entropy
dj
with yi>yj
Pij
=eoij
1 + eoijoij = F(di)-F(dj)
Cij = �Qij logPij � (1�Qij) log(1� Pij)
Cij
= log(1 + e�oij)
Ifoij → +∞(i.e.,correctlyordered)
Cij → 0
Ifoij → -∞(i.e.,mis-ordered)
Cij → +∞
Pair-wiseAlgorithms
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 13[CQL+07]Zhe Cao,TaoQin,Tie-YanLiu,Ming-FengTsai,andHangLi.Learningtorank:frompairwiseapproachtolistwise approach.InProceedingsofthe24thinternationalconferenceonMachinelearning,pages129–136.ACM,2007.
RankNet performsbetter thanotherpairwisealgorithms
RankNet costisnotnicelycorrelatedwith
NDCGquality
List-wiseAlgorithms:LambdaMart[Bur10]
Training Algo: GBRTLoss Function: MSE
di 𝜆i
Training Instance
q: …d1 d2 d3 dj d|q|
Recall:GBRTrequiresagradientgi foreverydi
First:estimatethegradientcomparingtodj,withyi>yj :
Then:estimatethegradientcomparingtoeveryotherdj forq
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 14[Bur10]ChristopherJ.C.Burges.Fromranknet tolambdarank tolambdamart:Anoverview.TechnicalReportMSR-TR-2010-82,June2010.
…
Δ Qualityafterswappingdi withdj
derivativeofthenegativeRankNet cost
Ifoij → +∞ (i.e.,correctlyordered)
𝜆ij → 0
Ifoij → -∞ (i.e.,mis-ordered)𝜆ij → |Δ NDCG|
gi = �i =X
yi>yj
�ij �X
yi<yj
�ij
�ij
=1
1 + eoij|�NDCG| = ��
ji
Topdocumentsaremorerelevant!
List-wiseAlgorithms:someresults
• NDCG@10onpublicLtR Datasets
Otherapproaches:ListNet/ListMLE[CQL+07],ApproximateRank[QLL10],SVMAP[YFRJ07],RankGP[YLKY07],others...
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 15
Algorithm MSN10K Y!S1 Y!S2 Istella-SRankingSVM 0.4012 0.7238 0.7306 N/A
GBRT 0.4602 0.7555 0.7620 0.7313LambdaMART 0.4618 0.7529 0.7531 0.7537
[CQL+07]Zhe Cao,TaoQin,Tie-YanLiu,Ming-FengTsai,andHangLi.Learningtorank:frompairwiseapproachtolistwise approach.InProceedingsofthe24thinternationalconferenceonMachinelearning,pages129–136.ACM,2007.[QLL10]TaoQin,Tie-YanLiu,andHangLi.Ageneralapproximationframeworkfordirectoptimizationofinformationretrievalmeasures.Informationretrieval,13(4):375–397,2010.[YFRJ08]Yisong Yue,ThomasFinley,FilipRadlinski,andThorstenJoachims.Asupportvectormethodforoptimizingaverageprecision.InProceedingsofthe30thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages271– 278.ACM,2007.[YLKY07]Jen-YuanYeh,Jung-YiLin,Hao-RenKe,andWei-PangYang.Learningtorankforinformationretrievalusinggeneticprogramming.InProceedingsofSIGIR2007WorkshoponLearningtoRankforInformationRetrieval(LR4IR2007),2007.
LearningtoRankAlgorithms• NewapproachestooptimizeIR
measures:• DirectRank[XLL+08],LambdaMart[Bur10],
BLMart[GCL11],SSLambdaMART[SY11],CoList[GY14],LogisticRank[YHT+16],…See[Liu11][TBH15].
• DeepLearning toimprovequery-documentmatching:• Conv.DNN[SM15],DSSM[HHG+13],
Dual-Embedding[MNCC16],LocalandDistributedrepr.[MDC17],WeakSupervision[DZS+17],NeuralClickModel[BMdRS16],…
• On-linelearning:• Multi-armedbandits[RKJ08],
Duelingbandits[YJ09],K-armedduelingbandits[YBKJ12],onlinelearning[HSWdR13][HWdR13],…
16[Liu11] Tie-Yan Liu. Learning to rank for information retrieval, 2011. Springer.[TBH15]Niek Tax,SanderBockting,andDjoerd Hiemstra.Across-benchmarkcomparisonof87learningtorankmethods.Informationprocessing&management,51(6):757–772,2015.
Figurefrom[Liu11]
InthistutorialwefocusonGBRTs
AdsClickPrediction:GBDTasafeatureextractor,thenLogReg [HPJ+14]
AdsClickPrediction:refine/boostNN output[LDG+17]
ProductRanking:100GBDTswithpairwiseranking[SCP16]
DocumentRanking:GBDTnamedLogisticRank [YHT+16]
Ranking,forecasting&recommendations:ObliviousGBRT
17
[HPJ+14] XinranHe,JunfengPan,OuJin,TianbingXu,BoLiu,TaoXu,YanxinShi,AntoineAtallah,RalfHerbrich,StuartBowers,etal.Practicallessonsfrompredictingclicksonadsatfacebook.InProceedingsoftheEighthInternationalWorkshoponDataMiningforOnlineAdvertising,pages1–9.ACM,2014.[LDG+17] Xiaoliang Ling,Weiwei Deng, Chen Gu, Hucheng Zhou, Cui Li, and Feng Sun.Model ensemble for click prediction in bing search ads. In Proceedings of the 26th InternationalConferenceonWorldWideWebCompanion,pages689–698.InternationalWorldWideWebConferencesSteeringCommittee,2017.[SCP16]DariaSorokinaandErickCantu-Paz.Amazonsearch:The joyof rankingproducts. InProceedingsof the39th InternationalACMSIGIRconferenceonResearchandDevelopment inInformationRetrieval,pages459–460.ACM,2016.[YHT+16]DaweiYin,YueningHu,JiliangTang,TimDaly,MianweiZhou,HuaOuyang,JianhuiChen,ChangsungKang,HongboDeng,ChikashiNobata,etal.Rankingrelevanceinyahoosearch.InProceedingsofthe22ndACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining,pages323–332.ACM,2016.
InthistutorialwefocusonGBRTs
• Successful in several Data Challenges:• Winner of the Yahoo! LtR Challenge: combination of 12 ranking models,
8 of which were Lambda-MART models, each having up to 3,000 trees [CC11]• According to the 2015 statistics, GBRTs were adopted by the majority of the
winning solutions among the Kaggle competitions, even more than thepopular deep networks, and all the top-10 teams qualified in the KDDCup2015 used GBRT-based algorithms [CG16]
• New interesting open-source implementations:• XGBoost, LightGBM byMicrosoft, CatBoost by Yandex
• Pluggable within Apache Lucene/Solr18
[CC11] Olivier Chapelle and Yi Chang. Yahoo! learning to rank challenge overview. In Proceedings of the Learning to Rank Challenge, pages 1–24, 2011.[CG16] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,KDD ’16, pages 785–794, New York, NY, USA, 2016. ACM.
Single-StageRanking
Requirestoapplythelearntmodel toeverymatchingdocument,andtogeneratetherequiredfeatures.Notfeasible!Wehaveatleast3 efficiencyvs.effectivenesstrade-offs.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 19
ResultsRANKERQuery
Single-StageRanking
①FeatureComputationTrade-off• ComputationallyExpensive &highlydiscriminative featuresvs.computationallyCheap&slightlydiscriminative features
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 20
ResultsRANKERQuery
Two-StageRanking
Expensivefeaturesarecomputedonlyforthetop-K candidatedocuments passingthefirststage.HowtochoseK?②NumberofMatchingCandidatesTrade-off :• a Largeset ofcandidatesisExpensiveandproduceshigh-quality resultsvs.aSmallsetofcandidatesisCheap andproduceslow-quality results• 1000documents[DBC13] (Gov2,ClueWeb09-Bcollections)• 1500-2000documents[MSO13](ClueWeb09-B)• “hundredsofthousands”(over“hundredsofmachines”)[YHT+16a]
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 21
Query+top-K docs
STAGE1:
Matching/Recall-oriented
Ranking
STAGE2:
Precision-orientedRanking
Query Results
[DBC13]VanDang,MichaelBendersky,andWBruceCroft.Two-stagelearningtorankforinformationretrieval.InAdvancesinInformationRetrieval,pages423–434.Springer,2013.[MSO13]CraigMacdonald,Rodrygo LTSantos,andIadh Ounis.Thewhens andhows oflearningtorankforwebsearch.InformationRetrieval,16(5):584–628,2013.[YHT+16]Dawei Yin,Yuening Hu,Jiliang Tang,TimDaly,Mianwei Zhou,HuaOuyang,Jianhui Chen,Changsung Kang,Hongbo Deng,Chikashi Nobata,etal.Rankingrelevanceinyahoosearch.InProceedingsofthe22ndACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining,pages323–332.ACM,2016.
Multi-StageRanking
• 3stages[YHT+16]:Contextualfeatures areconsideredinthe3rd stage• Contextual=>aboutthecurrentresultset• Rankbasedonspecificfeatures,Mean,Variance,Standardizedfeatures(seealso[LNO+15a]),
Topicmodelsimilarity• Firsttwostagesareexecutedateachservingnode
• N stages[CGBC17]:Whichmodel ineachstage?Whichfeatures?Howmanydocuments?• About200configurationstested• bestresultswithN=3stages,2500and700docsbetweenstages
• Apropermethodology/algorithmforchoosingthebestconfigurationisstillmissing.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 22
STAGE1:
Matching/Recall-oriented
Ranking
STAGE2:
Precision-orientedRanking
Query Query+Top30
[YHT+16]Dawei Yin,Yuening Hu,Jiliang Tang,TimDaly,Mianwei Zhou,HuaOuyang,Jianhui Chen,Changsung Kang,Hongbo Deng,Chikashi Nobata,etal.Rankingrelevanceinyahoosearch.InProceedingsofthe22ndACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining,pages323–332.ACM,2016.[CGBC17]Ruey-ChengChen,LukeGallagher,Roi Blanco,andJ.ShaneCulpepper.Efficientcost-awarecascaderankinginmulti-stageretrieval.InProceedingsofthe40thInternationalACMSIGIRConferenceonResearchandDevelopmentinInformationRetrieval,SIGIR’17,pages445–454,NewYork,NY,USA,2017.ACM.
STAGE3:
ContextualRanking
Results
Multi-StageRanking
③ModelComplexityTrade-off :• Complex &Slow high-quality vs.Simple &Fast low-qualitymodels:
• Complex as:RandomForest,GBRT,InitializedGBRT,Lambda-MART,• Simple as:CoordinateAscent,RidgeRegression,SVM-Rank,RankBoost• In-between as:ObliviousLambda-Mart,ListNet
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 23
STAGEi-1:
CheapRanker
STAGEi:
AccurateRanker
Query
STAGEi+1:
VeryAccurateRanker
Results
ModelComplexityTrade-off
• Comparisononvaryingtrainingparameters[CLN+16]:• #trees,#leaves,learningrate,etc.
• Complexmodelsachievesignificantlyhigherquality• Bestmodeldependsontimebudget
• TodayisaboutModelComplexityTrade-off!
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 24[CLN+16]GabrieleCapannini,ClaudioLucchese,FrancoMariaNardini,SalvatoreOrlando,RaffaelePerego,andNicolaTonellotto.Qualityversusefficiencyindocumentscoringwithlearning-to-rankmodels.InformationProcessing&Management,2016.
Next…
Efficiency/Effectivenesstrade-offsin:• FeatureSelection• EnhancedLearningAlgorithms• Approximatescoring• FastScoring
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 25
References[BMdRS16]AlexeyBorisov,IlyaMarkov,MaartendeRijke,andPavelSerdyukov.A neuralclickmodelforwebsearch.In Proceedingsofthe25thInternationalConferenceonWorldWideWeb,pages531--541.InternationalWorldWideWebConferencesSteeringCommittee,2016.
[BOM15]Roi Blanco,GiuseppeOttaviano,andEdgarMeij.Fast andspace-efficiententitylinkingforqueries.In ProceedingsoftheEighthACMInternationalConferenceonWebSearchandDataMining,pages179--188.ACM,2015.
[BSD10]PaulNBennett,KrystaSvore,andSusanTDumais.Classification-enhancedranking.In Proceedingsofthe19thinternationalconferenceonWorldwideweb,pages111--120.ACM,2010.
[BSR+05]ChrisBurges,TalShaked,ErinRenshaw,AriLazier,MattDeeds,NicoleHamilton,andGregHullender.Learning torankusinggradientdescent.In Proceedingsofthe22ndinternationalconferenceonMachinelearning,pages89--96.ACM,2005.
[Bur10]ChristopherJ.C.Burges.From ranknet tolambdarank tolambdamart:Anoverview.Technical ReportMSR-TR-2010-82,June2010.
[CBCD08]BenCarterette,PaulBennett,DavidChickering,andSusanDumais.Here orthere:PreferenceJudgmentsforRelevance.AdvancesinInformationRetrieval,pages16--27,2008.
[CC11]OlivierChapelle andYiChang.Yahoo!learningtorankchallengeoverview.In ProceedingsoftheLearningtoRankChallenge,pages1--24,2011.
[CCL11]OlivierChapelle,YiChang,andT-YLiu.Future directionsinlearningtorank.In ProceedingsoftheLearningtoRankChallenge,pages91--100,2011.
[CG16]Tianqi ChenandCarlosGuestrin.Xgboost:Ascalabletreeboostingsystem.In Proceedingsofthe22NdACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining,KDD'16,pages785--794,NewYork,NY,USA,2016.ACM.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 26
References[CGBC17]Ruey-ChengChen,LukeGallagher,Roi Blanco,andJ.ShaneCulpepper.Efficient cost-awarecascaderankinginmulti-stageretrieval.InProceedingsofthe40thInternationalACMSIGIRConferenceonResearchandDevelopmentinInformationRetrieval,SIGIR'17,pages445--454,NewYork,NY,USA,2017.ACM.
[CJRY12]OlivierChapelle,ThorstenJoachims,FilipRadlinski,andYisong Yue.Large-scalevalidationandanalysisofinterleavedsearchevaluation.ACM TransactionsonInformationSystems(TOIS),30(1):6,2012.
[CLN+16]GabrieleCapannini,ClaudioLucchese,FrancoMariaNardini,SalvatoreOrlando,RaffaelePerego,andNicolaTonellotto.Qualityversusefficiencyindocumentscoringwithlearning-to-rankmodels.Inf.Process.Manage.,52(6):1161--1177,November2016.
[CMZG09]OlivierChapelle,DonaldMetlzer,Ya Zhang,andPierreGrinspan.Expected reciprocalrankforgradedrelevance.In Proceedingsofthe18thACMconferenceonInformationandknowledgemanagement,pages621--630.ACM,2009.
[CQL+07]Zhe Cao,TaoQin,Tie-YanLiu,Ming-FengTsai,andHangLi.Learning torank:frompairwiseapproachtolistwise approach.InProceedingsofthe24thinternationalconferenceonMachinelearning,pages129--136.ACM,2007.
[DBC13]VanDang,MichaelBendersky,andWBruceCroft.Two-stagelearningtorankforinformationretrieval.In AdvancesinInformationRetrieval,pages423--434.Springer,2013.
[DZK+10]Anlei Dong,Ruiqiang Zhang,Pranam Kolari,JingBai,FernandoDiaz,YiChang,Zhaohui Zheng,andHongyuan Zha.Time isoftheessence:improvingrecency rankingusingtwitterdata.In Proceedingsofthe19thinternationalconferenceonWorldwideweb,pages331--340.ACM,2010.
[DZS+17]MostafaDehghani,Hamed Zamani,Aliaksei Severyn,Jaap Kamps,andW.BruceCroft.Neural rankingmodelswithweaksupervision.In Proceedingsofthe40thInternational{ACM{SIGIRConferenceonResearchandDevelopmentinInformationRetrieval,Shinjuku,Tokyo,Japan,August7-11,2017,pages65--74.{ACM,2017.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 27
References[FISS03]Yoav Freund,RajIyer,RobertESchapire,andYoram Singer.An efficientboostingalgorithmforcombiningpreferences.Journal ofmachinelearningresearch,4(Nov):933--969,2003.
[Fri01]JeromeHFriedman.Greedy functionapproximation:agradientboostingmachine.Annals ofstatistics,pages1189--1232,2001.
[GCL11]YasserGanjisaffar,RichCaruana,andCristinaVideira Lopes.Bagging gradient-boostedtreesforhighprecision,lowvariancerankingmodels.In Proceedingsofthe34thinternationalACMSIGIRconferenceonResearchanddevelopmentinInformationRetrieval,pages85--94.ACM,2011.
[GY14]WeiGaoandPeiYang.Democracy isgoodforranking:Towardsmulti-viewranklearningandadaptationinwebsearch.In Proceedingsofthe7thACMinternationalconferenceonWebsearchanddatamining,pages63--72.ACM,2014.
[H+00]MonikaRauchHenzinger etal.Link analysisinwebinformationretrieval.IEEE DataEng.Bull.,23(3):3--8,2000.
[HHG+13]Po-SenHuang,Xiaodong He,Jianfeng Gao,LiDeng,AlexAcero,andLarryHeck.Learning deepstructuredsemanticmodelsforwebsearchusingclickthrough data.In Proceedingsofthe22ndACMinternationalconferenceonConferenceoninformation& knowledgemanagement,pages2333--2338.ACM,2013.
[HPJ+14]XinranHe,Junfeng Pan,Ou Jin,Tianbing Xu,BoLiu,TaoXu,Yanxin Shi,AntoineAtallah,RalfHerbrich,StuartBowers,etal.Practicallessonsfrompredictingclicksonadsatfacebook.In ProceedingsoftheEighthInternationalWorkshoponDataMiningforOnlineAdvertising,pages1--9.ACM,2014.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 28
References[HSWdR13]Katja Hofmann,AnneSchuth,ShimonWhiteson,andMaartendeRijke.Reusing historicalinteractiondataforfasteronlinelearningtorankforir.In ProceedingsofthesixthACMinternationalconferenceonWebsearchanddatamining,pages183--192.ACM,2013.
[HWdR13]Katja Hofmann,ShimonWhiteson,andMaartendeRijke.Balancing explorationandexploitationinlistwise andpairwiseonlinelearningtorankforinformationretrieval.Information Retrieval,16(1):63--90,2013.
[JGP+05]ThorstenJoachims,LauraGranka,BingPan,HeleneHembrooke,andGeriGay.Accurately interpretingclickthrough dataasimplicitfeedback.In Proceedingsofthe28thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages154--161.Acm,2005.
[JK00]Kalervo J{\"arvelin andJaana Kekalainen.Ir evaluationmethodsforretrievinghighlyrelevantdocuments.In Proceedingsofthe23rdannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages41--48.ACM,2000.
[JLN16]DiJiang,KennethWai-TingLeung,andWilfredNg.Query intentminingwithmultipledimensionsofwebsearchdata.World WideWeb,19(3):475--497,2016.
[Joa02]ThorstenJoachims.Optimizingsearchenginesusingclickthrough data.InProceedingsoftheeighthACMSIGKDDinternationalconferenceonKnowledgediscoveryanddatamining,pages133--142.ACM,2002.
[JSS17]ThorstenJoachims,Adith Swaminathan,andTobiasSchnabel.Unbiasedlearning-to-rankwithbiasedfeedback.ProceedingsoftheTenthACMInternationalConferenceonWebSearchandDataMining.ACM,2017.
[JWR00]KSparck Jones,SteveWalker,andStephenE.Robertson.A probabilisticmodelofinformationretrieval:developmentandcomparativeexperiments:Part2.Informationprocessing& management,36(6):809--840,2000.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 29
References[KDF+13]RonKohavi,AlexDeng,BrianFrasca,TobyWalker,Ya Xu,andNilsPohlmann.Online controlledexperimentsatlargescale.InProceedingsofthe19thACMSIGKDDinternationalconferenceonKnowledgediscoveryanddatamining,pages1168--1176.ACM,2013.
[LCZ+10]BoLong,OlivierChapelle,Ya Zhang,YiChang,Zhaohui Zheng,andBelleTseng.Active learningforrankingthroughexpectedlossoptimization.In Proceedingsofthe33rdinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages267--274.ACM,2010.
[LDG+17]Xiaoliang Ling,Weiwei Deng,ChenGu,Hucheng Zhou,CuiLi,andFengSun.Model ensembleforclickpredictioninbing searchads.InProceedingsofthe26thInternationalConferenceonWorldWideWebCompanion,pages689--698.InternationalWorldWideWebConferencesSteeringCommittee,2017.
[Liu11]Tie-YanLiu.Learning torankforinformationretrieval,2011.
[LNO+15]ClaudioLucchese,FrancoMariaNardini,SalvatoreOrlando,RaffaelePerego,andNicolaTonellotto.Speeding updocumentrankingwithrank-basedfeatures.In Proceedingsofthe38thInternationalACMSIGIRConferenceonResearchandDevelopmentinInformationRetrieval,pages895--898.ACM,2015.
[LOP+13]ClaudioLucchese,SalvatoreOrlando,RaffaelePerego,Fabrizio Silvestri,andGabrieleTolomei.Discovering tasksfromsearchenginequerylogs.ACM TransactionsonInformationSystems(TOIS),31(3):14,2013.
[MC07]DonaldMetzlerandWBruceCroft.Linear feature-basedmodelsforinformationretrieval.Information Retrieval,10(3):257--274,2007.
[MDC17]Bhaskar Mitra,FernandoDiaz,andNickCraswell.Learning tomatchusinglocalanddistributedrepresentationsoftextforwebsearch.In Proceedingsofthe26thInternationalConferenceonWorldWideWeb,pages1291--1299.InternationalWorldWideWebConferencesSteeringCommittee,2017.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 30
References[MNCC16]Bhaskar Mitra,EricNalisnick,NickCraswell,andRichCaruana.A dualembeddingspacemodelfordocumentranking.arXiv preprintarXiv:1602.01137,2016.
[MSC+13]TomasMikolov,IlyaSutskever,KaiChen,GregSCorrado,andJeffDean.Distributed representationsofwordsandphrasesandtheircompositionality.In Advancesinneuralinformationprocessingsystems,pages3111--3119,2013.
[MSO13]CraigMacdonald,Rodrygo LTSantos,andIadh Ounis.The whens andhows oflearningtorankforwebsearch.Information Retrieval,16(5):584--628,2013.
[MW08]DavidMilneandIanHWitten.Learning tolinkwithwikipedia.In Proceedingsofthe17thACMconferenceonInformationandknowledgemanagement,pages509--518.ACM,2008.
[MZ08]AlistairMoffatandJustinZobel.Rank-biasedprecisionformeasurementofretrievaleffectiveness.ACM TransactionsonInformationSystems(TOIS),27(1):2,2008.
[PC98]JayMPonteandWBruceCroft.A languagemodelingapproachtoinformationretrieval.In Proceedingsofthe21stannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages275--281.ACM,1998.
[QLL10]TaoQin,Tie-YanLiu,andHangLi.A generalapproximationframeworkfordirectoptimizationofinformationretrievalmeasures.Information retrieval,13(4):375--397,2010.
[RJ05]FilipRadlinski andThorstenJoachims.Query chains:learningtorankfromimplicitfeedback.In ProceedingsoftheeleventhACMSIGKDDinternationalconferenceonKnowledgediscoveryindatamining,pages239--248.ACM,2005.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 31
References[RKJ08]FilipRadlinski,RobertKleinberg,andThorstenJoachims.Learning diverserankingswithmulti-armedbandits.In Proceedingsofthe25thinternationalconferenceonMachinelearning,pages784--791.ACM,2008.
[RS03]YvesRasolofo andJacquesSavoy.Term proximityscoringforkeyword-basedretrievalsystems.Advances ininformationretrieval,pages79--79,2003.
[RZT04]StephenRobertson,HugoZaragoza,andMichaelTaylor.Simple bm25extensiontomultipleweightedfields.In ProceedingsofthethirteenthACMinternationalconferenceonInformationandknowledgemanagement,pages42--49.ACM,2004.
[SB88]GerardSaltonandChristopherBuckley.Term-weightingapproachesinautomatictextretrieval.Information processing&management,24(5):513--523,1988.
[SCP16]DariaSorokina andErickCantu-Paz.Amazon search:Thejoyofrankingproducts.In Proceedingsofthe39thInternationalACMSIGIRconferenceonResearchandDevelopmentinInformationRetrieval,pages459--460.ACM,2016.
[SHG+14]Yelong Shen,Xiaodong He,Jianfeng Gao,LiDeng,andGregoire Mesnil.Learning semanticrepresentationsusingconvolutionalneuralnetworksforwebsearch.In Proceedingsofthe23rdInternationalConferenceonWorldWideWeb,pages373--374.ACM,2014.
[SJ72]KarenSparck Jones.A statisticalinterpretationoftermspecificityanditsapplicationinretrieval.Journal ofdocumentation,28(1):11--21,1972.
[SM15]Aliaksei Severyn andAlessandroMoschitti.Learning torankshorttextpairswithconvolutionaldeepneuralnetworks.In Proceedingsofthe38thInternationalACMSIGIRConferenceonResearchandDevelopmentinInformationRetrieval,pages373--382.ACM,2015.
[SY11]MartinSzummer andEmine Yilmaz.Semi-supervisedlearningtorankwithpreferenceregularization.In Proceedingsofthe20thACMinternationalconferenceonInformationandknowledgemanagement,pages269--278.ACM,2011.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 32
References[TBH15]Niek Tax,SanderBockting,andDjoerd Hiemstra.A cross-benchmarkcomparisonof87learningtorankmethods.Informationprocessing&management,51(6):757--772,2015.
[XLL+08]JunXu,Tie-YanLiu,MinLu,HangLi,andWei-YingMa.Directly optimizingevaluationmeasuresinlearningtorank.In Proceedingsofthe31stannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages107--114.ACM,2008.
[YBKJ12]Yisong Yue,JosefBroder,RobertKleinberg,andThorstenJoachims.The k-armedduelingbanditsproblem.Journal ofComputerandSystemSciences,78(5):1538--1556,2012.
[YFRJ07]Yisong Yue,ThomasFinley,FilipRadlinski,andThorstenJoachims.A supportvectormethodforoptimizingaverageprecision.InProceedingsofthe30thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages271--278.ACM,2007.
[YHT+16]Dawei Yin,Yuening Hu,Jiliang Tang,TimDaly,Mianwei Zhou,HuaOuyang,Jianhui Chen,Changsung Kang,Hongbo Deng,ChikashiNobata,etal.Ranking relevanceinyahoosearch.In Proceedingsofthe22ndACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining,pages323--332.ACM,2016.
[YJ09]Yisong YueandThorstenJoachims.Interactively optimizinginformationretrievalsystemsasaduelingbanditsproblem.In Proceedingsofthe26thAnnualInternationalConferenceonMachineLearning,pages1201--1208.ACM,2009.
[YLKY07]Jen-YuanYeh,Jung-YiLin,Hao-RenKe,andWei-PangYang.Learning torankforinformationretrievalusinggeneticprogramming.InProceedingsofSIGIR2007WorkshoponLearningtoRankforInformationRetrieval(LR4IR2007),2007.
[YR09]Emine YilmazandStephenRobertson.Deep versusshallowjudgmentsinlearningtorank.In Proceedingsofthe32ndinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages662--663.ACM,2009.
LuccheseC.,NardiniF.M.Efficiency/EffectivenessTrade-offsinLearningtoRank 33