practical advice for building machine learning applications · 2019-12-05 · machine learning...

69
Machine Learning Practical Advice for Building Machine Learning Applications 1 Based on lectures and papers by Andrew Ng, Pedro Domingos, Tom Mitchell and others

Upload: others

Post on 05-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MachineLearning

PracticalAdvicefor

BuildingMachineLearningApplications

1BasedonlecturesandpapersbyAndrewNg,PedroDomingos,TomMitchellandothers

Page 2: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MLandtheworld

• Diagnosticsofyourlearningalgorithm

• Erroranalysis

• InjectingmachinelearningintoYourFavoriteTask

• Makingmachinelearningmatter

2

MakingMLworkintheworld

MostlyexperientialadviceAlsobasedonwhatotherpeoplehavesaidSeereadingsonclasswebsite

Page 3: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MLandtheworld

• Diagnosticsofyourlearningalgorithm

• Erroranalysis

• InjectingmachinelearningintoYourFavoriteTask

• Makingmachinelearningmatter

3

Page 4: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Debuggingmachinelearning

SupposeyoutrainanSVMoralogisticregressionclassifierforspamdetection

Youobviously followbestpracticesforfindinghyper-parameters(suchascross-validation)

Yourclassifierisonly75%accurate

Whatcanyoudotoimproveit?

4

(assumingthattherearenobugsinthecode)

Page 5: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Differentwaystoimproveyourmodel

Moretrainingdata

Features1. Usemorefeatures2. Usefewerfeatures3. Useotherfeatures

Bettertraining1. Runformoreiterations2. Useadifferentalgorithm3. Useadifferentclassifier4. Playwithregularization

5

Page 6: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Differentwaystoimproveyourmodel

Moretrainingdata

Features1. Usemorefeatures2. Usefewerfeatures3. Useotherfeatures

Bettertraining1. Runformoreiterations2. Useadifferentalgorithm3. Useadifferentclassifier4. Playwithregularization

6

Tedious!Andpronetoerrors,dependenceonluck

Letustrytomakethisprocessmoremethodical

Page 7: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

First,diagnostics

Somepossibleproblems:

1. Over-fitting(highvariance)

2. Under-fitting(highbias)

3. Yourlearningdoesnotconverge

4. Areyoumeasuringtherightthing?

7

Easiertofixaproblemifyouknowwhereitis

Page 8: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Detectingoverorunderfitting

Over-fitting:Thetrainingaccuracyismuchhigherthanthetestaccuracy– Themodelexplainsthetrainingsetverywell,butpoorgeneralization

Under-fitting:Bothaccuraciesareunacceptablylow– Themodelcannotrepresenttheconceptwellenough

8

Page 9: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Detectinghighvariance usinglearningcurves

9

Sizeoftrainingdata

Error

Trainingerror

Page 10: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Detectinghighvariance usinglearningcurves

10

Sizeoftrainingdata

Error

Trainingerror

Generalizationerror/testerror

Page 11: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Detectinghighvariance usinglearningcurves

11

Testerrorkeepsdecreasingastrainingsetincreases)moredatawillhelp

Sizeoftrainingdata

Error

Largegapbetweentrainandtesterror

Trainingerror

Generalizationerror/testerror

Typicallyseenformorecomplexmodels

Page 12: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Detectinghighbiasusinglearningcurves

12

Sizeoftrainingset

Error Trainingerror

Generalizationerror/testerror

Bothtrainandtesterrorareunacceptable(Butthemodelseemstoconverge)

Typicallyseenformoresimplemodels

Page 13: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Differentwaystoimproveyourmodel

Moretrainingdata

Features1. Usemorefeatures2. Usefewerfeatures3. Useotherfeatures

Bettertraining1. Runformoreiterations2. Useadifferentalgorithm3. Useadifferentclassifier4. Playwithregularization

13

Page 14: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Differentwaystoimproveyourmodel

Moretrainingdata

Features1. Usemorefeatures2. Usefewerfeatures3. Useotherfeatures

Bettertraining1. Runformoreiterations2. Useadifferentalgorithm3. Useadifferentclassifier4. Playwithregularization

14

Helpswithover-fitting

Helpswithunder-fittingHelpswithover-fittingCouldhelpwithover-fittingandunder-fitting

Couldhelpwithover-fittingandunder-fitting

Page 15: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Diagnostics

Somepossibleproblems:

ü Over-fitting(highvariance)

ü Under-fitting(highbias)

3. Yourlearningdoesnotconverge

4. Areyoumeasuringtherightthing?

15

Easiertofixaproblemifyouknowwhereitis

Page 16: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Doesyourlearningalgorithmconverge?

Iflearningisframedasanoptimizationproblem,tracktheobjective

16

Iterations

ObjectiveNotyetconvergedhere Convergedhere

Page 17: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Doesyourlearningalgorithmconverge?

Iflearningisframedasanoptimizationproblem,tracktheobjective

17

Iterations

ObjectiveNotyetconvergedhere Howabouthere?

Notalwayseasytodecide

Page 18: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Doesyourlearningalgorithmconverge?

Iflearningisframedasanoptimizationproblem,tracktheobjective

18

Iterations

Objective

Somethingiswrong

Page 19: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Doesyourlearningalgorithmconverge?

Iflearningisframedasanoptimizationproblem,tracktheobjective

19

Iterations

Objective

Helpstodebug

Somethingiswrong

Ifwearedoinggradientdescentonaconvexfunctiontheobjectivecan’tincrease(Caveat:ForSGD,theobjectivewillslightlyincreaseoccasionally,butnotbymuch)

Page 20: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Differentwaystoimproveyourmodel

Moretrainingdata

Features1. Usemorefeatures2. Usefewerfeatures3. Useotherfeatures

Bettertraining1. Runformoreiterations2. Useadifferentalgorithm3. Useadifferentclassifier4. Playwithregularization

20

Helpswithoverfitting

Helpswithunder-fittingHelpswithover-fittingCouldhelpwithover-fittingandunder-fitting

Couldhelpwithover-fittingandunder-fitting

Page 21: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Differentwaystoimproveyourmodel

Moretrainingdata

Features1. Usemorefeatures2. Usefewerfeatures3. Useotherfeatures

Bettertraining1. Runformoreiterations2. Useadifferentalgorithm3. Useadifferentclassifier4. Playwithregularization

21

Helpswithoverfitting

Helpswithunder-fittingHelpswithover-fittingCouldhelpwithover-fittingandunder-fitting

Couldhelpwithover-fittingandunder-fitting

Tracktheobjectiveforconvergence

Page 22: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Diagnostics

Somepossibleproblems:

ü Over-fitting(highvariance)

ü Under-fitting(highbias)

ü Yourlearningdoesnotconverge

4. Areyoumeasuringtherightthing?

22

Easiertofixaproblemifyouknowwhereitis

Page 23: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Whattomeasure

• Accuracyofpredictionisthemostcommonmeasurement

• Butifyourdatasetisunbalanced,accuracymaybemisleading– 1000positiveexamples,1negativeexample– Aclassifierthatalwayspredictspositivewillget99.9%accuracy.Hasit

reallylearnedanything?

• Unbalancedlabelsàmeasurelabelspecificprecision,recallandF-measure– Precisionforalabel:Amongexamplesthatarepredictedwithlabel,what

fractionarecorrect– Recallforalabel:Amongtheexampleswithgivengroundtruthlabel,

whatfractionarecorrect– F-measure:Harmonicmeanofprecisionandrecall

23

Page 24: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MLandtheworld

• Diagnosticsofyourlearningalgorithm

• Erroranalysis

• InjectingmachinelearningintoYourFavoriteTask

• Makingmachinelearningmatter

24

Page 25: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MachineLearninginthisclass

25

MLcode

Page 26: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MachineLearningincontext

26Figurefrom[Sculley,etalNIPS2015]

Page 27: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

ErrorAnalysis

Generallymachinelearningplaysasmall(butimportant)roleinalargerapplication• Pre-processing• Featureextraction(possiblybyotherMLbasedmethods)• Datatransformations

Howmuchdoeachofthesecontributetotheerror?

Erroranalysistriestoexplainwhyasystemisnotperformingperfectly

27

Page 28: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

28

Page 29: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

29

Text

Page 30: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

30

Text

Words

Page 31: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

31

Text

Words

Parts-of-speech

Page 32: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

32

Text

Words

Parts-of-speech

Parsetrees

Page 33: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

33

Text

Words

Parts-of-speech

Parsetrees

AML-basedapplication

Page 34: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

34

Text

Words

Parts-of-speech

Parsetrees

AML-basedapplication

EachofthesecouldbeMLdriven

Ordeterministic

Butstillerrorprone

Page 35: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Example:Atypicaltextprocessingpipeline

35

Text

Words

Parts-of-speech

Parsetrees

AML-basedapplication

EachofthesecouldbeMLdriven

Ordeterministic

Butstillerrorprone

Howmuchdoeachofthesecontributetotheerrorofthefinalapplication?

Page 36: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Trackingerrorsinacomplexsystem

Pluginthegroundtruthfortheintermediatecomponentsandseehowmuchtheaccuracyofthefinalsystemchanges

36

System Accuracy

End-to-endpredicted 55%

Withgroundtruth words 60%

+groundtruthparts-of-speech 84%

+groundtruthparse trees 89%

+ground truthfinaloutput 100%

Page 37: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Trackingerrorsinacomplexsystem

Pluginthegroundtruthfortheintermediatecomponentsandseehowmuchtheaccuracyofthefinalsystemchanges

37

Errorinthepart-of-speechcomponenthurtsthemost

System Accuracy

End-to-endpredicted 55%

Withgroundtruth words 60%

+groundtruthparts-of-speech 84%

+groundtruthparse trees 89%

+ground truthfinaloutput 100%

Page 38: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Ablativestudy

Explainingdifferencebetweentheperformancebetweenastrongmodelandamuchweakerone(abaseline)

UsuallyseenwithfeaturesSupposewehaveacollectionoffeaturesandoursystemdoeswell,butwedon’tknowwhichfeaturesaregivingustheperformance

Evaluatesimplersystemsthatprogressivelyusefewerandfewerfeaturestoseewhichfeaturesgivethehighestboost

38

Itisnotenoughtohaveaclassifierthatworks;itisusefultoknowwhyitworks.

Helpsinterpretpredictions,diagnoseerrorsandcanprovideanaudittrail

Page 39: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MLandtheworld

• Diagnosticsofyourlearningalgorithm

• Erroranalysis

• InjectingmachinelearningintoYourFavoriteTask

• Makingmachinelearningmatter

39

Page 40: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Classifyingfish

SayyouwanttobuildaclassifierthatidentifieswhetherarealphysicalfishissalmonortunaHowdoyougoaboutthis?

40

Page 41: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Classifyingfish

SayyouwanttobuildaclassifierthatidentifieswhetherarealphysicalfishissalmonortunaHowdoyougoaboutthis?

41

Theslowapproach

1. Carefullyidentifyfeatures,getthebestdata,thesoftwarearchitecture,maybedesignanewlearningalgorithm

2. Implementitandhopeitworks

Advantage:Perhapsabetterapproach,maybeevenanewlearningalgorithm.Research.

Page 42: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Classifyingfish

SayyouwanttobuildaclassifierthatidentifieswhetherarealphysicalfishissalmonortunaHowdoyougoaboutthis?

42

Theslowapproach

1. Carefullyidentifyfeatures,getthebestdata,thesoftwarearchitecture,maybedesignanewlearningalgorithm

2. Implementitandhopeitworks

Advantage:Perhapsabetterapproach,maybeevenanewlearningalgorithm.Research.

Thehacker’sapproach

1. Firstimplementsomething

2. Usediagnosticstoiterativelymakeitbetter

Advantage:Fasterrelease,willhaveasolutionforyourproblemquicker

Page 43: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Classifyingfish

SayyouwanttobuildaclassifierthatidentifieswhetherarealphysicalfishissalmonortunaHowdoyougoaboutthis?

43

Theslowapproach

1. Carefullyidentifyfeatures,getthebestdata,thesoftwarearchitecture,maybedesignanewlearningalgorithm

2. Implementitandhopeitworks

Advantage:Perhapsabetterapproach,maybeevenanewlearningalgorithm.Research.

Thehacker’sapproach

1. Firstimplementsomething

2. Usediagnosticstoiterativelymakeitbetter

Advantage:Fasterrelease,willhaveasolutionforyourproblemquicker

Bewaryofprematureoptimization

Beequallywaryofprematurelycommittingtoabadpath

Page 44: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Whattowatchoutfor

• Doyouhavetherightevaluationmetric?– Anddoesyourlossfunctionreflectit?

• Bewareofcontamination:Ensurethatyourtrainingdataisnotcontaminatedwiththetestset– Learning=generalizationtonewexamples– Donotseeyourtestseteither.Youmayinadvertentlycontaminatethemodel

– Bewareofcontaminatingyourfeatureswiththelabel!– (Besuspiciousofperfectpredictors)

44

Page 45: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Whattowatchoutfor

• Beawareofbiasvs.variancetradeoff(orover-fittingvs.under-fitting)

• Beawarethatintuitionsmaynotworkinhighdimensions– Noproofbypicture– Curseofdimensionality

• Atheoreticalguaranteemayonlybetheoretical– Maymakeinvalidassumptions(eg:ifthedataisseparable)– Mayonlybelegitimatewithinfinitedata(eg:estimatingprobabilities)– Experimentsonrealdataareequallyimportant

45

Page 46: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Bigdataisnotenough

Butmoredataisalwaysbetter– Cleanerdataisevenbetter

Rememberthatlearningisimpossiblewithoutsomebiasthatsimplifiesthesearch– Otherwise,nogeneralization

Learningrequiresknowledgetoguidethelearner– Machinelearningisnot amagicwand

46

Page 47: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Whatknowledge?

• Whichmodelistherightoneforthistask?– Linearmodels,decisiontrees,deepneuralnetworks,etc

• Whichlearningalgorithm?– Doesthedataviolateanycrucialassumptionsthatwereusedto

definethelearningalgorithmorthemodel?– Doesthatmatter?

• Featureengineeringiscrucial

• Implicitly,theseareallclaimsaboutthenatureoftheproblem

47

Page 48: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Miscellaneousadvice

• Learnsimplermodelsfirst– Ifnothing,atleasttheyformabaselinethatyoucanimproveupon

• Ensemblesseemtoworkbetter

• Thinkaboutwhetheryourproblemislearnableatall– Learning=generalization

48

Page 49: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MLandtheworld

• Diagnosticsofyourlearningalgorithm

• Erroranalysis

• InjectingmachinelearningintoYourFavoriteTask

• Makingmachinelearningmatter

49

Page 50: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Makingmachinelearningmatter

ChallengestothegreaterMLcommunity1. AlawpassedorlegaldecisionmadethatreliesontheresultofanMLanalysis

2. $100MsavedthroughimproveddecisionmakingprovidedbyanMLsystem

3. AconflictbetweennationsavertedthroughhighqualitytranslationprovidedbyanMLsystem

4. A50%reductionincybersecurity break-insthroughMLdefenses

5. AhumanlifesavedthroughadiagnosisorinterventionrecommendedbyanMLsystem

6. Improvementof10%inonecountry’sHumanDevelopmentIndexattributabletoanMLsystem

50

Page 51: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

MLandsystembuilding

51

SeveralrecentpapersabouthowMLfitsinthecontextoflargesoftwaresystems

Page 52: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Data-drivendecisionmaking:Increasinglyprevalent

Somebroaderconcernsaboutalgorithmicdecision-makingemerge:• Doclassifiersexhibitbiases?

• Howdoweensurethatsuchsystemsaretransparentintheirdecision-making?

• Whencanyoubelieveyourmodel?Shouldyouleavedecisionmakingtoit?

• Whatifstatisticalmodelsareusedinethicallydubiousways?

52

Algorithmsarenolongerjustaboutshowingproof-of-conceptlearning

Theserefertoauxiliarycriteriathatneednotdirectlytiedtothelossthatweminimize

Page 53: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Biasedclassifiers

Whatifclassifiersareusedtodecide…– … howlongsomeoneshouldbesentencedforacrime?– … orwhethersomeone’sloanapplicationshouldbeapproved?– … orwhethersomeoneshouldbefired?

Questions:• Howcanweensurethattheclassifiersdonotexhibitillegalor

unethicalbiases?– i.e.theclassifiersarefair?

• Howdowedesignalgorithmsandevaluationframeworkswhichavoiddiscrimination?

• Whatifthedataitself(eitherknowinglyorunknowingly)isunfair?

53

Allthesearerealexamples

Page 54: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Therighttoexplanations

Imagineyouhaveajobandaclassifierdecidestofireyou.– Maybebecauseitmadeanerroronthisinstance(ie.you!)

Questions:• Howdowedevelopstatisticalmethodsthatnotonly

makeaprediction,butarealsotransparentintheirdecisionmakingprocess?

• Howdowedevelopalgorithmsthatcanexplaintheirdecisions?

54

Page 55: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Arecurrentlegalsystemsrobust?

Perhapsthereisroomtorewrite/updatelawstoaccountformachinelearning

• Whatifevidenceisfakedbyasamplingfromagenerativemodel?

• WhoistoblameifaML-basedautomaticcarisinvolvedinanaccident?

• WhatifanML-basedautonomousweapondecidestokillsomeonewithoutanyhumanintervention?

55

Page 56: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Fairness,AccountabilityandTransparency

• Weneedtoensurethat– OuralgorithmsareFair– OuralgorithmscanbeheldAccountable– OuralgorithmsexhibitTransparency

• Thesearealldifficulttoformalize– Butstillimportant

• Anditisimportanttokeeptheseconcernswhenwe– Defineatask– Collectdata– Defineevaluations– Designfeatures,models– … reallyateverystepalongtheway

56

http://www.fatml.orghttps://fatconference.org

Page 57: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Aretrospectivelookatthecourse

57

Page 58: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Learning=generalization

“AcomputerprogramissaidtolearnfromexperienceEwithrespecttosomeclassoftasksT andperformancemeasureP,ifitsperformanceattasksinT,asmeasuredbyP,improveswithexperienceE.”

TomMitchell(1999)

58

Page 59: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Wesawdifferent“models”

Or:whatkindofafunctionshouldalearnerlearn

– Linearclassifiers

– Decisiontrees

– Non-linearclassifiers,featuretransformations,neuralnetworks

– Ensemblesofclassifiers

59

Page 60: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Differentlearningprotocols

• Supervisedlearning– Ateachersuppliesacollectionofexampleswithlabels– Thelearner hastolearntolabelnewexamplesusingthisdata

• Wedidnotsee– Unsupervisedlearning

• Noteacher,learnerhasonlyunlabeledexamples• Datamining

– Semi-supervisedlearning• Learnerhasaccesstobothlabeledandunlabeledexamples

60

Page 61: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Learningalgorithms

• Onlinealgorithms:Learnercanaccessonlyonelabeledatatime– Perceptron

• Batchalgorithms:Learnercanaccesstotheentiredataset– NaïveBayes– Supportvectormachines,logisticregression– Decisiontreesandnearestneighbors– Boosting– Neuralnetworks

61

Page 62: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Representingdata

Whatisthebestwaytorepresentdataforaparticulartask?

• Features

• Dimensionalityreduction(wedidn’tcoverthis,butdolookatthematerialifyouareinterested)

62

Page 63: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Thetheoryofmachinelearning

Mathematicallydefininglearning

– Onlinelearning

– ProbablyApproximatelyCorrect(PAC)Learning

– Bayesianlearning

63

Page 64: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Representation,optimization,evaluation

64Tablefrom[Domingos,2012]

Page 65: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Machinelearningistoo easy!

• Remarkablydiversecollectionofideas

• Yet,inpracticemanyoftheseapproachesworkroughlyequallywell– Eg:SVMvslogisticregressionvsaveragedperceptron

65

Page 66: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Whatwedidnotsee

MachinelearningisalargeandgrowingareaofscientificstudyWedidnotcover• Kernelmethods• Unsupervisedlearning,clustering• HiddenMarkovmodels• Multiclasssupportvectormachines• Topicmodels• Structuredmodels• ….

66

Butwesawthefoundationsofhowtothinkaboutmachinelearning

Page 67: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Whatwedidnotsee

MachinelearningisalargeandgrowingareaofscientificstudyWedidnotcover• Kernelmethods• Unsupervisedlearning,clustering• HiddenMarkovmodels• Multiclasssupportvectormachines• Topicmodels• Structuredmodels• ….

67

Butwesawthefoundationsofhowtothinkaboutmachinelearning

Severalclassesthatcanfollow(orarerelatedto)thiscourse:• DataMining• Clustering• TheoryofMachineLearning• Variousapplications(NLP,vision,…)• Datavisualization• Specializations:Deeplearning,StructuredPrediction,etc

Page 68: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Thiscourse

Focusontheunderlyingconceptsandalgorithmicideasinthefieldofmachinelearning

Not about• Usingaspecificmachinelearningtool• Anysinglelearningparadigm

68

Page 69: Practical Advice for Building Machine Learning Applications · 2019-12-05 · Machine Learning Practical Advice for Building Machine Learning Applications Based on lectures and papers

Whatwesaw

1. Abroadtheoreticalandpracticalunderstandingofmachinelearningparadigmsandalgorithms

2. Abilitytoimplementlearningalgorithms

3. Identifywheremachinelearningcanbeappliedandmakethemostappropriatedecisions(aboutalgorithms,models,supervision,etc)

69