how to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...persian cat...

48
Outline Introduction Predictive uncertainty of deep neural networks Summary of contributions How to train confident neural networks Training Confidence-Calibrated Classifiers for Detecting Out-of-Distribution Samples [Lee’ 18a] Applications Hierarchical novelty detection [Lee’ 18b] Conclusion Future work 1 [Lee’ 18a] Lee, K., Lee, H., Lee, K. and Shin, J. Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples. I n ICLR, 2018. [Lee’ 18b] Lee, K., Lee, Min. K, Zhang, Y. Shin. J, Lee, H. Hierarchical Novelty Detection for Visual Object Recognition, In CVPR, 2018.

Upload: others

Post on 28-Nov-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

Outline

• Introduction• Predictiveuncertaintyofdeepneuralnetworks• Summaryofcontributions

• Howtotrainconfidentneuralnetworks• TrainingConfidence-CalibratedClassifiersforDetectingOut-of-DistributionSamples[Lee’18a]

• Applications• Hierarchicalnoveltydetection[Lee’18b]

• Conclusion• Futurework

1

[Lee’18a]Lee,K.,Lee,H.,Lee,K.andShin,J.TrainingConfidence-calibratedClassifiersforDetectingOut-of-DistributionSamples. InICLR,2018.[Lee’18b]Lee,K.,Lee,Min.K,Zhang,Y.Shin.J,Lee,H.HierarchicalNoveltyDetectionforVisualObjectRecognition,InCVPR,2018.

Page 2: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Supervisedlearning(e.g.,regressionandclassification)• Objective:findinganunknowntargetdistribution,i.e.,P(Y|X)

• Recentadvancesindeeplearninghavedramaticallyimprovedaccuracyonseveralsupervisedlearningtasks

Introduction:Predictiveuncertaintyofdeepneuralnetworks(DNNs)

2

[Amodei’16]Amodei,D.,Ananthanarayanan,S.,Anubhai,R.,Bai,J.,Battenberg,E.,Case,C.,Casper,J.,Catanzaro,B.,Cheng,Q.,Chen,G.andChen,J.Deepspeech2:End-to-endspeechrecognitioninenglish andmandarin.In ICML,2016.[He’16]He,K.,Zhang,X.,Ren,S.andSun,J.Deepresiduallearningforimagerecognition.In CVPR,2016.[Hershey’17]Hershey,S.,Chaudhuri,S.,Ellis,D.P.,Gemmeke,J.F.,Jansen,A.,Moore,R.C.,Plakal,M.,Platt,D.,Saurous,R.A.,Seybold,B.andSlaney,M.CNNarchitecturesforlarge-scaleaudioclassification.In ICASSP,2017.[Girshick’15]Girshick,Ross.Fastr-cnn.InICCV,pp.1440–1448,2015

Inputspace Outputspace

Objectivedetection[Girshick’15]

Speechrecognition[Amodei’16]

Imageclassification[He’16]

Audiorecognition[Hershey’17]

Page 3: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• UncertaintyofpredictivedistributionisimportantinDNN’sapplications• Whatispredictiveuncertainty?

• Asaexample,considerclassificationtask

• Itrepresentsaconfidenceaboutprediction!• Forexample,itcanbemeasuredasfollows:

• Entropyofpredictivedistribution[Lakshminarayanan’17]

• Maximumvalueofpredictivedistribution[Hendrycks’17]

Introduction:Predictiveuncertaintyofdeepneuralnetworks(DNNs)

3[Lakshminarayanan’17]Lakshminarayanan,B.,Pritzel,A.andBlundell,C.,Simpleandscalablepredictiveuncertaintyestimationusingdeepensembles.In NIPS,2017.[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.

Persiancat

tigercat

0.120.18

Persiancat

dog

0.99

Page 4: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Predictiveuncertaintyisrelatedtomanymachinelearningproblems:

• PredictiveuncertaintyisalsoindispensablewhendeployingDNNsinreal-worldsystems[Dario’16]

Introduction:Predictiveuncertaintyofdeepneuralnetworks(DNNs)

4

Autonomousdrive Secureauthenticationsystem[Dario’16]DarioAmodei,ChrisOlah,JacobSteinhardt,PaulChristiano,JohnSchulman,andDanMane.Concreteproblemsinaisafety.arXivpreprintarXiv:1606.06565,2016.[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.[Guo’17]Guo,C.,Pleiss,G.,Sun,Y.andWeinberger,K.Q.,2017.OnCalibrationofModernNeuralNetworks. InICML2017.[Goodfellow’14]Goodfellow,I.J.,Shlens,J.andSzegedy,C.,2014.Explainingandharnessingadversarialexamples. arXivpreprintarXiv:1412.6572.[Srivastava’14]Srivastava,N.,Hinton,G.E.,Krizhevsky,A.,Sutskever,I.andSalakhutdinov,R.,Dropout:asimplewaytopreventneuralnetworksfromoverfitting.JMLR.2014.

Noveltydetection[Hendrycks’17]

Adversarialdetection[Song’18]

Ensemblelearning[Lee’17]

Page 5: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• However,DNNsdonotcapturetheirpredictiveuncertainty

• E.g.,DNNstrainedtoclassifyMNISTimagesoftenproducehighconfidentprobability91%evenforrandomnoise[Henderycks’17]

• Challengearisesinimprovingthequalityofthepredictiveuncertainty!

• Maintopicofthispresentation• Howtotrainconfidentneuralnetworks?

• Trainingconfidence-calibratedclassifiersfordetectingout-of-distributionsamples[Lee’18a]

• Applications• Confidentmultiplechoicelearning [Lee’17]• Hierarchicalnoveltydetection[Lee’18b]

Introduction:Predictiveuncertaintyofdeepneuralnetworks(DNNs)

5

Unknownimage Cat TrainDog

99%

[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.[Lee’18a]Lee,K.,Lee,H.,Lee,K.andShin,J.TrainingConfidence-calibratedClassifiersforDetectingOut-of-DistributionSamples. InICLR2018.[Lee’17]Lee,K.,Hwang,C.,Park,K.andShin,J.ConfidentMultipleChoiceLearning. InICML,2017.[Lee’18b]Lee,K.,Lee,Min.K,Zhang,Y.Shin.J,Lee,H.HierarchicalNoveltyDetectionforVisualObjectRecognition,InCVPR,2018.

Page 6: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

Outline

• Introduction• Predictiveuncertaintyofdeepneuralnetworks• Summaryofcontributions

• Howtotrainconfidentneuralnetworks• TrainingConfidence-CalibratedClassifiersforDetectingOut-of-DistributionSamples[Lee’18a]

• Applications• ConfidentMultipleChoiceLearning[Lee’17]• Hierarchicalnoveltydetection[Lee’18b]

• Conclusion• Futurework

6

[Lee’18a]Lee,K.,Lee,H.,Lee,K.andShin,J.TrainingConfidence-calibratedClassifiersforDetectingOut-of-DistributionSamples. InICLR,2018.[Lee’17]Lee,K.,Hwang,C.,Park,K.andShin,J.ConfidentMultipleChoiceLearning. InICML,2017.[Lee’18b]Lee,K.,Lee,Min.K,Zhang,Y.Shin.J,Lee,H.HierarchicalNoveltyDetectionforVisualObjectRecognition,InCVPR,2018.

Page 7: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Relatedproblem• Detectingout-of-distribution[Hendrycks’17,Liang’18]

• Detectwhetheratestsampleisfromin-distribution(i.e.,trainingdistributionbyclassifier)orout-of-distribution

HowtoTrainConfidentNeuralNetworks?

7[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.[Liang’18]Liang,S.,Li,Y.andSrikant,R.PrincipledDetectionofOut-of-DistributionExamplesinNeuralNetworks. InICLR,2018.

Page 8: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Relatedproblem• Detectingout-of-distribution[Hendrycks’17,Liang’18]

• Detectwhetheratestsampleisfromin-distribution(i.e.,trainingdistributionbyclassifier)orout-of-distribution

• E.g.,imageclassification• Assumeaclassifiertrainshandwrittendigits(denotedasin-distribution)• Detectingout-of-distribution

• Performanceofdetectorreflectsconfidenceofpredictivedistribution!

HowtoTrainConfidentNeuralNetworks?

8

In-distribution Out-of-distribution

Predictivedist.

Data

[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.[Liang’18]Liang,S.,Li,Y.andSrikant,R.PrincipledDetectionofOut-of-DistributionExamplesinNeuralNetworks. InICLR,2018.

Page 9: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Threshold-basedDetector[Guo’17,Hendrycks’17,Liang’18]

RelatedWork

9

[Input] [Classifier]

score10

Ifscore>𝜖:In-distribution

Else:out-of-distribution

[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.[Guo’17]Guo,C.,Pleiss,G.,Sun,Y.andWeinberger,K.Q.,2017.OnCalibrationofModernNeuralNetworks. InICML2017.[Liang’18]Liang,S.,Li,Y.andSrikant,R.,2017.PrincipledDetectionofOut-of-DistributionExamplesinNeuralNetworks. InICLR,2018.

Page 10: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Threshold-basedDetector[Guo’17,Hendrycks’17,Liang’18]

• Howtodefinethescore?• Baselinedetector[Hendrycks’17]

• Confidencescore=maximumvalueofpredictivedistribution:

• Temperaturescaling[Guo’17]

• Confidencescore=maximumvalueofscaledpredictivedistribution

RelatedWork

10

[Input] [Classifier]

score10

Ifscore>𝜖:In-distribution

Else:out-of-distribution

[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.[Guo’17]Guo,C.,Pleiss,G.,Sun,Y.andWeinberger,K.Q.,2017.OnCalibrationofModernNeuralNetworks. InICML2017.[Liang’18]Liang,S.,Li,Y.andSrikant,R.,2017.PrincipledDetectionofOut-of-DistributionExamplesinNeuralNetworks. InICLR,2018.

Outputofneuralnetworks

Page 11: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Threshold-basedDetector[Guo’17,Hendrycks’17,Liang’18]

• Howtodefinethescore?• Baselinedetector[Hendrycks’17]

• Confidencescore=maximumvalueofpredictivedistribution:

• Temperaturescaling[Guo’17]

• Confidencescore=maximumvalueofscaledpredictivedistribution

• Limitations• Performanceofpriorworkshighlydependsonhowtotraintheclassifiers

RelatedWork

11

[Input] [Classifier]

score10

Ifscore>𝜖:In-distribution

Else:out-of-distribution

[Henderycks’17]Hendrycks,D.andGimpel,K.,Abaselinefordetectingmisclassifiedandout-of-distributionexamplesinneuralnetworks. InICLR2017.[Guo’17]Guo,C.,Pleiss,G.,Sun,Y.andWeinberger,K.Q.,2017.OnCalibrationofModernNeuralNetworks. InICML2017.[Liang’18]Liang,S.,Li,Y.andSrikant,R.,2017.PrincipledDetectionofOut-of-DistributionExamplesinNeuralNetworks. InICLR,2018.

Page 12: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Onecanconsider• Bayesianneuralnetworks[Yingzhen’17]

• Trainingorinferringthosemodelsarecomputationallyexpensive

OurContributions

12

• Ensembleofclassifiers[Balaji’17]

[Yingzhen’17]Yingzhen LiandYarin Gal.Dropoutinferenceinbayesian neuralnetworkswithalpha-divergences.InInternationalConferenceonMachineLearning(ICML),2017.[Balaji’17]Balaji Lakshminarayanan,AlexanderPritzel,andCharlesBlundell.Simpleandscalablepredictiveuncertaintyestimationusingdeepensembles.InAdvancesinneuralinformationprocessingsystems(NIPS),2017.

Page 13: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Onecanconsider• Bayesianneuralnetworks[Yingzhen’17]

• Trainingorinferringthosemodelsarecomputationallyexpensive

• Ourcontribution

OurContributions

13

Confidencelossfortrainingmoreplausible

simpleDNNs

GANforgeneratingout-of-distributionsamples

JointtrainingmethodofclassifierandGAN

• Ensembleofclassifiers[Balaji’17]

[Yingzhen’17]Yingzhen LiandYarin Gal.Dropoutinferenceinbayesian neuralnetworkswithalpha-divergences.InInternationalConferenceonMachineLearning(ICML),2017.[Balaji’17]Balaji Lakshminarayanan,AlexanderPritzel,andCharlesBlundell.Simpleandscalablepredictiveuncertaintyestimationusingdeepensembles.InAdvancesinneuralinformationprocessingsystems(NIPS),2017.

Page 14: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Onecanconsider• Bayesianneuralnetworks[Yingzhen’17]

• Trainingorinferringthosemodelsarecomputationallyexpensive

• Ourcontribution

• Experimentalresults• Ourmethoddrasticallyimprovesthedetectionperformance• E.g.,VGGNet trainedbyourmethodimprovesTPRcomparedtothebaseline:14.0%à39.1%and46.3%à 98.9%onCIFAR-10andSVHN

OurContributions

14

Confidencelossfortrainingmoreplausible

simpleDNNs

GANforgeneratingout-of-distributionsamples

JointtrainingmethodofclassifierandGAN

• Ensembleofclassifiers[Balaji’17]

[Yingzhen’17]Yingzhen LiandYarin Gal.Dropoutinferenceinbayesian neuralnetworkswithalpha-divergences.InInternationalConferenceonMachineLearning(ICML),2017.[Balaji’17]Balaji Lakshminarayanan,AlexanderPritzel,andCharlesBlundell.Simpleandscalablepredictiveuncertaintyestimationusingdeepensembles.InAdvancesinneuralinformationprocessingsystems(NIPS),2017.

Page 15: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Confidentloss• MinimizetheKLdivergenceondatafromout-of-distribution

• Interpretation• Assigninghighermaximumpredictionvaluestoin-distributionsamplesthanout-of-distributionones

Contribution1:ConfidentLoss

15

Datafromin-dist Datafromout-of-dist

Datadistribution Uniformdistribution

“Zeroconfidence”

Page 16: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Confidentloss• MinimizetheKLdivergenceondatafromout-of-distribution

• Interpretation• Assigninghighermaximumpredictionvaluestoin-distributionsamplesthanout-of-distributionones

• Effectsofconfidenceloss• FractionofthemaximumpredictionvaluefromsimpleCNNs(2Conv +3FC)

Contribution1:ConfidentLoss

16

Datafromin-dist Datafromout-of-dist

CIFAR-10SVHN TinyImageNet LSUN

Page 17: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Confidentloss• MinimizetheKLdivergenceondatafromout-of-distribution

• Interpretation• Assigninghighermaximumpredictionvaluestoin-distributionsamplesthanout-of-distributionones

• Effectsofconfidenceloss• FractionofthemaximumpredictionvaluefromsimpleCNNs(2Conv +3FC)• In-distribution:SVHN

Contribution1:ConfidentLoss

17

Datafromin-dist Datafromout-of-dist

Page 18: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Confidentloss• MinimizetheKLdivergenceondatafromout-of-distribution

• Interpretation• Assigninghighermaximumpredictionvaluestoin-distributionsamplesthanout-of-distributionones

• Effectsofconfidenceloss• FractionofthemaximumpredictionvaluefromsimpleCNNs(2Conv +3FC)• KLdivergencetermisoptimizedusingCIFAR-10trainingdata

Contribution1:ConfidentLoss

18

Datafromin-dist Datafromout-of-dist

Page 19: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Mainissuesofconfidenceloss• HowtooptimizetheKLdivergenceloss?

Contribution2.GANforGeneratingOut-of-DistributionSamples

19

Datafromout-of-dist

Page 20: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Mainissuesofconfidenceloss• HowtooptimizetheKLdivergenceloss?

• Thenumberofout-of-distributionsamplesmightbealmostinfinitetocovertheentirespace

Contribution2.GANforGeneratingOut-of-DistributionSamples

20

Datafromout-of-dist

Page 21: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Mainissuesofconfidenceloss• HowtooptimizetheKLdivergenceloss?

• Thenumberofout-of-distributionsamplesmightbealmostinfinitetocovertheentirespace

Contribution2.GANforGeneratingOut-of-DistributionSamples

21

Datafromout-of-dist

Page 22: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Mainissuesofconfidenceloss• HowtooptimizetheKLdivergenceloss?

• Thenumberofout-of-distributionsamplesmightbealmostinfinitetocovertheentirespace

• Ourintuition• Samplesclosetoin-distributioncouldbemoreeffectiveinimprovingthedetectionperformance

Contribution2.GANforGeneratingOut-of-DistributionSamples

22

Page 23: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Mainissuesofconfidenceloss• HowtooptimizetheKLdivergenceloss?

• Thenumberofout-of-distributionsamplesmightbealmostinfinitetocovertheentirespace

• Ourintuition• Samplesclosetoin-distributioncouldbemoreeffectiveinimprovingthedetectionperformance

Contribution2.GANforGeneratingOut-of-DistributionSamples

23

Page 24: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Mainissuesofconfidenceloss• HowtooptimizetheKLdivergenceloss?

• Thenumberofout-of-distributionsamplesmightbealmostinfinitetocovertheentirespace

• Ourintuition• Samplesclosetoin-distributioncouldbemoreeffectiveinimprovingthedetectionperformance

Contribution2.GANforGeneratingOut-of-DistributionSamples

24

[0,0.2)=

[0.2,0.8)=

[0.8,1)=

Page 25: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Mainissuesofconfidenceloss• HowtooptimizetheKLdivergenceloss?

• Thenumberofout-of-distributionsamplesmightbealmostinfinitetocovertheentirespace

• Ourintuition• Samplesclosetoin-distributioncouldbemoreeffectiveinimprovingthedetectionperformance

Contribution2.GANforGeneratingOut-of-DistributionSamples

25

Page 26: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• NewGANobjective

• Term(a)forcesthegeneratortogeneratelow-densitysamples• (approximately)minimizingthelognegativelikelihoodofin-distribution

• Term(b)correspondstotheoriginalGANloss• Generatingout-of-distributionsamplesclosetoin-distribution

Contribution2.GANforGeneratingOut-of-DistributionSamples

26

Page 27: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• NewGANobjective

• Term(a)forcesthegeneratortogeneratelow-densitysamples• (approximately)minimizingthelognegativelikelihoodofin-distribution

• Term(b)correspondstotheoriginalGANloss• Generatingout-of-distributionsamplesclosetoin-distribution

Contribution2.GANforGeneratingOut-of-DistributionSamples

27

Page 28: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• NewGANobjective

• Term(a)forcesthegeneratortogeneratelow-densitysamples• (approximately)minimizingthelognegativelikelihoodofin-distribution

• Term(b)correspondstotheoriginalGANloss• Generatingout-of-distributionsamplesclosetoin-distribution

Contribution2.GANforGeneratingOut-of-DistributionSamples

28

Classifiertrainedonin-distribution

Page 29: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• NewGANobjective

• Term(a)forcesthegeneratortogeneratelow-densitysamples• (approximately)minimizingthelognegativelikelihoodofin-distribution

• Term(b)correspondstotheoriginalGANloss• Generatingout-of-distributionsamplesclosetoin-distribution

• ExperimentalresultsontoyexampleandMNIST

Contribution2.GANforGeneratingOut-of-DistributionSamples

29

Page 30: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• WesuggesttrainingtheproposedGANusingaconfidentclassifier• Converseisalsopossible

Contribution3.JointConfidenceLoss

Page 31: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• WesuggesttrainingtheproposedGANusingaconfidentclassifier• Converseisalsopossible

• Weproposeajointconfidenceloss

• Classifier’sconfidenceloss:(c)+(d)• GANloss:(d)+(e)

Contribution3.JointConfidenceLoss

31

Page 32: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• WesuggesttrainingtheproposedGANusingaconfidentclassifier• Converseisalsopossible

• Weproposeajointconfidenceloss

• Classifier’sconfidenceloss:(c)+(d)• GANloss:(d)+(e)

• Alternatingalgorithmforoptimizingthejointconfidenceloss

Contribution3.JointConfidenceLoss

32Step2.updateclassifier

Classifier

GANStep1.updateGAN

Classifier

GAN

Page 33: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Model:VGGNet [Christian’15]with13layers• In-distribution:CIFAR-10orSVHN

• Out-of-distribution:(resized)TinyImageNet andLSUN

ExperimentalResults:dataset&model

33

• 32×32 RGB• 10classes• 50,000trainingset• 10,000testset

• 32×32 RGB• 10classes• 73,257trainingset• 26,032testset

CIFAR-10[Krizhevsky’09] SVHN[Netzer’11]

• 32×32 RGB• 200classes• 10,000testset

• 32×32 RGB• 10classes• 10,000testset

TinyImageNet LSUN

[Krizhevsky’09]Krizhevsky,A.andHinton,G.Learningmultiplelayersoffeaturesfromtinyimages.Master’sthesis,DepartmentofComputerScience,UniversityofToronto,2009.[Netzer’11]Netzer,Y.,Wang,T.,Coates,A.,Bissacco,A.,Wu,B.andNg,A.Y.Readingdigitsinnaturalimageswithunsupervisedfeaturelearning.InNIPSWorkshoponDeepLearningandUnsupervisedFeatureLearning,2011.[Christian’15]ChristianSzegedy,WeiLiu,Yangqing Jia,PierreSermanet,ScottReed,Dragomir Anguelov,Dumitru Erhan,VincentVanhoucke,andAndrewRabinovich.Goingdeeperwithconvolutions.InComputerVisionandPatternRecognition(CVPR),2015

Page 34: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• TP=truepositive• FN=falsenegative• TN=truenegative• FP=falsepositive• [Metrics]• FPRat95%TPR

• FPR=FP/(FP+TN),TPR=TP/(TP+FN)

• AUROC(AreaUndertheReceiverOperatingCharacteristiccurve)• ROCcurve=relationshipbetweenTPRandFPR

• DetectionError• Minimummisclassificationprobabilityoverallthresholds

• AUPR(AreaunderthePrecision-Recallcurve)• PRcurve=relationshipbetweenprecision=TP/(TP+FP)andrecall=TP/(TP+FN)

ExperimentalResults- Metric

34

Page 35: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Measurethedetectionperformanceofthreshold-baseddetectors• Confidencelosswithsomeexplicitout-of-distributiondataset

• Classifiertrainedbyourmethoddrasticallyimprovesthedetectionperformanceacrossallout-of-distributions

ExperimentalResults

35

RealisticimagessuchasTinyImageNet(aqualine)andLSUN(greenline)aremoreusefulthansyntheticdatasets(orangeline)forimprovingthedetectionperfor-mance

Page 36: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Jointconfidenceloss

• ConfidencelosswiththeoriginalGAN(orangebar)isoftenusefulforimprovingthedetectionperformance

• Jointconfidenceloss(bluebar)stilloutperformsallbaselineitinallcases

ExperimentalResults

36

Page 37: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• ComparisonwithODIN[Liang’18]

ExperimentalResults

37

Page 38: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• ComparisonwithODIN[Liang’18]

ExperimentalResults

38

Page 39: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Interpretabilityoftrainedclassifier

• Classifiertrainedbycrossentropylossshowssharpgradientmapsforbothsamplesfromin- andout-of-distributions

• Classifierstrainedbytheconfidencelossesdoonlyonsamplesfromin-distribution.

ExperimentalResults

39

Page 40: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

Outline

• Introduction• Predictiveuncertaintyofdeepneuralnetworks• Summaryofcontributions

• Howtotrainconfidentneuralnetworks• TrainingConfidence-CalibratedClassifiersforDetectingOut-of-DistributionSamples[Lee’18a]

• Applications• Hierarchicalnoveltydetection[Lee’18b]

• Conclusion• Futurework

40

[Lee’18a]Lee,K.,Lee,H.,Lee,K.andShin,J.TrainingConfidence-calibratedClassifiersforDetectingOut-of-DistributionSamples. InICLR,2018.[Lee’17]Lee,K.,Hwang,C.,Park,K.andShin,J.ConfidentMultipleChoiceLearning. InICML,2017.[Lee’18b]Lee,K.,Lee,Min.K,Zhang,Y.Shin.J,Lee,H.HierarchicalNoveltyDetectionforVisualObjectRecognition,InCVPR,2018.

Page 41: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Noveltydetection• 1.Findtheclosestknown(super-)categoryintaxonomy• 2.Findfine-grainedclassificationfornovelcategories(i.e.,out-of-distributionsamples)

HierarchicalNoveltyDetection

41

Figure1.Anillustrationofourhierarchicalnoveltydetectiontask

Page 42: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Objective• 1.Findtheclosestknown(super-)categoryintaxonomy• 2.Findfine-grainedclassificationfornovelcategories(i.e.,out-of-distributionsamples)

HierarchicalNoveltyDetection

42

Figure1.Anillustrationofourhierarchicalnoveltydetectiontask

Page 43: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Objective• 1.Findtheclosestknown(super-)categoryintaxonomy• 2.Findfine-grainedclassificationfornovelcategories(i.e.,out-of-distributionsamples)

HierarchicalNoveltyDetection

43

Figure1.Anillustrationofourhierarchicalnoveltydetectiontask

Page 44: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Top-downmethod(TD)• p(child)=∑super p(child|super)p(super)

• Inference

• Definitionofconfidence:

TwoMainApproaches

44

Novelclass

Page 45: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Top-downmethod(TD)• p(child)=∑super p(child|super)p(super)

• Inference

• Definitionofconfidence:• Objective

TwoMainApproaches

45

Novelclass

Page 46: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• ImageNetdataset• 22Kclasses• Taxonomy

• 396superclassesof1Kknownleafclasses

• Restof21Kclassescanbeusedasnovelclass

• Example

ExperimentalResultsonImageNetDataset

46

[Deng’ 12] J. Deng, J. Krause, A. C. Berg, and L. Fei-Fei. Hedging your bets: Optimizing accuracy-specificity trade offs in large scale visual recognition. In CVPR , pages 3450–3457. IEEE, 2012.

Page 47: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• ImageNetdataset• 22Kclasses• Taxonomy

• 396superclassesof1Kknownleafclasses

• Restof21Kclassescanbeusedasnovelclass

• Example

ExperimentalResultsonImageNetDataset

47

[Deng’ 12] J. Deng, J. Krause, A. C. Berg, and L. Fei-Fei. Hedging your bets: Optimizing accuracy-specificity trade offs in large scale visual recognition. In CVPR , pages 3450–3457. IEEE, 2012.

• Hierarchicalnoveltydetectionperformance

• Baseline:DARTS[Deng’12]

• OnecannotethatourmethodshavehighernovelclassaccuracythanDARTStohaveasameknownclassaccuracyinmostregions

Page 48: How to train confident neural networksalinlab.kaist.ac.kr/resource/iclr_2018_confident...Persian cat tiger cat 0.12 0.18 Persian cat dog 0.99 Algorithmic Intelligence Lab •Predictive

AlgorithmicIntelligenceLab

• Weproposeanewmethodfortrainingconfident deepneuralnetworks• Itproducetheuniformdistributionwhentheinputisnotfromtargetdistribution

• Weshowthatitcanbeappliedtomanymachinelearningproblems:• Detectingout-of-distributionproblem[Lee’18a]• Ensemblelearningusingdeepneuralnetworks[Lee’17]• Hierarchicalnoveltydetection[Lee’18b]

• Webelievethatournewapproachbringsarefreshinganglefordevelopingconfidentdeepnetworksinmanyrelatedapplications:

• Networkcalibration• Adversarialexampledetection• Bayesianprobabilisticmodels• Semi-supervisedlearning

Conclusion

48

[Lee’18a]Lee,K.,Lee,H.,Lee,K.andShin,J.TrainingConfidence-calibratedClassifiersforDetectingOut-of-DistributionSamples. InICLR,2018.[Lee’17]Lee,K.,Hwang,C.,Park,K.andShin,J.ConfidentMultipleChoiceLearning. InICML,2017.[Lee’18b]Lee,K.,Lee,Min.K,Zhang,Y.Shin.J,Lee,H.HierarchicalNoveltyDetectionforVisualObjectRecognition,InCVPR,2018.