efficiency vs robustness in high-dimensional statisticsrongge/stoc2017ml/ilias_diakonikolas... ·...

55
Efficiency vs Robustness in High-Dimensional Statistics Ilias Diakonikolas (USC)

Upload: doankhanh

Post on 30-May-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Efficiencyvs RobustnessinHigh-DimensionalStatistics

IliasDiakonikolas(USC)

Canwedeveloplearningalgorithmsthatarerobust toaconstantfractionofcorruptions inthedata?

CONTEXT

ContaminationModel:Letbeafamilyofhigh-dimensionaldistributions.WesaythatasetofN samplesis-corruptedfromifitisgeneratedasfollows:• N samplesaredrawnfromanunknown• Anomniscientadversaryinspectsthesesamplesand

changesarbitrarilyan-fractionofthem.

F

F 2 F

F✏

RobustEstimatorsinHighDimensionswithouttheComputationalIntractability(D,Kamath,Kane,Li,Moitra,Stewart,FOCS2016)

AgnosticEstimationofMeanandCovariance(Lai,Rao,Vempala,FOCS2016)

SURVEYOFTWORECENTWORKS:

PARAMETERESTIMATIONGivensamplesfromanunknowndistribution

e.g.,a1-DGaussian

howdoweaccuratelyestimateitsparameters?

PARAMETERESTIMATIONGivensamplesfromanunknowndistribution

e.g.,a1-DGaussian

howdoweaccuratelyestimateitsparameters?

empiricalmean: empiricalvariance:

Themaximumlikelihoodestimatorisasymptoticallyefficient(1910-1920)

R.A.Fisher

Themaximumlikelihoodestimatorisasymptoticallyefficient(1910-1920)

R.A.Fisher J.W.Tukey

Whatabouterrors inthemodelitself?(1960)

ROBUSTSTATISTICS

Whatestimatorsbehavewellinaneighborhood aroundthe model?

ROBUSTPARAMETERESTIMATIONGivencorrupted samplesfroma1-DGaussian:

canweaccuratelyestimateitsparameters?

=+idealmodel noise observedmodel

Dotheempiricalmeanandempiricalvariancework?

Dotheempiricalmeanandempiricalvariancework?

No!

=+idealmodel noise observedmodel

Asinglecorruptedsamplecanarbitrarilycorrupttheestimates

Dotheempiricalmeanandempiricalvariancework?

No!

=+idealmodel noise observedmodel

Asinglecorruptedsamplecanarbitrarilycorrupttheestimates

Butthemedian andmedianabsolutedeviationdowork

Fact[Folklore]:Given- corruptedsamplesfroma1-DGaussian

withhighconstantprobabilitywehavethat:

where

Fact[Folklore]:Given- corruptedsamplesfroma1-DGaussian

withhighconstantprobabilitywehavethat:

where

Whataboutrobustestimationinhigh-dimensions?

PRIORWORK(1960- 2016)

• Vastliteratureonrobustestimatorsinstatisticscommunity.

• SampleComplexityvs RobustnessinHigh-Dimensions(e.g.,no tradeoffforrobustlearningofGaussian)

PRIORWORK(1960- 2016)

• Vastliteratureonrobustestimatorsinstatisticscommunity.

• SampleComplexityvs RobustnessinHigh-Dimensions(e.g.,no tradeoffforrobustlearningofGaussian)

• ComputationalEfficiencyvs Robustness?

PRIORWORK(1960- 2016)

Allknownestimatorsareeitherhardtocomputeorlosepolynomial factorsinthedimension

• Vastliteratureonrobustestimatorsinstatisticscommunity.

• SampleComplexityvs RobustnessinHigh-Dimensions(e.g.,no tradeoffforrobustlearningofGaussian)

• ComputationalEfficiencyvs Robustness?

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

BasicProblem:Givenan- corruptedsetofsamplesfromad-dimensionalunknownmeanGaussian

efficientlycomputeaparametersuchthat

PREVIOUSAPPROACHESFORROBUSTESTIMATION

ErrorGuarantee

RunningTime

Tukey Median NP-Hard

GeometricMedian

Tournament

Pruning

UnknownMean

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

MAINRESULTFORTHISTALK

Theorem: Therearealgorithmswiththefollowingbehavior:Givenandasetofcorrupted samplesfromad-

dimensionalGaussian,,thealgorithmsruninpolynomialtimeandfindaparameterthatsatisfies:• [LRV’16]:

inadditive adversarymodel.

• [DK+’16]:

instrong adversarymodel.

✏ > 0

FURTHERALGORITHMICRESULTS

[DKKLMS’16] EfficientRobustLearningAlgorithmswithDimension-Independent ErrorGuarantees:• MeanandCovarianceEstimationunderBoundedMoment

Assumptions• MixturesofProductDistributions/SphericalGaussians• ParameterEstimationinGraphicalModels [D-Kane-Stewart’16]

[LRV’16] Milddimension-dependent error:• MeanandCovarianceEstimationunderBoundedMoment

Assumptions• IndependentComponentAnalysis,SVD

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

COMPARISONOFTWOAPPROACHES

Commonalities:

• SpectralAlgorithms:Lookatspectrumofempiricalcovariancetorobustlyestimatethemean

• CertificateforRobustnessofEmpiricalEstimator:SpectralNormofEmpiricalCovarianceisSmall

Exploitingthecertificate:

• [LRV16]:Find“good”largesubspace.

• [DK+16]:Checkconditiononentirespace.Ifviolated,filteroutliers.

CERTIFICATEFORROBUSTNESSOFEMPIRICALESTIMATOR

Detectwhentheempiricalestimatormay becompromised

=uncorrupted=corrupted

Thereisadirectionoflarge(>1)variance

KeyLemma:IfX1, X2, …, XN isan-corruptedsetofsamplesfromand,thenfor

(1) (2)

withhighprobabilitywehave:• [LRV’16]:

inadditive adversarymodel.

• [DK+’16]:

instrong adversarymodel.

Take-away:Anadversaryneedstomessupthesecondmomentinordertocorruptthefirstmoment

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

RECURSIVEAPPROACH[LRV’16]

Two-StepProcedure:

Step#1:Findlargesubspacewhereempiricalmeanworks.

Step#2:Recurse oncomplement.(Ifdimensionis1,useempiricalmedian.)

Combineresults.

Canreducedimensionbyfactorof2ineachrecursivestep.

FINDINGAGOODSUBSPACE(I)

• GoodsubspaceG:onewheretheempiricalmeanworks.

ByKeyLemma,sufficientconditionis:

ProjectionofempiricalcovarianceonG hasnolargeeigenvalues.

• AlsowantG tobe“high-dimensional”.

Howdowefindsuchasubspace?

FINDINGAGOODSUBSPACE(II)

GoodSubspaceLemma:LetX1, X2, …, XN beanadditively- corruptedsetofsamplesfrom.Afterweakoutlierremoval,wehavethat

Corollary: Let W bethespanofthebottomd/2 eigenvaluesof.ThenW isagoodsubspace.

RECURSIVEDIMENSION-REDUCTIONALGORITHM

Algorithmworksasfollows:

• Removegrossoutliers(e.g.,pruning).

• LetW, V bethespanofbottomd/2 andupperd/2 eigenvaluesofrespectively.

• UseempiricalmeanonW.

• Recurse onV.(Ifthedimensionis1,usemedian.)

levelsoftherecursionfinalerrorof

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

FILTERINGAPPROACH[DKKLMS’16]

Two-StepProcedure:

Step#1:Detectiftheempiricalestimatormaybecompromised

Step#2:Ifitis, filteroutoutliers

Iterateonnewdataset.

Generalrecipethatworksforfairlygeneralsettings.

Willshowhowitworksforunknownmeancase.

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

Letv bethedirectionofmaximumvariance.

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

Letv bethedirectionofmaximumvariance.[Klivans-Long-Servedio’09]

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

Letv bethedirectionofmaximumvariance.

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

Letv bethedirectionofmaximumvariance.

v

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

Letv bethedirectionofmaximumvariance.

v

T

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

Letv bethedirectionofmaximumvariance.• Projectallthepointsonthedirectionofv.• FindathresholdT suchthat

• Throwawayallpointssuchthat

• Iterateonnewdataset.

Prx⇠uS [|v · x�median(v · x)| > T ] � 3e�T

2/2

|v · x�median(v · x)| > T

Eventuallytheempiricalmeanworks

FILTERINGALGORITHM

Eitheroutputempiricalmean,orremovemanyoutliers.

FilteringApproach:Supposethat:

Wefilteroutmorecorruptedthangoodpoints.

Afteranumberofiterations,we haveremovedallcorruptedpoints.

GENERALITYOFFILTERINGAPPROACH

• Focusofinitialversionwasonspecificdistributionfamilies(e.g.,Gaussian,discreteproductdistributions).

Errorguarantee:

• Filterapproachworksunderweakerconcentrationassumptionswithappropriate(tight)guarantees.E.g.,Under2nd momentassumption:Under4th momentassumption:Undersub-gaussian assumption…

• Samplecomplexitynear-optimalforallthesecases.

O(p✏)

O(✏3/4)

kµ� bµk2 = O(✏plog(1/✏))

O(✏plog(1/✏))

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

SUBSEQUENTWORK:ROBUSTMEANESTIMATION

Summarysofar:Thefilteringalgorithmrobustlyestimatesthemeanofuptoerrorinstrong adversarymodel.

SUBSEQUENTWORK:ROBUSTMEANESTIMATION

Summarysofar:Thefilteringalgorithmrobustlyestimatesthemeanofuptoerrorinstrong adversarymodel.

Question: Canweefficientlyachieveerror?

SUBSEQUENTWORK:ROBUSTMEANESTIMATION

Summarysofar:Thefilteringalgorithmrobustlyestimatesthemeanofuptoerrorinstrong adversarymodel.

Question: Canweefficientlyachieveerror?

[D-Kane-Stewart’16]:Algorithmwithruntimeinstrongadversarymodel.

SUBSEQUENTWORK:ROBUSTMEANESTIMATION

Summarysofar:Thefilteringalgorithmrobustlyestimatesthemeanofuptoerrorinstrong adversarymodel.

Question: Canweefficientlyachieveerror?

[D-Kane-Stewart’16]:Algorithmwithruntimeinstrongadversarymodel.

[D-Kane-Stewart’16]:NoStatisticalQuery(SQ)algorithmcandobetter.Specifically,errorrequiressuper-polynomialtimein1/ .

SUBSEQUENTWORK:ROBUSTMEANESTIMATION

Summarysofar:Thefilteringalgorithmrobustlyestimatesthemeanofuptoerrorinstrong adversarymodel.

Question: Canweefficientlyachieveerror?

[D-Kane-Stewart’16]:Algorithmwithruntimeinstrongadversarymodel.

[D-Kane-Stewart’16]:NoStatisticalQuery(SQ)algorithmcandobetter.Specifically,errorrequiressuper-polynomialtimein1/ .

[DKKLMS’17]: timealgorithmwitherrorinadditive adversarialmodel.

SYNTHETICEXPERIMENTS:UNKNOWNMEAN

Acomparisonoferrorrateonsyntheticdata(unknownmean):

PartI:Introduction

� CaseStudy:RobustMeanEstimation� NewAlgorithmicResults

PartII:AgnosticallyLearningaGaussian� ComparisonbetweentwoApproaches

� RecursiveDimension-Reduction[LRV’16]� FilteringTechnique[DKKLMS’16]�MoreRecentDevelopments

OUTLINE

PartIII:SummaryandConclusions

OPENPROBLEMS

SubsequentWork:

[D-Kane-Stewart’16]KnownStructureBayesNets

[Li/Du-Balakrishan-Singh’17]Sparsemodels(e.g.,sparsePCA)

[Charikar-Steinhardt-Valiant’17]List-decodablelearning

[D-Kane-Stewart’17]RobustClassification

• Pickyourfavoritehigh-dimensionallearningproblemforwhicha(non-robust)efficientalgorithmisknown.

• Makeitrobust!

SUMMARYANDCONCLUSIONS

• FirstComputationallyEfficientRobustEstimatorwithDimension-Independent ErrorGuarantees.

• GeneralRecipeforVariousHigh-DimensionalProblems.

• PracticalapplicationsinExploratoryDataAnalysis[DKKLMS,ICML’17].

Thanks!AnyQuestions