computational-statistical tradeoffs in robust estimation · 2020-01-03 ·...

40
Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) (based on joint work with D. Kane and A. Stewart)

Upload: others

Post on 22-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

Computational-StatisticalTradeoffsin

RobustEstimation

IliasDiakonikolas(USC)

(basedonjointworkwithD.KaneandA.Stewart)

Page 2: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

Canwedeveloplearning/estimationalgorithmsthatarerobust toaconstantfractionofcorruptions inthedata?

ROBUSTHIGH-DIMENSIONALESTIMATION

ContaminationModel:Letbeafamilyofhigh-dimensionaldistributions.Wesaythatadistributionis- corruptedwithrespecttoifthereexistssuchthat

F

F 2 FF✏

Page 3: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

• Input:samplegeneratedbymodelwithunknown• Goal:estimateparameterssothat

THEUNSUPERVISEDLEARNINGPROBLEM

Question1:Isthereanefficient learningalgorithm?

Unknown θ* samples ✓

✓⇤

✓ ✓ ⇡ ✓⇤

Main performance criteria:• Sample size• Running time• Robustness

Question2:Aretheretradeoffs betweenthesecriteria?

Page 4: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

ROBUSTLYLEARNINGAGAUSSIAN – PRIORWORK

BasicProblem:Givenan- corruptedversionofanunknownd-dimensionalunknownmeanGaussian

efficiently computeahypothesisdistributionsuchthat

• Extensivelystudiedinrobuststatisticssincethe1960’s.Tillrecently,knownefficientestimatorsgeterror

• RecentAlgorithmicProgress:-- [Lai-Rao-Vempala’16]

-- [D-Kamath-Kane-Li-Moitra-Stewart’16]

Page 5: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

ROBUSTLYLEARNINGAGAUSSIAN

BasicProblem:Givenan- corruptedversionofanunknownd-dimensionalunknownmeanGaussian

efficiently computeahypothesisdistributionsuchthat

erroristheinformation-theoreticallybestpossible.

Page 6: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

ROBUSTLEARNING– OPENQUESTION

SummaryofPriorWork: Thereisatimealgorithmforrobustlylearningwithinerror

OpenQuestion:Isthereatimealgorithmforrobustlylearningwithinerror?Howabout?

Page 7: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PartI:Introduction

� UnsupervisedLearninginHighDimension� StatisticalQuery(SQ)LearningModel� OurResults

PartII:ComputationalSQLowerBounds

� GenericSQLowerBoundTechnique� TwoApplications:LearningGMMs,RobustlyLearningaGaussian

OUTLINE

PartIII:Extensions

PartIV:SummaryandConclusions

Page 8: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

STATISTICALQUERIES[KEARNS’93]

𝑥", 𝑥$, … , 𝑥& ∼ 𝐷 over𝑋

Page 9: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

STATISTICALQUERIES[KEARNS’93]

𝑣" − 𝐄-∼. 𝜙" 𝑥 ≤ 𝜏𝜏 istoleranceofthequery;𝜏 = 1/ 𝑚�

𝜙7

𝑣"𝜙$𝑣$

𝑣7SQalgorithmSTAT.(𝜏) oracle

𝐷

𝜙": 𝑋 → −1,1

Problem𝑃 ∈ SQCompl 𝑞,𝑚 :IfexistsaSQalgorithmthatsolves𝑃 using𝑞 queriestoSTAT.(𝜏 = 1/ 𝑚� )

Page 10: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

POWEROFSQ ALGORITHMS(?)RestrictedModel:Hopetoproveunconditionalcomputationallowerbounds.

PowerfulModel:WiderangeofalgorithmictechniquesinMLareimplementableusingSQs*:

• PACLearning:AC0,decisiontrees,linearseparators,boosting.

• UnsupervisedLearning:stochasticconvexoptimization,moment-basedmethods,k-meansclustering,EM,…[Feldman-Grigorescu-Reyzin-Vempala-Xiao/JACM’17]

Onlyknownexception:Gaussianeliminationoverfinitefields(e.g.,learningparities).

Forallproblemsinthistalk,strongestknownalgorithmsareSQ.

Page 11: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

METHODOLOGYFORSQ LOWERBOUNDSStatisticalQueryDimension:

• Fixed-distributionPACLearning[Blum-Furst-Jackson-Kearns-Mansour-Rudich’95;…]

• GeneralStatisticalProblems[Feldman-Grigorescu-Reyzin-Vempala-Xiao’13,…,Feldman’16]

PairwisecorrelationbetweenD1 andD2 withrespecttoD:

Fact:Sufficestoconstructalargesetofdistributionsthatarenearlyuncorrelated.

Page 12: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PartI:Introduction

� UnsupervisedLearninginHighDimension� StatisticalQuery(SQ)LearningModel� OurResults

PartII:ComputationalSQLowerBounds

� GenericSQLowerBoundTechnique� TwoApplications:LearningGMMs,RobustlyLearningaGaussian

OUTLINE

PartIII:SummaryandConclusions

Page 13: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

STATISTICALQUERYLOWERBOUNDFORROBUSTLYLEARNINGAGAUSSIAN

Theorem:SupposeAnySQalgorithmthatlearnsan- corruptedGaussianwithinstatisticaldistanceerror

requireseither:• SQqueriesofaccuracyor• Atleast

manySQqueries.

Take-away: Anyasymptoticimprovementinerrorguaranteeoverpriorworkrequiressuper-polynomialtime.

Page 14: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

GENERALLOWERBOUNDCONSTRUCTION

GeneralTechniqueforSQLowerBounds:LeadstoTightLowerBounds

forarangeofHigh-dimensionalEstimationTasks

ConcreteApplicationsofourTechnique:

• RobustlyLearningtheMeanandCovariance

• LearningGaussianMixtureModels(GMMs)

• Statistical-ComputationalTradeoffs

• RobustlyTestingaGaussian

Page 15: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

APPLICATIONS:CONCRETESQ LOWERBOUNDSUnifiedtechniqueyieldingarangeofapplications

LearningProblem UpperBound SQLowerBound

RobustGaussianMeanEstimation

Error:

[DKKLMS’16]

RuntimeLowerBound:

forfactorM improvementinerror.

RobustGaussianCovarianceEstimation

Error:

[DKKLMS’16]

Learningk-GMMs(withoutnoise)

Runtime:

[MV’10, BS’10]

Runtime LowerBound:

Robustk-SparseMeanEstimation

Samplesize:

[Li’17,DBS’17]

If samplesizeisruntimelowerbound:

RobustCovarianceEstimationinSpectral

Norm

Samplesize:

[DKKLMS’16]

If samplesizeisruntimelowerbound:

Page 16: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

GAUSSIANMIXTUREMODEL(GMM)

• GMM:Distributiononwithprobabilitydensityfunction

• ExtensivelystudiedinstatisticsandTCS

KarlPearson(1894)

Page 17: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

GAUSSIANMIXTUREMODEL(GMM)

• GMM:Distributiononwithprobabilitydensityfunction

• ExtensivelystudiedinstatisticsandTCS

KarlPearson(1894)

Page 18: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

LEARNINGGMMS- PRIORWORK(I)

TwoRelatedLearningProblemsParameterEstimation:Recovermodelparameters.

• SeparationAssumptions:Clustering-basedTechniques[Dasgupta’99,Dasgupta-Schulman’00,Arora-Kanan’01,Vempala-Wang’02,Achlioptas-McSherry’05,Brubaker-Vempala’08]

SampleComplexity:(BestKnown)Runtime:

• NoSeparation:MomentMethod[Kalai-Moitra-Valiant’10,Moitra-Valiant’10,Belkin-Sinha’10,Hardt-Price’15]

SampleComplexity:(BestKnown)Runtime:

Page 19: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

SEPARATIONASSUMPTIONS

• Clusteringispossibleonlywhenthecomponentshaveverylittleoverlap.

• Formally,wewantthetotalvariationdistancebetweencomponentstobecloseto1.

• AlgorithmsforlearningsphericalGMMSworkunderthisassumption.

• Fornon-sphericalGMMs,knownalgorithmsrequirestrongerassumptions.

Page 20: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

LEARNINGGMMS- PRIORWORK(II)

DensityEstimation:Recoverunderlyingdistribution(withinstatisticaldistance).

[Feldman-O’Donnell-Servedio’05,Moitra-Valiant’10,Suresh-Orlitsky-Acharya-Jafarpour’14,Hardt-Price’15,Li-Schmidt’15]

SampleComplexity:

(BestKnown)Runtime:

Fact:ForseparatedGMMs,densityestimationandparameterestimationareequivalent.

Page 21: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

LEARNINGGMMS– OPENQUESTION

Summary:Thesamplecomplexityofdensityestimationfork-GMMsis.Thesamplecomplexityofparameterestimationforseparated k-GMMsis.

Question:Isthereatime learningalgorithm?

Page 22: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

STATISTICALQUERYLOWERBOUNDFORLEARNINGGMMS

Theorem: Supposethat.AnySQalgorithmthatlearnsseparatedk-GMMsovertoconstanterrorrequireseither:• SQqueriesofaccuracy

or• Atleast

manySQqueries.

Take-away: ComputationalcomplexityoflearningGMMsisinherentlyexponentialinnumberofcomponents.

Page 23: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PartI:Introduction

� UnsupervisedLearninginHighDimension� StatisticalQuery(SQ)LearningModel� OurResults

PartII:ComputationalSQLowerBounds

� GenericSQLowerBoundTechnique� TwoApplications:LearningGMMs,RobustlyLearningaGaussian

OUTLINE

PartIII:SummaryandConclusions

Page 24: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

GENERALRECIPEFOR(SQ)LOWERBOUNDS

OurgenerictechniqueforprovingSQLowerBounds:

� Step#1:ConstructdistributionthatisstandardGaussianinalldirectionsexcept.

� Step#2:Constructtheunivariateprojectioninthedirectionsothatitmatchesthefirstm momentsof

� Step#3:Considerthefamilyofinstances

Non-GaussianComponentAnalysis [Blanchardetal.2006]

Page 25: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

HIDDENDIRECTIONDISTRIBUTION

Definition: Foraunitvectorv andaunivariatedistributionwithdensityA,considerthehigh-dimensionaldistribution

Example:

Page 26: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

GENERICSQLOWERBOUND

Definition: Foraunitvectorv andaunivariatedistributionwithdensityA,considerthehigh-dimensionaldistribution

Proposition:Supposethat:• A matchesthefirstm momentsof• Wehaveaslongasv, v’ arenearly

orthogonal.

ThenanySQalgorithmthatlearnsanunknown withinerrorrequireseitherqueriesofaccuracyormanyqueries.

Page 27: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

WHYISFINDINGAHIDDENDIRECTIONHARD?

Observation:Low-DegreeMomentsdonothelp.

• A matchesthefirstm momentsof• Thefirstm momentsofareidenticaltothoseof• Degree-(m+1) moment tensor has entries.

Claim:Randomprojectionsdonothelp.

• Todistinguishbetweenand,wouldneedexponentiallymanyrandomprojections.

Page 28: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

ONE-DIMENSIONALPROJECTIONSAREALMOSTGAUSSIAN

KeyLemma:LetQ bethedistributionof,where.Then,wehavethat:

Page 29: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PROOFOFKEYLEMMA(I)

Page 30: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PROOFOFKEYLEMMA(I)

Page 31: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PROOFOFKEYLEMMA(II)

where istheoperatorover

GaussianNoise(Ornstein-Uhlenbeck)Operator

Page 32: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

EIGENFUNCTIONS OFORNSTEIN-UHLENBECK OPERATOR

LinearOperator actingonfunctions

Fact(Mehler’66):

• denotesthedegree-i Hermite polynomial.• Notethatareorthonormalwithrespect

totheinnerproduct

Page 33: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

GENERICSQLOWERBOUND

Definition: Foraunitvectorv andaunivariatedistributionwithdensityA,considerthehigh-dimensionaldistribution

Proposition:Supposethat:• A matchesthefirstm momentsof• Wehaveaslongasv, v’ arenearly

orthogonal.

ThenanySQalgorithmthatlearnsanunknown withinerrorrequireseitherqueriesofaccuracyormanyqueries.

Page 34: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PartI:Introduction

� UnsupervisedLearninginHighDimension� StatisticalQuery(SQ)LearningModel� OurResults

PartII:ComputationalSQLowerBounds

� GenericSQLowerBoundTechnique� Application:LearningGMMs

OUTLINE

PartIII:SummaryandConclusions

Page 35: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

Theorem: AnySQalgorithmthatlearnsseparatedk-GMMsovertoconstanterrorrequireseitherSQqueriesofaccuracyoratleastmanySQqueries.

APPLICATION:SQ LOWERBOUNDFORGMMS (I)

Wanttoshow:

byusingourgenericproposition:

Proposition:Supposethat:• A matchesthefirstm momentsof• Wehaveaslongasv, v’ arenearly

orthogonal.

ThenanySQalgorithmthatlearnsanunknownwithinerrorrequireseitherqueriesofaccuracyormanyqueries.

Page 36: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

APPLICATION:SQ LOWERBOUNDFORGMMS (II)

Proposition:Supposethat:• A matchesthefirstm momentsof• Wehaveaslongasv, v’ arenearly

orthogonal.

ThenanySQalgorithmthatlearnsanunknownwithinerrorrequireseitherqueriesofaccuracyormanyqueries.

Lemma:ThereexistsaunivariatedistributionA thatisak-GMMwithcomponentsAi such that:• A agreeswithonthefirst2k-1 moments.• Eachpairofcomponentsareseparated.• Wheneverv andv’ arenearlyorthogonal

Page 37: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

APPLICATION:SQ LOWERBOUNDFORGMMS (III)Lemma:ThereexistsaunivariatedistributionA thatisak-GMMwithcomponentsAi such that:• A agreeswithonthefirst2k-1 moments.• Eachpairofcomponentsareseparated.• Wheneverv andv’ arenearlyorthogonal

Page 38: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

APPLICATION:SQ LOWERBOUNDFORGMMS (III)High-DimensionalDistributionslooklike“parallelpancakes”:

Efficientlylearnablefork=2. [Brubaker-Vempala’08]

Page 39: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

PartI:Introduction

� UnsupervisedLearninginHighDimension� StatisticalQuery(SQ)LearningModel� OurResults

PartII:ComputationalSQLowerBounds

� GenericSQLowerBoundTechnique� TwoApplications:LearningGMMs,RobustlyLearningaGaussian

OUTLINE

PartIII:SummaryandConclusions

Page 40: Computational-Statistical Tradeoffs in Robust Estimation · 2020-01-03 · Computational-Statistical Tradeoffs in Robust Estimation Ilias Diakonikolas (USC) ... Gaussian elimination

SUMMARYANDFUTUREDIRECTIONS

• GeneralTechniquetoProveSQLowerBounds• Robustnesscanmakehigh-dimensionalestimationhardercomputationallyandinformation-theoretically.

FutureDirections:

• FurtherApplicationsofourFrameworkList-DecodableMeanEstimation[D-Kane-Stewart’18]DiscreteProductDistributions[D-Kane-Stewart’18]RobustRegression[D-Kong-Stewart’18]AdversarialExamples[Bubeck-Price- Razenshteyn’18]

• AlternativeEvidenceofComputationalHardness?

Thanks!AnyQuestions?