computational-statistical tradeoffs in robust estimation · 2020-01-03 ·...

Computational-StatisticalTradeoffsin

RobustEstimation

IliasDiakonikolas(USC)

(basedonjointworkwithD.KaneandA.Stewart)

Canwedeveloplearning/estimationalgorithmsthatarerobust toaconstantfractionofcorruptions inthedata?

ROBUSTHIGH-DIMENSIONALESTIMATION

ContaminationModel:Letbeafamilyofhigh-dimensionaldistributions.Wesaythatadistributionis- corruptedwithrespecttoifthereexistssuchthat

F

F 2 FF✏

• Input:samplegeneratedbymodelwithunknown• Goal:estimateparameterssothat

THEUNSUPERVISEDLEARNINGPROBLEM

Question1:Isthereanefficient learningalgorithm?

Unknown θ* samples ✓

✓⇤

✓ ✓ ⇡ ✓⇤

Main performance criteria:• Sample size• Running time• Robustness

Question2:Aretheretradeoffs betweenthesecriteria?

ROBUSTLYLEARNINGAGAUSSIAN – PRIORWORK

BasicProblem:Givenan- corruptedversionofanunknownd-dimensionalunknownmeanGaussian

efficiently computeahypothesisdistributionsuchthat

✏

• Extensivelystudiedinrobuststatisticssincethe1960’s.Tillrecently,knownefficientestimatorsgeterror

• RecentAlgorithmicProgress:-- [Lai-Rao-Vempala’16]

-- [D-Kamath-Kane-Li-Moitra-Stewart’16]

ROBUSTLYLEARNINGAGAUSSIAN

BasicProblem:Givenan- corruptedversionofanunknownd-dimensionalunknownmeanGaussian

efficiently computeahypothesisdistributionsuchthat

✏

erroristheinformation-theoreticallybestpossible.

ROBUSTLEARNING– OPENQUESTION

SummaryofPriorWork: Thereisatimealgorithmforrobustlylearningwithinerror

OpenQuestion:Isthereatimealgorithmforrobustlylearningwithinerror?Howabout?

PartI:Introduction

� UnsupervisedLearninginHighDimension� StatisticalQuery(SQ)LearningModel� OurResults

PartII:ComputationalSQLowerBounds

� GenericSQLowerBoundTechnique� TwoApplications:LearningGMMs,RobustlyLearningaGaussian

OUTLINE

PartIII:Extensions

PartIV:SummaryandConclusions

STATISTICALQUERIES[KEARNS’93]

𝑥", 𝑥$, … , 𝑥& ∼ 𝐷 over𝑋

STATISTICALQUERIES[KEARNS’93]

𝑣" − 𝐄-∼. 𝜙" 𝑥 ≤ 𝜏𝜏 istoleranceofthequery;𝜏 = 1/ 𝑚�

𝜙7

𝑣"𝜙$𝑣$

𝑣7SQalgorithmSTAT.(𝜏) oracle

𝐷

𝜙": 𝑋 → −1,1

Problem𝑃 ∈ SQCompl 𝑞,𝑚 :IfexistsaSQalgorithmthatsolves𝑃 using𝑞 queriestoSTAT.(𝜏 = 1/ 𝑚� )

POWEROFSQ ALGORITHMS(?)RestrictedModel:Hopetoproveunconditionalcomputationallowerbounds.

PowerfulModel:WiderangeofalgorithmictechniquesinMLareimplementableusingSQs*:

• PACLearning:AC0,decisiontrees,linearseparators,boosting.

• UnsupervisedLearning:stochasticconvexoptimization,moment-basedmethods,k-meansclustering,EM,…[Feldman-Grigorescu-Reyzin-Vempala-Xiao/JACM’17]

Onlyknownexception:Gaussianeliminationoverfinitefields(e.g.,learningparities).

Forallproblemsinthistalk,strongestknownalgorithmsareSQ.

METHODOLOGYFORSQ LOWERBOUNDSStatisticalQueryDimension:

• Fixed-distributionPACLearning[Blum-Furst-Jackson-Kearns-Mansour-Rudich’95;…]

• GeneralStatisticalProblems[Feldman-Grigorescu-Reyzin-Vempala-Xiao’13,…,Feldman’16]

PairwisecorrelationbetweenD1 andD2 withrespecttoD:

Fact:Sufficestoconstructalargesetofdistributionsthatarenearlyuncorrelated.

PartI:Introduction




OUTLINE

PartIII:SummaryandConclusions

STATISTICALQUERYLOWERBOUNDFORROBUSTLYLEARNINGAGAUSSIAN

Theorem:SupposeAnySQalgorithmthatlearnsan- corruptedGaussianwithinstatisticaldistanceerror

requireseither:• SQqueriesofaccuracyor• Atleast

manySQqueries.

Take-away: Anyasymptoticimprovementinerrorguaranteeoverpriorworkrequiressuper-polynomialtime.

GENERALLOWERBOUNDCONSTRUCTION

GeneralTechniqueforSQLowerBounds:LeadstoTightLowerBounds

forarangeofHigh-dimensionalEstimationTasks

ConcreteApplicationsofourTechnique:

• RobustlyLearningtheMeanandCovariance

• LearningGaussianMixtureModels(GMMs)

• Statistical-ComputationalTradeoffs

• RobustlyTestingaGaussian

APPLICATIONS:CONCRETESQ LOWERBOUNDSUnifiedtechniqueyieldingarangeofapplications

LearningProblem UpperBound SQLowerBound

RobustGaussianMeanEstimation

Error:

[DKKLMS’16]

RuntimeLowerBound:

forfactorM improvementinerror.

RobustGaussianCovarianceEstimation

Error:

[DKKLMS’16]

Learningk-GMMs(withoutnoise)

Runtime:

[MV’10, BS’10]

Runtime LowerBound:

Robustk-SparseMeanEstimation

Samplesize:

[Li’17,DBS’17]

If samplesizeisruntimelowerbound:

RobustCovarianceEstimationinSpectral

Norm

Samplesize:

[DKKLMS’16]

If samplesizeisruntimelowerbound:

GAUSSIANMIXTUREMODEL(GMM)

• GMM:Distributiononwithprobabilitydensityfunction

• ExtensivelystudiedinstatisticsandTCS

KarlPearson(1894)

LEARNINGGMMS- PRIORWORK(I)

TwoRelatedLearningProblemsParameterEstimation:Recovermodelparameters.

• SeparationAssumptions:Clustering-basedTechniques[Dasgupta’99,Dasgupta-Schulman’00,Arora-Kanan’01,Vempala-Wang’02,Achlioptas-McSherry’05,Brubaker-Vempala’08]

SampleComplexity:(BestKnown)Runtime:

• NoSeparation:MomentMethod[Kalai-Moitra-Valiant’10,Moitra-Valiant’10,Belkin-Sinha’10,Hardt-Price’15]

SampleComplexity:(BestKnown)Runtime:

SEPARATIONASSUMPTIONS

• Clusteringispossibleonlywhenthecomponentshaveverylittleoverlap.

• Formally,wewantthetotalvariationdistancebetweencomponentstobecloseto1.

• AlgorithmsforlearningsphericalGMMSworkunderthisassumption.

• Fornon-sphericalGMMs,knownalgorithmsrequirestrongerassumptions.

LEARNINGGMMS- PRIORWORK(II)

DensityEstimation:Recoverunderlyingdistribution(withinstatisticaldistance).

[Feldman-O’Donnell-Servedio’05,Moitra-Valiant’10,Suresh-Orlitsky-Acharya-Jafarpour’14,Hardt-Price’15,Li-Schmidt’15]

SampleComplexity:

(BestKnown)Runtime:

Fact:ForseparatedGMMs,densityestimationandparameterestimationareequivalent.

LEARNINGGMMS– OPENQUESTION

Summary:Thesamplecomplexityofdensityestimationfork-GMMsis.Thesamplecomplexityofparameterestimationforseparated k-GMMsis.

Question:Isthereatime learningalgorithm?

STATISTICALQUERYLOWERBOUNDFORLEARNINGGMMS

Theorem: Supposethat.AnySQalgorithmthatlearnsseparatedk-GMMsovertoconstanterrorrequireseither:• SQqueriesofaccuracy

or• Atleast

manySQqueries.

Take-away: ComputationalcomplexityoflearningGMMsisinherentlyexponentialinnumberofcomponents.

PartI:Introduction




OUTLINE


GENERALRECIPEFOR(SQ)LOWERBOUNDS

OurgenerictechniqueforprovingSQLowerBounds:

� Step#1:ConstructdistributionthatisstandardGaussianinalldirectionsexcept.

� Step#2:Constructtheunivariateprojectioninthedirectionsothatitmatchesthefirstm momentsof

� Step#3:Considerthefamilyofinstances

Non-GaussianComponentAnalysis [Blanchardetal.2006]

HIDDENDIRECTIONDISTRIBUTION

Definition: Foraunitvectorv andaunivariatedistributionwithdensityA,considerthehigh-dimensionaldistribution

Example:

GENERICSQLOWERBOUND


Proposition:Supposethat:• A matchesthefirstm momentsof• Wehaveaslongasv, v’ arenearly

orthogonal.

ThenanySQalgorithmthatlearnsanunknown withinerrorrequireseitherqueriesofaccuracyormanyqueries.

WHYISFINDINGAHIDDENDIRECTIONHARD?

Observation:Low-DegreeMomentsdonothelp.

• A matchesthefirstm momentsof• Thefirstm momentsofareidenticaltothoseof• Degree-(m+1) moment tensor has entries.

Claim:Randomprojectionsdonothelp.

• Todistinguishbetweenand,wouldneedexponentiallymanyrandomprojections.

ONE-DIMENSIONALPROJECTIONSAREALMOSTGAUSSIAN

KeyLemma:LetQ bethedistributionof,where.Then,wehavethat:

PROOFOFKEYLEMMA(I)

PROOFOFKEYLEMMA(II)

where istheoperatorover

GaussianNoise(Ornstein-Uhlenbeck)Operator

EIGENFUNCTIONS OFORNSTEIN-UHLENBECK OPERATOR

LinearOperator actingonfunctions

Fact(Mehler’66):

• denotesthedegree-i Hermite polynomial.• Notethatareorthonormalwithrespect

totheinnerproduct

GENERICSQLOWERBOUND



orthogonal.

ThenanySQalgorithmthatlearnsanunknown withinerrorrequireseitherqueriesofaccuracyormanyqueries.

PartI:Introduction



� GenericSQLowerBoundTechnique� Application:LearningGMMs

OUTLINE


Theorem: AnySQalgorithmthatlearnsseparatedk-GMMsovertoconstanterrorrequireseitherSQqueriesofaccuracyoratleastmanySQqueries.

APPLICATION:SQ LOWERBOUNDFORGMMS (I)

Wanttoshow:

byusingourgenericproposition:


orthogonal.

ThenanySQalgorithmthatlearnsanunknownwithinerrorrequireseitherqueriesofaccuracyormanyqueries.

APPLICATION:SQ LOWERBOUNDFORGMMS (II)


orthogonal.

ThenanySQalgorithmthatlearnsanunknownwithinerrorrequireseitherqueriesofaccuracyormanyqueries.

Lemma:ThereexistsaunivariatedistributionA thatisak-GMMwithcomponentsAi such that:• A agreeswithonthefirst2k-1 moments.• Eachpairofcomponentsareseparated.• Wheneverv andv’ arenearlyorthogonal

APPLICATION:SQ LOWERBOUNDFORGMMS (III)Lemma:ThereexistsaunivariatedistributionA thatisak-GMMwithcomponentsAi such that:• A agreeswithonthefirst2k-1 moments.• Eachpairofcomponentsareseparated.• Wheneverv andv’ arenearlyorthogonal

APPLICATION:SQ LOWERBOUNDFORGMMS (III)High-DimensionalDistributionslooklike“parallelpancakes”:

Efficientlylearnablefork=2. [Brubaker-Vempala’08]

PartI:Introduction




OUTLINE


SUMMARYANDFUTUREDIRECTIONS

• GeneralTechniquetoProveSQLowerBounds• Robustnesscanmakehigh-dimensionalestimationhardercomputationallyandinformation-theoretically.

FutureDirections:

• FurtherApplicationsofourFrameworkList-DecodableMeanEstimation[D-Kane-Stewart’18]DiscreteProductDistributions[D-Kane-Stewart’18]RobustRegression[D-Kong-Stewart’18]AdversarialExamples[Bubeck-Price- Razenshteyn’18]

• AlternativeEvidenceofComputationalHardness?

Thanks!AnyQuestions?

computational-statistical tradeoffs in robust estimation · 2020-01-03 ·...

Documents