thoughts on the theory of statistics nancy reid ssc 2010 theory of statistics

50
Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

Post on 20-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

Thoughts on the theory of statisticsNancy Reid

Theory of statistics

Page 2: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Statistics in demand

“Statistical science is undergoing unprecedented growth in both opportunity and activity”

High energy physicsArt history Reality miningBioinformaticsComplex surveysClimate and environmentSSC 2010 …

Theory of statistics

Page 3: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Statistical Thinking

Dramatic increase in resources now available

Theory of statistics

Page 4: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Statistical Thinking 1

If a statistic was the answer, what was the question? What are we counting?

Common pitfalls means, medians and outliers

How sure are we? statistical significance and confidence

Percentages and risk relative and absolute change

Theory of statistics

Page 5: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Statistical theory for 20xx

What should we be teaching?If a statistic was the answer, what was the question?

Design of experiments and surveysCommon pitfalls

Summary statistics: sufficiency etc.How sure are we?

InferencePercentages and risk

Interpretation

Theory of statistics

Page 6: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Models and likelihood

Modelling is difficult and importantWe can get a lot from the likelihood functionNot only point estimatorsNot only (not at all!!) most powerful testsInferential quantities (pivots)Inferential distributions (asymptotics)A natural starting point, even for very complex models

Theory of statistics

Page 7: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Likelihood is everywhere! 2

Theory of statistics

Page 8: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

Outline

SSC 2010

1. Higher order asymptoticslikelihood as pivotal

2. Bayesian and non-Bayesian inference3. Partial, quasi, composite likelihood4. Where are we headed?

Theory of statistics

Page 9: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

P-value functions from likelihood

Likelihood as pivotal

Page 10: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

P-value functions from likelihood

Likelihood as pivotal

0.975

0.025

Page 11: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Can be nearly exact

Likelihood as pivotal

Likelihood rootMaximum likelihood estimateScore function All approximately distributed as

Much better :

can be

Page 12: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Can be nearly exact

Likelihood as pivotal

Likelihood rootMaximum likelihood estimateScore function

Page 13: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Can be nearly exact

Likelihood as pivotal

Page 14: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Can be nearly exact

Likelihood as pivotal

Page 15: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Can be nearly exact

Likelihood as pivotal

Page 16: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Can be nearly exact

Likelihood as pivotal

Page 17: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Can be nearly exact 3

Likelihood as pivotal

Page 18: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Using higher order approximations

Likelihood as pivotal

Excellent approximations for ‘easy’ cases Exponential families, non-normal linear regression

More work to construct for ‘moderate’ cases Autoregressive models, fixed and random effects, discrete responses

Fairly delicate for ‘difficult’ cases Complex structural models with several sources of variation

Best results for scalar parameter of interest But we may need inference for vector parameters

Page 19: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Where does this come from?

Likelihood as pivotal

4Amari, 1982, Biometrika; Efron, 1975, Annals

Page 20: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Where does this come from? 5,6,7

Likelihood as pivotal

Differential geometry of statistical modelsTheory of exponential familiesEdgeworth and saddlepoint approximationsKey idea:A smooth parametric model can be approximated by a tangent exponential family modelRequires differentiating log-likelihood function on the sample spacePermits extensions to more complex models

Page 21: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Where does this come from? 8

Likelihood as pivotal

Page 22: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Generalizations

Likelihood as pivotal

To discrete dataWhere differentiating the log-likelihood on the sample

space is more difficultSolution: use expected value of score statistic insteadRelative error instead ofStill better than the normal approximation

Page 23: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Generalizations 9

Likelihood as pivotal

Page 24: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Generalizations 10

Likelihood as pivotal

To vector parameters of interestBut our solutions require a single parameter Solution: use length of the vector, conditioned on the

direction

Page 25: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Generalizations 11

Likelihood as pivotal

Extending the role of the exponential familyBy generalizing differentiation on the sample spaceIdea: differentiate the expected log-likelihood

Instead of the log-likelihoodLeads to a new version of approximating exponential

familyCan be used with pseudo-likelihoods

Page 26: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

What can we learn? 12

Bayesian/nonBayesian

Higher order approximation requiresDifferentiating the log-likelihood function on the sample spaceBayesian inference will be differentAsymptotic expansion highlights the discrepancyBayesian posteriors are in general not calibratedCannot always be corrected by choice of the priorWe can study this by comparing Bayesian and

nonBayesian approximations

Page 27: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Example: inference for ED50 13

Bayesian/nonBayesian

Logistic regression with a single covariateOn the logistic scaleUse flat priors for Parameter of interest isEmpirical coverage of Bayesian posterior intervals:

0.90, 0.88, 0.89, 0.90Empirical coverage of intervals using

0.95, 0.95, 0.95, 0.95

Page 28: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Flat priors are not a good idea! 14

Bayesian/nonBayesian

Page 29: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Flat priors are not a good idea!

Bayesian/nonBayesian

Page 30: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Flat priors are not a good idea!

Bayesian/nonBayesian

Bayesian p-value – Frequentist p-value

Page 31: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

More complex models

Partial, quasi, composite likelihood

Likelihood inference has desirable propertiesSufficiency, asymptotic efficiencyGood approximations to needed distributionsDerived naturally from parametric modelsCan be difficult to construct, especially in complex modelsMany natural extensions: partial likelihood for censored

data, quasi-likelihood for generalized estimating equations, composite likelihood for dependent data

Page 32: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Complex models 14

Partial, quasi, composite likelihood

Example: longitudinal study of migraine sufferersLatent variable Observed variable

E.g. no headache, mild, moderate, intense … Covariates: age, education, painkillers, weather, … random effects between and within subjectsSerial correlation

Page 33: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Likelihood for longitudinal discrete data

Partial, quasi, composite likelihood

Likelihood function

Hard to computeMakes strong assumptionsProposal: use bivariate marginal densities instead of full multivariate normal densitiesGiving a mis-specified model

Page 34: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Composite likelihood

Partial, quasi, composite likelihood

Composite likelihood function

More generally

Sets index marginal or conditional (or …) distributions

Inference based on theory of estimating equations

Page 35: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

A simple example 16

Partial, quasi, composite likelihood

Pairwise likelihood estimator of fully efficientIf , loss of efficiency depends on dimensionSmall for dimension less than, say, 10Falls apart if for fixed sample size

Relevant for time series, genetics applications

Page 36: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Composite likelihood estimator

Partial, quasi, composite likelihood

Godambe information

Page 37: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Recent Applications 17

Partial, quasi, composite likelihood

Longitudinal data, binary and continuous: random effects models

Survival analysis: frailty models, copulas Multi-type responses: discrete and continuous;

markers and event timesFinance: time-varying covariance models Genetics/bioinformatics: CCL for vonMises distribution:

protein folding; gene mapping; linkage disequilibriumSpatial data: geostatistics, spatial point processes

Page 38: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

… and more

Partial, quasi, composite likelihood

Image analysis Rasch modelBradley-Terry model State space modelsPopulation dynamics …

Page 39: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

What can we learn?

Partial, quasi, composite likelihood

Page 40: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

What do we need to know?

Partial, quasi, composite likelihood

Why are composite likelihood estimators efficient?How much information should we use?Are the parameters guaranteed to be identifiable?Are we sure the components are consistent with a

‘true’ model?Can we make progress if not?How do joint densities get constructed?What properties do these constructions have?Is composite likelihood robust?

Page 41: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Why is this important?

Partial, quasi, composite likelihood

Composite likelihood ideas generated from applicationsLikelihood methods seem too complicatedA range of application areas all use the same/similar

ideasAbstraction provided by theory allows us to step back

from the particular applicationGet some understanding about when the methods

might not workAs well as when they are expected to work well

Page 42: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

The role of theory

Where are we headed?

Abstracts the main ideasSimplifies the detailsIsolates particular featuresIn the best scenario, gives new insight into what

underlies our intuitionExample: curvature and Bayesian inferenceExample: composite likelihood Example: false discovery rates

Page 43: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

False discovery rates 18

Where are we headed?

Problem of multiple comparisons Simultaneous statistical inference – R.G. Miller, 1966

Bonferroni correction too strongBenjamini and Hochberg, 1995Introduce False Discovery Rate

An improvement (huge!) on “Type I and Type II error”Then comes data, in this case from astrophysicsGenovese & Wasserman collaborating with Miller and

Nichol

Page 44: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

False discovery rates 19

Where are we headed?

Page 45: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Speculation 20

Where are we headed?

Composite likelihood as a smootherCalibration of posterior inferenceExtension of higher order asymptotics to composite

likelihoodExponential families and empirical likelihoodSemi-parametric and non-parametric models

connected to higher order asymptoticsEffective dimension reduction for inferenceEnsemble methods in machine learning

Page 46: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Speculation 21

Where are we headed?

“in statistics the problems always evolve relative to the development of new data structures and new computational tools” … NSF report

“Statistics is driven by data” … Don McLeish“Our discipline needs collaborations” … Hugh ChipmanHow do we create opportunities? How do we establish an independent identity?In the face of bureaucratic pressures to merge?Keep emphasizing what we do best!!

Page 47: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

SSC 2010

Speculation

Where are we headed?

Engle Variation, modelling, data, theory, data, theory

Tibshirani Cross-validation; forensic statistics

Netflix Grand Prize Recommender systems: machine learning, psychology,

statistics!Tufte

“Visual Display of Quantitative Information” -- 1983

Page 49: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

Thank you!!

Theory of statistics

Page 50: Thoughts on the theory of statistics Nancy Reid SSC 2010 Theory of statistics

End Notes

Theory of statistics

1. “Making Sense of Statistics” Accessed on May 5, 2010. http://www.senseaboutscience.org.uk/2. Midlife Crisis: National Post, January 30, 2008.3. Alessandra Brazzale, Anthony Davison and Reid (2007). Applied Asymptotics. Cambridge

University Press.4. Amari (1982). Biometrika.5. Fraser, Reid, Jianrong Wu. (1999). Biometrika.6. Reid (2003). Annals Statistics7. Fraser (1990). J. Multivariate Anal. 8. Figure drawn by Alessandra Brazzale. From Reid (2003).9. Davison, Fraser, Reid (2006). JRSS B.10. Davison, Fraser, Reid, Nicola Sartori (2010). in progress11. Reid and Fraser (2010). Biometrika12. Fraser, Reid, Elisabetta Marras, Grace Yun-Yi (2010). JRSSB13. Reid and Ye Sun (2009). Communications in Statistics14. J. Heinrich (2003). Phystat Proceedings15. C. Varin, C. Czado (2010). Biostatistics.16. D.Cox, Reid (2004). Biometrika.17. CL references in C.Varin, D.Firth, Reid (2010). Submitted for publication.18. Account of FDR and astronomy taken from Lindsay et al (2004). NSF Report on the Future of

Statistics19. Miller et al. (2001). Science.20. Photo: http://epiac1216.wordpress.com/2008/09/23/origins-of-the-phrase-pie-in-the-sky/ 21. Photo: http://www.bankofcanada.ca/en/banknotes/legislation/images/023361-lg.jpg