fundamentals of statistical exponential families
TRANSCRIPT
Institute of Mathematical StatisticsLECTURE NOTES-MONOGRAPH SERIES
Fundamentals of StatisticalExponential Familieswith Applications in Statistical Decision Theory
Lawrence D. BrownCornell University
Institute of Mathematical Statistics
LECTURE NOTES-MONOGRAPH SERIESShanti S. Gupta, Series Editor
Volume 9
Fundamentals of StatisticalExponential Familieswith Applications in Statistical Decision Theory
Lawrence D. BrownCornell University
Institute of Mathematical StatisticsHayward, California
Institute of Mathematical Statistics
Lecture Notes-Monograph Series
Series Editor, Shanti S. Gupta, Purdue University
The production of the IMS Lecture Notes-Monograph Seriesis managed by the IMS Business Office: Nicholas P. Jewell,Treasurer, and Jose L. Gonzalez, Business Manager.
Library of Congress Catalog Card Number: 87-80020
International Standard Book Number 0-940600-10-2
Copyright© 1986 Institute of Mathematical Statistics
All rights reserved
Printed in the United States of America
PREFACE
I first met exponential families as a beginning graduate student. The
previous summer I had written a short research report under the direction of
Richard Bellman at the RAND Corporation. That report was about a dynamic
programming problem concerning sequential observation of binomial variables.
Jack Kiefer read that report. He conjectured that the properties of the
binomial distribution used there were properties shared by all "Koopman-Darmois"
distributions. (This is a name sometimes used for exponential families, in
honor of the authors of two of the pioneering papers on the topic. See
Koopman (1936), and Darmois (1935), and also Pitman (1936).)
Jack suggested that I recast the paper into the Koopman-Darmois setting.
That suggestion had two objectives. One was the hope that viewing the problem
from this general perspective would lead to a clearer understanding of its
structure and perhaps a simpler and better proof. The other objective was the
hope of generalizing the result from the binomial to other classes of
distributions, for example the Poisson and the gamma. (The resulting manu-
script appeared as Brown (1965).)
These two objectives of clearer understanding and of possible generaliza-
tion in statistical applications are the motivation for this monograph. Many
if not most of the successful mathematical formulations of statistical
questions involve specific exponential families of distributions such as the
normal, the exponential and gamma, the beta, the binomial and the multinomial,
the geometric and the negative binomial, and the Poisson among others. It is
often informative and advantageous to view these mathematical formulations
vi PREFACE
from the perspective of general exponential families.
These notes provide a systematic treatment of the analytic and probabilis-
tic properties of exponential families. This treatment is constructed with a
variety of statistical applications in mind. This basic theory appears in
Chapters 1-3, 5, 6 and the first part of Chapter 7 (through Section 7.11).
Chapter 4, the latter part of Chapter 7, and many of the examples and exercises
elsewhere in the text develop selected statistical applications of the basic
theory.
Almost all the specific statistical applications presented here are
within the area of statistical decision theory. However, as suggested above
the scope of application of exponential families is much wider yet. They are,
for further example, a valuable tool in asymptotic statistical theory. The
presentation of the basic theory here was designed to be also suitable for
applications in this area. Exercises 2.19.1, 5.15.1-5.15.4 and 7.5.1-7.5.5
provide further background for some of these applications. Efron (1975) gives
an elegant example of what can be done in this area.
Some earlier treatments of the general topic have proved helpful to me
and have influenced my presentation, both consciously and unconsciously. The
most important of these is Barndorff-Nielsen (1978). The latter half of that
book treats many of the same topics as the current monograph, although they
are arranged differently and presented from a different point-of-view.
Lehmann (1959) contains an early definitive treatment of some fundamental
results such as Theorems 1.13, 2.2, 2.7 and 2.12. Rockafellar (1970) treats
in great detail the duality theory which appears in Chapters 5 and 6. I found
Johansen (1979) also to be useful, particularly in the preparation of Chapter
1.
The first version of this monograph was prepared during a year's leave at
the Technion, Haifa, and the second was prepared during a temporary appointment
at the Hebrew University, Jerusalem. I wish to express my gratitude to both
those institutions and especially to my colleagues in both departments for
their hospitality, interest, and encouragement. I also want to acknowledge
PREFACE vii
the support from the National Science Foundation which I received throughout
the preparation of this manuscript.
I am grateful to all the colleagues and students who have heard me lecture
on the contents or have read versions of this monograph. Nearly all have
made measurable, positive contributions. Among these I want to specially
thank Richard Ellis, Jiunn Hwang, Iain Johnstone, John Marden, and Yossi Rinott
who have particularly influenced specific portions of the text, Jim Berger who
made numerous valuable suggestions, and above all Roger Farrell who carefully
read and critically and constructively commented on the entire manuscript. The
draft version of the index was prepared by Fu-Hsieng Hsieh.
Finally, I want to thank the editor of this series, Shanti Gupta, for his
gentle but persistent encouragement which made an important contribution to
the completion of this monograph.
TABLE OF CONTENTS
CHAPTER 1. BASIC PROPERTIES 1
Standard Exponential Families 1
Marginal Distributions 8
Reduction to a Minimal Family 13
Random Samples 16
Convexity Property 19
Conditional Distributions 21
Exercises 26
CHAPTER 2. ANALYTIC PROPERTIES 32
Differentiability and Moments 32
Formulas for Moments 34
Analyticity 38
Completeness 42
Mutual Independence 44
Continuity Theorem 48
Total Positivity 53
Partial Order Properties 57
Exercises 60
CHAPTER 3. PARAMETRIZATIONS 70
Steep Families 70
Mean Value Parametrization 73
Mixed Parametrization 78
IX
x TABLE OF CONTENTS
Differentiate Subfamilies 81
Exercises 85
CHAPTER 4. APPLICATIONS 90
Information Inequality 90
Unbiased Estimates of the Risk 99
Generalized Bayes Estimators of Canonical Parameters 106
Generalized Bayes Estimators of Expectation Parameters;
Conjugate Priors 112
Exercises 124
CHAPTER 5. MAXIMUM LIKELIHOOD ESTIMATION 144
Full Families 148
Non-Full Families 152
Convex Parameter Space 153
Fundamental Equation 160
Exercises 167
CHAPTER 6. THE DUAL TO THE MAXIMUM LIKELIHOOD ESTIMATOR 174
Convex Duality 178
Minimum Entropy Parameter 184
Aggregate Exponential Families 191
Exercises 203
CHAPTER 7. TAIL PROBABILITIES 208
Fixed Parameter (Via Chebyshev's Inequality) 208
Fixed Parameter (Via Kullback-Leibler Information) 212
Fixed Reference Set 214
Complete Class Theorems for Tests (Separated Hypotheses) 220
Complete Class Theorems for Tests (Contiguous Hypotheses) 232
Exercises 239
APPENDIX TO CHAPTER 4. POINTWISE LIMITS OF BAYES PROCEDURES 254
REFERENCES 269
INDEX 280