![Page 1: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/1.jpg)
1STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Discrete data
Basic data are discretely measured responses such as counts, proportions, nominal, ordinal, discrete variables with a few values, continuous variables grouped into a small number
of categories, etc. We illustrate the theoretical results by data examples. We will use SAS package for this class
![Page 2: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/2.jpg)
2STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Theory
Multivariate analysis of discrete data that is the underlying theory of such analysis
Topics Basic principles of statistical methods Analysis of Poisson counts Cross-classified table of counts (contingency tables) Success/failure records
![Page 3: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/3.jpg)
3STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Problems
Describe and understand the structure of a discrete multivariate distribution
A sort of “generalization” of regression with a distinction between response and explanatory variables where response is discrete Predictors can be all discrete, or mixture of discrete
and continuous variable Log-linear model Logistic regression
![Page 4: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/4.jpg)
4STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Topics
1. Introduction: Distributions and Inference for Categorical Data
2. Describing Contingency Tables
3. Inference for Contingency Tables
4. Introduction to Generalized Linear Models
5. Logistic Regression
6. Building and Applying Logistic Regression Models
7. Logit Models for Multinomial Responses
![Page 5: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/5.jpg)
5STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Chapter 1 Example
![Page 6: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/6.jpg)
6STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Chapter 1 - Outline
1.1 Categorical Response Data
1.2 Distributions for Categorical Data
1.3 Statistical Inference for Categorical Data
1.4 Statistical Inference for Binomial Parameters
1.5 Statistical Inference for Multinomial Parameters
![Page 7: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/7.jpg)
7STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.1 CATEGORICAL RESPONSE DATA
A categorical variable has a measurement scale consisting of a set of categories. political philosophy: liberal, moderate, or
conservative. brands of a product: brand A, brand B, and brand C
A categorical variable can be a response variable or independent variable
We consider primarily the CATEGORICAL RESPONSE DATA in this course
![Page 8: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/8.jpg)
8STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.1.1 Response–Explanatory Variable Distinction
Most statistical analyses distinguish between response (or dependent) variables and explanatory (or independent) variables.
For instance, regression models: selling price of a house = f(square footage, location)
In this book we focus on methods for categorical response variables.
As in ordinary regression, explanatory variables can be of any type.
![Page 9: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/9.jpg)
9STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.1.2 Nominal–Ordinal Scale Distinction
Nominal: Variables having categories without a natural ordering religious affiliation: Catholic, Protestant, Jewish,
Muslim, other. mode of transportation: automobile, bicycle, bus,
subway, walk favorite type of music: classical, country, folk, jazz,
rock choice of residence: apartment, condominium,
house, other. For nominal variables, the order of listing the categories
is irrelevant. The statistical analysis does not depend on that
ordering.
![Page 10: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/10.jpg)
10STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Nominal or Ordinal
Ordinal: ordered categories automobile: subcompact, compact, midsize, large social class: upper, middle, lower political philosophy: liberal, moderate, conservative patient condition: good, fair, serious, critical.
Ordinal variables have ordered categories, but distances between categories are unknown.
Although a person categorized as moderate is more liberal than a person categorized as conservative, no numerical value describes how much more liberal that person is. Methods for ordinal variables utilize the category ordering.
![Page 11: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/11.jpg)
11STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Interval variable
An interval variable is one that does have numerical distances between any two values. blood pressure level functional life length of television set length of prison term annual income
An internal variable is sometimes called a ratio variable if ratios of values are also valid. It has a clear definition of 0: Height Weight enzyme activity
![Page 12: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/12.jpg)
12STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
categories are not as clear cut as they sound
What kind of variable is color? In a psychological study of perception, different
colors would be regarded as nominal. In a physics study, color is quantified by
wavelength, so color would be considered a ratio variable.
What about counts? If your dependent variable is the number of cells in
a certain volume, what kind of variable is that. It has all the properties of a ratio variable, except it must be an integer.
Is that a ratio variable or not? These questions just point out that the classification scheme is appears to be more comprehensive than it is
![Page 13: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/13.jpg)
13STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
A variable’s measurement scale determines which statistical methods are appropriate.
In the measurement hierarchy, interval variables are highest, ordinal variables are next, and nominal variables are lowest.
Statistical methods for variables of one type can also be used with variables at higher levels but not at lower levels.
For instance, statistical methods for nominal variables can be used with ordinal variables by ignoring the ordering of categories.
Methods for ordinal variables cannot, however, be used with nominal variables, since their categories have no meaningful ordering.
It is usually best to apply methods appropriate for the actual scale.
![Page 14: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/14.jpg)
14STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.1.3 Continuous–Discrete Variable Distinction
according to the number of values they can take Actual measurement of all variables occurs in a discrete
manner, due to precision limitations in measuring instruments.
The continuous / discrete classification, in practice, distinguishes between variables that take lots of values and variables that take few values.
Statisticians often treat discrete interval variables having a large number of values, such as test scores, as continuous
![Page 15: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/15.jpg)
15STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
This class: Discretely measured responses can be:
Binary (two categories) nominal variables (unordered) ordinal variables (ordered) discrete interval variables having relatively few values,
and continuous variables grouped into a small number of
categories.
![Page 16: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/16.jpg)
16STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.1.4 Quantitative–Qualitative Variable Distinction
Nominal variables are qualitative distinct categories differ in quality, not in quantity.
Interval variables are quantitative distinct levels have differing amounts of the characteristic of interest.
The position of ordinal variables in the quantitative or qualitative classification is fuzzy.
![Page 17: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/17.jpg)
17STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Analysts often utilize the quantitative nature of ordinal variables by assigning numerical scores to categories or assuming an underlying continuous distribution.
This requires good judgment and guidance from researchers who use the scale, but it provides benefits in the variety of methods available for data analysis.
![Page 18: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/18.jpg)
18STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Summary
Continuous variable Ratio Interval Discrete
Categorical Binary Ordinal Nominal
![Page 19: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/19.jpg)
19STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Calculation:
OK to compute.... Nominal Ordinal Interval Ratio
frequency distribution Yes Yes Yes Yes
median and percentiles No Yes Yes Yes
add or subtract No No Yes Yes
mean, standard deviation, standard error of the mean
No No Yes Yes
ratio, or coefficient of variation
No No No Yes
![Page 20: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/20.jpg)
20STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Example1: Grades measured
pass/fail A,B,C,D,F 3.2, 4.1, 5.0, 2.1, …
![Page 21: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/21.jpg)
21STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Example 2
o Did you get a flu? (Yes or No) – is a binary nominal categorical variable
o What was the severity of your flu? (low, medium, or high) – is an ordinal categorical variable
Context is important. The context of the study and corresponding questions are important in specifying what kind of variable we will analyze.
![Page 22: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/22.jpg)
22STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.2 DISTRIBUTIONS FOR CATEGORICAL DATA
Inferential data analyses require assumptions about the random mechanism that generated the data.
For continuous variable, Normal distribution For categorical variable
Binomial hypergeometric distribution Multinomial Poisson
![Page 23: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/23.jpg)
23STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Overview of probability and inference
The basic problem we study in probability: Given a data generating process, what are the properties of the outcomes?
The basic problem of statistical inference: Given the outcomes (data), what we can say about the process that generated the data?
Observed data
Data generating process
probability
inference
![Page 24: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/24.jpg)
24STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Random variable
A random variable is the outcome of an experiment (i.e. a random process) expressed as a number.
We use capital letters near the end of the alphabet (X, Y , Z, etc.) to denote random variables.
Just like variables, probability distributions can be classified as discrete or continuous.
![Page 25: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/25.jpg)
25STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Continuous Probability Distributions
If a random variable is a continuous variable, its probability distribution is called a continuous probability distribution.
A continuous probability distribution differs from a discrete probability distribution in several ways.
The probability that a continuous random variable will assume a particular value is zero.
As a result, a continuous probability distribution cannot be expressed in tabular form.
Instead, an equation or formula is used to describe a continuous probability distribution.
![Page 26: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/26.jpg)
26STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Normal
Most often, the equation used to describe a continuous probability distribution is called a probability density function. Sometimes, it is referred to as a density function, or a PDF.
Normal N(µ, 2) PDF
}2
)(exp{
2
1),;(
2
2
2
2
x
xf
![Page 27: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/27.jpg)
27STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Chi-square distribution, PDF
![Page 28: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/28.jpg)
28STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Discrete random variables
A discrete random variable is one which may take on only a countable number of distinct values such as 0,1,2,3,4,........
Discrete random variables are usually (but not necessarily) counts. If a random variable can take only a finite number of distinct values, then it must be discrete.
Examples: the number of children in a family the Friday night attendance at a cinema the number of patients in a doctor's surgery the number of defective light bulbs in a box of ten.
![Page 29: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/29.jpg)
29STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
discrete random variable
The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible values. It is also sometimes called the probability function or the probability mass function.
Suppose a random variable X may take k different values, with the probability that X = xi defined to be P(X = xi) = pi. The probabilities pi must satisfy the following:
0 < pi < 1 for each i
p1 + p2 + ... + pk = 1.
![Page 30: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/30.jpg)
30STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Example Suppose a variable X can take the values 1, 2, 3, or 4.
The probabilities associated with each outcome are described by the following table: Outcome 1 2 3 4 Probability 0.1 0.3 0.4 0.2
The probability that X is equal to 2 or 3 is the sum of the two probabilities: P(X = 2 or X = 3) = P(X = 2) + P(X = 3) = 0.3 + 0.4 = 0.7.
Similarly, the probability that X is greater than 1 is equal to 1 - P(X = 1) = 1 - 0.1 = 0.9, by the complement rule.
This distribution may also be described by the probability histogram shown to the right
![Page 31: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/31.jpg)
31STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Properties
E(X)= x f(x) var(X)= (x-E(X))2 f(x)
If the distribution depends on unknown parameters we write it as f(x; ) or f(x | )
![Page 32: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/32.jpg)
32STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.2.0 Bernoulli Distribution
the Bernoulli distribution is a discrete probability distribution, which takes value 1 with success probability and value 0 with failure probability 1 − . So if X is a random variable with this distribution, we have:
or write it as
Then
![Page 33: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/33.jpg)
33STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.2.1 Binomial Distribution
Many applications refer to a fixed number n of binary observations.
Let y1 , y2 , . . . , yn denote responses for n independent and identical trials (Bernoulli trials)
Identical trials means that the probability of success is the same for each trial.
Independent trials means that the Yi are independent random variables.
![Page 34: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/34.jpg)
34STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
The total number of successes
has the binomial distribution with index n and parameter , denoted by bin(n,)
The probability mass function
where
![Page 35: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/35.jpg)
35STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
0 5 10 15 20 250
0.05
0.1
0.15
0.2
0.25
0.3
0.35binomial pdf bin(25, )
=0.10=0.25=0.50
![Page 36: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/36.jpg)
36STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Moments
Because Yi=1 or 0, Yi=Yi2
E(Yi)=E(Yi2)=1 x + 0 x (1-)=
Skewness:
![Page 37: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/37.jpg)
37STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
The distribution converges to normality as n increases
0 5 10 15 20 250
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Binomial(25, 0.25)
Normal
![Page 38: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/38.jpg)
38STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Binomial(5, 0.25)
Normal(1.25,0.96825)
![Page 39: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/39.jpg)
39STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.2.2 Multinomial DistributionMultiple possible outcomes
Suppose that each of n independent, identical trials can have outcome in any of c categories.
if trial i has outcome in category j = 0 otherwise
represents a multinomial trial, with
Let denote the number of trials having outcome in category j.
The counts have the multinomial distribution.
![Page 40: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/40.jpg)
40STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
pdf:
![Page 41: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/41.jpg)
41STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.2.3 Poisson Distribution
count data do not result from a fixed number of trials. y=number of deaths due to automobile accidents on
motorways in Italy y>0 Poisson probability mass function (Poisson 1837)
It satisfies
Skewness
![Page 42: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/42.jpg)
42STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Poisson
0 5 10 15 20 25 30 350
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18 =5
=10
=15
![Page 43: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/43.jpg)
43STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Poisson Distribution
used for counts of events that occur randomly over time or space, when outcomes in disjoint periods or regions are independent.
an approximation for the binomial when n is large and is small with µ=n
For example, n=50 million driving in Italy death rate/week =0.000002 the number of deaths is bin(n, ) Or approximately Poisson with µ=n=100
![Page 44: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/44.jpg)
44STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.2.4 Overdispersion
A key feature of the Poisson distribution is that its variance equals its mean. Sample counts vary more when their mean is
higher. Overdispersion: Count observations often exhibit
variability exceeding that predicted by the binomial or Poisson.
![Page 45: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/45.jpg)
45STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.2.5 Connection between Poisson and Multinomial Distributions
Example, In Italy this next week, let
y1=# of people who die in automobile accidents
y2=number who die in airplane accidents
y3=number who die in railway accidents
(Y1, Y2, Y3) ~ independent Poisson ( ) The total ~ Poisson ( ) Here n is random variable rather than fixed If n is given, (Y1, Y2, Y3) is no longer independent and
Poisson, WHY
![Page 46: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/46.jpg)
46STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
conditional distribution given that
let
~ multinomial distribution
![Page 47: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/47.jpg)
47STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
multinomial distributionvs. Poisson distribution Many categorical data analyses assume a multinomial
distribution. Such analyses usually have the same parameter
estimates as those of analyses assuming a Poisson distribution, because of the similarity in the likelihood functions.
![Page 48: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/48.jpg)
48STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.3 STATISTICAL INFERENCE FOR CATEGORICAL DATA(general)
Once you choose the distribution of the categorical variable, you need to estimate the parameters in the distribution
We first review general method Point estimate Confidence interval
Section 1.4 MLE for binomial Section 1.5 MLE for multinominal
![Page 49: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/49.jpg)
49STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Likelihood
Likelihood is a tool for summarizing the data's evidence about unknown parameters. Let us denote the unknown parameter(s) of a distribution generically by .
If we observe a random variable X = x from distribution f (x|), then the likelihood associated with x, l(|x), is simply the distribution f (x|) regarded as a function of with x fixed.
For example, if we observe x from bin(n; ), the likelihood function is
xnx
x
nxl
)1()|(
![Page 50: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/50.jpg)
50STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Likelihood
The formula for the likelihood looks similar algebraically to the f (x|) but the distinction should be clear!
The distribution function is defined over the support of discrete variable x with given, whereas the likelihood is defined over the continuous parameter space for .
Consequently, a graph of the likelihood usually looks different from a graph of the probability distribution.
In most cases, we work with loglikelihood
)|(log)|( xlxL
)1log()(log
)1log()(loglog)|(log)|(
xnx
xnxx
nxlxL
![Page 51: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/51.jpg)
51STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Loglikelihood function bin(5,) and we observe x=0, x=1, and x=2
0 0.5 1-30
-20
-10
0
l (
| x
)
l ( | x )=x log +(n-x) log (1-)
0 0.5 1-40
-30
-20
-10
0
l (
| x
)
l ( | x )=x log +(n-x) log (1-)
0 0.5 1-20
-15
-10
-5
0
l (
| x
)
l ( | x )=x log +(n-x) log (1-)
0 0.2 0.4 0.6 0.8 1-7000
-6000
-5000
-4000
-3000
-2000
-1000
l (
| x
)
l ( | x )=x log +(n-x) log (1-)
bin(842+982,)
x=842 (yes)
![Page 52: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/52.jpg)
52STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Likelihood
In many problems of interest, we will derive our loglikelihood from a sample rather than from a single observation. If we observe an independent sample x1, x2, …, xn from a distribution f (x|), then the overall likelihood is the product of the individual likelihoods:
and loglikelihood is
n
ii
n
iin xlxfxxl
111 )|()|(),,|(
n
ii
n
ii
n
iin
xLxf
xfxxL
11
11
)|()|(log
)|(log),,|(
![Page 53: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/53.jpg)
53STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Log likelihood
In regular problems, as the total sample size n grows, the loglikelihood function does two things: (a) it becomes more sharply peaked around its maximum,and (b) its shape becomes nearly quadratic
the loglikelihood for a normal-mean problem is exactly quadratic.
That is, if we observe y1, . . . , yn from a normal population with known variance, the loglikelihood is
or in multi-dimension
![Page 54: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/54.jpg)
54STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
MLE (maximum likelihood estimation) ML estimate for θ is the maximizer of L(θ) or,
equivalently, the maximizer of l(θ). This is the parameter value under which the data observed have the highest probability of occurrence.
In regular problems, the ML estimate can be found by setting to zero the first derivative(s) of l(θ) with respect to θ.
![Page 55: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/55.jpg)
55STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Transformations of parameters
If l(θ) is a likelihood and φ = g(θ) is a one-to-one function of the parameter with back-transformation θ = g−1(φ), then we can express the likelihood in terms of φ as l( g−1(φ) ).
Transformations may help us to improve the shape of the loglikelihood.
If the parameter space for θ has boundaries, we may want to choose a transformation to the entire real space.
For example, consider the binomial loglikelihood,L
![Page 56: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/56.jpg)
56STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
binomial loglikelihood
If we apply the logit transformation
whose back-transformation is
the loglikelihood in terms of β is
L
![Page 57: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/57.jpg)
57STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
If we observe y = 1 from a binomial with n = 5, the loglikelihood in terms of β looks like this.
![Page 58: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/58.jpg)
58STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Transformations do not affect the location of the maximum-likelihood (ML) estimate.
If l(θ) is maximized at ˆθ, then l(φ) is maximized at ˆφ = g(ˆθ).
![Page 59: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/59.jpg)
59STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
score function
A first derivative of L(θ) with respect to θ is called a score function or simply a score.
In a one-parameter problem, the score function from an independent sample y1, . . . , yn is
where
is the score contribution for yi. The ML estimate is usually the solution of the likelihood
equation L’(θ)=0.
L
![Page 60: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/60.jpg)
60STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Mean of the score function.
A well known property of the score is that it has mean zero.
The score is an expression that involves both the parameter θ and the data Y . Because it involves Y , we can take its expectation with respect to the data distribution f(y|θ). The expected score is no longer a function of Y , but it’s still a function of θ. If we evaluate this expected score at the “true value” of θ—that is, at the same value of θ assumed when we took the expectation—we get zero:
If certain differentiability conditions are met, the integral may be rewritten as
![Page 61: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/61.jpg)
61STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
For example, in the case of the binomial proportion, we have
which is zero because E(Y ) = n. If we apply a one-to-one transformation to the
parameter φ = g(θ), then the score function with respect to the new parameter φ also has mean zero.
![Page 62: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/62.jpg)
62STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Estimating functions.
This property of the score function—that it has an expectation of zero when evaluated at the true parameter θ—is a key to the modern theory of statistical estimation.
In the original theory of likelihood-based estimation, as developed by R.A. Fisher and others, the ML estimate ˆθ is viewed as the value of the parameter that, under the parametric model, that makes the observed data most likely.
statisticians have begun to view ˆθ as the solution the score equation(s). That is, we now often view an ML estimate as the solution to L’(θ)=0
![Page 63: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/63.jpg)
63STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
estimating equations
Any function of the data and the parameters having mean zero at the true θ has this property as well. Functions having the mean-zero property are called estimating functions.
Setting the estimating functions to zero is called the estimating equations.
In the case of the binomial proportion, for example,
Y − nis a mean-zero estimating function, and so is
−1 [Y − n] .
![Page 64: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/64.jpg)
64STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Information and variance estimation.
The variance of the score is known as the Fisher information. In the case of a single parameter, the Fisher information is
If θ has k parameters, the Fisher information is the k x k covariance matrix for scores
![Page 65: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/65.jpg)
65STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Like the score function, the Fisher information is also a function of θ. So we can evaluate it at any given value of θ.
Notice that i(θ) as we defined it is the square of a sum which, in many problems, can be messy.
To actually compute the Fisher information, we usually make use of the well known identity
![Page 66: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/66.jpg)
66STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
In the multiparameter case, l(θ) is the k x k matrix of second derivatives
whose (l,m)th element is
![Page 67: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/67.jpg)
67STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
why we care about the Fisher information? it provides us with a way (several ways, actually) of
assessing the uncertainty in the ML estimate. It is well known that, in regular problems, ˆθ is
approximately normally distributed about the true θ with variance given by the reciprocal (or, in the multiparameter case, the matrix inverse) of the Fisher information.
![Page 68: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/68.jpg)
68STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
two common ways toapproximate the variance of ˆθ.
The first way is to plug ˆθ into i(θ) and invert,
this is commonly called the “expected information.”
The second way is to invert (minus one times) the actual second derivative of the loglikelihood at θ = ˆθ,
this is commonly called the “observed information.”
![Page 69: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/69.jpg)
69STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.3.2 Likelihood Function and ML Estimate for Binomial Parameter
The binomial log likelihood is
Differentiating with respect to yields
Equating this to 0 gives the likelihood equation, which has solution
the sample proportion of successes for the n trials.
![Page 70: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/70.jpg)
70STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Calculating , taking the expectation, and we get
Thus, the asymptotic variance of is Actually, from E(Y)=n and var(Y)=n (1- ), the
distribution if =Y/n has mean and standard error
![Page 71: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/71.jpg)
71STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Likelihood function and MLE summary
We use maximum likelihood estimate (MLE) asymptotically normal asymptotically consistent asymptotically efficient
Likelihood function probability of those data, treated as a function of
the unknown parameter. maximum likelihood (ML) estimate
parameter value that maximizes this function
![Page 72: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/72.jpg)
72STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
MLE and its variance
If y1; y2; … ; yn is a random sample from distribution f(y|), then the score function is
In regular problems, we can find the ML estimate by setting the score function(s) to zero and solving for .
The equations L’(θ)=0 are called the score equations. More generally, they can be called estimating equations because their solution is the estimate for θ.
We defined the Fisher information as the variance of the score function and
![Page 73: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/73.jpg)
73STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.3.3 Wald–Likelihood Ratio–Score Test Triad
Three standard ways exist to use the likelihood function to perform large-sample inference. Wald test Score test Likelihood ratio test
We introduce these for a significance test of a null hypothesis H0: and then discuss their relation to interval estimation.
They all exploit the large-sample normality of ML estimators.
![Page 74: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/74.jpg)
74STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Wald test
With nonnull standard error SE of , the test statistic
has an approximate standard normal distribution when
One- or two-sided P-value by z. Or z2 has a chi-squared null distribution with 1 df
The P-value is then the right-tailed chi-squared probability above the observed value
This type of statistic, using the nonnull standard error, is called a Wald statistic (Wald 1943).
![Page 75: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/75.jpg)
75STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Wald test
For an .05-level two-side test, we reject H0 if
Equivalently, if
where 3.84 is the 95th percentile of 2(1).
96.1ˆ
0 SE
22
0 96.184.3)ˆvar(
)ˆ(
![Page 76: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/76.jpg)
76STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Wald test
The multivariate extension for the Wald test of
has test statistic
where is the inverse matrix of Information matrix.
W is an asymptotic chi-squared distribution with df = rank of .
Wald test is not invariant to transformations. That is, a Wald test on a transformed parameter φ= g() may yield a different p-value than a Wald test on the original scale.
![Page 77: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/77.jpg)
77STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
uses the likelihood function through the ratio of two maximizations:(1). the maximum over the possible parameter values under H0 (2). the maximum over the larger set of parameter values permitting H0 or an alternative Ha to be true.
The likelihood-ratio test statistic equals
where L0 and L1 denote the maximized log-likelihood functions.
is 2 distribution with df=dim(Ha U H0)-dim(H0)
Reject H0 if
> 2 (=0.05)
Likelihood ratio test
![Page 78: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/78.jpg)
78STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
The score test is based on the slope and expected curvature of the log-likelihood function L() at the null value 0.
Score function
The value tends to be larger in absolute value when is farther from 0.
Score statistic
has an approximate standard normal null distribution. The chi-squared form of the score statistic is
Score test
)/)((
)(20
2
0
LE
u
)/)((
)(20
20
2
LE
u
![Page 79: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/79.jpg)
79STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Why is score statistic reasonable?
Recall that the mean of the score is zero and its variance is equal to the Fisher information.
In a large sample, the score will also be approximately normally distributed because it's a sum of iid random variables.
Therefore, it will behave like a squared standard normal [2(1)] if H0 is true.
![Page 80: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/80.jpg)
80STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Wald–Likelihood Ratio–Score Test
![Page 81: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/81.jpg)
81STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
The three test statistics - Wald, LR and score are asymptotically equivalent.
The differences among them vanish in large samples if the null hypothesis is true.
If the null hypothesis is false, they may take very different values. But in that case, all the test statistics will be large, the p-values will be essentially zero, and they will all lead us to reject H0.
Score test does not require to calculate MLE. LR test is scale-invariant. LR statistic uses the most information of the three types
of test statistic and is the most versatile.
![Page 82: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/82.jpg)
82STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.3.4 Constructing confidence intervals
In practice, it is more informative to construct confidence intervals for parameters than to test hypotheses about their values.
For any of the three test methods, a confidence interval results from inverting the test. For instance, a 95% confidence interval for is the set of 0 for which the test of H0: has a P-value exceeding 0.05.
Let denote the z-score from the standard normal distribution having right-tailed probability a; this is the 100(1-a) percentile of that distribution.
Let denote the 100(1-a) percentile of the chi-squared distribution with degrees of freedom df.
![Page 83: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/83.jpg)
83STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Tests and Confidence IntervalsAt significant level ,
reject H0: , if
2/0
ˆ
z
SE
100(1-)%
confidence interval
2/0
ˆ
z
SE
} :{ 0
} :{ 0
0
![Page 84: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/84.jpg)
84STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Confidence Intervals
The Wald confidence interval is most common in practice because it is simple to construct using ML estimates and standard errors reported by statistical software.
The likelihood-ratio-based interval is becoming more widely available in software and is preferable for categorical data with small to moderate n.
For the best known statistical model, regression for a normal response, the three types of inference necessarily provide identical results.
![Page 85: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/85.jpg)
85STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.4 STATISTICAL INFERENCE FOR BINOMIAL PARAMETERS
Recall log likelihood
Score function
MLE
SE=
)1log()(log)|( ynyyL
)1/()(/)( ynyu
![Page 86: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/86.jpg)
86STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.4.1 Tests about a Binomial Parameter
Since H0 has a single parameter, we use the normal rather than chi-squared forms of Wald and score test statistics. They permit tests against one-sided as well as two-sided alternatives.
Wald statistic
Evaluating the binomial score and information at 0
The normal form of the score statistic simplifies to
![Page 87: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/87.jpg)
87STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
binomial log-likelihood under H0
Under Ha
The likelihood-ratio test statistic
or
has an asymptotic chi-squared distribution with df=1.
)1log()(log 000 ynyL
)ˆ1log()(ˆlog1 ynyL
![Page 88: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/88.jpg)
88STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Test
At significant level , two sided, reject H0, if
(Wald test)
(Score test)
(LR test)
![Page 89: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/89.jpg)
89STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.4.2 Confidence Intervals for a Binomial Parameter
Inverting the Wald test,
Unfortunately, it performs poorly unless n is very large The actual coverage probability usually falls below the
nominal confidence coefficient, much below when is near 0 or 1.
An adjustment is needed. (Problem 1.24)
![Page 90: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/90.jpg)
90STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
Simulation to calculate coverage prob.%let n=1000; %let pi=0.5; %let simuN=10000;
data simu; drop i;
do i=1 to &simuN;
k=RAND('BINOMIAL',&pi,&n); output;
end;
run;
data res; set simu;
pihat=k/&n;
lci=pihat-1.96*sqrt(pihat*(1-pihat)/&n);
uci=pihat+1.96*sqrt(pihat*(1-pihat)/&n);
if lci>&pi or uci<&pi then cover=0; else cover=1;
proc sql;
select sum(cover)/&simuN as coverageprobabilty from res;
![Page 91: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/91.jpg)
91STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
it performs poorly if 1) n is small; 2) pi near 0 or 1.
%let n=1000; %let pi=0.5; %let simuN=10000; %let n=20; %let pi=0.5; %let simuN=10000; %let n=20; %let pi=0.1; %let simuN=10000; %let n=20; %let pi=0.9; %let simuN=10000;
![Page 92: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/92.jpg)
92STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
An adjustment is needed. (Problem 1.24)
![Page 93: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/93.jpg)
93STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
%let n=20; %let pi=0.5; %let simuN=10000;
data simu; drop i;
do i=1 to &simuN;
k=RAND('BINOMIAL',&pi,&n); output;
end;
run;
data res; set simu;
pihat=(k+1.96)/(&n+1.96*1.96);
lci=pihat-1.96*sqrt(pihat*(1-pihat)/(&n+1.96*1.96));
uci=pihat+1.96*sqrt(pihat*(1-pihat)/(&n+1.96*1.96));
if lci>&pi or uci<&pi then cover=0; else cover=1;
proc sql;
select sum(cover)/&simuN as coverageprobabilty from res;
![Page 94: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/94.jpg)
94STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
score confidence interval
The score confidence interval contains 0 values for which
Its endpoints are the 0 solutions to the equations
It is quadratic in 0. This interval is
![Page 95: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/95.jpg)
95STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
LR-based confidence interval
The likelihood-ratio-based confidence interval is more complex computationally, but simple in principle.
It is the set of 0 for which the likelihood ratio test has a P-value exceeding .
Equivalently, it is the set of 0 for which double the log likelihood drops by less than from its value at the ML estimate.
![Page 96: 1 STA 517 – Introduction: Distribution and Inference Discrete data Basic data are discretely measured responses such as counts, proportions, nominal,](https://reader036.vdocuments.net/reader036/viewer/2022062409/56649d355503460f94a0d08e/html5/thumbnails/96.jpg)
96STA 517 – Introduction: Distribution and InferenceSTA 517 – Introduction: Distribution and Inference
1.4.3 Proportion of Vegetarians Example