Download - Health Economics- Lecture Ch03

8/3/2019 Health Economics- Lecture Ch03

1/28

Statistical Tools

Dr. Katherine Sauer

Metropolitan State College of Denver

Health Economics


2/28

Outline:

I. Hypothesis Testing

II. Difference of MeansIII. Regression Analysis


3/28

I. Hypothesis Testing

A. Simple Hypothesis

Men and women smoke different numbers of cigarettes

State the hypothesis:

Null hypothesis

(hypothesis we wish todisprove):

H0: cm = cw

ex: men and women

smoke the same number

of cigarettes

Alternative hypothesis

(hypothesis that theory

suggests to be the case)

H1: cm cw

ex: men and women donot smoke the same

number of cigarettes


4/28

B. Composite Hypothesis

Rich people spend more on health care than do poor

people

State the hypothesis

Null hypothesis

(hypothesis we wish todisprove):

H0: Er= Ep

ex: the rich and poor

spend the same amount

Alternative hypothesis

(hypothesis that theory

suggests to be the case)

H1: Er> Ep

ex: the rich spend morethan the poor


5/28

II. Difference in Means

Consider the example of mens and womens smoking.

To compare mens and womens smoking rates wecould ask people from the population at-large how many

cigarettes they smoke per day.

Since we cant ask everyone, how do we decide uponthe sample to use?


6/28

Since many things other than gender may affect the

number of cigarettes a person smokes, we can account

for this by selecting a sample of people randomly

from the universe of all people.

We could also select a sample of people from a

relatively homogeneous group, like, college

sophomores from a given college.


7/28

Types of Data

Continuous - natural measures that in principle could take

on different values for each observation

ex: height, weight, income, price

Categorical - refer to arbitrary categories

ex: gender (male or female)

race (black, white, or other)

location (urban or rural)

Is the number of cigarettes smoked continuous or

categorical?


8/28

Using NIH data for smokers from 2001 and 2002 it wasfound that:

For 4,714 men, cm = 15.60 cigarettes per day

For 4,841 women, cw = 13.47 cigarettes per day

the difference is = cm cw = 2.13 cigarettes per day


9/28

The data shows a difference in the average number of

cigarettes smoked per day by men and women.

Does the difference represent a true difference

between men and women smoking?

or

Did the sample randomly draw a higher average level

for men (15.60) than for women (13.47)?

Lets look at the sample distribution.


10/28

Based on the distribution,

some men and somewomen smoked far fewer

and some smoked far

more than the average.

Variance is a measure of

the dispersion of

cigarettes smoked around

the average.

mean: men (15.60) , women (13.47)


11/28

The larger the variance, the dispersion around the mean

is large.

- another observation may be far from the

sample mean

The smaller the variance, the dispersion around the

mean is small.

- another observation is likely close to the

sample mean

In testing a hypothesis, would you rather see a large or

small variance in your sample data?


12/28

The square root of the variance is called the standarddeviation,s.

A larger standard deviation indicates more dispersion

around the mean.

A smaller standard deviation indicates less dispersion

around the mean.


13/28

Thestandard errorof the mean is the standard deviation

divided by the square root of the number ofobservations.


14/28

To test our smoking hypothesis formally, we can

construct a difference of means test.

- good for continuous data that can be broken

up by categories

We wish to compare the value,

difference = cm cwto zero, which was the original hypothesis.

Recall: difference = 2.13

The standard error of the difference is calculated to be

equal to 0.216.


15/28

About 68 percent of a distribution lies within 1 standard

error 2.13 0.216 =1.91

2.13 + 0.216 =2.35

About 95 percent of a distribution lies within 2 standarderrors

2.13 (2)(0.216) =1.69

2.13 +(2)(0.216) =2.56

How does this compare to our null hypothesis that the

value difference is zero?


16/28

The t test:

The t statistic is calculated as the value divided by the

standard error.

In our example: 2.13 / 0.216 = 9.86

As a rule of thumb, if the t-statistic is greater than 2,

you have statistical significance.


17/28

This experiment would find very good evidence that

among smokers, women smoke fewer cigarettes than

men.

The males have higher levels than the females, and the

probability is well over 95 percent that this difference is

statistically significant.


18/28

III. Regression Analysis

- good for data that is continuous

Suppose we wish to explore the relationship between thecigarette tax and the amount of cigarettes smoked per

day.

null hypothesis: no effect (b = 0)alternative hypothesis: tax is inversely related to

the quantity smoked

(b < 0)


19/28

We want to know if the coefficient of -3.24 is

significantly different from zero.


20/28

A coefficient of -3.24 means:

A $1 increase in the tax is correlated with a change in

quantity demanded of 3.24 fewer cigarettes.


21/28

The elasticity is -0.09. This means a 1% increase in the

tax will lead to a 0.09% reduction in quantity

demanded.


22/28

A multiple regression includes more than one

explanatory variable.

ex: gender, race, age, education, income

Some of the variables may be continuous, some may be

categories.

- interpretation is different


23/28

Continuous variables

Notice how adding more variables changes the

coefficient on excise tax.

Is it still significant?

CC

C

C


24/28

Income:

Age:

Education:

CC

C

C


25/28

When using categorical variables in a regression, we need

to assign them a numerical value.- dummy variables

Dummy variables are used in regression analysis to

determine whether groups of people differ from others.

For example, maybe we would want to know if African

Americans smoke more than other groups.

We can create a dummy variable that assigns the value 1

if the person is African American or 0 otherwise.


26/28

Because male appears as a variable, we know it was

assigned a value of 1. (female =0)

Is the male coefficient significant?

D

D

D


27/28

The interpretation of a dummy variable is different than

that of a continuous variable.

0 -5.05

2.23

African AmericanNo=0 Yes =1

No=0

Yes =1

Male

An African American

female smokes 5.05 fewer

cigarettes than white

females.

A white male smokes 2.23

more cigarettes than a

white female.

An African American male smokes 2.82 fewer

cigarettes than a white female.

2.23 -5.05

= - 2.82


28/28

Summary of Statistical Competencies:

Formulate questions in terms of hypotheses.

Read statistical test results to determine if the result is

significant.

Understand statistical significance.

Interpret reported regression results.

Download - Health Economics- Lecture Ch03

Top Related