understanding multivariate research berry & sanders

11
Understanding Multivariate Research Berry & Sanders

Upload: candace-jefferson

Post on 27-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Understanding Multivariate Research Berry & Sanders

Understanding Multivariate Research

Berry & Sanders

Page 2: Understanding Multivariate Research Berry & Sanders

Regression Assumptions

1. The independent variables are measured at the interval level or are dichotomous (1/0).

2. The dependent variable is continuous.3. The variables in the model are measured perfectly (no

measurement error).4. The effect of the independent variable, X, on the

dependent variable, Y, is linear.5. The error or disturbance term is completely uncorrelated

with the independent variables.6. The effects of all independent variables on the dependent

variable are additive.

Page 3: Understanding Multivariate Research Berry & Sanders

Multivariate Regression

Yi = b0 + b1X1i + b2X2i + b3X3i + ei

Each slope coefficient (bi) measures the responsiveness of the dependent variable to a one-unit change in the associated independent variable when the other independent variables are held constant.

Page 4: Understanding Multivariate Research Berry & Sanders

Example

Y: body weight (lbs)

X1: food intake (average daily calories)

X2: exercise (average daily expenditure, calories)

X3: gender (1=male, 0=female)

Page 5: Understanding Multivariate Research Berry & Sanders

Table 3.1 Regression Model of Body Weight

Coefficient

Intercept 152.0

FOOD 0.028

EXERCISE -0.045

MALE 35.00

Page 6: Understanding Multivariate Research Berry & Sanders

Interpretation

Intercept: A female who eats no food and does not exercise weighs 152 pounds.

FOOD: A one calorie increase in daily average food intake increases a person’s weight by .028 pounds. A 100 calorie increase results in a 2.8 pound increase in weight (100x.028).

EXERCISE: A one calorie increase in calories expended through exercise decreases a person’s weight by 0.045 pounds.

Page 7: Understanding Multivariate Research Berry & Sanders

Interpretation (continued)

MALE (dichotomous)

The coefficient can be interpreted as the difference in the expected value of Y between a case for which X=0 and a case for which X=1 (holding all other independent variables constant).

For two individuals with identical food intake and exercise, a man can expect to weigh 35 pounds more than a woman.

Page 8: Understanding Multivariate Research Berry & Sanders

Elements of a Regression Model

1. Measuring the fit of the model: based on a comparison of the actual and predicted values of Y. The further away data points fall from the regression line, the worse the fit.

2. R2 = the proportion of the variation in Y that is explained by the independent variables, or the squared correlation between the actual and predicted values. R2 ranges from 0 to 1 with 1 indicating a perfect fit (all points on the regression line).

Page 9: Understanding Multivariate Research Berry & Sanders

Elements (continued)

3. Statistical Significance

Ho: βi = 0

H1: βi > 0 (or < or not equal)

t = bi / s.e.

Rule of thumb: If t > 2 or t < -2, then the coefficient is statistically significant (we reject the null hypothesis that the coefficient is zero).

Page 10: Understanding Multivariate Research Berry & Sanders

Elements (continued)

Confidence level (95%): We calculate a partial slope coefficient for a sample (bi). We can calculate a confidence interval around this estimate, within which we would expect the true (population) coefficient (βi) to fall in 95 of 100 samples.

Page 11: Understanding Multivariate Research Berry & Sanders

Potential Problems

Multicollinearity: a high (or perfect) correlation among any of the independent variables (e.g. education and income)

Heteroskedasticity: Non-constant variance in errors Autocorrelation (or serial correlation): The error terms

are correlated; very common in time series data

All of these problems create inefficiencies (increasing standard errors), but they do not affect our slope coefficients (they remain unbiased).