presentation

10
Predicting Poverty Group 4: Heidi Dong, Tiffany Liu, Megan Fox, Taylor Puccini and Alexandra Houle-Dupont 1

Upload: tiffany-liu

Post on 20-Feb-2017

32 views

Category:

Documents


0 download

TRANSCRIPT

Predicting PovertyGroup 4: Heidi Dong, Tiffany Liu, Megan Fox, Taylor

Puccini and Alexandra Houle-Dupont

1

Statistics is a powerful tool of analysis that can be applied to examine social issues. Poverty is a growing, self-perpetuating problem in the United States; those affected can remain entrenched in the cycle of poverty over several generations. We decided to examine the factors we believe are entrenching people in poverty by analyzing the relationship between living habits, overall well-being, and poverty. We used each of the fifty United States as our unit of analysis and data from 2012. We attempted to answer the following questions:

1. Is there correlation between obesity and hypertension across states? 2. Is there a difference in food expenditure if you’re above or below the average national

unemployment rate?3. Is there correlation between the poverty rate and obesity rate by state? 4. Does poverty rate vary based on region?

It should be noted that we checked the assumptions of independence, linearity, equal variance and normality for each dependent and independent variable. 2

Introduction

Variable n Mean Variance Std. dev. Std. err. Median Range Min Max

Unemployment Rate (% by state) 50 7.34 2.935 1.713 0.2423 7.45 8.1 3.1 11.2

Fast food (% users in last 30 days by state) 50 83.46 0.08188 0.2861 0.04047 83.42 1.45 82.79 84.24

Hypertension (% by state) 50 27.64 9.754 3.123 0.4417 27.2 14.3 20.5 34.8

Total Food Expenditures ($/household) 50 48, 480 7,716,000 2778 392.8 48,550 14,040 41,950 56,000

Unemployment codedAbove national average (0)Below national average (1)

50 n/a n/a n/a n/a n/a n/a n/a n/a

Obesity Rate (% by state) 50 29.38 10.73 3.276 0.4633 29.65 14.6 21.3 35.9

Poverty Rate (% by state) 50 14.04 11.04 3.322 0.4698 13.65 13.6 8.6 22.2

3

Variables and Summary Statistics

Is there a correlation between obesity and hypertension across states?

● H0: 1 = 0 ● Ha: 1 ≠ 0 ● We are setting α=0.05● P-value of <0.0001● There is a positive correlation between obesity rate and the

percentage of people within a state living with hypertension. ● Slope of ≈0.74 indicates that as hypertension rate within a state

increases by 1%, the obesity rate of that state increases by 0.74%.● 49.44% of the variability in obesity rate by state can be explained

by the variability in hypertension across that state.● Conclusion: With a P-value of <0.0001, less than α=0.05, we reject

null hypothesis. We have sufficient evidence to say that there is a correlation between obesity and hypertension across states.

Parameter Estimate Alternative DF T-Stat P-value

Intercept 8.99 ≠ 0 48 3.00 0.0042

Slope 0.74 ≠ 0 48 6.85 <0.0001

Simple linear regression results:Dependent Variable: Obesity RateIndependent Variable: Hypertension (%) Obesity Rate = 8.9891311 + 0.73763908 Hypertension (%)Sample size: 50R (correlation coefficient) = 0.70316441R-sq = 0.49444019Estimate of error standard deviation: 2.353631

4

(%)

Is there a difference in food expenditure if you’re above or below the average national unemployment rate?● We chose to do a two sample t-test to determine

the relationship between unemployment rate and food expenditure across the United States.

● We coded (created a dummy variable) to create a mean for unemployment rate:○ 32 states are 1 (below national avg. of

8.07%)○ 18 states are 0 (above national avg. of

8.07%)● We are setting α=0.05● P-value of 0.047 ● Conclusion: With a p-value of 0.047, less than

α=0.05, we reject the H0 and conclude there is sufficient evidence to say that Unemployment does have an effect on the mean total food expenditure of a state.

Hypothesis test results:

μ1 : Mean of Total Food Expenditures ($/household) where

"Unemployment Above (0)"=0

μ2 : Mean of Total Food Expenditures ($/household) where

"Unemployment Above (0)"=1

μ1 - μ2 : Difference between two means

H0 : μ1 - μ2 = 0

HA : μ1 - μ2 ≠ 0

(without pooled variances)

5

Difference Sample Diff. Std. Err. DF T-Stat P-value

μ1 - μ2 95060.60 46581.90 50 2.04 0.047

Parameter Estimate Std. Err. Alternative DF T-Stat P-value

Intercept -0.50 3.78 ≠ 0 48 -0.13 0.90

Slope 0.49 0.13 ≠ 0 48 3.87 0.0003

Simple linear regression results:Dependent Variable: Poverty RateIndependent Variable: Obesity Rate Poverty Rate = -0.50 + 0.49 Obesity RateSample size: 50R (correlation coefficient) = 0.49R-sq = 0.24Estimate of error standard deviation: 2.93

● We chose to run a linear regression t-test to test this question.● H0: 1 = 0 ● Ha: 1 ≠ 0● We are setting α=0.05● Predicted Poverty Rate = -0.50 + 0.49 Obesity Rate● P-value of 0.0003● Conclusion: Because our p-value is 0.0003, less than α=0.05, we

reject the null hypothesis and say there is enough evidence to say that there is a positive, relatively strong, significant relationship between poverty rate within a state and the obesity rate within that state.

6

Is there correlation between the poverty rate and obesity rate by state?

Does poverty rate vary based on region?● We tested the popular stereotype that the South is more impoverished

than the North by using a 2 sample t-test.● South: Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana,

Mississippi, North Carolina, South Carolina, Tennessee, Virginia, West Virginia

● Northeast: Connecticut, Delaware, Maine, Maryland, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, Vermont

● Southern average poverty rate: μ1 = 17.39% ● Northeastern average poverty rate: μ2 = 11.83% ● H0 : μsouth - μnortheast = 0 ; The Northeast and South have the same average

poverty rate● HA : μsouth - μnortheast > 0 ; Northeast and South do not have the same

poverty rate● α=0.05● P-value of <0.0001● Conclusion: With a p-value that is effectively 0, less than α=0.05, we

reject the null hypothesis and conclude that there is enough evidence to say that the stereotype is true: the South has a higher poverty rate than the North. 7

Possible Shortcoming: We took the regional means of proportions of each state. While the poverty rate, a proportion, normalizes for the population, averaging the various proportions does not take into account differences in population size throughout the regions. We attempted to create our own regional proportion using the data that we did have.

Hypothesis test results:μ1 : Mean of South μ2 : Mean of Northeastμ1 - μ2 : Difference between two means

Difference n1 n2 Sample mean

Std. err.

Z-stat

P-value

μ1 - μ2 12 11 5.56 1.05 5.29 <0.0001

Multiple Linear Regression Predicting Poverty Rate Predicted Poverty Rate = -61.67 + 0.30(Obesity) -0.0004(Total Food Expenditure) + 0.30(Hypertension) + 0.72(Fast food) + 0.77(Unemployment Rate)

Summary of fit:Root MSE: 2.46R-squared: 0.51R-squared (adjusted): 0.45

● Hypertension and obesity have the same correlation coefficient (0.3)—for every 1% increase in the respective rates, the poverty rate increases by 0.3%.

● Total food expenditure has a very negative, weak impact—for every added dollar of food expenditure by household, the state poverty rate decreases by 0.0004.

● Fast food and unemployment rate have fairly strong coefficients (> 0.7). ○ For every 1% increase in fast food users in the last 30

days, the poverty rate increases by 0.72%. ○ For every 1% increase in the state unemployment

rate, the poverty rate increases by 0.766% ● Model strength: moderately strong ● About 51% of the variability in the poverty rate is accounted

for by the variability in the variables listed in the multiple linear regression table to the right.

● 45% of the variability in the predicted poverty rate is accounted for by the variability in the variables listed to the right when taking into account number of variables and sample size.

● Significant variables:○ 95% confidence level: Unemployment Rate (0.0018)○ 90% confidence level: Unemployment Rate (0.0018)

Obesity Rate (0.10)

Source DF SS MS F-stat P-value

Model 5 275.3 55.05 9.121 <0.0001

Error 44 265.6 6.036

Total 49 540.8

Parameter Estimate T-Stat P-value

Intercept -61.67 -0.414 0.68

Obesity Rate (%) 0.3 1.84 0.07

Total Food Expenditures ($/household) -0.0004 -0.803 0.43

Hypertension (%) 0.3 1.677 0.10

Fast food (% users in last 30 days) 0.72 0.39 0.72

Unemployment Rate (%) 0.766 3.319 0.0018

8

Analysis & Conclusion ● What did we learn?

○ Obesity and hypertension are highly correlated. Therefore, if a person has high blood pressure, he or she is more likely to be obese.

○ The relationship between food expenditure and unemployment rate is significant. Unemployment does affect the amount spent per household on food annually.

○ Poverty and obesity rates have a significant, positive, and relatively strong relationship. Based on this correlation, we know that targeting poverty would likely be a good solution to preventing obesity.

○ Being born in the South, which an individual cannot control, increases a person’s likelihood of living in poverty. ○ The social factors we examined are only part of the explanation for why poverty exists as it does, and continues to

cycle in our society. Other factors, including issues of race, marginalization, lack of education, access to healthcare, etc. also likely contribute to poverty.

● In our analysis, potential sources of error are: ○ Multicollinearity—food expenditure and fast food usage (in the last thirty days) had a strong, positive correlation of

0.718, which might diminish the predictive power of the model. ○ We used a small data set—in the real world, policy makers and NGO’s use big data, which would likely yield a

different result.○ Inequality within states—states with greater number of high income earners likely affects overall averages, making

problems seem less severe than they are. 9

Works Cited Centers for Disease Control and Prevention (CDC). (2014). Data, Trends and Maps: Obesity Prevalence Maps. Retrieved from

http://www.cdc.gov/obesity/data/table-adults.html.

U.S. Department of Labor, Bureau of Labor Statistics. (2012). Unemployment Rates for States Annual Average Rankings, Year: 2012. Retrieved from

http://www.bls.gov/lau/lastrk12.htm.

U.S. Census Bureau. (2012). Annual Estimates of the Resident Population for the United States, Regions, States, and Puerto Rico: April 1, 2010 to July 1, 2012.

Retrieved from https://www.census.gov/popest/data/state/totals/2012/.

Yoon SS, Burt V, Louis T, Carroll MD. Hypertension among adults in the United States, 2009–2010. NCHS data brief, no 107. Hyattsville, MD:

National Center for Health Statistics. 2012.

U.S. Census Bureau (2013, September). Poverty: 2000 to 2012. Retrieved from https://www.census.gov/prod/2013pubs/acsbr12-01.pdf.

Easy Analytic Software, Inc., Easy Analytic Software Inc. (EASI) (2009-03-01). Consumer Behavior - 2013: Food - Fast Food Restaurants |

Socioeconomic Indicator: Fast Food & Drive-In Restaurants: Visited In Last 6 Months: Total Category, 2013. Data-Planet™ Statistical

Datasets by Conquest Systems, Inc. [Data-file]. Dataset-ID: 050-301-009

Easy Analytic Software, Inc., Easy Analytic Software Inc. (EASI) (2009-03-01). Consumer Expenditures - 2013: Expenditures - Food | Socioeconomic

Indicator: Total annual expenditures, 2013. Data-Planet™ Statistical Datasets by Conquest Systems, Inc. [Data-file]. Dataset-ID:

050-302-001

10