is their a correlation between gpa and number of hours worked? by: excellent student #1 excellent...
TRANSCRIPT
Is their a correlation between GPA and number of hours
worked?
By: Excellent Student #1Excellent Student #2Excellent Student #3
Analysis• For our project we decided to survey
working kids and find out if having a job affected their GPA.
• We chose this survey topic because we were curious to see at what point did the number of hours worked begin effecting a students grades and GPA.
• We used convenient sampling to collect our data
Two quantitative variables that are
good for regression analysis…
• Explanatory Variables, Indicator, Predictor: Hours worked per week (X):
• 3.5 5 10 14 8 20 30 6 8 8 25 4 4 20 10 12 20 24 24 30 6 6 4 2 20 22 15 9 7 14 10 10 0 0 0
• Response Variables, dependent:
GPA (Y):• 4.5 4.1 4 4.2 3.46 3.6 3.7 3.7 4.3 3.5 3.6 2.8 3.25 4 3 3.2 4 3.6 2.9
4.3 3.5 3.25 3.2 3.67 4 3.52 3.84 3.7 3.8 3.5 3.7 3.5 4.2 3.5 3.7
Raw Data n=35
Vital Stats for X values
• (Mean for X-values) X-bar: 11.72857143
• (Standard Deviation) S: 8.620549028
• (Variance) S2: 74.31386554
• 5# summary: Minimum= 0,Q1= 5, Q2 Median= 10, Q3= 20, maximum= 30
• N=35
Vital stats for Y values
(Mean for X-values) X-bar: 3.654
(Standard Deviation) S: .4034289999
(Variance) S2: .162754958
5# summary: Minimum= 2.8,Q1= 3.5, Q2 Median= 3.67, Q3= 4.0, maximum= 4.5
N=35
Outlier Test
1.5 x IQR criterionA number is an Outlier if it is…
#< Q1- (1.5 x IQR)
#> Q3+ (1.5 x IQR)
Outlier Test X values
•For our X values an outlier would be anything greater than 42 and less than -17.5
•There are No outliers
Outlier Test Y-values
•For our Y values an outlier would be considered anything less than 2.75 and anything greater than 4.75
•There are No Outliers
Histogram for Hours worked per week (X)
This graph is Right Skewed and according to the 1.5 x IQR criterion there are NO outliers.
Histogram for GPA (Y)
This graph is symmetric and has NO outliers
Empirical Rule Test (X)
For Symmetric distributions S=8.620549028
Xbar= 11.729
• Xbar+/-1s =11.729+1(8.620549028)=20.34954903
11.729-1(8.620549028)=3.108450972
68% of the students surveyed worked between (3.108450972 hours, 20.34954903 hours)
• Xbar+/-2s=11.729+2(8.620549028)=28.97009806
11.729-2(8.620549028)= -5.476098056
95% of the students surveyed worked between (-5.476098056, 28.97009806
At first we used the Empirical Rule Test for the X-values, and then we remembered that my X-values were right skewed so therefore we wouldn’t be able to use the Empirical Rule Test we would have to use Chebyshev’s Theorem.
Chebyshev’s Theorem (X)
• For Non symmetric distributions (skewed data)
S=8.620549028
Xbar= 11.729
• At least ¾ (or 75%) of all students surveyed work between 4.067 hours and 20.969 hours a week.
• At least 8/9 (89%) of all students surveyed work between 2.089 hours and 7.94 hours a week.
Empirical Rule (Y)• Because the Y-values display a bell shaped distribution (symmetric) you can use the Empirical Rule
S= .4034289999
Ybar= 3.654
• Ybar+/-1s =3.654+1(.4034289999)= 4.057429
3.654-1(.4034289999)= 3.250571
68% of the students surveyed worked between (3.250571,4.057429) hours a week.
• Ybar+/-2s =3.654+2(.4034289999)= 4.460858
3.654-2(.4034289999)= 2.847142
95% of the students surveyed worked between (2.847142,4.460858) hours a week.
• Ybar+/-3s =3.654+3(.4034289999)= 4.864287
3.654-3(.4034289999)= 2.443713
95.5% of the students surveyed worked between (2.443713, 4.864287) hours a week.
Interpret
When interpreting r we used table A-6 in the back of the book. We used alpha value 0.05 to compare r against.
• Table A-6 value =.335 R=.1871678686
• Because The A-6 value exceeds the R value this means that there is No Linear correlation
• Another way to test if your data is linear is by testing the value of P (.352)
• If P is less than alpha (0.05) then there is a linear correlation.
• Since P is greater than alpha here we have no linear correlation.
R and R2
Form- Non linearDirection- Positive value but no real
directionStrength- Weak• R2=.035031811• 3.5% of the variation in Y is
explained by the Least Squares Regression Line.
Scatter plot
• Form- Non Linear• Direction- none • Strength-weak• There are no outliers or influential points
Hours) Vs. GPA(Y)
Regression Line on New Scatter plot and marginal
change• Y=a+bx• Y=3.562+.008x• Marginal Value- When Hours worked increases by
one your GPA will increase by .008 (Slope of LSRL)
Would you use this line for prediction?
• I would not use this line for prediction because the linear correlation coefficient (r) and the coefficient of determination (r2) are so weak.
Variation
•Total Variation: 5.971495 Σ(y-ybar)2 (The sum of the explained variables+ the unexplained variables)
•Explained Variation: 0.05375 Σ(yhat-ybar)2
• Unexplained Variation: 5.917745 Σ(y-yhat)2
Standard Error• Standard Error is a measure of the differences between
the observed sample y-values and the predicted values y that are obtained using the regression equation.
• This means Standard Error is a collective measure of the spread of the sample points about the regression line
• Se= 0.44414
Residual Graph• This is a good representation of the data
because the points are very scattered about y=0 and there is no visible pattern.
95% prediction interval when X=3
2.6488<y<4.532
We are 95% confident that when x = 3, the true predicted value of y falls between 2.6488 and 4.532
Residual values• .90665• .49351• .34971• .51468• -.1728• -.1379• -.1255• .08475• .66723• -.1328• -.1817• -.7977• -.3477• .26212• -.6503• -.4678• .26212• -.1729
• .52708• -.3653• -.3255• -.4153• .07227• .41979• -.2179• .0846• .00592• .15847• -.124• .01468• -.1503• .54971• -.0627• .1373• -.6627
Conclusion• In conclusion we found that for students at CCA, the # of hours worked did
not effect their GPA.
• Normal students working 20 hours a week would be affected scholastically, but since we go to CCA kids can be getting a 4.5, work over 20 hours a week and not have their GPA’s affected.
• There are many possible errors that could have occurred during our survey to skew our data. For example: someone could have taken the survey twice giving us the same data point twice, people’s time of hours worked and GPA could have both been estimates or just a guess, and depending on the job they have, some samples may have different circumstances in regards to getting there homework done based on the levels of their classes, how many classes their taking, and how tired they are after work.