exploring relationships between variables ch. 10 scatterplots, associations, and correlations ch. 10...

Post on 24-Dec-2015

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Exploring relationships between variables

Exploring relationships between variables

Ch. 10Scatterplots, Associations,

and Correlations

Ch. 10Scatterplots, Associations,

and Correlations

ScatterplotsScatterplots

• Shows change over time• Shows patterns• Shows Trends• Relationships• Outlier values

• Shows change over time• Shows patterns• Shows Trends• Relationships• Outlier values

Scatterplots Scatterplots

• Can be positive or negative• Show relationship amongst 2

variables• Can be shown more in depth

through the Z-scores of both variables (ZX, ZY)

• Can be positive or negative• Show relationship amongst 2

variables• Can be shown more in depth

through the Z-scores of both variables (ZX, ZY)

Z-scoresZ-scores

• X-MeanX / Standard Deviation (SX)

• Y-MeanY / Standard Deviation (SY)

• Calculating standard deviation in the same way as before.

• X-MeanX / Standard Deviation (SX)

• Y-MeanY / Standard Deviation (SY)

• Calculating standard deviation in the same way as before.

RatioRatio

• Correlation coefficient• Sum of SX * SY / n-1• Correlation measures the

strength of the linear association between 2 variables

• Correlation coefficient• Sum of SX * SY / n-1• Correlation measures the

strength of the linear association between 2 variables

variablesvariables

• Explanatory Variable – X• Response Variable - Y

• Explanatory Variable – X• Response Variable - Y

Least-Squares LineLeast-Squares Line• Y= a + bx• a = y intercept• b = slope• a = y – bx• b = SSxy/SSx • SSx = Sum of squares of x

• Y= a + bx• a = y intercept• b = slope• a = y – bx• b = SSxy/SSx • SSx = Sum of squares of x

SSxSSx

• This is calculated by obtaining the sum of each squared x

• You then subtract the sum of x squared divided by n

• You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)

• This is calculated by obtaining the sum of each squared x

• You then subtract the sum of x squared divided by n

• You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)

SSxySSxy

• Sum of squares of x and y• Take the sum of each x value

times each y value.• You then subtract from that

total the (Sum of x) * (Sum of y) n

• Sum of squares of x and y• Take the sum of each x value

times each y value.• You then subtract from that

total the (Sum of x) * (Sum of y) n

SSxySSxy

• SSxy is a more efficient way of computing

• Sum of each (x-xbar) * (y-ybar)

• SSxy is a more efficient way of computing

• Sum of each (x-xbar) * (y-ybar)

Complete Guided Ex. #3 page 566

Complete Guided Ex. #3 page 566

Standard Error of Estimate

Standard Error of Estimate

• Se = square root of E(y-yp)squared/n – 2

• How to calculate square root of SDY – b(SDx * SDy) / n-2

• Se = square root of E(y-yp)squared/n – 2

• How to calculate square root of SDY – b(SDx * SDy) / n-2

ResidualsResiduals

• You can graph the residual of the equation to see if the regression is accurate

• Residuals are the difference between the observed value and the predicted value

• R = observed - predicted

• You can graph the residual of the equation to see if the regression is accurate

• Residuals are the difference between the observed value and the predicted value

• R = observed - predicted

Confidence IntervalsConfidence Intervals

• Yp – E < y < yp + E• Yp = predicted value of y

• Yp – E < y < yp + E• Yp = predicted value of y

What does this mean (better understanding)What does this mean

(better understanding)

Types of dataTypes of data

• Outlier• Leverage• Influential Point• Lurking Variable

• Outlier• Leverage• Influential Point• Lurking Variable

OutlierOutlier

• Any data point that stands away from the others

• Any data point that stands away from the others

LeverageLeverage

• Data points with X-values that are far from the mean

• Can alter the line of least regression

• Data points with X-values that are far from the mean

• Can alter the line of least regression

Influential PointInfluential Point

• Omitting this point can drastically alter the regression model

• Omitting this point can drastically alter the regression model

Lurking VariableLurking Variable

• A variable that is hidden in the equation

• It is not explicitly part of the model but affects the way the variables in the model appear

• A variable that is hidden in the equation

• It is not explicitly part of the model but affects the way the variables in the model appear

top related