diploma in statistics introduction to regression lecture 5.11 introduction to regression lecture 5.1...

59
Diploma in Statistics Introduction to Regression Lecture 5.1 1 Introduction to Regression Lecture 5.1 1. Review 2. Transforming data, the log transform i. liver fluke egg hatching rate ii.explaining CEO remuneration iii.brain weights and body weights 3. SLR with transformed data 4. Transforming X, quadratic fit 5. Other options

Upload: raymond-turner

Post on 25-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 1

Introduction to RegressionLecture 5.1

1. Review

2. Transforming data, the log transform

i. liver fluke egg hatching rate

ii. explaining CEO remuneration

iii. brain weights and body weights

3. SLR with transformed data

4. Transforming X, quadratic fit

5. Other options

Page 2: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 2

Using t values

Convention: n >30 is big,

n < 30 is small.

Z0.05 = 1.96

≈ 2

t30, 0.05 = 2.04

≈ 2

Page 3: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 3

Selected critical values for the t-distribution .25 .10 .05 .02 .01 .002 .001

= 1 2.41 6.31 12.71 31.82 63.66 318.32 636.61 2 1.60 2.92 4.30 6.96 9.92 22.33 31.60 3 1.42 2.35 3.18 4.54 5.84 10.22 12.92 4 1.34 2.13 2.78 3.75 4.60 7.17 8.61 5 1.30 2.02 2.57 3.36 4.03 5.89 6.87 6 1.27 1.94 2.45 3.14 3.71 5.21 5.96 7 1.25 1.89 2.36 3.00 3.50 4.79 5.41 8 1.24 1.86 2.31 2.90 3.36 4.50 5.04 9 1.23 1.83 2.26 2.82 3.25 4.30 4.78 10 1.22 1.81 2.23 2.76 3.17 4.14 4.59 12 1.21 1.78 2.18 2.68 3.05 3.93 4.32 15 1.20 1.75 2.13 2.60 2.95 3.73 4.07 20 1.18 1.72 2.09 2.53 2.85 3.55 3.85 24 1.18 1.71 2.06 2.49 2.80 3.47 3.75 30 1.17 1.70 2.04 2.46 2.75 3.39 3.65 40 1.17 1.68 2.02 2.42 2.70 3.31 3.55 60 1.16 1.67 2.00 2.39 2.66 3.23 3.46 120 1.16 1.66 1.98 2.36 2.62 3.16 3.37 ∞ 1.15 1.64 1.96 2.33 2.58 3.09 3.29

Page 4: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 4

Quantify the extent of the recovery in Year 6, Q3.

= 1030 Q1 + 1292 Q2 + 1210 Q3 + 1279 Q4 + 33.7 Time

Year 6 Q2: P = 1657

= 1292 + 33.7 × 22 = 2033

P – = 1657 – 2033 = – 376

Year 6 Q3: P = 2185

= 1210 + 33.7 × 23 = 1985

P – = 2185 – 1985 = 200

Homework 4.2.1

Page 5: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 5

Homework 4.2.2

List correspondences between the output from the original regression and the output from the alternative regression.

Confirm that the coefficients of Q1, Q2 and Q3 in the original are the corresponding coefficients in the alternative with the Q4 coefficient added.

Page 6: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 6

Predictor Coef SE Coef T PNoconstantQ1 1029.87 23.41 43.99 0.000Q2 1292.35 24.45 52.85 0.000Q3 1210.42 25.55 47.37 0.000Q4 1278.70 26.71 47.88 0.000Time 33.725 1.619 20.83 0.000S = 40.9654

Predictor Coef SE Coef T PConstant 1278.70 26.71 47.88 0.000Q1 -248.82 26.36 -9.44 0.000Q2 13.65 26.11 0.52 0.609Q3 -68.27 25.96 -2.63 0.019Time 33.725 1.619 20.83 0.000S = 40.9654

Page 7: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 7

Homework 4.2.3

1. Calculate the simple linear regressions of Jobtime on each of T_Ops and Units. Confirm the corresponding t-values.

2. Calculate the simple linear regression of Jobtime on Ops per Unit. Comment on the negative correlation of Jobtime with Ops per Unit in the light of the corresponding t-value.

3. Confirm the calculation of the R2 values.

Page 8: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 8

Solution 4.2.3

2. Calculate the simple linear regression of Jobtime on Ops per Unit. Comment on the negative correlation of Jobtime with Ops per Unit in the light of the corresponding t-value.

Comment: The t-value is insignificant; the negative correlation is just chance variation, with no substantive meaning.

Page 9: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 9

Variance Inflation Factors

2kk

kR1

1ns

)ˆ(SE

ns)ˆ(SE0R

kk

2k

factorlationinferrordardtansR1

12k

factorlationinfiancevarR1

12k

Convention: problem if > 90% or VIFk > 102kR

Page 10: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 10

What to do?

• Get new X values, to break correlation pattern

– impractical in observational studies

• Choose a subset of the X variables

– manually

– automatically

• stepwise regression

• other methods

Page 11: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 11

Residential load survey data.

Data collected by a US electricity supplier during an investigation of the factors that influence peak demand for electricity by residential customers.

Load is demand at system peak demand hour, (kW)

Size is house size, in SqFt/1000,

Income (X2) is annual family income, in $/1000,

AirCon (X3) is air conditioning capacity, in tons,

Index (X4) is the house appliance index, in kW,

Residents (X5) is number in house on a typical day

Page 12: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 12

Matrix plot

Page 13: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 13

Results

All variables in:Predictor Coef SE Coef T PConstant 0.1263 0.2289 0.55 0.585Size -2.6689 0.9059 -2.95 0.006Income 0.00027912 0.00007892 3.54 0.001AirCon 0.42462 0.03472 12.23 0.000Index 0.00038137 0.00007884 4.84 0.000Residents 0.00197 0.02218 0.09 0.930

Income deletedPredictor Coef SE Coef T PConstant -397.0 492.7 -0.81 0.426Size 10943.3 594.2 18.42 0.000AirCon -1.86 75.45 -0.02 0.980Index 0.0721 0.1709 0.42 0.676Residents 38.65 47.75 0.81 0.424

Page 14: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 14

Exercise

Calculate the VIF for Size. Comment.

Homework

Calculate variance inflation factors for all explanatory variables. Discuss

Page 15: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 15

Multicollinearity

when when there is perfect correlation within the X variables.

Example: Indicators

Illustration: Minitab

Page 16: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 16

Introduction to RegressionLecture 5.1

1. Review

2. Transforming data, the log transform

i. liver fluke egg hatching rate

ii. explaining CEO remuneration

iii. brain weights and body weightsA

3. SLR with transformed data

4. Transforming X, quadratic fit

5. Other options

Page 17: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 17

(i) Hatching of liver fluke eggs

The life cycle of the liver fluke

1. Adults in liver lay eggs

2. Animals excrete eggs

3. Eggs hatch on ground

4. Larvae seek snail

5. Development within snail

6. Emergence from snail

7. Consumption by animal

8. Penetration to liver

Page 18: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 18

Hatching of liver fluke eggs:Duration and Success rate

Duration and success rate of hatching of 600 liver fluke eggs at a series of fixed temperatures

Temperature (C)

Number hatched

Duration (mean days)

SD Hatch%

10 546 115.75 2.14 91.0 13 543 56.50 2.33 90.5 16 534 32.39 1.98 89.0 18 501 24.49 1.41 83.5 20 499 18.92 1.39 83.1 22 497 15.58 1.23 82.8 24 465 13.39 1.03 77.5 26 448 11.98 1.28 74.0 28 438 10.16 0.94 73.0 30 432 9.45 0.96 72.0 32 256 10.37 0.94 42.5 34 42 11.52 0.85 7.0 35 0

Page 19: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 19

Temperature

Dura

tion

353025201510

120

100

80

60

40

20

0

Scatterplot of Duration vs Temperature

Page 20: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 20

Temperature

Log(D

ura

tion)

353025201510

2.2

2.0

1.8

1.6

1.4

1.2

1.0

Scatterplot of Log(Duration) vs Temperature

Page 21: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 21

Sales

Tota

l com

p

140000120000100000800006000040000200000

200000000

150000000

100000000

50000000

0

Scatterplot of Total comp vs Sales

(ii) Explaining CEO Compensationand Company Sales,

(Forbes magazine, May 1994)

Page 22: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 22

Explaining CEO Remuneration,bivariate log transformation

LogSales

LogCom

p

5.55.04.54.03.53.02.52.0

8

7

6

5

4

Scatterplot of LogComp vs LogSales

Page 23: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 23

(iii) Mammals' Brainweight vs Bodyweight

Species Bodyweight Brainweight

African elephant 6654 5712 African giant pouched rat 1 6.6 Artic fox 3.385 44.5 Artic ground squirrel 0.92 5.7 Asian elephant 2547 4603 Brachiosaurus 87000 154.5 Baboon 10.55 179.5 Big brown bat 0.023 0.3 Brazilian tapir 160 169 Cat 3.3 25.6 Chimpanzee 52.16 440

● ● ●

● ● ●

● ● ●

Page 24: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 24

Bodyweight

Bra

inw

eig

ht

9000080000700006000050000400003000020000100000

6000

5000

4000

3000

2000

1000

0

Scatterplot of Brainweight vs Bodyweight

Scatterplot view

Page 25: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 25

LBodyW

LBra

inW

543210-1-2-3

4

3

2

1

0

-1

Scatterplot of LBrainW vs LBodyW

Scatterplot view,log transform

Page 26: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 26

LBodyW

LBra

inW

43210-1-2-3

4

3

2

1

0

-1

Scatterplot of LBrainW vs LBodyW

Scatterplot view,Dinosaurs deleted

Page 27: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 27

Histogram view

600048003600240012000

48

36

24

12

0

Brainweight

Fre

qu

en

cy

6000500040003000200010000

60

45

30

15

0

Bodyweight

Fre

qu

en

cy

Histogram of Brainweight

Histogram of Bodyweight

Page 28: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 28

Histogram view,log transform

43210-1

16

12

8

4

0

LBrainW

Fre

qu

en

cy

43210-1-2

12

9

6

3

0

LBodyW

Fre

qu

en

cy

Histogram of LBrainW

Histogram of LBodyW

Page 29: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 29

Changing spread with log

Page 30: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 30

Changing spread with log

Page 31: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 31

Changing spread with log

Page 32: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 32

Changing spread with log

Page 33: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 33

Changing spread with log

Page 34: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 34

Changing spread with log

Page 35: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 35

Changing spread with log

Page 36: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 36

Changing spread with log

Page 37: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 37

Changing spread with log

Page 38: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 38

Why the log transform works

High spread at high X

transformed to

low spread at high Y

Low spread at low X

transformed to

high spread at low Y

Page 39: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 39

Why the log transform works

10 to 100

transformed to

log10(10) to log10(102)

i.e. 1 to 2

1/10 = 0.1 to 1/100 = 0.01

transformed to

log10(10–1) to log10(10–2)

i.e., – 1 to – 2

Page 40: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 40

Introduction to RegressionLecture 5.1

1. Review

2. Transforming data, the log transform

i. liver fluke egg hatching rate

ii. explaining CEO remuneration

iii. brain weights and body weights

3. SLR with transformed data

4. Transforming X, quadratic fit

5. Other options

Page 41: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 41

SLR with transformed dataLBrainW versus LBodyW

The regression equation is

LBrainW = 0.932 + 0.753 LBodyW

Predictor Coef SE Coef T P

Constant 0.93237 0.04170 22.36 0.000

LBodyW 0.75309 0.02858 26.35 0.000

S = 0.302949

Page 42: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 42

LBodyW

LBra

inW

43210-1-2-3

4

3

2

1

0

-1

Scatterplot of LBrainW vs LBodyW

Application:Do humans conform?

Human

Page 43: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 43

Application:Do humans conform?

• Delete the Human data,

• calculate regression,

• predict human LBrainW and

• compare to actual, relative to s

Page 44: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 44

Application:Do humans conform?

Regression Analysis: LBrainW versus LBodyW

The regression equation isLBrainW = 0.924 + 0.744 LBodyW

Predictor Coef SE Coef t pConstant 0.92410 0.03933 23.50 0.000LBodyW 0.74383 0.02706 27.48 0.000

S = 0.285036

Page 45: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 45

Application:Do humans conform?

LBodyW(Human) = 1.79239

LBrainW(Human) = 3.12057

Predicted LBrainW = 0.924 + 0.744 × 1.79239

= 2.25754

Residual = 3.12057 – 2.25754= 0.86303

Residual / s = 0.86303 / 0.285036 = 3.03

Page 46: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 46

Deleted residuals

For each potentially exceptional case:

– delete the case

– calculate the regression from the rest

– use the fitted equation to calculate a

deleted fitted value

– calculate deleted residual

= obseved value – deleted fitted value

Minitab does this automatically for all cases!

Page 47: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 47

Application:Do humans conform?

With 63 cases, we do not expect to see any cases with residuals exceeding 3 standard deviations.

On the other hand, recalling the scatter plot, the humans do not appear particulary exceptional. The dotplot view of deleted residuals emphasises this:

Water opossums appear more exceptional.

HumanWater Opossum

Page 48: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 48

Application:Do humans conform?

4

3

2

1

0

-1

-2

-3

-43210-1-2-3

De

lete

d R

esi

du

als

Score

AD 0.385

P-Value 0.383

Probability Plot of Deleted Residuals

Page 49: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 49

Introduction to RegressionLecture 5.1

1. Review

2. Transforming data, the log transform

i. liver fluke egg hatching rate

ii. explaining CEO remuneration

iii. brain weights and body weights

3. SLR with transformed data

4. Transforming X, quadratic fit

5. Other options

Page 50: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 50

Optimising a nicotine extraction process

In determining the quantity of nicotine in different samples of tobacco, temperature is a key variable in optimising the extraction process. A study of this phenomenon involving analysis of 18 samples produced these data.

Page 51: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 51

Optimising a nicotine extraction process

Regression Analysis: Nicotine versus Temperature

The regression equation isNicotine = 2.61 + 0.0247 Temperature

Predictor Coef SE Coef T PConstant 2.6086 0.2121 12.30 0.000Temperature 0.024656 0.003579 6.89 0.000

S = 0.217412 R-Sq = 74.8%

Page 52: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 52

Optimising a nicotine extraction process

Page 53: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 53

Optimising a nicotine extraction process,quadratic fit

90807060504030

4.6

4.4

4.2

4.0

3.8

3.6

3.4

3.2

3.0

Temperature

Nic

oti

ne

Scatterplot of Nicotine vs Temperature

Page 54: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 54

Optimising a nicotine extraction process,quadratic fit

The regression equation isNicotine = 1.20 + 0.0767 Temperature - 0.000453 Temp-sqr

Predictor Coef SE Coef T PConstant 1.2041 0.6312 1.91 0.076Temperature 0.07674 0.02257 3.40 0.004Temp-sqr -0.0004529 0.0001943 -2.33 0.034

S = 0.192398 R-Sq = 81.5%

Page 55: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 55

Optimising a nicotine extraction process,quadratic fit

Page 56: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 56

Optimising a nicotine extraction process,quadratic fit, case 5 excluded

The regression equation isNicotine = 1.21 + 0.0750 Temperature - 0.000419 Temp-sqr

Predictor Coef SE Coef T PConstant 1.2096 0.5129 2.36 0.033Temperature 0.07504 0.01835 4.09 0.001Temp-sqr -0.0004189 0.0001583 -2.65 0.019

S = 0.156321 R-Sq = 88.6%

Page 57: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 57

Optimising a nicotine extraction process,quadratic fit, case 5 excluded

Page 58: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 58

5 Other options

• Other functions,

– e.g., 1/Y, Y, Y2, etc., same for X

• Generalised linear models,

– choose a function of Y, a model for

• etc.

Page 59: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver

Diploma in StatisticsIntroduction to Regression

Lecture 5.1 59

Reading

EM Section 6.7.1

Hamilton, Ch. 5

Extra Notes: More on log