final review solns

Upload: morgan-sanchez

Post on 02-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Final Review Solns

    1/10

    Stat 305 Final Practice Solutions

    1. Enterprise Industries produces Fresh, a brand of liquid laundry detergent. In order to moreeffectively manage its inventory, the company would like to better predict demand for Fresh. To

    develop a prediction model, the company has gathered data concerning demand for Fresh over

    the last 30 sales periods (each sales period is defined to be a four-week period). For this data

    set, let

    1x = the price (in dollars) of Fresh as offered by Enterprise Industries in the sales period minusthe average industry price (in dollars) of competitors similar detergents in the sales period.

    2x = Enterprise Industries advertising expenditure (in hundreds of thousands of dollars) to

    promote Fresh in the sales period

    y = the demand for Fresh (in hundreds of thousands of bottles) in the sales period

    Refer to Output A for parts (a) (b).

    a) [4] Based on your interpretation of the scatterplots provided, state the model equations that

    might adequately describe the relationship of i) y with1

    x and ii) y with2

    x . Your answers

    here should be similar in form to the following incorrect answer:!"""" ++++=

    21322110 xxxxy .

    i) !"" ++=110

    xy

    ii) !""" +++= 2

    22210 xxy

    b) [2] If one would fit the model, y !!0+!

    1x1(which may or may not correctly reflect the

    relationship between y andx1 ) to the data set, what would be the value for

    R2

    , the coefficient ofdetermination?

    R2= 0.8897

    2= 0.7916

    From among several models for y as a function of1

    x and2

    x , the following model (Model 1) was

    selected: !""""" +++++=21413

    2

    22210 xxxxxy .

    c) [5] A normal quantile plot of residuals and a plot of the residuals versus the predicted valuesare shown in Output B. Describe how you may use these plots to examine whether certain

    model assumptions are appropriate here. State the assumptions under consideration andidentify clearly the plot you would use for assessing each assumption.

    ),0(~ 2!" Niid

    The constant variance assumption can be assessed by looking at the plot of the

    residuals. If one sees the residuals forming a fan shape, then the constant variance

    assumption may not be appropriate for the data. If the model is appropriate for the data,then one hopes to see the residuals forming a cloud shape.

  • 8/10/2019 Final Review Solns

    2/10

    The normality assumption could be assessed by looking at the normal quantile plot. Ifthe points create a fairly linear pattern, especially in the middle of the plot, then thenormality assumption could be appropriate for the data.

    Refer to Output C for parts (d) (g).

    d) [4] Predict the demand for the next sales period (in hundreds of thousands of bottles) if theprice difference will be -.20 (dollars) and the advertising expenditure for Fresh will be 5.0

    (hundreds of thousands of dollars).

    064.7

    )5*20.0(48.1)20.0(13.11)5(67.)5(61.711.29 2

    =

    !!!++!=y

    e) [5] Calculate a 90% confidence interval for2

    ! using the information provided. Use this interval

    to test the hypotheses concerning whether a quadratic term is needed in the model or not.State your decision.

    )02.1,32.0()2027.0(708.16712.0

    Since 0 is not in the interval, then we can conclude that 02 !" . Hence the quadratic

    term is needed in the model. Note that the value used fortis based on 25 df.

    f) [4] State the null and the alternative hypotheses concerning whether the interaction term isneeded in the model or not. Continue to follow the five-step format to perform a hypothesis test.

    0361.0

    21.26672.0

    04777.1

    0:

    0:

    4

    40

    =!

    !=

    !!

    =

    "

    =

    valuep

    t

    H

    H

    a #

    #

    Since the p-value is less than 0.05, we can reject the null hypothesis and conclude thatthe interaction term is needed in the model.

    g) [2] Give the estimate for 2! .

    MSE = 0.04258

  • 8/10/2019 Final Review Solns

    3/10

    h) [6] Since Enterprise Industries has to pay someone to visit several stores and gatherinformation on the prices for similar detergents produced by competitors during every salesperiod, Enterprise Industries is wondering if using only advertising expenditure to predict

    demand is equivalent to using both advertising expenditure and price difference to predict

    demand. Output D contains the output for a model that uses only advertising expenditure to

    predict demand, i.e., !""" +++= 2

    22210 xxy (Model 2). Follow the five-step format to justify

    using Model 1 or Model 2 to predict demand. Note that you will need to provide the value of a

    test statistic that is distributed according to an F -distribution.

    0:430

    ==!!H

    :a

    H at least one 0!j" for i = 3,4.

    001.

    39.3)95(.

    ~2.1325/06.1

    )2527/()06.118.2(25,2

    =

    =

    !!

    =

    valuep

    Q

    Ff

    The p-value is less than 0.05 so we reject H0. Therefore, use the full model (Model 1) to predict

    demand.

    Output A:

    Output B:

  • 8/10/2019 Final Review Solns

    4/10

    Output C:

    Output D:

  • 8/10/2019 Final Review Solns

    5/10

    2. Fill in the blanks for the following Analysis of Variance table.

    Source DF Sum of Squares Mean Square F Ratio

    Model 4 400 _____c____ ___e___

    Error __a__ ___b___ _____d_____

    C. Total 24 800

    Answers: a=20, b=400, c=100, d=20, e=5.

    3. A student measured his car mileage at different combinations of speed (55, 60, 65 and 70 mph) and

    octane (87 and 90). The data set consists of 24 observations - three observations for each

    combination of levels. The fitted model is in the form eocspeed xbxbby tan210 ++= .Refer to Output A to answer the following questions.

    a) Calculate R2, the coefficient of determination.

    R2= 235.14 / 246.63 = .95

    b) Calculate the residual for the observation given in the first line of the data table.

    363.

    637.2930

    )87*981.155*176.029.133(3011

    =

    !=

    +!!!=! yy

    c) Give a 99% confidence interval for1

    ! .

    )100.0,252.0(

    )027.0(831.2176.0

    !!

    !

    d) Give the estimate for ! .

    !0.547 = 0.7396

    f) Interpret the confidence interval calculated by JMP for the observation described by the last line

    in the data table.

    For a speed of 55 mph and 87 octane, we are 95% confident that the average mileage will bebetween 28.97 and 30.19 mpg.

  • 8/10/2019 Final Review Solns

    6/10

    g) To compare modeleocspeed xxy tan210 !!! ++" to model 0!"y , state the null and alternative

    hypotheses, the formula for the test statistic, the formula with the appropriate values as provided

    by the JMP output, the p-value and the conclusion.

    0001.][

    90.214)21/(489.11

    )2123/()489.1163.246(

    0:

    0:

    21,2

    21

    210

    =

    !!

    =

    "

    ==

    fFP

    f

    orleastatH

    H

    a ##

    ##

    Since the p-value is less than 0.05, we can conclude that at least one of the parameters is not

    equal to zero. Therefore, the model that includes both speed and octane along with theappropriate parameter estimates should be used to predict average mileage.

    Output A:

    First five lines from the data table:

    Speed Octane Mileage Lower 95% Mean mileage Upper 95% Mean mileage55 87 30 28.9687622 30.1929045

    60 87 29 28.2334503 29.164883

    65 87 28 27.3517837 28.2832163

    70 87 27 26.3237622 27.547904555 87 30.5 28.9687622 30.1929045

  • 8/10/2019 Final Review Solns

    7/10

    4. The Department of Transportation (DOT) conducted an experiment to determine the relationshipbetween the curing process, which is characterized by time and temperature, and maximumcompressive strength of concrete (psi). There are four levels for time: 1, 2, 5 and 10 days. There

    are three levels for temperature: 40, 60 and 80 degrees Fahrenheit. And there are three

    observations for each combination of levels.

    a) In order for inference about quantities such as1

    ! to be valid, what do we need to assume about

    errors (residuals) for any linear regression model?

    ! ~ iidNormal(0, 2! )

    b) Compare the modeltemptime

    xxy210

    !!! ++" (refer to Output B) to the model

    temptimetemptime xxxxy 3210 !!!! +++" (refer to Output C). Which model (along with the

    appropriate parameter estimates) use to predict strength and why? Follow the five-step formatand provide a test statistic that is distributed according to the F-distribution.

    05.01.56.7)99(.27.417.4)95(.:FUsing

    ~27.432/5.321450

    )3233/()5.3214502.364346(

    0:

    0:

    1,30

    32,1

    3

    30

    =

  • 8/10/2019 Final Review Solns

    8/10

    5. Circle either T (true) or F (false).

    T F Suppose a 95% confidence interval for the difference of two population

    means is (-1.3, 4.1). According to the 95% confidence interval, the p-value would

    be less than 0.05 based on a null hypothesis that states there is no difference.

    T F When one says (0,10) is a 95% CI for , one means that the probability

    that lies within the interval is .95.

    T F A 99% confidence interval is wider than a 95% confidence interval for a givendata set.

    Answers: F, F, T.

    6. Fill in the blank(s) with the appropriate answer.

    a) A Type __________ error occurs when one says that there is a difference between two population

    means when the difference is zero as stated by the null hypothesis.

    b) The lettersiid stand for __________________________________________________.

    c) The sample mean for large samples (samples with 30 or more values) is approximately

    normally distributed according to the ____________ ____________ Theorem.

    Answers: I , independently and identically distributed, Central Limit.

    7. A new experimental drug to reduce cholesterol was developed. Five people were chosen to receivethe new drug. Each person had his/her cholesterol measured before taking the drug. Then eachperson took the drug for a six-week period and had his/her cholesterol measured again.

    a) Give and interpreta 95% confidence interval for the mean difference between the before andafter cholesterol measurements.

    Person 1 2 3 4 5Before 200 220 180 195 240After 180 190 165 175 180

    Differences: -20, -30, -15, -20, -60d = -29

    2

    ds = 330

    )45.6,55.51(

    4151

    5

    330776.229,

    5

    330776.229

    !!

    =!=!=

    ""#

    $%%&

    '+!!!

    ndf

  • 8/10/2019 Final Review Solns

    9/10

    We are 95% confident that the mean decrease in cholesterol after taking the new drug for six weekswill be between 6.45 and 51.55.

    8. An engineer is concerned about spring lifetimes (103cycles) under two different levels of stress:900 N/mm2and 950 N/mm2. Below are the data.

    950 N/mm

    2

    : 225, 171, 198, 189, 189, 135, 162, 135, 117, 162

    900 N/mm2: 216, 162, 153, 216, 225, 216, 306, 225, 243, 189

    Follow the five-step format to assess the strength of evidence that the difference in mean lifetimes

    between 900 N/mm2stress level and 950 N/mm2stress level is not equal to zero.

    1.1844

    1.215

    9.1098

    3.168

    2

    900

    900

    2

    950

    950

    =

    =

    =

    =

    s

    x

    s

    x

    05.02.

    025.5.01.

    73.2

    10

    1

    10

    136.38

    03.1681.215

    36.3818

    )9(1.1844)9(9.1098

    0:

    0:

    950900

    9509000

  • 8/10/2019 Final Review Solns

    10/10

    40

    8.1902

    8.168

    60

    2.1315

    1.154

    900

    2

    900

    900

    950

    2

    950

    950

    =

    =

    =

    =

    =

    =

    n

    s

    x

    n

    s

    x

    Follow the five-step format to assess the strength of evidence that the difference in mean lifetimes

    between 900 N/mm2stress level and 950 N/mm2stress level is not equal to zero.

    0836.]73.1[2]73.1|[|

    73.1

    60

    2.1315

    40

    8.1902

    01.1548.168

    0:

    0:

    950900

    9509000

    =!

    =

    +

    !!

    =

    "!

    =!

    ZPZP

    z

    H

    H

    a

    The p-value is greater than .05, so we will not reject the null hypothesis. Hence, there is not enoughevidence to conclude that there is a difference in mean lifetimes between 900 N/mm2stress level and

    950 N/mm2stress level.