bum2413 - applied statistics 21112

Upload: syada-hageda

Post on 17-Mar-2016

16 views

Category:

Documents


0 download

DESCRIPTION

ump question

TRANSCRIPT

  • I- lUniversiti $ Malaysia PAHANG En

    FACULTY OF INDUSTRIAL SCIENCES & TECHNOLOGYFINAL EXAMINATION

    COURSE : APPLIED STATISTICS

    COURSE CODE : BUM2413/BSU1023/BCT20531BPF3313/ BKU20321BAM3022

    LECTURER : DR ROSLINAZAIRIMAII BINTI ZAKARIA NOR HAFIZAH BINTI MOSLIM AZLYNA BINTI SENAWI MOHD RASHID BIN AB HAMID NOR AZILA BINTI CHE MUSA NOOR FADHILAH BINTI AHMAD RADI

    DATE : 5 JUNE 2012

    DURATION : 3 HOURS

    SESSION/SEMESTER : SESSION 2011/2012 SEMESTER II

    PROGRAMME CODE : BSB/BSK1BAAIBAE/BCNIBCG1BCS/BPPIBPT/ BPS/BFFIBFMIBKCIBKGIBKB/BMMIBM1/ BMBIBMF/BMAJBEE/BEP/BEC

    INSTRUCTIONS TO CANDIDATES

    1. This question paper consists of SEVEN (7) questions. Answer all questions. 2. All answers to a new question should start on new page. 3. All the calculations and assumptions must be clearly stated. 4. Candidates are not allowed to bring any material other than those allowed by

    the invigilator into the examination room.

    EXAMINATION REQUIREMENTS: 1. Statistical Table 2. Scientific Calculator

    DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO DO SO

    This examination paper consists of FIFTEEN(15) printed pages including front page.

  • CONFIDENTIAL BSBIBSBAAIBAEIBCNIBCG/BCS1BPPIBPTIBPS1BFF1BFW BKCIBKG/BKBIBMMJBMIJBMBIBMFIBMA/BEEIBEPIBEC/

    1 1121118UM24131BSU1023BCT20531BPF33131BKU20321BAM3022

    QUESTION 1

    An article in the Journal of Strain Analysis compares Karlsruhe and Lehigh methods for predicting the shear strengths for steel plate girders. Data of these two methods, are

    shown in Table 1.

    GirderKarlsruhe Method

    Lehigh Method

    Gi 1.186 1.067 G2 1.151 0.992 G3 1.322 1.063 G4 1.339 1.062 G5 1.200 1.062 G6 1.402 1.178 G7 1.365 1.037 G8 1.537 1.086 G9 1.559 1.052

    Table 1: Shear strengths for steel plate girders of two methods

    (a) Find the mean and standard deviation for the difference of methods in Table 1. (3 Marks)

    (b) Find a 98% confidence interval for the mean difference in shear strengths between Karlsruhe and Lehigh methods.

    (4 Marks)

    (c) Is there any mean difference between the two methods? By assuming the data is normally distributed, test the hypothesis at 5% level of significance.

    (7 Marks)

    2

  • CONFIDENTIAL BSBIBSKAMBAE1BCN/BCGIBCSIBFP1BPT1BPSIBFF'BFW BKCiBKG/BKBIBMMiBMUBMBIBMF/BMA1BEEfBEPC/ 1 112IIIBUM2413IBSU1O23BCT2O53/BPF33131BKU20321BAM3022

    QUESTION 2

    The variability in the thickness of the oxide layers is a critical characteristic of the semiconductor wafers. Low variability of the oxide thickness is desirable for subsequent processing steps. Two different mixtures of gases are being studied to determine whether

    one is superior in reducing the variability of the oxide thickness. Twenty one wafers are etched in each gas. For gas A, the mean of oxide thickness is 10.05 angstroms and standard deviation is 1.96 angstroms while for gas B the mean is 13.22 angstroms and

    standard deviation is 2.13 angstroms.

    (a) Find a 98% confidence interval for population mean for mixture of gas B and give the interpretation of the parameter estimate.

    (5 Marks)

    (b) Determine whether the two mixtures of gases have different in variability of oxide layers thickness at the 0.1 level of significance.

    (8 Marks)

    (c) Can we conclude that the mean mixture of gas A is not more than gas B at 10% significance level? Your assumption of the condition of population variances is

    based on your answer in (b).(8 Marks)

    3

  • CONFIDENTIAL BSBfBSKIBAAIBAEIBCN/BCGIBCS1BPP1BPTIBPS1BFFIB BKCIBKG/BKB/BMMIBMIJBMEIBMF/BMAI'BEEJBEP/BEC/

    11 12IIIBUM2413IBSU1023BCT2053IBPF3313IBKU2032AM3022

    QUESTION 3

    A researcher wishes to see whether there is any difference in the weight gains of athletes following one of three special diets. Athletes are randomly assigned to three groups and placed on three different diets for six weeks. The weight gains (kg) are shown below.

    Type of diet Weight Gains (kg) Diet 2 3 4 2 Diet 5 6 5 7 4 3 Diet 4 1 2 1 1 3

    Table 2: Weight gains (kg) for different type of diets

    (a) How many treatments involved in the experiment? (1 Mark)

    (b) Based on the data in Table 2, is there any treatment effect between the type of diets at 5% level of significance?

    (16 Marks)

    4

  • CONFIDENTIALBKC/BKG1BKB/BMM/BMIJBMBIBMFIBMAIBEE/BEPIBEC/

    11 12111BUM24131BSU1023BCT20531BPF33131BKU20321BAM3022

    QUESTION 4

    (a) What is the different between simple linear regression and multiple linear regression?

    (1 Mark)

    (b) A fitted simple linear regression equation is given by j 12.9 + 2.34x where n=1O, S,=929.98 and S=389.93.

    (i) Calculate the value of correlation coefficient, r and comment on the value.

    (4 Marks)

    (ii) Complete the ANOVA table below and test the hypothesis that the linearity of the regression line at a = 0.05 significance level.

    Source of variation Sum of squares

    Degrees of freedom

    Mean of squares

    F lest

    Regression 1

    Residual

    Total 929.98

    Table 3: ANOVA(12 Marks)

  • CONFIDENTIAL BSB[BSKIBAAIBAEBCNIBCGIBCSIBPPJBPTIBPSIBFFIBFMI BKCIBKGIBKB/BMMIBMIIBMB/BMF/BMAIBEEIBEP113EC/

    11 1211JBUM24131BSU1023BCT2053111PF33131BKU2032113AM3022

    QUESTION 5

    A study was performed to investigate the car performance for car models produced by the U.S., Japan, Germany and Sweden between 1978 and 1979. The fuel mileage in kilometers per liter is believed to be related to car weight (in thousand kilograms), drive-ratio and horsepower. A multiple regression analysis is conducted to determine the multiple linear regression equation which gives the best fit to the data. The following

    show the Excel outputs of the multiple regression analysis for the study.

    SUMMARY OUTPUT

    Regression Statistics Multiple R 0.9031 RSquare 0.8155 Adjusted R Square 0.8104 Standard Error 2.9508 Observations 38!

    ANOVAdf SS MS F SignficanceF

    Regression 1 1293.5156 1293.5155 159.1610 8.88939E-15! Residual 36 292 5752 8.12711 Total 37 1586.0908;

    Coefficients Standard Error tStat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 4871 19537 249311 0.0000; 447452 52 6697 447452 52.1665.7:

    Weight -8.36. 0.6630 -12.5159 0.0000 -9.7093! -7.0199 -9.7093. -7.0199

    SUMMARY OUTPUT:

    Regression Statistics Multiple 0.4172 ... . . RSquare 0.1741 Adjusted RSquare 01511 Standard Error 6.0323 . Observations . 38. .

    ANOVAdf SS MS F Significance F

    Regression 1: 276.1009 276.1009 7.5876: 0.0092 .. . Residual 36 13099899 353886 Total 37 15860908

    Coefficients . Standard Error tStat I P-value Lower 95% . Upper95% Lower95.0% Upper 95.0% Intercept 8.44 6.0065 1.4046 0.1687 -3.7453 20.5181 -3.7453 20.6181! Drive Ratio 5.28 1.9158 2.7546. 0.0092 1.39171 9.1524. 1.3917 9.1624

  • CONFIDENTIAL BSB/BSKBAAIBAEIBCNIBCG/BCS/BPPIBPTIBPS/BFFIBFM/ BKCIBKGIBKB[BMMIBMIJBMBIBMF/BMAIBEEIBEP1BEC/

    I 112111BUM2413/BSU1O23BCT2O53IBPF3313IBKU2O32IBAM302Z

    SUMMARY OUTPUT

    Regression Statistics 'MultipleR 0. 8713 RSquare 0.7591 Adjusted RS re 0.7524 Standard Error 3.2.1576: Observations 38

    :ANOVA df i SS MS F Significance

    Regression 1 1204.0530: 1204.0530 113.45

  • CONFIDENTIAL BSB/BSKJBAAIBAEfBCNIBCG[BCS1BPP1BPTiBPSlBFFFM/ BKC/BKG/BKB/BMMJBMI[BMBIBMF1BMA1BEEEPIBEC/

    11 12111BUM24131BSU1023BCT20531BPF33131BKU203218AM3022

    SUMMARY OUTPUT

    Regression Statistics Multiple R 0.9G951 RSquare 0.82721 C Adjusted R Square 0.8173: Standard Error 27986 Observations 38 .

    ANOVAdf SS MS F Significance

    Regression 2. 1311 9559 655.9830i 83.7553 14 4.55362&14

    Residual 35 2741249 78321 Total 37 1586.0908.

    Coefficients Standard Error tStat P-value Lower 95016 Upper95% Lower 95.0% Upper 95.0%

    Intercept 4894 1.9240 254379 0 0000 45.0361: 52.8475 450361 528479

    Weight -6.06. 1.6338 -3.7119 0.0007 -9.3814: -2.7477 -9.3814 2.7477

    Horsepower -0.07 0.0437 -1.5348;: 0.1338, -0.1557 0.0216 -0.15571 0.0216.

    SUMMARY OUTPUT

    Regression Statistics Multiple 08793 RSquare 07732 Adjusted R Square 07602 Standard Error 3.2051 Observations 38. .

    ANOVAdf i SS MS F SignficanceF:

    Regression 2 1226.3751 613.1875 59.6626 5.29113E-12 Residual . 35 359.7157 10.2776 : Total 37 1586.0908

    Coefficients iStandard Error tStat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

    Intercept 54.63 5 7676 9.4714 00000 42 9182 56 3359 42 9182 663359

    Drive Ratio 1 86 1.2597E -1.4737 : 0.1495 -4.4140, 0.7009,: 44140 07009

    Horsepower i -0.24. 0.0247 -9.6157 0.0000 -0.2872 -0.1871 -0.2872. -0.1871.

    8

  • CONFIDENTIAL BSBSAAAE1BCN/BCGJBCSIBPPIBPT1BPS/BFF1BFM1 BKCIBKG/BKIBIBMMIBMIJBMBIBMF/BMAIBEEJBEPIBEC/

    11 I2IUBUM2413IBSUIO23BCT2O53[BPF3313/BKU2032/BAM3022

    SUMMARY OUTPUT

    Regression Statistics Multiple R 0.9482 Rsquare 0.8991 Adjusted R Square 0.8902 Standard Error 2.1595 Observations 38

    ANOVAdf SS MS F Significance F

    Regression 3 1426.0575 475.3525 1 100.9914i 5.26413E17 Residual 34 150.0333 4.7069: Total 3T 1585.0908

    Coefficients : Standard Error tStat P4'alue Lower 95% Upper 9556 Lower 95.0% 1 Upper95.0% Intercept 70.28 4.5838: 15.3326i 0.0000 60.96611 79.5969 60.9561 79.5969 Weight 928 14255 65133 00000 121813 63875 121813 63876 Drive Ratio 472 09595 49234 0.0000 ! -6.6736; -2.7739 -6.67361 27739 Horseoower -0.04: 0.0342; -1.2432 : -0.11211 -0.1121; 0.0270

    Based on the given Excel outputs, answer the following.

    (a) State the response and predictor variables? (2 Marks)

    (b) Extract the relevant informations and summarize your results in a table. (7 Marks)

    (c) Hence, determine the best regression equation for predicting the mileage (kilometer per liter) value.

    (2 Marks)

    IN

  • CONFIDENTIAL BSB/BSK/BAA/BAE/BCN/BCG/BCS/BPP/BPT/BPS/BFF/BFNF BKCIBKGIBKBIBMMJBMUBMBIBMFIBMAIBEEIBEPIBEC/

    11 12111BUM24131BSU1023BCT20531BPF33131BKU2032/BAM3022

    QUESTION 6

    The goals scored per match by MyKid football team gave the following results:

    Number of goals per match 0 1 2 3 1 4 5 6 7 Number of matches IL_L 18 29 1 18 1 10 7 3 1

    Table 4: Goals scored in all football matches

    Test whether the number of goals per match follows a Poisson distribution at 10%

    significance level.(12 Marks)

    QUESTION 7

    In a study of the television viewing habits of children, a developmental psychologist selects a random sample of 300 primary students; 100 boys and 200 girls. Each student is

    asked which of the following TV programs they preferred most; Word World, Dibo the

    Gift Dragon or Mickey Mouse Clubhouse. Results are shown in Table 5 below.

    Viewing Preferences

    Word World Dibo the Gift DragonMickey Mouse

    Clubhouse Row total

    Boys 50 30 20 100 Girls 50 80 70 200 Column total 100 110 90 300

    Table 5: TV program viewing preferences

    At a = 0.01, is there enough evidence to support the claim that the proportions of

    viewing preferences for boys are equal for each of the three TV programs?(8 Marks)

    END OF QUESTION PAPER

    10

  • CONFIDENTIAL BSBIBSKIBAAIBAEIBCN/BCG/BCSIBPP1BPTIBPSIBFFIBFMI BKCIBKG/BKBIBMMIBMIIBMBIBMF1BMAIBEEIBEP1BEC/

    11 12111BUM24131BSU1023BCT20531BPF33131BKU2032/BAM3O2Z

    Appendix - Table Of Formulas

    Confidence Intervals, Sample Sizes and Hypothesis Testing

    Confidence intervals for p Hypothesis testing for p

    I .x_za/2 7; X+Zai7=J Ztt/T

    X - Z , X + Za/2 7J Ztest =

    Xta12vJ_ , X+ta/2v_j=J ttest = where v=n-1

    Confidence intervals for du - 4u2 Hypothesis testing for A . -

    2i2),u0

    ZtestJcY22 cT

    Foro-^o: Foro^o:

    (i 2)u0 ____

    Zte L2 2

    fl 2 111 fl2

    11

  • CONFIDENTIAL BSBIBSKJBAA/BAEIBCNIBCGIBCSIBPPIBPTIBPSIBFFIBFM/ BKCIBKGIBKBIBMMIBMIJBMBIBMFIBMAIBEEIBEPIBEC/

    11 12111BUM24131BSU1023BCT20531BPF3313/BKU2032/BAM3022

    For o # For o #2)u0

    ( F2 2

    I (

    X2)tai2v /L+Lttest 12 2

    41+ Vi 2 V'i '2

    / 2 2\ 2 1 2 2s\2 n1 n2 )

    where v =In1 n2

    where v = 2 212 1fl2 S2 2 1 1

    ni) n2) n) n1 -1 n2 -1 n1-1 n2-1

    For o

    = o- : For o = o:

    ( x2)P0 )Z 12 sj-_-+_1_ J

    Ztest =

    For cr = For cr = o:

    J 2)ta12v(yi

    test

    -

    p ri^ + n2

    where v-n1 +n2 -2 where v=n1+n2-2 Pooled estimator, s

    /(n-1)s+(n2-1)s =n1+n2-2

    Confidence Interval for PD Hypothsis Testing for PD

    d _z/iTd,,,Fn

    ttest =

    d-,u D where v = n -1 SD/'.I

    -S D SD) t\ Za//=s +Za/T)

    12

  • CONFIDENTIAL BSBIBSKBAAIBAEIBCN/BCG/BCS/BPPJBPT[BPSIBFFIBFMI BKCIBKG/BKBJBMM/BMIJBMBJBMFIBMAIBEEIBEP/BEC/

    11 12111BUM24131BSU1023BCT20531BPF33131BKU20321BAM3022

    -

    ( n-I f' a/2,n1 Confidence intervals for ir Hypothesis testing for 7r

    [JP(1_P) , p+ai2ifr(1_]

    -_______

    Ztest - 1-7r0)

    Confidence intervals for 7t1 - 7t2 Hypothesis testing for it1 - it2

    (p, p2 )r0 = ___________ If ,r

    0 1 Zt.,it1 (1irk )

    92(1-7r2)

    [(p1 _p2)Za12(1 ) +

    = o, Z. =

    FPP (I --PP + n2

    where pp = x1 + x2fl1 + fl2

    Confidence intervals for a2 Hypothesis testing for a2

    ((n_1 )s 2 (n _l)s2 I I

    2 2 (n-1)s 2 2 I ' I k

    X. /2, Z1-a/2,v ) Zest 2co

    where v=n-1

    Confidence intervals for Hypothesis testing fora2 U2

    1 where v1 =nj 1 ' fa12 I

    -

    .test - 2 2 2 V2 V1

    S2 faizvi,v2 S2 ) v2= fl2 1 S2

    Sample sizes

    n = [' 0 J 2 n=p(1_p)2-J

    13

  • CONFiDENTIAL BSB/BSK/BAA/BAE1BCNIBCGIBCSIBPPIBPTIBPSIBFFJBFMJ BKCIBKG/BKBIBMMIBMIJBMB/BMF/BMA[BEE[BEPIBEC/

    ii 12111BUM24131BSU1023BCT20531BPF33131BKU20321BAM3022

    Analysis of Variance (ANOVA) One-way ANOVA Two-way ANOVA

    SST = k 1 SST = >jx __x2

    1=1 j1 k=1 a r

    H jl 1 a 1 SSA =--x2 ___x2

    br 1=1 ' abr

    SSB= - x. ___x2 ar 1 ar

    11 fl 1 iab 1 SSAB=--x2 -------x2 SSA SSB

    ' r 1=1 j1

    abr

    SSE = SSTSS(Tr) SSE = SST SSA SSBSSAB

    Goodness of Fit Test and Contingency Tables

    Goodness of Fit Test Test using Contingency Tables

    (0 E)1.

    E.n,.

    fl 2' i Xtest _ L4 i

    r c(O_E) ZtSt

    1=1

    Free distribution DoF; v = ki Ejj Hypothesized distribution DoF; V =k

    - p 1 where v_(r - 1)(c 1)

    Simple Linear Regression and Correlation

    Simple Linear Regression and Correlation SXY r-

    .'jsxxs.y [Exi)t i J ,, ______ [nJ2

    S,=x1y1 ' S=Jx ' S,=y 1=1

    i=1 n 1=1 fl 1=1

    Regression line equation: 5' = /3 + x where /3 = and fl = - xx

    14

  • CONFIDENTIAL BSBIBSKIBAAIBAE/BCNIBCGIBCSIBPPIBPTIBPSIBFFIBFMJ BKCIBKGIBKB1BMMJBMI1BMB1BMF/BMA/BEEIBEPEC/

    11 12II1BUM2413IBSU1O23BCT2O531BPF33131BKU20321BAM3OZ2

    Hypothesis Testing for Intercept, '80 Hypothesis Testing for Slope, /3

    A s.e(flo) !+T

    y2

    ttest FMSR-

    _i-i A s.e(i)

    FMSR- n

    Sum of Squares Regression, SSR Mean Square Residual, MSReS

    SSR/31SXY MSRes_[ n-2

    15

    Page 1Page 2Page 3Page 4Page 5Page 6Page 7Page 8Page 9Page 10Page 11Page 12Page 13Page 14Page 15