140610090075 r. nindya kartika rachim (bioinformatika)

Upload: nindydyy

Post on 04-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 140610090075 R. Nindya Kartika Rachim (Bioinformatika)

    1/7

    1. Gene CD33. Use grep to nd the index of the important gene CD33 among the list of charactersgolub.gnames. For each test below formulate the null hypothesis, the p-value and your conclusion.

    To perform computations on the expressions of this gene we need to know its row index. Thiscan obtained by the grep function :

    > grep("CD33",golub.gnames[,2])

    [1] 808

    From using the function grep, we can find the result, The expression values of antigen CD33 areavailable at golub[808,] and further information on it by golub.gnames[808,].

    The expression values of gene CD33 from the ALL patients can now be printed to the screen, asfollows:

    > golub[808,gol.fac=="ALL"]

    [1] -0.57277 -1.38539 -0.47039 -0.41469 -0.15402 -1.21719 -1.37386

    [8] -0.52956 -1.10366 -0.74396 -0.97673 -0.00787 -0.99141 -1.05662

    [15] -1.39503 -0.73418 -0.67921 -0.87388 -0.82569 -1.12953 -0.75991

    [22] -0.92231 -1.13505 -1.46474 -0.59614 -1.04821 -1.23051

    One of method to visualize data is by dividing the range of data values into a number ofintervals is histogram, and can be printed to the screen, as follow :

    > hist(golub[808, gol.fac=="ALL"])

  • 7/31/2019 140610090075 R. Nindya Kartika Rachim (Bioinformatika)

    2/7

    To test the hypothesis that the ALL gene expression values of CD33 from Golub et al. arenormally distributed, the Shapiro-Wilk test can be used as follows.

    > shapiro.test(golub[808, gol.fac=="ALL"])

    Shapiro-Wilk normality test

    data: golub[808, gol.fac == "ALL"]

    W = 0.9696, p-value = 0.592

    From the computation, we get p-value = 0.592 > : 0,05. So the hypothesis is accepted, we can saythat the data ALL is Normal.

    The expression values of gene CD33 from the AML patients can now be printed to the screen,as follows:

    > golub[808,gol.fac=="AML"]

    [1] -0.38605 0.50814 0.70283 1.05902 0.38602 -0.19413 1.10560

    [8] 0.76630 0.48881 -0.13785 -0.40721

    One of method to visualize data is by dividing the range of data values into a number ofintervals is histogram, and can be printed to the screen, as follow :

  • 7/31/2019 140610090075 R. Nindya Kartika Rachim (Bioinformatika)

    3/7

    > hist(golub[808, gol.fac=="AML"])

    To test the hypothesis that the ALL gene expression values of CD33 from Golub et al. arenormally distributed, the Shapiro-Wilk test can be used as follows.

    > shapiro.test(golub[808, gol.fac=="AML"])

    Shapiro-Wilk normality test

    data: golub[808, gol.fac == "AML"]

    W = 0.9121, p-value = 0.2583

    From the computation, we p-value = 0.2583 > : 0,05. So the hypothesis is accepted, we can say that

    the data AML is Normal.

    The null hypothesis for gene CD33 that the variance of the ALL patients equals that of the AMLpatients can be tested by the built-in-function var.test, as follows.

    > var.test(golub[808,] ~ gol.fac)

    F test to compare two variances

  • 7/31/2019 140610090075 R. Nindya Kartika Rachim (Bioinformatika)

    4/7

    data: golub[808, ] by gol.fac

    F = 0.4605, num df = 26, denom df = 10, p-value = 0.1095

    alternative hypothesis: true ratio of variances is not equal to 1

    95 percent confidence interval:

    0.1376700 1.1923646

    sample estimates:

    ratio of variances

    0.4604523

    From the computation, we get p-value = 0.1095 > : 0,05. So the hypothesis is accepted. we can say

    that Variance two population are same

    we test against by the built-in-function t-test. Recall that the corresponding geneexpression values are collected in row 808 of the golub

    > t.test(golub[808,] ~ gol.fac, var.equal=TRUE)

    Two Sample t-test

    data: golub[808, ] by gol.fac

    t = -7.9813, df = 36, p-value = 1.773e-09

    alternative hypothesis: true difference in means is not equal to 0

    95 percent confidence interval:

    -1.5487898 -0.9211602

  • 7/31/2019 140610090075 R. Nindya Kartika Rachim (Bioinformatika)

    5/7

    sample estimates:

    mean in group ALL mean in group AML

    -0.8812041 0.3537709

    From the computation, we get, p-value = 1.773e-09 boxplot(golub[1788,] ~ gol.fac)

    we test against by the built-in-function t-test. Recall that the corresponding geneexpression values are collected in row 1788 of the golub

    > t.test(golub[1788,] ~ gol.fac, var.equal=TRUE)

  • 7/31/2019 140610090075 R. Nindya Kartika Rachim (Bioinformatika)

    6/7

    Two Sample t-test

    data: golub[1788, ] by gol.fact = -0.178, df = 36, p-value = 0.8597alternative hypothesis: true difference in means is not equal to 0

    95 percent confidence interval:-0.6874315 0.5764734sample estimates:mean in group ALL mean in group AML

    -0.3046481 -0.2491691

    From the computation, we get, p-value = 0.8597 > : 0,05. So the hypothesis is accepted. we can saythat the mean of two population are same.

    3. HOXA9. Gene "HOXA9 Homeo box A9" with expression values in row 1391, can cause leukemia(Golub et al., 1999).

    To test the hypothesis that the ALL gene expression values ofHOXA9 Homeo box A9" from Golub etal. are normally distributed, the Shapiro-Wilk test can be used as follows.

    > shapiro.test(golub[1391, gol.fac=="ALL"])

    Shapiro-Wilk normality test

    data: golub[1391, gol.fac == "ALL"]W = 0.5831, p-value = 1.318e-07

    From the computation, wep-value = 1.318e-07

  • 7/31/2019 140610090075 R. Nindya Kartika Rachim (Bioinformatika)

    7/7

    The null hypothesis that the expression values for gene HOXA9 Homeo box A9" are equallydistributed for the ALL patients and the AML patients can be tested by the built-in-function wilcox.test,as follows.

    > wilcox.test(golub[1391,] ~ gol.fac)

    Wilcoxon rank sum test

    data: golub[1391, ] by gol.facW = 34, p-value = 7.923e-05alternative hypothesis: true location shift is not equal to 0

    From the computation, we getp-value = 7.923e-05