meta-analysis

74

Click here to load reader

Upload: jude

Post on 26-Jan-2016

84 views

Category:

Documents


2 download

DESCRIPTION

Sessions 1.2-1.3: Effect Size Calculation. Funded through the ESRC’s Researcher Development Initiative. Meta-analysis. Department of Education, University of Oxford. Sessions 1.2-1.3: Effect Size Calculation. 2. Effect size calculation. The effect size makes meta-analysis possible - PowerPoint PPT Presentation

TRANSCRIPT

  • Funded through the ESRCs Researcher Development InitiativeDepartment of Education, University of OxfordSessions 1.2-1.3: Effect Size Calculation

  • *

  • The effect size makes meta-analysis possibleIt is based on the dependent variable (i.e., the outcome)It standardizes findings across studies such that they can be directly comparedAny standardized index can be an effect size (e.g., standardized mean difference, correlation coefficient, odds-ratio), but mustbe comparable across studies (standardization)represent magnitude & direction of the relationshipbe independent of sample sizeDifferent studies in same meta-analysis can be based on different statistics, but have to transform each to a standardized effect size that is comparable across different studies

  • XLSSample size, significance and d effect size

    Levene

    DefinitionThe Levene test is defined as:

    H0:

    Ha:for at least one pair (i,j).

    Test Statistic:Given a variable Y with sample of size N divided into k subgroups, where Ni is the sample size of the ith subgroup, the Levene test statistic is defined as:

    where Zij can have one of the following three definitions:

    1. where

    is the mean of the ith subgroup.

    2. where

    is the median of the ith subgroup.

    3. where

    is the 10% trimmed mean of the ith subgroup.

    are the group means of the Zij and is the overall mean of the Zij.

    The three choices for defining Zij determine the robustness and power of Levene's test. By robustness, we mean the ability of the test to not falsely detect unequal variances when the underlying data are not normally distributed and the variables are in f

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    (i.e., skewed) distribution. Using the mean provided the best power for symmetric, moderate-tailed, distributions.

    Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good power. If you have knowledge of th

    Significance Level:

    Critical Region:The Levene test rejects the hypothesis that the variances are equal if

    where

    is the upper critical value of the F distribution with k - 1 and N - k degrees of freedom at a significance level of .

    In the above formulas for the critical regions, the Handbook follows the convention that

    is the upper critical value from the F distribution and

    is the lower critical value. Note that this is the opposite of some texts and software programs. In particular, Dataplot uses the opposite convention.

    Sample OutputDataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    LEVENE F-TEST FOR SHIFT IN VARIATION

    (CASE: TEST BASED ON MEDIANS)

    1. STATISTICS

    NUMBER OF OBSERVATIONS = 100

    NUMBER OF GROUPS = 10

    LEVENE F TEST STATISTIC = 1.705910

    2. FOR LEVENE TEST STATISTIC

    0 % POINT = 0.

    50 % POINT = 0.9339308

    75 % POINT = 1.296365

    90 % POINT = 1.702053

    95 % POINT = 1.985595

    99 % POINT = 2.610880

    99.9 % POINT = 3.478882

    90.09152 % Point: 1.705910

    3. CONCLUSION (AT THE 5% LEVEL):

    THERE IS NO SHIFT IN VARIATION.

    THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

    Interpretation of Sample OutputWe are testing the hypothesis that the group variances are equal. The output is divided into three sections.

    1. The first section prints the number of observations (N), the number of groups (k), and the value of the Levene test statistic.

    2. The second section prints the upper critical value of the F distribution corresponding to various significance levels. The value in the first column, the confidence level of the test, is equivalent to 100(1-

    ). We reject the null hypothesis at that significance level if the value of the Levene F test statistic printed in section one is greater than the critical value printed in the last column.

    3. The third section prints the conclusion for a 95% test. For a different significance level, the appropriate conclusion can be drawn from the table printed in section two. For example, for

    = 0.10, we look at the row for 90% confidence and compare the critical value 1.702 to the Levene test statistic 1.7059. Since the test statistic is greater than the critical value, we reject the null hypothesis at the

    = 0.10 level.

    Output from other statistical software may look somewhat different from the above output.

    QuestionLevene's test can be used to answer the following question:

    Is the assumption of equal variances valid?

    Related TechniquesStandard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    SoftwareThe Levene test is available in some general purpose statistical software programs, including Dataplot.

    is the mean of the ith subgroup.

    is the median of the ith subgroup.

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    Dataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    Standard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    The Levene test is available in some general purpose statistical software programs, including Dataplot.

    mean_effect size

    T-test with unpooled varianses and effect sizes (with pooled variances)

    sample Asample Bsample Csample D

    M100105-5.0000M1 - M1M100105-5.0000M1 - M1

    SD151545.0000(s1)2 / n1SD151522.5000(s1)2 / n1

    N5545.0000(s2)2 / n2N101022.5000(s2)2 / n2

    9.4868SQRT ((s1)2 / n1 + (s2)2 / n2)6.7082SQRT ((s1)2 / n1 + (s2)2 / n2)

    T-0.5270T-0.7454

    one or two tailed20.5270one or two tailed20.7454

    DF8(n1 + n2 -2)DF18(n1 + n2 -2)

    sign0.6125sign0.4657

    d0.3333d0.3333

    sample Esample Fsample Gsample H

    M5.541.5000M1 - M1M5.541.5000M1 - M1

    SD0.50.50.0013(s1)2 / n1SD0.50.50.0125(s1)2 / n1

    N2002000.0013(s2)2 / n2N20200.0125(s2)2 / n2

    0.0500SQRT ((s1)2 / n1 + (s2)2 / n2)0.1581SQRT ((s1)2 / n1 + (s2)2 / n2)

    T30.0000T9.4868

    one or two tailed230.0000one or two tailed29.4868

    DF398(n1 + n2 -2)DF38(n1 + n2 -2)

    sign0.0000sign0.0000

    d3.0000d3.0000

    cohens d (pooled variances) = (m1 - m2) /sqrt ( ( ( (n1 -1) sd12 ) + ( (n2 - 1) sd22) )/ (n1 + n2 - 2) )

    T-test and effect sizes with pooled variances

    * systematically change the mean-level differences, maintaining the variances and sample sizes

    * systematically change the sample sizes, maintaining the variances and mean-level differences

    * systematically change the variances, maintaining the mean-level differences and sample sizes

    * "unbalance" the test, distort sample sizes and / or variances of the groups

    * figure out some "rule of thumb" for your observations

    chi2

    This spread sheet calculates the chi-square value for a 2 x 2 contingency table and transforms it into an effect size and a phi coefficient

    Variable A(observed minus expected frequencies)2

    noyesnoyes

    Variable Bno265204469no536.94536.94

    yes3275107yes536.94536.94

    297279576

    expected frequencies(observed minus expected frequencies)2 / expected frequencies

    noyesnoyes

    no241.83227.17no2.222.36

    yes55.1751.83yes9.7310.36

    observed minus expected frequenciesCHISQ24.6758732833

  • *XLSESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Sample size, significance and d effect size

    Levene

    DefinitionThe Levene test is defined as:

    H0:

    Ha:for at least one pair (i,j).

    Test Statistic:Given a variable Y with sample of size N divided into k subgroups, where Ni is the sample size of the ith subgroup, the Levene test statistic is defined as:

    where Zij can have one of the following three definitions:

    1. where

    is the mean of the ith subgroup.

    2. where

    is the median of the ith subgroup.

    3. where

    is the 10% trimmed mean of the ith subgroup.

    are the group means of the Zij and is the overall mean of the Zij.

    The three choices for defining Zij determine the robustness and power of Levene's test. By robustness, we mean the ability of the test to not falsely detect unequal variances when the underlying data are not normally distributed and the variables are in f

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    (i.e., skewed) distribution. Using the mean provided the best power for symmetric, moderate-tailed, distributions.

    Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good power. If you have knowledge of th

    Significance Level:

    Critical Region:The Levene test rejects the hypothesis that the variances are equal if

    where

    is the upper critical value of the F distribution with k - 1 and N - k degrees of freedom at a significance level of .

    In the above formulas for the critical regions, the Handbook follows the convention that

    is the upper critical value from the F distribution and

    is the lower critical value. Note that this is the opposite of some texts and software programs. In particular, Dataplot uses the opposite convention.

    Sample OutputDataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    LEVENE F-TEST FOR SHIFT IN VARIATION

    (CASE: TEST BASED ON MEDIANS)

    1. STATISTICS

    NUMBER OF OBSERVATIONS = 100

    NUMBER OF GROUPS = 10

    LEVENE F TEST STATISTIC = 1.705910

    2. FOR LEVENE TEST STATISTIC

    0 % POINT = 0.

    50 % POINT = 0.9339308

    75 % POINT = 1.296365

    90 % POINT = 1.702053

    95 % POINT = 1.985595

    99 % POINT = 2.610880

    99.9 % POINT = 3.478882

    90.09152 % Point: 1.705910

    3. CONCLUSION (AT THE 5% LEVEL):

    THERE IS NO SHIFT IN VARIATION.

    THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

    Interpretation of Sample OutputWe are testing the hypothesis that the group variances are equal. The output is divided into three sections.

    1. The first section prints the number of observations (N), the number of groups (k), and the value of the Levene test statistic.

    2. The second section prints the upper critical value of the F distribution corresponding to various significance levels. The value in the first column, the confidence level of the test, is equivalent to 100(1-

    ). We reject the null hypothesis at that significance level if the value of the Levene F test statistic printed in section one is greater than the critical value printed in the last column.

    3. The third section prints the conclusion for a 95% test. For a different significance level, the appropriate conclusion can be drawn from the table printed in section two. For example, for

    = 0.10, we look at the row for 90% confidence and compare the critical value 1.702 to the Levene test statistic 1.7059. Since the test statistic is greater than the critical value, we reject the null hypothesis at the

    = 0.10 level.

    Output from other statistical software may look somewhat different from the above output.

    QuestionLevene's test can be used to answer the following question:

    Is the assumption of equal variances valid?

    Related TechniquesStandard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    SoftwareThe Levene test is available in some general purpose statistical software programs, including Dataplot.

    is the mean of the ith subgroup.

    is the median of the ith subgroup.

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    Dataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    Standard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    The Levene test is available in some general purpose statistical software programs, including Dataplot.

    mean_effect size

    T-test with unpooled varianses and effect sizes (with pooled variances)

    sample Asample Bsample Csample D

    M100105-5.0000M1 - M1M100105-5.0000M1 - M1

    SD151545.0000(s1)2 / n1SD151522.5000(s1)2 / n1

    N5545.0000(s2)2 / n2N101022.5000(s2)2 / n2

    9.4868SQRT ((s1)2 / n1 + (s2)2 / n2)6.7082SQRT ((s1)2 / n1 + (s2)2 / n2)

    T-0.5270T-0.7454

    one or two tailed20.5270one or two tailed20.7454

    DF8(n1 + n2 -2)DF18(n1 + n2 -2)

    sign0.6125sign0.4657

    d0.3333d0.3333

    sample Esample Fsample Gsample H

    M5.541.5000M1 - M1M5.541.5000M1 - M1

    SD0.50.50.0013(s1)2 / n1SD0.50.50.0125(s1)2 / n1

    N2002000.0013(s2)2 / n2N20200.0125(s2)2 / n2

    0.0500SQRT ((s1)2 / n1 + (s2)2 / n2)0.1581SQRT ((s1)2 / n1 + (s2)2 / n2)

    T30.0000T9.4868

    one or two tailed230.0000one or two tailed29.4868

    DF398(n1 + n2 -2)DF38(n1 + n2 -2)

    sign0.0000sign0.0000

    d3.0000d3.0000

    cohens d (pooled variances) = (m1 - m2) /sqrt ( ( ( (n1 -1) sd12 ) + ( (n2 - 1) sd22) )/ (n1 + n2 - 2) )

    T-test and effect sizes with pooled variances

    * systematically change the mean-level differences, maintaining the variances and sample sizes

    * systematically change the sample sizes, maintaining the variances and mean-level differences

    * systematically change the variances, maintaining the mean-level differences and sample sizes

    * "unbalance" the test, distort sample sizes and / or variances of the groups

    * figure out some "rule of thumb" for your observations

    chi2

    This spread sheet calculates the chi-square value for a 2 x 2 contingency table and transforms it into an effect size and a phi coefficient

    Variable A(observed minus expected frequencies)2

    noyesnoyes

    Variable Bno265204469no536.94536.94

    yes3275107yes536.94536.94

    297279576

    expected frequencies(observed minus expected frequencies)2 / expected frequencies

    noyesnoyes

    no241.83227.17no2.222.36

    yes55.1751.83yes9.7310.36

    observed minus expected frequenciesCHISQ24.6758732833

  • *XLSESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Simulate ds on homemade calculator (ES.xls)Change direction of effectsChange Ns (equal or same?)Change SDs

    Sheet1

    T-test and effect sizes

    TreatmentControlTreatmentControl

    M105pooled SD1005.0000M100pooled SD105-5.0000

    SD15151515.0000SD15151515.0000

    N151515.0000N151515.0000

    5.47725.4772

    T0.91T-0.91

    0.910.91

    DF28DF28

    sign0.3691sign0.3691

    d0.33d-0.33

    one or two tailed2one or two tailed2

    Sheet2

    Sheet3

  • 79% of T above 69% of T above *Effect size as proportion in the Treatment group doing better than the average Control group person 57% of T above

  • *Effect size as proportion of success in the Treatment versus Control group (Binomial Effect Size Display = BESD): Success: 55% of T, 45% of C Success: 62% of T, 38% of C Success: 68% of T, 32% of C

  • Long focus on significance level (safe-guarding against Type I (a) error) today focus on practical and meaningful significance.Cohen, J. (1994). The earth is round (p < .05), American Psychologist, 49, 9971003.*Why effect size?

    interrater

    StudyIVRater 1Rater 2IV1Rater 1

    Study 1IV134Cat1Cat2Cat3Cat4

    Study 1IV223Rater 2Cat14010

    Study 1IV333Cat21500

    Study 1IV444Cat30030

    Study 2IV122Cat40015

    Study 2IV233

    Study 2IV344

    Study 2IV443

    .

    Study kIV133

    Study kIV222

    Study kIV312

    Study kIV433

    StudyIVRater 1Rater 2.Rater n

    Study 1IV1344

    Study 1IV2232

    Study 1IV3333

    Study 1IV4444

    Study 2IV1222

    Study 2IV2334

    Study 2IV3444

    Study 2IV4433

    .

    Study kIV1332

    Study kIV2221

    Study kIV3121

    Study kIV4333

    H0 H1

    Real world

    H0 TrueH1 True

    StudyAccept H0ok

    Accept H1ok

    Sheet3

  • *A short history of the effect size (Huberty, 2002; see also Olejnik & Algina, 2000 for review of effect sizes)

  • Power: Finding what is out there Type II (b) error not finding what is out therePower (1 b): the probability of rejecting a false H0 hypothesis Power of .80 or .90 in primary research

    *Power and effect size

  • *Power, sought effect size, at significance level a = .05 in primary research (prior to conducting study)

    Chart1

    19412887

    246162110

    310204139

    394259176

    527347235

    effect .20

    effect .25

    effect .30

    power

    N needed per sample

    Sample size for three effects sizes, a = .05

    Sheet1

    Effect size = .33

    PowerSample for each group

    a = .10a = .05

    90%95%99%

    0.7086113175

    0.7598126192

    0.80112143212

    0.85131163237

    0.90155191270

    0.95196235323

    wedn 22.09.2005

    Power

    Effect0.500.600.700.800.90

    effect .20194246310394527

    effect .25128162204259347

    effect .3087110139176235

    wedn 22.09.2005

    effect .20

    effect .25

    effect .30

    power

    N needed per sample

    Sample size for three effects sizes, a = .05

    Sheet3

  • *How meaningful is a small effect size?A small effect size changed the course of an RCT in 1987: placebo group participants were given aspirin instead (see Rosenthal, 1994, p. 242)XLS

    Levene

    DefinitionThe Levene test is defined as:

    H0:

    Ha:for at least one pair (i,j).

    Test Statistic:Given a variable Y with sample of size N divided into k subgroups, where Ni is the sample size of the ith subgroup, the Levene test statistic is defined as:

    where Zij can have one of the following three definitions:

    1. where

    is the mean of the ith subgroup.

    2. where

    is the median of the ith subgroup.

    3. where

    is the 10% trimmed mean of the ith subgroup.

    are the group means of the Zij and is the overall mean of the Zij.

    The three choices for defining Zij determine the robustness and power of Levene's test. By robustness, we mean the ability of the test to not falsely detect unequal variances when the underlying data are not normally distributed and the variables are in f

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    (i.e., skewed) distribution. Using the mean provided the best power for symmetric, moderate-tailed, distributions.

    Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good power. If you have knowledge of th

    Significance Level:

    Critical Region:The Levene test rejects the hypothesis that the variances are equal if

    where

    is the upper critical value of the F distribution with k - 1 and N - k degrees of freedom at a significance level of .

    In the above formulas for the critical regions, the Handbook follows the convention that

    is the upper critical value from the F distribution and

    is the lower critical value. Note that this is the opposite of some texts and software programs. In particular, Dataplot uses the opposite convention.

    Sample OutputDataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    LEVENE F-TEST FOR SHIFT IN VARIATION

    (CASE: TEST BASED ON MEDIANS)

    1. STATISTICS

    NUMBER OF OBSERVATIONS = 100

    NUMBER OF GROUPS = 10

    LEVENE F TEST STATISTIC = 1.705910

    2. FOR LEVENE TEST STATISTIC

    0 % POINT = 0.

    50 % POINT = 0.9339308

    75 % POINT = 1.296365

    90 % POINT = 1.702053

    95 % POINT = 1.985595

    99 % POINT = 2.610880

    99.9 % POINT = 3.478882

    90.09152 % Point: 1.705910

    3. CONCLUSION (AT THE 5% LEVEL):

    THERE IS NO SHIFT IN VARIATION.

    THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

    Interpretation of Sample OutputWe are testing the hypothesis that the group variances are equal. The output is divided into three sections.

    1. The first section prints the number of observations (N), the number of groups (k), and the value of the Levene test statistic.

    2. The second section prints the upper critical value of the F distribution corresponding to various significance levels. The value in the first column, the confidence level of the test, is equivalent to 100(1-

    ). We reject the null hypothesis at that significance level if the value of the Levene F test statistic printed in section one is greater than the critical value printed in the last column.

    3. The third section prints the conclusion for a 95% test. For a different significance level, the appropriate conclusion can be drawn from the table printed in section two. For example, for

    = 0.10, we look at the row for 90% confidence and compare the critical value 1.702 to the Levene test statistic 1.7059. Since the test statistic is greater than the critical value, we reject the null hypothesis at the

    = 0.10 level.

    Output from other statistical software may look somewhat different from the above output.

    QuestionLevene's test can be used to answer the following question:

    Is the assumption of equal variances valid?

    Related TechniquesStandard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    SoftwareThe Levene test is available in some general purpose statistical software programs, including Dataplot.

    is the mean of the ith subgroup.

    is the median of the ith subgroup.

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    Dataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    Standard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    The Levene test is available in some general purpose statistical software programs, including Dataplot.

    mean_effect size

    T-test with unpooled varianses and effect sizes (with pooled variances)

    sample Asample Bsample Csample D

    M100105-5.0000M1 - M1M100105-5.0000M1 - M1

    SD151545.0000(s1)2 / n1SD151522.5000(s1)2 / n1

    N5545.0000(s2)2 / n2N101022.5000(s2)2 / n2

    9.4868SQRT ((s1)2 / n1 + (s2)2 / n2)6.7082SQRT ((s1)2 / n1 + (s2)2 / n2)

    T-0.5270T-0.7454

    one or two tailed20.5270one or two tailed20.7454

    DF8(n1 + n2 -2)DF18(n1 + n2 -2)

    sign0.6125sign0.4657

    d0.3333d0.3333

    sample Esample Fsample Gsample H

    M5.541.5000M1 - M1M5.541.5000M1 - M1

    SD0.50.50.0013(s1)2 / n1SD0.50.50.0125(s1)2 / n1

    N2002000.0013(s2)2 / n2N20200.0125(s2)2 / n2

    0.0500SQRT ((s1)2 / n1 + (s2)2 / n2)0.1581SQRT ((s1)2 / n1 + (s2)2 / n2)

    T30.0000T9.4868

    one or two tailed230.0000one or two tailed29.4868

    DF398(n1 + n2 -2)DF38(n1 + n2 -2)

    sign0.0000sign0.0000

    d3.0000d3.0000

    cohens d (pooled variances) = (m1 - m2) /sqrt ( ( ( (n1 -1) sd12 ) + ( (n2 - 1) sd22) )/ (n1 + n2 - 2) )

    T-test and effect sizes with pooled variances

    * systematically change the mean-level differences, maintaining the variances and sample sizes

    * systematically change the sample sizes, maintaining the variances and mean-level differences

    * systematically change the variances, maintaining the mean-level differences and sample sizes

    * "unbalance" the test, distort sample sizes and / or variances of the groups

    * figure out some "rule of thumb" for your observations

    chi2

    This spread sheet calculates the chi-square value for a 2 x 2 contingency table and transforms it into an effect size and a phi coefficient

    Variable A(observed minus expected frequencies)2

    noyesnoyes

    Variable Bno1041093311037no1807.941807.94

    yes1891084511034yes1807.941807.94

    2932177822071

    expected frequencies(observed minus expected frequencies)2 / expected frequencies

    noyesnoyes

    no146.5210890.48no12.340.17

    yes146.4810887.52yes12.340.17

    observed minus expected frequenciesCHISQ25.01

  • *

    Effect size as proportion of success in the Treatment versus Control group (Binomial Effect Size Display = BESD):

    Success: 55% of T, 45% of C

    Success: 62% of T, 38% of C

    Success: 68% of T, 32% of C

    *

    *

    LM chip inCompare with emerging themesCompare with interrater drift

    2 June 2008

  • Within the one meta-analysis, can include studies based on any combination of statistical analysis (e.g., t-tests, ANOVA, correlation, odds-ratio, chi-square, etc). The art of meta-analysis is how to compute effect sizes based on non-standard designs and studies that do not supply complete data (see Lipsey&Wilson_AppB.pdf).Convert all effect sizes into a common metric based on the natural metric given research in the area. E.g. d, r, OR

    *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)

  • Standardized mean differenceGroup contrast researchTreatment groupsNaturally occurring groupsInherently continuous constructCorrelation coefficientAssociation between inherently continuous constructsOdds-ratioGroup contrast researchTreatment or naturally occurring groupsInherently dichotomous constructRegression coefficients and other multivariate effectsRequires access to covariance-variance (correlation) matrices for each included study*

  • *Means and standard deviationsCorrelationsP-valuesF-statisticsdt-statisticsother test statisticsAlmost all test statistics can be transformed into an standardized effect size dESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Calculating ds (1)

  • Represents a standardized group contrast on an inherently continuous measureUses the pooled standard deviationCommonly called dESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)Calculating ds (1)

  • Cohens d

    Hedges g

    Glasss D

    *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)Various contrast effect sizes

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)Calculating d (1) using Ms, SDs and nsRemember to code treatment effect in positive direction!

    Sheet1

    T-test and effect sizes

    TreatmentControlsample Csample D

    M25pooled SD205.0000M20pooled SD25-5.0000

    SD5551.0000SD5551.0000

    N25300.8333N25300.8333

    1.35401.3540

    T3.6927T-3.6927

    3.69273.6927

    DF53DF53

    sign0.0005sign0.0005

    d1.0000d1.0000

    one or two tailed2one or two tailed2

    Sheet2

    Sheet3

  • **ES_calculator.xls

  • *Calculating d (2) using ES calculator, using Ms, ns, and t-value

  • **Calculating d (3) using ES calculator, using ns, and t-valueThe treatment group scored higher than the control group at Time 2 (t[28]= 4.11; p
  • Hedges proposed a correction for small sample size bias (ns < 20)Must be applied before analysis*Calculating d (3) correcting for small sample bias

  • **Calculating d (4) using ES calculator, using ns, and F-valueRemember: in a two-group ANOVA F = t2

  • **Calculating d (5) using ES calculator, using p-valueThe mean-level comparison was not significant (p = .53)

  • **T-test tabledf = (n1 + ns 2) Sometimes authors only report e.g., p
  • **Example dataset so far (1) (ES_enter.sav):

    enter_w

    studyesTreatCntrnGroupssen1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.0025305520.28710.07330.00910.287112.1312.13

    20.8040408020.23240.05000.00400.232418.5214.81

    31.4615153020.41090.13330.03550.41095.928.65

    40.4020204020.31940.10000.00200.31949.803.92

    50.10807515520.16080.02580.00000.160838.663.87

    60.10606012020.18270.03330.00000.182729.963.00

    70.1745358020.22580.05080.00020.225819.623.34

    80.4345408520.21980.04720.00110.219820.708.90

    90.4010012022020.13670.01830.00040.136753.4821.39

    100.6020016036020.10840.01130.00050.108485.1151.06

    110.05556512020.18320.03360.00000.183229.781.49

    120.10708015020.16380.02680.00000.163837.293.73

    130.70800801

    141.20900901

    150.80200201

    Sums360.98136.29

    average es0.38

    se of mean es0.05

    C.I. Lower0.27

    C.I. Upper0.48

    Sheet2

    Sheet3

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)Use all available tools for calculating the following 5 effect sizes ES 6: MT = 21, MC = 20, nT = 60, nC = 60, t = .55ES 7: MT = 103.5, MC = 100, SDT = 22.0, SDC = 18.5, nT = 45, nC = 35, ES 8: nT = 45, nC = 40, p
  • **Example dataset so far (2) (ES_enter.sav):

    enter_w

    studyesTreatCntrnGroupssen1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.0025305520.28710.07330.00910.287112.1312.13

    20.8040408020.23240.05000.00400.232418.5214.81

    31.4615153020.41090.13330.03550.41095.928.65

    40.4020204020.31940.10000.00200.31949.803.92

    50.10807515520.16080.02580.00000.160838.663.87

    60.10606012020.18270.03330.00000.182729.963.00

    70.1745358020.22580.05080.00020.225819.623.34

    80.4345408520.21980.04720.00110.219820.708.90

    90.4010012022020.13670.01830.00040.136753.4821.39

    100.6020016036020.10840.01130.00050.108485.1151.06

    110.05556512020.18320.03360.00000.183229.781.49

    120.10708015020.16380.02680.00000.163837.293.73

    130.70800801

    141.20900901

    150.80200201

    Sums360.98136.29

    average es0.38

    se of mean es0.05

    C.I. Lower0.27

    C.I. Upper0.48

    Sheet2

    Sheet3

  • **Calculating d (11) using ES calculator, using number of successful outcomes per group

    Sheet1

    0.8726666667

    0.8726666667

    0.2348484848

    1.54040404040.1322901849

    Estimates of Covariance Parameters(a)

    ParameterEstimateStd. Error

    Residual0.22222222220.0641500299

    Intercept [subject = study]Variance1.54461279460.69054155810.12577417820.874

    aDependent Variable: rating.

    1.76683501680.8742258218

    Sheet2

    SuccessFailureTotal

    Treatment282856

    Control313465

    Total5962121

    Sheet3

    MBD000C7826.unknown

    MBD000C86C3.unknown

    MBD000C7ACC.unknown

  • **Calculating d (11) using ES calculator, using number of successful outcomes per group

    Sheet1

    0.8726666667

    0.8726666667

    0.2348484848

    1.54040404040.1322901849

    Estimates of Covariance Parameters(a)

    ParameterEstimateStd. Error

    Residual0.22222222220.0641500299

    Intercept [subject = study]Variance1.54461279460.69054155810.12577417820.874

    aDependent Variable: rating.

    1.76683501680.8742258218

    Sheet2

    SuccessFailureTotal

    Treatment282856

    Control313465

    Total5962121

    Sheet3

    MBD000C7826.unknown

    MBD000C86C3.unknown

    MBD000C7ACC.unknown

  • **Calculating d (12) using ES calculator, using proportion of successes per group (53% vs. 48.5%)

  • *

    Effect size as proportion of success in the Treatment versus Control group

    Success: 55% of T, 45% of C

    Success: 62% of T, 38% of C

    Success: 68% of T, 32% of C

    *

    *

    LM chip inCompare with emerging themesCompare with interrater drift

    2 June 2008

  • **Calculating d (13) using paired t-test (only one experimental group; each person their own control)Dont use the SD of the change score!r = correlation between Time 1 and Time 2

  • **Calculating d (14) using paired t-test (only one experimental group)n (pairs) = 90, t-value = 6.5, r = .70

  • **Calculating d (15)The 20 participants increased .84 z-scores between time 1 and time 2 (p
  • **Example dataset so far 3 (ES_enter.sav): Method difference: mean contrast and gain scores

    enter_w

    studyesTreatCntrnGroupssen1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.0025305520.28710.07330.00910.287112.1312.13

    20.8040408020.23240.05000.00400.232418.5214.81

    31.4615153020.41090.13330.03550.41095.928.65

    40.4020204020.31940.10000.00200.31949.803.92

    50.10807515520.16080.02580.00000.160838.663.87

    60.10606012020.18270.03330.00000.182729.963.00

    70.1745358020.22580.05080.00020.225819.623.34

    80.4345408520.21980.04720.00110.219820.708.90

    90.4010012022020.13670.01830.00040.136753.4821.39

    100.6020016036020.10840.01130.00050.108485.1151.06

    110.05566512120.18240.03320.00000.182430.071.50

    120.10708015020.16380.02680.00000.163837.293.69

    130.70800801

    140.53900901

    150.80200201

    Sums361.27136.27

    average es0.38

    se of mean es0.05

    C.I. Lower0.27

    C.I. Upper0.48

    Sheet2

    Sheet3

  • **Summary of equations from Lipsey & Wilson (2001) (for more formulae see Lipsey & Wilson Appendix B)

  • The effect sizes are weighted by the inverse of the variance to give more weight to effects based on larger sample sizesVariance for mean level comparison is calculated as

    The standard error of each effect size is given by the square root of the sampling varianceSE = vi*ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Weighting for mean-level differences

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Enter_w.xls

    enter_w

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.4694

    se of mean es0.0416

    95% C.I. Lower0.3879

    95% C.I. Upper0.5509

    enter_w2

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    enter_w2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    Sheet2

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.1027740240.102774024

    0.530.530.09663505230.0966350523

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    Sheet3

    0.302002010.500.223620.0000

    0.302002010.700.179631.0078

    0.302002010.900.110781.6327

    0.505005010.500.144647.8469

    0.505005010.700.120468.9655

    0.505005010.900.0806153.8462

    0.808008010.500.118671.1111

    0.808008010.700.107286.9565

    0.808008010.900.0806153.8462

  • SE for gain scores

    Inverse variance for gain scores

    *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Weighting for gain scoresT1 and T2 scores are dependent so we need to get correlation between T1 and T2 into equation (not always reported)

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*XLSEnter_w.xls

    enter_w

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.650.10870.00880.00310.108784.65610.108784.6659.26

    140.53900900.5310.700.09070.00670.00160.0907121.54770.0907121.5564.42

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums582.63272.07

    average es0.4670

    se of mean es0.0414

    95% C.I. Lower0.3858

    95% C.I. Upper0.5482

    enter_w2

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.650.10870.00880.00310.108784.65610.108784.6659.26

    140.53900900.5310.700.09070.00670.00160.0907121.54770.0907121.5564.42

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums582.63272.07

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    enter_w2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    enter_w3

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.10868532560.1086853256

    0.530.530.09070403640.0907040364

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    Sheet2

    studyesTreatCntrndGroupsrsewwes

    11.002530551.0020.287112.1312.13

    20.804040800.8020.232418.5214.81

    31.461515301.4620.41095.928.65

    40.402020400.4020.31949.803.92

    50.1080751550.1020.160838.663.87

    60.1060601200.1020.182729.963.00

    70.174535800.1720.225819.623.34

    80.434540850.4320.219820.708.90

    90.401001202200.4020.136753.4821.39

    100.602001603600.6020.108485.1151.06

    110.0556651210.0520.182430.071.50

    120.1070801500.1020.163837.293.69

    130.70800800.7010.650.108784.6659.26

    140.53900900.5310.700.0907121.5564.42

    150.80200200.8010.500.256915.1512.12

    Sums582.63272.07

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    Sheet3

    0.302002010.500.223620.0000

    0.302002010.700.179631.0078

    0.302002010.900.110781.6327

    0.505005010.500.144647.8469

    0.505005010.700.120468.9655

    0.505005010.900.0806153.8462

    0.808008010.500.118671.1111

    0.808008010.700.107286.9565

    0.808008010.900.0806153.8462

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Compute the weighted mean ES and s.e. of the ES in SPSS (var_ofES.sps) (1)

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Compute the weighted mean ES and s.e. of the ES in SPSS (var_ofES.sps) (2)

  • Weight the ES by the inverse of the s.e.

    The average ES

    Standard error of the ES*ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Compute the weighted mean ES and s.e. of the ES

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Enter_w.xls

    enter_w

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.650.10870.00880.00310.108784.65610.108784.6659.26

    140.53900900.5310.700.09070.00670.00160.0907121.54770.0907121.5564.42

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums582.63272.07

    average es0.4670

    se of mean es0.0414

    95% C.I. Lower0.3858

    95% C.I. Upper0.5482

    enter_w2

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.650.10870.00880.00310.108784.65610.108784.6659.26

    140.53900900.5310.700.09070.00670.00160.0907121.54770.0907121.5564.42

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums582.63272.07

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    enter_w2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    enter_w3

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.10868532560.1086853256

    0.530.530.09070403640.0907040364

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    Sheet2

    studyesTreatCntrndGroupsrsewwes

    11.002530551.0020.287112.1312.13

    20.804040800.8020.232418.5214.81

    31.461515301.4620.41095.928.65

    40.402020400.4020.31949.803.92

    50.1080751550.1020.160838.663.87

    60.1060601200.1020.182729.963.00

    70.174535800.1720.225819.623.34

    80.434540850.4320.219820.708.90

    90.401001202200.4020.136753.4821.39

    100.602001603600.6020.108485.1151.06

    110.0556651210.0520.182430.071.50

    120.1070801500.1020.163837.293.69

    130.70800800.7010.650.108784.6659.26

    140.53900900.5310.700.0907121.5564.42

    150.80200200.8010.500.256915.1512.12

    Sums582.63272.07

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    Sheet3

    0.302002010.500.223620.0000

    0.302002010.700.179631.0078

    0.302002010.900.110781.6327

    0.505005010.500.144647.8469

    0.505005010.700.120468.9655

    0.505005010.900.0806153.8462

    0.808008010.500.118671.1111

    0.808008010.700.107286.9565

    0.808008010.900.0806153.8462

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*

    enter_w

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.650.10870.00880.00310.108784.65610.108784.6659.26

    140.53900900.5310.700.09070.00670.00160.0907121.54770.0907121.5564.42

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums582.63272.07

    average es0.4670

    se of mean es0.0414

    95% C.I. Lower0.3858

    95% C.I. Upper0.5482

    enter_w2

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.650.10870.00880.00310.108784.65610.108784.6659.26

    140.53900900.5310.700.09070.00670.00160.0907121.54770.0907121.5564.42

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums582.63272.07

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    enter_w2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    enter_w3

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.10868532560.1086853256

    0.530.530.09070403640.0907040364

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    Sheet2

    studyesTreatCntrndGroupsrsewwes

    11.002530551.0020.287112.1312.13

    20.804040800.8020.232418.5214.81

    31.461515301.4620.41095.928.65

    40.402020400.4020.31949.803.92

    50.1080751550.1020.160838.663.87

    60.1060601200.1020.182729.963.00

    70.174535800.1720.225819.623.34

    80.434540850.4320.219820.708.90

    90.401001202200.4020.136753.4821.39

    100.602001603600.6020.108485.1151.06

    110.0556651210.0520.182430.071.50

    120.1070801500.1020.163837.293.69

    130.70800800.7010.650.108784.6659.26

    140.53900900.5310.700.0907121.5564.42

    150.80200200.8010.500.256915.1512.12

    Sums582.63272.07

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    Sheet3

    0.302002010.500.223620.0000

    0.302002010.700.179631.0078

    0.302002010.900.110781.6327

    0.505005010.500.144647.8469

    0.505005010.700.120468.9655

    0.505005010.900.0806153.8462

    0.808008010.500.118671.1111

    0.808008010.700.107286.9565

    0.808008010.900.0806153.8462

  • Does average of ES converge toward the average of the largest (n) study?*Funnel plot for x = sample size, y = ES 95% C.I. = 1.96 * s.e.99% C.I. = 2.58 * s.e.99.9% C.I. = 3.29 * s.e.

    Chart2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    enter_w

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    enter_w

    d

    d

    Effect sizes by sample size

    enter_w2

    0.2870962250.287096225

    0.23237900080.2323790008

    0.41092578410.4109257841

    0.31937438850.3193743885

    0.16082783150.1608278315

    0.18268825910.1826882591

    0.22577483430.2257748343

    0.2197950620.219795062

    0.1367368630.136736863

    0.10839741690.1083974169

    0.18235155280.1823515528

    0.16376319580.1637631958

    0.1027740240.102774024

    0.09663505230.0966350523

    0.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    Sheet2

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    Sheet2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    Sheet3

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.1027740240.102774024

    0.530.530.09663505230.0966350523

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    0.302002010.500.223620.0000

    0.302002010.700.179631.0078

    0.302002010.900.110781.6327

    0.505005010.500.144647.8469

    0.505005010.700.120468.9655

    0.505005010.900.0806153.8462

    0.808008010.500.118671.1111

    0.808008010.700.107286.9565

    0.808008010.900.0806153.8462

    enter_w

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    enter_w

    d

    d

    Effect sizes by sample size

    enter_w2

    0.2870962250.287096225

    0.23237900080.2323790008

    0.41092578410.4109257841

    0.31937438850.3193743885

    0.16082783150.1608278315

    0.18268825910.1826882591

    0.22577483430.2257748343

    0.2197950620.219795062

    0.1367368630.136736863

    0.10839741690.1083974169

    0.18235155280.1823515528

    0.16376319580.1637631958

    0.1027740240.102774024

    0.09663505230.0966350523

    0.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    Sheet2

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    Sheet2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    Sheet3

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.1027740240.102774024

    0.530.530.09663505230.0966350523

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    0.302002010.500.223620.0000

    0.302002010.700.179631.0078

    0.302002010.900.110781.6327

    0.505005010.500.144647.8469

    0.505005010.700.120468.9655

    0.505005010.900.0806153.8462

    0.808008010.500.118671.1111

    0.808008010.700.107286.9565

    0.808008010.900.0806153.8462

  • ES in smaller sample has larger standard error (s.e.)*Funnel plot including s.e. of ES

    Chart1

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.1027740240.102774024

    0.530.530.09663505230.0966350523

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    enter_w

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    enter_w

    d

    d

    Effect sizes by sample size

    enter_w2

    0.2870962250.287096225

    0.23237900080.2323790008

    0.41092578410.4109257841

    0.31937438850.3193743885

    0.16082783150.1608278315

    0.18268825910.1826882591

    0.22577483430.2257748343

    0.2197950620.219795062

    0.1367368630.136736863

    0.10839741690.1083974169

    0.18235155280.1823515528

    0.16376319580.1637631958

    0.1027740240.102774024

    0.09663505230.0966350523

    0.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    Sheet2

    studyesTreatCntrndGroupsrse_1grse1se2se3w1n1+n2/(n1*n2)d2 / (2(n1+n2))sewwes

    11.002530551.0020.07330.00910.287112.1312.13

    20.804040800.8020.05000.00400.232418.5214.81

    31.461515301.4620.13330.03550.41095.928.65

    40.402020400.4020.10000.00200.31949.803.92

    50.1080751550.1020.02580.00000.160838.663.87

    60.1060601200.1020.03330.00000.182729.963.00

    70.174535800.1720.05080.00020.225819.623.34

    80.434540850.4320.04720.00110.219820.708.90

    90.401001202200.4020.01830.00040.136753.4821.39

    100.602001603600.6020.01130.00050.108485.1151.06

    110.0556651210.0520.03320.00000.182430.071.50

    120.1070801500.1020.02680.00000.163837.293.69

    130.70800800.7010.700.10280.00750.00310.102894.67460.102894.6766.27

    140.53900900.5310.650.09660.00780.00160.0966107.08550.0966107.0956.76

    150.80200200.8010.500.25690.05000.01600.256915.15150.256915.1512.12

    Sums578.18271.41

    average es0.47

    se of mean es0.04

    95% C.I. Lower0.39

    95% C.I. Upper0.55

    Sheet2

    11

    0.80.8

    1.461.46

    0.40.4

    0.10.1

    0.10.1

    0.170.17

    0.430.43

    0.40.4

    0.60.6

    0.050.05

    0.0990.099

    0.70.7

    0.530.53

    0.80.8

    d

    d

    Effect sizes by sample size

    Sheet3

    110.2870962250.287096225

    0.80.80.23237900080.2323790008

    1.461.460.41092578410.4109257841

    0.40.40.31937438850.3193743885

    0.10.10.16082783150.1608278315

    0.10.10.18268825910.1826882591

    0.170.170.22577483430.2257748343

    0.430.430.2197950620.219795062

    0.40.40.1367368630.136736863

    0.60.60.10839741690.1083974169

    0.050.050.18235155280.1823515528

    0.0990.0990.16376319580.1637631958

    0.70.70.1027740240.102774024

    0.530.530.09663505230.0966350523

    0.80.80.25690465160.2569046516

    d

    d

    Effect sizes by sample size

    0.302002010.500.223620.0000

    0.302002010.700.179631.0078

    0.302002010.900.110781.6327

    0.505005010.500.144647.8469

    0.505005010.700.120468.9655

    0.505005010.900.0806153.8462

    0.808008010.500.118671.1111

    0.808008010.700.107286.9565

    0.808008010.900.0806153.8462

  • *Samplen = sizem = meand = effect sizeESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*

  • Means and standard deviations (d)c2 fP-valuesF-statisticsrt-statisticsother test statisticsAlmost all test statistics can be transformed into an standardized effect size rESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Calculating rs

  • *Correlations / relationships between variables rxy Pearsons product moment coefficient (continuous continuous)Rpb Bi-serial correlation (dichotomous continuous)c2 (dichotomous dichotomous)rsSpearmans rank-order coefficient (ordinal ordinal)And others, e.g., f coefficient, Odds-Ratio (OR) Cramers V, Contingency coefficient C Tetrachoric and polychoric correlations . (etc)

  • *Bias when dichotomising continuous variables

    X or Y are both truly continuous, but in the study either is dichotomised X = continuous, Y =50/50 split gives an rpb that is 80% of its value, had it been continuous X or Y are both truly continuous, but both are dichotomised Maximum value of f if x = 30/70 split and Y = 50/50 split is f = .33

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*Calculating rs from d (1)r can be used in all situations d can, but d cannot be used in all situations where r is appropriate

    Levene

    DefinitionThe Levene test is defined as:

    Test Statistic:

    1. where

    is the mean of the ith subgroup.

    2. where

    is the median of the ith subgroup.

    3. where

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    (i.e., skewed) distribution. Using the mean provided the best power for symmetric, moderate-tailed, distributions.

    Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good power. If you have knowledge of th

    Significance Level:

    Critical Region:The Levene test rejects the hypothesis that the variances are equal if

    where

    In the above formulas for the critical regions, the Handbook follows the convention that

    is the upper critical value from the F distribution and

    is the lower critical value. Note that this is the opposite of some texts and software programs. In particular, Dataplot uses the opposite convention.

    Sample OutputDataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    LEVENE F-TEST FOR SHIFT IN VARIATION

    (CASE: TEST BASED ON MEDIANS)

    1. STATISTICS

    NUMBER OF OBSERVATIONS = 100

    NUMBER OF GROUPS = 10

    LEVENE F TEST STATISTIC = 1.705910

    2. FOR LEVENE TEST STATISTIC

    0 % POINT = 0.

    50 % POINT = 0.9339308

    75 % POINT = 1.296365

    90 % POINT = 1.702053

    95 % POINT = 1.985595

    99 % POINT = 2.610880

    99.9 % POINT = 3.478882

    90.09152 % Point: 1.705910

    3. CONCLUSION (AT THE 5% LEVEL):

    THERE IS NO SHIFT IN VARIATION.

    THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

    Interpretation of Sample OutputWe are testing the hypothesis that the group variances are equal. The output is divided into three sections.

    2. The second section prints the upper critical value of the F distribution corresponding to various significance levels. The value in the first column, the confidence level of the test, is equivalent to 100(1-

    ). We reject the null hypothesis at that significance level if the value of the Levene F test statistic printed in section one is greater than the critical value printed in the last column.

    3. The third section prints the conclusion for a 95% test. For a different significance level, the appropriate conclusion can be drawn from the table printed in section two. For example, for

    = 0.10, we look at the row for 90% confidence and compare the critical value 1.702 to the Levene test statistic 1.7059. Since the test statistic is greater than the critical value, we reject the null hypothesis at the

    = 0.10 level.

    Output from other statistical software may look somewhat different from the above output.

    QuestionLevene's test can be used to answer the following question:

    Is the assumption of equal variances valid?

    Related TechniquesStandard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    SoftwareThe Levene test is available in some general purpose statistical software programs, including Dataplot.

    is the mean of the ith subgroup.

    is the median of the ith subgroup.

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    Dataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    Standard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    The Levene test is available in some general purpose statistical software programs, including Dataplot.

    mean_effect size

    T-test with unpooled varianses and effect sizes (with pooled variances)

    sample Asample Bsample Csample D

    M100105-5.0000M100105-5.0000

    SD151545.0000SD151522.5000

    N5545.0000N101022.5000

    9.48686.7082

    T-0.5270T-0.7454

    one or two tailed20.5270one or two tailed20.7454

    DF8DF18

    sign0.6125sign0.4657

    d0.3333d0.3333

    sample Esample Fsample Gsample H

    M5.541.5000M5.541.5000

    SD0.50.50.0013SD0.50.50.0125

    N2002000.0013N20200.0125

    0.05000.1581

    T30.0000T9.4868

    one or two tailed230.0000one or two tailed29.4868

    DF398DF38

    sign0.0000sign0.0000

    d3.0000d3.0000

    T-test and effect sizes with pooled variances

    * systematically change the mean-level differences, maintaining the variances and sample sizes

    * systematically change the sample sizes, maintaining the variances and mean-level differences

    * systematically change the variances, maintaining the mean-level differences and sample sizes

    * "unbalance" the test, distort sample sizes and / or variances of the groups

    * figure out some "rule of thumb" for your observations

    ES

    T-test and effect sizes

    TreatmentControlTreatmentControl

    M105pooled SD1005.0000M105pooled SD1005.0000

    SD1515152.2500SD1515152.2500

    N1001002.2500N1001002.2500

    2.12132.1213

    T2.3570T2.3570

    2.35702.3570

    DF198DF198

    sign0.0194sign0.0194

    Cohen's d0.3333

    0.3333

    0.3333d0.3333

    one or two tailed2one or two tailed2

    Hedge's g0.331662479

    Cohen's d from Hedge's g0.3333333333

    chi2

    This spread sheet calculates the chi-square value for a 2 x 2 contingency table and transforms it into an effect size and a phi coefficient

    Variable A(observed minus expected frequencies)21.0967741935

    noyesnoyes0.0923733201

    Variable Bno282856no0.480.480.0504772241

    yes313465yes0.480.48

    59621210.0222680886

    expected frequencies(observed minus expected frequencies)2 / expected frequencies

    noyesnoyes

    no27.3128.69no0.020.02

    yes31.6933.31yes0.020.01

    observed minus expected frequenciesCHISQ0.06

  • If inherently continuous X and Y, mean-contrast is a better option than rpb

    *Calculating rpb (2)

  • **Calculating r (3) from t-valueAppropriate for both independent and dependent samples t-test valuesCalculating r (4) from c2-value

  • *Sources of error

    Cf. Structural Equation Model (circle = latent/ unobserved construct, rectangle = manifest/ observed variable)

    Manifest (observed) variable xManifest (observed) variable yLatent (unobserved) XLatent (unobserved) Yrx*y*rxxryyrxy

  • *Alternatively: transform rs into Fishers Zr-transformed rs, which are more normally distributed

    ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)*

    Levene

    DefinitionThe Levene test is defined as:

    Test Statistic:

    1. where

    is the mean of the ith subgroup.

    2. where

    is the median of the ith subgroup.

    3. where

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    (i.e., skewed) distribution. Using the mean provided the best power for symmetric, moderate-tailed, distributions.

    Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good power. If you have knowledge of th

    Significance Level:

    Critical Region:The Levene test rejects the hypothesis that the variances are equal if

    where

    In the above formulas for the critical regions, the Handbook follows the convention that

    is the upper critical value from the F distribution and

    is the lower critical value. Note that this is the opposite of some texts and software programs. In particular, Dataplot uses the opposite convention.

    Sample OutputDataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    LEVENE F-TEST FOR SHIFT IN VARIATION

    (CASE: TEST BASED ON MEDIANS)

    1. STATISTICS

    NUMBER OF OBSERVATIONS = 100

    NUMBER OF GROUPS = 10

    LEVENE F TEST STATISTIC = 1.705910

    2. FOR LEVENE TEST STATISTIC

    0 % POINT = 0.

    50 % POINT = 0.9339308

    75 % POINT = 1.296365

    90 % POINT = 1.702053

    95 % POINT = 1.985595

    99 % POINT = 2.610880

    99.9 % POINT = 3.478882

    90.09152 % Point: 1.705910

    3. CONCLUSION (AT THE 5% LEVEL):

    THERE IS NO SHIFT IN VARIATION.

    THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

    Interpretation of Sample OutputWe are testing the hypothesis that the group variances are equal. The output is divided into three sections.

    2. The second section prints the upper critical value of the F distribution corresponding to various significance levels. The value in the first column, the confidence level of the test, is equivalent to 100(1-

    ). We reject the null hypothesis at that significance level if the value of the Levene F test statistic printed in section one is greater than the critical value printed in the last column.

    3. The third section prints the conclusion for a 95% test. For a different significance level, the appropriate conclusion can be drawn from the table printed in section two. For example, for

    = 0.10, we look at the row for 90% confidence and compare the critical value 1.702 to the Levene test statistic 1.7059. Since the test statistic is greater than the critical value, we reject the null hypothesis at the

    = 0.10 level.

    Output from other statistical software may look somewhat different from the above output.

    QuestionLevene's test can be used to answer the following question:

    Is the assumption of equal variances valid?

    Related TechniquesStandard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    SoftwareThe Levene test is available in some general purpose statistical software programs, including Dataplot.

    is the mean of the ith subgroup.

    is the median of the ith subgroup.

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    Dataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    Standard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    The Levene test is available in some general purpose statistical software programs, including Dataplot.

    mean_effect size

    T-test with unpooled varianses and effect sizes (with pooled variances)

    sample Asample Bsample Csample D

    M100105-5.0000M100105-5.0000

    SD151545.0000SD151522.5000

    N5545.0000N101022.5000

    9.48686.7082

    T-0.5270T-0.7454

    one or two tailed20.5270one or two tailed20.7454

    DF8DF18

    sign0.6125sign0.4657

    d0.3333d0.3333

    sample Esample Fsample Gsample H

    M5.541.5000M5.541.5000

    SD0.50.50.0013SD0.50.50.0125

    N2002000.0013N20200.0125

    0.05000.1581

    T30.0000T9.4868

    one or two tailed230.0000one or two tailed29.4868

    DF398DF38

    sign0.0000sign0.0000

    d3.0000d3.0000

    T-test and effect sizes with pooled variances

    * systematically change the mean-level differences, maintaining the variances and sample sizes

    * systematically change the sample sizes, maintaining the variances and mean-level differences

    * systematically change the variances, maintaining the mean-level differences and sample sizes

    * "unbalance" the test, distort sample sizes and / or variances of the groups

    * figure out some "rule of thumb" for your observations

    chi2

    This spread sheet calculates the chi-square value for a 2 x 2 contingency table and transforms it into an effect size and a phi coefficient

    Variable A(observed minus expected frequencies)21.0967741935

    noyesnoyes0.0923733201

    Variable Bno282856no0.480.480.0504772241

    yes313465yes0.480.48

    59621210.0222680886

    expected frequencies(observed minus expected frequencies)2 / expected frequencies

    noyesnoyes

    no27.3128.69no0.020.02

    yes31.6933.31yes0.020.01

    observed minus expected frequenciesCHISQ0.06

  • *ESRC RDI One Day Meta-analysis workshop (Marsh, OMara, Malmberg)rr.xls

    Levene

    DefinitionThe Levene test is defined as:

    Test Statistic:

    1. where

    is the mean of the ith subgroup.

    2. where

    is the median of the ith subgroup.

    3. where

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    (i.e., skewed) distribution. Using the mean provided the best power for symmetric, moderate-tailed, distributions.

    Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good power. If you have knowledge of th

    Significance Level:

    Critical Region:The Levene test rejects the hypothesis that the variances are equal if

    where

    In the above formulas for the critical regions, the Handbook follows the convention that

    is the upper critical value from the F distribution and

    is the lower critical value. Note that this is the opposite of some texts and software programs. In particular, Dataplot uses the opposite convention.

    Sample OutputDataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    LEVENE F-TEST FOR SHIFT IN VARIATION

    (CASE: TEST BASED ON MEDIANS)

    1. STATISTICS

    NUMBER OF OBSERVATIONS = 100

    NUMBER OF GROUPS = 10

    LEVENE F TEST STATISTIC = 1.705910

    2. FOR LEVENE TEST STATISTIC

    0 % POINT = 0.

    50 % POINT = 0.9339308

    75 % POINT = 1.296365

    90 % POINT = 1.702053

    95 % POINT = 1.985595

    99 % POINT = 2.610880

    99.9 % POINT = 3.478882

    90.09152 % Point: 1.705910

    3. CONCLUSION (AT THE 5% LEVEL):

    THERE IS NO SHIFT IN VARIATION.

    THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

    Interpretation of Sample OutputWe are testing the hypothesis that the group variances are equal. The output is divided into three sections.

    2. The second section prints the upper critical value of the F distribution corresponding to various significance levels. The value in the first column, the confidence level of the test, is equivalent to 100(1-

    ). We reject the null hypothesis at that significance level if the value of the Levene F test statistic printed in section one is greater than the critical value printed in the last column.

    3. The third section prints the conclusion for a 95% test. For a different significance level, the appropriate conclusion can be drawn from the table printed in section two. For example, for

    = 0.10, we look at the row for 90% confidence and compare the critical value 1.702 to the Levene test statistic 1.7059. Since the test statistic is greater than the critical value, we reject the null hypothesis at the

    = 0.10 level.

    Output from other statistical software may look somewhat different from the above output.

    QuestionLevene's test can be used to answer the following question:

    Is the assumption of equal variances valid?

    Related TechniquesStandard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    SoftwareThe Levene test is available in some general purpose statistical software programs, including Dataplot.

    is the mean of the ith subgroup.

    is the median of the ith subgroup.

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    Dataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    Standard Deviation Plot

    Box Plot

    Bartlett Test

    Chi-Square Test

    Analysis of Variance

    The Levene test is available in some general purpose statistical software programs, including Dataplot.

    mean_effect size

    T-test with unpooled varianses and effect sizes (with pooled variances)

    sample Asample Bsample Csample D

    M100105-5.0000M100105-5.0000

    SD151545.0000SD151522.5000

    N5545.0000N101022.5000

    9.48686.7082

    T-0.5270T-0.7454

    one or two tailed20.5270one or two tailed20.7454

    DF8DF18

    sign0.6125sign0.4657

    d0.3333d0.3333

    sample Esample Fsample Gsample H

    M5.541.5000M5.541.5000

    SD0.50.50.0013SD0.50.50.0125

    N2002000.0013N20200.0125

    0.05000.1581

    T30.0000T9.4868

    one or two tailed230.0000one or two tailed29.4868

    DF398DF38

    sign0.0000sign0.0000

    d3.0000d3.0000

    T-test and effect sizes with pooled variances

    * systematically change the mean-level differences, maintaining the variances and sample sizes

    * systematically change the sample sizes, maintaining the variances and mean-level differences

    * systematically change the variances, maintaining the mean-level differences and sample sizes

    * "unbalance" the test, distort sample sizes and / or variances of the groups

    * figure out some "rule of thumb" for your observations

    chi2

    This spread sheet calculates the chi-square value for a 2 x 2 contingency table and transforms it into an effect size and a phi coefficient

    Variable A(observed minus expected frequencies)21.0967741935

    noyesnoyes0.0923733201

    Variable Bno282856no0.480.480.0504772241

    yes313465yes0.480.48

    59621210.0222680886

    expected frequencies(observed minus expected frequencies)2 / expected frequencies

    noyesnoyes

    no27.3128.69no0.020.02

    yes31.6933.31yes0.020.01

    observed minus expected frequenciesCHISQ0.06

  • *

    Chart1

    0.46

    0.33

    0.25

    -0.2

    -0.25

    -0.4

    -0.1

    0.1

    0.275

    0.15

    r

    N

    ES (r)

    Ten effect sizes (r)

    Levene

    DefinitionThe Levene test is defined as:

    Test Statistic:

    1. where

    is the mean of the ith subgroup.

    2. where

    is the median of the ith subgroup.

    3. where

    is the 10% trimmed mean of the ith subgroup.

    Levene's original paper only proposed using the mean. Brown and Forsythe (1974)) extended Levene's test to use either the median or the trimmed mean in addition to the mean. They performed Monte Carlo studies that indicated that using the trimmed mean per

    (i.e., skewed) distribution. Using the mean provided the best power for symmetric, moderate-tailed, distributions.

    Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good power. If you have knowledge of th

    Significance Level:

    Critical Region:The Levene test rejects the hypothesis that the variances are equal if

    where

    In the above formulas for the critical regions, the Handbook follows the convention that

    is the upper critical value from the F distribution and

    is the lower critical value. Note that this is the opposite of some texts and software programs. In particular, Dataplot uses the opposite convention.

    Sample OutputDataplot generated the following output for Levene's test using the GEAR.DAT data set (by default, Dataplot performs the form of the test based on the median):

    LEVENE F-TEST FOR SHIFT IN VARIATION

    (CASE: TEST BASED ON MEDIANS)

    1. STATISTICS

    NUMBER OF OBSERVATIONS = 100

    NUMBER OF GROUPS = 10

    LEVENE F TEST STATISTIC = 1.705910

    2. FOR LEVENE TEST STATISTIC

    0 % POINT = 0.

    50 % POINT = 0.9339308

    75 % POINT = 1.296365

    90 % POINT = 1.702053

    95 % POINT = 1.985595

    99 % POINT = 2.610880

    99.9 % POINT = 3.478882

    90.09152 % Point: 1.705910

    3. CONCLUSION (AT THE 5% LEVEL):

    THERE IS NO SHIFT IN VARIATION.

    THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

    Interpretation of Sample OutputWe are testing the hypothesis that the group variances are equal. The output is divided into three sections.

    2. The second section prints the upper critical value of the F