variance heterogeneity and non-normality

29

Upload: solifa-sarah

Post on 02-Dec-2014

69 views

Category:

Documents


2 download

DESCRIPTION

Matkul Statistik Materi ANOVA (Analysis Of Variance) - Variance Heterogeneity and Non-Normality.

TRANSCRIPT

Page 1: Variance Heterogeneity and Non-Normality
Page 2: Variance Heterogeneity and Non-Normality

Heterogeneity of variance can be classified into two types:

1. Where the variance is functionally related to the mean

2. Where there is no functional relationship between the variance and the mean

Page 3: Variance Heterogeneity and Non-Normality

Variance is functionally related to the mean– Usually associated with data whose distribution

is not normal– Example I, count data, such as the number of

infested plants per plot or the number of gall per leaf• Usually follow the poisson distribution, wherein the

variance is equal to the mean, s2 = x

– Example II, binomial distribution, such as percent survival of insects or percent plants infected with desease (e.g., alive or dead and infested or not)• The variance and the mean are related as s2 = x(1-x).

Page 4: Variance Heterogeneity and Non-Normality

There is no functional relationship between the variance and the mean

Usually occurs in experiments where, due to the nature of the treatments tested, some treatments have errors that are substantially higher (or lower) than others.

Example, the variance of the F2 generation can be expected to be higher than that of the F1 generation because of genetic variability in F2 is much higher than that in F1.

Page 5: Variance Heterogeneity and Non-Normality

Handling Variance Heterogeneity• There are two remedial measures for

handling variance heterogeneity:1. The method of data transformation for

variances that are functionally related to the mean

2. The method of error partitioning for variances that are not functionally related to the mean

Page 6: Variance Heterogeneity and Non-Normality

Handling Variance Heterogeneity• Procedure for detecting the presence of

variance heterogeneity and for diagnosing the type of variance heterogeneity:

1. For each treatment, compute the variance and the mean across replication

2. Plot a scatter diagram between the mean value and the variance

3. Visually examine the scatter diagram to identify the pattern of relationship (Fig. 7.2)

Page 7: Variance Heterogeneity and Non-Normality
Page 8: Variance Heterogeneity and Non-Normality

Data TransformationThe most appropriate for variance

heterogeneity where the variance and the mean are functionally related

The appropriate data transformation to be used depend on the specific type of relationship between the variance and the mean.

Page 9: Variance Heterogeneity and Non-Normality

Data TransformationThere are three of the most commonly used

transformation1. Logarithmic Transformation2. Square-Root Transformation3. Arc Sine Transformation

Page 10: Variance Heterogeneity and Non-Normality

Logarithmic TransformationMost appropriate for data where the

standard deviation is proportional to the mean or where the effects are multiplicative.

Generally found in data that are whole numbers and cover a wide range of values.

Examples: number of insects per plot or number of egg masses per plant (per unit area)

Page 11: Variance Heterogeneity and Non-Normality

Logarithmic TransformationIf the data set involves small values (e.g., less

than 10), log(x + 1) should be used instead of log x, where x is the original data

Page 12: Variance Heterogeneity and Non-Normality

The Procedure for Applying the Logarithmic Transformation

Page 13: Variance Heterogeneity and Non-Normality

STEP 1. Verify the functional relationship between the mean and the variance using the scatter-diagramSTEP 2. Because some of the values in Table 7.14 are less than 10, log(x + 1) is applied instead of log x. STEP 3. Verify the success of the logarithm transformation in achieving the desired homogeneity of variance, by applying step 1 to the transformed data in Table 7.15.

Page 14: Variance Heterogeneity and Non-Normality

STEP 4. Construct the analysis of variance, in the usual manner, on the transformed data in Table 7.15

Page 15: Variance Heterogeneity and Non-Normality
Page 16: Variance Heterogeneity and Non-Normality
Page 17: Variance Heterogeneity and Non-Normality

Square-Root TransformationAppropriate for data consisting of small

whole numbers, for example, data obtained in counting rare events, such as:the number of infested plants in a plot, the number of insects caught in traps, or the number of weeds per plot.

Appropriate also for percentage data where the range is between 0 and 30% or between 70 and 100%.

Page 18: Variance Heterogeneity and Non-Normality

Square-Root TransformationIf most of the values in the data set are small

(e.g., less than 10), especially with zeroes present, (x + 0.5)1/2 should be used instead of x1/2

Page 19: Variance Heterogeneity and Non-Normality

Square-Root Transformation

• The range of data is from 0 to 26.39%• Many values are less than 10 data are transformed into (x + 0.5)1/2

Page 20: Variance Heterogeneity and Non-Normality

Square-Root Transformation

Page 21: Variance Heterogeneity and Non-Normality
Page 22: Variance Heterogeneity and Non-Normality

Square Root Transformation

Page 23: Variance Heterogeneity and Non-Normality

Arc Sine TransformationAppropriate for data on proportion, data

obtained from a count, and data expressed as decimal fractions or percentages (

Note: percentages which are not derived from count data are not included, such as percentage protein or carbohydrates.

Using a table of the arc sine transformation.The value of 0% should be substituted by

(1/4n) and the value of 100% by (100-1/4n)

Page 24: Variance Heterogeneity and Non-Normality

Arc Sine TransformationNot all percentage data need to be transformed. The rule in choosing the proper transformation as follow:RULE 1. For percentage data lying within the

range of 30 to 70%, no transformation is needed.RULE 2. For percentage data lying within the

range of either 0 to 30% of 70 to 100%, but not both, the square-root transformation should be used.

RULE 3. For percentage data that do not follow the ranges specified in either rule 1 or rule 2, the arc sine transformation should be used.

Page 25: Variance Heterogeneity and Non-Normality

Illustration of Arc Sine Transformation

Original data

• The arc sine transformation should be used because the percentage data ranged from 0 to 100%.• All zero values are replaced by [1/4(75)] and all 100 values by {100- [1/4(75)]}

Page 26: Variance Heterogeneity and Non-Normality

Illustration of Arc Sine Transformation

Transformed data

Page 27: Variance Heterogeneity and Non-Normality
Page 28: Variance Heterogeneity and Non-Normality

The result of arc sine transformation using original data in Table 7.23

Page 29: Variance Heterogeneity and Non-Normality

Statistical Procedures for Agricultural ResearchKwanchai A, GomezArturo A. Gomez