session 17 & 18

26
SESSION 17 & 18 Last Update 16 th March 2011 Measures of Dispersion Measures of Variability

Upload: lovie

Post on 25-Feb-2016

53 views

Category:

Documents


2 download

DESCRIPTION

SESSION 17 & 18. Last Update 16 th March 2011. Measures of Dispersion Measures of Variability. Grouped Data – Investment B. Learning Objectives. Measures of relative standing: Median, Quartiles, Deciles and Percentiles Measures of dispersion: Range - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SESSION 17 & 18

SESSION 17 & 18

Last Update16th March 2011

Measures of DispersionMeasures of Variability

Page 2: SESSION 17 & 18

Lecturer: Florian BoehlandtUniversity: University of Stellenbosch Business SchoolDomain: http://www.hedge-fund-analysis.net/pages/ve

ga.php

Page 3: SESSION 17 & 18

Grouped Data – Investment BIntervals x f f(<) xf Actual

-25 to < -15 -20 2 2 -40-15 to < -5 -10 5 7 -50-5 to < 5 0 5 12 05 to < 15 10 4 16 40

15 to < 25 20 6 22 12025 to < 35 30 3 25 90

Total 25 160Total / 2 12.5Mean 6.400 7.072

Ome 5f(<) 12fme 4

Median 6.250 4.700Omo 15fm 6

fm-1 4fm+1 3

Mode 19.000 multimodal

Page 4: SESSION 17 & 18

Learning Objectives

1. Measures of relative standing: Median, Quartiles, Deciles and Percentiles

2. Measures of dispersion: Range3. Measures of variability: Variance and

Standard Deviation

Page 5: SESSION 17 & 18

Percentiles

The Pth percentile is the value for which P percent are less than that value and (100 – p)% are greater than that value.Some special percentiles commonly used include the median and the quartiles.Percentiles are measures of relative standing.

Page 6: SESSION 17 & 18

Terminology

50th Percentile 25th, 50th, 75th,100th Percentile

20th, 40th,…, 100th Percentile

10th, 20th,…, 100th Percentile

½ 1 Median ¼ 4 Quartiles 1/5 5 Quintiles

1/10 10 Deciles

Q2 Q1, Q2,Q3,Q4,

Lp

Page 7: SESSION 17 & 18

Location of a Percentile

The location L of a percentile is a function of the required percentile P and the sample size n:

Lp = (n + 1) * (P / 100)

As with the median, all observations must be placed in ascending or descending order first.

Page 8: SESSION 17 & 18

Calculation of Percentile

1. Place all observations in order2. Calculate the location of the percentile3. Since the location will often be a fraction

(e.g. n/2), the distance between the two observations in question must be multiplied with the fractional part of the location

4. The result of 3. is added to the preceding observation to yield the percentile

Page 9: SESSION 17 & 18

Percentile: An example

The following denotes the number of hours spent on the internet:0 0 5 7 8 9 12 14 22 23The values are already placed in order. The sample size is n = 10. We wish to determine L25, L50 and L75 (this is analogous to the quartiles Q1, Q2 and Q3)

Page 10: SESSION 17 & 18

Solution – Step 1

Use the formula to calculate the location for each percentile / quartile

Obs Data Quartile Lp1 0 25 2.75 =(10 + 1) * (25 / 100)2 0 50 5.50 =( + 1) * (50 / 100)3 5 75 8.25 =( + 1) * (75 / 100)4 7 n5 8 106 97 128 149 22

10 23

Page 11: SESSION 17 & 18

Solution – Step 2

Determine the fractional part of the location

Obs Data Quartile Lp Fraction1 0 25 2.75 0.75 =2.75 - 22 0 50 5.50 0.50 =5.5 - 53 5 75 8.25 0.25 =8.25 - 84 7 n5 8 106 97 128 149 22

10 23

Page 12: SESSION 17 & 18

Solution – Step 3Obs Data Quartile Lp Fraction Lower Upper

1 0 25 2.75 0.75 0 52 0 50 5.5 0.50 8 93 5 75 8.25 0.25 14 224 7 n5 8 106 97 128 149 22

10 23

Determine the next lower and next higher observation associated with the location. For 2.75, the two observations are 2 0 and 3 5.

Page 13: SESSION 17 & 18

Solution – Step 4

In order to determine the quartile associated with a given location, you need to calculate the following:

Solution = Lower + (Upper – Lower) * Fraction

Obs Data Quartile Lp Fraction Lower Upper Solution1 0 25 2.75 0.75 0 5 3.75 =0 + (5 - 0) * 0.752 0 50 5.5 0.50 8 9 8.50 =8 + (9 - 8) * 0.53 5 75 8.25 0.25 14 22 16.00 =14 + (22 - 14) * 0.254 7 n5 8 106 97 128 149 22

10 23

Page 14: SESSION 17 & 18

Exercises

You may use shortcuts if you want!1. Determine the first, second and third

quartiles:5 8 2 9 5 3 7 4 2 7 4 10 4 3 5

2. Determine the third and eighth deciles (30th and 80th percentile):10.5 14.7 15.3 17.7 15.9 12.2 10.0 14.1 13.9 18.5 13.9 15.1 15.7

Page 15: SESSION 17 & 18

Range

The range is the difference between the minimum and maximum observation. It is a measure of dispersion.The interquartile range is the difference between the third and the first quartile:

Interquartile Range = Q3 – Q1

Page 16: SESSION 17 & 18

Variance

The variance expresses the sum of the squared deviation of every single observation from the sample / population mean. All differences are squared so that positive and negative deviations from the mean are not cancelled out.The variance in a measure of variability.

Page 17: SESSION 17 & 18

Population and Sample Variance

We need to differentiate between population variance and sample variance. From the calculation of the mean, the sample variance has one less degrees of freedom (n-1) in calculating the variance. For the hypothetically infinite population of size N this is not the case.

Page 18: SESSION 17 & 18

Formulas

Sample Population

Sample size Total population size

Observation Observation

Sample Mean Population Mean

Sample Statistic Population Parameter

Page 19: SESSION 17 & 18

Calculation of Variance

1. Calculate the average:Sum of observations / number of observations

2. Subtract the average from every obervation3. Square the difference4. Sum the squared differences5. Divide the result from 4. by either N

(population) or n-1 (sample)

Page 20: SESSION 17 & 18

Variance: An example

The following denotes the number of hours spent on the internet for a sample of n = 10 adults:0 7 12 5 33 14 8 0 9 22Calculate the variance.

Page 21: SESSION 17 & 18

Solution – Step 1

Use the mean to calculate the differences between the mean and every observation

Obs Data Difference1 0 -8 =(0 - 8)2 7 -1 =(7 - 8)3 12 4 =(12 - 8)4 5 -3 =(5 - 8)5 3 -5 =(3 - 8)6 14 6 =(14 - 8)7 8 0 =(8 - 8)8 0 -8 =(0 - 8)9 9 1 =(9 - 8)

10 22 14 =(22 - 8)Total 80

n 10n-1

Average 8

Page 22: SESSION 17 & 18

Solution – Step 2

Square all differences. Next, Sum the differences and divide the sum by n – 1 (sample only)

Obs Data Difference Sqr Diff1 0 -8 64 =(-8)^22 7 -1 1 =(-1)^23 12 4 16 =(4)^24 5 -3 9 =(-3)^25 3 -5 25 =(-5)^26 14 6 36 =(6)^27 8 0 0 =(0)^28 0 -8 64 =(-8)^29 9 1 1 =(1)^2

10 22 14 196 =(14)^2Total 80 412

n 10n-1 9

Average 8 45.778In case of the sample, the sumsq is divided by n-1, in the case of the population it is divided by N

Page 23: SESSION 17 & 18

Interpretation Variance

The variance may be difficult to interpret. Remember that all differences are squared to avoid positive and negative differences from cancelling out. The statistic may be standardized by taking the square root of the variance. This statistic is called the standard deviation.However, the variances from two datasets may still be referred to when determining the more volatile dataset.

Page 24: SESSION 17 & 18

Example – Standard Deviation

The population standard deviation:

Similarly, the sample standard deviation:

Thus, for the internet usage example:

Page 25: SESSION 17 & 18

Solution – Step 3Obs Data Difference Sqr Diff

1 0 -8 642 7 -1 13 12 4 164 5 -3 95 3 -5 256 14 6 367 8 0 08 0 -8 649 9 1 1

10 22 14 196Total 80 412

n 10n-1 9

Average 8 45.778Sqrt 6.766

Interpretation:On average, observations of internet usage within the sample of ten people deviates by 6.766 h from the sample mean.

Page 26: SESSION 17 & 18

Exercises

1. Calculate the variance and standard deviation for the following data:2 8 9 4 1 7 5 4

2. Calculate the variance and standard deviation for the following data:7 -5 -3 8 4 -4 1 -5 9 3