ferguson - variability
TRANSCRIPT
-
8/8/2019 Ferguson - Variability
1/28
OF VARIATION,SKEWNESS.
MEASURES
AND KURTOSIS
5.I INTRODUCTIONOf greatcOncernio the statistician is the var-iation in the events of nature.
The variatiDn of one measuremenl from another is a persislingcharacter
istic of any sample of measurements. Measurements of intelligence, eye
color, reaction time, and skin resistance, for example, exhibit variation in
any sample of individuals. Anthropometric measurementssuch as height.
weight,diameterof the skull. length of the forearm, and angularseparation
of the metatarsals show variation betwe'enindividuals. Anatomical and
physiological measuremenls vary: also the measurements made by the
physicist,chemist,botanist, and agronomist. Statistics has been spoken of
as the study crfvariation. Fisher ( 1970) has observed,
The conccpljon of \lalr\tics !s lhe sludy of varialion i\ lhe naturnl oulcomc ofvie\ jnB lhe
subjecl as the sluJy of population\: for a population ol rndividuals in all re\pecis idenlical is
completely described by a descript()n of any one individual. logether wilh the number in r.he
group. The populationswhich are the object of slatistical sludy always display variation in
one or more rcspects.
The experimental scienlisl is frequentlyconcernedwith the different cir'
cumstances.conditions.or sources whichcontributeto the variation in the
measurements
he or she obtains. The analysis of variance(Chapter15)developedby Fisher is an important statistical procedurewhereby the variationin a set of experimental data can be partitionedinto componentswhichfrequentlymay be attributed to different c.lusal circumstances.
How may the variation in any set of measurenentsbe described?
THEMEAN OIPVANON
-
8/8/2019 Ferguson - Variability
2/28
Consider rhe lbllowing measL
Sample,4 I0
] 15 lt
SampleB I
8 15 2:
We note that the t\r,osanrpleInspectionindicates.however,variabie than those in sampAmong the possiblemeasuresthe mean deviation.and the rtbese is the slandarddeviation
5.2 THERANGEf$ .,9g"_ts_*.su:ptellJlg3l
metrts.ihe{angeis takej}-ar.th
[email protected]!0etrls. The rangefor
20 minus10,or 10. Therang
is 28 minusl, or 26. Themr
exhihit greater variation
thantla mucngreaterrange.Therar
samplesit is an unstabledesct
the rangefbr smallsamplesisdeviationbut increasesrapidlynot independentof samplesizedistributionsthattapert0 0 atttainingextremevaluesfor lar6rangescalculatedon samplescnotdirectlycomparable.Desoifectivelyusedin theapplication
5.3 THEMEANDEVIATIONConsiderthefbllowingmeasurer
Sample,4 It 8 8uSamplef I 4 7 IO ISamplca I 5
Intuitively,themeasurements
inwhichin rurn arelessvariablethrnI exhihitno variation atall.and 16. Ii we cxpressthemeasmeans.we obtain
-
8/8/2019 Ferguson - Variability
3/28
Considerthe lbllowing mea5uremerlts for two samplesr
Samplc.l t0 t: li llt t0SampleI I 8 t-5 tt t8
We note that the two salnples have the same mean, nantel), 15. Simple
inspection indicates. houevcr. that the measurements in sample B are morevariable than those in sample ,4: they difler more one liom another.Among the possiblemeasules used to describe this variation are the range,the mean deviation. and the standard deviation. The most imoortant oftheseis the standard dcviation.
5.2 IHE RANGE@arntio1, ln 43y1a!.qp,lc-Le
al_lrgalumentsthe rang is taken. aslhedifferencebetweenthe largesl andsmallestrl1g$]lrcmenl.s. The range for- lhe measurcments 10, ll, 5, I8, and lU rs
l0 minus 10.or 10. fhc range ft)r the mcasufemen(t l. l{, 15. ll, and l8
is ?8 minus I, or 26. The measurementsin th!' second set quite clearlyexhibit greatervariation than those in lhe first set, and thi5 reflects itself ina nuch gfa(er range. fhe rangehas two disadvantages. First, for largesamples it is an unstable descriptive measure. The rampling variance ofthe range fbr small samplcs is not much greater than that of the standarddeviatjon but increases rapidly with increase in N. Second,the range isnot independent of sample size. except under special circumstances. Fordistributionsthat taper to 0 at the extremities a bctter chance exists of obtainingextrenre values lbr large than lbr small samples. Consequently,
rangc; calculated on samplescomposed of diff'ercnt numbers of cases arenot directly comparable. Despitc thcse disadvantages the range may be ef'fectivclyused in the application of tests of significancc with small samples.
5.3 THEMEAN DEVIATIONConsider the following measuremcnts:
SamplcJ SlililiSSanplc B I I 7 l0 tlSample(' I5t0 15:e
lntuitively.the measurements in samplc,4are less variable than those in B.whichin turn are less variilblcthanthose in C. Indeed.the measurementsin z1 exhibit no variation at all. fhe means of the thre sarnplesare ll, 7.and i6. lf we cxpress thc measurementsas deviations fiom their samplemeans. we obtain
-
8/8/2019 Ferguson - Variability
4/28
61 MEASUPIS ANO(URTOSTS
OF VAnAION, srrWNtSS,
Sample,{ 0 0 000SampleI 3 0 +3 +6
-6-
SampleC 15 tt +1 +9 +13
Inspectionof these numbers suggeststhat as variation increases, thedepartureofthe observations from their sample mean increases. We may usethis characteristic to define a measure of variation. One such measureislhe mean deviation. The mean deviation is the arithmetic mean of the
absolute deviations from thearithmetic mean. An absolute deviation is adeviationwithout regard to algebraicsign. To obtain the mean deviationwe simply calculatethe deviations from the arithmetic mean, sumthese.disregardingalgebraicsign, and divide by N. For sample ,4 above, themean deviation is 0. For sample B the mean deviation is (6 + 3 +0+ 3 +6)/5: +:3.6. + 11 +4+
ForsampleCthemeandeviationis(159+13)/5:Y:10.4.The mean deviation is givenin algebraic language by the fbrmula
t5;rl ItD-:x,ltl
Here X -* is a deviation from the mean and l,f'- tl is a deviationwilhout regard to algebraic sign. The verticalbarsmeanthal signsareignored.
Hitherto, symbols above and below the summation sign ! have beenused to indicatethe limits of the summation. In the above formula for themean deviation these symbols have beenomitted, the summation beingclearly understood to extendover the N members in the sample. In thisand subsequent chapters symbols indicating the limits of summation will,for convenience, be omitted wherethese are understood clearly from thecontext to extend over N sample members. Where anypossibilityofdoubtcould exist, the symbolsaboveand belowthe summation sign will be inserted.
Themean deviation is infrequently used. It is not readily amenable toalgebraic manipulation. This circumstance stems from the useof absolutevalues. ln general,in statisticalwork lhe use of absolute values should beavoided,if at all possible.
The mean deviation is discussedhereprimarilyfor pedagogicreasons.It illDsfiatshow one parlicularmeasureof variation may be defined.
5,4 THESAMPLE ANO STANDARO VARIANCE DEVIAIION
Some of the deviations about the mean are posilive; others are negative.
The sum of deviations is 0. One method for dealing with the presenceof
-
8/8/2019 Ferguson - Variability
5/28
negative signs is to use the absolute deviations. as in tbe calculation of the
5.4 TfiESAMPITvAfl Nct At{D3meandeviationas ,ommendit. An althe deviationsabosquaresin thedefir
of the measuremermeanare-6, -3, (9, 0. 9. and36. Tl
The sumof sorriono]I?"tithe-yqri4lce-"tiifrBorh
!ry-dtviOlclrhe-surnumbero.fcases.I
t5.21
ln the illustrativee;deviationsaboutthe[5.2],iss': {ri: 13.
An altemativensquaresby N-I rivariancess, is given
t5.1i
Both formulas,Ivariance. Thesefc
processof plausibleItinctionis madehen
that definedby formr"
What is the essformula[5.2]andthe(heanswerto this qu
tinction was madebvalues,or parametenof a populationvariar>(X -X)" by N weshowa systematicte
divide> (X -t), byo:. Suchanestimatethanor lessthanqr.In all situations
required,the statisticN shouldbe used. Ibook an unbiasedesdescriptivestatistics,
-
8/8/2019 Ferguson - Variability
6/28
5.' IHT 5AMPIE VAFIANCE AND SIANDARD DEVIAIION
mean devialion as denncd in formula [5.1]. Thisprocedurehaslittle to recommendit. An altemative and generally preferable procedureis to squarethe deviations about the mean,sum these squares, and use this sum ofsquaresin the definition of a measure of variation. For example, the mean
olthe measurements l, 4,7, 10, and 13is 7. The deviations from the
-6, -3, 0,*3, and+6.
meanare The squareso[ these deviations are36,9, 0, 9, and 36. The sum of squares is 90.sum of squares of deviations about the mean is used in the defini
the variance-Both are in common use. blq .ttlS4_deli""" rhe variancebflivlOlqe-1h-stts-a{-sqlalq of deviatiops about thc mean bLN. thenq4b31-o!14.sqq.Denote this statistic by .r'':. Thus
-
2(.X x)',
ln the illustrative example in the paragraphabove, the sum of squares ofdeviationsaboutthe mean was 90. and the variance, according to formula[,5.2],isr':+!:18.
An altemative method of defining the varianceis to divide the sunl ofsquares by N -I rather than N. Thus. according to this dennition. thevariancess: is givenby
I5.31 -2
Both formulas, [5.2] and [5.3],providealternativedefinitionsof thevariance. These formulas have no derivation but are obtained by aprocessof plausiblereasoning. The readerwill note that no notational distinctjonismade here between the variancedefined by formula [5.2]andthatdefinedby formula [5.3].
What is the essential diference between the varianceas defined byformula[5.2]andthe varianceas defined by formulat5.31?To understandtheanswer to tbis questionthereader should recallthatin Chapter I a distinctionwasmade betu'een sample values, or estimates, and populationvalues,or parameters.Both formulas, [5.2] and [5.3],provideestimatesof a populationvalianceo!. For certain algebraic reasonswhen we dilide
-
:(.Y ,Y)' by N we obtaina biased estimate of rr". This estimate willshow a systematic tendencyto be less than 02. lt is biased. Whenwedivide!(-Y -t)' by N -l, however, we obtain an unbiased eslimate ofo:. Suchanestimatewill show no systematic tendency to be either greaterthan or less than or.
In all situations where an estimateof a populalionvariance rrr isrequired,theslatistic i'zwhich divides the sum of squares by N -I and not
-
8/8/2019 Ferguson - Variability
7/28
N should be used. In the greatmajority of situations discussed in thisbook an unbiased estimate is required. ln some situations involvingdescriptive statistics, convenienceand simplicity dictate the useof an es
-
8/8/2019 Ferguson - Variability
8/28
MIASURES Oa vARlAllON, SKEWNESS,
-
timatewhich divides the sum of squares hv N and not N l. lrt g
-
8/8/2019 Ferguson - Variability
9/28
differences X, Xr, X, ,1.,,. and X.3 X;. [n general, for Nmeasurmentsthe number of such dilferences is N(N- l)/2. To illustrale,fDr the measurements 1,4,7, 10, and ll, the differences between
-
eachmeasurementand every olher measurenent are -3, -6, -9, l:. ,1.
9, -3, -6, and -3.
6. Note that the sign of the difference depends onthe order of the measurements. If we obtain the sum of squares of the dit't'erencesbetween each meilsurement and every olher measurement anddivide by the number of such differences,the result is closely related to s';in fact it is simply twice r:. [n our example the sum of squares of dif[erencesis 450. We divide this by l0 to obtain 45.0. which is seen to betwice the variance. 22.5. as calculated by formula [-s.]1. In general,inalgebraicnoration it may- be shown that!.5 AN ItLUSIPAIVE APPUCAION
L5.4)
where the summatilferences.Thisresuleachvalueisfrom evdiferencesdividedb.
The varianceis afeet,then(X -x yr1.desirableto usea mei
units of the originalj
takingthesquarerootis called thestundard
t5.5i
or
t5.61
5.5 AN ITTUSIRATIVEAPI
Our understanding
ofwill beenrichedby cor
are of interest. Consieffectof a drug onacoof subjects, who receivthe drug, areused. Eiscoreson the codingts
Experimental S
7Conrrol 29
-
8/8/2019 Ferguson - Variability
10/28
36
The meanscorefor the-51.5. The investigatotmeansthatthedrughajects. The standarddeand 14.86, the experim
ancethan thecontrol grertrnga substantial inflrinfluenceon levelof pementaldatatheinvestigencesin the standardd
arithmeticmean.
-
8/8/2019 Ferguson - Variability
11/28
5.5 AN rrrUSlnATtVt 57APPCAION
-
:f'\'' -
l).qtX )'2\ '
N(N t)12
where the summation is understood to extend over N(N l)/l diflerences.This resuh meansthal s! is a descriptlve indexof how differenteachvalueis from every other value;in fact it is anaverageof the squareddifferencesdividedby 2.
-
The variance is a statistic in squared units. lf x X is a deviation infeet, then (X -X)' is a deviation in feet squared. For many purposesit isdesirableto use a measureof variationwhich is not in squared units butinunits of the original measurementsthemselves.We obtain this resuhbytakingthe square root of either formula [5.2]oformula[5.3]. This statistic
^fhus
is calledthestqndarddeyiation.
[5.-rI
5.5 AN ITTUSTRATIVE APPLICATION
Our understandingof the nature of the variance andthe standard deviationwill be enriched by considering illustrative situations wherethesestatisticsare of interest. Consider a simple experiment designed to investigate theeffecl of a dlug on a cognitive task suchas coding. An experimentalgroupof subjects,who receive the drug,and a control group,who do not receivelhe drug, are used. Each groupcontains l0 subjects. Let us assumethescoreson the coding task for the two groupsareas follows:
Experimental 5 7 11 ll 15 4'7 6E 85 96 99Control :9 .16 1'l 42 49 58 6t 63 69 7()
The mean score for the experimental groupis 50.0, and that for the control,
51.5. The investigator might be led to concludefrom inspecting thesemeansthat the drug had little or no effect on the performanceof the subjects.The standard deviations for the two groupsare, respectively. 35.63and 14.86, the experimentalgroupbeing much more variable in performancethanthe control group. Quiteclearly the treatment appearsto be exeninga substantial influenceon the variationin performance,althoughitsinfluenceon level of performanceis negligible. In the analysis ofexperimentaldata the investigator mustattendto, and if possibleinterpret. differencesin
the standard deviation, or variance,as well as differences in thearithmeticmean.
-
8/8/2019 Ferguson - Variability
12/28
SIANDAPOSCORES
68MaasuRls sKEwNEss, xuetosts
ot vARtAloN AND
the marks assigne
THE VARIANCEthe same as the
5.6 CATCUI.ATING SAMPTE AND THEFROM UNGROUPEDThis result follow['or purposesof calculation. it is convenient to write the variance and the
SIANDARDDEVIATION
DATA
correspondingobsstandard deviation in a different form. The variance may be $'ritten irs
meanof the origi
-A deviati,, >(,Y x)"X*c.
-
addedis then (X-
x t. Sinceth
-
>\xt + t! 2xr )
tion of a constant,
NI
trate,by adding a>.Y:+Nt,'_2N.t' we obtain6, 9, 127, and the mean o:r' -Nt'' 12. The deviation
-6, -3,0, +t, anr
If all measuretln this derivation note lhat the summation of X' over N is simply NX J: dard daviationis atl\o thc \umnl:Ltion ol 2XX i\ 2t:t' .-2Ntr. since >X: NX. The the standard devi
a
standarddeviation is given by
-
8/8/2019 Ferguson - Variability
13/28
tipliedby the conslis3x4:12. Toof a sample of meac is cX. A devia
squaring,summinl
Thus to calculate the standard deviation using this formula, we sum the
obtarn
squares of the original observations. subtract from this N times the squareof the arithmeticmean, divide by N- l. and then take the square root.For example,the five observations I, .1.7, 10. and l3 havea mean of 7.The squirres of these obscrvations are I, 6, 49, 100,and I 69. The sum ofthesesqrrared observations is 335. The variance is then
Thus if all measul
135 5 7r_
,.
.'._ ) Y-_rx' ---'"q11multipliedby c, an
'--.v
I -.-
is a negative numb
way of illustration
rrnd lhe:tantlaltl deviari,,rn i\ \ l=0 J.74.
varianceof 22.50,
An alternative formula for the standard deviation which avoids the
are multiplied by tl
calculation of the arithmetic mcan and may. ther!'[ore, be useful lbr certain
now 5 x 7, or 35.
compulational purposesi\
+30. Squaringtt/,\Tx,,-(:xr squaresis 2,250,
15.7l'
Y N(,V l) 23.72,whereas5The slight discrepaThisformularequiresoncoperationof division only.
ON THE DEVIAIION
-
8/8/2019 Ferguson - Variability
14/28
5.7 THEEFFECI STANDARD5.8 STANDARDSCOROFADDINGOR MUTTIPTYINGBYA CONSTANTHitherto we havec
I.l a
utultant i:t added to ull thc obsarr.'alionsin a surnple, tltc standurdthey wereoriginall.dcvitttion rttnuint uncltungcd. An examiner may conclude, for exarrple,X with meanX ar
that an cxamina(ion is too difficult. He may decide to add l0 points to all
-
8/8/2019 Ferguson - Variability
15/28
5.I SIANOARDSCORES 69themarks assigned. The standard deviation of the originalmarkswill bethe same as the standarddeviationof marks with the 10 pointsadded.This result follows directly from the fact that if X is an observation, thecorrespondingobservationwith the constant c addedis X + c. lf t is themean of the original observations, the mean with the constantaddedis
* + c. A deviation from the meanof the observations with the consranradded is then (,\'+ c) (t + c). which is readily observed to be equal toX -t. Sincethe deviations about the mean areunchangedby the additionof a constant,the standard deviationwill remainunchanged.To illustrate,by addinga constant, say, 5, to the measurements I, 4, 7, 10. and ll,weobtain6,9, 12, 15. and 18. The mean ofthe crriginalmeasurements
is
7. and the mean of the measurementswith ihe constantadded is 7 * 5. or12. The deviations from the mean are in both instances the same, namely,6,-3, 0, +3, and 1 6. The standard deviation in both instancesis 4.74.
lf all measuremants in a sample ore muhiplied by a (onstqnt,thestandardderiqtionisalsomultipliedby tht absolute value ofthat constont. lfthe standard deviation of examination marks is 4 and all marks are multipliedbythe constanl 3, then the standard deviation of the resulling marksis 3 x 4: 12. To demonstrate weobserve is the mean
this result. that iftof a sample of measurements, multiplied by
the mean of the measurementsr is cX. A deviationfrom the mean is Ihen rX -r'X , rX -.Yt. 81squaring.summingover N observations,and dividing by N- l. we
Obtain
st,Y-rY\2 c22(X -*)2
Thus if all measurenrents are muhiplied by a constant c, the variance ismulliplied by c': and the standarddeviation by the absolutevalue of c. If cis a negative number. say, 3, s is multiplied by theabsolute value 3. Byway of illustration,themeasurements
1,4,7, 10,13have a mean of7, avariance of 22.50, and a standard deviation of 4.74. If the measurementsare multiplied by theconstant5,we obtain 5. 20, l-5. -50, 65. The mean isnow-5x7,or 35. The deviationsfrom the mean are 30,-15,0,+15,*30. Squaringthesewe obtain 900, 225, 0, 225, 900. The sum ofsquares is 2,250, the varianceis 562.50, and the standarddeviation is23.72, whereas 5 times the original standard deviation of 4.74 is 23.70.The slightdiscrepancyresultsfrom the rounding of decimals.
5.8 STANDARDSCORESHithertowe haveconsideredscores or measurements in the form in which
theywereoriginallyobtained. Suchscores are represented by the symbol
X with meanX and standarddeviation s. Suchscores in their orisinal
-
8/8/2019 Ferguson - Variability
16/28
MEAsuREsoF vaprllroN, aNDKupTosrs
sxEwNEss,AOVANIAGESOF THE VARIANCTAN
form are spoken of as r.n! Jco/'eJ. We have also considered deviations In efect,
in relalion lo-
about the arithmetic mean. .r: X t. These are Inown as tleviatiott scoreof 65 ontheEngl.r( ore.r and have a mean of0 and a standard deviation of J. lf now we the mathemalics
examidivide the deviation about the mean by the standard deviation, we obtain tionabovethemean,thwhat is callcd a stundurd score represented by the symbol ;. Thus beconsideredto
be thethe mean,that is.52 +
XXx
individualmakesa scor
58on lhe mathematics
Standardscores have a mean of 0 and a standarddeviation of l. As pre
anceon the two subje(
viously shown, if ali measurementsin a sample are multiplied by a con-
his standardscoreis (5stant,the standard deviation is also multiplied by the absolutevalue of that
scoreis (58-52)lt2:
-
constant. Deviation scores. ,r: X X, have a standard deviation s. Each
darddeviationunitbelcscore has a constant -,\'added. This leavess unchanged. Ifall the devia
anceis .5 standard devtion scores are divided by ,r. which is the same thing as nrultiplying by the
individualdid muchmoconstanl l/.r, the standard deviation of the scores thus obtained is s/s: l.
lhe performanceof thTo illustrate. the following observations have been expressedin raw-
althoughthis is notrefle
score. deviation-score. and standard-score form,
orouscomparabilityof
-
8/8/2019 Ferguson - Variability
17/28
shouldbe identical in slIndividual ,' clearaswe proceed.The reader shouldn
/1 I 7l.llis equalto N -
R .61 l. Wec 1l-
D I.-._x(x
Et5 5 .'79 ,tIt) IO l.-58
The readershouldr
Sum .(J0 .{J0 sumof squaresof slandiMean l{J .0t) .ot)6.tl 6..11 l(xt
5.9 ADVANTAGESOI IHE \Becausestandardscores have zero mean and unit standarddeviation,
DEVIATIONAS MEASUI
they are readily amenable to certain forms of algebraic manipulation.Many formulationscan be derived more convenientlyusing standard
The variance and stand
scoresthan using raw or deviation scores.
measuresof variation.
The use of standardscoresmeans, in effect, that we are using the stan
variancehascerlainaddi
dard deviation as the unit of measurement. In the above exampleindivid
intoadditivecomponents
ualI is l.ll standard deviations, or standard deviation units, below the
cumstance.The sample
mean. while individual F is 1.58 standard devialionunitsabove the mean.
eslimateof thepopulatiot
Standardscores are frequently used to obtain compalability of obser
der certain assumptjons
-
8/8/2019 Ferguson - Variability
18/28
vations obtainedby different procedures. Consider examinations in
deviationin the populatiEnglishand mathematics applied to the same groupof individuals,and as-
doesof the meandeviati
sume the means and standard deviations to be as follows:deviationare more amemeasures.Theyenterinstatistics.Theyarewideon samplingstatisticsthe
Examination
Fnslish 658
effect the standarddevial
Malhemalics 5t l:
metersfrom samplevalr
-
8/8/2019 Ferguson - Variability
19/28
5.' ADVANIAGES O' IttE VARIANC! ANO SIANDAED DIVIAIION AS MEASURIS OF VATIA'ION71
ln cffect. in relation to the pcrlbrmance of the individuals in the group, ascore of 65 on the English cxamination is the equivalent of a score of -52 onthe mathematics examination. To illustrate. a score one stanclard deviation
above the mean.that is, 6-5'i 8, or 73. on the English examinutioncanhe consideled to be the equivalent of a score one standard deviation abovethe mean. that is.52* I2. or 64. on the mathematics cxamination. If anindividual makes a score of -57 on thc English examination and a score of58 on thc mathemalics examination, we may compare his relative performanceon rhc rwo subjects hv comparing his standard scores. On Englishhis standard score is (-57 65)/8 : 1.0.and on mathematics his standard
-
scor!'is (-5ll 5l)/l:: .-5. fhus on Englishhis pertbrmanccis one rtandarddeviation unit belo\\'the average, while on mathematics his perfornranceis
.-5 standard devialion unit abovc thc avcrage. Quite clearly.thisindividuai did much more poorly in English than in mathematicsrelativetoth perfbrmance of the group of irilividuals taking the examinations,althoughthis is not reflcctcd in the original marks assigned. To attain rigorouscomparabilityofscorcs,the distributions of scores on the two testsshould be identical in shape. The nreaning of this stat(-ment will hecomeclear as wc orocecd.
'fhe
reader shoulcl notc that thc sum of squarcs of standard scores, ):',is equal to N l. We obs!'r ve lhat .:' : (l ,l 1'/.r':hence
-
\-_, >(r tr :(.\' *t'
,,..,.^.'":*";T;*"', ;:ll:n;'i -r,; i )./N.,he
sum of squares of standard scorcs is N and not N l.
5.9 ADVANTAGESOF THE VARIANCEAND STANDARDDEVIAIIONASMEASURESOF VARIATION
The variance and standard deviation havc many advantages over othermeasures of variation. Much statistical work involves their use. Thevariancehas certain additive propertiesand may on occasion be partitionedinto additive components. each of rvhich may be related to some causal circumstance.The sample standard deviation is a more stable or accurateestimate of the population pantmeler than olher measures of virriation. Undercertain assumptions it provides a more stable estima(e of the standarddeviation in the population than the sample mean deviation. for example,does of tbe mean deviation in the population. The variance and standarddeviation are more amenable to mathematical manipulation than othermeasures. They enter into formulas for the computation of man) types ofstatistics. They are widely used as measures oferror. ln laterdiscussion
on sampling statistics the reader rvill observe that the stanrlar,l ellot is rneffect the stantlard deviation of errors made in estimating population parameters
-
8/8/2019 Ferguson - Variability
20/28
from sample values. These errors result from the operation of
-
8/8/2019 Ferguson - Variability
21/28
oF vARrATroN, AND
72MEAsuREs sKEwNEss,(uRTosrs
5.'I MEASURESOf SXEWNISSAND TUTT(
chance factors in random sampling. A full appreciation of the importance
The rationale for thiss
and meaning of the varianceand standard deviation in their many ramifica
tribution (or any set ofr
tions requires considelable familiarity with statistical ideas.
the mean, whenraisedbelowthe mean. whendistribution.4r.,: 0,an5,IOMOMENISAEOUTTHEMEAN sumsof deviations abr
power, will not balanceThe mean and the standard deviation are closely related to a family of g, + 0. lf the disrribtdescriptive statistics known as mom?nls. The first four moments about positive;whennegativrthe arithmetic mean are as follows: introducedin order to e
fer in variability. Thus
:{x -x)
Uskewnessof a set of n
15.8rDrt:Nscoreon a psychologica-
:(x t)2 N- ,will recallthata standar
'
"lt: N N uslngstandardscoresilmeasurementsto anotl_,,:rr,*rr" directlyanalogousto th
As an illustrationof
-2(X *)''n'
AI
-
8/8/2019 Ferguson - Variability
22/28
BI
ln general,the rth moment aboutthe mean is givenby
X)'Thesenumbersexpressr
-
t5.el^,._2(X
A
The term "moment" originatesin mechanics. Considera lever sup-Bportedby a fulcrum. lf a force li is applied to the lever at a distance-r.from the origin, then.l,r: is called the momentof the force. Further, ifa Set ,4is a symmetrical
deyiationsraisedtothetl
second force.ll is applied at a distance -rr. the total moment isfix1 *./l.rr.
lf we square th distances x, we obtain the second moment;if we cube
.4 64
them,we obtain the third moment; and so on. When we come to consider
a -64
frequency distributions, the origin is the analog of the fulcrum and the
frequenciesin the various class intervals are analogousto forces operating
For setl, ru,: 0 and g
at variousdistancesfrom the origin. Observe that the first moment about
.387. SetB is a positiv
the mean is 0 and the secondmoment is (N l)/N timestheunbiasedsam-
The commonly used
ple variance. The third momentis used to obtain a measureof skewness,
and is definedas
and the fourth moment.a measure of kurtosis.
t5.lrl
5,I1 MEASURES AND KURTOSISThis definitionis based
OF SKEWNESS
mean, when raisedto th(The commonly used measureof skewness makes use of the third moment fourthmoment.
-
8/8/2019 Ferguson - Variability
23/28
Thecoland is defined as tive thicknessof the tails
tionmaybeflatteror molI5.r0l meancontributemuchm
m2\ m2
The termzzr, is used to a
-
8/8/2019 Ferguson - Variability
24/28
5.11MIASURISOF S([WN[SS AND (URlOSrSThe rationale fbr this statistic is based on the observation that when a distribution(or any set ofnumbers) is symmetrical. the sum ofdeviations abovethe mean, u,hen raised to the third power. will balance the sum ofdeviationsbelow the mean. when raised to the third power. Thus for a symmetrical
distribution.,r r: 0. and the A,,: L If the distritrution is asymmetrical. thesums of deviations above and below the mean, when raised to the thirdpower. will not balance. Thus for an asymrnetrical distribution rrr,; 0 ande, + 0. lf the distribution.or set of numbers.is positivelyskewed.g, ispositive;when negatively skewed gr is negative. The quantit!,ar.f rri,isintroducedin order to ensure that gr is comparable for distributions that differin variability. Thus g, is independent ofthe scale ofmeasurement. Theskewnessof a set Df measurementsin gmms, meters.pounds,or units ofscore on a psychologicaltest can be directly compared usingg,. The reader
(,\ -tlir.
rvillrecallthata standard score is tlehned a\:: Oncreasonforusing standard scores is 1() achieve comparability of scores fiom one set ofmeasurementsto anothef. The use rrf ri:r r4 in the definirionof e, isdirectly analogous to the use ol's in the definition of a slandard score.
As an illustration of g,,considertq,o sets of numbers. ,4and I
A6 I0 l:t4R l0 l5
These numbers expressed as deviations tiom the mean become
Set,,1is a s1'mmetrical set of numbers. and set B is asymmetrical.
A410 rl +48420 +l -5-lhese
deviationsraised to the third power are as follows:
ItoB 64 o r l15
-8
For set I, rr':0 andg,:0. For set B.,rrr: 10.80,in.:9.10, and g,:.387. Set B is a positively skewed set of numbers.The commonly used measure of kurtosis involves the foufth moment.and is defined as
[s.l ]lThis definition is based on the observation that large deviations from themean. when raised to the fourth power. will contribute substantiallyto theti)urth moment. The concept of kurtosis is nrole closely Iinked to the relativethickness of the tails of distributions thiin to the idea that one distributionmay be flatter or more peakedthan another. Largedeviationsfrom the
mean contribute much more to thc fourth momenl than smallerdeviations.The term 2,,, is used to achieve comparability. lt serves the same purpose
-
8/8/2019 Ferguson - Variability
25/28
-
8/8/2019 Ferguson - Variability
26/28
-
8/8/2019 Ferguson - Variability
27/28
Sample standard deviiStandardscore.i
Momentsabout the mMeasureof skewness.Measureof kurtosis.
1
EXERCISES
LlFor the measuremmandeviation,(
i2 The variancecalcsum of squares ofu3
A biased varianceWhat is the corres
4The variance forvariancebe if all r
(b)divided by a c(-5 Show that )(X
Scholaslicaptitudr
of 100. A student
L.'
Expressthesescot
Expressthe measltr rn" sum of squa, /
l/8The mean andstarfor a class of 26 stmake scores of 50,scores?
9Calculatethesecol6, 10, 14, 16. Cor
l0 The following are
Group I 2 3Group II 2 4
Calculatemeasures
-
8/8/2019 Ferguson - Variability
28/28