ferguson - variability

Upload: ashfaqchauhan

Post on 09-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Ferguson - Variability

    1/28

    OF VARIATION,SKEWNESS.

    MEASURES

    AND KURTOSIS

    5.I INTRODUCTIONOf greatcOncernio the statistician is the var-iation in the events of nature.

    The variatiDn of one measuremenl from another is a persislingcharacter

    istic of any sample of measurements. Measurements of intelligence, eye

    color, reaction time, and skin resistance, for example, exhibit variation in

    any sample of individuals. Anthropometric measurementssuch as height.

    weight,diameterof the skull. length of the forearm, and angularseparation

    of the metatarsals show variation betwe'enindividuals. Anatomical and

    physiological measuremenls vary: also the measurements made by the

    physicist,chemist,botanist, and agronomist. Statistics has been spoken of

    as the study crfvariation. Fisher ( 1970) has observed,

    The conccpljon of \lalr\tics !s lhe sludy of varialion i\ lhe naturnl oulcomc ofvie\ jnB lhe

    subjecl as the sluJy of population\: for a population ol rndividuals in all re\pecis idenlical is

    completely described by a descript()n of any one individual. logether wilh the number in r.he

    group. The populationswhich are the object of slatistical sludy always display variation in

    one or more rcspects.

    The experimental scienlisl is frequentlyconcernedwith the different cir'

    cumstances.conditions.or sources whichcontributeto the variation in the

    measurements

    he or she obtains. The analysis of variance(Chapter15)developedby Fisher is an important statistical procedurewhereby the variationin a set of experimental data can be partitionedinto componentswhichfrequentlymay be attributed to different c.lusal circumstances.

    How may the variation in any set of measurenentsbe described?

    THEMEAN OIPVANON

  • 8/8/2019 Ferguson - Variability

    2/28

    Consider rhe lbllowing measL

    Sample,4 I0

    ] 15 lt

    SampleB I

    8 15 2:

    We note that the t\r,osanrpleInspectionindicates.however,variabie than those in sampAmong the possiblemeasuresthe mean deviation.and the rtbese is the slandarddeviation

    5.2 THERANGEf$ .,9g"_ts_*.su:ptellJlg3l

    metrts.ihe{angeis takej}-ar.th

    [email protected]!0etrls. The rangefor

    20 minus10,or 10. Therang

    is 28 minusl, or 26. Themr

    exhihit greater variation

    thantla mucngreaterrange.Therar

    samplesit is an unstabledesct

    the rangefbr smallsamplesisdeviationbut increasesrapidlynot independentof samplesizedistributionsthattapert0 0 atttainingextremevaluesfor lar6rangescalculatedon samplescnotdirectlycomparable.Desoifectivelyusedin theapplication

    5.3 THEMEANDEVIATIONConsiderthefbllowingmeasurer

    Sample,4 It 8 8uSamplef I 4 7 IO ISamplca I 5

    Intuitively,themeasurements

    inwhichin rurn arelessvariablethrnI exhihitno variation atall.and 16. Ii we cxpressthemeasmeans.we obtain

  • 8/8/2019 Ferguson - Variability

    3/28

    Considerthe lbllowing mea5uremerlts for two samplesr

    Samplc.l t0 t: li llt t0SampleI I 8 t-5 tt t8

    We note that the two salnples have the same mean, nantel), 15. Simple

    inspection indicates. houevcr. that the measurements in sample B are morevariable than those in sample ,4: they difler more one liom another.Among the possiblemeasules used to describe this variation are the range,the mean deviation. and the standard deviation. The most imoortant oftheseis the standard dcviation.

    5.2 IHE RANGE@arntio1, ln 43y1a!.qp,lc-Le

    al_lrgalumentsthe rang is taken. aslhedifferencebetweenthe largesl andsmallestrl1g$]lrcmenl.s. The range for- lhe measurcments 10, ll, 5, I8, and lU rs

    l0 minus 10.or 10. fhc range ft)r the mcasufemen(t l. l{, 15. ll, and l8

    is ?8 minus I, or 26. The measurementsin th!' second set quite clearlyexhibit greatervariation than those in lhe first set, and thi5 reflects itself ina nuch gfa(er range. fhe rangehas two disadvantages. First, for largesamples it is an unstable descriptive measure. The rampling variance ofthe range fbr small samplcs is not much greater than that of the standarddeviatjon but increases rapidly with increase in N. Second,the range isnot independent of sample size. except under special circumstances. Fordistributionsthat taper to 0 at the extremities a bctter chance exists of obtainingextrenre values lbr large than lbr small samples. Consequently,

    rangc; calculated on samplescomposed of diff'ercnt numbers of cases arenot directly comparable. Despitc thcse disadvantages the range may be ef'fectivclyused in the application of tests of significancc with small samples.

    5.3 THEMEAN DEVIATIONConsider the following measuremcnts:

    SamplcJ SlililiSSanplc B I I 7 l0 tlSample(' I5t0 15:e

    lntuitively.the measurements in samplc,4are less variable than those in B.whichin turn are less variilblcthanthose in C. Indeed.the measurementsin z1 exhibit no variation at all. fhe means of the thre sarnplesare ll, 7.and i6. lf we cxpress thc measurementsas deviations fiom their samplemeans. we obtain

  • 8/8/2019 Ferguson - Variability

    4/28

    61 MEASUPIS ANO(URTOSTS

    OF VAnAION, srrWNtSS,

    Sample,{ 0 0 000SampleI 3 0 +3 +6

    -6-

    SampleC 15 tt +1 +9 +13

    Inspectionof these numbers suggeststhat as variation increases, thedepartureofthe observations from their sample mean increases. We may usethis characteristic to define a measure of variation. One such measureislhe mean deviation. The mean deviation is the arithmetic mean of the

    absolute deviations from thearithmetic mean. An absolute deviation is adeviationwithout regard to algebraicsign. To obtain the mean deviationwe simply calculatethe deviations from the arithmetic mean, sumthese.disregardingalgebraicsign, and divide by N. For sample ,4 above, themean deviation is 0. For sample B the mean deviation is (6 + 3 +0+ 3 +6)/5: +:3.6. + 11 +4+

    ForsampleCthemeandeviationis(159+13)/5:Y:10.4.The mean deviation is givenin algebraic language by the fbrmula

    t5;rl ItD-:x,ltl

    Here X -* is a deviation from the mean and l,f'- tl is a deviationwilhout regard to algebraic sign. The verticalbarsmeanthal signsareignored.

    Hitherto, symbols above and below the summation sign ! have beenused to indicatethe limits of the summation. In the above formula for themean deviation these symbols have beenomitted, the summation beingclearly understood to extendover the N members in the sample. In thisand subsequent chapters symbols indicating the limits of summation will,for convenience, be omitted wherethese are understood clearly from thecontext to extend over N sample members. Where anypossibilityofdoubtcould exist, the symbolsaboveand belowthe summation sign will be inserted.

    Themean deviation is infrequently used. It is not readily amenable toalgebraic manipulation. This circumstance stems from the useof absolutevalues. ln general,in statisticalwork lhe use of absolute values should beavoided,if at all possible.

    The mean deviation is discussedhereprimarilyfor pedagogicreasons.It illDsfiatshow one parlicularmeasureof variation may be defined.

    5,4 THESAMPLE ANO STANDARO VARIANCE DEVIAIION

    Some of the deviations about the mean are posilive; others are negative.

    The sum of deviations is 0. One method for dealing with the presenceof

  • 8/8/2019 Ferguson - Variability

    5/28

    negative signs is to use the absolute deviations. as in tbe calculation of the

    5.4 TfiESAMPITvAfl Nct At{D3meandeviationas ,ommendit. An althe deviationsabosquaresin thedefir

    of the measuremermeanare-6, -3, (9, 0. 9. and36. Tl

    The sumof sorriono]I?"tithe-yqri4lce-"tiifrBorh

    !ry-dtviOlclrhe-surnumbero.fcases.I

    t5.21

    ln the illustrativee;deviationsaboutthe[5.2],iss': {ri: 13.

    An altemativensquaresby N-I rivariancess, is given

    t5.1i

    Both formulas,Ivariance. Thesefc

    processof plausibleItinctionis madehen

    that definedby formr"

    What is the essformula[5.2]andthe(heanswerto this qu

    tinction was madebvalues,or parametenof a populationvariar>(X -X)" by N weshowa systematicte

    divide> (X -t), byo:. Suchanestimatethanor lessthanqr.In all situations

    required,the statisticN shouldbe used. Ibook an unbiasedesdescriptivestatistics,

  • 8/8/2019 Ferguson - Variability

    6/28

    5.' IHT 5AMPIE VAFIANCE AND SIANDARD DEVIAIION

    mean devialion as denncd in formula [5.1]. Thisprocedurehaslittle to recommendit. An altemative and generally preferable procedureis to squarethe deviations about the mean,sum these squares, and use this sum ofsquaresin the definition of a measure of variation. For example, the mean

    olthe measurements l, 4,7, 10, and 13is 7. The deviations from the

    -6, -3, 0,*3, and+6.

    meanare The squareso[ these deviations are36,9, 0, 9, and 36. The sum of squares is 90.sum of squares of deviations about the mean is used in the defini

    the variance-Both are in common use. blq .ttlS4_deli""" rhe variancebflivlOlqe-1h-stts-a{-sqlalq of deviatiops about thc mean bLN. thenq4b31-o!14.sqq.Denote this statistic by .r'':. Thus

    -

    2(.X x)',

    ln the illustrative example in the paragraphabove, the sum of squares ofdeviationsaboutthe mean was 90. and the variance, according to formula[,5.2],isr':+!:18.

    An altemative method of defining the varianceis to divide the sunl ofsquares by N -I rather than N. Thus. according to this dennition. thevariancess: is givenby

    I5.31 -2

    Both formulas, [5.2] and [5.3],providealternativedefinitionsof thevariance. These formulas have no derivation but are obtained by aprocessof plausiblereasoning. The readerwill note that no notational distinctjonismade here between the variancedefined by formula [5.2]andthatdefinedby formula [5.3].

    What is the essential diference between the varianceas defined byformula[5.2]andthe varianceas defined by formulat5.31?To understandtheanswer to tbis questionthereader should recallthatin Chapter I a distinctionwasmade betu'een sample values, or estimates, and populationvalues,or parameters.Both formulas, [5.2] and [5.3],provideestimatesof a populationvalianceo!. For certain algebraic reasonswhen we dilide

    -

    :(.Y ,Y)' by N we obtaina biased estimate of rr". This estimate willshow a systematic tendencyto be less than 02. lt is biased. Whenwedivide!(-Y -t)' by N -l, however, we obtain an unbiased eslimate ofo:. Suchanestimatewill show no systematic tendency to be either greaterthan or less than or.

    In all situations where an estimateof a populalionvariance rrr isrequired,theslatistic i'zwhich divides the sum of squares by N -I and not

  • 8/8/2019 Ferguson - Variability

    7/28

    N should be used. In the greatmajority of situations discussed in thisbook an unbiased estimate is required. ln some situations involvingdescriptive statistics, convenienceand simplicity dictate the useof an es

  • 8/8/2019 Ferguson - Variability

    8/28

    MIASURES Oa vARlAllON, SKEWNESS,

    -

    timatewhich divides the sum of squares hv N and not N l. lrt g

  • 8/8/2019 Ferguson - Variability

    9/28

    differences X, Xr, X, ,1.,,. and X.3 X;. [n general, for Nmeasurmentsthe number of such dilferences is N(N- l)/2. To illustrale,fDr the measurements 1,4,7, 10, and ll, the differences between

    -

    eachmeasurementand every olher measurenent are -3, -6, -9, l:. ,1.

    9, -3, -6, and -3.

    6. Note that the sign of the difference depends onthe order of the measurements. If we obtain the sum of squares of the dit't'erencesbetween each meilsurement and every olher measurement anddivide by the number of such differences,the result is closely related to s';in fact it is simply twice r:. [n our example the sum of squares of dif[erencesis 450. We divide this by l0 to obtain 45.0. which is seen to betwice the variance. 22.5. as calculated by formula [-s.]1. In general,inalgebraicnoration it may- be shown that!.5 AN ItLUSIPAIVE APPUCAION

    L5.4)

    where the summatilferences.Thisresuleachvalueisfrom evdiferencesdividedb.

    The varianceis afeet,then(X -x yr1.desirableto usea mei

    units of the originalj

    takingthesquarerootis called thestundard

    t5.5i

    or

    t5.61

    5.5 AN ITTUSIRATIVEAPI

    Our understanding

    ofwill beenrichedby cor

    are of interest. Consieffectof a drug onacoof subjects, who receivthe drug, areused. Eiscoreson the codingts

    Experimental S

    7Conrrol 29

  • 8/8/2019 Ferguson - Variability

    10/28

    36

    The meanscorefor the-51.5. The investigatotmeansthatthedrughajects. The standarddeand 14.86, the experim

    ancethan thecontrol grertrnga substantial inflrinfluenceon levelof pementaldatatheinvestigencesin the standardd

    arithmeticmean.

  • 8/8/2019 Ferguson - Variability

    11/28

    5.5 AN rrrUSlnATtVt 57APPCAION

    -

    :f'\'' -

    l).qtX )'2\ '

    N(N t)12

    where the summation is understood to extend over N(N l)/l diflerences.This resuh meansthal s! is a descriptlve indexof how differenteachvalueis from every other value;in fact it is anaverageof the squareddifferencesdividedby 2.

    -

    The variance is a statistic in squared units. lf x X is a deviation infeet, then (X -X)' is a deviation in feet squared. For many purposesit isdesirableto use a measureof variationwhich is not in squared units butinunits of the original measurementsthemselves.We obtain this resuhbytakingthe square root of either formula [5.2]oformula[5.3]. This statistic

    ^fhus

    is calledthestqndarddeyiation.

    [5.-rI

    5.5 AN ITTUSTRATIVE APPLICATION

    Our understandingof the nature of the variance andthe standard deviationwill be enriched by considering illustrative situations wherethesestatisticsare of interest. Consider a simple experiment designed to investigate theeffecl of a dlug on a cognitive task suchas coding. An experimentalgroupof subjects,who receive the drug,and a control group,who do not receivelhe drug, are used. Each groupcontains l0 subjects. Let us assumethescoreson the coding task for the two groupsareas follows:

    Experimental 5 7 11 ll 15 4'7 6E 85 96 99Control :9 .16 1'l 42 49 58 6t 63 69 7()

    The mean score for the experimental groupis 50.0, and that for the control,

    51.5. The investigator might be led to concludefrom inspecting thesemeansthat the drug had little or no effect on the performanceof the subjects.The standard deviations for the two groupsare, respectively. 35.63and 14.86, the experimentalgroupbeing much more variable in performancethanthe control group. Quiteclearly the treatment appearsto be exeninga substantial influenceon the variationin performance,althoughitsinfluenceon level of performanceis negligible. In the analysis ofexperimentaldata the investigator mustattendto, and if possibleinterpret. differencesin

    the standard deviation, or variance,as well as differences in thearithmeticmean.

  • 8/8/2019 Ferguson - Variability

    12/28

    SIANDAPOSCORES

    68MaasuRls sKEwNEss, xuetosts

    ot vARtAloN AND

    the marks assigne

    THE VARIANCEthe same as the

    5.6 CATCUI.ATING SAMPTE AND THEFROM UNGROUPEDThis result follow['or purposesof calculation. it is convenient to write the variance and the

    SIANDARDDEVIATION

    DATA

    correspondingobsstandard deviation in a different form. The variance may be $'ritten irs

    meanof the origi

    -A deviati,, >(,Y x)"X*c.

    -

    addedis then (X-

    x t. Sinceth

    -

    >\xt + t! 2xr )

    tion of a constant,

    NI

    trate,by adding a>.Y:+Nt,'_2N.t' we obtain6, 9, 127, and the mean o:r' -Nt'' 12. The deviation

    -6, -3,0, +t, anr

    If all measuretln this derivation note lhat the summation of X' over N is simply NX J: dard daviationis atl\o thc \umnl:Ltion ol 2XX i\ 2t:t' .-2Ntr. since >X: NX. The the standard devi

    a

    standarddeviation is given by

  • 8/8/2019 Ferguson - Variability

    13/28

    tipliedby the conslis3x4:12. Toof a sample of meac is cX. A devia

    squaring,summinl

    Thus to calculate the standard deviation using this formula, we sum the

    obtarn

    squares of the original observations. subtract from this N times the squareof the arithmeticmean, divide by N- l. and then take the square root.For example,the five observations I, .1.7, 10. and l3 havea mean of 7.The squirres of these obscrvations are I, 6, 49, 100,and I 69. The sum ofthesesqrrared observations is 335. The variance is then

    Thus if all measul

    135 5 7r_

    ,.

    .'._ ) Y-_rx' ---'"q11multipliedby c, an

    '--.v

    I -.-

    is a negative numb

    way of illustration

    rrnd lhe:tantlaltl deviari,,rn i\ \ l=0 J.74.

    varianceof 22.50,

    An alternative formula for the standard deviation which avoids the

    are multiplied by tl

    calculation of the arithmetic mcan and may. ther!'[ore, be useful lbr certain

    now 5 x 7, or 35.

    compulational purposesi\

    +30. Squaringtt/,\Tx,,-(:xr squaresis 2,250,

    15.7l'

    Y N(,V l) 23.72,whereas5The slight discrepaThisformularequiresoncoperationof division only.

    ON THE DEVIAIION

  • 8/8/2019 Ferguson - Variability

    14/28

    5.7 THEEFFECI STANDARD5.8 STANDARDSCOROFADDINGOR MUTTIPTYINGBYA CONSTANTHitherto we havec

    I.l a

    utultant i:t added to ull thc obsarr.'alionsin a surnple, tltc standurdthey wereoriginall.dcvitttion rttnuint uncltungcd. An examiner may conclude, for exarrple,X with meanX ar

    that an cxamina(ion is too difficult. He may decide to add l0 points to all

  • 8/8/2019 Ferguson - Variability

    15/28

    5.I SIANOARDSCORES 69themarks assigned. The standard deviation of the originalmarkswill bethe same as the standarddeviationof marks with the 10 pointsadded.This result follows directly from the fact that if X is an observation, thecorrespondingobservationwith the constant c addedis X + c. lf t is themean of the original observations, the mean with the constantaddedis

    * + c. A deviation from the meanof the observations with the consranradded is then (,\'+ c) (t + c). which is readily observed to be equal toX -t. Sincethe deviations about the mean areunchangedby the additionof a constant,the standard deviationwill remainunchanged.To illustrate,by addinga constant, say, 5, to the measurements I, 4, 7, 10. and ll,weobtain6,9, 12, 15. and 18. The mean ofthe crriginalmeasurements

    is

    7. and the mean of the measurementswith ihe constantadded is 7 * 5. or12. The deviations from the mean are in both instances the same, namely,6,-3, 0, +3, and 1 6. The standard deviation in both instancesis 4.74.

    lf all measuremants in a sample ore muhiplied by a (onstqnt,thestandardderiqtionisalsomultipliedby tht absolute value ofthat constont. lfthe standard deviation of examination marks is 4 and all marks are multipliedbythe constanl 3, then the standard deviation of the resulling marksis 3 x 4: 12. To demonstrate weobserve is the mean

    this result. that iftof a sample of measurements, multiplied by

    the mean of the measurementsr is cX. A deviationfrom the mean is Ihen rX -r'X , rX -.Yt. 81squaring.summingover N observations,and dividing by N- l. we

    Obtain

    st,Y-rY\2 c22(X -*)2

    Thus if all measurenrents are muhiplied by a constant c, the variance ismulliplied by c': and the standarddeviation by the absolutevalue of c. If cis a negative number. say, 3, s is multiplied by theabsolute value 3. Byway of illustration,themeasurements

    1,4,7, 10,13have a mean of7, avariance of 22.50, and a standard deviation of 4.74. If the measurementsare multiplied by theconstant5,we obtain 5. 20, l-5. -50, 65. The mean isnow-5x7,or 35. The deviationsfrom the mean are 30,-15,0,+15,*30. Squaringthesewe obtain 900, 225, 0, 225, 900. The sum ofsquares is 2,250, the varianceis 562.50, and the standarddeviation is23.72, whereas 5 times the original standard deviation of 4.74 is 23.70.The slightdiscrepancyresultsfrom the rounding of decimals.

    5.8 STANDARDSCORESHithertowe haveconsideredscores or measurements in the form in which

    theywereoriginallyobtained. Suchscores are represented by the symbol

    X with meanX and standarddeviation s. Suchscores in their orisinal

  • 8/8/2019 Ferguson - Variability

    16/28

    MEAsuREsoF vaprllroN, aNDKupTosrs

    sxEwNEss,AOVANIAGESOF THE VARIANCTAN

    form are spoken of as r.n! Jco/'eJ. We have also considered deviations In efect,

    in relalion lo-

    about the arithmetic mean. .r: X t. These are Inown as tleviatiott scoreof 65 ontheEngl.r( ore.r and have a mean of0 and a standard deviation of J. lf now we the mathemalics

    examidivide the deviation about the mean by the standard deviation, we obtain tionabovethemean,thwhat is callcd a stundurd score represented by the symbol ;. Thus beconsideredto

    be thethe mean,that is.52 +

    XXx

    individualmakesa scor

    58on lhe mathematics

    Standardscores have a mean of 0 and a standarddeviation of l. As pre

    anceon the two subje(

    viously shown, if ali measurementsin a sample are multiplied by a con-

    his standardscoreis (5stant,the standard deviation is also multiplied by the absolutevalue of that

    scoreis (58-52)lt2:

    -

    constant. Deviation scores. ,r: X X, have a standard deviation s. Each

    darddeviationunitbelcscore has a constant -,\'added. This leavess unchanged. Ifall the devia

    anceis .5 standard devtion scores are divided by ,r. which is the same thing as nrultiplying by the

    individualdid muchmoconstanl l/.r, the standard deviation of the scores thus obtained is s/s: l.

    lhe performanceof thTo illustrate. the following observations have been expressedin raw-

    althoughthis is notrefle

    score. deviation-score. and standard-score form,

    orouscomparabilityof

  • 8/8/2019 Ferguson - Variability

    17/28

    shouldbe identical in slIndividual ,' clearaswe proceed.The reader shouldn

    /1 I 7l.llis equalto N -

    R .61 l. Wec 1l-

    D I.-._x(x

    Et5 5 .'79 ,tIt) IO l.-58

    The readershouldr

    Sum .(J0 .{J0 sumof squaresof slandiMean l{J .0t) .ot)6.tl 6..11 l(xt

    5.9 ADVANTAGESOI IHE \Becausestandardscores have zero mean and unit standarddeviation,

    DEVIATIONAS MEASUI

    they are readily amenable to certain forms of algebraic manipulation.Many formulationscan be derived more convenientlyusing standard

    The variance and stand

    scoresthan using raw or deviation scores.

    measuresof variation.

    The use of standardscoresmeans, in effect, that we are using the stan

    variancehascerlainaddi

    dard deviation as the unit of measurement. In the above exampleindivid

    intoadditivecomponents

    ualI is l.ll standard deviations, or standard deviation units, below the

    cumstance.The sample

    mean. while individual F is 1.58 standard devialionunitsabove the mean.

    eslimateof thepopulatiot

    Standardscores are frequently used to obtain compalability of obser

    der certain assumptjons

  • 8/8/2019 Ferguson - Variability

    18/28

    vations obtainedby different procedures. Consider examinations in

    deviationin the populatiEnglishand mathematics applied to the same groupof individuals,and as-

    doesof the meandeviati

    sume the means and standard deviations to be as follows:deviationare more amemeasures.Theyenterinstatistics.Theyarewideon samplingstatisticsthe

    Examination

    Fnslish 658

    effect the standarddevial

    Malhemalics 5t l:

    metersfrom samplevalr

  • 8/8/2019 Ferguson - Variability

    19/28

    5.' ADVANIAGES O' IttE VARIANC! ANO SIANDAED DIVIAIION AS MEASURIS OF VATIA'ION71

    ln cffect. in relation to the pcrlbrmance of the individuals in the group, ascore of 65 on the English cxamination is the equivalent of a score of -52 onthe mathematics examination. To illustrate. a score one stanclard deviation

    above the mean.that is, 6-5'i 8, or 73. on the English examinutioncanhe consideled to be the equivalent of a score one standard deviation abovethe mean. that is.52* I2. or 64. on the mathematics cxamination. If anindividual makes a score of -57 on thc English examination and a score of58 on thc mathemalics examination, we may compare his relative performanceon rhc rwo subjects hv comparing his standard scores. On Englishhis standard score is (-57 65)/8 : 1.0.and on mathematics his standard

    -

    scor!'is (-5ll 5l)/l:: .-5. fhus on Englishhis pertbrmanccis one rtandarddeviation unit belo\\'the average, while on mathematics his perfornranceis

    .-5 standard devialion unit abovc thc avcrage. Quite clearly.thisindividuai did much more poorly in English than in mathematicsrelativetoth perfbrmance of the group of irilividuals taking the examinations,althoughthis is not reflcctcd in the original marks assigned. To attain rigorouscomparabilityofscorcs,the distributions of scores on the two testsshould be identical in shape. The nreaning of this stat(-ment will hecomeclear as wc orocecd.

    'fhe

    reader shoulcl notc that thc sum of squarcs of standard scores, ):',is equal to N l. We obs!'r ve lhat .:' : (l ,l 1'/.r':hence

    -

    \-_, >(r tr :(.\' *t'

    ,,..,.^.'":*";T;*"', ;:ll:n;'i -r,; i )./N.,he

    sum of squares of standard scorcs is N and not N l.

    5.9 ADVANTAGESOF THE VARIANCEAND STANDARDDEVIAIIONASMEASURESOF VARIATION

    The variance and standard deviation havc many advantages over othermeasures of variation. Much statistical work involves their use. Thevariancehas certain additive propertiesand may on occasion be partitionedinto additive components. each of rvhich may be related to some causal circumstance.The sample standard deviation is a more stable or accurateestimate of the population pantmeler than olher measures of virriation. Undercertain assumptions it provides a more stable estima(e of the standarddeviation in the population than the sample mean deviation. for example,does of tbe mean deviation in the population. The variance and standarddeviation are more amenable to mathematical manipulation than othermeasures. They enter into formulas for the computation of man) types ofstatistics. They are widely used as measures oferror. ln laterdiscussion

    on sampling statistics the reader rvill observe that the stanrlar,l ellot is rneffect the stantlard deviation of errors made in estimating population parameters

  • 8/8/2019 Ferguson - Variability

    20/28

    from sample values. These errors result from the operation of

  • 8/8/2019 Ferguson - Variability

    21/28

    oF vARrATroN, AND

    72MEAsuREs sKEwNEss,(uRTosrs

    5.'I MEASURESOf SXEWNISSAND TUTT(

    chance factors in random sampling. A full appreciation of the importance

    The rationale for thiss

    and meaning of the varianceand standard deviation in their many ramifica

    tribution (or any set ofr

    tions requires considelable familiarity with statistical ideas.

    the mean, whenraisedbelowthe mean. whendistribution.4r.,: 0,an5,IOMOMENISAEOUTTHEMEAN sumsof deviations abr

    power, will not balanceThe mean and the standard deviation are closely related to a family of g, + 0. lf the disrribtdescriptive statistics known as mom?nls. The first four moments about positive;whennegativrthe arithmetic mean are as follows: introducedin order to e

    fer in variability. Thus

    :{x -x)

    Uskewnessof a set of n

    15.8rDrt:Nscoreon a psychologica-

    :(x t)2 N- ,will recallthata standar

    '

    "lt: N N uslngstandardscoresilmeasurementsto anotl_,,:rr,*rr" directlyanalogousto th

    As an illustrationof

    -2(X *)''n'

    AI

  • 8/8/2019 Ferguson - Variability

    22/28

    BI

    ln general,the rth moment aboutthe mean is givenby

    X)'Thesenumbersexpressr

    -

    t5.el^,._2(X

    A

    The term "moment" originatesin mechanics. Considera lever sup-Bportedby a fulcrum. lf a force li is applied to the lever at a distance-r.from the origin, then.l,r: is called the momentof the force. Further, ifa Set ,4is a symmetrical

    deyiationsraisedtothetl

    second force.ll is applied at a distance -rr. the total moment isfix1 *./l.rr.

    lf we square th distances x, we obtain the second moment;if we cube

    .4 64

    them,we obtain the third moment; and so on. When we come to consider

    a -64

    frequency distributions, the origin is the analog of the fulcrum and the

    frequenciesin the various class intervals are analogousto forces operating

    For setl, ru,: 0 and g

    at variousdistancesfrom the origin. Observe that the first moment about

    .387. SetB is a positiv

    the mean is 0 and the secondmoment is (N l)/N timestheunbiasedsam-

    The commonly used

    ple variance. The third momentis used to obtain a measureof skewness,

    and is definedas

    and the fourth moment.a measure of kurtosis.

    t5.lrl

    5,I1 MEASURES AND KURTOSISThis definitionis based

    OF SKEWNESS

    mean, when raisedto th(The commonly used measureof skewness makes use of the third moment fourthmoment.

  • 8/8/2019 Ferguson - Variability

    23/28

    Thecoland is defined as tive thicknessof the tails

    tionmaybeflatteror molI5.r0l meancontributemuchm

    m2\ m2

    The termzzr, is used to a

  • 8/8/2019 Ferguson - Variability

    24/28

    5.11MIASURISOF S([WN[SS AND (URlOSrSThe rationale fbr this statistic is based on the observation that when a distribution(or any set ofnumbers) is symmetrical. the sum ofdeviations abovethe mean, u,hen raised to the third power. will balance the sum ofdeviationsbelow the mean. when raised to the third power. Thus for a symmetrical

    distribution.,r r: 0. and the A,,: L If the distritrution is asymmetrical. thesums of deviations above and below the mean, when raised to the thirdpower. will not balance. Thus for an asymrnetrical distribution rrr,; 0 ande, + 0. lf the distribution.or set of numbers.is positivelyskewed.g, ispositive;when negatively skewed gr is negative. The quantit!,ar.f rri,isintroducedin order to ensure that gr is comparable for distributions that differin variability. Thus g, is independent ofthe scale ofmeasurement. Theskewnessof a set Df measurementsin gmms, meters.pounds,or units ofscore on a psychologicaltest can be directly compared usingg,. The reader

    (,\ -tlir.

    rvillrecallthata standard score is tlehned a\:: Oncreasonforusing standard scores is 1() achieve comparability of scores fiom one set ofmeasurementsto anothef. The use rrf ri:r r4 in the definirionof e, isdirectly analogous to the use ol's in the definition of a slandard score.

    As an illustration of g,,considertq,o sets of numbers. ,4and I

    A6 I0 l:t4R l0 l5

    These numbers expressed as deviations tiom the mean become

    Set,,1is a s1'mmetrical set of numbers. and set B is asymmetrical.

    A410 rl +48420 +l -5-lhese

    deviationsraised to the third power are as follows:

    ItoB 64 o r l15

    -8

    For set I, rr':0 andg,:0. For set B.,rrr: 10.80,in.:9.10, and g,:.387. Set B is a positively skewed set of numbers.The commonly used measure of kurtosis involves the foufth moment.and is defined as

    [s.l ]lThis definition is based on the observation that large deviations from themean. when raised to the fourth power. will contribute substantiallyto theti)urth moment. The concept of kurtosis is nrole closely Iinked to the relativethickness of the tails of distributions thiin to the idea that one distributionmay be flatter or more peakedthan another. Largedeviationsfrom the

    mean contribute much more to thc fourth momenl than smallerdeviations.The term 2,,, is used to achieve comparability. lt serves the same purpose

  • 8/8/2019 Ferguson - Variability

    25/28

  • 8/8/2019 Ferguson - Variability

    26/28

  • 8/8/2019 Ferguson - Variability

    27/28

    Sample standard deviiStandardscore.i

    Momentsabout the mMeasureof skewness.Measureof kurtosis.

    1

    EXERCISES

    LlFor the measuremmandeviation,(

    i2 The variancecalcsum of squares ofu3

    A biased varianceWhat is the corres

    4The variance forvariancebe if all r

    (b)divided by a c(-5 Show that )(X

    Scholaslicaptitudr

    of 100. A student

    L.'

    Expressthesescot

    Expressthe measltr rn" sum of squa, /

    l/8The mean andstarfor a class of 26 stmake scores of 50,scores?

    9Calculatethesecol6, 10, 14, 16. Cor

    l0 The following are

    Group I 2 3Group II 2 4

    Calculatemeasures

  • 8/8/2019 Ferguson - Variability

    28/28