day 2, morning: the logic of distribu8ons · day 2, morning: the logic of distribu8ons instructor:...
TRANSCRIPT
![Page 2: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/2.jpg)
Recap
• Yesterdaymorningwetalkedabout:– Thedescrip8veandinferen8alpurposesofsta8s8cs
– Thedifferencebetweensamples,popula8ons,andtheissuesthatarisebecauseofsamplingerrorandsamplebias—andthewaysinwhichprobabilitytheorycanbeusedtoaddresstheformer.
– Thebasicsofprobabilitytheory
![Page 3: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/3.jpg)
Recap
• Yesterdayeveningwetalkedabout:– Thedifferencebetweenunivariate,bivariate,andmul8variatesta8s8cs.
– Whatavariableisandthegeneralformsthatitcantake:nominal,ordinal,orcon8nuous.
– Measuresofcentraltendencyforeachtypeofvariable.
– Measuresofdispersion.
![Page 4: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/4.jpg)
GamePlanforToday• Morning– Wewillbringthesepreviouslecturestogethertoshowhowwecanuseprobabilitytheorytoassesshowrepresenta8veoursampledistribu.onisofthepopula.ondistribu.on.
• Evening– AReroutliningtheasympto.ctheoryofprobabilitydistribu.onsinthemorning,wewillthenexaminebasicunivariatetestsofsta.s.calinferencetoquan8fyhowwelloursamplesta8s8csapproximatepopula8onparametersnetofsamplingerror.
![Page 5: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/5.jpg)
Whatisadistribu8on?
Insta8s8cs,adistribu.onissimplythearrayofvaluesforoneormorevariablesacrossasetofunits(people,groups,etc.).
![Page 6: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/6.jpg)
Thedistribu8oniseverything
• Theconceptof“distribu8on”hasbeenattheimplicitcenterofabsolutelyeverythingwehavetalkedaboutsofar!
• Wehavespecificallylookedatsamplesta8s8cdistribu8ons,suchas…
![Page 7: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/7.jpg)
FrequencyDistribu8ons
Total 10,335 100.00 excellent 2,407 23.29 100.00 good 2,591 25.07 76.71 average 2,938 28.43 51.64 fair 1,670 16.16 23.21 poor 729 7.05 7.05 5=excellent Freq. Percent Cum.1=poor,...,
. tab health
![Page 8: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/8.jpg)
Frequencydistribu8onsas“histograms”
050
010
0015
00Fr
eque
ncy
50 100 150 200 250 300systolic blood pressure
![Page 9: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/9.jpg)
Bimodaldistribu8onwithcategoricalvariables
0.2
.4.6
.8
Male Female
Solid R or D Likely R or DLeaning R or D Toss-Up
![Page 10: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/10.jpg)
Bimodaldistribu8onwithcon8nuousvariables
020
040
060
0Fr
eque
ncy
20 40 60 80age in years
Notethetwopeaks.
![Page 11: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/11.jpg)
Distribu8onsandProbabilityTheory
• Distribu8onsservemuchmorethanadescrip8vepurpose.
• Werelyontheasympto.ctheoryofprobabilitydistribu.onstomakesta.s.calinferences.
• Suchatheorygivesusdependableideasaboutwhatthesamplingdistribu.onwilllooklike.
![Page 12: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/12.jpg)
Whatis“asympto8c”?
• “Asympto8c”referstothepropertythat,ifsampledaninfinitenumberof8mes,asta8s8cwillconvergetothepopula8onparameteritismeanttoapproximate.
• Thatis,as,assumingthatallnarerandomsamplesofN.
![Page 13: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/13.jpg)
Whatisa“probabilitydistribu8on”?
• A“probabilitydistribu8on”isanarrayofprobabilis8cvaluesforavariableacrossasetofunits,wherethevaluesarepropor8onsthatmustsumto1.
V1 V2 V3 V4 V5 RMargins1 0.0000576938 0.1088889841 0.1090984138 0.0000576950 0.7818972133 12 0.0000476265 0.0000476352 0.9123043912 0.0119956500 0.0756046971 13 0.0000881123 0.0179937158 0.0264640660 0.0000881160 0.9553659899 14 0.1108000716 0.2503997467 0.5927814401 0.0459855150 0.0000332266 15 0.2411895005 0.0404012131 0.6524702741 0.0001041705 0.0658348418 16 0.1470047772 0.0002170118 0.6761725881 0.0002169660 0.1763886570 17 0.8856909695 0.0001638012 0.0001637739 0.0001637537 0.1138177018 18 0.2807926994 0.2093476438 0.0530959614 0.0000664792 0.4566972163 19 0.5049335568 0.0260158232 0.0000863853 0.4424611335 0.0265031011 110 0.0203117985 0.3800703244 0.2362073342 0.0352103566 0.3282001863 111 0.1901308947 0.5532532962 0.0002811341 0.0002811777 0.2560534974 112 0.3384214744 0.4279357679 0.1207937032 0.1128139993 0.0000350552 113 0.8520671572 0.0001154451 0.0462275639 0.1014744017 0.0001154321 114 0.8163828385 0.1158173532 0.0000841640 0.0213096942 0.0464059501 115 0.4869992267 0.2536825278 0.0000751098 0.0210530230 0.2381901127 1
![Page 14: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/14.jpg)
Whatisa“probabilitydistribu8on”?
v1 v2 v3 v4 v5 RMarginsr1 0.0000576938 0.1088889841 0.1090984138 0.0000576950 0.7818972133 1r2 0.0000476265 0.0000476352 0.9123043912 0.0119956500 0.0756046971 1r3 0.0000881123 0.0179937158 0.0264640660 0.0000881160 0.9553659899 1r4 0.1108000716 0.2503997467 0.5927814401 0.0459855150 0.0000332266 1r5 0.2411895005 0.0404012131 0.6524702741 0.0001041705 0.0658348418 1r6 0.1470047772 0.0002170118 0.6761725881 0.0002169660 0.1763886570 1r7 0.8856909695 0.0001638012 0.0001637739 0.0001637537 0.1138177018 1r8 0.2807926994 0.2093476438 0.0530959614 0.0000664792 0.4566972163 1r9 0.5049335568 0.0260158232 0.0000863853 0.4424611335 0.0265031011 1r10 0.0203117985 0.3800703244 0.2362073342 0.0352103566 0.3282001863 1r11 0.1901308947 0.5532532962 0.0002811341 0.0002811777 0.2560534974 1r12 0.3384214744 0.4279357679 0.1207937032 0.1128139993 0.0000350552 1r13 0.8520671572 0.0001154451 0.0462275639 0.1014744017 0.0001154321 1r14 0.8163828385 0.1158173532 0.0000841640 0.0213096942 0.0464059501 1r15 0.4869992267 0.2536825278 0.0000751098 0.0210530230 0.2381901127 1
![Page 15: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/15.jpg)
Puengittogether• Ifwehaveaninfinitenumberofrandomsamplees8matesfromthe
samepopula8on,themeanofthisdistribu8onofes8mateswillconvergetothepopula8onmean.
• Giventhatwecanneverreallyhaveaninfinitenumberofsamples,theasympto8ctheoryofprobabilitydistribu8onssuggeststhat,withlargersamplesizes,wecaninferwithgreaterdegreesofprobabilis8cconfidencewhetherornotoursinglesamplesta8s8caccuratelyreflectstheunknownpopula8onparameter.
• Asn approachesN,thesamplingerrorgetssmallerandsmaller,meaningthatthereliabilityofoures8mategetsbeherandbeher.Thisisbecause,theore8cally,ifthesamplesize(n)keepsgrowing,itwilleventuallyjustbethepopula8on(N)!
![Page 16: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/16.jpg)
Puengittogether
• Ofcourse,suchatheoryrequiresthatwemakeassump8onsabouttheshapeoftheunknownpopula8ondistribu8on.
• Otherwisewedon’tknowwhatnisapproxima8ng!
![Page 17: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/17.jpg)
CentralLimitTheorem
• Luckyforus,someveryintelligentsta8s8cianswhocamebeforeusno8cedthat,assamplesizesgrewlargerandlarger,thedistribu8onofsamplemeansbecomesapproximatelynormal—regardlessofwhetherornottheparameteritselfisnormallydistributed.ThisistheCentralLimitTheorem(CLT).
• Bynormaldistribu.on,wemeanasymmetricdistribu8onwhereapproximatelyhalfofthedatafalltoeithersideofthemean.Itiscommonlyknownasadistribu8onthatfollowsabellcurve.
![Page 18: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/18.jpg)
CentralLimitTheorem
AccordingtotheCLT,wecanexpectthat,withanormaldistribu8onofrandomsamplemeans,approximately68%ofthesamplemeanswillbewithinonestandarddevia8ononeithersideofthepopula8onmean(μ).95%willbewithintwo,and99.7%withinthree.
*FigurefromMathIsFunwebsite(hhps://www.mathsisfun.com/data/standard-normal-distribu8on.html).
![Page 19: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/19.jpg)
CentralLimitTheoremNo8cehowthedatabecomemoresymmetricaboutthemeanasthesamplesizeincreases.Assuch,largesamplesizescanserveas“proxies”forrepeatedrandomsamplesandjus8fytheCLT.
0.1
.2.3
.4.5
Density
-1 0 1 2 3n50
0.1
.2.3
.4Density
-4 -2 0 2 4n500
0.1
.2.3
.4Density
-4 -2 0 2 4n5000
0.1
.2.3
.4Density
-4 -2 0 2 4n50000
![Page 20: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/20.jpg)
StandardError• So,whatcanwesayaboutpopula8onparametersgivenasample
sta8s8candthesepopula8ondistribu8onassump8ons?• Forstarters,wecancalculatethestandarddevia8onofthe
theore8caldistribu8onofrandomsamplemeansaroundtheunknownpopula8onmean—alsoknownasthestandarddevia8onofthesamplingdistribu8on.Thisisknownasthestandarderror,andcanbefoundwith:
– Whereisthestandarddevia8onofthepopula8onparameterandthedenominatoristhesquarerootofthesamplesize.
![Page 21: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/21.jpg)
StandardError
• Ofcourse,isusuallynotknown,soweusethesamplestandarddevia8onasanapproxima8on:
• Ormoresimply:
![Page 22: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/22.jpg)
StandardError
• Whatdoesthestandarderrorofthesystolicbloodpressurevariabletellus?Howisthisdifferencefromthestandarddevia8on?
bpsystol 10337 130.8826 .2295796 130.4325 131.3326 Variable Obs Mean Std. Err. [95% Conf. Interval]
. ci bpsystol
bpsystol 10337 130.8826 23.34159 65 300 Variable Obs Mean Std. Dev. Min Max
. sum bpsystol
![Page 23: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/23.jpg)
StandardError
• Arandomsamplemeandrawnfromthepopula8on(suchasthisone)likelydiffersfromthepopula8onsystolicbpbyabout0.23mm/Hg.
bpsystol 10337 130.8826 .2295796 130.4325 131.3326 Variable Obs Mean Std. Err. [95% Conf. Interval]
. ci bpsystol
bpsystol 10337 130.8826 23.34159 65 300 Variable Obs Mean Std. Dev. Min Max
. sum bpsystol
![Page 24: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/24.jpg)
StandardError
• Thestandarddevia8on,however,ismerelyadescrip8veindica8onofvariabledispersion.Theaveragerespondentinthesampledivergesabout23.34mm/Hg.fromthemean.
bpsystol 10337 130.8826 .2295796 130.4325 131.3326 Variable Obs Mean Std. Err. [95% Conf. Interval]
. ci bpsystol
bpsystol 10337 130.8826 23.34159 65 300 Variable Obs Mean Std. Dev. Min Max
. sum bpsystol
![Page 25: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/25.jpg)
StandardError
• Justtocheckthemath:
bpsystol 10337 130.8826 .2295796 130.4325 131.3326 Variable Obs Mean Std. Err. [95% Conf. Interval]
. ci bpsystol
bpsystol 10337 130.8826 23.34159 65 300 Variable Obs Mean Std. Dev. Min Max
. sum bpsystol
![Page 26: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/26.jpg)
StandardError• No8cethatgetssmallerwhentwothingshappen:– (1)Whens,thestandarddevia8on,issmall.– (2)Whenthesamplesizeislarge.
• Butalsonotethatsitselfissmallerwhenthesamplesizeislarger.
• Whennislarge—andthereforeacloserapproxima.onofN—thesamplingdistribu.onvarieslessanditismorelikelythatthesamplemeanrepresentsthepopula.onmean!
![Page 27: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/27.jpg)
ConfidenceIntervalforMean
• Wecanalsousethestandarderrorandourknowledgeofthenormaldistribu8ontoconstructaconfidenceintervalaroundthemean—thatis,thebandofvalueswithinwhichthepopula8onmean,µ,islikelytoreside.
![Page 28: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/28.jpg)
ConfidenceIntervalforMean
• Theconfidenceintervalcanbefoundwith: or
• Wherezisourcri.calvalue:i.e.,thenumberofstandarddevia8onsawayfromthemeanthatrepresenttherangeofprobabili8eswithinwhichwethinkthepopula8onmeanresides.
![Page 29: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/29.jpg)
Wait…z-value?What’sthat? • ThinkbacktowhattheCLTtellsus:– About68%ofsamplemeansfallwithinaboutonestandarddevia8ononeithersideofthepopula8onmean.
– About95%fallwithinabouttwo.– About99.7%fallwithinaboutthree.
• Wecanusethisinforma8ontofindthestandarddevia8onsthatcorrespondtothedistribu8onpercen8lesthatcapturethesepercentages.
![Page 30: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/30.jpg)
Wait…z value?What’sthat?
• Forexample,thoughwesaythat95%ofthees8matesfallwithinabouttwostandarddevia8onsofthepopula8onmean,themoreprecisenumberis1.96.Itisourz value!
Thatis,about95%ofthesamplemeansfallwithin±1.96standarddevia.onsofthepopula8onmean.(Nevermindthat0—forourpurposes,thinkofitasµ.)*PhotofromWikipedia(hhps://en.wikipedia.org/wiki/1.96).
![Page 31: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/31.jpg)
ConfidenceIntervalExample• Themeanweight(inkilograms)inourNHANESsampleis
71.90.Thestandarddevia8onis15.36,andoursamplesizeis10,337.Withinwhatrangeofkilogramscanwebe95%confidentincludesthepopula8onmean?
.
weight 10337 71.90088 15.35515 30.84 175.88 Variable Obs Mean Std. Dev. Min Max
. sum weight
![Page 32: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/32.jpg)
ConfidenceIntervalExample• Let’sstartbyfirstcompu8ngthestandarderror:
.
weight 10337 71.90088 15.35515 30.84 175.88 Variable Obs Mean Std. Dev. Min Max
. sum weight
![Page 33: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/33.jpg)
ConfidenceIntervalExample• Wewanttocapturethepopula8onmeanwithinthebandof
valuesthat,accordingtotheCLT,likelyfallwithin±1.96standarddevia8onsfromthepopula8onmean.Assuch:
.
weight 10337 71.90088 15.35515 30.84 175.88 Variable Obs Mean Std. Dev. Min Max
. sum weight
![Page 34: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/34.jpg)
ConfidenceIntervalExample
• Thereisa95%chancethattheintervalbetween71.604kg.and72.197kg.containsthemeanpopula8onweight.
.
weight 10337 71.90088 15.35515 30.84 175.88 Variable Obs Mean Std. Dev. Min Max
. sum weight
![Page 35: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/35.jpg)
ConfidenceIntervalExample
• ConfirmwithStata:
weight 10337 71.90088 .1510277 71.60484 72.19692 Variable Obs Mean Std. Err. [95% Conf. Interval]
![Page 36: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/36.jpg)
CIexamplewithdifferentcri8calvalue
• Whatifwewantedtobe,say,99%confidentthatourintervalcontainsµ?
• Thecri8calz-valuefora99%confidenceintervalis2.58.Thismeansthat,followingtheCLT,weexpectabout99%ofsamplemeanspulledrandomlyfromoursamplingdistribu8onfallwithin±2.58standarddevia8onsofthepopula8onmean.
![Page 37: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/37.jpg)
CIexamplewithdifferentcri8calvalue
• Let’sdothemath:
.
weight 10337 71.90088 15.35515 30.84 175.88 Variable Obs Mean Std. Dev. Min Max
. sum weight
![Page 38: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/38.jpg)
CIexamplewithdifferentcri8calvalue
• Wecansaythat,998mesoutof100,wehavecapturedthemeanpopula8onweightwiththeintervalbetween71.51kg.and72.29kg.
.
weight 10337 71.90088 15.35515 30.84 175.88 Variable Obs Mean Std. Dev. Min Max
. sum weight
![Page 39: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/39.jpg)
CIexamplewithdifferentcri8calvalue
• ConfirmwithStata:
.
weight 10337 71.90088 .1510277 71.51179 72.28997 Variable Obs Mean Std. Err. [99% Conf. Interval]
![Page 40: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/40.jpg)
ConfidenceIntervalPrecision
• Notethattheconfidenceintervalgetsbiggerwhenwegofrom95%to99%confidence.– For95%CI:72.197–71.605=.592– For99%CI:72.290–71.512=.778
• Thisisbecausewehavetohavelessprecisionwhenwetrytobemoreconfident!
![Page 41: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/41.jpg)
z-scores • Recallthatcri8calz-values(e.g.,±1.96and±2.58)arethe
standarddevia8onsawayfromµthatwewouldexpecttocapture95%and99%ofrandomsamplemeans(respec8vely)inanormalsamplingdistribu8on.
• Theore8cally,thez-valueforanygivencasecanbecalculatedwith.Thisvaluewouldtellushowmanystandarddevia8onsthecaseisfromµ.
• Wecanapplythissamelogictoanindividualvariabletoquan8fyhowfaraspecificcaseisfromthevariablemean.Thismeasureiscalledaz-score,orastandardizedscore.
![Page 42: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/42.jpg)
z-scores • Thez-scoreforanindividualcasecanbefoundbysubtrac8ngthevariablemeanfromtherawscoreandthendividingthedifferencebythevariablestandarddevia8on:
– Where,asbefore,isthemeanforthevariableands isthevariablestandarddevia8on.
![Page 43: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/43.jpg)
z-scoreexample • Belowisthedistribu8onofsystolicbloodpressurereadingsforthe
NHANESsample.Themeanis130.88mm/Hg.Thegreenbarisapar8cularvalueofthevariable:110mm/Hg.Thestandarddevia8onis23.34mm/Hg.Whatisthez-scoreforthiscase,andwhatdoesthisnumbermean?
050
010
00Fr
eque
ncy
50 70 90 110 130 150 170 190 210 230 250 270 290systolic blood pressure
![Page 44: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/44.jpg)
z-scoreexample
050
010
00Fr
eque
ncy
50 70 90 110 130 150 170 190 210 230 250 270 290systolic blood pressure
![Page 45: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/45.jpg)
z-scoreexample
050
010
00Fr
eque
ncy
50 70 90 110 130 150 170 190 210 230 250 270 290systolic blood pressure
• Acasewithasystolicbloodpressurereadingof110mm/Hg.isalihlelessthan1standarddevia8onbelowthemean.
![Page 46: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/46.jpg)
z-scoreexample • ConfirmingwithStata.Notethatthesamebariscolored
green.That’sbecausetheyarethesamecases!
050
010
00Fr
eque
ncy
50 70 90 110 130 150 170 190 210 230 250 270 290systolic blood pressure
050
010
00Fr
eque
ncy
-2 0 2 4 6 8zsystol
![Page 47: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/47.jpg)
Conclusion• Wehaveseenhowtheasympto8ctheoryofprobabilitydistribu8onsallowsustoassesshowwelloursinglesamplemeanrepresentsthetruepopula8onmeanintheabsenceofrepeatedrandomsamples.
• ItdoesthisbyfollowingtheCLT.Thisallowsustouseoursamplesizetoes8matehowwellhypothe8calrandomsamples(ofthesamesize)wouldapproximateanormaldistribu8onandthereforeapproximatethepopula8onmean.
![Page 48: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/48.jpg)
Conclusion
• Thoughstandarderrorsandconfidenceintervalshelpusgetanideaofwherethepopula8onmeanmaybe,howdoweknowthesenumbersarereliable?Thatis,howdoweknowthatoures8matesaren’tjusttheproductofsamplingerror?
• Thisisthejobofsta.s.calinference—anditisthetopicforthenextsession!
![Page 49: Day 2, Morning: The Logic of Distribu8ons · Day 2, Morning: The Logic of Distribu8ons Instructor: Marshall A. Taylor 844 Flanner Hall mtaylo15@nd.edu marshalltaylor.net Recap •](https://reader036.vdocuments.net/reader036/viewer/2022071007/5fc4a644ad0f4b2914531e2f/html5/thumbnails/49.jpg)
DatasetsUsed
• TheStatasurveydocumenta8ondata,nhanes2f,fromtheStataPresswebsite.RetrievedJuly24,2016(hhp://www.stata-press.com/data/r11/svy.html).