statistics and experimental design - uc · l.t. gama luís telo da gama [email protected] estação...

19
L.T. Gama Luís Telo da Gama [email protected] Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Luís Telo da Gama [email protected] Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental Design Basic Principles L.T. Gama Basic principles Statistics Statistics is is often often used used as a as a drunk drunk man man uses a uses a street street lamp lamp… More for More for support support than than for for illumination illumination! ! L.T. Gama Test Test hypotheses hypotheses based based on on information information from from a a sample sample Obtain Obtain inferences inferences ( prediction prediction and and decision decision- making making) for a ) for a global global population population Information Data Knowledge Statistical analyses Biological integration Statistics – The basic problem… Statistics – The basic problem… L.T. Gama Corn Corn or or barley barley for for pigs pigs?? ?? Comparison Comparison of of effect effect of of: One One factor factor Source Source of of energy energy ) With With two two treatments treatments Corn Corn or or barley barley One One response response variable variable ) Growth Growth rate rate in in pigs pigs Example 1 Example 1 L.T. Gama Example 2 Example 2 What What is is the the effect effect of of the the level level of of PMSG PMSG administered administered on on ovulation ovulation rate rate in in sows sows? Comparison Comparison of of effect effect of of : : One One factor factor Level Level of of PMSG PMSG administered administered ) With With several several treatments treatments 250, 500, 750 UI 250, 500, 750 UI One One response response variable variable ) Ovulation Ovulation rate rate L.T. Gama Example 3 Example 3 Effects Effects of of bST bST administration administration to to dairy dairy cows cows? What What is is the the effect effect in in Holstein Holstein and and Jersey Jersey cows cows? What What is is the the optimum optimum level level? Is Is the the optimum optimum level level the the same same for for the the two two breeds breeds? Comparison Comparison of of effects effects of of: Two Two factors factors studied studied ) Breed Breed and and level level of of bST bST Treatments Treatments ) Breed Breed: : Holstein Holstein and and Jersey Jersey ) bST bST/day day: 0, 15, 30, 45 mg/ : 0, 15, 30, 45 mg/day day One One response response variable variable Milk Milk yield yield per per lactation lactation

Upload: doankhanh

Post on 04-Aug-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Luís Telo da [email protected]

Estação Zootécnica NacionalFaculdade de Medicina Veterinária - UTL

Luís Telo da [email protected]

Estação Zootécnica NacionalFaculdade de Medicina Veterinária - UTL

Statistics andExperimental Design

Basic Principles

Statistics andExperimental Design

Basic Principles

L.T. Gama

Basic principlesBasic principles

StatisticsStatistics isis oftenoften usedusedas a as a drunkdrunk manman uses a uses a streetstreet lamplamp……More for More for supportsupport thanthan for for illuminationillumination! !

L.T. Gama

TestTest hypotheseshypotheses basedbasedonon informationinformation fromfrom a a samplesample

ObtainObtain inferencesinferences((predictionprediction andanddecisiondecision--makingmaking) for a ) for a global global populationpopulation

Information

Data

Knowledge

Statisticalanalyses

Biologicalintegration

Statistics – The basic problem…Statistics – The basic problem…

L.T. Gama

CornCorn oror barleybarley for for pigspigs????

ComparisonComparison ofof effecteffect ofof::

OneOne factorfactor•• SourceSource ofof energyenergy

WithWith twotwo treatmentstreatments•• CornCorn oror barleybarley

OneOne response response variablevariableGrowthGrowth rate rate inin pigspigs

Example 1Example 1

L.T. Gama

Example 2Example 2WhatWhat isis thethe effecteffect ofof thethe levellevel ofof PMSG PMSG administeredadministered onon ovulationovulation rate rate inin sowssows??

ComparisonComparison ofof effecteffect ofof: :

OneOne factorfactor•• LevelLevel ofof PMSG PMSG administeredadministered

WithWith severalseveral treatmentstreatments•• 250, 500, 750 UI250, 500, 750 UI

OneOne response response variablevariableOvulationOvulation raterate

L.T. Gama

Example 3Example 3EffectsEffects ofof bSTbST administrationadministration to to dairydairycowscows??

WhatWhat isis thethe effecteffect inin HolsteinHolstein andand JerseyJersey cowscows??WhatWhat isis thethe optimumoptimum levellevel??IsIs thethe optimumoptimum levellevel thethe samesame for for thethe twotwo breedsbreeds??

ComparisonComparison ofof effectseffects ofof::TwoTwo factorsfactors studiedstudied

BreedBreed andand levellevel ofof bSTbST

TreatmentsTreatmentsBreedBreed: : HolsteinHolstein andand JerseyJerseybSTbST//dayday: 0, 15, 30, 45 mg/: 0, 15, 30, 45 mg/dayday

OneOne response response variablevariableMilkMilk yieldyield perper lactationlactation

Page 2: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Example of responsesExample of responses

0100200300400500600700800

GMD

Milho Cevada9

10

11

12

13

14

15

250 500 750

PMSG

T.O.

0

2000

4000

6000

8000

10000

12000

0 5 10 15 20 25 30 35 40bST (mg/d)

PL (k

g)

Example 1 Example 2

Example 3H

J

L.T. Gama

Important pointsImportant pointsInIn allall cases:cases:

Define Define clearlyclearly whatwhat youyou wantwant to to studystudy!!FactorsFactors consideredconsideredTreatmentsTreatments usedusedResponse Response variablesvariables analysedanalysed

CanCan thethe resultsresults bebe extrapolatedextrapolated to to thethe populationpopulationofof interestinterest??

i.e., i.e., isis thethe samplesample representativerepresentative??

NullNull hypothesishypothesis……

KISS KISS = = KKeepeep IItt SShorthort andand SSimpleimple

L.T. Gama

Important pointsImportant pointsHoweverHowever……

ExampleExample 1: 1: TreatmentsTreatments are are discontinuousdiscontinuouscorncorn vsvs. . barleybarley

ExampleExample 2: 2: TreatmentsTreatments are are continuouscontinuousIncreasingIncreasing levellevel ofof PMSGPMSG

ExampleExample 3: 3: TreatmentsTreatments are are bothboth discontinuousdiscontinuousandand continuouscontinuous

DiscontinuousDiscontinuous: : breedbreedContinuousContinuous: : levellevel ofof bSTbST

DifferentDifferent approachesapproaches!!!!!!ButBut similarsimilar…… L.T. Gama

Steps in statistical analysis

DescriptiveDescriptive statisticsstatistics

OrganizationOrganization andand summarizationsummarization ofof datadata••““PicturePicture”” ofofthethe experienceexperience••DetectionDetection ofoftrendstrends

MeasuresMeasures ofof localizationlocalization andanddispersiondispersion

FrequencyFrequency tablestables, , graphicgraphicdisplaysdisplays, , etcetc..

L.T. Gama

Steps in statistical analysisSteps in statistical analysis

DescriptiveDescriptive statisticsstatisticsRuleRule NNºº 1: 1: FirstFirst plotplot thethe data!!!!data!!!!

DetectionDetection ofof outliersoutliers•• AbnormalAbnormal extreme extreme valuesvalues•• WhatWhat shouldshould bebe donedone??

"You can observe a lot by watching!"

Yogi Berra

L.T. Gama

Characteristics of the normal distributionCharacteristics of the normal distribution

68%

95%

99%

Approximate!!!

Page 3: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Steps in statistical analysisSteps in statistical analysis

DescriptiveDescriptive statisticsstatisticsPossiblePossible needneed for for transformationtransformation

Examples oftransformations

Original data

Transformed data

L.T. Gama

Steps in statistical analysisInferentialInferential statisticsstatistics

InIn a a probabilisticprobabilistic mannermanner, , obtainobtain conclusionsconclusionswhichwhich cancan bebe appliedapplied to to thethe populationpopulation ofof interestinterest::

IsIs corncorn betterbetter thanthan barleybarley for for pigpig feedingfeeding??HowHow muchmuch does does ovulationovulation rate rate changechange perper additionaladditional IU IU ofof PMSG ?PMSG ?WhatWhat isis thethe optimumoptimum dosagedosage ofof bSTbST inin JerseyJersey andandHolsteinHolstein cowscows??

AnalysisAnalysis ofof variancevariance, , regressionregression, etc., etc.

Major objective Major objective inin statisticalstatistical analysisanalysis!!

L.T. Gama

Important conceptsPopulationPopulation -- groupgroup ofof interestinterest for for thetheresearcherresearcher

UsuallyUsually notnot knownknown inin detaildetailParametersParameters ofof thethe populationpopulation are are notnot knownknown, , butbutare are estimableestimable

SampleSample –– partpart ofof thethe populationpopulation selectedselected for for thethe experimentexperiment

ConclusionsConclusions are are onlyonly validvalid ifif thethe samplesample isisselectedselected atat randomrandom ((i.ei.e., ., representativerepresentative ofof thethepopulationpopulation))

Similar to Similar to surveysurvey//pollpoll vsvs. . electionelectionL.T. Gama

Important conceptsWeWe studystudy a a smallsmall partpart ofof a a populationpopulation to to makemake judgmentsjudgments aboutabout thatthat populationpopulation

SampleSampleResultsResults are are statisticsstatistics

MeanMean, standard , standard deviationdeviation, , relationshipsrelationships amongamongvariablesvariables, , etcetc. . ObservableObservable inin thethe samplesample

statisticsstatistics are are estimatorsestimators ofof parametersparameters ininthethe populationpopulation

fixedfixed, , unknownunknown, , notnot calculablecalculable

L.T. Gama

Concept of experimental unit

e.u.e.u. = = unitunit ofof material to material to whichwhich a a treatmenttreatment isis appliedapplied..animalanimaltreattreat. . appliedapplied to a to a groupgroup ofof animalsanimals

CageCage, , penpen, , parkpark, , buildingbuildingrepeatedrepeated measuresmeasures ofof anan animal animal

LactationLactation phasephase inin oneone cowcowEyeEye ofof oneone rabbitrabbit

NB: NB: treatmentstreatments are are attributedattributed to to e.ue.u. . andand notnot thetheoppositeopposite

RandomizedRandomized choicechoice byby::TakeTake numbersnumbers fromfrom a a hathatRandomRandom numbersnumbers generatorgenerator ((tabletable oror computercomputer))

L.T. Gama

Basic principles

VariabilityVariability amongamong e.ue.u. . allowsallows teststestsResidual Residual variabilityvariability oror experimental experimental errorerrorObtainedObtained byby replicationreplication ofof e.ue.u..

RandomizationRandomization ofof e.ue.u..eacheach e.u.e.u. hashas thethe samesame probabilityprobability ofof beingbeing subjectsubjectto to anyany ofof thethe treatmentstreatments underunder studystudy

nonenone ofof thethe treatmentstreatments isis favoredfavored;;experimental experimental errorerror isis wellwell estimatedestimated; ;

Page 4: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Response variables

QuantitativeQuantitative

ContinuousContinuousMilkMilk yieldyield, ADG, ADG

DiscontinuousDiscontinuousLitterLitter sizesize

0

5000

10000

15000

20000

25000

0

2000

4000

6000

8000

1000

0

1200

0

1400

0

1600

0

0

2

4

6

8

10

12

4 5 6 7 8 9 10 11 12

Milk Yield in 158552 lactations

Prolificacy in 52 sows

L.T. Gama

Response variables

QualitativeQualitative oror categoricalcategorical

Ordinal (Ordinal (withwith scalescale))FertilityFertility, , dystociadystocia

Nominal (no Nominal (no scalescale))BrucelosisBrucelosis, IBR, etc., IBR, etc.

Não paridas

Paridas

Fertility in a group of 200 cows

InfertilidadeMamitesPésOutros

Causes of culling in a dairy herd

L.T. Gama

Most frequent case…Most frequent case…TheThe majoritymajority ofof response response variablesvariables inin biologybiologyhashas a a continuouscontinuous andand normal normal distributiondistribution

0 1 2 3 4 5 6 7 8 9 10 11 12

Level of metabolite X

Freq

uenc

y

Characterized by:- Mean- Variance

L.T. Gama

Mean and varianceMean and variance

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Level of metabolite X

Freq

uenc

y

1

2 3

Level of metabolite X in 3 breeds

μ1 = μ2 < μ3 σ21 > σ2

2 = σ23

L.T. Gama

Importance of residual variationImportance of residual variation

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Residual Residual variabilityvariability

Experimental Experimental errorerror

VariabilityVariability amongamongobservationsobservations subjectsubject to to thethe samesame treatmenttreatment

ImpliesImplies replicationreplication ofofobservobserv. . InIn eacheachtreatmenttreatment

ItIt isis assumedassumed equalequal for for thethe differentdifferent treattreat..

A B

A B

L.T. Gama

Importance of residual variationImportance of residual variation

0 1 2 3 4 5 6 7 8 9 10 11 12 13

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Residual Residual variabilityvariability

Serves as a Serves as a scalescale to to testtestsignificancesignificance ofof differencesdifferencesamongamong treatmentstreatments

ExampleExampleDifferenceDifference amongamong treattreat. . isis thethesamesame, , butbut: :

1 1 –– HighHigh residual residual variationvariation•• DifferenceDifference betweenbetween treattreat. .

couldcould bebe duedue to to thethesamplingsampling processprocess

2 2 –– LowLow residual residual variationvariation•• DifferenceDifference betweenbetween treattreat. .

probablyprobably isis realreal

A B

A B

Page 5: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Test of significanceTest of significanceConceptConcept

WeWe compare:compare:variabilityvariability betweenbetween treatmentstreatmentsvariabilityvariability withinwithin treatmentstreatments

IfIf thethe ratio ratio isis highhighWeWe concludeconclude thatthat treattreat. are . are indeedindeed differentdifferent

ReductionReduction ofof thethe denominatordenominator ((expexp. . errorerror) ) betterbetter capacitycapacity to to detectdetect real real differencesdifferences amongamongtreatmentstreatments ((betterbetter precisionprecision))

Ratio

L.T. Gama

Residual variabilityReductionReduction ofof residual residual variabilityvariability => => ↑↑ precisionprecision

i.e. i.e. attemptattempt to to reducereduce ““background background noisenoise””HowHow??

more more homogeneoushomogeneous e.ue.u. . maymay limitlimit spacespace ofof inferenceinference

stratificationstratification intointo homogeneoushomogeneous groupsgroupsblocksblocks –– buildingbuilding, , litterlitter, etc., etc.

statisticalstatistical adjustmentadjustment for for variablesvariables whichwhich cancan bebeidentifiedidentified//controledcontroled–– covariablescovariables

initialinitial weightweight, age , age ofof cowcow, , etcetc. .

increasingincreasing thethe numbernumber ofof e.ue.u. . does does notnot reducereduce residual residual variabilityvariabilityincreasesincreases powerpower ofof thethe testtestparametersparameters estimatedestimated withwith betterbetter precisionprecision

L.T. Gama

Scientific methodologyScientific methodology

ApplicationApplication ofof logiclogic andand objectivityobjectivity to to thethecomprehensioncomprehension ofof differentdifferent phenomenaphenomena ((biologicalbiologicaloror othersothers))

ExaminationExamination ofof whatwhat isis knownknown

FormulationFormulation ofof hypotheseshypotheses whichwhich cancan bebeverifiedverified experimentallyexperimentally

CarryingCarrying--outout ofof experimentationexperimentation

L.T. Gama

Scientific methodologyScientific methodology

What is knownand not known?

Formulation ofquestion or problem

Explicit Hypothesis

Design of experiment

What is the question?

ExperimentData collection

Analyses ofresults

InterpretationConclusions

Newknowledge

Other forms ofspreading

ScientificScientific publicationpublication

Intr

oduc

tion

and

obje

ctiv

es

Materials and Methods

Resultsand

Discussion

Adapted from deMalmfors, Garnsworthy e Grossman

L.T. Gama

Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis.

Sir R. Fisher - The Design of Experiments, 1935

L.T. Gama

Steps in experimentation1. 1. PlanningPlanning

a) a) DefinitionDefinition ofof workwork hypothesishypothesisImportanceImportance; ; simplesimple, precise, preciseCanCan bebe verifiedverified inin thethe experienceexperienceResultsResults shouldshould allowallow thethe researcherresearcher to determine to determine thethe probabilityprobability ofof beingbeing wrongwrong inin hishis conclusionsconclusions

b) b) DefinitionDefinition ofof populationpopulation ofof inferenceinferenceAnimalsAnimals, , facilitiesfacilities, , managementmanagement, etc., , etc., usedused inin thetheexperienceexperience are are representativerepresentative ofof thethe populationpopulation to to whichwhich wewe wantwant to to applyapply thethe conclusionsconclusions??

Page 6: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

2. Design2. DesignCharacteristicsCharacteristics to to measuremeasureFactorsFactors to to studystudy; ; treatmentstreatments to to bebe usedusedMinimizationMinimization ofof uncontrolleduncontrolled influencesinfluences andandsubjectivitysubjectivityDetermine Determine thethe desireddesired precisionprecision andand differencesdifferenceswhichwhich are are expectedexpected andand justifiablejustifiable

ConsiderationsConsiderations ofof statisticalstatistical, , economiceconomic, etc. , etc. naturenature

ChoiceChoice ofof thethe experimental designexperimental designMakeMake resourcesresources compatiblecompatible; ; logisticslogistics““OutlineOutline”” ofof statisticalstatistical analysisanalysis

Steps in experimentation

L.T. Gama

Ask for helpAsk for help

BeforeBefore youyou beginbegin!!!!!!

To To callcall inin thethe statisticianstatistician afterafter thetheexperimentexperiment isis donedone maymay bebe no more no more thanthanaskingasking himhim to to performperform a a postmortempostmortemexaminationexamination: : hehe maymay bebe ableable to to saysay whatwhatthethe experimentexperiment dieddied ofof..

Sir R. Fisher , 1938

L.T. Gama

3. 3. ExecutionExecutionAllocationAllocation ofof differentdifferent treatmentstreatments to experimental to experimental unitsunitsRigorousRigorous carryingcarrying outout ofof experimentationexperimentation

4. 4. AnalysesAnalysesStatisticalStatistical analysisanalysis ofof thethe resultsresults shouldshould leadlead to to thetheconfirmationconfirmation, , rejectionrejection oror changechange ofof thethe original original hypothesishypothesisStatementStatement, , inin a a probabilisticprobabilistic mannermanner, , aboutabout thethepossibilitypossibility ofof thethe researcherresearcher beingbeing mistakenmistaken inin hishisconclusionsconclusions..

5. 5. ReportingReporting andand publicationpublication ofof thethe resultsresults

Steps in experimentation

L.T. Gama

PublicationPublication

AdvicesAdvices onon publicationpublicationHaveHave somethingsomething newnew to to saysaySaySay ititShutShut upup afterafter youyou’’veve saidsaid ititGiveGive thethe texttext anan appropriateappropriatetitletitle andand orderorder RamonRamon y y CajalCajal,,

18991899

No research is finished until it has been published!

L.T. Gama

PublicationPublicationThe massacre of P<0.05

The fact that a difference is statistically significantdoes not imply that there is a “true” difference

In every 20 results where P<0.05, on average 1 result is nottruly different

NS results usually are not publishedSame experience repeated by different scientists

Sentences such as “the differences were not statisticallysignificant, but they are of biological importance” willcause any referee to jump on his/her chair

Lack of statistical significance only means that, given theexisting variability among observations, the difference thatwas found could very well be due to chance alone

L.T. Gama

Data available for statistical analysisData available for statistical analysis

1.1. DesignedDesigned experimentsexperimentsTreatmentsTreatments appliedapplied to to e.u.e.u. andand responses responses observedobservedExampleExample::

InfluenceInfluence ofof differentdifferent levelslevels ofof energyenergy onon growthgrowth rate rate ofof lambslambs•• 30 30 lambslambs ofof breedbreed X, X, bewteenbewteen 2 2 andand 3 3 monthsmonths ofof age age •• 3 3 levelslevels ofof energyenergy inin thethe dietdiet

2.2. SurveySurvey studiesstudiesData Data collectedcollected inin a a samplesample, , accordingaccording to to prepre--defineddefined criteriacriteriaThereThere are no are no treatstreats. . appliedapplied as as suchsuch to to thethe e.ue.u. . ExampleExample::

surveyssurveys aimedaimed atat characterizingcharacterizing managementmanagement practicespractices andand milkmilkproductionproduction inin farmersfarmers ofof a a givengiven breedbreedfarmersfarmers chosenchosen randomlyrandomly, , usingusing some some stratificationstratification criteriacriteria ((herdherdsizesize, , farmerfarmer’’s age, s age, levellevel ofof educationeducation, , etcetc.).)studystudy ofof thesethese andand otherother identifyableidentifyable factorsfactors ((seasonseason, , levellevel ofofsupplementationsupplementation, use , use ofof silagesilage, etc.) , etc.) onon dairydairy performances.performances.

Page 7: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Data available for statistical analysisData available for statistical analysis

3.3. ObservationObservation studiesstudiesLikeLike thethe previousprevious case, case, butbut withwith no no obviousobvious criterioncriterion ininchoosingchoosing e.ue.u. . ExamplesExamples::

use use ofof hospital hospital recordsrecords to to studystudy thethe influenceinfluence ofof age, age, sexsex, , seasonseason, , yearyear, , regionregion, , etcetc., ., onon thethe incidenceincidence ofof influenza influenza ininhumanshumans. . Use Use ofof dairydairy recordsrecords to to evaluateevaluate effectseffects ofof age, age, monthmonth ofof calvingcalving, , herdherd, , etcetc. . onon milkmilk yieldyield

Cases 2 Cases 2 andand 33ThereThere isis no no clearclear attributionattribution ofof treatstreats. to . to e.ue.u..HoweverHowever, , therethere are are factorsfactors whichwhich cancan bebe identifiedidentified as as havinghaving a a potentialpotential influenceinfluence onon thethe response response variablesvariablesTheseThese factorsfactors are are consideredconsidered inin statisticalstatistical analysesanalyses inin a a waywaysimilar to similar to treatmentstreatments inin a a designeddesigned experimentexperiment. .

L.T. Gama

Data available for statistical analysisData available for statistical analysis

BalancedBalanced datadataSameSame numbernumber ofof observationsobservations perpertreatmenttreatment oror combinationcombination ofof treatstreats. . UsuallyUsually inin designeddesigned experimentsexperiments((typetype 1)1)

UnbalancedUnbalanced datadataNNºº ofof obsevobsev. . notnot equalequal for for differentdifferenttreatstreats.; .; sometimessometimes ““emptyempty cellscells””

frequentfrequent withwith fieldfield datadataStatisticalStatistical methodsmethods are are approximateapproximateSometimesSometimes problemsproblems inin interpretinginterpretingadjustedadjusted meansmeans

seesee laterlater

AgeAge

SexSex

1010101022

1010101011

FFMM

AgeAge

SexSex

00101022

1515202011

FFMM

L.T. Gama

The foundation of all statistical analyses

Central limit theorem

The foundation of all statistical analyses

Central limit theorem

L.T. Gama

ExampleExampleSMART SMART hashas receivedreceived complaintscomplaintsthatthat itsits carscars are are tootoo smallsmall for for thethePortuguesePortuguese youngyoung populationpopulation!!

TheyThey givegive youyou a a grantgrant to to estimateestimate thethe averageaverage heightheightofof PortuguesePortuguese peoplepeople

YouYou hirehire 20 20 studentsstudentsAskAsk themthem to to gogo to to theirtheir favouritefavourite bar, bar, andand measuremeasure 10 10 peoplepeople chosenchosen randomlyrandomly ((e.ge.g., ., thethe firstfirst onesones arrivingarriving atat thethebar bar afterafter 3 3 a.ma.m. . onon a a FridayFriday))EachEach studentstudent calculatescalculates thethe meanmean ofof itsits samplesample ofof 10 10 individualsindividuals

L.T. Gama

ExampleExample

1.4 1.5 1.6 1.7 1.8 1.9

Height (m)

Freq

uenc

y

SupposeSuppose thethe distributiondistribution ofof truetrue heightsheights inin thethe PortuguesePortuguesepopulationpopulation isis thethe followingfollowing

WhatWhat cancan bebe expectedexpected fromfrom thethe samplingsampling thatthat youyouaskedasked youryour studentsstudents to to carrycarry outout??

L.T. Gama

Example resultsExample results

1.4 1.5 1.6 1.7 1.8 1.9

Height (m)

Freq

uenc

y

1.4 1.5 1.6 1.7 1.8 1.9

Height (m)

Freq

uenc

y

1.4 1.5 1.6 1.7 1.8 1.9

Height (m)

Freq

uenc

y

Student 1Mean = 1.59

Student 2Mean = 1.72

Student 3Mean = 1.66

Page 8: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Example resultsExample results

PossiblePossible resultsresults

0

0.05

0.1

0.15

0.2

0.25

1.4 1.5 1.6 1.7 1.8 1.9

Height (m)

Freq

uenc

y

0

2

4

6

8

10

12

Nº m

eans

Distribution of heights in theoriginal population

Distribution ofthe mean of 10

people, collectedby 20 students

Notice that sample means are closer to the mean of the original population.Does it make sense? L.T. Gama

Basic concepts – CLTBasic concepts – CLT

LevelLevel ofof metabolitemetabolite X X inin thousandsthousandsofof micemice ofof strainstrain AA

0 1 2 3 4 5 6 7 8 9 10 11 12

Level of metabolite X

Freq

uenc

y

Normal distribution

6=μ

2=σ

L.T. Gama

Basic conceptsBasic conceptsIfIf wewe taketake severalseveral samplessamples ofof nn micemice fromfrom thisthispopulationpopulation, , whatwhat do do wewe expectexpect to to getget? ?

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Level of metabolite X

Freq

uenc

y

Distribution of the meansof samples of n individuals

n=4

L.T. Gama

Basic conceptsBasic conceptsIfIf wewe taketake severalseveral samplessamples ofof nn micemice fromfrom thisthispopulationpopulation, , whatwhat do do wewe expectexpect to to getget? ?

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Level of metabolite X

Freq

uenc

y

Distribution of the meansof samples of n individuals

n=4

n=9

L.T. Gama

Basic conceptsBasic conceptsIfIf wewe taketake severalseveral samplessamples ofof nn micemice fromfrom thisthispopulationpopulation, , whatwhat do do wewe expectexpect to to getget? ?

Normal distribution

6=X

ns σ=

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Level of metabolite X

Freq

uenc

y

Distribution of the meansof samples of n individuals

n=4

n=9

n=16

Central LimitTheorem

L.T. Gama

Effect of sample sizeEffect of sample size

Sampling: n=2, 5, 10, 25; 10000 replications

Page 9: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Effect of the original distributionEffect of the original distributionSampling: n=10, 10000 replications

L.T. Gama

ConclusionConclusion

TheThe meanmean ofof a a smallsmall samplesample tendstends to to havehave a a widerwiderdistributiondistribution aroundaround thethe meanmean ofof thethe populationpopulation fromfromwherewhere itit waswas collectedcollectedTheThe distributiondistribution getsgets narrowernarrower as as nn increasesincreases

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Level of metabolite X

Freq

uenc

y

n=4

n=9

n=16

66709

2 .s ==⇒

14

2==⇒ s

50162 .s ==⇒

L.T. Gama

Hypothesis testingBasic principles

Hypothesis testingBasic principles

L.T. Gama

FundamentalsFundamentals

HypothesisHypothesis testingtesting isis basedbased onon some some typetypeofof ““testtest ofof significancesignificance””

TestsTests t, F, t, F, etcetc..

InIn essenceessence, , wewe wantwant to compare:to compare:

Ratio Ratio isis highhigh SignificantSignificant!!!!Ratio Ratio isis smallsmall N.S.!!!N.S.!!!

eatments within trobserv. among eatmentsbetween tr

yVariabilityVariabilit

L.T. Gama

The crucial point!The crucial point!WeWe admitadmit thethe possibilitypossibility ofof beingbeing wrongwrong inin thetheconclusionconclusion !!!!!!

i.e., i.e., thethe factfact thatthat wewe are are workingworking withwith a a samplesample((andand notnot withwith thethe populationpopulation) ) maymay leadlead to to wrongwrongconclusionsconclusions. .

CorrectCorrectErrorError !!((TypeType 2)2)

TreatTreat. do . do notnotdifferdiffer

ErrorError !!((TypeType 1)1)

CorrectCorrectTreatTreat. . differdiffer

TreatTreat. do . do notnotdifferdiffer

TreatTreat. . differdiffer

“REALITY”

SAM

PLE

Prob. α

Prob. βL.T. Gama

Sequence of the analysis1. 1. WeWe startstart byby assumingassuming thatthat thethe twotwo treatstreats. do . do

notnot differdifferi.e. i.e. thethe twotwo groupsgroups ((oneone correspondingcorresponding to to eacheachtreattreat.) .) actuallyactually come come fromfrom thethe samesame conceptual conceptual populationpopulation, , andand theythey onlyonly differdiffer duedue to to thethesamplingsampling processprocess

2. 2. TestTest ifif itit isis plausibleplausible thatthat thethe observedobserveddifferencesdifferences are are onlyonly duedue to to thethe samplingsamplingprocessprocess

wewe compare compare thethe variabiltyvariabilty betweenbetween treatstreats. . withwith thethevariabilityvariability withinwithin treattreat. (residual). (residual)

Page 10: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Sequence of the analysis(cont.)3. 3. IfIf thethe differencesdifferences betweenbetween treatstreats. are . are ““bigbig””

andand residual residual variabilityvariability isis ““smallsmall””::itit isis notnot likelylikely thatthat thethe observedobserved differencesdifferences are are duedueto to chancechance alonealoneitit isis quite quite probableprobable thatthat thethe twotwo treatstreats. are . are indeedindeeddifferentdifferentwewe cancan nevernever rejectreject withwith absoluteabsolute certaintycertainty thatthat thethetwotwo treatstreats. are . are equalequal inin realityreality

L.T. Gama

Sequence in hypothesis testingSequence in hypothesis testing

Define Define nullnull hypothesishypothesis““StateState ofof naturenature””HH00: : μμAA = = μμBB

Define Define alternativealternative hypothesishypothesis (H(HAA))HHAA: : μμAA ≠≠ μμBB (bilateral)(bilateral)HHAA: : μμAA >> μμBB (unilateral)(unilateral)

VerifyVerify ifif data data allowallow rejectionrejection ofof HH00StatisticalStatistical testtest (t (t oror F) F)

L.T. Gama

Sequence in hypothesis testingSequence in hypothesis testing

StatisticalStatistical testtest: : calculatecalculate standardizedstandardizeddifferencedifference betweenbetween treatstreats. .

ObtainObtain criticalcritical valuevalue fromfrom appropriateappropriate tablestablesFunctionFunction ofof αα andand d.f.ed.f.e. .

e.g. for g.l.e.=10e.g. for g.l.e.=10αα = 0.10 => = 0.10 => ttcrcríítt = 1.372= 1.372αα = 0.05 => = 0.05 => ttcrcríítt = 1.812= 1.812αα = 0.01 => = 0.01 => ttcrcríítt = 2.764= 2.764

Compare Compare ttobsobs.. andand ttcrcríítt..IfIf ttobsobs.. > > ttcrcríítt. . => => RejectReject HH0 0 (for a (for a givengiven αα) )

ns2

XXs

XXt2

BA

XX

BAobs

21

−=

−=

L.T. Gama

0.010.020.050.1α = 0.2

2.5762.3261.9601.6451.282∞

2.6602.3902.0001.6711.29660

2.7502.4572.0421.6971.31030

2.8452.5282.0861.7251.32520

2.8612.5392.0931.7291.32819

2.8782.5522.1011.7341.33018

2.8982.5672.1101.7401.33317

2.9212.5832.1201.7461.33716

2.9472.6022.1311.7531.34115

2.9772.6242.1451.7611.34514

3.0122.6502.1601.7711.35013

3.0552.6812.1791.7821.35612

3.1062.7182.2011.7961.36311

3.1692.7642.2281.8121.37210

3.2502.8212.2621.8331.3839

3.3552.8962.3061.8601.3978

3.4992.9982.3651.8951.4157

3.7073.1432.4471.9431.4406

4.0323.3652.5712.0151.4765

4.6043.7472.7762.1321.5334

5.8414.5413.1822.3531.6383

9.9256.9654.3032.9201.8862

63.65631.82112.7066.3143.0781

0.0050.010.0250.05α = 0.1df

Distribution t

UnilateralBilateral

L.T. Gama

In synthesisIn synthesis

Define H0 and HACalculate observedtest value (t or F)

BA XX

BAobs s

XXt−

−=

Obtain critical value(t or F) in tables

Function of α and d.f.

tobs > tcrít tobs ≤ tcrít

RejectH0

No-rejectionof H0

AcceptHA

Do not acceptHA L.T. Gama

Power of a testPower of a test

Strain A Strain B

Page 11: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Power of a testPower of a testAssume Assume twotwo strainsstrains: A : A andand BB

““TrueTrue”” meansmeans for for metabolitemetabolite X are 6 X are 6 andand 8 8 unitsunits

1 2 3 4 5 6 7 8 9 10 11 12 13 14

A B

L.T. Gama

Power of a testPower of a test

DistributionDistribution ofof indivindiv. . observationsobservations inin strainsstrains A A andand BBAssumingAssuming σσ=2=2

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

L.T. Gama

Comparison of the 2 strains, using 4 e.u./strainComparison of the 2 strains, using 4 e.u./strain

ExpectedExpected distributiondistribution ofof meanmean ofof 4 4 observobserv. . inin strainstrain AA

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

14

2s ==

L.T. Gama

Comparison of the 2 strains, using 4 e.u./strainComparison of the 2 strains, using 4 e.u./strain

DefinitionDefinition ofof rejectionrejection regionregion for Hfor H00

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

α=0.05

RejectHo

Accept

If a mean of 4 observ. falls to rightof this line, it is probably not from

pop. A(but we can not afirm that it is not!)

L.T. Gama

Comparison of the 2 strains, using 4 e.u./strainComparison of the 2 strains, using 4 e.u./strain

DefinitionDefinition ofof rejectionrejection regionregion for Hfor H00

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

α=0.05

RejectHo

Accept

If value in sample of B isgreater than 7.895...

Critical t- 8 observ.; 2 trts; 7 d.f.e.- 1 tail test- Critical t = 1.895

L.T. GamaAccept

Comparison of the 2 strains, using 4 e.u./strainComparison of the 2 strains, using 4 e.u./strain

ExpectedExpected distributiondistribution ofof thethe meanmean ofof thethe 2 2 strainsstrains

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

α=0.05

RejectHo

β=0.6

Page 12: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. GamaAccept

Comparison of the 2 strains, using 9 e.u./strainComparison of the 2 strains, using 9 e.u./strain

ExpectedExpected distributiondistribution ofof thethe meanmean ofof thethe 2 2 strainsstrains

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

α=0.05

RejectHo

β=0.26

L.T. GamaAccept

Comparison of the 2 strains, using 16 e.u./strainComparison of the 2 strains, using 16 e.u./strain

ExpectedExpected distributiondistribution ofof thethe meanmean ofof thethe 2 2 strainsstrains

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

α=0.05

RejectHo

β=0.03

L.T. Gama

Preliminary conclusionsPreliminary conclusions

ProbabilityProbability ofof errorerror inin a a givengiven experimentexperiment::typetype I I isis controlablecontrolable ((αα))typetype II II isis notnot ((ββ) ) –– resultsresults fromfrom thethe expectedexpecteddistributiondistribution inin thethe samplingsampling processprocess

SameSame differencedifference betweenbetween μμAA andand μμBB easiereasier to to detectdetect whenwhen n n

For a For a givengiven situationsituation αα => => ββ

PowerPower ofof a a testtest = 1= 1--ββL.T. Gama

Power of a test as a function of nPower of a test as a function of n

Difference of 10% almost undetectablewhen n is small

L.T. Gama

Risks of ignoring the power of a testRisks of ignoring the power of a test

ExampleExamplePacemakerPacemaker A A isis standard standard inin thethe marketmarketBrandBrand B B wantswants to prove to prove thatthat itit hashas a a productproduct as as goodgoodas as brandbrand A A HH00 : : μμAA = = μμBB

Assume Assume thatthat::inin realityreality B B isis worseworse thanthan AADueDue to to lacklack ofof resourcesresources, , experienceexperience isis smallsmall, , withwith lowlowpowerpowerwewe makemake a a typetype II II errorerror, , andand do do notnot rejectreject HH00

brandbrand B B maymay (?) (?) claimclaim thatthat itit isis as as goodgood as A (?) as A (?)

ConsequencesConsequences??????L.T. Gama

““TheThe resultsresults ofof thethe experienceexperience werewere::oneone thirdthird ofof thethe animalsanimals showedshowed clearclearimprovementimprovement withwith thethe treatmenttreatment;;oneone thirdthird didndidn’’t show t show anyany improvementimprovement; ; thethe thirdthird mouse mouse ranran awayaway!!””

Nº observations/treatmentNº observations/treatment

Page 13: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

What is the experimental unit?What is the experimental unit?

PetriPetri dishdishNNºº coloniescolonies atat 48 48 hourshours

InhibitionInhibition ofofcoliphormscoliphorms withwithantisepticantiseptic XX

MouseMouseCholesterolCholesterol levellevelafterafter 30 30 daysdays

EffectEffect ofof drugdrug X X onon cholesterolcholesterol

CageCageWeightWeight ofof micemiceafterafter 30 30 daysdays

SupplementationSupplementationofof 2 2 antibioticsantibiotics ininfeedfeed

e.u.e.u.TreatTreat. . appliedapplied to...to...Response Response variablevariable

TreatTreat. to . to testtest

L.T. Gama

What is the experimental unit?What is the experimental unit?

PacientPacientBloodBlood pressurepressure atat6, 12 e 24 h6, 12 e 24 h

EffectEffect ofof 2 2 drugsdrugsonon evolutionevolution ofofbloodblood pressurepressure

SteakSteakMeatMeat tendernesstendernessTemperatureTemperature andandhumidityhumidity ininageingageing ofof meatmeat

AnimalAnimalMeatMeat tendernesstendernessAge Age atat slaughterslaughterinin cattlecattle

e.u.e.u.TreatTreat. . appliedapplied to...to...Response Response variablevariable

TreatTreat. to . to testtest

L.T. Gama

What is the experimental unit?What is the experimental unit?

FractionFraction ofofejaculateejaculate

MobilityMobility ofof semensemenatat 30 30 minmin. .

EffectEffect ofof 2 2 typestypesofof extenderextendersolutionsolution

BullBullMobilityMobility ofof semensemenatat 30 30 minmin. .

EffectEffect ofof seasonseason((SpringSpring vsvs. . FallFall))

RabbitRabbit’’s s eyeeye

IncidenceIncidence ofofconjuntivitisconjuntivitis

CosmeticCosmetic Y Y appliedapplied oror notnot inineyeeye lasheslashes

e.u.e.u.TreatTreat. . appliedapplied to...to...Response Response variablevariable

TreatTreat. to . to testtest

L.T. Gama

What is the experimental unit?What is the experimental unit?

Social Social behaviourbehaviourinin penguinspenguins

EffectEffect ofoftemperaturetemperature

CowCow--lactlact. . stagestage

MilkMilk yieldyieldNNºº milkingsmilkings//daydayinin 2 2 phasesphases ofof thethelactationlactation

e.u.e.u.TreatTreat. . appliedapplied to...to...Response Response variablevariable

TreatTreat. to . to testtest

L.T. Gama

Nº observations/treatmentNº observations/treatment

ImportanceImportance

Balance Balance betweenbetween desireddesired precisionprecision andand availableavailableresourcesresources

InsuficientInsuficient numbernumber ofof e.u.e.u. maymay notnot allowallow detectiondetectionofof ““significantsignificant”” differencesdifferences, , eveneven thoughthough theythey existexistinin realityreality

UselessUseless experienceexperience

ExcessiveExcessive numbernumber ofof e.u.e.u. maymay bebe questionablequestionablefromfrom differentdifferent pointspoints ofof viewview

Animal Animal welfarewelfare, use , use ofof resourcesresources, , etcetc. .

L.T. Gama

Principles of the processPrinciples of the process

FirstFirst approximationapproximationDifferenceDifference isis significantsignificant ifif::

HoweverHowever::ttcrcríítt.. alreadyalready dependsdepends onon nn ((i.ei.e. . d.f.ed.f.e.).)WeWe are are ignoringignoring TypeType II II errorerror

crít.BA

XX

BAobs t

ns

XXs

XXt >−

=−

=−

2221

2

2

2

⎟⎟⎠

⎞⎜⎜⎝

⎛ −≥

sXX

t nBA

.crít

Page 14: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Correct way to determine nCorrect way to determine n

Define Define previouslypreviously::DifferenceDifference whichwhich wewe wantwant//expectexpect to to detectdetect ((μμ11-- μμ22))ExpectedExpected variabilityvariability amongamong e.u.e.u. submittedsubmitted to to thethesamesame treatmenttreatment ((σσ oror ss))TolerableTolerable probabiltiesprobabilties for for typetype I (I (αα) ) andand typetype II (II (ββ) ) errorserrors

SeveralSeveral methodsmethods are are possiblepossibleUse Use ofof thethe Z Z distributiondistribution

SimplestSimplest methodmethodSimilar to t Similar to t distributiondistribution, , butbut independentindependent ofof d.f.ed.f.e. .

L.T. Gama.0019 .0020 .0021 .0021 .0022 .0023 .0023 .0024 .0025 .0026 2.8

.0026 .0027 .0028 .0029 .0030 .0031 .0032 .0033 .0034 .0035 2.7

.0036 .0037 .0038 .0039 .0040 .0041 .0043 .0044 .0045 .0047 2.6

.0048 .0049 .0051 .0052 .0054 .0055 .0057 .0059 .0060 .0062 2.5

.0064 .0066 .0068 .0069 .0071 .0073 .0075 .0078 .0080 .0082 2.4

.0084 .0087 .0089 .0091 .0094 .0096 .0099 .0102 .0104 .0107 2.3

.0110 .0113 .0116 .0119 .0122 .0125 .0129 .0132 .0136 .0139 2.2

.0143 .0146 .0150 .0154 .0158 .0162 .0166 .0170 .0174 .0179 2.1

.0183 .0188 .0192 .0197 .0202 .0207 .0212 .0217 .0222 .0228 2.0

.0233 .0239 .0244 .0250 .0256 .0262 .0268 .0274 .0281 .0287 1.9

.0294 .0301 .0307 .0314 .0322 .0329 .0336 .0344 .0352 .0359 1.8

.0367 .0375 .0384 .0392 .0401 .0409 .0418 .0427 .0436 .0446 1.7

.0455 .0465 .0475 .0485 .0495 .0505 .0516 .0526 .0537 .0548 1.6

.0559 .0571 .0582 .0594 .0606 .0618 .0630 .0643 .0655 .0668 1.5

.0681 .0694 .0708 .0722 .0735 .0749 .0764 .0778 .0793 .0808 1.4

.0823 .0838 .0853 .0869 .0885 .0901 .0918 .0934 .0951 .0968 1.3

.0985 .1003 .1020 .1038 .1056 .1075 .1093 .1112 .1131 .1151 1.2

.1170 .1190 .1210 .1230 .1251 .1271 .1292 .1314 .1335 .1357 1.1

.1379 .1401 .1423 .1446 .1469 .1492 .1515 .1539 .1562 .1587 1.0

.1611 .1635 .1660 .1685 .1711 .1736 .1762 .1788 .1814 .1841 0.9

.1867 .1894 .1922 .1949 .1977 .2005 .2033 .2061 .2090 .2119 0.8

.2148 .2177 .2206 .2236 .2266 .2296 .2327 .2358 .2389 .2420 0.7

.2451 .2483 .2514 .2546 .2578 .2611 .2643 .2676 .2709 .2743 0.6

.2776 .2810 .2843 .2877 .2912 .2946 .2981 .3015 .3050 .3085 0.5

.3121 .3156 .3192 .3228 .3264 .3300 .3336 .3372 .3409 .3446 0.4

.3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821 0.3

.3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207 0.2

.4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602 0.1

.4641 .4681 .4721 .4761 .4801 .4840 .4880 .4920 .4960 .5000 0.0

.09 .08 .07 .06 .05 .04 .03 .02 .01 .00 Z

Z distribution(unilateral)

L.T. Gama

Nº observ./treat. – Z distributionNº observ./treat. – Z distributionIgnoringIgnoring ββ

ReallyReally correspondscorresponds to to ββ=0.5=0.5

ConsideringConsidering thethe powerpower ofof thethe testtest

2

2

212

⎟⎟⎠

⎞⎜⎜⎝

σ

⎟⎠⎞⎜

⎝⎛ +

≥μμ

βZαZ n

2

2

212

⎟⎟⎠

⎞⎜⎜⎝

σ

−≥

μμαZ

n e.g. ZZαα=0.05=0.05 = 1.65= 1.65

L.T. Gama

Nº observ./treat. - ExampleNº observ./treat. - Example

ConsideringConsidering 11--ββ=0.9=0.9Case Case ofof 2 2 strainsstrainsμμAA-- μμBB = = 22 σσ = 2= 2ZZ0.050.05 = 1.65= 1.65 ZZ0.10.1= 1.28= 1.28

2

2

212

⎟⎟⎠

⎞⎜⎜⎝

σ

⎟⎠⎞⎜

⎝⎛ +

≥μμ

βZαZ n

( )[ ]

.e.u1722/)68(

228.165.12n ≈−

+≥

L.T. Gama

Nº observ. with differences expressed as %Nº observ. with differences expressed as %

XsCV =

n n mustmust bebe highhigh ifif::SmallSmall differencesdifferences are are expectedexpectedVariabilityVariability amongamong e.ue.u. . isis highhigh

L.T. Gama

NNºº observobserv././treattreat. as a . as a functionfunction ofof::DifDif. . amongamong μμ expressedexpressed inin s s unitsunitsProbProb. . typetype II II errorerror ((ββ))AssumingAssuming αα=0.05=0.05

β

Bilateral test!!

Page 15: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Internet resourcesInternet resources

www.stat.uiowa.edu/~rlenth/

L.T. Gama

Design of the experimentDesign of the experiment

L.T. Gama

Design of experimentDesign of experiment

TreatmentTreatment structurestructureSimpleSimple

eacheach e.u.e.u. subjectsubject to to oneone treattreat. . FactorialFactorial

eacheach e.u.e.u. subjectsubject to a to a combinationcombination ofof treatstreats. . allowsallows studystudy ofof interactionsinteractions

Experimental design Experimental design AttributionAttribution ofof treatstreats. . takingtaking intointo accountaccount identifiableidentifiablefactorsfactors whichwhich cause cause additionaladditional variationvariation(background (background noisenoise))

MinimizationMinimization ofof residual residual variabilityvariabilityBetterBetter precisionprecision

L.T. Gama

Experimental designExperimental designHowHow to minimize residual to minimize residual variabilityvariability??

TakeTake intointo accountaccount otherother factorsfactors whichwhich maymay cause cause additionaladditional variationvariation

breedbreed, age, , age, sexsex, , seasonseason, , etcetc..

IfIf possiblepossible, , makemake comparisonscomparisons amongamonghomogeneoushomogeneous e.ue.u. .

i.e. i.e. atributeatribute treatstreats. . takingtaking intointo accountaccount otherother factorsfactors

ExamplesExamples3 3 dietsdiets testedtested inin males males andand femalesfemales ofof 2 2 strainsstrains

•• ComparisonsComparisons withinwithin strainstrain--sexsex2 2 systemssystems ofof curingcuring hamham inin pigspigs ofof 2 2 breedsbreeds

•• ComparisonsComparisons withinwithin animalanimal

L.T. Gama

Treatment structureTreatment structure

L.T. Gama

One factorOne factor

TreatTreat. A . A vsvs. . treattreat. B . B –– TraditionalTraditional ANOVAANOVAEffectoEffecto ofof antibioticantibiotic X X comparedcompared withwith Y Y

IncreasedIncreased levelslevels ofof a a givengiven factorfactorp.ep.e. . concentrationconcentration ofof a a givengiven drugdrug, , levellevel ofof metabolizablemetabolizableenergyenergy, , etcetc..

AnalysisAnalysis ofof regressionregressionLinear Linear regressionregression

•• relationshiprelationship betweenbetween VitVit. B12 . B12 ingestedingested andand retainedretained•• ProteinProtein inin thethe feedfeed andand ADGADG•• etc.etc.

* **

*

** *

Protein

ADG

Y=b0+b1X

Page 16: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

QuadraticQuadratic regressionregressionOneOne ofof thethe mostmost frequentfrequent inin biologybiologyRelationshipRelationship betweenbetween Y Y andand X X isis curvilinearcurvilinearExamplesExamples

Age Age ofof cowcow andand fertilityfertilityLevelLevel ofof energyenergy andand growthgrowthLevelLevel ofof bSTbST andand milkmilk yieldyield

One factorOne factor

Y=b0+b1X+b2X2

5000

6000

7000

8000

9000

10000

11000

0 5 10 15 20 25 30 35 40bST (mg)

PL (k

g)

Maximum at28 mg

L.T. Gama

Example – simple factorialExample – simple factorialResponse Response variablevariable

WeaningWeaning weightweight inin lambslambsFactorsFactors consideredconsidered

YearYear (2)(2)SystemSystem –– irrigatedirrigated oror notnotSupplementationSupplementation –– yesyes oror nono

ExperienceExperience carriedcarried outout as a factorial as a factorial

QuestionsQuestionsWhatWhat isis thethe effecteffect ofof irrigationirrigation??WhatWhat isis thethe effecteffect ofof supplementationsupplementation??IsIs thethe effecteffect ofof supplementationsupplementation similar similar inin bothboth systemssystems??

i.e., i.e., isis therethere anan interactioninteraction??

L.T. Gama

Factorial experimentFactorial experimentWant to study joint effect of two factors, each one Want to study joint effect of two factors, each one with several levels with several levels Interaction among factors!Interaction among factors!ExampleExample

effecteffect ofof threethree differentdifferent growthgrowth promoterspromoters, , inin malemale andand femalefemalechickenchicken

isis thethe effecteffect thethe samesame inin bothboth sexessexes?.?.factorial 3 x 2factorial 3 x 220 20 chickenchicken perper treattreat. . combinationcombination

119Total114Error2Prot. x Sex1Sex2Level of protein

d.f.Source of variation

ANOVA

L.T. Gama

OneOne continuouscontinuous andand oneone discontinuousdiscontinuous factorfactorExampleExample

IncreasingIncreasing thethe energyenergy inin thethe dietdiet hashas thethe samesameeffecteffect inin Alentejano Alentejano andand LargeLarge WhiteWhite pigspigs? ? TheThe metabolicmetabolic response to response to increasingincreasing levelslevels ofofthyroxinethyroxine isis thethe samesame inin males males andand femalesfemales? ? Response to Response to bSTbST isis thethe samesame inin HolsteinHolstein andandJerseyJersey cowscows??

AnalysisAnalysis ofof covariancecovariance

Combining continuous anddiscontinuous factors

Combining continuous anddiscontinuous factors

L.T. Gama

ExampleExampleTest of two protein sources in the dietTest of two protein sources in the diet

Soybean meal (SBM) and fishmeal (FM)Soybean meal (SBM) and fishmeal (FM)Each diet has 4 levels of crude protein (12, 14, 16 and 18%)Each diet has 4 levels of crude protein (12, 14, 16 and 18%)Trial conducted with 16 pigs (2 per combination source of Trial conducted with 16 pigs (2 per combination source of protprot. x level of CP). x level of CP)The effect of level of CP (continuous factor) is the same for The effect of level of CP (continuous factor) is the same for the two protein sources (discontinuous factor)?the two protein sources (discontinuous factor)?

Combining continuous anddiscontinuous factors

Combining continuous anddiscontinuous factors

250300350400450500550600650

10 12 14 16 18

PB

SBM

FM

L.T. Gama

Most frequent designs in animal experiments

Most frequent designs in animal experiments

Page 17: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

Completely Randomized Design (CRD)Completely Randomized Design (CRD)

OneOne factor (factor (withwith ii levelslevels); ); possiblypossibly a factorial a factorial NotNot possiblepossible to to groupgroup e.ue.u. . EachEach animal animal submittedsubmitted to to oneone treattreat. (. (oror combinationcombinationofof treatstreats.), .), withwith oneone observationobservation..ExampleExample::

GroupGroup ofof 20 20 lambslambs, , ofof thethe samesame sexsex andand breedbreed, , withwith similar similar ages (50 d).ages (50 d).TestTest ofof 4 4 differentdifferent dietsdiets; ; effecteffect onon weightweight atat 100 d 100 d

19Total

16Error

3Diet

d.f.Source of variation

ANOVA

L.T. Gama

Randomized Complete Blocks Design (RCBD)

Randomized Complete Blocks Design (RCBD)

Experimental Experimental unitsunits (e.g., (e.g., animalsanimals) ) cancan bebe groupedgrouped inina a logicallogical mannermanner

ThereThere isis oneone sourcesource ofof variaionvariaion whichwhich shouldshould bebe takentaken intointoaccountaccount (e.g. (e.g. litterlitter, , buildingbuilding, , dayday ofof trialtrial, etc.), , etc.), andand treatstreats. . ofofinterestinterest are are appliedapplied withinwithin thisthis factor;factor;

ExampleExample::EffectEffect ofof 2 2 typestypes ofof proteinprotein onon growthgrowth ofof 20 20 malemale pigletspiglets, , fromfrom 10 10 litterslitters..LitterLitter maymay havehave importantimportant effecteffect

AnimalsAnimals havehave genes genes andand maternal maternal effectseffects inin commoncommonTreatTreat. . assignedassigned withinwithin litterlitter (2 (2 pigspigs//treattreat.).)

19Total9Error9Litter1Typo of protein

d.f.Source of variation

ANOVA

L.T. Gama

OtherOther examplesexampleseffect of two drugs used in dermatology;effect of two drugs used in dermatology;

both used simultaneously in the same animal both used simultaneously in the same animal animal considered as a blockanimal considered as a block

comparison of two ways to consolidate fractures comparison of two ways to consolidate fractures surgicallysurgically

forced fracture of the radius in the two members of dogs forced fracture of the radius in the two members of dogs dog considered as a blockdog considered as a block

effect of supplementation with calcium in laying effect of supplementation with calcium in laying henshens

each building split in the middle ; each building split in the middle ; 2 treat. used in the same building; 2 treat. used in the same building; experience repeated in 3 buildings .experience repeated in 3 buildings .Building is the blockBuilding is the block

Randomized Complete Blocks Design (RCBD)

Randomized Complete Blocks Design (RCBD)

L.T. Gama

Applies the principles of RCBD; Applies the principles of RCBD; in this case there are two factors which can be used to group in this case there are two factors which can be used to group the experimental units.the experimental units.

However, number of levels of the two factors must be However, number of levels of the two factors must be the same (or multiple), and equal to the number of the same (or multiple), and equal to the number of treats. treats.

Example:Example:3 litters in mice (9 animals)3 litters in mice (9 animals)3 weeks of trial3 weeks of trial3 treats. (A, B, C).3 treats. (A, B, C).

each treat. appears only once in each row and each treat. appears only once in each row and columncolumnassumes that interactions do not exist!!assumes that interactions do not exist!!

Latin squareLatin square

BBAACC33

CCBBAA22

AACCBB11

ZZYYXX

Litter

Week

L.T. Gama

ExampleExample

Test of efficacy of 4 antidotes of Test of efficacy of 4 antidotes of substsubst. X. XA=controlA=control B=B=phenobarbitalphenobarbital, , C=ammonium chloride C=ammonium chloride D=lactoseD=lactose

rabbits (n=16) treated previously with the antidote. rabbits (n=16) treated previously with the antidote.

trial carried out in 4 consecutive days trial carried out in 4 consecutive days

on the day of the trial injected with substance X with on the day of the trial injected with substance X with intervals of .5, 1, 1.5 or 2 minutes intervals of .5, 1, 1.5 or 2 minutes

response variable is the lethal dose of Xresponse variable is the lethal dose of X

Latin squareLatin square

L.T. Gama

Latin square usedLatin square usedLetalLetal dose expressed in (log mg X/ log Kg PV), presented in dose expressed in (log mg X/ log Kg PV), presented in ( ) ( ) –– average of 4 rabbits/treat.average of 4 rabbits/treat.

Latin squareLatin square

A(1.168)

D(1.139)

B(1.240)

C(0.665)

4

C(0.934)

B(1.394)

A(0.925)

D(1.266)

3

D(0.935)

A(1.031)

C(1.432)

B(1.220)

2

B(1.231)

C(1.161)

D(1.231)

A(1.576)

1

4321Day of trial

Interval

15Total

6Error

3Antidote

3Interval

3Days

d.f.Source of variation

ANOVA

Page 18: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

SameSame individual individual submmitedsubmmited sucessivelysucessively to to severalseveral treatmentstreatmentsTreatTreat. are . are comparedcompared withinwithin thethe individual, individual, thusthus removingremoving thethe noisenoiseintroducedintroduced byby variabilityvariability amongamong individualsindividuals..e.g. e.g. differentdifferent drugsdrugs to to controlcontrol hipertensionhipertension

Similar to Similar to latinlatin squaresquareIndividual = Individual = columncolumnPeriodPeriod = = linelineTreatmentsTreatments assignedassigned sequentiallysequentially

AllAll indivindiv. . subjectsubject to to everyevery treatmenttreatment

ItIt isis assumedassumed thatthat therethere isis no residual no residual oror carrycarry--overover effecteffectWithdrawalWithdrawal oror restingresting periodperiod betweenbetween treatstreats. . SometimesSometimes, , itit cancan bebe investigatedinvestigated notnot onlyonly thethe effecteffect ofof treatstreats., ., butbutalsoalso ofof thethe sequencesequence inin whichwhich theythey are are appliedapplied..

Design Design frequentlyfrequently usedused inin medicalmedical researchresearch

Cross-over (change-over)Cross-over (change-over)

L.T. Gama

AssumeAssume4 4 indivindiv..4 4 periodsperiods4 4 treattreat..

Cross-overCross-over

A B C DB C D AC D A BD A B C

A B C DB D A CC A D BD C A B

•Sequence always thesame•If A is very good, B would be favored

•Each treat. is preceededby all the others•Minimizes carry-over•Williams square

2 alternatives

Ind.

Period

L.T. Gama

FactorsFactors to to considerconsider::IndividualIndividualTreatmentTreatmentSequenceSequence ofof treatmentstreatments ??

IfIf therethere isis carrycarry--overover......

ExperienceExperience maymay bebe conductedconducted inin severalseveralrepeatedrepeated latinlatin squaressquares

SameSame treatstreats. . andand sequencessequences inin thethe differentdifferentsquaressquares ((e.ge.g., ., periodsperiods, , labslabs, , etcetc.).)DifferentDifferent individualsindividuals inin eacheach periodperiod//lablab

Cross-overCross-over

L.T. Gama

AssumingAssuming thatthat carrycarry--overoverdoes does notnot existexist

6 6 indivindiv..3 3 periodsperiods2 2 groupsgroups ((racerace, , lablab, , countrycountry, , etcetc.).)

AssumptionAssumption ofof no no interactioninteraction

ANOVAANOVAGroupGroupPeriodPeriodIndivIndiv..TreatTreat..

Cross-overCross-over

AABBCC33

CCAABB22

BBCCAA11

332211

BBAACC66

CCBBAA55

AACCBB44

332211

Indiv.

Period

L.T. Gama

IfIf therethere isis carrycarry--overover......ConsiderConsider effecteffect ofof treattreat. . andandsequencesequence ofof treatstreats..

e.g., e.g., wewe wantwant to to testtest::EffectEffect ofof AAIfIf effecteffect ofof A A dependsdepends onon beingbeing afterafter ororbeforebefore BB

AnalysisAnalysisIndividualIndividualTreatTreat. . Sequencie (AB Sequencie (AB oror BA)BA)

Cross-overCross-over

AABB

BBAA

L.T. Gama

SplitSplit plotplot inin timetimemeasuresmeasures takentaken inin thethe samesame individual (individual (oror e.u.e.u.) ) overovertimetime are are notnot independentindependent fromfrom eacheach otherother

assumptionassumption inin ANOVAANOVAItIt isis ofof interestinterest to to studystudy thethe evolutionevolution ofof a a phenomenonphenomenon overover timetime

effecteffect ofof timetime maymay bebe linear, linear, quadraticquadratic, etc., etc.evaluateevaluate ifif a a givengiven treattreat. . hashas influenceinfluence onon thatthat evolutionevolutiontaketake intointo accountaccount variabilityvariability amongamong individualsindividuals

ExamplesExamplesEvolutionEvolution ofof levellevel ofof insulininsulin afterafter ingestingingesting twotwo typestypes ofof foodfood((e.ge.g. . richrich inin starchstarch oror inin fatfat))ChangesChanges inin levellevel ofof FSH FSH afterafter injectioninjection ofof E2 E2 inin sowssows treatedtreatedoror notnot withwith inhibininhibin

Repeated measuresRepeated measures

Page 19: Statistics and Experimental Design - UC · L.T. Gama Luís Telo da Gama ltgama1@yahoo.com Estação Zootécnica Nacional Faculdade de Medicina Veterinária - UTL Statistics and Experimental

L.T. Gama

ExampleExample9 9 sowssows inin thethe samesame stagestage ofof cyclecycle splitsplit intointo 3 3 groupsgroupsEachEach groupgroup (n=3) (n=3) receivesreceives oneone ofof thethe followingfollowing treatstreats.:.:

saline (CONT) saline (CONT) cattlecattle follicularfollicular fluidfluid (LFB) (LFB) swineswine follicularfollicular fluidfluid (LFP) (LFP)

BloodBlood collectedcollected atat 6, 12 6, 12 andand 18 18 hourshours (h=3) (h=3) andand RIA RIA ofofplasmaticplasmatic FSH.FSH.

Objective Objective isis to to studystudy evolutionevolution ofof FSH FSH overover timetime, , andandhowhow itit isis affectdaffectd byby treattreat. .

interactioninteraction treat.*timetreat.*timetesttest linear linear andand quadraticquadratic evolutionevolution ofof FSH (FSH (timetime consideredconsidered as as a a continuouscontinuous variablevariable))

Repeated measuresRepeated measures

L.T. Gama

Repeated measuresRepeated measures

1(t1(t--1)1)TimeTime22*treat.*treat.

t(1)(nt(1)(n--1)+t(1)(n1)+t(1)(n--1)1)ErrorError

1(t1(t--1)1)Time*treatTime*treat..

11TimeTime22

11TimeTime

t(nt(n--1)1)Sow(treatSow(treat.).)tt--11TreatmentTreatment

d.fd.f..SourceSource ofof variationvariation

ANOVA

0

5

10

15

20

6 8 10 12 14 16 18

Tempo (h)

FSH

ControleLFBLFP

Results

Note that treat. means are irrelevant in this case

L.T. Gama

SoftwareExcelExcelSASSASSPSSSPSSStatisticaStatisticaSystatSystatMinitabMinitabRRGenstatGenstatetc.etc.

L.T. Gama

L.T. Gama

Curso de Métodos Estatísticos Aplicados à Produção Animal

Curso de Métodos Estatísticos Aplicados à Produção Animal

EstaEstaçção Zootão Zootéécnica Nacionalcnica NacionalSantarSantaréémm

Dates?Dates?

[email protected]@mail.telepac.ptL.T. Gama 1.161.261.351.481.591.691.852.032.232.392.633.02500

1.281.351.431.561.661.751.912.092.292.452.683.08120

1.411.471.541.651.751.841.992.172.372.532.763.1660

1.531.581.641.751.841.932.082.252.452.612.843.2440

1.641.691.741.841.932.022.162.342.542.692.933.3230

1.861.901.952.042.122.202.352.522.712.873.103.5020

2.082.112.162.252.332.402.552.712.913.063.293.6915

2.222.252.302.382.462.542.672.843.033.193.423.8213

2.552.582.622.702.782.852.983.143.333.493.724.1210

3.253.273.313.383.453.523.653.803.984.134.364.767

4.394.414.454.514.584.644.754.905.065.215.435.815

500120603020151075432DF

Numerator DF

Denominator

Distribuição Fpara α=0.05