stat 502 design and analysis of experiments two sample experiments
TRANSCRIPT
Stat 502Design and Analysis of Experiments
Two Sample Experiments: Treatment vs Control
Fritz Scholz
Department of Statistics, University of Washington
September 14, 2013
1
Circuit Board Solder Flux Experiment (Courtesy Denis Janky)
Flux is material used to facilitate soldering on circuit boards.http://en.wikipedia.org/wiki/Flux_(metallurgy)
Moisture condensing on the boards can interact with solderingcontaminants (usually residual solder �ux) and result in thin�laments (dendrites) on the surface.
The dendrites can carry current and disrupt the circuits orshort circuit the board.
Measure Surface Insulation Resistance (SIR) between twoelectrically isolated sites.
Electrical problems on aircraft could occur during �ight,possibly intermittent.
Troubled circuit boards are yanked, often without �ndings.
Cleaning soldering contaminants is a leading contaminator ofthe ozone layer.
Some �uxes are easier to clean than others.
We have 2 �ux candidates.2
Treatments and Controls
Here the experimental situation was laid out in terms of twodi�erent and competing �ux treatments.
One of the treatments could have been used in the past.
It is the familiar treatment and often it would then be calledthe control.
The other treatment would be the newer treatment.
Why should we change?
Some possible reasons:
the promise of better resultsequal results with a cheaper or cleaner processtreatment �exibility, spreading the risk of treatment availability
3
The Experimental Process
18 boards are available for the experiment, not necessarily arandom sample from all boards.
Test �ux brands X and Y: randomly assign 9 boards each to X& Y (FLUX variable)
The boards are soldered and cleaned. Order randomized.(SC.ORDER)
Then the boards are coated and cured to avoid handlingcontamination. Order randomized. (CT.ORDER)
Then the boards are placed in a humidity chamber andmeasured for SIR. Position in chamber randomized. (SLOT)
The randomization at the various process steps avoidsunknown biases.
When in doubt on the e�ect of any process step, randomize!
The randomization of �ux assignments gives us amathematical basis for judging �ux di�erences with respect tothe response SIR (to become clear later).
4
DOE Steps Recapitulated
1 Goal of the experiment.Answer question: Is Flux X di�erent from Flux Y?If not, we can use them interchangeably.One may be cheaper than the other.Test null hypothesis H0: No di�erence in �uxes.
2 Understand the experimental units: Boards.
3 De�ne the appropriate response variable to be measured: SIR
4 De�ne potential sources of response variationa) factors of interest: �ux typeb) nuisance factors: boards, various processing steps, testing.
5 Decide on treatment and blocking variables.Treatment = �ux type, no blocking.With 2 humidity chambers might have blocked on those,since variation from chamber to chamber may be substantial.
6 Clearly de�ne experimental process and what is randomized.Treatments and all nuisance factors are randomized.
5
Why Are Treatments and Nuisance Factors Randomized?
The treatments are randomized among boards to avoidconfounding and to provide a mathematical basis for inferenceconcerning treatment e�ects.
We don't want any board peculiarities all ganging up on onetreatment.
This would make it di�cult to distinguish treatment fromboard e�ect.
The nuisance factors all come into play after the �uxtreatments are applied. If not, such nuisance factor e�ectscould be lumped with the board peculiarities.
Any e�ect that they have could align with the treatments, i.e.,the 9 highest nuisance factor e�ects could be aligned with �uxX. Randomization makes that unlikely.
Randomizing over all nuisance factors separately andindependently makes any combined confounding e�ectsextremely unlikely. It reduces variability since + and − e�ectscancel out to a large extent.
6
Flux Data
BOARD FLUX SC.ORDER CT.ORDER SLOT SIR1 Y 13 14 5 8.62 Y 16 8 6 7.53 X 18 9 15 11.54 Y 11 11 11 10.65 X 15 18 9 11.66 X 9 15 18 10.37 X 6 1 16 10.18 Y 17 12 17 8.29 Y 5 10 13 10.010 Y 10 13 14 9.311 Y 14 5 10 11.512 X 12 17 12 9.013 X 4 7 3 10.714 X 8 6 1 9.915 Y 3 2 4 7.716 X 7 3 2 9.717 Y 1 16 8 8.818 X 2 4 7 12.6
see Flux.csv or fluxNote the 4 levels of randomization
7
Flux Experiment: First Box Plot Look at SIR Data
FluxX Y
SIR
( l
og10
((Ohm
)) )
67
89
1011
1213
14 FLUXY −− FLUXX == −1.467
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
boxplot(SIR~FLUX,data=flux)# basic boxplot without annotation# see class web site for flux.plots
8
What Does a Box Plot Show?
In the box plot (Tukey 1977) the upper/lower quartiles, Q(.75) &Q(.25), and the median, Q(.5), are portrayed by the 3 horizontalline segments of the rectangle.
Dashed lines extend from the ends of the box to the adjacentvalues, de�ned below.
Compute the interquartile range IQR = Q(.75)− Q(.25).The upper adjacent value is the largest observation≤ Q(.75) + 1.5× IQR.The lower adjacent value is the smallest observation≥ Q(.25)− 1.5× IQR. (Chambers, Cleveland, Kleiner, Tukey)
Observations beyond adjacent values are shown individually.
For a normal population ≈ .35% of the values fall beyond eachadjacent value.
The dots shown superimposed on our SIR box plots are just theadded data values.
Sample quantiles, Q(p), are determined by one of many possibleschemes, see ?quantile in R.
9
The Utility of Box Plots
Succinct description of data sets, showing median, upper andlower quartiles. This avoids looking at too many data points.
If median ≈ center of the box=⇒ the middle 50% of the data appear symmetric.
If in addition adjacent values ≈ symmetric relative to the box=⇒ this symmetry appears to extend to the tails,otherwise data are skewed.
Any values beyond the adjacent values are singled out aspotential outliers, asking for attention.
Especially e�ective for comparing many data sets side by side.
10
Detour on Taxiway Deviations
T
d1
d2
wing span W2
wing span W1
taxiway centerline
taxiway centerline
Risk of collision and running o� taxiway, getting stuck in the mud.
11
Two Lasers Measure Distances to Nose & Main Gears
12
747 Taxiway Deviations
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
0 5 10 15 20
−10
−5
05
10
Eastbound Traffic
time of day
Nos
e G
ear
Dev
iatio
n fr
om T
axiw
ay C
ente
rline
(ft)
13
Box Plots for 747 Taxiway Deviations
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●●
●
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
−10
−5
05
10
Eastbound Traffic
time of day
Nos
e G
ear
Dev
iatio
n fr
om T
axiw
ay C
ente
rline
(ft)
See class web site for deviation.boxplot.eb14
747 Taxiway Deviations
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●●
●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
● ●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●
● ●
●
●●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
● ●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
0 5 10 15 20
−10
−5
05
10
Westbound Traffic
time of day
Nos
e G
ear
Dev
iatio
n fr
om T
axiw
ay C
ente
rline
(ft)
15
Box Plots for 747 Taxiway Deviations
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
● ●●
●●
●●
●
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
−10
−5
05
10
Westbound Traffic
time of day
Nos
e G
ear
Dev
iatio
n fr
om T
axiw
ay C
ente
rline
(ft)
See class web site for deviation.boxplot.wb
16
Some Comments
Eastbound Tra�c box plots hinted at a preponderance ofmedians on one side of the taxiway centerline.
Westbound Tra�c made the same point very clearly becauseof the much larger tra�c volume in that direction.
The consistent bias direction suggests that it is not only dueto pilot position relative to aircraft centerline (parallax e�ect).
Eastbound bias seems less than the Westbound bias.
This suggests two bias components, one switching directionwith tra�c direction (parallax e�ect), the other does not.
17
Taxiway Centerline and Centerlights
Pilots avoid hitting the o� centerline light bumps=⇒ consistent bias (in same direction) in addition to pilotparallax bias which changes with direction of travel.
18
Three Studies on Taxiway Deviations
The links below are active, if connected to www.
http://www.airporttech.tc.faa.gov/Design/Downloads/anc.report.091003.pdf
http://www.airporttech.tc.faa.gov/Design/Downloads/jfk.report.pdf
http://www.airporttech.tc.faa.gov/Design/Downloads/separation new.pdf
19
Returning to the Flux Experiment: QQ-Plot of SIR Data
●
●
●
●
●
●
●
●
●
8 9 10 11 12
89
1011
12
SIR with Flux X ( log10(Ohm) )
SIR
with
Flu
x Y
(
log 1
0(Ohm
) )
20
What is a QQ-Plot?
A QQ-plot is a another way of comparing 2 data sets visually.
Easiest to construct for equal sample sizes, m = n.
Then you plot the ordered values of one sample against thecorresponding ordered values of the other sample.
For m > n you linearly interpolate n values from the sortedlarger sample and plot these against the sorted smaller sample.
out<-qqplot(x,y) does it for 2 sample vectors x and y.
The out<- is optional. out is a list with plotting positionsout$x and out$y.
abline(lm(y ~ x,out)) �ts a line to the QQ-plot.
See the function body of qqplot and ?approx for theinterpolation scheme.
21
What Can a QQ-Plot Tell You?
If the points scatter around the main diagonal=⇒ samples appear to come from the same distribution.
An ≈ linear point pattern=⇒ the two samples di�er at most in location and scale.
An ≈ linear point pattern parallel to the main diagonal=⇒ the two samples di�er mainly in location (not scale).
If the point pattern is ≈ linear and crosses the diagonal nearthe middle of the point pattern,
=⇒ the two samples di�er mainly in scale (not location).
If the point pattern does not look ≈ linear, then the samplesdi�er in aspects other than location and scale.
For this there are many possibilities.
Location is captured by mean or median, scale by standarddeviation or IQR.
Judgment is subjective and takes experience.
Simulation is one way to gain such experience.
22
QQ-Plot of SIR Data (Higher Perspective?)
● ●●
●●●
●
●
●
0 5 10 15 20
05
1015
20
SIR with Flux X ( log10(Ohm) )
SIR
with
Flu
x Y
(
log 1
0(Ohm
) )
Here the same point pattern looks more linear and closer to themain diagonal.
23
Some QQ-Plots from N(0,1) Samples (m=9, n=9)
●
●● ●
●
●● ●
●
y −− x == −0.792
●
●
●●
● ●
●●
●y −− x == 0.926
●
●●
● ●
●● ●
●
y −− x == −0.57
●
● ●●●
●●
●
●y −− x == −0.394
●●
●●
●
●●
●●
y −− x == −0.62
●
●
● ● ●●
●
●
●y −− x == 0.115
●
●
●● ●
●
●
●●
y −− x == −0.625
● ●
●
●●
● ● ●●
y −− x == −1.12
●
●
●●
●
●●
●
●
y −− x == 0.584
●
●
●●
●● ●●
●
y −− x == −0.647
●
●
● ●
● ●
● ●
●y −− x == 0.41
●●
●●● ●
●● ●
y −− x == −1.33
● ●
● ●●
●●
●●
y −− x == −0.667
●
●●
● ●
● ●
●
●y −− x == −0.31
● ●
● ●●
● ● ● ●
y −− x == −0.757
●
● ●● ●
●
●●
●y −− x == 0.845
●
●
●
●
●
●
●● ●y −− x == 0.165
●
●
●
●
●●
● ●
●y −− x == 1.07
●●●
●
●●
●●
●y −− x == −0.194
●●
● ●
● ●
●● ●
y −− x == −0.253
24
What Do the 20 Simulated QQ-Plots Tell Us?
Very few patterns seem to vary along the main diagonal, the�expected� situation.
We say �expected� because both samples come from the samepopulation/distribution.
Clear linear patterns are also rare. This is due to the smallsample size.
12 of 20 patterns have points on both sides of main diagonal.
Some patterns are clearly o�set from the main diagonal.
Thus one should be rather forgiving when judging with smallsample sizes.
The next slide illustrates how matters improve when bothsamples are of size 30.
25
Some QQ-Plots from N(0,1) Samples (m=30, n=30)
●●
● ●●●●
●●● ●● ●●●●●●
●●●●●●
●●●
●●
●y −− x == 0.299
● ● ● ● ● ●●●● ●●●●●●● ●
●●●●●●
●●● ● ●
●●
y −− x == 0.23
●●●
● ● ●●●
●●●●
●● ●● ●●
●●●●●
●●● ● ●
●●y −− x == 0.17
● ●●●●● ●●
●●●●●●
● ●●●●● ●●●● ●
● ●● ●
●y −− x == 0.118
●
●●
● ● ●●●
●●●● ●●●●
●●●●●
● ● ●●
● ● ● ●
●y −− x == −0.03
●●●●●●●
●● ●●●
●●●
● ● ●●● ●
● ● ●● ●●●
●
●y −− x == 0.00176
●
●●● ●
●● ●●●●●
●●●● ●●●
●● ●●●
●● ●● ●
●
y −− x == 0.0172
●●●
●●●●
●● ● ●●●● ●
●●●●●●●●●
● ●●
●
●
●y −− x == 0.252
●●
●● ●● ●●
●●●●
●●●●● ●●
●●● ●●
● ●●
● ● ●
y −− x == −0.0942
●
●● ●
● ●●● ●●●●
●●●●
● ●● ●●●● ● ●
●●● ●
●y −− x == 0.257
●●
●●● ●
●●●●● ●●●●● ● ●●
● ●● ●
●● ● ●
●●●y −− x == 0.0611
●
●
●
●
● ●●●● ●●●
●● ●● ●●●●● ●●●
●●●
● ●
●y −− x == 0.253
●●
●●●●●●
●●●
●● ●●●● ●●●●●●
●●●
●●
●
●y −− x == −0.0129
●
●
●● ●●●●
●●● ● ●●
●●●●●●●● ●
●●●●●
●
●y −− x == −0.383
●
● ●●
● ●●●●●● ●●●
●●●●● ●●●
●●●
●●● ●
●y −− x == 0.11
●
●●● ●●●●●●
●●●●
●●●● ●● ●
●●●●
●● ●●
●y −− x == 0.0591
●●●●
●●●●
●● ●
●●●
●●●●●●
● ● ●●●●
●● ●
●y −− x == 0.0275
●
●●●
●● ●● ●● ●● ●●●●●●●●● ● ●●●
●●●
● ●y −− x == −0.314
●● ●
●● ●
● ●●
● ●●●
● ● ●●●●
●●●● ●● ● ●
●●
●
y −− x == 0.219
● ● ●
●●●●●
● ●●●●● ●● ●●●
●●●●●
● ●●
●
●●y −− x == −0.164
26
Is the Di�erence Y − X = −1.467 Signi�cant?
In comparing SIR for the two �uxes let us focus on thedi�erence of means FLUXY − FLUXX = Y − X .
If the use of �ux X or �ux Y made no di�erence (H0), then weshould have seen the same results for these 18 boards, nomatter which got �ux X or Y.
X or Y is then just an arti�cial label with no consequence.
Due to the di�erent splittings into two groups of 9 & 9, wewould see other di�erences of means.
There are(189
)= 48620 = choose(18,9) such possible
splits. For each split we could obtain Y − X .
Was our observed di�erence of −1.467 from an unusualrandom split?
Need the randomization or reference distribution of Y − X forall possible splits.
27
Some Randomization Examples of Y − X
8.6 8.6 8.6 8.6 8.6 8.6 8.67.5 7.5 7.5 7.5 7.5 7.5 7.511.5 11.5 11.5 11.5 11.5 11.5 11.510.6 10.6 10.6 10.6 10.6 10.6 10.611.6 11.6 11.6 11.6 11.6 11.6 11.610.3 10.3 10.3 10.3 10.3 10.3 10.310.1 10.1 10.1 10.1 10.1 10.1 10.18.2 8.2 8.2 8.2 8.2 8.2 8.210.0 10.0 10.0 10.0 10.0 10.0 10.09.3 9.3 9.3 9.3 9.3 9.3 9.311.5 11.5 11.5 11.5 11.5 11.5 11.59.0 9.0 9.0 9.0 9.0 9.0 9.010.7 10.7 10.7 10.7 10.7 10.7 10.79.9 9.9 9.9 9.9 9.9 9.9 9.97.7 7.7 7.7 7.7 7.7 7.7 7.79.7 9.7 9.7 9.7 9.7 9.7 9.78.8 8.8 8.8 8.8 8.8 8.8 8.812.6 12.6 12.6 12.6 12.6 12.6 12.6
Y−X Y−X Y−X Y−X Y−X Y−X Y−X1.1778 0.4222 -0.0889 -0.4000 0.5778 0.7778 0.2000
28
Reference Distribution of Y − X
Compute Y − X for each of the 48620 possible splits anddetermine how unusual the observed di�erence of −1.467 is.
This seems like a lot of computing work. Here it takes just afew seconds in R using the function combn.
But the computing work can grow rapidly, since(m+nm
)gets
large in a hurry. For example(2613
)= 10400600 (10 million+).
Let SIR be the vector of all 18 SIR values in your workspace.
In your workspace de�ne the following function of the 2arguments ind and ym.diff <-function(ind,y){mean(y[ind])-mean(y[-ind])}
rand.ref.dist <- combn(1:18,9,FUN=m.diff,y=SIR)
gives the vector of all 48620 such di�erences of averages.
29
What does combn Do?
The function combn goes through all combinations of 9indices taken from 1:18. Each such combination is fed as thevalue for the �rst argument ind to the function m.diff.Note that ind is not explicitly named in combn(...).
Instead of ind any other unused variable name could be used,e.g., IndianaJones.
The 2nd argument y in m.diff is explicitly named as y=SIRin combn(1:18,9,FUN=m.diff,y=SIR).
For each combination ind and assigned vectorSIR = (X1, . . . ,X9,Y1, . . . ,Y9) = (Z1, . . . ,Z18) the functionm.diff(ind,y) is evaluated. For example,ind=c(3,5,7,9,10,12,14,17,18) −→mean(y[ind]) - mean(y[-ind]) = (Z3 + Z5 + Z7 +. . .+ Z17 + Z18)/9− (Z1 + Z2 + Z4 + . . .+ Z15 + Z16)/9.
30
Using IndianaJones
m.diff <-function(IndianaJones,y){mean(y[IndianaJones])-
mean(y[-IndianaJones])}andrand.ref.dist <-combn(1:18,9,FUN=m.diff,y=SIR)would have produced the same output vector rand.ref.dist.
If m.diff <- function(z,y1,y2) we would need toexplicitly indicate the sources for the two variables y1 and y2 incombn(...,y1=...,y2=...).
31
combn
The default function call (with no FUN speci�cation) combn(x,m, FUN = NULL, simplify = TRUE, ...) or justcombn(x, m) results in a matrix with columns representing allpossible combinations of m elements taken from the vector x .Don't use x= in ... since x is already implied as �rst argument.Here is an example with x=1:5=c(1,2,3,4,5) and m=3
> combn(1:5,3)[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 2 2 2 3[2,] 2 2 2 3 3 4 3 3 4 4[3,] 3 4 5 4 5 5 4 5 5 5
Note(53
)=(52
)= 5!/(3!2!) = 5 · 4/(1 · 2) = 10.
32
Reference Distribution of Y − X (continued)
rand.ref.dist, gives the reference distribution of Y − X .
We �nd a (two-sided) p-value of p.val=.02587 for ourobserved Y − X = −1.467 via p.val <-mean(abs(rand.ref.dist)>= 1.46666) =proportion of T or 1 in the logic vectorabs(rand.ref.dist)>= 1.46666.
The (two-sided) p-value is the probability of seeing an|Y − X |-value as extreme as or more extreme than the actuallyobserved |y − x | = 1.467, when in fact the hypothesis H0
holds true, i.e., under the randomization reference distribution.
The upfront randomization of �uxes and the assumption of H0
make this a valid probability statement!
Without that randomization we cannot speak of chance.
With upfront randomization and assuming H0 to be true, wecan view the observed di�erence as having arisen as a result ofone of 48620 equally likely splits.
33
Randomization Reference Distribution of Y − X
Y −− X == SIRY −− SIRX
freq
uenc
y
−2 −1 0 1 2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
P((Y
−−X
≤≤−
1.46
6666
))==0.
0129
4 P((Y−−
X≥≥
1.466666))==0.01294
2−sided p−value
0.02587 reference distribution
0.02828 normal approximation
34
How to Determine the P-Value
How to obtain the p-value from the reference distributions?
Note the following:> x <- 1:10> x>3[1] FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE> sum(x>3)[1] 7> mean(x>3) # the proportion of x values > 3[1] 0.7
Note that x>3 produced a logic vector with same length as x.
In arithmetic expressions or functions the logic values FALSEand TRUE are interpreted as 0 and 1, respectively, .
35
How to Determine the P-Value (continued)
x = rand.ref.dist is the vector of all the di�erences ofmeans, Y − X , obtained for all 48620 possible splits.
mean(x<=-1.46666) and mean(x>=1.46666) give usthe respective one-sided p-values p1 = .01294 and p2 = .01294for the randomization reference distribution.
Rather than adding these 2 p-values to get a two-sided p-valuewe can also do this directly viamean(abs(x)>= 1.46666)=.02587.
Here abs(x) is the vector of absolute values in x.
36
Some Computational Issues
In the p-value calculation we used 1.46666 instead of 1.467p.val <- mean(abs(rand.ref.dist)>= 1.46666)Why?
The number −1.467 is obtained by rounding the truey − x = −1.46666.....Thus the logic vector abs(rand.ref.dist)>= 1.467would give an F to all those splits with |Y − X | = 1.46666....
mean(abs(rand.ref.dist)>= 1.467)=.02345would compute the chance of seeing a result for |Y − X | thatis more extreme than the observed |y − x | = 1.46666.....
However, the p-value is the chance of seeing a |Y − X | that isas extreme as or more extreme than the observed|y − x | = 1.46666...., i.e., p.val = .02587.
37
Interpreting the P-Value
The calculation of the p-value is based on two premises
Fluxes are assigned randomly to the 18 units upfront. TRUE!!The hypothesis H0 of no �ux e�ect. TRUE or FALSE.
If the p-value is very small, then we can either believe that wesaw a rather rare result for our test statistic Y − X while H0 istrue, or we should be induced to disbelieve H0 and reject it.
When is the p-value su�ciently small?
When p-value ≤ .05 the result is said to be signi�cant at level.05, and we should reject H0, if .05 is our prearranged cut-o�or signi�cance level.
The actual p-value is more informative than stating asigni�cant result at .05 or .01.
38
One-Sided or Two-Sided P-Value?
A two-sided p-value is appropriate if we test the hypothesis ofno treatment di�erence against the alternative that there is adi�erence, without specifying in which direction.
Here we may hope for an insigni�cant result so that we may beable to use both treatments interchangeably.
A one-sided p-value would be appropriate if we test thehypothesis of no treatment di�erence against an alternativethat speci�es the direction in which the means may di�er, saythe newer treatment, �ux X, yields higher SIR values than �uxY, the previous treatment.
However, the test result should not in�uence the choice of thetargeted alternative.
39
Randomization Reference Distribution of Y − X
Y −− X == SIRY −− SIRX
freq
uenc
y
−2 −1 0 1 2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
P((Y
−−X
≤≤−
1.46
6666
))==0.
0129
4 P((Y−−
X≥≥
1.466666))==0.01294
2−sided p−value
0.02587 reference distribution
0.02828 normal approximation
40
Which Normal Distribution Do We Use as Approximation?
Taking n values Y1, . . . ,Yn randomly and without replacementfrom Z1, . . . ,ZN , then (see Appendix A, slide 82) samplingtheory gives the mean and variance of Y =
∑ni=1 Yi/n as
E (Y ) = Z and var(Y ) =S2
n
(1− n
N
)with S2 =
1
N − 1
N∑i=1
(Zi−Z )2
(1− n/N) is called the �nite population correction factor.
Further theory, a version of the Central Limit Theorem (CLT),suggests that Y has an approximate normal distribution withthis mean and variance, i.e.,
Y ≈ N(E (Y ), var(Y )
)
41
Approximate Normality for Y − X
If X is the average of the N − n = m Z -values notcontained in the Y -sample, then
Z =nY + mX
N=⇒ Y − Z = Y − nY + mX
N=
m
N(Y − X )
=⇒ Y − X =N
m(Y − Z ) .
Since Z remains constant during any of this random sampling,it follows that Y − X is approximately normally distributedwith mean and variance given by E (Y − X ) = 0 and
var(Y − X ) =N2
m2
S2
n
(1− n
N
)= S2 N
mn= S2
(1
m+
1
n
)No more �nite population correction factor!
42
Normal QQ-Plot of Y − X Randomization Reference Distribution
● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●●● ● ● ● ● ●
−3 −2 −1 0 1 2 3
−3
−2
−1
01
23
corresponding quantiles of the normal approximation
203
uniq
ue v
alue
s of
Y
−−X
the reference distribution has shorter tails than the normal approximation
σσ((Y −− X)) == 0.669
43
Construction of the Previous Normal QQ-Plot
Let D1 < D2 < . . . < D203 denote the distinct possible valuesof Y − X and let ni be the frequency of Y − X = Di amongthe M = 48620 possible splits.
The proportion of Y − X ≤ Di is pi = (n1 + . . .+ ni )/M
Could regard Di as the pi -quantile of the approximating normaldistribution with mean zero and standard deviation σ = .669.
Unfortunately p203 = 1 and the normal 1-quantile =∞.
To remedy this and also to treat p1 and p203 in a moresymmetric fashion we modify pi to pi = pi ·M/(M + 1) andplot Di against the pi -quantile of the above normaldistribution. This was done in the previous plot.
Another option, with roughly the same e�ect, is to usepi = (n1 + . . .+ ni − .5)/M.
We note that the QQ-plot shows the discrepancy in the normalapproximation much more clearly (in the tails) than thedensity plot superimposed on the histogram.
44
Some Experimenting with Randomization Tests
In the following we will take our SIR data and subtract Y − X
from all the Y values,
i.e., obtain Y ′i = Yi − (Y − X ), i = 1, 2, . . . , 9.
This altered Y ′ sample will have as averageY ′ = Y − (Y − X ) = X , i.e., both samples are aligned ontheir common average.Then we take Y ′1 and add 9× (Y − X ), i.e., obtainY ′′1 = Y ′1 + 9× (Y − X ), and leave the otherY ′′i = Y ′i , i = 2, . . . , 9 unchanged.
=⇒1
9
(Y ′′1 + . . .+ Y ′′9
)=
1
9
(Y ′1 + . . .+ Y ′9
)+ (Y − X ) = X + (Y − X ) = Y
=⇒ Y ′′ − X = Y − X same di�erence as before.
However, we round Y ′′1 , . . . ,Y′′9 to nearest decimal. This
changes Y ′′ − X slightly.
What is the e�ect of this data change on the randomizationreference distribution?
45
Box Plot View of Altered SIR Data
●
FluxX Y
SIR
( l
og10
((Ohm
)) )
−3
−2
−1
01
23
45
67
89
1011
1213
14 FLUXY −− FLUXX == −1.463
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
46
Comments and Questions
The two samples seem well aligned when comparing medians.
The Y ′′ sample has an outlier, does not a�ect the median.
Y ′′ − X = −1.463 ≈ −1.467, our previous value (rounding).There is no di�erence in Flux, if we disregard the outlier.
The outlier is presumably a defective circuit board orsomething gone wrong with this board and is not a treatmente�ect (otherwise should see more such cases).
What happens to Y ′′ − X for all 48620 splits?
Where is −1.463 relative to the reference distribution?
What is its two-sided p-value?
47
Randomization Reference Distribution of Y ′′ − X
Y″″ −− X == SIR″″Y −− SIRX
freq
uenc
y
−2 0 2
0.0
0.1
0.2
0.3
0.4 P((Y″″ −− X ≤≤ −1.463)) == 0.2727 P((Y″″ −− X ≥≥ 1.463)) == 0.2727
2−sided p−value
0.5455 reference distribution
0.367 normal approximation
48
Comments
The randomization reference distribution does not judge thisoutlier sample an unusual split.
The randomization test is not fooled by the outlier.
The two-sided p-value of .55 shows no signi�cance.
The outlier has equal chance of being in either the X -sampleor the Y -sample under random splits.
The normal approximation is poor and should not be used here.
In fact, the randomization reference distribution looks like a(.5, .5) mixture of two approximately normal distributions,centered at ±1.463, respectively.The CLT e�ect still shows in these two constituentdistributions, generated as randomization distributions fromsamples of 8 and 9 or 9 and 8, respectively.
49
Su�cient Conditions for CLT
In order for limit theory to make sense we need to let thepopulation size N = m + n→∞.
Denote the population values by ZN1, . . . ,ZNN .
Let Y be the average of a random sample of size n(without replacement) from this population.
Su�cient conditions for asymptotic normality of Y are
m→∞ and n→∞
andmaxi (ZNi − ZN)2∑N
i=1(ZNi − ZN)2
−→ 0
For an accessible proof see Lehmann's Nonparametrics,Statistical Methods Based on Ranks, Springer 2006, pp. 352�.
While a single outlier may allow a CLT e�ect, it still shows itsadverse e�ect for small N.
50
Some More Experimenting with Randomization Tests
Take our SIR data and subtract Y − X from all the Y values,i.e., obtain Y ′i = Yi − (Y − X ), i = 1, 2, . . . , 9.
This altered Y ′ sample has average Y ′ = Y − (Y − X ) = X ,i.e., both samples are aligned on their common average.
Then we take Y ′1, . . . ,Y′4 and add (9/4)× (Y − X ) to each,
i.e., obtain Y ′′i = Y ′i + (9/4)× (Y − X ) for i = 1, 2, 3, 4, andleave the other Y ′′i = Y ′i , i = 5, . . . , 9 unchanged.
=⇒ 1
9
(Y ′′1 + . . .+ Y ′′9
)=
1
9
(Y ′1 + . . .+ Y ′9
)+ (Y − X )
= X + (Y − X ) = Y
=⇒ Y ′′ − X = Y − X same di�erence as before.
Again we will round Y ′′1 , . . . ,Y′′9 to nearest decimal.
What is the e�ect on the randomization reference distribution?
51
Box Plot View of Altered SIR Data
FluxX Y
SIR
( l
og10
((Ohm
)) )
56
78
910
1112
1314 FLUXY −− FLUXX == −1.452
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
52
Comments and Questions
The two samples are not so well aligned, comparing medians.
Three Y values seem to fall apart from the other 6, whichresults in Y ′′ − X = −1.452 ≈ −1.467, our previous value.It seems that there is a di�erence in Flux, but it shows in adi�erence in mean and scale (center and spread).
The three �outliers� could presumably be defective circuitboards or represent more strongly varying treatment e�ects(treatment/unit interactions).
What happens to Y ′′ − X for all 48620 splits?
What is its two-sided p-value of −1.452 ?
53
Randomization Reference Distribution of Y ′′ − X
Y −− X == SIRY −− SIRX
freq
uenc
y
−2 0 2
0.0
0.1
0.2
0.3
0.4 P((Y −− X ≤≤ −1.452)) == 0.06495 P((Y −− X ≥≥ 1.452)) == 0.06495
2−sided p−value
0.1299 reference distribution
0.1208 normal approximation
54
Comments
The randomization reference distribution does not judge thismore dispersed and shifted Y -sample a very unusual split.
The randomization test looks only at the di�erence in averagesand not at scale changes, thus loses some e�cacy.
The two-sided p-value of .13 shows no signi�cance.
The normal approximation is reasonable when compared to theprevious case, especially when smoothing over the very localjaggedness, i.e., using a wider histogram bin width.
A di�erence in sample dispersion has a negative impact whencomparing for means.
55
The E�ect of Spread
We take our SIR data and transform them as follows
X ′i = X + .5× (Xi − X ) and Y ′i = Y + .5× (Yi − Y )
X ′ = X and Y ′ = Y , thus Y ′ − X ′ = Y − X is unchanged.
However, we have reduced the spread in both samples by afactor .5√∑
i (X′i − X ′)2
m − 1= .5×
√∑i (Xi − X )2
m − 1and similarly for the Y ′i .
Again round X ′1, . . . ,X′9 and Y ′1, . . . ,Y
′9 to nearest decimal.
What is the e�ect of this change on the randomizationreference distribution?
56
Box Plot View of Altered SIR Data
FluxX Y
SIR
( l
og10
((Ohm
)) )
89
1011
1213 FLUXY −− FLUXX == −1.433
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
57
Comments and Questions
The two samples seem to have roughly the same di�erence insample means Y ′ − X ′ = −1.433 ≈ −1.467 = Y − X in theoriginal data. The di�erence comes from the rounding.
Here the di�erence in �ux is more clearly de�ned, becauseeach sample is more tightly scattered around its center.
Clearer separation of signal from noise!
What is its two-sided p-value of −1.433?
58
Randomization Reference Distribution of Y ′ − X ′
Y −− X == SIRY −− SIRX
freq
uenc
y
−2 0 2
0.0
0.2
0.4
0.6
0.8
P((Y −− X ≤≤ −1.433)) == 0.0002262 P((Y −− X ≥≥ 1.433)) == 0.0002262
2−sided p−value
0.0004525 reference distribution
0.001152 normal approximation
59
Comments and Questions
As expected, the two-sided p-value .00045 is much smaller.
The normal approximation seems somewhat reasonable but theresulting two-sided p-value .0012 is more than twice as largeas the p-value .00045. (−→ Tail discrepancy! Slide 43)
Instead of tightening the spread of each sample by a factor .5enlarge the spread by a factor of 2. What would be the result?
Instead of tightening the spread of each sample by a factor .5tighten the spread by a factor of .5 in only one of the samples.What would be the result?
60
Computation Time and Storage Issues
ref.dist2 <- function () {SIR <- flux$SIRrand.ref.dist <- combn(1:18,9,
FUN=m.diff,y=SIR)}> system.time(ref.dist2())
user system elapsed1.23 0.00 1.26
1.26 seconds to compute the(189
)= 48620 values of the
reference distribution.
30 experimental units split into 15 and 15=⇒
(3015
)= 155, 117, 520 splits and it would take at least
1.26 · 155117520/48620 = 4019.9 seconds or 67 minutes.
Computation times and storage requirements get out of handin a hurry. We need an alternative computational approach.
61
Approximation to Randomization Reference Distribution
A simple way out of this dilemma of exploding computationtimes and storage requirements is to generate a su�cientlylarge random sample (with replacement), say K = 10, 000, ofcombinations from this set of all
(m+nm
)combinations.
For each sampled combination compute the statistic ofinterest, Si = S(X i ,Y i ) = Yi − Xi , i = 1, . . . ,K andapproximate the randomization reference CDF F (z) by theempirical distribution of the S1, . . . , SK .
Approximate
F (z) = P(S(X ,Y ) ≤ z) = P(Y−X ≤ z) ≈ FK (z) =1
K
K∑i=1
Bi (z)
with Bi (z) = 1 when Si = S(X i ,Y i ) ≤ z and Bi (z) = 0 else.
62
The Law of Large Numbers (LLN)
For any z we have E (Bi (z)) = F (z).
The Bi (z) are independent and identically distributed (iid).
The Law of Large Numbers (LLN)
=⇒ FK (z) =1
K
K∑i=1
Bi (z) −→ F (z) as K →∞ .
Similarly, we can approximatep(z) = P(|S(X ,Y )| ≥ |z |) = P(|Y − X | ≥ |z |) by
pK (z) =1
K
K∑i=1
Bi (z)
with Bi (z) = 1 when |Si | ≥ |z | and Bi (z) = 0 else.
For z = y − x , the observed sample mean pK (z) gives us agood approximation for the true two-sided p-value p(z) of z .
63
Sample Simulation Program
This can be done in a loop using the sample function in R.
sim.ref.dist <- function(K=10000){D.star <- numeric(K)for(i in 1:K){
SIR.s <- sample(SIR)D.star[i] <- mean(SIR.s[1:9])-mean(SIR.s[10:18])
}D.star}
The output vector D.star holds the sampled referencedistribution of Y − X from which estimated p-values can becomputed for any observed y − x .
The following slide shows the QQ-plot comparison with the fullrandomization reference distribution, together with therespective p-values.
This approach should su�ce for practical purposes.
Problematic with many simultaneous tests (microarray data).Screen with normal or Student-t approximation upfront.
64
QQ-Plot of Simulated & Full Randomization Reference Distribution
for Y − X
−2 −1 0 1 2
−2
−1
01
2
Y −− X for all combinations
Y−−
X fo
r al
l 100
00 s
ampl
ed c
ombi
natio
ns
p1 == 0.0119
p2 == 0.0127
p1 == 0.01294
p2 == 0.01294
65
The 2-Sample Student t-Test Statistic
When testing the equality of means of two normal populationsit is customary and optimal to compare the respective samplemeans relative to a measure of the dispersion in the twosamples =⇒ Student's 2-sample t-statistic
t(X ,Y ) =(Y − X )/
√1/n + 1/m√
[∑n
i=1(Yi − Y )2 +∑m
j=1(Xj − X )2]/(m + n − 2)
In the presence of substantial inherent variation �small�di�erences in means could easily be due to this variation andthus will not appear signi�cant.
Judge mean di�erences relative to a measure of this variation.
So far we have not assumed that we sample normalpopulations. In fact, we only guaranteed that we sample froma �nite population Z1, . . . ,ZN .
66
Randomization Distribution of the 2-Sample t-Statistic
The values of t(X ,Y ) are in 1-1 monotone increasingcorrespondence with those of Y − X under all splits.See Appendix B, slide 85.
In fact,
=⇒ t(X ,Y ) =
√m + n − 2√1/n + 1/m
W√1− mn
NW 2
↗ in W
where
W = (Y − X )/√∑
(Zk − Z )2 .
with∑
(Zk − Z )2 staying constant over all splits.
Thus p-values for t(x , y) and y − x will be the same undereach respective randomization reference distribution.
67
Student t-Approximation for the t(X ,Y ) Reference Distribution
In regular1 situations the randomization reference distributionof t(X ,Y ) is very well approximated by a Studentt-distribution with 16 = 18− 1− 1 degrees of freedom.
This was very useful before computing and simulation werereadily available. All one needs to do is compute t(X ,Y ) forthe observed data and calculate its p-value from the Studentt16-distribution. Here t(x , y) = −2.512236.Reference: G.E.P Box and S.L. Anderson, �Permutation theory
in the derivation of robust criteria and the study of departures from
assumptions,�J. Roy. Stat. Soc., Ser. B, Vol 17 (1955), pp. 1-34.
The test based on t(X ,Y ) and its t-distribution under H0 alsoshows up in the normal-model-based approach to this problem.More on this later.
It is the concept of the t(X ,Y ) statistic (not Y − X ) thatgeneralizes to more complex experimental situations.
1Our outlier experiment shows what can happen68
t(X ,Y ) Randomization Reference Distribution
t((X,, Y))
freq
uenc
y
−4 −2 0 2 4
0.0
0.1
0.2
0.3
0.4
P((t((
X,, Y
))≤≤−
2.51
2236
))==0.
0129
4 P((t((X,, Y))≥≥
2.512236))==0.01294
2−sided p−value
0.02587 reference distribution
0.023098 t16 −approximation
69
t-QQ-Plot of t(X ,Y ) Randomization Reference Distribution
●
●
●
●
●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
−6 −4 −2 0 2 4 6
−6
−4
−2
02
46
corresponding quantiles of the Student t16 −approximation
203
uniq
ue v
alue
s of
t((
X,, Y
))
the reference distribution has longer tails than the Student t16 −approximation
70
Randomization Reference Distribution of Y ′′ − X
t((X,, Y))
freq
uenc
y
−3 −2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7 P((t((X,, Y)) ≤≤ −0.8942885)) == 0.2727 P((t((X,, Y)) ≥≥ 0.8942885)) == 0.2727
2−sided p−value
0.5455 reference distribution0.38442 t16 −approximation
The outlier example shows that the t-distribution is not always agood �t.
71
Critical Value, Signi�cance Level & Type I Error
Any extreme value of |Y − X | or |t(X ,Y )| could either comeabout due to a rare chance event (via the randomization) ordue to H0 actually being wrong.
We have to make a decision: Reject H0 or not?
We may decide to reject H0 when |Y − X | ≥ C or|t(X ,Y )| ≥ C , where C or C are respective critical values.
To determine C or C one usually sets a signi�cance level αwhich limits the probability of rejecting H0 when in fact H0 istrue (Type I error).
The requirement
α = P(reject H0 | H0) = P(|Y−X | ≥ C | H0) = P(|t(X ,Y )| ≥ C | H0)
then determines C = Cα or C = Cα. α↘⇒ Cα ↗, Cα ↗.
P(. . . |H0) refers to the randomization reference distribution.
72
Signi�cance Levels and P-Values
When we reject H0 we would say that the results weresigni�cant at the level α, agreed upon prior to the experiment.
Commonly used values of α are α = .05 or α = .01,a remnant of statistical tables.
Rejecting at smaller α than these would be even strongerevidence against H0.
For how small an α would we still have rejected?
This leads us to the observed signi�cance level α or p-value forthe observed discrepancy value |y − x |
p-value = P(|Y−X | ≥ |y−x | | H0) = α i.e., Cα = |y−x |
α is the smallest α at which we would have rejected H0 withthe observed |y − x |.
73
How to Determine the Critical Value C.crit
For α = .05 we want to �nd C.crit such thatmean(abs(x)>=C.crit)=.05. Here x =rand.ref.dist is the reference distribution vector.
Equivalently, �nd the .95-quantile of abs(x) viaC.crit=quantile(abs(x),.95).
From the full reference distribution we getC.crit(α = .05) = 1.28889 andC.crit(α = .01) = 1.64444.
From the simulated reference distribution we getC.crit(α = .05) = 1.31111 andC.crit(α = .01) = 1.66667.
Note that we avoided the use of C in place of C.crit becausein R the letter C has a predetermined meaning. =⇒ ?C.
R used to warn you when your chosen object name masks asystem object name. This warning no longer happens.
74
What Does the t-Distribution Give Us?
What does the observed t-statistic t(x , y) = −2.5122 give as2-sided p-value?P(|t(X ,Y )| ≥ 2.5122) =2 ∗ (1− pt(2.5122,16)) = 2 ∗ pt(−2.5122,16) = .02310,≈ .02587 from the randomization reference distribution.
What are the critical values tcrit(α) for |t(X ,Y )| for levelα = .05, .01 tests?
We �nd tcrit(α = .05) = C.05 = qt(.975,16) = 2.1199 andtcrit(α = .01) = C.01 = qt(.995,16) = 2.9208, respectively.
With |t(x , y)| = 2.5122 we would reject H0 at α = .05 since|t(x , y)| ≥ 2.1199 but not at α = .01 since |t(x , y)| < 2.9208.
The same message derives from the p-value:.01 < .02310 ≤ .05.
75
Comparing P-Values
The randomization reference distribution gave us a p-value of:.02587
The normal approximation for the Y − X reference distributiongave us: .02828
The t16 approximation for t(X ,Y ) gave us: .02310.
The approximations deviate in opposite directions with roughlythe same magnitude .02828-.02587 = .00241 and.02310-.02587 = -.00277.
It is not clear whether the t approximation should be preferredbased on approximation quality.
The t approximation involves a single distribution while thenormal approximation still requires σ(Y − X ).
t(X ,Y ) �ts better into the larger picture of what lies ahead.
76
Critical Values of the Randomization Reference Distribution
t((X,, Y))
freq
uenc
y
−4 −2 0 2 4
0.0
0.1
0.2
0.3
0.4
observedαα == 0.05αα == 0.01
77
Some Concluding Comments
Critical values and �xed signi�cance levels were motivated by
theoretical considerations that allowed the comparison ofdi�erent level α tests and to �nd the best one.Neyman-Pearson optimality theory.
the planning of sample sizes to achieve su�cient power todetect treatment e�ects of speci�ed size.
easy tabulation of critical values for a limited set of α values.
the fact that computing was not what it is today.
Statistical tables are following the path of logarithm tables.
78
Calculator Simulations in 1980
Back in 1980 I asked colleagues Piet Groeneboom and RonPyke to look at the large sample distribution of a particularstatistic Sn (a very di�cult problem).
They succeeded in showing that the asymptotic distribution ofSn should be normal
Sn ≈ N(n log(n), 3n2 log(n)
)for large n.
I simulated the distribution for Sn on my HP-41 CV.
After days & weeks intermediate results did not look normal.
Since it was known that Sn ≥ 0, a natural heuristic for decentnormality would be
0 ≤ n log(n)−3×√3n2 log(n) =⇒ log(n) ≥ 27 or n ≥ 5.3×1011
79
Calculators during R.A. Fischer' Time
Photographed at the Bonn Arithmeum, July 2013.Fisher used such a machine for 14 years.
80
Calculators in/before My Time
81
Appendix A: Sampling Theory Proofs
The following two slides establish the mean and variance of Y whenY1, . . . ,Yn are randomly drawn without replacement from
Z1, . . . ,ZN with N = m + n.
E (Y ) = Z
and
var(Y ) =S2
n
(1− n
N
)with S2 =
1
N − 1
N∑i=1
(Zi − Z )2
Note the evident correctness when n = N (m = 0).
82
Sampling Theory
Write Y =∑N
i=1 IiZi/n , where the random variable Ii = 1whenever the Y -sample contains Zi and Ii = 0 otherwise. Then
E (Ii ) = P(Ii = 1) =
(1
1
)(N−1
n−1
)(Nn
) =(N − 1)!n!(N − n)!
N!(n − 1)!(N − n)!=
n
N.
and for i 6= j
E (Ii Ij) = P(Ii = 1, Ij = 1) =
(2
2
)(N−2
n−2
)(Nn
) =(N − 2)!n!(N − n)!
N!(n − 2)!(N − n)!=
n(n − 1)
N(N − 1).
The Z1, . . . ,ZN are the �nite population from which we sample.
=⇒ E (Y ) =1
n
N∑i=1
Zi E (Ii ) =1
n
N∑i=1
Zin
N=
1
N
N∑i=1
Zi = Z .
Further note
Y =1
n
N∑i=1
IiZi =1
n
N∑i=1
Ii (Zi − Z ) + Z sinceN∑i=1
Ii = n .
83
var(Y ) = S2
n
(1− n
N
)Writing zi = Zi − Z with
∑Ni=1 zi = 0 and dropping the constant Z
in the variance
=⇒ var(Y ) = var
(1
n
N∑i=1
Ii zi
)= cov
1
n
N∑i=1
Ii zi ,1
n
N∑j=1
Ijzj
=1
n2
N∑i=1
N∑j=1
zi zjcov(Ii , Ij )
For i = j we have cov(Ii , Ij) = var(Ii ) = (n/N)(1− n/N) = nm/N2 andfor i 6= j we have
cov(Ii , Ij ) = E(Ii Ij )− E(Ii )E(Ij ) = P(Ii = 1, Ij = 1)−n
N
n
N=
n(n − 1)
N(N − 1)−
n2
N2
=n
N
(n − 1
N − 1−
n
N
)=
n
N
N(n − 1)− (N − 1)n
N(N − 1)= −
n
N
N − n
N(N − 1)= −
nm
N2(N − 1)
var(Y ) =1
n2
N∑i=1
z2inm
N2−
1
n2
∑i 6=j
zi zjnm
N2(N − 1)= S2
(nm(N − 1)
n2N2+
nm
n2N2
)=
S2
n
(1−
n
N
)
since 0 =∑
i zi∑
j zj =∑
i
∑j zizj =
∑i z
2i +
∑i 6=j zizj , i.e.,∑
i 6=j zizj = −∑
i z2i .
84
Appendix B: Monotone Equivalence of Y − X and t(X ,Y )
The values of t(X ,Y ) are in 1-1 monotone increasingcorrespondence with those of Y − X under all splits.
∑(Zk − Z)2 −
mn
N
(Y − X
)2=∑
Y 2
i +∑
X 2
j −1
N
(mX + nY
)2 − mn
N
(Y − X
)2=∑
Y 2
i +∑
X 2
j − nY 2 −mX 2 =∑
(Yi − Y )2 +∑
(Xi − X )2
Notet(X ,Y )
√1/n + 1/m
√m + n − 2
=Y − X√∑
(Yi − Y )2 +∑
(Xi − X )2=
W√1− mn
NW 2
↗ inW
with W = (Y − X )/√∑
(Zk − Z )2, where∑
(Zk − Z )2 isconstant over all splits.
=⇒ t(X ,Y ) =
√m + n − 2√1/n + 1/m
W√1− mn
NW 2
or W =t(X ,Y )√
m+n−21/n+1/m
+ mnNt(X ,Y )2
85