the fallacies of learning › ... · the fallacies of learning prof. giuseppe de nicolao. outline...

106
The Fallacies of Learning Prof. Giuseppe De Nicolao

Upload: others

Post on 04-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

The Fallacies of LearningProf. Giuseppe De Nicolao

Page 2: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Outline

• Correlation is not causation

• Regression to mediocrity

• Randomized Trials

Page 3: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Correla5on is not causa5on

Page 4: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Correlation Analysis

Data from n = 129 Pavia tudents (A.A. 2001/02)• Height (cm)• Weigth (Kg)• Exam mark of Algebra e Geometry • Exam mark of Physics I mark

Aim: evaluate the correlation of the following pairs• Weight - Height• Algebra e Geometry - Physics I • Weight - Algebra e Geometry

Page 5: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Let’s inspect the scatter plots

Statistica di base IV 4

Innanzi tutto, esaminiamo gli scatter plot:

1 6 0 1 7 0 1 8 0 1 9 0 2 0 04 0

5 0

6 0

7 0

8 0

9 0

1 0 0

height

wei

ght

129 Pavia students

1 5 2 0 2 5 3 0 3 51 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

algebra & geometria

fisic

a 1

4 0 6 0 8 0 1 0 01 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

weight

alge

bra

& ge

omet

ria

Statistica di base IV 4

Innanzi tutto, esaminiamo gli scatter plot:

1 6 0 1 7 0 1 8 0 1 9 0 2 0 04 0

5 0

6 0

7 0

8 0

9 0

1 0 0

height

wei

ght

129 Pavia students

1 5 2 0 2 5 3 0 3 51 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

algebra & geometria

fisic

a 1

4 0 6 0 8 0 1 0 01 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

weight

alge

bra

& ge

omet

ria

Statistica di base IV 4

Innanzi tutto, esaminiamo gli scatter plot:

1 6 0 1 7 0 1 8 0 1 9 0 2 0 04 0

5 0

6 0

7 0

8 0

9 0

1 0 0

height

wei

ght

129 Pavia students

1 5 2 0 2 5 3 0 3 51 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

algebra & geometria

fisic

a 1

4 0 6 0 8 0 1 0 01 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

weight

alge

bra

& ge

omet

ria

Page 6: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

At first sight we note:

• Clear posiSve correlaSon between weight and height• Some posiSve correlaSon between A&G e Physics• Dubious negaSve correlaSon (uncorrelaSon?) between weight and

A&G.

QuanStaSve correlaSon index: Pearson’s correlaSon coefficient

Statistica di base IV 5

A prima vista si nota che:

• Chiara correlazione positiva tra peso e altezza

• Leggera correlazione positiva tra Algebra e Fisica

• Dubbia correlazione negativa (incorrelazione?) tra peso eGeometria.

Indice quantitativo di correlazione: Coefficiente di correlazione

rX Y := Cov[X,Y ]√������������Var[X ] Var[Y ]

= σX Y

σXσY

Per stimarlo, facciamo ricorso alla

Correlazione campionaria:

R X Y := SxyS x S y

Sxy := ∑i=1

n

( x i - X) (y i - Y)

n - 1

Nota: Se V = aX+b, W = cY+d:

• rV W = rX Y

• R V W = R X Y

Page 7: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

To estimate rXY we resort to the sample correlation

Remark: If V = aX + b, W= cY + D

Statistica di base IV 5

A prima vista si nota che:

• Chiara correlazione positiva tra peso e altezza

• Leggera correlazione positiva tra Algebra e Fisica

• Dubbia correlazione negativa (incorrelazione?) tra peso eGeometria.

Indice quantitativo di correlazione: Coefficiente di correlazione

rX Y := Cov[X,Y ]√������������Var[X ] Var[Y ]

= σX Y

σXσY

Per stimarlo, facciamo ricorso alla

Correlazione campionaria:

R X Y := SxyS x S y

Sxy := ∑i=1

n

( x i - X) (y i - Y)

n - 1

Nota: Se V = aX+b, W = cY+d:

• rV W = rX Y

• R V W = R X Y

Statistica di base IV 5

A prima vista si nota che:

• Chiara correlazione positiva tra peso e altezza

• Leggera correlazione positiva tra Algebra e Fisica

• Dubbia correlazione negativa (incorrelazione?) tra peso eGeometria.

Indice quantitativo di correlazione: Coefficiente di correlazione

rX Y := Cov[X,Y ]√������������Var[X ] Var[Y ]

= σX Y

σXσY

Per stimarlo, facciamo ricorso alla

Correlazione campionaria:

R X Y := SxyS x S y

Sxy := ∑i=1

n

( x i - X) (y i - Y)

n - 1

Nota: Se V = aX+b, W = cY+d:

• rV W = rX Y

• R V W = R X Y

Page 8: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Table: mean, variance, SD

Statistica di base IV 6

Tabella: Medie, varianze, SD

variabile media varianza SD

altezza 176.3 60.0 7.8

peso 67.7 111.7 10.6

algebra 24.0 17.2 4.1

fisica I 22.6 16.0 4.0

Tabella: Correlazioni stimate

variabili correlazione campionaria

altezza-peso 0.67

algebra-fisica 0.38

peso-algebra - 0.25

Variable Mean Variance SD

height

weight

A & G

Physics

Variable Mean Variance SD

Page 9: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Table: estimated correlation coefficients

Statistica di base IV 6

Tabella: Medie, varianze, SD

variabile media varianza SD

altezza 176.3 60.0 7.8

peso 67.7 111.7 10.6

algebra 24.0 17.2 4.1

fisica I 22.6 16.0 4.0

Tabella: Correlazioni stimate

variabili correlazione campionaria

altezza-peso 0.67

algebra-fisica 0.38

peso-algebra - 0.25

Variables sample correlaSon

height - weight

A & G - Physicsweight – A & G

Page 10: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Comments

• As expected, there is a clear correlaSon between height and weight

• Weak correlaSon between A & G and Physics: does it mean that exams are a lo^ery?

• NegaSve correlaSon between weight and A&G: does some weight loss help improving the marks?

Page 11: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Problem: how to determine whether the correlation is significant?Answer: take part of the devil’s advocate and make the Null HypothesisH0 that the two variables are uncorrelated (rXY = 0) to see if it isrejected by the data.

In order to reject the the Null Hypothesis I need a referencedistribution

Page 12: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Computer simulated experiment

1. Using a random number generator we draw n pairs of independent (rXY=0) random variables, both distributed as standard normal variables

2. The sample correlation coefficient RXY is computed.3. Repeat 999 times the steps 1 and 2.4. Plot the histogram of the 1000 sample correlations RXY that have been

obtained.

The histogram approximates the reference distribution for RXY under the Null Hypothesis of zero correlation.

Page 13: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Approximate reference distributions for different values of n

Statistica di base IV 9

Distribuzioni di riferimento approssimate per diverse numerositàn del campione:

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =5

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =10

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =20

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =30

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =60

correlation coeff.- 1 -0 .5 0 0 . 5 1

0

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =120

correlation coeff.

Statistica di base IV 9

Distribuzioni di riferimento approssimate per diverse numerositàn del campione:

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =5

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =10

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =20

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =30

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =60

correlation coeff.- 1 -0 .5 0 0 . 5 1

0

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =120

correlation coeff.

Statistica di base IV 9

Distribuzioni di riferimento approssimate per diverse numerositàn del campione:

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =5

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =10

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =20

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =30

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =60

correlation coeff.- 1 -0 .5 0 0 . 5 1

0

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =120

correlation coeff.

Statistica di base IV 9

Distribuzioni di riferimento approssimate per diverse numerositàn del campione:

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =5

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =10

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =20

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =30

- 1 -0 .5 0 0 . 5 10

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =60

correlation coeff.- 1 -0 .5 0 0 . 5 1

0

5 0

1 0 0

1 5 0

2 0 0

2 5 0

3 0 0n =120

correlation coeff.

Page 14: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Comments

• If n is not large enough, it is relaSvely easy to obtain values of RXY much greater than zero even if X and Yare actually uncorrelated.

• Example: If n ≤ 20, observing RXY = 0.4 is not a good guarantee that correlaSon exists (at difference with what happens for n ≥ 60).

• We need a quanStaSve assessment of the significance of RXY.

Page 15: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Property: Consider n pairs (X,Y) where X and Y are jointly normal and independent. Then:

Statistica di base IV 10

Osservazione: Se n non è abbastanza grande risulta relativamentefacile ottenere valori di Rxy molto maggiori di zero anche se levariabili sono incorrelate.

Esempio: Se n ≤ 20, un valore R xy = 0.4 non è una buonagaranzia che esista correlazione (a differenza di quanto accadeper n ≥ 60).

Diamo ora una valutazione quantitativa della significativitàdi R xy.

Proprietà: Si considerino n coppie (x,y) dove x ed y sonocongiuntamente gaussiane e indipendenti. Allora:

t = R xy

√������1 - R xy2 √����n - 2 = tn - 2

(tn-2: t di Student a n-2 gradi di libertà)

Test di correlazione (significatività α): Avendo calcolato t

• |t | > tα/2 ⇒ respingo l'ipotesi nulla ⇒ Rxy èsignificativamente ≠ 0

• |t | ≤ tα/2 ⇒ non respingo l'ipotesi nulla ⇒ Rxy non èsignificativamente ≠ 0

Correla,on test (with significance level α): • |t| > tα/2 ⇒ H0 is rejected ⇒ RXY is significantly = 0• |t| ≤ tα/2 ⇒ H0 is n not rejected

Page 16: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Table: values of tα/2

Statistica di base IV 11

Tabella: Valori di tα/2

P(tn-1 ≥ tα/2) = α/2ovvero P( |tn-1 | ≤ tα/2) = α

n α = 0.20 α = 0.10 α=0.05

1 3.078 6.314 12.7062 1.886 2.920 4.3033 1.638 2.353 3.1824 1.533 2.132 2.7765 1.476 2.015 2.571

6 1.440 1.943 2.4477 1.415 1.895 2.3658 1.397 1.860 2.3069 1.383 1.833 2.2621 0 1.372 1.812 2.228

1 1 1.363 1.796 2.2011 2 1.356 1.782 2.1791 3 1.350 1.771 2.1601 4 1.345 1.761 2.1451 5 1.341 1.753 2.131

1 6 1.337 1.746 2.1201 7 1.333 1.740 2.1101 8 1.330 1.734 2.1011 9 1.328 1.729 2.0932 0 1.325 1.725 2.086

2 1 1.323 1.721 2.0802 2 1.321 1.717 2.0742 3 1.319 1.714 2.0692 4 1.318 1.711 2.0642 5 1.316 1.708 2.060

2 6 1.315 1.706 20.562 7 1.314 1.703 2.0522 8 1.313 1.701 2.0482 9 1.311 1.699 2.0453 0 1.310 1.697 2.042

4 0 1.303 1.684 2.0216 0 1.296 1.671 2.000120 1.289 1.658 1.980∞∞∞∞ 1.282 1.645 1.960

Statistica di base IV 11

Tabella: Valori di tα/2

P(tn-1 ≥ tα/2) = α/2ovvero P( |tn-1 | ≤ tα/2) = α

n α = 0.20 α = 0.10 α=0.05

1 3.078 6.314 12.7062 1.886 2.920 4.3033 1.638 2.353 3.1824 1.533 2.132 2.7765 1.476 2.015 2.571

6 1.440 1.943 2.4477 1.415 1.895 2.3658 1.397 1.860 2.3069 1.383 1.833 2.2621 0 1.372 1.812 2.228

1 1 1.363 1.796 2.2011 2 1.356 1.782 2.1791 3 1.350 1.771 2.1601 4 1.345 1.761 2.1451 5 1.341 1.753 2.131

1 6 1.337 1.746 2.1201 7 1.333 1.740 2.1101 8 1.330 1.734 2.1011 9 1.328 1.729 2.0932 0 1.325 1.725 2.086

2 1 1.323 1.721 2.0802 2 1.321 1.717 2.0742 3 1.319 1.714 2.0692 4 1.318 1.711 2.0642 5 1.316 1.708 2.060

2 6 1.315 1.706 20.562 7 1.314 1.703 2.0522 8 1.313 1.701 2.0482 9 1.311 1.699 2.0453 0 1.310 1.697 2.042

4 0 1.303 1.684 2.0216 0 1.296 1.671 2.000120 1.289 1.658 1.980∞∞∞∞ 1.282 1.645 1.960

Statistica di base IV 11

Tabella: Valori di tα/2

P(tn-1 ≥ tα/2) = α/2ovvero P( |tn-1 | ≤ tα/2) = α

n α = 0.20 α = 0.10 α=0.05

1 3.078 6.314 12.7062 1.886 2.920 4.3033 1.638 2.353 3.1824 1.533 2.132 2.7765 1.476 2.015 2.571

6 1.440 1.943 2.4477 1.415 1.895 2.3658 1.397 1.860 2.3069 1.383 1.833 2.2621 0 1.372 1.812 2.228

1 1 1.363 1.796 2.2011 2 1.356 1.782 2.1791 3 1.350 1.771 2.1601 4 1.345 1.761 2.1451 5 1.341 1.753 2.131

1 6 1.337 1.746 2.1201 7 1.333 1.740 2.1101 8 1.330 1.734 2.1011 9 1.328 1.729 2.0932 0 1.325 1.725 2.086

2 1 1.323 1.721 2.0802 2 1.321 1.717 2.0742 3 1.319 1.714 2.0692 4 1.318 1.711 2.0642 5 1.316 1.708 2.060

2 6 1.315 1.706 20.562 7 1.314 1.703 2.0522 8 1.313 1.701 2.0482 9 1.311 1.699 2.0453 0 1.310 1.697 2.042

4 0 1.303 1.684 2.0216 0 1.296 1.671 2.000120 1.289 1.658 1.980∞∞∞∞ 1.282 1.645 1.960

Statistica di base IV 11

Tabella: Valori di tα/2

P(tn-1 ≥ tα/2) = α/2ovvero P( |tn-1 | ≤ tα/2) = α

n α = 0.20 α = 0.10 α=0.05

1 3.078 6.314 12.7062 1.886 2.920 4.3033 1.638 2.353 3.1824 1.533 2.132 2.7765 1.476 2.015 2.571

6 1.440 1.943 2.4477 1.415 1.895 2.3658 1.397 1.860 2.3069 1.383 1.833 2.2621 0 1.372 1.812 2.228

1 1 1.363 1.796 2.2011 2 1.356 1.782 2.1791 3 1.350 1.771 2.1601 4 1.345 1.761 2.1451 5 1.341 1.753 2.131

1 6 1.337 1.746 2.1201 7 1.333 1.740 2.1101 8 1.330 1.734 2.1011 9 1.328 1.729 2.0932 0 1.325 1.725 2.086

2 1 1.323 1.721 2.0802 2 1.321 1.717 2.0742 3 1.319 1.714 2.0692 4 1.318 1.711 2.0642 5 1.316 1.708 2.060

2 6 1.315 1.706 20.562 7 1.314 1.703 2.0522 8 1.313 1.701 2.0482 9 1.311 1.699 2.0453 0 1.310 1.697 2.042

4 0 1.303 1.684 2.0216 0 1.296 1.671 2.000120 1.289 1.658 1.980∞∞∞∞ 1.282 1.645 1.960

Statistica di base IV 11

Tabella: Valori di tα/2

P(tn-1 ≥ tα/2) = α/2ovvero P( |tn-1 | ≤ tα/2) = α

n α = 0.20 α = 0.10 α=0.05

1 3.078 6.314 12.7062 1.886 2.920 4.3033 1.638 2.353 3.1824 1.533 2.132 2.7765 1.476 2.015 2.571

6 1.440 1.943 2.4477 1.415 1.895 2.3658 1.397 1.860 2.3069 1.383 1.833 2.2621 0 1.372 1.812 2.228

1 1 1.363 1.796 2.2011 2 1.356 1.782 2.1791 3 1.350 1.771 2.1601 4 1.345 1.761 2.1451 5 1.341 1.753 2.131

1 6 1.337 1.746 2.1201 7 1.333 1.740 2.1101 8 1.330 1.734 2.1011 9 1.328 1.729 2.0932 0 1.325 1.725 2.086

2 1 1.323 1.721 2.0802 2 1.321 1.717 2.0742 3 1.319 1.714 2.0692 4 1.318 1.711 2.0642 5 1.316 1.708 2.060

2 6 1.315 1.706 20.562 7 1.314 1.703 2.0522 8 1.313 1.701 2.0482 9 1.311 1.699 2.0453 0 1.310 1.697 2.042

4 0 1.303 1.684 2.0216 0 1.296 1.671 2.000120 1.289 1.658 1.980∞∞∞∞ 1.282 1.645 1.960

Statistica di base IV 11

Tabella: Valori di tα/2

P(tn-1 ≥ tα/2) = α/2ovvero P( |tn-1 | ≤ tα/2) = α

n α = 0.20 α = 0.10 α=0.05

1 3.078 6.314 12.7062 1.886 2.920 4.3033 1.638 2.353 3.1824 1.533 2.132 2.7765 1.476 2.015 2.571

6 1.440 1.943 2.4477 1.415 1.895 2.3658 1.397 1.860 2.3069 1.383 1.833 2.2621 0 1.372 1.812 2.228

1 1 1.363 1.796 2.2011 2 1.356 1.782 2.1791 3 1.350 1.771 2.1601 4 1.345 1.761 2.1451 5 1.341 1.753 2.131

1 6 1.337 1.746 2.1201 7 1.333 1.740 2.1101 8 1.330 1.734 2.1011 9 1.328 1.729 2.0932 0 1.325 1.725 2.086

2 1 1.323 1.721 2.0802 2 1.321 1.717 2.0742 3 1.319 1.714 2.0692 4 1.318 1.711 2.0642 5 1.316 1.708 2.060

2 6 1.315 1.706 20.562 7 1.314 1.703 2.0522 8 1.313 1.701 2.0482 9 1.311 1.699 2.0453 0 1.310 1.697 2.042

4 0 1.303 1.684 2.0216 0 1.296 1.671 2.000120 1.289 1.658 1.980∞∞∞∞ 1.282 1.645 1.960

Page 17: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Even easier: table for r2.5%

If |RXY| > r2.5%then RXY is significantly ≠ 0

Statistica di base IV 12

Metodo più comodo: Se fisso α = 5% posso calcolare i valoricritici di Rxy per diversi valori di n.

Tabella: Valori di r2.5%

Rxy è significativamente ≠ 0se |Rxy| > r2.5%

n r2.5%

3 0.9974 0.955 0.886 0.817 0.758 0.719 0.67

1 0 0.631 1 0.601 2 0.581 3 0.551 4 0.531 5 0.511 6 0.501 7 0.481 8 0.471 9 0.46

2 0 0.442 1 0.432 2 0.422 3 0.412 4 0.402 5 0.402 6 0.392 7 0.382 8 0.372 9 0.373 0 0.36

6 0 0.25

120 0.18

Page 18: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Back to the example

Statistica di base IV 6

Tabella: Medie, varianze, SD

variabile media varianza SD

altezza 176.3 60.0 7.8

peso 67.7 111.7 10.6

algebra 24.0 17.2 4.1

fisica I 22.6 16.0 4.0

Tabella: Correlazioni stimate

variabili correlazione campionaria

altezza-peso 0.67

algebra-fisica 0.38

peso-algebra - 0.25

Variable sample correlation

height - weight

A & G - Physicsweight – A & G

Significantly ≠ 0

Significantly ≠ 0

Significantly ≠ 0!

Page 19: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

«Beware the lurking variable!»Correlation does not imply causation

• Example #1: shoe number – reading ability• Example #2: study degree – unemployment duration in the US during

the Great Depression

In many cases, a «spurious» correlation is observed when both variables are correlated to a third variable (the lurking variable).

• Example #1: the age.• Example #2: again the age because the young were more educated

Page 20: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Causal diagram

AGE

SHOENUMBER

READINGABILITY

CONFOUNDER

Page 21: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Let’s re-examine the scaRer plot dis5nguishing males (*) and females (o)

Statistica di base IV 15

Rivediamo lo scatter plot distinguendo maschi e femmine:(nM = 104, nF = 25)

4 0 5 0 6 0 7 0 8 0 9 0 1 0 01 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

weight

alge

bra

& ge

omet

ria

malefemale

Congettura: C'è correlazione tra peso e algebra perché lefemmine hanno (mediamente) voti più alti e (mediamente) pesanodi meno

nM = 104nF = 25

Page 22: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Conjecture: gender is a confounder

GENDER

WEIGHT A & GMARK

CONFOUNDER

Page 23: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Histograms

Statistica di base IV 16

Istogrammi

4 0 6 0 8 0 1 0 00

5

1 0

1 5

2 0

2 5

3 0male

weight1 0 2 0 3 0 4 0

0

5

1 0

1 5

2 0

2 5male

algebra & geometria

4 0 6 0 8 0 1 0 00

2

4

6

8female

weight1 0 2 0 3 0 4 0

0

2

4

6

8female

algebra & geometria

Page 24: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Table: Mean, Variance, SD

Statistica di base IV 17

Tabella: Medie, varianze, SD

variabile media varianza SD

peso 53.9 38.2 6.2(femmine)

peso 71.1 72.3 8.5(maschi)

algebra 26.1 17.7 4.2(femmine)

algebra 23.5 15.9 4.0(maschi)

Tabella: Correlazioni stimate

variabili correlazione campionaria

peso-algebra - 0.17(femmine)

peso-algebra - 0.11(maschi)

Variable Mean Variance SD

Weight(female)

A & G(female)

Weight(male)

A & G(male)

Page 25: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Es5mated correla5on coefficient

Statistica di base IV 17

Tabella: Medie, varianze, SD

variabile media varianza SD

peso 53.9 38.2 6.2(femmine)

peso 71.1 72.3 8.5(maschi)

algebra 26.1 17.7 4.2(femmine)

algebra 23.5 15.9 4.0(maschi)

Tabella: Correlazioni stimate

variabili correlazione campionaria

peso-algebra - 0.17(femmine)

peso-algebra - 0.11(maschi)

Variables sample correlation

weight – A & G(female)

weight – A & G(male)

Page 26: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

If |RXY| > r2.5%then RXY is significantly ≠ 0

Statistica di base IV 12

Metodo più comodo: Se fisso α = 5% posso calcolare i valoricritici di Rxy per diversi valori di n.

Tabella: Valori di r2.5%

Rxy è significativamente ≠ 0se |Rxy| > r2.5%

n r2.5%

3 0.9974 0.955 0.886 0.817 0.758 0.719 0.67

1 0 0.631 1 0.601 2 0.581 3 0.551 4 0.531 5 0.511 6 0.501 7 0.481 8 0.471 9 0.46

2 0 0.442 1 0.432 2 0.422 3 0.412 4 0.402 5 0.402 6 0.392 7 0.382 8 0.372 9 0.373 0 0.36

6 0 0.25

120 0.18

Page 27: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Estimated correlation coefficient

Statistica di base IV 17

Tabella: Medie, varianze, SD

variabile media varianza SD

peso 53.9 38.2 6.2(femmine)

peso 71.1 72.3 8.5(maschi)

algebra 26.1 17.7 4.2(femmine)

algebra 23.5 15.9 4.0(maschi)

Tabella: Correlazioni stimate

variabili correlazione campionaria

peso-algebra - 0.17(femmine)

peso-algebra - 0.11(maschi)

Variables sample correlaSon

weight – A & G(female)

weight – A & G(male)

Not significantly ≠ 0

Not significantly ≠ 0

Page 28: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Comments

• If the subpopulations (males and females) are analyzed separately, the correlation weight-A&G is in both cases under the significance threshold.

• Gender is likely to be a confounder.• For some reasons (to be better understood) females obtain better

marks (on average)

Page 29: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Final remarks on correlation analysis

It is sensiSve to ouSliers

Statistica di base IV 19

Considerazioni finali sull'analisi di correlazione

Attenzione! Gli outlier possono falsare le correlazioni.

0 2 4 60

1

2

3

4

5

R xy = 0.08

Attenzione! Ci sono legami tra le variabili che non sono rivelatidal coefficiente di correlazione.

0 5 1 00

5

1 0

1 5

2 0

2 5

Rxy = 3.0774e-17

Statistica di base IV 19

Considerazioni finali sull'analisi di correlazione

Attenzione! Gli outlier possono falsare le correlazioni.

0 2 4 60

1

2

3

4

5

R xy = 0.08

Attenzione! Ci sono legami tra le variabili che non sono rivelatidal coefficiente di correlazione.

0 5 1 00

5

1 0

1 5

2 0

2 5

Rxy = 3.0774e-17

Some nonlinear relaSonships are not detected by Pearson’s correlaSon coefficient

Page 30: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Final remarks on correla5on analysis

• Pearson’s correlation coefficient measure linear association and not association in general.

• Its use is most appropriate when the bivariate scatter plot is ellyptic• Best results are achieved for jointly normal random variables for

which uncorrelation = independence.

Page 31: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Simpson’s Paradox

Page 32: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

16 18 20 22 24 26 28 30shoe number

2

3

4

5

6

7

8

9

10

11

read

ing

abilit

yReading ability of kids vs their shoe number

• Age, represented as differentcolors is the confounder

• A mixture of 5 distributions

• Within each distribution nocorrelation exists

6 years

7 years

8 years

9 years

10 years

Page 33: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Let’s re-examine the scaRer plot dis5nguishing males (*) and females (o)

Statistica di base IV 15

Rivediamo lo scatter plot distinguendo maschi e femmine:(nM = 104, nF = 25)

4 0 5 0 6 0 7 0 8 0 9 0 1 0 01 8

2 0

2 2

2 4

2 6

2 8

3 0

3 2

weight

alge

bra

& ge

omet

ria

malefemale

Congettura: C'è correlazione tra peso e algebra perché lefemmine hanno (mediamente) voti più alti e (mediamente) pesanodi meno

nM = 104nF = 25

Page 34: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Apparent negative correlation Positive correlation within the groups

Page 35: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Regression to mediocrity

Page 36: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

February 9, 1877, Royal Ins5tu5on, London

Page 37: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Friday evening lectureF. Galton, Typical laws of heredity

Page 38: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

How does children’s statures depend on parents’ ones?

• Females’ statures multplied by 1.08• Midparent = average of parents• Statures normalized around zero

Page 39: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Gaussianity: the «quincunx»

Quincunx: a Greek coin

https://en.wikipedia.org/wiki/Bean_machine

Page 40: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Gaussianity: the «quincunx»

The balls «inherit» their posiSon from those in the higher rows in the same way as humans inherit their stature from those of their ancestors

The width of the Gaussian depends on the number of rows

Page 41: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

According to the quincunx, variance shoud increase from one genera5on to the next (and eventually diverge), but ...

When Mid-Parents are taller than mediocrity, their Children tend to be shorter than they

Mid-Parents

Child

ren

When Mid-Parents are shorter than mediocrity, their Children tend to be taller than they

Page 42: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Galton’s empirical law of reversion (1877): there must be a counteracting force

to maintain population stability

Chutes reduce variance

Quincunx increases variance

IniSal variance

larger variance

Variance returns to iniSal value

The two-stage quincunx

EMPIRICAL LAW

OF REVERSION

LAW OF ERROR

(CENTRAL LIMIT THEOREM)

Page 43: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

As a maRer of fact, the law of reversion is a fallacy:

no counterac5ng force is needed

Page 44: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

1885, Aberdeen: Galton’s Presiden5al Address at the Bri5sh Associa5on for the Advancement of Science

• [I was] «blind to what I now perceive to be the simple explanaSon»• «The explanaSon of it is as follows. The child inherits partly from his

parents, partly from his ancestry.»

Regression Towards Mediocrity in Hereditary Stature.Author(s): Francis GaltonSource: The Journal of the Anthropological Institute of Great Britain and Ireland, Vol. 15(1886), pp. 246-263Published by: Royal Anthropological Institute of Great Britain and IrelandStable URL: http://www.jstor.org/stable/2841583 .Accessed: 14/06/2014 16:10

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Royal Anthropological Institute of Great Britain and Ireland is collaborating with JSTOR to digitize, preserveand extend access to The Journal of the Anthropological Institute of Great Britain and Ireland.

http://www.jstor.org

Page 45: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Statistical formulation of Galton’s 1885 explanation• P ⇠ N (0,�2

): stature of Mid-Parent

• C ⇠ N (0,�2): stature of (grown up) Child

• C = ↵P + V, 0 < ↵ < 1

• V ⇠ N (0,�2): ancestry contribution, independent of P

Since P and C have the same variance, �2must satisfy

Var(P ) = �2

↵2�2+ �2

= �2

�2= (1� ↵2

)�2

Moreover

Cov(C,P ) = E ((↵P + V )P ) = ↵E(P 2) = ↵�2

rCP =Cov(C,P )pVar(C)Var(P )

= ↵

2

Page 46: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Just a sta*s*cal property

Complete model

✓PC

◆⇠ N

✓✓0

0

◆,�2

✓1 ↵↵ 1

◆◆

How is C influenced by P? This is equivalent to asking what is the distribu-

tion of C conditional on P = p. Recalling that C = ↵P+V , with V independent

of P ,

E(C|P = p) = ↵p

Therefore, if we consider Parents whose stature is p, on the average the

stature of their Children is always strictly less than p.There is no counteracting force, it is just a statistical property.

3

Page 47: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Galton abandoned“ law of reversion ”in favour of “Regression to the mean” or

“regression towards mediocrity”THE AMERICAN STATISTICIAN, VOL. , NO. , –http://dx.doi.org/./..

HISTORY CORNER

On Galton’s Change From “Reversion” to “Regression”

Prakash Gorroochurn

ARTICLE HISTORYReceived April Revised August

KEYWORDSAncestral type; Atavism;Imperfect correlation

ABSTRACTGalton’s first work on regression probably led him to think of it as a unidirectional, genetic process, which hecalled “reversion.”A subsequent experiment on family heights made him realize that the phenomenon wassymmetric and nongenetic. Galton then abandoned “reversion” in favor of “regression.” Final confirmationwasprovided throughDickson’smathematical analysis andGalton’s examinationof height data onbrothers.

1. Introduction

Regression, as we know it today, was born fromGalton’s investi-gations into the laws of heredity. The phenomenon that Galtondiscovered is best described in his own words:

…offspring did not tend to resemble their parent seeds in size, but tobe always more mediocre than they—to be smaller than the parents,if the parents were large; to be larger than the parents, if the parentswere very small … (Galton 1886)

That Galton came to this conclusion almost single-handedlyand not by drawing on the contributions from his predeces-sors is testimony to his genius. The various experiments andanalyses that Galton performed before he reached his conclu-sion have been well described in works such as Cowan (1972);MacKenzie (1978); Porter (1986); Stigler (1986, 1989); and Bul-mer (2003). However, what is often not properly discussed isthat Galton at first very probably did not understand regressionas we know it today. He first called the phenomenon “rever-sion” (indeed the symbol r was first used by Galton to signify“the coefficient of reversion” (Pearson 1930, p. 9), which wasa genetic process well known to both him and his contempo-raries. One of his first discoveries was not that therewas a regres-sion effect, but rather that the reversion phenomenon he hadobserved and had assumed would occur was operating in a lin-ear fashion. Galton also thought the phenomenon was a unidi-rectional process operating on offspring from remote ancestors(Gorroochurn to appear, Chap. 2). The realization that some-thing other than a unidirectional genetic process was going on,however, soon came about when he found that reversion wasalso occurring on parents from their offspring. At this stage, Gal-tonmade the decision to change “reversion” to “regression.” Gal-ton confirmed his hypothesis through the mathematical analy-sis of J. Hamilton Dickson and later through the examination ofheight data on brothers. The phenomenonwas not genetic rever-sion as he had at first thought, but a nongenetic, purely statisticalphenomenon that could operate in either direction.

2. The 1877 Paper: Reversion or “ATAVISM”

Galton’s (1877) groundbreaking paper, “Typical Laws ofHeredity” dealt with a problem that had preoccupied him

CONTACT Prakash Gorroochurn [email protected] Associate Professor, Department of Biostatistics, Columbia University, New York, NY .

for a while, indeed since his 1869 book Hereditary Genius (Gal-ton 1869): Why do the characteristics (mean and variance) ofa hereditary attribute (such as height) from an isolated humanpopulation remain constant from generation to generation? Toexplain the constancy in attributes, Galton invoked reversion,also known as atavism, which is the genetic process by which anindividual resembles a grandparent or more distant ancestorwith respect to some trait not possessed by the parents. This canhappen, for example, if a recessive and previously suppressedtrait reappears through the combination of two recessive allelesin a genotype. Alternatively, the process of recombination cangive rise to a unique constellation of genes resulting in a longsuppressed character to reappear (these two mechanisms werenot known to Galton as Mendelism was yet to be rediscoveredin 1900). Atavism is thus reversion to ancestral type and was wellknown by Galton’s contemporaries, including Darwin (1859, p.14), who, in fact, first proposed it. This genetic process is quitedifferent from the purely statistical phenomenon that Galtonsoon discovered and at first identified with reversion.

There is undeniable evidence that Galton believed thatatavism was the process that would revert offspring’s traits tothose of their distant ancestors. Thus, back in 1865, he made thefollowing statement:

Lastly, though the talent and character of both of the parents might,in any particular case, be of a remarkably noble order, and thor-oughly congenial, yet they would necessarily have such mongrelantecedents that it would be absurd to expect their children to invari-ably equal them in their natural endowments. The law of atavismprevents it. (Galton 1865, p. 319).

Galton’s (1877) in his paper explained that he resorted toexperiments with sweet peas to answer his questions. He sorteda large number of sweet pea seeds into seven equally spaced size(weight) classes and sent each of his friends seven packets, eachcontaining 10 seeds of a given class size. The seeds from the off-spring were then collected and sent back to Galton. From hisanalysis of the seed results, Galton made the following two keyconclusions:

1. For a given parental class size, the size of the filial seedswas normally distributed, with the same probable errorep within each class (i.e., the same family variability).

© American Statistical Association

Page 48: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Graphical interpretation

via scatter plot

The$Book$of$Why:$The$New$Science$of$Cause$and$Effect$–$Pearl$and$Mackenzie$

$ 9$

Figure 2. The scatterplot shows a dataset of heights, with each dot representing the height of a

father (on the X-axis) and his son (on the Y-axis). The dashed line coincides with the major axis

of the ellipse, while the solid line (called the regression line) connects the rightmost and leftmost

points on the ellipse. The difference between them accounts for regression to the mean. For

example, the black star shows that 72-inch fathers have, on the average, 71-inch sons. (That is,

the average height of all the data points in the vertical strip is 71 inches.) The horizontal strip and

white star show that the same loss of height occurs in the non-causal direction (backward in

time). (Figure by Maayan Harel, with a contribution from Christopher Boucher.)

REGRESSION OF Y ON X

AXIS OF THE ELLIPSE (THE «SD LINE»)

45°

Page 49: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

230 P. GORROOCHURN

Figure . Ellipse generated by Galton by joining points with the same frequency.

irrefutable confirmation needed a proper mathematical analy-sis, a task that Galton thought was beyond his analytical skills.Therefore, he solicited the help of the able mathematician J.Hamilton Dickson. In modern mathematical language (this isexactly how the problem was phrased: “A point P is capable ofmoving along a straight line P’OP, making an angle tan−12/3with the axis of y, which is drawn through O the mean positionof P; the probable error of the projection of P on Oy is 1.22 inch:another point p, whose mean position at any time is P, is capa-ble of moving from P parallel to the axis of x (rectangular coor-dinates) with a probable error of 1.50 inch. To discuss the ‘sur-face of frequency of p’ (Galton 1886),Dicksonwas providedwiththe information thatY ∼ N(0, σ 2

Y ) andX |Y ∼ N(βX |Y y, σ 2X |Y ),

and was asked the following questions:1. What is the joint density of (X,Y ) , and what is the shape

of the contours of equal probability density?2. How can the regression coefficient βY |X be calculated?3. What is the density of Y given X?4. What is the relationship between βY |X and βX |Y ?Dickson answered each of the above questions without much

trouble, and the solution was published as an Appendix to Gal-ton’s (1886) paper “Family Likeliness in Stature.” In modernnotation, the joint density of X and Y is

fXY!x, y

"

= fY!y"fX |Y

!x|y"

∝ exp

#

−$

y2

2σ 2Y

+!x − βX |Y y

"2

2σ 2X |Y

%&

.

(4)

(Pearson (1921) expressed his puzzle as to why Galton did nothimself derive the joint density of X and Y since he alreadyknew both fY(y) and fY|X(x|y). However, it is unlikely that Galtonthought in terms of conditional and marginal distributions.)

To obtain the contours of equal probability, Dickson sets theexpression in the above exponent to a constant (say K):

y2

σ 2Y

+!x − βX |Y y

"2

σ 2X |Y

= K, (5)

which is the equation of a set of ellipses.To obtain the required regression coefficient βY |X , first

Equation (5) is differentiated:

ydy/dxσ 2Y

+!x − βX |Y y

" !1 − βX |Ydy/dx

"

σ 2X |Y

= 0,

so that

yσ 2Ydy

+ (x − βX |Y y)(dx − βX |Ydy)σ 2

X |Y

= 0.

By setting the coefficient of dy to zero, tangents to the ellipsein Equation (5) parallel to the y-axis can be obtained and theseintersect the ellipse at points lying on the line OM (see Figure 2)with the following equation:

yσ 2Y

−!x − βX |Y y

"βX |Y

σ 2X |Y

= 0,

or

y = βX |Yσ 2Y

σ 2X |Y + β2

X |Yσ 2Yx.

Thus,

βY |X = βX |Yσ 2Y

σ 2X |Y + β2

X |Yσ 2Y

. (6)

REGRESSION OF CHILDREN

ON PARENTS

REGRESSION OF PARENTS

ON CHILDREN

Page 50: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

There are always two regression lines!

• Regression of Y on X• Regression of X on Y• They coincide only if correlation is ±1• Paradox is only apparent: they answer distinct questions

• Predict Y given X: use regression of Y on X• Predict X given Y: use regression of X on Y• Non-Gaussian data: they are the lines that minimize the mean-square

prediction error given either X or Y• No causality assumption is needed: just a consequence of joint

distribution

Page 51: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Regression toward the mean: is it universal?

It is not universal!

Statistical Reversion Toward the Mean: More Universal

Than Regression Toward the Mean

MYRA L. SAMUELS*

Schmittlein discussed the lack of universality of regres-

sion toward the mean. The present note emphasizes the

universality of a similar effect, dubbed "reversion" to-

ward the mean, defined as the shift in conditional ex-

pectation of the upper or lower portion of a distribution.

Reversion toward the mean is a useful concept for sta-

tistical reasoning in applications and is more self-evi-

dently plausible than regression toward the mean.

KEY WORDS: Probability mixture models; Regression

to the mean; Reversion to the mean.

1. INTRODUCTION

In a recent commentary, Schmittlein (1989) demon-

strated that the phenomenon of statistical regression to-

ward the mean is by no means universal, even in mixture

models where universality might have been hoped for.

The purpose of the present note is to point out that a

closely related phenomenon is much more nearly uni-

versal and to present a heuristic proof of this universality

that is easily accessible to nonstatisticians.

2. REGRESSION AND REVERSION

TOWARD THE MEAN

Let X1 and X2 be random variables with joint distri-

bution function F. Assume that X1 and X2 have the same

marginal distribution and let ,u denote their common mean. The distribution F exhibits regression toward the mean

if, for all c > ,u,

? ' E[X2 I X1 = c] < c, (1) with the reverse inequalities holding for c < ,u.

As noted by Schmittlein (1989), Galton (1877) orig- inally used the term reversion rather than regression. Let

us resurrect this archaic term for a new use, and say that F exhibits reversion toward the mean if, for any c,

,u'E[X2 X1 > c] < E[XI X1 > c] (2a) and

,u'E[X2 |XI < c] > E[XI XI < C]. (2b)

(To avoid trivialities, we restrict attention throughout to values of c for which the conditional expectations are

defined.) Clearly, regression toward the mean implies reversion toward the mean, but not vice versa.

As defined by (2), reversion toward the mean occurs

when the conditional mean of the upper or lower portion

of the distribution shifts, or reverts, toward the uncon-

ditional mean ,u. In many applications the upper or lower

portion of interest would be a small portion (a tail) of

the distribution, but reversion is not restricted to this case;

note that (2) places no restriction on the location of c

relative to ,u.

Reversion can serve as well as regression to motivate

statistical cautionary tales. For example, educational re-

searchers can be warned that a group of school children

selected because their performance is below some cutoff

would be expected, on the average, to show improve-

ment when observed later. Or medical investigators can

be alerted that patients selected for levels of serum po-

tassium higher than some cutoff would be expected, on the average, to show reduced levels when observed later.

The phenomenon (2) has been discussed in this kind of context (for example, by Davis 1976, 1986;

McDonald, Mazzuca, and McCabe 1983), but termi- nology is often vague and the distinction between (1) and

(2) is frequently blurred. Recently Senn (1990) has em-

phasized that (2) is a phenomenon of practical impor-

tance and has suggested that both (1) and (2) should be considered forms of regression toward the mean. It might

be less confusing, however, to give distinct names to these distinct phenomena. Since the conditional expec-

tation function f(x) = E[X2 I XI = x] is generally called the "regression" function (Rao 1973, p. 264; Dixon and Massey 1983, p. 210-211; Kendall, Stuart, and Ord 1987, p. 524), it seems appropriate to continue to refer to (1)

as "regression" and to choose a different name for (2).

3. THE UNIVERSALITY OF REVERSION

TOWARD THE MEAN

To investigate the universality of reversion toward the

mean, it is helpful to split the definition into two parts by noting that (2) holds iff

E[X2 |XI > c] < E[XI XI > c], for all c, (3a)

E[X2 |XI < c] > E[XI XI < c] , for all c, (3b) and

E[X2 I X1 > c] 2 /, E[X2 I XI < c] < ? for all c. (4)

The condition (3) can be called reversion to the mean or beyond; (3a) asserts that the mean of the upper por- tion of the distribution reverts to a lower position, and

(3b) asserts contrariwise for the mean of the lower por- tion. The additional requirement (4) assures that the re- version cannot go beyond the mean ,u.

*Myra L. Samuels is Instructor and Assistant Supervisor of Statis-

tical Consulting, Department of Statistics, Purdue University, West

Lafayette, IN 47907.

344 The American Statistician, November 1991, Vol. 45, No. 4 ?) 1991 American Statistical Association

This content downloaded from 193.204.40.97 on Tue, 14 Jan 2020 13:11:18 UTCAll use subject to https://about.jstor.org/terms

COMA Commentaries are informative essays dealing with viewpoints of statis-

tical practice, statistical education, and other topics considered to be

of general interest to the broad readership of The American Statistician.

Commentaries are similar in spirit to Letters to the Editor, but they

involve longer discussions of background, issues, and perspectives. All

commentaries will be refereed for their merit and compatibility with these

criteria.

Surprising Inferences From Unsurprising Observations:

Do Conditional Expectations Really Regress to the Mean?

DAVID C. SCHMITTLEIN*

Most social science descriptions of the statistical regression

effect envision the effect as occurring toward the population

mean. If individuals that initially had extreme values regress

back toward the mean on subsequent observations, then as

a corollary individuals who were at the population mean

initially are expected to stay at the population mean. Both

the regression to the mean statement and its corollary are

generally false for models exhibiting a regression effect.

For commonly used probability mixture models, conditional

expectations of subsequent observations based on previous

observations regress not to the mean, but to some other

value. Examples are presented using mixtures of normal,

Poisson, and binomial random variables.

KEY WORDS: Prior distributions; Probability mixture

models; Regression to the mean.

1. INTRODUCTION

A statistical phenomenon that has misled researchers is the so-called regres-

sion effect. Test scores change as a statistical fact of life: on retest, on the

average, they regress toward the mean. The regression effect operates

because of the imperfect correlation between the pretest and posttest scores.

(Kerlinger 1986, p. 296)

Regression is always to the population mean of a group. Its magnitude

depends both on the test-retest reliability of a measure and on the difference

between the mean of a deliberately selected subgroup and the mean of the

population from which the subgroup was chosen. (Cook and Campbell

1979, pp. 52-53)

These two quotations motivate a concept common in so-

cial science research, which is usually called "statistical

regression" or "regression to the mean." In addition to the

foregoing descriptions, which are taken from popular re-

search methodology textbooks for the social sciences, the

concept of regression to the mean surfaces often in several

particular disciplines, including psychology:

Regression simply means that through . . . chance variation, extreme scores

tend to "regress" or move toward the mean upon retesting. For example, if children are given Form L of the Stanford-Binet and then retested with

Form M six months later, there will be a tendency for those who scored

above average on the first test to fall closer to the mean on the second

test. (Anastasi 1958, p. 203)

and marketing research:

This regression-toward-the-mean phenomenion is found unii'ersallYJ for characteristics that varv not on/v1 amonig people blut also over timne for each

person. . . . In fact, thinking about this phenomenon, one may realize that

it is no remarkable discovery but a logical inevitability. (Greene 1982, pp.

29-31)

The italics (added) in each quotation underscore the spe-

cial role that the population mean is presumed to play in

such an effect: it is this quantity toward which an individ-

ual's future behavior "regresses." Explicit consideration of

this regression effect is important in many longitudinal stud-

ies. It must be considered when groups of individuals being

analyzed have been created based on an initial "score" of

each individual, with that score having a stochastic com-

ponent. Nesselroade, Stigler, and Baltes (1980) listed stud-

ies in a number of disciplines that have been seriously

compromised through failure to consider such an effect.

The intent in this article is to show that there really is no

logical imperative for the regression effect to occur toward

the mean. Simple, reasonable processes in common use

exhibit regression to a quantity other than the mean. It is

only a specific class of models that produces regression to

the mean. In describing this class we shall see that it in-

cludes, but is not limited to, models based on conjugate

priors for the distribution of a latent trait across a population.

There is a sort of corollary to this regression-to-the-mean

phenomenon that will also be of interest. It concerns the

expected future behavior of an individual who scores exactly

the population mean in an initial observation period. Pre-

sumably, the expected future observed score is a continuous,

monotonically increasing function of the initial observed

score. Then if initial scores above the mean regress back

toward the mean, and if initial scores below the mean regress

up toward the mean, what do we expect of an individual

who initially scores exactly the population mean? Within

this logical framework we must expect this individual's

future score to be unchanged. That is, the population mean

is the only fixed point for the function linking the observed *David C. Schmittlein is Associate Professor, Marketing Department,

the Wharton School, University of Pennsylvania, Philadelphia, PA 19104.

176 The American Statistician, August 1989, Vol. 43, No. 3 C) 1989 American Statistical Association

This content downloaded from 193.204.40.97 on Tue, 14 Jan 2020 13:32:38 UTCAll use subject to https://about.jstor.org/terms

COMA Commentaries are informative essays dealing with viewpoints of statis-

tical practice, statistical education, and other topics considered to be

of general interest to the broad readership of The American Statistician.

Commentaries are similar in spirit to Letters to the Editor, but they

involve longer discussions of background, issues, and perspectives. All

commentaries will be refereed for their merit and compatibility with these

criteria.

Surprising Inferences From Unsurprising Observations:

Do Conditional Expectations Really Regress to the Mean?

DAVID C. SCHMITTLEIN*

Most social science descriptions of the statistical regression

effect envision the effect as occurring toward the population

mean. If individuals that initially had extreme values regress

back toward the mean on subsequent observations, then as

a corollary individuals who were at the population mean

initially are expected to stay at the population mean. Both

the regression to the mean statement and its corollary are

generally false for models exhibiting a regression effect.

For commonly used probability mixture models, conditional

expectations of subsequent observations based on previous

observations regress not to the mean, but to some other

value. Examples are presented using mixtures of normal,

Poisson, and binomial random variables.

KEY WORDS: Prior distributions; Probability mixture

models; Regression to the mean.

1. INTRODUCTION

A statistical phenomenon that has misled researchers is the so-called regres-

sion effect. Test scores change as a statistical fact of life: on retest, on the

average, they regress toward the mean. The regression effect operates

because of the imperfect correlation between the pretest and posttest scores.

(Kerlinger 1986, p. 296)

Regression is always to the population mean of a group. Its magnitude

depends both on the test-retest reliability of a measure and on the difference

between the mean of a deliberately selected subgroup and the mean of the

population from which the subgroup was chosen. (Cook and Campbell

1979, pp. 52-53)

These two quotations motivate a concept common in so-

cial science research, which is usually called "statistical

regression" or "regression to the mean." In addition to the

foregoing descriptions, which are taken from popular re-

search methodology textbooks for the social sciences, the

concept of regression to the mean surfaces often in several

particular disciplines, including psychology:

Regression simply means that through . . . chance variation, extreme scores

tend to "regress" or move toward the mean upon retesting. For example, if children are given Form L of the Stanford-Binet and then retested with

Form M six months later, there will be a tendency for those who scored

above average on the first test to fall closer to the mean on the second

test. (Anastasi 1958, p. 203)

and marketing research:

This regression-toward-the-mean phenomenion is found unii'ersallYJ for characteristics that varv not on/v1 amonig people blut also over timne for each

person. . . . In fact, thinking about this phenomenon, one may realize that

it is no remarkable discovery but a logical inevitability. (Greene 1982, pp.

29-31)

The italics (added) in each quotation underscore the spe-

cial role that the population mean is presumed to play in

such an effect: it is this quantity toward which an individ-

ual's future behavior "regresses." Explicit consideration of

this regression effect is important in many longitudinal stud-

ies. It must be considered when groups of individuals being

analyzed have been created based on an initial "score" of

each individual, with that score having a stochastic com-

ponent. Nesselroade, Stigler, and Baltes (1980) listed stud-

ies in a number of disciplines that have been seriously

compromised through failure to consider such an effect.

The intent in this article is to show that there really is no

logical imperative for the regression effect to occur toward

the mean. Simple, reasonable processes in common use

exhibit regression to a quantity other than the mean. It is

only a specific class of models that produces regression to

the mean. In describing this class we shall see that it in-

cludes, but is not limited to, models based on conjugate

priors for the distribution of a latent trait across a population.

There is a sort of corollary to this regression-to-the-mean

phenomenon that will also be of interest. It concerns the

expected future behavior of an individual who scores exactly

the population mean in an initial observation period. Pre-

sumably, the expected future observed score is a continuous,

monotonically increasing function of the initial observed

score. Then if initial scores above the mean regress back

toward the mean, and if initial scores below the mean regress

up toward the mean, what do we expect of an individual

who initially scores exactly the population mean? Within

this logical framework we must expect this individual's

future score to be unchanged. That is, the population mean

is the only fixed point for the function linking the observed *David C. Schmittlein is Associate Professor, Marketing Department,

the Wharton School, University of Pennsylvania, Philadelphia, PA 19104.

176 The American Statistician, August 1989, Vol. 43, No. 3 C) 1989 American Statistical Association

This content downloaded from 193.204.40.97 on Tue, 14 Jan 2020 13:32:38 UTCAll use subject to https://about.jstor.org/terms

Page 52: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Theorem: If X1 and X2 are idenScally distributed and

then, (2a) holds

Statistical Reversion Toward the Mean: More Universal

Than Regression Toward the Mean

MYRA L. SAMUELS*

Schmittlein discussed the lack of universality of regres-

sion toward the mean. The present note emphasizes the

universality of a similar effect, dubbed "reversion" to-

ward the mean, defined as the shift in conditional ex-

pectation of the upper or lower portion of a distribution.

Reversion toward the mean is a useful concept for sta-

tistical reasoning in applications and is more self-evi-

dently plausible than regression toward the mean.

KEY WORDS: Probability mixture models; Regression

to the mean; Reversion to the mean.

1. INTRODUCTION

In a recent commentary, Schmittlein (1989) demon-

strated that the phenomenon of statistical regression to-

ward the mean is by no means universal, even in mixture

models where universality might have been hoped for.

The purpose of the present note is to point out that a

closely related phenomenon is much more nearly uni-

versal and to present a heuristic proof of this universality

that is easily accessible to nonstatisticians.

2. REGRESSION AND REVERSION

TOWARD THE MEAN

Let X1 and X2 be random variables with joint distri-

bution function F. Assume that X1 and X2 have the same

marginal distribution and let ,u denote their common mean. The distribution F exhibits regression toward the mean

if, for all c > ,u,

? ' E[X2 I X1 = c] < c, (1) with the reverse inequalities holding for c < ,u.

As noted by Schmittlein (1989), Galton (1877) orig- inally used the term reversion rather than regression. Let

us resurrect this archaic term for a new use, and say that F exhibits reversion toward the mean if, for any c,

,u'E[X2 X1 > c] < E[XI X1 > c] (2a) and

,u'E[X2 |XI < c] > E[XI XI < C]. (2b)

(To avoid trivialities, we restrict attention throughout to values of c for which the conditional expectations are

defined.) Clearly, regression toward the mean implies reversion toward the mean, but not vice versa.

As defined by (2), reversion toward the mean occurs

when the conditional mean of the upper or lower portion

of the distribution shifts, or reverts, toward the uncon-

ditional mean ,u. In many applications the upper or lower

portion of interest would be a small portion (a tail) of

the distribution, but reversion is not restricted to this case;

note that (2) places no restriction on the location of c

relative to ,u.

Reversion can serve as well as regression to motivate

statistical cautionary tales. For example, educational re-

searchers can be warned that a group of school children

selected because their performance is below some cutoff

would be expected, on the average, to show improve-

ment when observed later. Or medical investigators can

be alerted that patients selected for levels of serum po-

tassium higher than some cutoff would be expected, on the average, to show reduced levels when observed later.

The phenomenon (2) has been discussed in this kind of context (for example, by Davis 1976, 1986;

McDonald, Mazzuca, and McCabe 1983), but termi- nology is often vague and the distinction between (1) and

(2) is frequently blurred. Recently Senn (1990) has em-

phasized that (2) is a phenomenon of practical impor-

tance and has suggested that both (1) and (2) should be considered forms of regression toward the mean. It might

be less confusing, however, to give distinct names to these distinct phenomena. Since the conditional expec-

tation function f(x) = E[X2 I XI = x] is generally called the "regression" function (Rao 1973, p. 264; Dixon and Massey 1983, p. 210-211; Kendall, Stuart, and Ord 1987, p. 524), it seems appropriate to continue to refer to (1)

as "regression" and to choose a different name for (2).

3. THE UNIVERSALITY OF REVERSION

TOWARD THE MEAN

To investigate the universality of reversion toward the

mean, it is helpful to split the definition into two parts by noting that (2) holds iff

E[X2 |XI > c] < E[XI XI > c], for all c, (3a)

E[X2 |XI < c] > E[XI XI < c] , for all c, (3b) and

E[X2 I X1 > c] 2 /, E[X2 I XI < c] < ? for all c. (4)

The condition (3) can be called reversion to the mean or beyond; (3a) asserts that the mean of the upper por- tion of the distribution reverts to a lower position, and

(3b) asserts contrariwise for the mean of the lower por- tion. The additional requirement (4) assures that the re- version cannot go beyond the mean ,u.

*Myra L. Samuels is Instructor and Assistant Supervisor of Statis-

tical Consulting, Department of Statistics, Purdue University, West

Lafayette, IN 47907.

344 The American Statistician, November 1991, Vol. 45, No. 4 ?) 1991 American Statistical Association

This content downloaded from 193.204.40.97 on Tue, 14 Jan 2020 13:11:18 UTCAll use subject to https://about.jstor.org/terms

Reversion to the mean or beyond is virtually a uni-

versal phenomenon. Only the nondegeneracy condition

Pr[X2> c I X1 > c] < 1, Pr[X2< c I X1 < c] < 1,

for all c (5)

is required to assure that the inequalities in (3) are strict.

Proposition. If X1 and X2 are identically distributed

and (5) holds, then (3) holds; that is, F exhibits reversion

to the mean or beyond.

Here are two proofs of the proposition. The first is a

straightforward mathematical proof. (A different math-

ematical proof, assuming X1 and X2 nonnegative, was

given by McDonald et al. 1983.)

Mathematical Proof. Let Ii, for i = 1, 2, denote in- dicator random variables defined by

Ii = I if Xi> c

= 0 otherwise,

and let J = I, - I2. Then

E[X2I1] = E[X2I2] + E[X2J] = E[XIII] + E[X2J]. (6)

(The second equality follows because X1 and X2 are iden-

tically distributed.) The fact that X2 < c when J = 1 and

X2 > c when J = -1, together with (5), implies that

E[X2J] < c Pr[J = 1] - c Pr[J =-1]

= c E[J]

= 0,

so that (6) yields

E[X21,] < E[X1I1],

from which (3a) follows immediately. Relation (3b) is

proved similarly.

The second proof of the Proposition is an easily under- stood heuristic argument, framed in terms of IQ, a var-

iate that has the same distribution at any age.

The Red T-Shirt Argument. Let X1 and X2 represent

IQ at age 8 and at age 18, respectively. Visualize a very

long row of chairs, each occupied by an 8-year-old child;

each chair bears a label with the child's IQ, and the chairs

are arrayed in nondecreasing order of IQ from left to right. Suppose a kindly teacher decides to reward the

"smart" children whose IQ exceeds c; say, c = 120. She

places a marker on the rightmost chair labeled 120, and

all children sitting to the right of the marker receive red

T-shirts. What happens to the mean IQ of the red T-

shirted children as they grow from age 8 to 18? Imagine

that the chairs and labels remain in place (representing

the stationarity of the distribution) while the children get up, have various adventures (but retain their red T-shirts),

and return at age 18 to take a chair corresponding to their

current IQ. Possibly all red T-shirts are still sitting to the

right of the marked chair; in this case [(5) being violated]

the mean IQ of the red T-shirt group has not changed.

But, if any red T-shirts have moved to the left of the

marked chair, then clearly the mean IQ of the red T-

shirt group must have decreased. This proves the first

relation in (3); the second would be argued similarly.

If, in addition to having identical marginals, the dis-

tribution F satisfies the condition (4), then F exhibits

reversion toward the mean (not beyond). Condition (4)

asserts a weak form of positive dependence between X1

and X2, weaker than positive quadrant dependence as de-

fined by Lehmann (1966), but stronger than nonnegative

correlation. Reversion toward the mean is universal, then,

among distributions with X1 and X2 identically distrib-

uted and weakly positively dependent in the sense of (4). In longitudinal studies, where X1 and X2 are measure-

ments on the same subject, it would usually be reason-

able to assume that (4) should hold. In such a setting,

(4) simply asserts that the group of subjects whose initial

scores are higher [lower] than c will score higher [lower]

than average when measured subsequently; this will cer- tainly be true if the expected future score of an individ-

ual is a monotonically increasing function of his initial

score.

As mathematical models for longitudinal studies,

Schmittlein (1989) considered latent trait mixture models, in which X1 is an observation from a mixture distribution

of the form f G(x I 0) dH(0) and X2 is another obser- vation for the same value of 0. It is easy to show that

in these models (4) will hold if the distribution functions

G(x I 0) are monotone in 0. Thus, for example, any dis- tribution generated by mixing Poissons will exhibit re-

version toward the mean, even though, as in examples

4.1 and 4.2 of Schmittlein (1989), it may not exhibit

regression toward the mean. Similarly, any distribution

generated by mixing normals (with equal variances) on

their mean will exhibit reversion toward the mean, even

though it may not exhibit regression toward the mean,

and, indeed, it will generally (if it is unimodal) exhibit

regression toward the mode, as shown by Das and Mulder

(1983). As further examples, any distribution generated

by mixing binomials (with equal n's) on their success

probability, or by mixing gammas (with equal scale pa-

rameters) on their shape parameter, or by mixing gam-

mas (with equal shape parameters) on their scale param-

eter, will exhibit reversion toward the mean.

The preceding discussion has assumed X1 and X2 to be

identically distributed. If they are not, the notions of

regression and reversion toward the mean can be gen-

eralized to mean that the standardized random variables

Xi* = (Xi - i)/oi exhibit regression or reversion toward zero, where 4i and oi are the mean and standard devia- tion of Xi, for i = 1, 2. Clearly, reversion toward the mean in this generalized sense is universal whenever X1

and X2 belong to the same location-scale family and are

positively dependent in the sense that the Xi* satisfy (4)

4. THE MAGNITUDE OF THE

REVERSION EFFECT

When both regression and reversion toward the mean

occur, we can compare the magnitudes of the two ef-

fects. The comparison is very simple in the case of linear

The American Statistician, November 1991, Vol. 45, No. 4 345

This content downloaded from 193.204.40.97 on Tue, 14 Jan 2020 13:11:18 UTCAll use subject to https://about.jstor.org/terms

Interpreta,on: if you select subjects whose X1 is above a threshold, then the average of X2 will be smaller than that of X1.

Statistical Reversion Toward the Mean: More Universal

Than Regression Toward the Mean

MYRA L. SAMUELS*

Schmittlein discussed the lack of universality of regres-

sion toward the mean. The present note emphasizes the

universality of a similar effect, dubbed "reversion" to-

ward the mean, defined as the shift in conditional ex-

pectation of the upper or lower portion of a distribution.

Reversion toward the mean is a useful concept for sta-

tistical reasoning in applications and is more self-evi-

dently plausible than regression toward the mean.

KEY WORDS: Probability mixture models; Regression

to the mean; Reversion to the mean.

1. INTRODUCTION

In a recent commentary, Schmittlein (1989) demon-

strated that the phenomenon of statistical regression to-

ward the mean is by no means universal, even in mixture

models where universality might have been hoped for.

The purpose of the present note is to point out that a

closely related phenomenon is much more nearly uni-

versal and to present a heuristic proof of this universality

that is easily accessible to nonstatisticians.

2. REGRESSION AND REVERSION

TOWARD THE MEAN

Let X1 and X2 be random variables with joint distri-

bution function F. Assume that X1 and X2 have the same

marginal distribution and let ,u denote their common mean. The distribution F exhibits regression toward the mean

if, for all c > ,u,

? ' E[X2 I X1 = c] < c, (1) with the reverse inequalities holding for c < ,u.

As noted by Schmittlein (1989), Galton (1877) orig- inally used the term reversion rather than regression. Let

us resurrect this archaic term for a new use, and say that F exhibits reversion toward the mean if, for any c,

,u'E[X2 X1 > c] < E[XI X1 > c] (2a) and

,u'E[X2 |XI < c] > E[XI XI < C]. (2b)

(To avoid trivialities, we restrict attention throughout to values of c for which the conditional expectations are

defined.) Clearly, regression toward the mean implies reversion toward the mean, but not vice versa.

As defined by (2), reversion toward the mean occurs

when the conditional mean of the upper or lower portion

of the distribution shifts, or reverts, toward the uncon-

ditional mean ,u. In many applications the upper or lower

portion of interest would be a small portion (a tail) of

the distribution, but reversion is not restricted to this case;

note that (2) places no restriction on the location of c

relative to ,u.

Reversion can serve as well as regression to motivate

statistical cautionary tales. For example, educational re-

searchers can be warned that a group of school children

selected because their performance is below some cutoff

would be expected, on the average, to show improve-

ment when observed later. Or medical investigators can

be alerted that patients selected for levels of serum po-

tassium higher than some cutoff would be expected, on the average, to show reduced levels when observed later.

The phenomenon (2) has been discussed in this kind of context (for example, by Davis 1976, 1986;

McDonald, Mazzuca, and McCabe 1983), but termi- nology is often vague and the distinction between (1) and

(2) is frequently blurred. Recently Senn (1990) has em-

phasized that (2) is a phenomenon of practical impor-

tance and has suggested that both (1) and (2) should be considered forms of regression toward the mean. It might

be less confusing, however, to give distinct names to these distinct phenomena. Since the conditional expec-

tation function f(x) = E[X2 I XI = x] is generally called the "regression" function (Rao 1973, p. 264; Dixon and Massey 1983, p. 210-211; Kendall, Stuart, and Ord 1987, p. 524), it seems appropriate to continue to refer to (1)

as "regression" and to choose a different name for (2).

3. THE UNIVERSALITY OF REVERSION

TOWARD THE MEAN

To investigate the universality of reversion toward the

mean, it is helpful to split the definition into two parts by noting that (2) holds iff

E[X2 |XI > c] < E[XI XI > c], for all c, (3a)

E[X2 |XI < c] > E[XI XI < c] , for all c, (3b) and

E[X2 I X1 > c] 2 /, E[X2 I XI < c] < ? for all c. (4)

The condition (3) can be called reversion to the mean or beyond; (3a) asserts that the mean of the upper por- tion of the distribution reverts to a lower position, and

(3b) asserts contrariwise for the mean of the lower por- tion. The additional requirement (4) assures that the re- version cannot go beyond the mean ,u.

*Myra L. Samuels is Instructor and Assistant Supervisor of Statis-

tical Consulting, Department of Statistics, Purdue University, West

Lafayette, IN 47907.

344 The American Statistician, November 1991, Vol. 45, No. 4 ?) 1991 American Statistical Association

This content downloaded from 193.204.40.97 on Tue, 14 Jan 2020 13:11:18 UTCAll use subject to https://about.jstor.org/terms

Statistical Reversion Toward the Mean: More Universal

Than Regression Toward the Mean

MYRA L. SAMUELS*

Schmittlein discussed the lack of universality of regres-

sion toward the mean. The present note emphasizes the

universality of a similar effect, dubbed "reversion" to-

ward the mean, defined as the shift in conditional ex-

pectation of the upper or lower portion of a distribution.

Reversion toward the mean is a useful concept for sta-

tistical reasoning in applications and is more self-evi-

dently plausible than regression toward the mean.

KEY WORDS: Probability mixture models; Regression

to the mean; Reversion to the mean.

1. INTRODUCTION

In a recent commentary, Schmittlein (1989) demon-

strated that the phenomenon of statistical regression to-

ward the mean is by no means universal, even in mixture

models where universality might have been hoped for.

The purpose of the present note is to point out that a

closely related phenomenon is much more nearly uni-

versal and to present a heuristic proof of this universality

that is easily accessible to nonstatisticians.

2. REGRESSION AND REVERSION

TOWARD THE MEAN

Let X1 and X2 be random variables with joint distri-

bution function F. Assume that X1 and X2 have the same

marginal distribution and let ,u denote their common mean. The distribution F exhibits regression toward the mean

if, for all c > ,u,

? ' E[X2 I X1 = c] < c, (1) with the reverse inequalities holding for c < ,u.

As noted by Schmittlein (1989), Galton (1877) orig- inally used the term reversion rather than regression. Let

us resurrect this archaic term for a new use, and say that F exhibits reversion toward the mean if, for any c,

,u'E[X2 X1 > c] < E[XI X1 > c] (2a) and

,u'E[X2 |XI < c] > E[XI XI < C]. (2b)

(To avoid trivialities, we restrict attention throughout to values of c for which the conditional expectations are

defined.) Clearly, regression toward the mean implies reversion toward the mean, but not vice versa.

As defined by (2), reversion toward the mean occurs

when the conditional mean of the upper or lower portion

of the distribution shifts, or reverts, toward the uncon-

ditional mean ,u. In many applications the upper or lower

portion of interest would be a small portion (a tail) of

the distribution, but reversion is not restricted to this case;

note that (2) places no restriction on the location of c

relative to ,u.

Reversion can serve as well as regression to motivate

statistical cautionary tales. For example, educational re-

searchers can be warned that a group of school children

selected because their performance is below some cutoff

would be expected, on the average, to show improve-

ment when observed later. Or medical investigators can

be alerted that patients selected for levels of serum po-

tassium higher than some cutoff would be expected, on the average, to show reduced levels when observed later.

The phenomenon (2) has been discussed in this kind of context (for example, by Davis 1976, 1986;

McDonald, Mazzuca, and McCabe 1983), but termi- nology is often vague and the distinction between (1) and

(2) is frequently blurred. Recently Senn (1990) has em-

phasized that (2) is a phenomenon of practical impor-

tance and has suggested that both (1) and (2) should be considered forms of regression toward the mean. It might

be less confusing, however, to give distinct names to these distinct phenomena. Since the conditional expec-

tation function f(x) = E[X2 I XI = x] is generally called the "regression" function (Rao 1973, p. 264; Dixon and Massey 1983, p. 210-211; Kendall, Stuart, and Ord 1987, p. 524), it seems appropriate to continue to refer to (1)

as "regression" and to choose a different name for (2).

3. THE UNIVERSALITY OF REVERSION

TOWARD THE MEAN

To investigate the universality of reversion toward the

mean, it is helpful to split the definition into two parts by noting that (2) holds iff

E[X2 |XI > c] < E[XI XI > c], for all c, (3a)

E[X2 |XI < c] > E[XI XI < c] , for all c, (3b) and

E[X2 I X1 > c] 2 /, E[X2 I XI < c] < ? for all c. (4)

The condition (3) can be called reversion to the mean or beyond; (3a) asserts that the mean of the upper por- tion of the distribution reverts to a lower position, and

(3b) asserts contrariwise for the mean of the lower por- tion. The additional requirement (4) assures that the re- version cannot go beyond the mean ,u.

*Myra L. Samuels is Instructor and Assistant Supervisor of Statis-

tical Consulting, Department of Statistics, Purdue University, West

Lafayette, IN 47907.

344 The American Statistician, November 1991, Vol. 45, No. 4 ?) 1991 American Statistical Association

This content downloaded from 193.204.40.97 on Tue, 14 Jan 2020 13:11:18 UTCAll use subject to https://about.jstor.org/terms

Page 53: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Placebo effect: is it genuine?

september2011 125

different factor, yet the result came out closely the same.

The year being 1886 the computer in question was, of course, a human and not an electronic assistant! The more interesting point, however, is that Galton is describing what we would now call robustness in statistics – and, simultaneously, provides an early example of what is now recognised as a general scientific phenomenon: scientists never seem to fail the robustness checks they report. It is interesting to speculate why.

Galton’s data consisted of 928 adult children and 205 “parentages” – that is to say, father-and-mother couples. (The mean number of chil-dren per couple was thus just over 4.5 – families were larger in those days.) He represented the height of parents using a single statistic, the “mid-parent”, this being the mean of the height of the father and of his wife’s height multiplied by 1.08. Of course, as previously noted, for the female children the heights were also multiplied by 1.08. For the male children they were unad-justed.

Figure 1 is a modern graphical representa-tion of Galton’s data. Galton had grouped his results by intervals of 1 inch, and in conse-quence, if a given child’s recorded height were plotted against its recorded mid-parent height, many points would be superimposed on top of each other. I have added a small amount of “jit-ter” in either dimension to separate the points, which are shown in blue. The data are plotted in two ways: child against mid-parent on the left and mid-parent against child on the right. The thin solid black diagonal line in each case is the line of equality. If a point lies on this line then child and mid-parent were identical in height. Also shown in red in each case are two

different approaches one might use to predict-ing “output” from “input”. The dashed line is the least squares fit, what we now (thanks to Galton) call a regression line. The thick red line is a more local fit, in fact a so-called LOWESS (or locally weighted scatterplot smoothing) line. The point about either of these two ap-proaches – irrespective of whether we predict child from mid-parent or vice versa – is that the line that is produced is less steep than the line of exact equality. The consequence is that we may expect that an adult child is closer to average height than its parents – but also, paradoxically, that parents are closer to average height than is their child.

The first part we might expect. The second may seem absurd – but is just as true. I will say it again: a tall child will have parents, on aver-age, less tall than himself. This particular point is both deep and trivial. It is deep because the first time that students encounter it (I can still remember my own reaction) they assume that it is wrong; its truth is well hidden. Once under-stood, however, it becomes so obvious that one is amazed at how regularly it is overlooked. It is a point not about genetics but about statistics.

In fact I am confident that at this stage I can divide my readers into two: those who will claim that I am wasting their time repeating a hackneyed truth, and those who will say that I have so far failed in anything that I have said to show that the hackney in question is a genuine carriage. So I will say farewell to the members of the first group and address myself to the sec-ond – but before I say goodbye to the first I will ask them one question. Do you think that there is good evidence that the placebo effect is genuine? If so, stick around for a while because I will try and show you that you (and ten thousand physicians with you) are wrong. What this has to

do with Francis Galton will be revealed in due course.

So let us leave Francis Galton for the mo-ment and consider another example, this time a simulated one. Figure 2 shows simulated values in diastolic blood pressure (DBP) for a group of 1000 individuals measured on two occasions: at baseline and at outcome. (“Outcome” simply means “some time later”; they have not re-ceived any medical treatment between the two occasions.) If your blood pressure is high, you are hypertensive. Using a common but arbitrary definition of hypertension as a diastolic pres-sure of 95 mmHg or more, the subjects have been classified as consistently hypertensive (red diamonds) consistently normotensive (blue circles) or inconsistent – hypertensive on one occasion, normal on the other (orange stars). The distributions at outcome and baseline are very similar, with means close to 90 mmHg and a spread that can be defined by that Galtonian statistic, the inter-quartile range, as being close to 11 mmHg on either occasion. In other words, what the picture shows is a population that all in all has not changed over time although, since the correlation (to use another Galtonian term) is just under 0.8 and therefore less than 1, there is, of course, variability over time for many individuals. Some have increased their blood pressure between the measurements, some have reduced it.

However, in the setting of many clinical tri-als, Figure 2 is not a figure we would see. The reason is simple: we would not follow up indi-viduals who were observed to be normotensive at baseline. If you are “healthy”, we would not bother to call you back for the second test. In-

60 65 70 75

6065

7075

Mid-parent height (inches)

Chi

ld h

eigh

t (in

ches

)

60 65 70 75

6065

7075

Child height (inches)

Mid

pare

nt h

eigh

t (in

ches

)

Figure 1. Galton’s height data: two scatterplots showing the regression phenomenon (drawn from data listed at http://www.math.uah.edu/stat/data/Galton.txt)

Figure 2. Simulated diastolic blood pressure for 1000 patients measured on two occasions – blue circles, normotensive on both occasions; red diamonds, hypertensive on both occasions; orange stars, inconsistent

Senn, S. (2011). Francis Galton and regression to the mean. Significance, 8(3), 124-126.

Page 54: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

september2011126

stead what doctors and medics see is the picture given in Figure 3. Of the 1000 subjects seen at baseline, 285 had DBP values in excess of 95 mmHg. We did not bother to call the other 715 back; we concentrated instead on those we deemed to have a medical problem – and those are the only ones shown in the figure. And if now we compare the outcome values of those 285 subjects we have left to the values they showed at baseline, we will find that mean DBP seems to have gone down. At outcome it is more than 2 mmHg lower than it was at baseline. What we have just observed is what Francis Galton called regression to the mean.

There has been an apparent spontaneous improvement in blood pressure. Apparently many patients who were hypertensive at baseline became normotensive. It is important to under-stand here that this observed “improvement” is a consequence of this stupid (but very common) way of looking at the data. It arises because of the way we select the values. What is missing because of our selection method is bad news. We can only see patients who remain hypertensive or who become normotensive. We left out the patients who were normotensive but became hypertensive. They are shown in Figure 4. If we had their data they would correct the misleading picture in Figure 3, but the way we have gone about our study means that we will not see their outcome values.

Regression to the mean is a consequence of the observation that, on average, extremes do not survive. In our height example, extremely tall parents tend to have children who are taller than average and extremely small parents tend to have children who are smaller than average, but in both cases the children tend to be closer to the average than were their parents. If that were not the case the distribution of height

would have to get wider over time. Of course there can be changes in such distributions over time and it is the case that people are taller now than in Galton’s day, but this is a separate phenomenon in addition to regression to the mean.

However, regression to the mean is not re-stricted to height nor even to genetics. It can occur anywhere where repeated measurements are taken.

Does it happen that scientists get fooled by Galton’s regression to the mean? All the time! Right this moment all over the world in dozens of disciplines, scientists are fool-ing themselves either by not having a control group, which would also show the regression effect, or, if they do have a control group, by concentrating on the differences within groups between outcome and baseline rather than the differences between groups at outcome. It is regression to the mean that is a very plausible explanation for the placebo effect, since entry into clinical trials is usually only by virtue of an extreme baseline value. This does not matter as long as you compare the treated group to the placebo group, since both groups will regress to the mean. It does mean, however, that you have to be very careful before claiming that any improvement in the placebo group is due to the healing hands of the physician or psychological expectancy.

To prove that would require a three-arm trial: an active group, a placebo group and a group given nothing at all. Then all three groups would have the same regression to the mean improve-ment and differences between the placebo and the open arm could be judged to be due to a true placebo effect. Not surprisingly, very few such trials have been run. However, analysis of those that have been run suggests that only in the

area of pain control do we have reliable evidence of a placebo effect4,5.

But regression to the mean is not just lim-ited to clinical trials. Did you choose dangerous road intersections in your region for corrective engineering work based on their record of traffic accidents? Did you fail to have a control group of similar black spots that went untreated? Are you going to judge efficacy of your interven-tion by comparing before and after? Then you should know that Francis Galton’s regression to the mean predicts that sacrificing a chicken on

such black spots can be shown to be effective by the methods you have chosen6. Did you give failing students a remedial class and did they improve again when tested? Are you sure that subsequence means consequence? What have you overlooked?

A Victorian eccentric who died 100 years ago, although no great shakes as a mathematician, made an important discovery of a phenomenon that is so trivial that all should be capable of learning it and so deep that many scientists spend their whole career being fooled by it.

References1. Forrest, D.W. (1974) Francis Galton: The

Life and Work of a Victorian Genius. London: Paul Elek.2. David, H.A. (1995) First (questionable)

occurrence of common terms in mathematical statistics. American Statistician, 49, 121–133.

3. Galton, F. (1886) Regression towards mediocrity in hereditary stature. Journal of the Anthropological Institute of Great Britain and Ireland, 15, 246–263.

4. Hrobjartsson, A. and Gotzsche, P.C. (2001) Is the placebo powerless? An analysis of clinical trials comparing placebo with no treatment. New England Journal of Medicine, 344, 1594–1602.

5. Kienle, G.S. and Kiene, H. (1997) The powerful placebo effect: fact or fiction? Journal of Clinical Epidemiology, 50, 1311–1318.

6. Senn, S.J. (2003) Dicing with Death. Cambridge: Cambridge University Press.

Stephen Senn is a professor of statistics at Glasgow University. His book Dicing with Death is published by Cambridge University Press and covers (among other matters) Francis Galton and the history of regression to the mean.

Do scientists get fooled by Galton’s regression to the mean? All the time!

Figure 3. Diastolic blood pressure on two occasions for patients observed to be hypertensive at baseline

Figure 4. Patients from Figure 2 who were normotensive at baseline but hypertensive at outcome

What if I study only pa*ents that were hypertensive at baseline?

There has been an apparentspontaneous improvement inblood pressure. Apparentlymany patients who werehypertensive at baselinebecame normotensive. Whatis missing because of ourselection method is badnews. We can only seepatients who remainhypertensive or who becomenormotensive. We left outthe patients who werenormotensive but becamehypertensive.

Patients that improve

GoodNews

(observed)

BadNews

(unobserved)

Page 55: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Dangerous crossroads How to (apparently) secure them by Galton’s regression

to the mean (or by killing chickens)

Page 56: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Expected #accidents 1981-1982 condi5onal on #accidents 1979-1980

Page 57: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Expected #accidents 1981-1982 conditional on #accidents 1979-1980

There is an apparent improvement but let us

reverse the Sme axis

Page 58: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Expected #accidents 1979-1980 conditional on #accidents 1981-1982

Sites decrease their accidents if

we go back in time!

Absurd?No! Just an instance of regression to the mean

Page 59: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Expected #accidents 1981-1982 conditional on #accidents 1979-1980

Mean #accidents 1979-1980 = 0.98Mean #accidents 1981-1982 = 0.96

During 1979-1980 most sites (1779 out of 3112) had zero accidents.

SITES OVER THE 1979-1980 MEAN

TEND TOIMPROVE

IN 1981-1982

SITES UNDER THE 1979-1980 MEAN

TEND TOWORSEN

IN 1981-1982Mean=0.98

Page 60: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Did you choose dangerous roadintersec?ons in your region forcorrec?ve engineering work based ontheir record of traffic accidents? Did youfail to have a control group of similarblack spots that went untreated? Areyou going to judge efficacy of yourinterven?on by comparing before andaDer? Then you should know thatFrancis Galton’s regression to the meanpredicts that sacrificing a chicken onsuch black spots can be shown to beeffec>ve by the methods you havechosen.Senn, S. (2011). Francis Galton and regression to the mean.Significance, 8(3), 124-126.

Page 61: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

A misleading plot

• X and Y correlated and with same distribution• Example: same variable measured before (X) and after (Y) the administration of

an ineffective treatment• Idea: assess effect of treatment comparing the change D=Y-X with the baseline X• This amounts to studying D conditional on X• Regression to the mean implies that if x > E[X], then E[Y|X=x]<x,

that is E[D |x] < 0. • Conversely, if x < E[X], then E[D |x] > 0. • The plot tends to be very impressive: emphasis is either on the improvement of

the previously underperforming subjects or on «convergence».

Page 62: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Simulated example: X and Y are both standardized Gaussians (rXY = 0.6)

-3 -2 -1 0 1 2 3X

-4

-3

-2

-1

0

1

2

3Y

-3 -2 -1 0 1 2 3X

-3

-2

-1

0

1

2

3

= Y

-X

The weakerimprove (D>0)

The strongerworsen (D <0)

The illusion of «convergence»

Regression of Y on X

SD lin

e

Page 63: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 64: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Variation (productivity growth rate) vs baseline (1970 productivity)

M. Friedman: “I find itsurprising that the reviewerand authors, all of whom aredistinguished economists,thoroughly conversant withmodern statistical methods,should have failed to recognizethat they were guilty of theregression fallacy”

Page 65: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Signs of convergence of students’ performance or

Galton’s regression to the mean?

Page 66: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

PPIIAANNOO DDII IINNFFOORRMMAAZZIIOONNEE EE FFOORRMMAAZZIIOONNEE SSUULLLL’’IINNDDAAGGIINNEE OOCCSSEE--PPIISSAA EE AALLTTRREE RRIICCEERRCCHHEE NNAAZZIIOONNAALLII EE IINNTTEERRNNAAZZIIOONNAALLII

Hotel Hilton Giardini Naxos (ME) 25-28 ottobre 2010

PPRROOGGRRAAMMMMAA

Lunedì 25 ottobre 2010

ore 14.00 Registrazione dei partecipanti

ore 15.00 Presentazione dell’iniziativa Annamaria Leuzzi (Dir. Gen. per gli Affari Intern. - MIUR)

ore 15.30 Il sistema scolastico italiano alla luce dei risultati nelle indagini nazionali Piero Cipollone (Presidente INVALSI)

ore 16.30 Il sistema scolastico italiano alla luce dei risultati nelle indagini internazionali Bruno Losito

ore 17.30 Coffee Break

ore 18.00 Discussione

ore 19.00 Fine lavori prima giornata

Martedì 26 ottobre 2010

ore 9.00 Introduzione ai lavori Annamaria Leuzzi (Dir. Gen. per gli Affari Intern. - MIUR)

ore 9.20 Le competenze linguistiche nelle indagini nazionali ed internazionali Mimma Siniscalco

PPIIAANNOO DDII IINNFFOORRMMAAZZIIOONNEE EE FFOORRMMAAZZIIOONNEE SSUULLLL’’IINNDDAAGGIINNEE OOCCSSEE--PPIISSAA EE AALLTTRREE RRIICCEERRCCHHEE NNAAZZIIOONNAALLII EE IINNTTEERRNNAAZZIIOONNAALLII

Hotel Hilton Giardini Naxos (ME) 25-28 ottobre 2010

PPRROOGGRRAAMMMMAA

Lunedì 25 ottobre 2010

ore 14.00 Registrazione dei partecipanti

ore 15.00 Presentazione dell’iniziativa Annamaria Leuzzi (Dir. Gen. per gli Affari Intern. - MIUR)

ore 15.30 Il sistema scolastico italiano alla luce dei risultati nelle indagini nazionali Piero Cipollone (Presidente INVALSI)

ore 16.30 Il sistema scolastico italiano alla luce dei risultati nelle indagini internazionali Bruno Losito

ore 17.30 Coffee Break

ore 18.00 Discussione

ore 19.00 Fine lavori prima giornata

Martedì 26 ottobre 2010

ore 9.00 Introduzione ai lavori Annamaria Leuzzi (Dir. Gen. per gli Affari Intern. - MIUR)

ore 9.20 Le competenze linguistiche nelle indagini nazionali ed internazionali Mimma Siniscalco

Page 67: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

In conclusione…qualche buona no*zia

Page 68: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Trend:segni di convergenza

Page 69: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Matematica

Page 70: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Programme for International Student Assessment

PISA 2012 Results in FocusWhat 15-year-olds know and what they can do with what they know

Standardized Tests «Why learning lessons from PISA is as hard as predicting who will win

a football match»

Page 71: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

PISA 2012 RESULTS IN FOCUS: WHAT 15-YEAR-OLDS KNOW AND WHAT THEY CAN DO WITH WHAT THEY KNOW © OECD 20148

PISA 2012 results show that many countries and economies have improved their performance, whatever their culture or socio-economic status.

For some of the countries and economies that improved their performance in one or more of the domains assessed, improvements are observed among all students: everyone “moved up”. Other countries concentrated their improvements among their low-achieving students, increasing the share of students

who begin to show literacy in mathematics, reading or science. Improvement in other countries, by contrast, is concentrated among high-achieving students, so the share of top-performing students grew.

Some of the highest-performing education systems were able to extend their lead, while others with very low performance have been catching up. This suggests that improvement is possible, whatever the starting point for students, schools and education systems.

Japan

Ireland

Austria

Switzerland

Norway

Poland

Brazil

Greece

Luxembourg

Germany

France

Thailand

Turkey

Australia

Sweden

Tunisia Mexico

Italy

Portugal

DenmarkIceland

United States

Latvia

HungaryUruguay

Spain

Slovak Republic Netherlands

New ZealandFinland

BelgiumCanada

OECD average 2003

Macao-China Hong Kong-China

Liechtenstein

Korea

RussianFederation

CzechRepublic

Indonesia

Ann

ualis

ed c

hang

e in

mat

hem

atic

s sc

ore

(in s

core

poi

nts)

350 400 450 500 550 600375 425 475 525 570

5

4

3

2

1

0

-1

-2

-3

-4

Mean mathematics score in PISA 2003

PISA 2003 performance below OECD average

PISA 2003 performance above OECD average

Performance im

provedPerform

ance deteriorated

Notes: Annualised score-point changes in mathematics that are statistically significant are indicated in a darker tone.The annualised change is the average annual change in PISA score points from a country’s/economy’s earliest participation in PISA to PISA 2012. It is calculated taking into account all of a country’s/economy‘s participation in PISA.Only countries and economies with comparable data from PISA 2003 and PISA 2012 are shown. The correlation between a country’s/economy‘s mean score in 2003 and its annualised performance is -0.60.OECD average 2003 considers only those countries with comparable data since PISA 2003.Source: OECD, PISA 2012 Database; Figure I.2.18.

Annualised change in performance between 2003 and 2012 and average PISA 2003 mathematics scores

Page 72: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 73: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 74: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Randomized trials

Page 75: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Comparing fertilizers

• A farmer wants to compare two fertilizers A and B in terms of tomato production

• In a row of 11 tomato plants, 5 are treated with A and 6 with B• Remark: if the first 5 plants of the row are treated with A, the

observed differences may (partly or totally) be due to the location (more or less sun and water and so on)

• Solution: randomization

Page 76: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Randomization

• The farmer shuffles 11 playing cards, 5 red and 6 black, obtaining the following sequence:

1 2 3 4 5 6 7 8 9 10 11R R B B R B B B R R B

• Red Card: ferSlizer A• Black Card: ferSlizer B

Page 77: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Tomato yield

Statistica di base III 26

Tabella: Risultati dell'esperimento randomizzato(rese di pomodori)

posizione 1 2 3 4 5 6 7 8 9 10 11

fertilizzante A A B B A B B B A A B

Kg pomodori 29.9 11.4 26.6 23.7 25.3 28.5 14.2 17.9 16.5 21.1 24.3

fertilizzante A fertilizzante B

29.9 26.611.4 23.725.3 28.516.5 14.221.1 17.9

24.3

nA = 5 nB = 6

yA = 20.84 yB = 22.53

differenza delle medie (fert. B - fert. A): yB-yA = 1.69

Osservazione: Sotto l'ipotesi nulla (A e B hanno lo stessoeffetto), le lettere "A" e "B" sono semplici etichette che noninfluenzano il risultato.

Se ridispongo le etichette A e B in modo diverso ottengo unasequenza di rese che ha la stessa probabilità di quella di partenza.

Fertilizer BFertilizer A

Page 78: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

H0: no difference between A and B in terms of yield

• Under H0, A and B are nothing that labels• If I exchange labels, the yields tables I obtain have all the same

probability as the starting yield table• There exist 11!/5!6! = 462 ways to place 5 labels A and 6 labels B on

the 11 plants. • If I compute the difference in the 462 cases, I obtain a

reference distribution against which the actual difference can be compared

Statistica di base III 26

Tabella: Risultati dell'esperimento randomizzato(rese di pomodori)

posizione 1 2 3 4 5 6 7 8 9 10 11

fertilizzante A A B B A B B B A A B

Kg pomodori 29.9 11.4 26.6 23.7 25.3 28.5 14.2 17.9 16.5 21.1 24.3

fertilizzante A fertilizzante B

29.9 26.611.4 23.725.3 28.516.5 14.221.1 17.9

24.3

nA = 5 nB = 6

yA = 20.84 yB = 22.53

differenza delle medie (fert. B - fert. A): yB-yA = 1.69

Osservazione: Sotto l'ipotesi nulla (A e B hanno lo stessoeffetto), le lettere "A" e "B" sono semplici etichette che noninfluenzano il risultato.

Se ridispongo le etichette A e B in modo diverso ottengo unasequenza di rese che ha la stessa probabilità di quella di partenza.

Page 79: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Reference distribution

In the 33% of the cases the B-A difference is larger than 1.69

Statistica di base III 27

Ci sono in tutto 11!/5!6! = 462 modi di allocare 5 etichette A e 6etichette B sulle 11 piante.

La distribuzione di riferimento è quella che si ottiene calcolandola differenza yB-yA nei 462 casi e costruendo il relativoistogramma

- 1 0 - 5 0 5 1 0 1 50

5

1 0

1 5

2 0

2 5Randomization distr

Si trova che nel 33% dei casila differenza yB-yA è maggiore di 1.69

P-value = 0.33

The p-value is 0.33

StaSsScally speaking, the difference is not very significant

Page 80: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Student’s t approxima5on of the randomized distribu5on

Statistica di base III 28

Proprietà: La distribuzione randomizzata è ben approssimatadalla distribuzione di una t di Student a nA + nB - 2 gradi dilibertà moltiplicata per

S√�������1/nA+1/nB)dove

S2 = (nA - 1)SA2 + (nB - 1)SB2

n A + nB - 2

- 1 0 - 5 0 5 1 0 1 50

5

1 0

1 5

2 0

2 5Randomization distr. and scaled t distr.

Statistica di base III 28

Proprietà: La distribuzione randomizzata è ben approssimatadalla distribuzione di una t di Student a nA + nB - 2 gradi dilibertà moltiplicata per

S√�������1/nA+1/nB)dove

S2 = (nA - 1)SA2 + (nB - 1)SB2

n A + nB - 2

- 1 0 - 5 0 5 1 0 1 50

5

1 0

1 5

2 0

2 5Randomization distr. and scaled t distr.

Property: the randomized distribuSonis approximated by a Student’s t withnA+nB-2 degrees of freedom, scaled by

where

SA2: sample variance of A yield

SB2: sample variance of B yield

Page 81: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Computational steps:

Statistica di base III 29

In pratica:

• Calcolo

S2 = (nA-1)SA2+(nB-1)SB2

n A + nB - 2 = 4×52.50+5×29.51

5 + 6 - 2 = 39.73

to = yB-yA

S√�������1/nA+1/nB = 1.69

6.30√�����1/5+1/6 = 1.69

3.82 = 0.44

• Dalla tabella della t di Student a nA + nB - 2 = 9 gradi dilibertà vedo che

P(t > to) = P(t > 0.44) > P(t > 1.543) = 0.3

P-value > 0.3(usando la tabella completa: P-value = 0.34)

(la resa B non risulta significativamente maggiore della resa A)

Morale: Pur di randomizzare posso usare il test t (comeapprossimazione della distr. randomizzata) senza dover ipotizzareil campionamento casuale.

From the Student’s t distribution with nA+nB-2 = 11 – 2 = 9 degreesof freedom

P(t0 > 0.44) = 0.34

Page 82: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Randomized Controlled Trials (RCT)

• One of the groups is a “control group”, either non-treated or subject to some standard treatment

• If the control subjects are non treated, the study can be used to invesSgate and quanSfy a possible causal relaSonship between treatment(s) and the outcome

• Especially if double blinded, a RCT gets rid of confounders.• «A well-blinded RCT is oDen considered the gold standard for clinical

trials. Blinded RCTs are commonly used to test the efficacy of medical interven?ons» (Wikipedia)

Page 83: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

The significance fallacy

Page 84: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 85: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 86: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Basic ingredients

• The Newsfeed: posts that you see when you log in• Your personal page: you can publish your posts and they might

enter the Newsfeeds of your friends• You can ask friendship and admission to groups• You can write on pages of friends and groups• You can put “like”, “angry, “love” on posts• ...

Page 87: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 88: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 89: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 90: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 91: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

• Facebook’s News Feed—the main list of status updates, messages, and photos you see when you open Facebook on your computer or phone—is not a perfect mirror of the world.

• But few users expect that Facebook would change their News Feed in order to manipulate their emoSonal state.

• We now know that’s exactly what happened two years ago. For one week in January 2012, data scienSsts skewed what almost 700,000 Facebook users saw when they logged into its service. Some people were shown content with a preponderance of happy and posiSve words; some were shown content analyzed as sadder than average. And when the week was over, these manipulated users were more likely to post either especially posiSve or negaSve words themselves.

• This Snkering was just revealed as part of a new study, published in the presSgious Proceedings of the NaSonal Academy of Sciences.

Page 92: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

“Two parallel experiments were conducted for positiveand negative emotion: One in which exposure to friends’positive emotional content in their News Feed wasreduced, and one in which exposure to negativeemotional content in their News Feed was reduced. Inthese conditions, when a person loaded their NewsFeed, posts that contained emotional content of therelevant emotional valence, each emotional post hadbetween a 10% and 90% chance (based on their User ID)of being omitted from their News Feed for that specificviewing.”

Page 93: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 94: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Remarkable and unremarkable ...

Page 95: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

A basic notion (that should be) taught in every statistics course: the difference between

statistical and practical significance

Page 96: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 97: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

the false belief that [statistically] significant results are automatically big and important

The significance fallacy

Page 98: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

elimina?ng a substan?al propor?on of emo?onal content from a user’s feed had the monumental effect of shiDing that user’s ownemo?onal word use by two hundredths of a standard devia?on. In other words, the manipula?on had a negligible real-world impact on users’ behavior. To put it in intui?ve terms, the effect of condi?on in the Facebook study is roughly comparable to a hypothe?caltreatment that increased the average height of the male popula?onin the United States by about one twen?eth of an inch (given a standard devia?on of ~2.8 inches). Theore?cally interes?ng, perhaps, but not very meaningful in prac>ce.

Tal Yarkoni

Page 99: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 100: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 101: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

if the idea that Facebook would ac?vely try to manipulate yourbehavior bothers you, you should probably stop reading this right now and go close your account. You also should definitely not readthis paper sugges?ng that a single social message on Facebook priorto the last US presiden?al elec?on the may have single-handedly increased na?onal voter turn-out by as much as 0.6%).

Tal Yarkoni

Page 102: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials
Page 103: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

“Our results suggest that the Facebook social messageincreased turnout directly by about 60,000 voters and indirectly through social contagion by another 280,000 voters, for a total of 340,000 additional votes”

Page 104: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

For the rest of the story on opinion dynamics:

Page 105: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

Automatica 100 (2019) 219–230

Contents lists available at ScienceDirect

Automatica

journal homepage: www.elsevier.com/locate/automatica

Opinion influence and evolution in social networks: A Markovianagents model✩

Paolo Bolzern a,*, Patrizio Colaneri a b, , Giuseppe De Nicolaoc

a Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria, Piazza Leonardo da Vinci 32, 20133 Milano, Italyb IEIIT-CNR, Milano, Italyc Università di Pavia, Dipartimento di Ingegneria Industriale e dell’Informazione, Via Ferrata 5, 27100 Pavia, Italy

a r t i c l e i n f o

Article history:Received 7 November 2017Received in revised form 16 October 2018Accepted 5 November 2018

Keywords:Opinion dynamicsMarkov chainSocial networks

a b s t r a c t

In this paper, the effect on collective opinions of filtering algorithmsmanaged by social network platformsis modeled and investigated. A stochastic multi-agent model for opinion dynamics is proposed, thataccounts for a centralized tuning of the strength of interaction between individuals. The evolution of eachindividual opinion is described by a Markov chain, whose transition rates are affected by the opinions ofthe neighbors through influence parameters. The properties of this model are studied in a general settingas well as in interesting special cases. A general result is that the overall model of the social networkbehaves like a high-dimensional Markov chain, which is viable to Monte Carlo simulation. Under theassumption of identical agents and unbiased influence, it is shown that the influence intensity affectsthe variance, but not the expectation, of the number of individuals sharing a certain opinion. Moreover,a detailed analysis is carried out for the so-called Peer Assembly, which describes the evolution of binaryopinions in a completely connected graph of identical agents. It is shown that the Peer Assembly can belumped into a birth–death chain that can be given a complete analytical characterization. Both analyticalresults and simulation experiments are used to highlight the emergence of particular collective behaviors,e.g. consensus and herding, depending on the centralized tuning of the influence parameters.

© 2018 Elsevier Ltd. All rights reserved.

1. Introduction

The pervasive spread of digital social networks in everyday’slife of billions of people has marked a dramatic change both inthe news propagation and opinions formation. Compared to tra-ditional media (press, TV and radio) operating through broadcastcommunication, for the first time in history everyone can reach aglobal audience bymeans of horizontal diffusionprocesses. In spiteof their apparent spontaneous and democratic nature, these pro-cesses are actually governed by algorithms that filter the potentialinformation presented to each individual users. A notable exampleis given by Facebook, where the News Feed of each user featuresonly a fraction of the contents posted by her/his contacts. Anundis-closed machine learning algorithm ranks the posts accounting fora huge number of factors, including user affinity, user habits andpost recentness, see e.g. ( ). By tuning the algorithmMcGee 2013

✩ Thematerial in this paper was not presented at any conference. This paper wasrecommended for publication in revised form byAssociate EditorVijay Guptaunderthe direction of Editor Christos G. Cassandras.

* Corresponding author.E-mail addresses: [email protected] (P. Bolzern),

[email protected] [email protected](P. Colaneri),(G. De Nicolao).

parameters, the social network company exerts a content-de facto

specific control of one-to-one interactions.A notable experiment, known as , was per-Emotional contagion

formed by Facebook itself in 2012 (Kramer, Guillory, & Hancock,2014). A massive subset of unaware users were exposed, duringa week, to a change of emotional content in their News Feed andtheir emotions were then compared to those of a control group.Citing the authors of ( ): ‘‘When positive ex-Kramer et al. 2014pressionswere reduced, people produced fewer positive posts andmorenegative posts;whennegative expressionswere reduced, theopposite pattern occurred’’. This experiment demonstrated thatselectively biasing the intensity of certain interactions can affectthe emotions of Facebook users. A similar manipulation effect canbe conjectured for opinion diffusion across the social network, justby increasing or decreasing the probability that posts in favor of oragainst a certain opinion will be displayed.

An example of intervention in a political context occurred dur-ing the 2010 US Congressional elections. A subset of some 60million users were encouraged to vote with a message at the topof their News Feed providing indications to local polling places.Moreover, a counter of how many Facebook users had alreadyreported votingwas displayed alongwith profile pictures of friendsthat had already clicked the button, ( ).I voted Bond et al. 2012

https://doi.org/10.1016/j.automatica.2018.11.0230005-1098/ 2018 Elsevier Ltd. All rights reserved.©

!

IEEE P

roof

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS 1

Opinion Dynamics in Social Networks: The Effectof Centralized Interaction Tuning on

Emerging BehaviorsPaolo Bolzern , Patrizio Colaneri , Fellow, IEEE, and Giuseppe De Nicolao , Senior Member, IEEE

Abstract— The algorithmic filtering of contents exchanged on1

digital social networks entails a centralized control of the intensity2

of users’ interaction. The aim of this article is to investigate how3

this centralized action can affect the time evolution of users’4

opinions on a specific issue. To this purpose, this article proposes5

a stochastic multiagent model that incorporates a simplified6

description of the modulation of the interaction intensity exerted7

by the platform manager. Various transient and steady-state8

properties of the model are established. In particular, it is studied9

how emerging collective behaviors, e.g., consensus, polarization,10

and community cleavage, depend on the interaction intensity11

parameter. Notably, the nonmonotonic effects of such a para-12

meter are observed for suitable distributions of influenceability13

across the community. By means of a newly introduced concept of14

individual stochastic social power, some insights are given on the15

role of the interaction intensity parameter in the conflict between16

opposite factions in simple scenarios. A major finding of this17

article is that an apparently neutral intervention, i.e., unbiased18

with respect to the conflicting opinions, can favor one faction just19

by tuning the interaction intensity.20

Index Terms— Markov process, multiagent systems, opinion21

dynamics, social networks, social power.22

I. INTRODUCTION23

THE development of mathematical models to describe24

opinion formation and evolution in a social community25

has received a strong impulse by the advent of online social26

networks. Classical models of opinion dynamics describe27

how a user opinion is updated depending on the neighbors’28

opinions, individual prejudice, and possible external sources,29

e.g., online media. An interesting area of research is to inves-30

tigate how local interactions may induce emerging collective31

behaviors, such as consensus, polarization, and fragmentation.32

Starting from the pioneering works of [1]–[3], several33

contributions have been proposed, which can be classified34

on the basis of the underlying mathematical model. The35

first category of models describes individual opinion as a36

real-valued scalar that captures a graduated level of belief37

AQ:1 Manuscript received July 25, 2019; revised October 30, 2019; acceptedDecember 21, 2019. (Corresponding author: Paolo Bolzern.)

AQ:2 Paolo Bolzern is with the Dipartimento di Elettronica, Informazione e Bioin-generia, Politecnico di Milano, Milan, Italy (e-mail: [email protected]).

Patrizio Colaneri is with the Dipartimento di Elettronica, Informazione eBioingeneria, Politecnico di Milano, Milan, Italy, and also with IEEIT-CNR,Milan, Italy (e-mail: [email protected]).

Giuseppe De Nicolao is with the Dipartimento di Ingegneria Indus-triale e dell’Informazione, Università di Pavia, Pavia, Italy (e-mail:[email protected]).

Digital Object Identifier 10.1109/TCSS.2019.2962273

on a single topic. Each individual’s opinion is influenced 38

by the opinions of the neighbors, e.g., by their convex 39

combination (see [4]). More recent works extended these 40

models along various directions, including the introduc- 41

tion of gossip-based interaction [5], logical constraints in 42

multiopinion models [6], [7], bounded-confidence models 43

accounting for homophily [8], [9], private/expressed opin- 44

ions [10], opinion-dependent susceptibility [11], subjective 45

logic framework [12], and opinions with interval uncertainty 46

[13]. The interested reader can refer to the tutorials [14], [15] 47

for a fairly complete overview of these deterministic models, 48

while recent empirical results are reported in [16]. 49

Another approach is based on a stochastic description of the 50

individuals’ interaction (see [17]) and the literature on social 51

learning [18], [19]. Other contributions assume that the single 52

opinion is itself a random variable, typically a Markov process, 53

taking values in a finite set. In this case, the opinion of each 54

agent is discrete-valued, and its time evolution is nondeter- 55

ministic. Multiagent networks, made by interacting Markov 56

chains, underpin also the opinion influence models introduced 57

in [20] and [21]. Another recent contribution is the multiagent 58

model developed in [22] and [23], where the individual opinion 59

is described as a discrete-valued Markov process and the 60

transition probabilities are affected by the opinions of the 61

neighbors. In particular, the transition probability rates are the 62

sum of two terms: an intrinsic rate, peculiar to the single agent, 63

and a social correction proportional to the fraction of neighbors 64

in a certain opinion. Such a model has some similarities with 65

classical epidemic models, such as the susceptible–infected– 66

susceptible (SIS) model [24]. Notably, the stochastic analysis 67

tools of [22] and [23] do not use the mean-field approximation 68

often adopted in the Markovian multiagent models. Rather, 69

a fully probabilistic approach, leveraging on marginalization, 70

is pursued. 71

A Markovian agent model of opinion dynamics has some 72

inherent advantages. First, it accounts for random variations 73

of individual opinions. More precisely, the opinion of each 74

agent is not deterministically influenced by its neighbors’ 75

opinions. This is an important feature because, in the real 76

world, an individual is typically affected by other external 77

sources of influence (such as information media or every day’s 78

experience) that are not formalized in the model. In particular, 79

an agent may modify its opinion even when the social network 80

environment remains unchanged. A second advantage is that 81

the Markovian models naturally cope with multidimensional 82

2329-924X © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

For the rest of the story on opinion dynamics:

NEW!

Page 106: The Fallacies of Learning › ... · The Fallacies of Learning Prof. Giuseppe De Nicolao. Outline • Correlation is not causation • Regression to mediocrity • Randomized Trials

The two sides of the significance fallacy

• The false belief that [statistically] significant results are automatically big and important

• The importance of estimating the effect size(and its confidence intervals)

• The false belief that [statistically] not significant results are automatically small and unimportant

• If you cannot reject the null hypothesis, this does not implies that it is true

• Lack of significance may be due to sample size (underpowered studies), so that further studies might reject H0