vladimír jani - master programme in applied...
TRANSCRIPT
![Page 1: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/1.jpg)
Hypothesis testing
Vladimír Janiš
Department of MathematicsMatej Bel University Banská Bystrica
Slovakia
December 13, 2011, Novi Sad
TEMPUS - Master programme in applied statistics
Vladimír Janiš Hypothesis testing
![Page 2: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/2.jpg)
Short history
John Arbuthnot, 1710 - the first published statistical test
observed, that the fraction of boys born is slightly larger than
the fraction of girls
calculated that assuming equal probabilities for boys and girls
this emiprical fact would be exceedingly unlikely - probability1
483600000000000000000000argued that this was a proof of God’s will - boys had higher
risks of an early death
clearly a consideration possessing the basic characterizations of
a hypothesis test
Vladimír Janiš Hypothesis testing
![Page 3: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/3.jpg)
Short history
John Arbuthnot, 1710 - the first published statistical test
observed, that the fraction of boys born is slightly larger than
the fraction of girls
calculated that assuming equal probabilities for boys and girls
this emiprical fact would be exceedingly unlikely - probability1
483600000000000000000000argued that this was a proof of God’s will - boys had higher
risks of an early death
clearly a consideration possessing the basic characterizations of
a hypothesis test
Vladimír Janiš Hypothesis testing
![Page 4: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/4.jpg)
Short history
John Arbuthnot, 1710 - the first published statistical test
observed, that the fraction of boys born is slightly larger than
the fraction of girls
calculated that assuming equal probabilities for boys and girls
this emiprical fact would be exceedingly unlikely - probability1
483600000000000000000000argued that this was a proof of God’s will - boys had higher
risks of an early death
clearly a consideration possessing the basic characterizations of
a hypothesis test
Vladimír Janiš Hypothesis testing
![Page 5: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/5.jpg)
Short history
John Arbuthnot, 1710 - the first published statistical test
observed, that the fraction of boys born is slightly larger than
the fraction of girls
calculated that assuming equal probabilities for boys and girls
this emiprical fact would be exceedingly unlikely - probability1
483600000000000000000000
argued that this was a proof of God’s will - boys had higher
risks of an early death
clearly a consideration possessing the basic characterizations of
a hypothesis test
Vladimír Janiš Hypothesis testing
![Page 6: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/6.jpg)
Short history
John Arbuthnot, 1710 - the first published statistical test
observed, that the fraction of boys born is slightly larger than
the fraction of girls
calculated that assuming equal probabilities for boys and girls
this emiprical fact would be exceedingly unlikely - probability1
483600000000000000000000argued that this was a proof of God’s will - boys had higher
risks of an early death
clearly a consideration possessing the basic characterizations of
a hypothesis test
Vladimír Janiš Hypothesis testing
![Page 7: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/7.jpg)
Short history
John Arbuthnot, 1710 - the first published statistical test
observed, that the fraction of boys born is slightly larger than
the fraction of girls
calculated that assuming equal probabilities for boys and girls
this emiprical fact would be exceedingly unlikely - probability1
483600000000000000000000argued that this was a proof of God’s will - boys had higher
risks of an early death
clearly a consideration possessing the basic characterizations of
a hypothesis test
Vladimír Janiš Hypothesis testing
![Page 8: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/8.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 9: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/9.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 10: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/10.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 11: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/11.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 12: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/12.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 13: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/13.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable X
X has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 14: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/14.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 15: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/15.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 16: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/16.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 17: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/17.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 18: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/18.jpg)
Tests in a modern sense
1900 - Karl Pearson
chi-square test
compared observed frequency distribution to a theoretically
assumed one
1925 - R. A. Fisher
data are regarded as the outcome of a random variable XX has a preassumed probability distribution
null hypothesis - an assertion defining a subset of this family
test statistics T = t(X ) indicates the degree to which the data
deviate from the null hypothesis
significance probability (p-value) 0.05
Fisher was the first to recognise the arbitrary nature of this
treshold
Vladimír Janiš Hypothesis testing
![Page 19: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/19.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 20: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/20.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 21: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/21.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 22: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/22.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 23: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/23.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 24: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/24.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 25: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/25.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 26: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/26.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 27: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/27.jpg)
A competing approach to Fischer
1928 - J. Neyman and Egon Pearson (the son of Karl Pearson)
criticized the arbitrariness in Fisher’s choice of of the test
statistic
claimed that for rational choice of a test statistic not only null
hypothesis, but also an alternative one is needed
formalized the testing problem as two decision problem
H0,H1(Ha)
decisions reject H0 or do not reject H0
two types of resulting errors - rejecting a true H0 (the first
kind) and failing to reject a false H0 (the second kind)
power of the test - probability of (correctly) rejecting H0, if H1is true
Vladimír Janiš Hypothesis testing
![Page 28: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/28.jpg)
Fisher formulation vs. Neyman-Pearson formulation
Neyman-Pearson
richer results at the cost of a more demanding model
we need to specify an alternative hypothesis
testing problem as a two-decision situation
Different philosophical positions summarised in
Hacking, I.: Logic of Statistical Inference, Cambridge Univ.
Press, Cambridge, 1965
Gigerenzer, G., Swijtnik, Z., Porter, T., Daston, L., Beatty, J.,
Kruger, L.: The Empire of Chance, Cambridge Univ. Press,
New York, 1989 - also dicsusses aspects in teaching and
practise of statistics by a hybrid theory combining elements of
both approaches
Vladimír Janiš Hypothesis testing
![Page 29: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/29.jpg)
Fisher formulation vs. Neyman-Pearson formulation
Neyman-Pearson
richer results at the cost of a more demanding model
we need to specify an alternative hypothesis
testing problem as a two-decision situation
Different philosophical positions summarised in
Hacking, I.: Logic of Statistical Inference, Cambridge Univ.
Press, Cambridge, 1965
Gigerenzer, G., Swijtnik, Z., Porter, T., Daston, L., Beatty, J.,
Kruger, L.: The Empire of Chance, Cambridge Univ. Press,
New York, 1989 - also dicsusses aspects in teaching and
practise of statistics by a hybrid theory combining elements of
both approaches
Vladimír Janiš Hypothesis testing
![Page 30: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/30.jpg)
Fisher formulation vs. Neyman-Pearson formulation
Neyman-Pearson
richer results at the cost of a more demanding model
we need to specify an alternative hypothesis
testing problem as a two-decision situation
Different philosophical positions summarised in
Hacking, I.: Logic of Statistical Inference, Cambridge Univ.
Press, Cambridge, 1965
Gigerenzer, G., Swijtnik, Z., Porter, T., Daston, L., Beatty, J.,
Kruger, L.: The Empire of Chance, Cambridge Univ. Press,
New York, 1989 - also dicsusses aspects in teaching and
practise of statistics by a hybrid theory combining elements of
both approaches
Vladimír Janiš Hypothesis testing
![Page 31: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/31.jpg)
Fisher formulation vs. Neyman-Pearson formulation
Neyman-Pearson
richer results at the cost of a more demanding model
we need to specify an alternative hypothesis
testing problem as a two-decision situation
Different philosophical positions summarised in
Hacking, I.: Logic of Statistical Inference, Cambridge Univ.
Press, Cambridge, 1965
Gigerenzer, G., Swijtnik, Z., Porter, T., Daston, L., Beatty, J.,
Kruger, L.: The Empire of Chance, Cambridge Univ. Press,
New York, 1989 - also dicsusses aspects in teaching and
practise of statistics by a hybrid theory combining elements of
both approaches
Vladimír Janiš Hypothesis testing
![Page 32: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/32.jpg)
Fisher formulation vs. Neyman-Pearson formulation
Neyman-Pearson
richer results at the cost of a more demanding model
we need to specify an alternative hypothesis
testing problem as a two-decision situation
Different philosophical positions summarised in
Hacking, I.: Logic of Statistical Inference, Cambridge Univ.
Press, Cambridge, 1965
Gigerenzer, G., Swijtnik, Z., Porter, T., Daston, L., Beatty, J.,
Kruger, L.: The Empire of Chance, Cambridge Univ. Press,
New York, 1989 - also dicsusses aspects in teaching and
practise of statistics by a hybrid theory combining elements of
both approaches
Vladimír Janiš Hypothesis testing
![Page 33: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/33.jpg)
Problems in the interpretation and use of tests
Hypothesis tests
routinely applied
but there are continuing debates about their use from the
philosophical point of view
Harlow, L.L., Mulaik, S.A., Steiger, J.H. (eds.): What if there
were no Significance Tests?, Erlbaum, Mahwah, NJ
Wilkinson, L.: Task force on statistical inference; Statistical
methods in psychology journals, American Psychologist 54:
594-604, 1999.
Nickerson, R.S.: Null hypothesis significance testing: A review
of an old and continuing controversy, Psychological Methods
5:241-301, 2000.
Vladimír Janiš Hypothesis testing
![Page 34: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/34.jpg)
Problems in the interpretation and use of tests
Hypothesis tests
routinely applied
but there are continuing debates about their use from the
philosophical point of view
Harlow, L.L., Mulaik, S.A., Steiger, J.H. (eds.): What if there
were no Significance Tests?, Erlbaum, Mahwah, NJ
Wilkinson, L.: Task force on statistical inference; Statistical
methods in psychology journals, American Psychologist 54:
594-604, 1999.
Nickerson, R.S.: Null hypothesis significance testing: A review
of an old and continuing controversy, Psychological Methods
5:241-301, 2000.
Vladimír Janiš Hypothesis testing
![Page 35: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/35.jpg)
Problems in the interpretation and use of tests
Hypothesis tests
routinely applied
but there are continuing debates about their use from the
philosophical point of view
Harlow, L.L., Mulaik, S.A., Steiger, J.H. (eds.): What if there
were no Significance Tests?, Erlbaum, Mahwah, NJ
Wilkinson, L.: Task force on statistical inference; Statistical
methods in psychology journals, American Psychologist 54:
594-604, 1999.
Nickerson, R.S.: Null hypothesis significance testing: A review
of an old and continuing controversy, Psychological Methods
5:241-301, 2000.
Vladimír Janiš Hypothesis testing
![Page 36: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/36.jpg)
Problems in the interpretation and use of tests
Hypothesis tests
routinely applied
but there are continuing debates about their use from the
philosophical point of view
Harlow, L.L., Mulaik, S.A., Steiger, J.H. (eds.): What if there
were no Significance Tests?, Erlbaum, Mahwah, NJ
Wilkinson, L.: Task force on statistical inference; Statistical
methods in psychology journals, American Psychologist 54:
594-604, 1999.
Nickerson, R.S.: Null hypothesis significance testing: A review
of an old and continuing controversy, Psychological Methods
5:241-301, 2000.
Vladimír Janiš Hypothesis testing
![Page 37: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/37.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false. However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 38: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/38.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false. However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 39: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/39.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false. However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 40: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/40.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false.
However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 41: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/41.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false. However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 42: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/42.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false. However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 43: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/43.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false. However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 44: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/44.jpg)
Reasons for criticism
difficulty of reasoning with uncertain evidence
natural human preference of strict rules and sharp decisions
The empirical researcher would like to conclude whether a theory is
true or false. However, experimental and observational data are
often so variable that the evidence produced by them is uncertain.
A serious and frequent misuse of hypothesis testing
α < p implies that the investigateg effect is absent, while α ≥ pproves that the effect exists.
better interpretation of hypothesis testing should be promoted
tests should be used less mechanically and combined with
other argumentations and other statistical procedures
user has to be aware that the random variability in the data
cannot be filtered out of the results
Vladimír Janiš Hypothesis testing
![Page 45: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/45.jpg)
Misinterpretations of hypothesis tests - 1
Nonrejection implies support for the null hypothesis.
nonrejection: there is not enough evidence against the null
hypothesis
the sample size may be small, error variability may be large, so
that the data do not contain much information
therefore nonrejection often provides support for alternative
hypothesis practically as strongly as for the null hypothesis
itself
nonrejection may not be interpreted as a support for the null
hypothesis
Vladimír Janiš Hypothesis testing
![Page 46: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/46.jpg)
Misinterpretations of hypothesis tests - 1
Nonrejection implies support for the null hypothesis.
nonrejection: there is not enough evidence against the null
hypothesis
the sample size may be small, error variability may be large, so
that the data do not contain much information
therefore nonrejection often provides support for alternative
hypothesis practically as strongly as for the null hypothesis
itself
nonrejection may not be interpreted as a support for the null
hypothesis
Vladimír Janiš Hypothesis testing
![Page 47: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/47.jpg)
Misinterpretations of hypothesis tests - 1
Nonrejection implies support for the null hypothesis.
nonrejection: there is not enough evidence against the null
hypothesis
the sample size may be small, error variability may be large, so
that the data do not contain much information
therefore nonrejection often provides support for alternative
hypothesis practically as strongly as for the null hypothesis
itself
nonrejection may not be interpreted as a support for the null
hypothesis
Vladimír Janiš Hypothesis testing
![Page 48: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/48.jpg)
Misinterpretations of hypothesis tests - 1
Nonrejection implies support for the null hypothesis.
nonrejection: there is not enough evidence against the null
hypothesis
the sample size may be small, error variability may be large, so
that the data do not contain much information
therefore nonrejection often provides support for alternative
hypothesis practically as strongly as for the null hypothesis
itself
nonrejection may not be interpreted as a support for the null
hypothesis
Vladimír Janiš Hypothesis testing
![Page 49: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/49.jpg)
Misinterpretations of hypothesis tests - 1
Nonrejection implies support for the null hypothesis.
nonrejection: there is not enough evidence against the null
hypothesis
the sample size may be small, error variability may be large, so
that the data do not contain much information
therefore nonrejection often provides support for alternative
hypothesis practically as strongly as for the null hypothesis
itself
nonrejection may not be interpreted as a support for the null
hypothesis
Vladimír Janiš Hypothesis testing
![Page 50: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/50.jpg)
Misinterpretations of hypothesis tests - 1
Nonrejection implies support for the null hypothesis.
nonrejection: there is not enough evidence against the null
hypothesis
the sample size may be small, error variability may be large, so
that the data do not contain much information
therefore nonrejection often provides support for alternative
hypothesis practically as strongly as for the null hypothesis
itself
nonrejection may not be interpreted as a support for the null
hypothesis
Vladimír Janiš Hypothesis testing
![Page 51: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/51.jpg)
Misinterpretations of hypothesis tests - 2
A nonsignificant result supports null hypothesis more in cases of a
high test power.
statistical power is the probability to reject the null hypothesis
if a given effect is presentthis cannot be inverted as a support of null hypothesis in the
case of non-significance
power studies are important while planning an experiment
appropriate procedures after the experiment are confidence
intervals
Test results tell us about the probabilities of null and alternative
hypotheses.
Vladimír Janiš Hypothesis testing
![Page 52: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/52.jpg)
Misinterpretations of hypothesis tests - 2
A nonsignificant result supports null hypothesis more in cases of a
high test power.
statistical power is the probability to reject the null hypothesis
if a given effect is presentthis cannot be inverted as a support of null hypothesis in the
case of non-significance
power studies are important while planning an experiment
appropriate procedures after the experiment are confidence
intervals
Test results tell us about the probabilities of null and alternative
hypotheses.
Vladimír Janiš Hypothesis testing
![Page 53: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/53.jpg)
Misinterpretations of hypothesis tests - 2
A nonsignificant result supports null hypothesis more in cases of a
high test power.
statistical power is the probability to reject the null hypothesis
if a given effect is present
this cannot be inverted as a support of null hypothesis in the
case of non-significance
power studies are important while planning an experiment
appropriate procedures after the experiment are confidence
intervals
Test results tell us about the probabilities of null and alternative
hypotheses.
Vladimír Janiš Hypothesis testing
![Page 54: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/54.jpg)
Misinterpretations of hypothesis tests - 2
A nonsignificant result supports null hypothesis more in cases of a
high test power.
statistical power is the probability to reject the null hypothesis
if a given effect is presentthis cannot be inverted as a support of null hypothesis in the
case of non-significance
power studies are important while planning an experiment
appropriate procedures after the experiment are confidence
intervals
Test results tell us about the probabilities of null and alternative
hypotheses.
Vladimír Janiš Hypothesis testing
![Page 55: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/55.jpg)
Misinterpretations of hypothesis tests - 2
A nonsignificant result supports null hypothesis more in cases of a
high test power.
statistical power is the probability to reject the null hypothesis
if a given effect is presentthis cannot be inverted as a support of null hypothesis in the
case of non-significance
power studies are important while planning an experiment
appropriate procedures after the experiment are confidence
intervals
Test results tell us about the probabilities of null and alternative
hypotheses.
Vladimír Janiš Hypothesis testing
![Page 56: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/56.jpg)
Misinterpretations of hypothesis tests - 2
A nonsignificant result supports null hypothesis more in cases of a
high test power.
statistical power is the probability to reject the null hypothesis
if a given effect is presentthis cannot be inverted as a support of null hypothesis in the
case of non-significance
power studies are important while planning an experiment
appropriate procedures after the experiment are confidence
intervals
Test results tell us about the probabilities of null and alternative
hypotheses.
Vladimír Janiš Hypothesis testing
![Page 57: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/57.jpg)
Misinterpretations of hypothesis tests - 2
A nonsignificant result supports null hypothesis more in cases of a
high test power.
statistical power is the probability to reject the null hypothesis
if a given effect is presentthis cannot be inverted as a support of null hypothesis in the
case of non-significance
power studies are important while planning an experiment
appropriate procedures after the experiment are confidence
intervals
Test results tell us about the probabilities of null and alternative
hypotheses.
Vladimír Janiš Hypothesis testing
![Page 58: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/58.jpg)
Misinterpretations of hypothesis tests - 3
Rejecting the null hypothesis the alternative theory is confirmed.
the alternative hypothesis is not the same as a scientific theory
alternative hypotheses are deduced from the theory, but alway
under asssumption that the study was well designed and
usually also other assumptions (normality of distributions)
alternative hypotheses are consequences of the theory, not its
sufficient conditions
n the other hand, it is possible that the theory is true, but the
alternative hypothesis deduced from it is not true, e.g.
because of the wrong experimental design
... The more you reject the null hypothesis, the more likely it is
that you’ll get {a title, a permanent position, ...}
Vladimír Janiš Hypothesis testing
![Page 59: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/59.jpg)
Misinterpretations of hypothesis tests - 3
Rejecting the null hypothesis the alternative theory is confirmed.
the alternative hypothesis is not the same as a scientific theory
alternative hypotheses are deduced from the theory, but alway
under asssumption that the study was well designed and
usually also other assumptions (normality of distributions)
alternative hypotheses are consequences of the theory, not its
sufficient conditions
n the other hand, it is possible that the theory is true, but the
alternative hypothesis deduced from it is not true, e.g.
because of the wrong experimental design
... The more you reject the null hypothesis, the more likely it is
that you’ll get {a title, a permanent position, ...}
Vladimír Janiš Hypothesis testing
![Page 60: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/60.jpg)
Misinterpretations of hypothesis tests - 3
Rejecting the null hypothesis the alternative theory is confirmed.
the alternative hypothesis is not the same as a scientific theory
alternative hypotheses are deduced from the theory, but alway
under asssumption that the study was well designed and
usually also other assumptions (normality of distributions)
alternative hypotheses are consequences of the theory, not its
sufficient conditions
n the other hand, it is possible that the theory is true, but the
alternative hypothesis deduced from it is not true, e.g.
because of the wrong experimental design
... The more you reject the null hypothesis, the more likely it is
that you’ll get {a title, a permanent position, ...}
Vladimír Janiš Hypothesis testing
![Page 61: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/61.jpg)
Misinterpretations of hypothesis tests - 3
Rejecting the null hypothesis the alternative theory is confirmed.
the alternative hypothesis is not the same as a scientific theory
alternative hypotheses are deduced from the theory, but alway
under asssumption that the study was well designed and
usually also other assumptions (normality of distributions)
alternative hypotheses are consequences of the theory, not its
sufficient conditions
n the other hand, it is possible that the theory is true, but the
alternative hypothesis deduced from it is not true, e.g.
because of the wrong experimental design
... The more you reject the null hypothesis, the more likely it is
that you’ll get {a title, a permanent position, ...}
Vladimír Janiš Hypothesis testing
![Page 62: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/62.jpg)
Misinterpretations of hypothesis tests - 3
Rejecting the null hypothesis the alternative theory is confirmed.
the alternative hypothesis is not the same as a scientific theory
alternative hypotheses are deduced from the theory, but alway
under asssumption that the study was well designed and
usually also other assumptions (normality of distributions)
alternative hypotheses are consequences of the theory, not its
sufficient conditions
n the other hand, it is possible that the theory is true, but the
alternative hypothesis deduced from it is not true, e.g.
because of the wrong experimental design
... The more you reject the null hypothesis, the more likely it is
that you’ll get {a title, a permanent position, ...}
Vladimír Janiš Hypothesis testing
![Page 63: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/63.jpg)
Misinterpretations of hypothesis tests - 3
Rejecting the null hypothesis the alternative theory is confirmed.
the alternative hypothesis is not the same as a scientific theory
alternative hypotheses are deduced from the theory, but alway
under asssumption that the study was well designed and
usually also other assumptions (normality of distributions)
alternative hypotheses are consequences of the theory, not its
sufficient conditions
n the other hand, it is possible that the theory is true, but the
alternative hypothesis deduced from it is not true, e.g.
because of the wrong experimental design
... The more you reject the null hypothesis, the more likely it is
that you’ll get {a title, a permanent position, ...}
Vladimír Janiš Hypothesis testing
![Page 64: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/64.jpg)
Misinterpretations of hypothesis tests - 3
Rejecting the null hypothesis the alternative theory is confirmed.
the alternative hypothesis is not the same as a scientific theory
alternative hypotheses are deduced from the theory, but alway
under asssumption that the study was well designed and
usually also other assumptions (normality of distributions)
alternative hypotheses are consequences of the theory, not its
sufficient conditions
n the other hand, it is possible that the theory is true, but the
alternative hypothesis deduced from it is not true, e.g.
because of the wrong experimental design
... The more you reject the null hypothesis, the more likely it is
that you’ll get {a title, a permanent position, ...}
Vladimír Janiš Hypothesis testing
![Page 65: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/65.jpg)
4 citations
Overemphasis on tests of significance at the expense especiallyof interval estimation has long been condemned (Cox 1977)The continued very extensive use of significance tests isalarming (Cox 1986)The author believes that tests provide a poor model of mostreal problems, usually so poor that their objectivity istangential and often too poor to be useful (Pratt 1976)We do not perform an experiment to find out if two varietiesof wheat or two drugs are equal. We know in advance, withoutspending a dollar on an experiment, that they are not equal(Deming 1975)
Vladimír Janiš Hypothesis testing
![Page 66: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/66.jpg)
4 citations
Overemphasis on tests of significance at the expense especiallyof interval estimation has long been condemned (Cox 1977)
The continued very extensive use of significance tests isalarming (Cox 1986)The author believes that tests provide a poor model of mostreal problems, usually so poor that their objectivity istangential and often too poor to be useful (Pratt 1976)We do not perform an experiment to find out if two varietiesof wheat or two drugs are equal. We know in advance, withoutspending a dollar on an experiment, that they are not equal(Deming 1975)
Vladimír Janiš Hypothesis testing
![Page 67: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/67.jpg)
4 citations
Overemphasis on tests of significance at the expense especiallyof interval estimation has long been condemned (Cox 1977)The continued very extensive use of significance tests isalarming (Cox 1986)
The author believes that tests provide a poor model of mostreal problems, usually so poor that their objectivity istangential and often too poor to be useful (Pratt 1976)We do not perform an experiment to find out if two varietiesof wheat or two drugs are equal. We know in advance, withoutspending a dollar on an experiment, that they are not equal(Deming 1975)
Vladimír Janiš Hypothesis testing
![Page 68: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/68.jpg)
4 citations
Overemphasis on tests of significance at the expense especiallyof interval estimation has long been condemned (Cox 1977)The continued very extensive use of significance tests isalarming (Cox 1986)The author believes that tests provide a poor model of mostreal problems, usually so poor that their objectivity istangential and often too poor to be useful (Pratt 1976)
We do not perform an experiment to find out if two varietiesof wheat or two drugs are equal. We know in advance, withoutspending a dollar on an experiment, that they are not equal(Deming 1975)
Vladimír Janiš Hypothesis testing
![Page 69: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/69.jpg)
4 citations
Overemphasis on tests of significance at the expense especiallyof interval estimation has long been condemned (Cox 1977)The continued very extensive use of significance tests isalarming (Cox 1986)The author believes that tests provide a poor model of mostreal problems, usually so poor that their objectivity istangential and often too poor to be useful (Pratt 1976)We do not perform an experiment to find out if two varietiesof wheat or two drugs are equal. We know in advance, withoutspending a dollar on an experiment, that they are not equal(Deming 1975)
Vladimír Janiš Hypothesis testing
![Page 70: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/70.jpg)
Consequence for our project
Critical reading of applied studies
A deep knowledge of a particulart application is welcome (a
statistically significant result need not necessarily be a
noteworthy result)
Where is the optimal ratio between quantity of taught
statistical methods and the depth of understanding their
nature?
Vladimír Janiš Hypothesis testing
![Page 71: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/71.jpg)
Consequence for our project
Critical reading of applied studies
A deep knowledge of a particulart application is welcome (a
statistically significant result need not necessarily be a
noteworthy result)
Where is the optimal ratio between quantity of taught
statistical methods and the depth of understanding their
nature?
Vladimír Janiš Hypothesis testing
![Page 72: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/72.jpg)
Consequence for our project
Critical reading of applied studies
A deep knowledge of a particulart application is welcome (a
statistically significant result need not necessarily be a
noteworthy result)
Where is the optimal ratio between quantity of taught
statistical methods and the depth of understanding their
nature?
Vladimír Janiš Hypothesis testing
![Page 73: Vladimír Jani - Master programme in applied statisticsstat.uns.ac.rs/workshop2/5hypothesis_testing.pdf · Hypothesis testing Vladimír Jani ... compared observed frequency distribution](https://reader034.vdocuments.net/reader034/viewer/2022051722/5aa7dfa97f8b9a50528ceb16/html5/thumbnails/73.jpg)
Consequence for our project
Critical reading of applied studies
A deep knowledge of a particulart application is welcome (a
statistically significant result need not necessarily be a
noteworthy result)
Where is the optimal ratio between quantity of taught
statistical methods and the depth of understanding their
nature?
Vladimír Janiš Hypothesis testing