Name-Brand vs. Off-Brand: A Twist on Taste Testing for aMathematical Statistics Course
Eric M. Reyes
Rose-Hulman Institute of Technology
JSM 2014
Outline
1 Background
2 Activity
3 Analysis
4 Results
Binary Choice Background
Dataset and Story
Reyes JSM 2014 (1)
Binary Choice Background
The Mathematical Statistics Course
Students have a strong mathematics background including: Calculus,Probability, Programming, and Introductory Statistics, (possibly)Analysis.
Topics include probability, properties of random samples, estimation,and inference via maximum likelihood.
Exposes students to the theory underlying many methods encounteredin other courses.
It is really easy to divorce this course from the pedagogical tools weuse in an Introductory Statistics course.
Reyes JSM 2014 (2)
Binary Choice Background
The Mathematical Statistics Course
Students have a strong mathematics background including: Calculus,Probability, Programming, and Introductory Statistics, (possibly)Analysis.
Topics include probability, properties of random samples, estimation,and inference via maximum likelihood.
Exposes students to the theory underlying many methods encounteredin other courses.
It is really easy to divorce this course from the pedagogical tools weuse in an Introductory Statistics course.
Reyes JSM 2014 (2)
Binary Choice Background
The Mathematical Statistics Course
Students have a strong mathematics background including: Calculus,Probability, Programming, and Introductory Statistics, (possibly)Analysis.
Topics include probability, properties of random samples, estimation,and inference via maximum likelihood.
Exposes students to the theory underlying many methods encounteredin other courses.
It is really easy to divorce this course from the pedagogical tools weuse in an Introductory Statistics course.
Reyes JSM 2014 (2)
Binary Choice Background
The Mathematical Statistics Course
Students have a strong mathematics background including: Calculus,Probability, Programming, and Introductory Statistics, (possibly)Analysis.
Topics include probability, properties of random samples, estimation,and inference via maximum likelihood.
Exposes students to the theory underlying many methods encounteredin other courses.
It is really easy to divorce this course from the pedagogical tools weuse in an Introductory Statistics course.
Reyes JSM 2014 (2)
Binary Choice Background
Background for Students
Name-Brand vs. Off-Brand: Can We Really Distinguish Between the Two?
Reyes JSM 2014 (3)
Binary Choice Activity
Activity and Assignment
Reyes JSM 2014 (4)
Binary Choice Activity
Data Collection
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Taste Test:
Participants were instructors at the university.
Double-blind.
Each participant was given m = 6 samples (3 name-brand and 3off-brand in randomized order).
Class Discussion:
What is the benefit of blinding and how might it be implemented?
How will randomization be utilized in this study?
What variables should be recorded in this study?
Reyes JSM 2014 (5)
Binary Choice Activity
Data Collection
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Taste Test:
Participants were instructors at the university.
Double-blind.
Each participant was given m = 6 samples (3 name-brand and 3off-brand in randomized order).
Class Discussion:
What is the benefit of blinding and how might it be implemented?
How will randomization be utilized in this study?
What variables should be recorded in this study?
Reyes JSM 2014 (5)
Binary Choice Activity
Data Collection
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Taste Test:
Participants were instructors at the university.
Double-blind.
Each participant was given m = 6 samples (3 name-brand and 3off-brand in randomized order).
Class Discussion:
What is the benefit of blinding and how might it be implemented?
How will randomization be utilized in this study?
What variables should be recorded in this study?
Reyes JSM 2014 (5)
Binary Choice Activity
Data Collection
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Taste Test:
Participants were instructors at the university.
Double-blind.
Each participant was given m = 6 samples (3 name-brand and 3off-brand in randomized order).
Class Discussion:
What is the benefit of blinding and how might it be implemented?
How will randomization be utilized in this study?
What variables should be recorded in this study?
Reyes JSM 2014 (5)
Binary Choice Activity
Data Collection
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Taste Test:
Participants were instructors at the university.
Double-blind.
Each participant was given m = 6 samples (3 name-brand and 3off-brand in randomized order).
Class Discussion:
What is the benefit of blinding and how might it be implemented?
How will randomization be utilized in this study?
What variables should be recorded in this study?
Reyes JSM 2014 (5)
Binary Choice Activity
The Data
0
1
2
3
4
5
0 1 2 3 4 5 6Number Correctly Categorized
Fre
quen
cy
Children No Children
Reyes JSM 2014 (6)
Binary Choice Activity
The Assignment
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Describe the distribution of the probability of actually discerningbetween the name-brand and off-brand products.
Account for the fact that a correct response could be due to a truediscernment or a lucky guess.
Construct a poster-presentation summarizing your analysis and results.
Reyes JSM 2014 (7)
Binary Choice Activity
The Assignment
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Describe the distribution of the probability of actually discerningbetween the name-brand and off-brand products.
Account for the fact that a correct response could be due to a truediscernment or a lucky guess.
Construct a poster-presentation summarizing your analysis and results.
Reyes JSM 2014 (7)
Binary Choice Activity
The Assignment
Question: What is the probability that a consumer can actually discernthe difference between Cheerios and an off-brand version?
Describe the distribution of the probability of actually discerningbetween the name-brand and off-brand products.
Account for the fact that a correct response could be due to a truediscernment or a lucky guess.
Construct a poster-presentation summarizing your analysis and results.
Reyes JSM 2014 (7)
Binary Choice Analysis
Analysis
Reyes JSM 2014 (8)
Binary Choice Analysis
Model for Data-Generating Process
Model Proposed by Morrison [Am. Stat. (1978)]
Pi is the probability the i-th subject can actually discern between thename-brand and off-brand.
Then, the probability of correctly categorizing the response is
Ci = Pi +1
2(1− Pi )
Assume PiIID∼ Beta(α, β).
Then, the number correct is Yi | Ci ∼ Bin (6,Ci ).
Reyes JSM 2014 (9)
Binary Choice Analysis
Model for Data-Generating Process
Model Proposed by Morrison [Am. Stat. (1978)]
Pi is the probability the i-th subject can actually discern between thename-brand and off-brand.
Then, the probability of correctly categorizing the response is
Ci = Pi +1
2(1− Pi )
Assume PiIID∼ Beta(α, β).
Then, the number correct is Yi | Ci ∼ Bin (6,Ci ).
Reyes JSM 2014 (9)
Binary Choice Analysis
Model for Data-Generating Process
Model Proposed by Morrison [Am. Stat. (1978)]
Pi is the probability the i-th subject can actually discern between thename-brand and off-brand.
Then, the probability of correctly categorizing the response is
Ci = Pi +1
2(1− Pi )
Assume PiIID∼ Beta(α, β).
Then, the number correct is Yi | Ci ∼ Bin (6,Ci ).
Reyes JSM 2014 (9)
Binary Choice Analysis
Model for Data-Generating Process
Model Proposed by Morrison [Am. Stat. (1978)]
Pi is the probability the i-th subject can actually discern between thename-brand and off-brand.
Then, the probability of correctly categorizing the response is
Ci = Pi +1
2(1− Pi )
Assume PiIID∼ Beta(α, β).
Then, the number correct is Yi | Ci ∼ Bin (6,Ci ).
Reyes JSM 2014 (9)
Binary Choice Analysis
Probabilistic Model
Letting f (y | α, β) represent the marginal density of Yi , then we havethat
`(α, β | y) =n∑
i=1
log [f (yi | α, β)]
Determining the marginal density is a good exercise in its own right(Casella and Berger [Statistical Inference (pg 197)]).
Students are expected to show that
f (y | α, β) =
(my
)Γ(α + β)
2mΓ(α)Γ(β)
y∑k=0
(y
k
)Γ(α + k)Γ(β + m − y)
Γ(α + β + m − y + k)
Reyes JSM 2014 (10)
Binary Choice Analysis
Probabilistic Model
Letting f (y | α, β) represent the marginal density of Yi , then we havethat
`(α, β | y) =n∑
i=1
log [f (yi | α, β)]
Determining the marginal density is a good exercise in its own right(Casella and Berger [Statistical Inference (pg 197)]).
Students are expected to show that
f (y | α, β) =
(my
)Γ(α + β)
2mΓ(α)Γ(β)
y∑k=0
(y
k
)Γ(α + k)Γ(β + m − y)
Γ(α + β + m − y + k)
Reyes JSM 2014 (10)
Binary Choice Analysis
Probabilistic Model
Letting f (y | α, β) represent the marginal density of Yi , then we havethat
`(α, β | y) =n∑
i=1
log [f (yi | α, β)]
Determining the marginal density is a good exercise in its own right(Casella and Berger [Statistical Inference (pg 197)]).
Students are expected to show that
f (y | α, β) =
(my
)Γ(α + β)
2mΓ(α)Γ(β)
y∑k=0
(y
k
)Γ(α + k)Γ(β + m − y)
Γ(α + β + m − y + k)
Reyes JSM 2014 (10)
Binary Choice Analysis
Computation of MLE’s
No closed-form solution exists for α̂ and β̂.
Students constructed R programs to compute the estimates using thedata.
Due to numerical instability, care had to be given to the optimizationroutine chosen and the coding of the likelihood.
Reyes JSM 2014 (11)
Binary Choice Analysis
Computation of MLE’s
No closed-form solution exists for α̂ and β̂.
Students constructed R programs to compute the estimates using thedata.
Due to numerical instability, care had to be given to the optimizationroutine chosen and the coding of the likelihood.
Reyes JSM 2014 (11)
Binary Choice Analysis
Computation of MLE’s
No closed-form solution exists for α̂ and β̂.
Students constructed R programs to compute the estimates using thedata.
Due to numerical instability, care had to be given to the optimizationroutine chosen and the coding of the likelihood.
Reyes JSM 2014 (11)
Binary Choice Results
Results
Reyes JSM 2014 (12)
Binary Choice Results
Results for Primary Question
0.0e+00
5.0e−07
1.0e−06
1.5e−06
2.0e−06
0.00 0.25 0.50 0.75 1.00Probability of Actually Discerning
Den
sity
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00Probability of Actually Discerning
Dis
trib
utio
n F
unct
ion
Reyes JSM 2014 (13)
Binary Choice Results
Results Comparing Those with and without Children
0
50
100
150
200
0.00 0.25 0.50 0.75 1.00Probability of Actually Discerning
Den
sity
Children No Children
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00Probability of Actually Discerning
Dis
trib
utio
n F
unct
ion
Children No Children
Reyes JSM 2014 (14)
Binary Choice Results
Conclusions
Students gave positive feedback on the activity, particularly seeing aproblem through from design to results.
Activity ties together several topics used throughout the course, andis a nice capstone to maximum likelihood.
Lends itself well to a short poster presentation.
Reyes JSM 2014 (15)
Binary Choice Results
Conclusions
Students gave positive feedback on the activity, particularly seeing aproblem through from design to results.
Activity ties together several topics used throughout the course, andis a nice capstone to maximum likelihood.
Lends itself well to a short poster presentation.
Reyes JSM 2014 (15)
Binary Choice Results
Conclusions
Students gave positive feedback on the activity, particularly seeing aproblem through from design to results.
Activity ties together several topics used throughout the course, andis a nice capstone to maximum likelihood.
Lends itself well to a short poster presentation.
Reyes JSM 2014 (15)
Binary Choice References
References
I Casella G and Berger RL.Statistical Inference (2nd ed, pg 197).Pacific Grove, CA: Thomson Learning, 2002.
I Morrison DG.A Probability Model for Forced Binary Choices.The American Statistician, 32(1):23-25, 1978.
Reyes JSM 2014 (16)
Binary Choice References
Contact Information
Eric M. ReyesDepartment of MathematicsRose-Hulman Institute of Technology5500 Wabash Ave.Terre Haute, IN 47803
web: www.rose-hulman.edu/∼reyesememail: [email protected]
Reyes JSM 2014 (17)