![Page 1: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/1.jpg)
Machine Teaching as a Probefor Learning Mechanism in Humans
Jerry Zhu
University of Wisconsin-Madison
Tsinghua Laboratory of Brain and IntelligenceWorkshop on Brain and Artificial Intelligence
Dec. 27, 2017
![Page 2: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/2.jpg)
input → human → performance
Examples:
I human categorization
I human memorization
![Page 3: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/3.jpg)
input → human → performance
input → mathematical model A → performance 2
The usual research flow:
1. run human experiments
2. tweak model A so that “performance 2” ≈ “performance”
3. publish
![Page 4: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/4.jpg)
How to improve “already good” models?
1. no obvious improvements
2. multiple equally good models
Idea: feed atypical input to model A
![Page 5: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/5.jpg)
The most interesting atypical input
Input D∗ that, according to model A, maximizes performance:
maxD
performance(A(D))
s.t. constraints on D
We call D∗ the optimal teaching input
![Page 6: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/6.jpg)
A little logic
E1=A is a faithful model of human learningE2=D∗ maximizes performance on AE3=D∗ maximizes performance on humans
E1 ∧ E2⇒ E3
contraposition
¬E3⇒ (¬E1 ∨ ¬E2)
If humans do not perform well on D∗ (¬E3), and since Jerry hasconfidence in how he optimizes D∗ (E2), then the only logicalconclusion is ¬E1.
![Page 7: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/7.jpg)
The new research flow
1. find the optimal teaching input D∗ that maximizesperformance for model A
2. run human experiments with input D∗
3. if human performance improvedI great! retain model A, publish
elseI D∗ exposes problems, revise model A, publish
“Hedging”
![Page 8: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/8.jpg)
Introducing machine teaching
Machine teaching: Finding the optimal teaching input D∗
I Given:I model A : D 7→ ΘI performance measure p(θ), θ ∈ ΘI constraints c(D)
I Optimize: input D∗ that maximizes p(A(D∗))
![Page 9: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/9.jpg)
Case study: humans categorization
I training data D = (x1, y1), . . . , (xn, yn)
I xi: feature vector, yi: class label
I cognitive model A: a (machine) learning algorithm D 7→ Θ
I classifier θ : X 7→ Y
I performance measure p(θ): test set accuracy w.r.t. θ∗ (Thisrequires us to know the target model, or have a labeled testset)
I example constraints c(D):I xi ∈ finite candidate pool (vs. Rd)I |D| ≤ n
![Page 10: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/10.jpg)
Machine teaching: Finding the optimal teaching input
D∗ := argmaxD,θ
p(θ)
s.t. θ = A(D)
|D| ≤ n
I first constraint = empirical risk minimization = optimizationby itself
I bilevel combinatorial optimizationI simple A (e.g. linear regression): closed-form D∗
I convex A (e.g. logistic regression): KKT+implicit function →nonlinear optimization, or mixed-integer nonlinear program
I complex A (e.g. neural networks): hill climbing etc.
I D∗ usually not i.i.d.
![Page 11: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/11.jpg)
Machine teaching example 1
![Page 12: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/12.jpg)
Machine teaching example 2
dim 1-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
dim
2
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
+++ *3*3*3
--- *3*3*3
+++ *3*3*3
-
-- *2 ++
+
--- +
-D
1
D2
D3
D4
![Page 13: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/13.jpg)
Machine teaching example 3
−1 0 1 −20
2
−1
−0.5
0
0.5
1
−1 0 1 −20
2
−1
0
1
−1 0 1 −20
2−2
−1.5
−1
−0.5
0
0.5
1
![Page 14: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/14.jpg)
Recent machine teaching research
I NIPS 2017 Workshop on Teaching Machines, Robots, andHumans (my tutorial http://pages.cs.wisc.edu/~jerryzhu/pub/NIPS17WStutorial.pdf)
I Applications:I educationI adversarial attacksI human robot interactionI interactive machine learningI algorithmic fairnessI machine learning debugging
![Page 15: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/15.jpg)
Human Categorization Example 1[Patil et al. 2014]
I Human categorization task: line length
I 1D threshold θ∗ = 0.5
I A: kernel density estimator
I Optimal D∗:
θ =0.5∗0 1
y=−1 y=1
human trained on human test accuracy
random items 69.8%optimal D∗ 72.5%
(statistically significant)
![Page 16: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/16.jpg)
Human Categorization Example 2[Sen et al. in preparation]
I Human categorization task: same or different molecules
Lewis representation Space-filling representation
I A: neural network
I Optimal D∗(n = 60):human trained on human pre-test error post-test error
random input 31.7% 28.6%expert input 28.7% 28.1%
D∗ 30.6% 25.1%(statistically significant)
![Page 17: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/17.jpg)
Human Categorization Example 3[Nosofsky & Sanders, Psychonomics 2017]
I Human categorization task: rock type
I Model A: Generalized Context Model (GCM)
I Optimal D∗ does not work better on humanshuman trained on human accuracy
random input 67.2%coverage input 71.2%
D∗ 69.3%
I Experts are revising the model
![Page 18: Machine Teaching as a Probe for Learning Mechanism in Humanspages.cs.wisc.edu/~jerryzhu/machineteaching/pub/MTprobe... · 2018-08-22 · Machine Teaching as a Probe for Learning Mechanism](https://reader034.vdocuments.net/reader034/viewer/2022050519/5fa32628438f3d556252bd7d/html5/thumbnails/18.jpg)
Summary
1. Find D∗ that maximizes performance for model A
2. Run human experiments with input D∗
I either human performance improvedI or model A revised
http://pages.cs.wisc.edu/~jerryzhu/machineteaching/