a m achine l earning a pproach for a utomatic s tudent m odel d iscovery nan li, noboru matsuda,...
DESCRIPTION
3 S TUDENT M ODEL C ONSTRUCTION Traditional Methods Structured interviews Think-aloud protocols Rational analysis Previous Automated Methods Learning factor analysis (LFA) Proposed Approach Use a machine-learning agent, SimStudent, to acquire knowledge 1 production rule acquired => 1 KC in student model (Q matrix) Require expert input. Highly subjective. Require expert input. Highly subjective. Within the search space of human- provided factors. Independent of human- provided factors.TRANSCRIPT
![Page 1: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/1.jpg)
A MACHINE LEARNING APPROACH FOR AUTOMATIC STUDENT MODEL DISCOVERYNan Li, Noboru Matsuda, William Cohen, and Kenneth KoedingerComputer Science DepartmentCarnegie Mellon University
![Page 2: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/2.jpg)
2
STUDENT MODEL A set of knowledge components (KCs) Encoded in intelligent tutors to model
how students solve problemsExample: What to do next on problems like
3x=12 A key factor behind instructional
decisions in automated tutoring systems
![Page 3: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/3.jpg)
3
STUDENT MODEL CONSTRUCTION Traditional Methods
Structured interviews Think-aloud protocols Rational analysis
Previous Automated Methods Learning factor analysis (LFA)
Proposed Approach Use a machine-learning
agent, SimStudent, to acquire knowledge
1 production rule acquired => 1 KC in student model (Q matrix)
Require expert input.Highly subjective.
Within the search space of human-provided factors.
Independent of human-provided
factors.
![Page 4: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/4.jpg)
4
A BRIEF REVIEW OF SIMSTUDENT
• A machine-learning agent that• acquires production
rules from• examples & problem
solving experience• given a set of
feature predicates & functions
![Page 5: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/5.jpg)
5
PRODUCTION RULES Skill divide
(e.g. -3x = 6)
What: Left side (-3x) Right side (6)
When: Left side (-3x) does not
have constant term=> How:
Get-coefficient (-3) of left side (-3x)
Divide both sides with the coefficient
Each production rule is associated with one KC
Each step (-3x = 6) is labeled with one KC, decided by the production applied to that step
Original model required strong domain-specific operators, like Get-coefficient Does not differentiate important distinctions in learning (e.g., -x=3 vs -3x = 6)
![Page 6: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/6.jpg)
6
DEEP FEATURE LEARNING Expert vs Novice (Chi et al., 1981)
Example: What’s the coefficient of -3x? Expert uses deep functional features to reply -3 Novice may use shallow perceptual features to reply 3
Model deep feature learning using machine learning techniques
Integrate acquired knowledge into SimStudent learning
Remove dependence on strong operators & split KCs into finer grain sizes
![Page 7: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/7.jpg)
7
FEATURE RECOGNITION ASPCFG INDUCTION Underlying structure in the problem
Grammar Feature Non-terminal symbol in a grammar
rule Feature learning task Grammar induction Student errors Incorrect parsing
![Page 8: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/8.jpg)
8
LEARNING PROBLEM Input is a set of feature recognition records
consisting of An original problem (e.g. -3x) The feature to be recognized (e.g. -3 in -3x)
Output A probabilistic context free grammar (PCFG) A non-terminal symbol in a grammar rule that
represents target feature
![Page 9: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/9.jpg)
9
A TWO-STEP PCFG LEARNING ALGORITHM• Greedy Structure
Hypothesizer: Hypothesizes grammar
rules in a bottom-up fashion
Creates non-terminal symbols for frequently occurred sequences
E.g. – and 3, SignedNumber and Variable
• Viterbi Training Phase: Refines rule
probabilities Occur more frequently
Higher probabilitiesGeneralizes Inside-Outside Algorithm (Lary & Young, 1990)
![Page 10: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/10.jpg)
10
EXAMPLE OF PRODUCTION RULES BEFORE AND AFTER INTEGRATION Extend the “What” Part in Production RuleOriginal:
Skill divide (e.g. -3x = 6)What:
Left side (-3x)Right side (6)
When:Left side (-3x) does not have constant term
=>How:
Get coefficient (-3) of left side (-3x)Divide both sides with the coefficient (-3)
Extended:Skill divide (e.g. -3x = 6)What:
Left side (-3, -3x)Right side (6)
When:Left side (-3x) does not have constant term
=>How:
Get coefficient (-3) of left side (-3x)Divide both sides with the coefficient (-3)
• Fewer operators• Eliminate need for domain-specific operators
![Page 11: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/11.jpg)
11
Original:Skill divide (e.g. -3x = 6)What:
Left side (-3x)Right side (6)
When:Left side (-3x) does not have constant term
=>How:
Get coefficient (-3) of left side (-3x)Divide both sides with the coefficient (-3)
![Page 12: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/12.jpg)
12
EXPERIMENT METHOD SimStudent vs. Human-generated model Code real student data
71 students used a Carnegie Learning Algebra I Tutor on equation solving
SimStudent: Tutored by a Carnegie Learning Algebra I Tutor Coded each step by the applicable production rule Used human-generated coding in case of no applicable
production Human-generated model:
Coded manually based on expertise
![Page 13: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/13.jpg)
13
HUMAN-GENERATED VS SIMSTUDENT KCS
Human-generated Model
SimStudent
Comment
Total # of KCs 12 21# of Basic Arithmetic Operation KCs
4 13 Split into finer grain sizes based on different problem forms
# of Typein KCs 4 4 Approximately the same# of Other Transformation Operation KCs (e.g. combine like terms)
4 4 Approximately the same
![Page 14: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/14.jpg)
14
HOW WELL TWO MODELS FIT WITH REAL STUDENT DATA Used Additive Factor Model (AFM)
An instance of logistic regression that Uses each student, each KC and KC by opportunity
interaction as independent variables To predict probabilities of a student making an error
on a specific step
![Page 15: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/15.jpg)
divide 1 1 1 1 1 1 1 1 1 1simSt-divide 1 1 1 1 1 1 1 0 0 0simSt-divide-1
0 0 0 0 0 0 0 1 1 1
AN EXAMPLE OF SPLIT IN DIVISION Human-generated
Model divide:
Ax=B & -x=A SimStudent
simSt-divide: Ax=B
simSt-divide-1: -x=A
Ax=B -x=A
![Page 16: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/16.jpg)
16
PRODUCTION RULES FOR DIVISION
Skill simSt-divide (e.g. -3x = 6) What:
Left side (-3, -3x) Right side (6)
When: Left side (-3x) does not
have constant term How:
Divide both sides with the coefficient (-3)
Skill simSt-divide-1 (e.g. -x = 3) What:
Left side (-x) Right side (3)
When: Left side (-x) is of the
form -v How:
Generate one (1) Divide both sides with -1
![Page 17: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/17.jpg)
17
AN EXAMPLE WITHOUT SPIT IN DIVIDE TYPEIN Human-
generated Model divide-typein
SimStudent simSt-divide-
typein
divide-typein 1 1 1 1 1 1 1 1 1simSt-divide-typin
1 1 1 1 1 1 1 1 1
![Page 18: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/18.jpg)
18
SIMSTUDENT VS SIMSTUDENT + FEATURE LEARNING SimStudent
Needs strong operators
Constructs student models similar to human-generated model
Extended SimStudent Only requires weak
operators Split KCs into finer
grain sizes based on different parse trees
Does Extended SimStudent produce a KC model that better fits student learning data?
![Page 19: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/19.jpg)
19
RESULTSHuman-generated Model
SimStudent
AIC 6529 64483-Fold Cross Validation RMSE
0.4034 0.3997
Significance Test SimStudent outperforms the human-generated
model in 4260 out of 6494 steps p < 0.001
SimStudent outperforms the human-generated model across 20 runs of cross validation
p < 0.001
![Page 20: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/20.jpg)
20
SUMMARY Presented an innovative application of a
machine-learning agent, SimStudent, for an automatic discovery of student models.
Showed that a SimStudent generated student model was a better predictor of real student learning behavior than a human-generate model.
![Page 21: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/21.jpg)
21
FUTURE STUDIES Test generality in other datasets in DataShop
Apply this proposed approach in other domains Stoichiometry Fraction addition
![Page 22: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/22.jpg)
22
![Page 23: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/23.jpg)
23
AN EXAMPLE IN ALGEBRA
![Page 24: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/24.jpg)
24
FEATURE RECOGNITION ASPCFG INDUCTION Underlying structure in the problem
Grammar Feature Non-terminal symbol in a grammar
rule Feature learning task Grammar induction Student errors Incorrect parsing
![Page 25: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/25.jpg)
25
LEARNING PROBLEM Input is a set of feature recognition records
consisting of An original problem (e.g. -3x) The feature to be recognized (e.g. -3 in -3x)
Output A probabilistic context free grammar (PCFG) A non-terminal symbol in a grammar rule that
represents target feature
![Page 26: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/26.jpg)
26
A COMPUTATIONAL MODEL OF DEEP FEATURE LEARNING Extended a PCFG Learning Algorithm (Li et
al., 2009) Feature Learning Stronger Prior Knowledge:
Transfer Learning Using Prior Knowledge
![Page 27: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/27.jpg)
27
A TWO-STEP PCFG LEARNING ALGORITHM• Greedy Structure
Hypothesizer: Hypothesizes grammar
rules in a bottom-up fashion
Creates non-terminal symbols for frequently occurred sequences
E.g. – and 3, SignedNumber and Variable
• Viterbi Training Phase: Refines rule
probabilities Occur more frequently
Higher probabilitiesGeneralizes Inside-Outside Algorithm (Lary & Young, 1990)
![Page 28: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/28.jpg)
28
FEATURE LEARNING Build most probable
parse trees For all observation
sequences Select a non-
terminal symbol that Matches the most
training records as the target feature
![Page 29: A M ACHINE L EARNING A PPROACH FOR A UTOMATIC S TUDENT M ODEL D ISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science](https://reader036.vdocuments.net/reader036/viewer/2022062600/5a4d1b627f8b9ab0599ae44d/html5/thumbnails/29.jpg)
29
TRANSFER LEARNING USING PRIOR KNOWLEDGE GSH Phase:
Build parse trees based on some previously acquired grammar rules
Then call the original GSH
Viterbi Training: Add rule frequency
in previous task to the current task
0.660.330.50.5