a machine learning approach for automatic student model discovery

29
A MACHINE LEARNING APPROACH FOR AUTOMATIC STUDENT MODEL DISCOVERY Nan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science Department Carnegie Mellon University

Upload: jescie-rowe

Post on 03-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

A Machine Learning Approach for Automatic Student Model Discovery. Nan Li, N oboru Matsuda, William Cohen, and Kenneth Koedinger Computer Science Department Carnegie Mellon University. Student Model. A set of knowledge components ( KCs ) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Machine Learning Approach for Automatic Student Model Discovery

A MACHINE LEARNING APPROACH FOR AUTOMATIC STUDENT MODEL DISCOVERYNan Li, Noboru Matsuda, William Cohen, and Kenneth Koedinger

Computer Science Department

Carnegie Mellon University

Page 2: A Machine Learning Approach for Automatic Student Model Discovery

2

STUDENT MODEL

A set of knowledge components (KCs) Encoded in intelligent tutors to model

how students solve problemsExample: What to do next on problems like

3x=12 A key factor behind instructional

decisions in automated tutoring systems

Page 3: A Machine Learning Approach for Automatic Student Model Discovery

3

STUDENT MODEL CONSTRUCTION

Traditional Methods Structured interviews Think-aloud protocols Rational analysis

Previous Automated Methods Learning factor analysis (LFA)

Proposed Approach Use a machine-learning

agent, SimStudent, to acquire knowledge

1 production rule acquired => 1 KC in student model (Q matrix)

Require expert input.Highly subjective.

Within the search space of human-provided factors.

Independent of human-provided

factors.

Page 4: A Machine Learning Approach for Automatic Student Model Discovery

4

A BRIEF REVIEW OF SIMSTUDENT

• A machine-learning agent that• acquires production

rules from• examples & problem

solving experience• given a set of

feature predicates & functions

Page 5: A Machine Learning Approach for Automatic Student Model Discovery

5

PRODUCTION RULES

Skill divide (e.g. -3x = 6)

What: Left side (-3x) Right side (6)

When: Left side (-3x) does not

have constant term

=> How:

Get-coefficient (-3) of left side (-3x)

Divide both sides with the coefficient

Each production rule is associated with one KC

Each step (-3x = 6) is labeled with one KC, decided by the production applied to that step

Original model required strong domain-specific operators, like Get-coefficient Does not differentiate important distinctions in learning (e.g., -x=3 vs -3x = 6)

Page 6: A Machine Learning Approach for Automatic Student Model Discovery

6

DEEP FEATURE LEARNING

Expert vs Novice (Chi et al., 1981) Example: What’s the coefficient of -3x?

Expert uses deep functional features to reply -3 Novice may use shallow perceptual features to reply 3

Model deep feature learning using machine learning techniques

Integrate acquired knowledge into SimStudent learning

Remove dependence on strong operators & split KCs into finer grain sizes

Page 7: A Machine Learning Approach for Automatic Student Model Discovery

7

FEATURE RECOGNITION ASPCFG INDUCTION

Underlying structure in the problem Grammar

Feature Non-terminal symbol in a grammar rule

Feature learning task Grammar induction Student errors Incorrect parsing

Page 8: A Machine Learning Approach for Automatic Student Model Discovery

8

LEARNING PROBLEM

Input is a set of feature recognition records consisting of An original problem (e.g. -3x) The feature to be recognized (e.g. -3 in -3x)

Output A probabilistic context free grammar (PCFG) A non-terminal symbol in a grammar rule that

represents target feature

Page 9: A Machine Learning Approach for Automatic Student Model Discovery

9

A TWO-STEP PCFG LEARNING ALGORITHM

• Greedy Structure Hypothesizer: Hypothesizes grammar

rules in a bottom-up fashion

Creates non-terminal symbols for frequently occurred sequences

E.g. – and 3, SignedNumber and Variable

• Viterbi Training Phase: Refines rule

probabilities Occur more frequently

Higher probabilitiesGeneralizes Inside-Outside Algorithm (Lary & Young, 1990)

Page 10: A Machine Learning Approach for Automatic Student Model Discovery

10

EXAMPLE OF PRODUCTION RULES BEFORE AND AFTER INTEGRATION

Extend the “What” Part in Production RuleOriginal:

Skill divide (e.g. -3x = 6)What:

Left side (-3x)Right side (6)

When:Left side (-3x) does not have constant term

=>How:

Get coefficient (-3) of left side (-3x)Divide both sides with the coefficient (-3)

Extended:Skill divide (e.g. -3x = 6)What:

Left side (-3, -3x)Right side (6)

When:Left side (-3x) does not have constant term

=>How:

Get coefficient (-3) of left side (-3x)Divide both sides with the coefficient (-3)

• Fewer operators• Eliminate need for domain-specific operators

Page 11: A Machine Learning Approach for Automatic Student Model Discovery

11

Original:Skill divide (e.g. -3x = 6)What:

Left side (-3x)Right side (6)

When:Left side (-3x) does not have constant term

=>How:

Get coefficient (-3) of left side (-3x)Divide both sides with the coefficient (-3)

Page 12: A Machine Learning Approach for Automatic Student Model Discovery

12

EXPERIMENT METHOD

SimStudent vs. Human-generated model Code real student data

71 students used a Carnegie Learning Algebra I Tutor on equation solving

SimStudent: Tutored by a Carnegie Learning Algebra I Tutor Coded each step by the applicable production rule Used human-generated coding in case of no applicable

production Human-generated model:

Coded manually based on expertise

Page 13: A Machine Learning Approach for Automatic Student Model Discovery

13

HUMAN-GENERATED VS SIMSTUDENT KCS

Human-generated Model

SimStudent

Comment

Total # of KCs 12 21

# of Basic Arithmetic Operation KCs

4 13 Split into finer grain sizes based on different problem forms

# of Typein KCs 4 4 Approximately the same

# of Other Transformation Operation KCs (e.g. combine like terms)

4 4 Approximately the same

Page 14: A Machine Learning Approach for Automatic Student Model Discovery

14

HOW WELL TWO MODELS FIT WITH REAL STUDENT DATA Used Additive Factor Model (AFM)

An instance of logistic regression that Uses each student, each KC and KC by opportunity

interaction as independent variables To predict probabilities of a student making an error

on a specific step

Page 15: A Machine Learning Approach for Automatic Student Model Discovery

divide 1 1 1 1 1 1 1 1 1 1

simSt-divide 1 1 1 1 1 1 1 0 0 0

simSt-divide-1

0 0 0 0 0 0 0 1 1 1

AN EXAMPLE OF SPLIT IN DIVISION Human-generated

Model divide:

Ax=B & -x=A SimStudent

simSt-divide: Ax=B

simSt-divide-1: -x=A

Ax=B -x=A

Page 16: A Machine Learning Approach for Automatic Student Model Discovery

16

PRODUCTION RULES FOR DIVISION

Skill simSt-divide (e.g. -3x = 6) What:

Left side (-3, -3x) Right side (6)

When: Left side (-3x) does not

have constant term How:

Divide both sides with the coefficient (-3)

Skill simSt-divide-1 (e.g. -x = 3) What:

Left side (-x) Right side (3)

When: Left side (-x) is of the

form -v How:

Generate one (1) Divide both sides with -1

Page 17: A Machine Learning Approach for Automatic Student Model Discovery

17

AN EXAMPLE WITHOUT SPIT IN DIVIDE TYPEIN

Human-generated Model divide-typein

SimStudent simSt-divide-

typein

divide-typein 1 1 1 1 1 1 1 1 1

simSt-divide-typin

1 1 1 1 1 1 1 1 1

Page 18: A Machine Learning Approach for Automatic Student Model Discovery

18

SIMSTUDENT VS SIMSTUDENT + FEATURE LEARNING

SimStudent Needs strong

operators Constructs student

models similar to human-generated model

Extended SimStudent Only requires weak

operators Split KCs into finer

grain sizes based on different parse trees

Does Extended SimStudent produce a KC model that better fits student learning data?

Page 19: A Machine Learning Approach for Automatic Student Model Discovery

19

RESULTS

Human-generated Model

SimStudent

AIC 6529 6448

3-Fold Cross Validation RMSE

0.4034 0.3997

Significance Test SimStudent outperforms the human-

generated model in 4260 out of 6494 steps p < 0.001

SimStudent outperforms the human-generated model across 20 runs of cross validation

p < 0.001

Page 20: A Machine Learning Approach for Automatic Student Model Discovery

20

SUMMARY

Presented an innovative application of a machine-learning agent, SimStudent, for an automatic discovery of student models.

Showed that a SimStudent generated student model was a better predictor of real student learning behavior than a human-generate model.

Page 21: A Machine Learning Approach for Automatic Student Model Discovery

21

FUTURE STUDIES

Test generality in other datasets in DataShop

Apply this proposed approach in other domains Stoichiometry Fraction addition

Page 22: A Machine Learning Approach for Automatic Student Model Discovery

22

Page 23: A Machine Learning Approach for Automatic Student Model Discovery

23

AN EXAMPLE IN ALGEBRA

Page 24: A Machine Learning Approach for Automatic Student Model Discovery

24

FEATURE RECOGNITION ASPCFG INDUCTION

Underlying structure in the problem Grammar

Feature Non-terminal symbol in a grammar rule

Feature learning task Grammar induction Student errors Incorrect parsing

Page 25: A Machine Learning Approach for Automatic Student Model Discovery

25

LEARNING PROBLEM

Input is a set of feature recognition records consisting of An original problem (e.g. -3x) The feature to be recognized (e.g. -3 in -3x)

Output A probabilistic context free grammar (PCFG) A non-terminal symbol in a grammar rule that

represents target feature

Page 26: A Machine Learning Approach for Automatic Student Model Discovery

26

A COMPUTATIONAL MODEL OF DEEP FEATURE LEARNING

Extended a PCFG Learning Algorithm (Li et al., 2009)

Feature Learning Stronger Prior Knowledge:

Transfer Learning Using Prior Knowledge

Page 27: A Machine Learning Approach for Automatic Student Model Discovery

27

A TWO-STEP PCFG LEARNING ALGORITHM

• Greedy Structure Hypothesizer: Hypothesizes grammar

rules in a bottom-up fashion

Creates non-terminal symbols for frequently occurred sequences

E.g. – and 3, SignedNumber and Variable

• Viterbi Training Phase: Refines rule

probabilities Occur more frequently

Higher probabilitiesGeneralizes Inside-Outside Algorithm (Lary & Young, 1990)

Page 28: A Machine Learning Approach for Automatic Student Model Discovery

28

FEATURE LEARNING

Build most probable parse trees For all observation

sequences Select a non-

terminal symbol that Matches the most

training records as the target feature

Page 29: A Machine Learning Approach for Automatic Student Model Discovery

29

TRANSFER LEARNING USING PRIOR KNOWLEDGE

GSH Phase: Build parse trees

based on some previously acquired grammar rules

Then call the original GSH

Viterbi Training: Add rule frequency

in previous task to the current task

0.660.330.50.5