predicting performance on mooc assessments using multi ... · predicting performance on mooc...
TRANSCRIPT
Predicting Performance on MOOC Assessments using Multi-Regression Models
Zhiyun Ren, Huzefa Rangwala, Aditya Johri
George Mason University4400 University Drive, Fairfax, Virginia 22030
Outline
q Background
q Personal Linear Multi-Regression Models
q Feature selection
q Experiments and discussion
q Conclusion and future work
Background
Background
Overview
q Information we have: MOOC server log
q Things we want to do: Predict student’s performance
Challenge
q Various kinds of participants
q High attrition rate
q Flexible timetable
q Baselines we have tried: Linear regression model, meanscore
Personal Linear Multi-Regression Models
!",$ = &" + &( + )"*+,"$ = &" + &( + ()",. ,"$,/0.,/
12
/34)
6
.34
𝑝𝑠 𝑊 𝑓𝑠𝑎
𝒍 --Numberof regressionmodels
𝒏𝑭 -- Numberof features
!"#"!"$%((, *, +)
12/ (01,2 − 01,2)4
5
678+ :( * ;4 + ( ;4) + <( * + ( )
Data structure
(a) Homework and quiz
(b) Video
(c) Study session
Feature selection
q quiz related features
q time related features
q interval-based features
q homework related features
Feature selection
q Video related features
q Session features
Experimental setup
q Different motivations part the data into two groups.
q Different models are applied for different data types.
Experimental protocol
q PreviousHW-based prediction
q PreviousOneHW-based prediction
HW1 HW2 HW3 HW4 …...
HW1 HW2 HW3 HW4 …...
Experimental baseline: KT-IDEM
K K K
Q Q Q
P(L0)P(T) P(T)
P(G1)P(S1)
P(G2)P(S2)
P(G3)P(S3)
I I I
……
Model parameters
P(L0) = Initial KnowledgeP(T) = Probability of learningP(G1…n) = Probability of guess per questionP(S1…n) = Probability of slip per question
n denotes the number of all questions.
Comparative Performance
q Prediction results with varying number of regression models for student group with continuous grade value
Comparative Performance
q Prediction results with varying number of regression models for student group with binary grade value
Comparative Performance
q The comparison of the accuracy and F1 scores with baseline approaches.
Feature Importance
Feature Importance
Conclusion and future work
q Predict algorithm: personalized multiple linear regression model.
q Experimental results: improved performance compared to baseline methods.
q Other contribution: analysis of feature importance.
q Future work: to set up an early warning system to help improve student’s performance
Thank you!