big data research in undergraduate education george karypis department of computer science &...

16
Big Data Research in Undergraduate Education George Karypis Department of Computer Science & Engineering University of Minnesota

Upload: damian-russell

Post on 29-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Big Data Research in Undergraduate Education

George KarypisDepartment of Computer Science & EngineeringUniversity of Minnesota

PREDICTING STUDENT’S PERFORMANCE IN COURSE ACTIVITIES

Background & Motivation

• Learning management systems (LMS) are now widely deployed and have become integral components in how universities teach their courses– distribute course material, discussion forums, wikis, online quizzes,

assignment distribution & submission, online gradebook, etc.

• They provide a mechanism by which a student’s “engagement” in a course can potentially be observed.

• Research question: – Can we leverage LMS information to predict how well a student will

perform in the course’s assignments?

• Accurate predictions can be used to develop “early warning” systems.

• Task– Predict the grade that a student will achieve in a graded

activity (quiz or assignment) based on information associated with the student’s prior performance, the course, and the student’s LMS interactions.

• Primary data– University of Minnesota’s Moodle installation.

– Over 11,000 students and 800 courses.

– Over 114,000 assignment submissions, 75,000 quiz submissions and 250,000 forum posts.

Problem setting

Features

• Student performance-specific features:– cumulative GPA & cumulative grade in the course so far.

• Activity and course-specific features:– activity type, course level, and department.

• Moodle interaction features:– #of discussions initiated, #of posts-write, #of posts-reads, #of

views, #of wiki adds, and #of other activities (e.g., surveys).

– Counts were determined at different time intervals prior to the activity’s due date and covered only the period after the last graded activity.

Models – Baseline

• Linear regression

predicted grade forstudent s on activity a

feature vectorfor student’s s

activity a

estimatedlinear model

Models – Collaborative multi-regression

• Estimates multiple linear regression models with student-specific linear combinations.

feature vector

student-specific combination

weight

student and course

bias terms

k linear models

Collaborative Multi-Regression Models

• Learns a small number of models – Captures performance patterns of student groups.– Makes use of the similarities among the students

(with respect to performance).

• Achieves personalization through – Student-specific bias terms. – Student-specific combination weights

(memberships).

Results – Prediction accuracy

Results—Effect of bias terms

Results—Feature importance

A deeper look…

A deeper look…

+Moodle features+assignments

+GPA+quizzes

A deeper look…

+Moodle features+assignments

+GPA+quizzes

A deeper look…

Observations

• Using the Moodle interaction features leads to better prediction accuracy.

• Features mostly contributing to predicted grades relate to:– Viewing of course material

– Previous performance

• Features related to viewing course material contribute to the predictions of some students more than others.– Some departments tend to have students whose viewing of course

material does not contribute much to their predicted grades.