learning analytics for improving educational games jcsg2017

Using Learning Analytics to improve Serious Games

Baltasar [email protected] , @BaltaFM

e-UCM Research Group , www.e-ucm.es

Jornadas eMadrid, 2017, 04/07/2017

Realising an Applied Gaming Eco-System

JCSG 2017, Universidad Politécnica de Valencia

mailto:[email protected]

http://www.e-ucm.es/

Open issues with serious games- Do serious games actually work?

- Very few SG have a full formal evaluation (e.g. pre-post)- Usually tested with a very limited number of users

- Formal evaluation could be as expensive as creatingthe game (or even more expensive)

- Evaluation is not considered as a strong requirement

- Difficult to deploy games in the classroom- Teachers have very little info about what is happening

when a game is being used

Learning analytics

• Improving education based on data analysis• Data driven• From only theory-driven to evidence-based

• Off-line• Analyzing data after use• Discovering patterns of use• Allows to improve the educacional experience for future

• Real time• Analyzing data while the system is in use to improve/adapt the current learning

experience• Allows to also use it in actual presential classes

Game Analytics

• Application of analytics to game development and research

• Telemetry• Data obtained over distance

• Mobile games, MMOG

• Game metrics• Interpretable measures of data related

to games

• Player behaviour

Game Learning Analytics (GLA) or Informagic?

• GLA is learning analytics applied to serious games• collect, analyze and visualize data from learners’ interactions with SG

• Informagic• False expectations of gaining full insight on the game educational

experience based only on very shallow game interaction data

• Set more realistic expectations about learning analytics with serious games• Requirements • Outcomes• Uses• Cost/Complexity

Perez-Colado, I. J., Alonso-Fernández, C., Freire-Moran, M., Martinez-Ortiz, I., & Fernández-Manjón, B. (2018). Game Learning Analytics is not informagic! In IEEE Global Engineering Education Conference (EDUCON).

Uses of Gaming Learning Analytics in educational games• Game testing – game analytics

• It is the game realiable?• How many students finish the game?• Average time to complete the game?

• Game deployment in the class – tools for teachers• Real-time information for supporting the teacher• Knowing what is happening when the game is deployed in the class• “Stealth” student evaluation

• Formal Game evaluation• From pre-post test to evaluation based on game learning analytics??

Minimun Game Requirements for GLA

• Most of games are black boxes.• No access to what is going on during game play

• We need access to game “guts”• User interactions• Changes of the game state or game variables

• Or the game must communicate with the outside world• Using some logging framework

• What is the meaning of the that data?

• Adequate experimental design and setting• Are users informed?• Anonymization of data could be required

Game Learning Analytics

Evidence-centered assessment design (ECD)The Conceptual Assessment Framework (CAF)

Student Model: What Are We Measuring?Evidence Model: How Do We Measure It?Task Model: Where Do We Measure It?Assembly Model: How Much Do We Need

to Measure It?

(Mislevy, Riconscente, 2005)

Learning Analytics Model (LAM)

LAM: stakeholders, activities, outcomes

xAPI Serious Games application profile

Standard interactions model developed and implemented in Experience API (xAPI) by UCM with ADL (Ángel Serrano et al, 2017).

The model allows tracking of all in-game interactions as xAPI traces(e.g. level started or completed)

Now being extended in BEACONINGto geolocalized games

IEEE stardarization group on xAPI https://www.adlnet.gov/serious-games-cop

https://www.adlnet.gov/serious-games-cop

Game Learning Analytics

Can GLA be systematized?

Realising an Applied Gaming Eco-System

JavaxApi Tracker

UnityxApi Tracker

C#xApi Tracker

Game trackers and cloud analytics frameworks as open code (github)

https://github.com/e-ucm

Client A2 Applications

Games

Analytics

Frontend

A2

Frontend

Pla

ye

rsD

evs,

Stu

de

nts

,

Te

ach

ers

.A

dm

ins

Users

Roles

Resources

Permissions

Applications

Au

the

ntica

tio

n

Au

tho

riza

tio

n

JWT

JSON

WEB

TOKEN

Topology

Analytics Backend

Games

Sessions

Collector

Results

Application 1

Application N

Ka

fka

Qu

eu

e

Games

Sessions

ResultsVisualization

Architecture and technologies

Open code to be deployed in your serverYou own all your game analytics data

Systematization of Analytics Dashboards

As long as traces follow xAPI format, these analysis do not require further configuration!Also possible to configure game-dependent analysis and visualizations for specific games and game characteristics.

Real-time analytics: Alerts and Warnings

• Identify situations that may require teacher intervention

• Fully customizable alert and warning system for real-time teacher feedback

24/11/2017

RAGE Project presentation17

Inactive learner: triggers when no traces received in #number of minutes (e.g. 2 minutes)> High % incorrect answers: after a minimum amount of questions answered, if more than # %of the answers are wrong

Students that need attention

View for an specific student(name anonymized)

xAPI GLA in games authoring

Previous game engine eAdventure (in Java)• Helps to create educational

point & click adventure games

Platform updated to uAdventure (in Unity)

Full integration of game learning analyticsinto uAdventure authoring tool

uAdventure games with default analytics

Include geolocalized games

uAdventure: geolocalized serious games

Geolocalized default analytics visualization

New Geolocalized game scenes and actions

Examples: First Aid - CPR validated game• Collaboration with Centro de Tecnologias Educativas de Aragon, Spain

• Identify a cardiac arrest and teach how to do a cardiopulmonary resuscitation to middle and high school students

• Validated game, in 2011, 4 schools with 340 students

Marchiori EJ, Ferrer G, Fernández-Manjón B, Povar Marco J, Suberviola González JF, Giménez Valverde A. Video-game instruction in basic life support maneuvers. Emergencias. 2012;24:433-7.

BEACONING Experiments: data collection

227 students from 12 to 17 years old

Game rebuilt with uAdventure

Each student completed:

1. Pre-test (multiple-choice questions)

2. Complete gameplay (xAPI traces)

3. Post-test (multiple-choice questions)

104 variables identified for each player.

Replicability of results: knowledge acquisition

Original experiment with

the game

Original experiment, control

group

Current experiment

Lower learning than in original experiment but still significative!

GLA for Improving SG evaluation

Ideally: we would want to find a better evaluation method for SGs, avoiding pre-post experiments. High costs in time and effort.

Our first approach: use data mining techniques to predict pre-test and post-test scores using the data tracked from in-game interactions.

- To measure knowledge acquisition.- But it could also work to measure attitude change or

awareness raised.

GLA data for Improving SG evaluation

Use data mining techniques to predict pre-test and post-test scores using the data tracked from in-game interactions.

1. To avoid pre-test:- Determine the influence of previous knowledge in game

results.

1. To avoid post-test:- Determine the capability of game interactions to predict

post test results when combined with the pre test.- Compare the previous capability to that of game

interactions on their own to predict post test results.

Improving SG evaluation

To avoid pre-test:

- Create prediction models of pre-test using only game data

Score prediction:- As binary category pass / fail

Prediction models used:- Naïve Bayes

Method Precision Recall

Naive Bayes 0.69 0.84

Results obtained:


To avoid post-test:- Create prediction models of post-test score using pre-test + game data- Create prediction models of post-test score using only game data

Score prediction:- As numeric value (from 1 to 15 in our game)- As binary category pass / fail

Prediction models used:- Trees- Regression- Naïve Bayes to compare results


To avoid post-test - summary of results obtained.

-Prediction of post-test score (1 to 15)

-Prediction of post-test pass / fail

Worse predictions without pre-test data but still acceptable results.

Method Pre-test ASE

Regression trees Yes 4.92

No 5.68

Linear regression Yes 5.81

No 5.71

Method Pre-test Precision Recall

Decision trees Yes 0.81 0.94

No 0.88 0.92

Logistic regression

Yes 0.89 0.98

No 0.87 0.98

Naive Bayes Yes 0.92 0.89

No 0.89 0.90

New uses of games based on GLA

- Avoiding pre-test: Games for evaluation

- Avoiding post-test: Games for teaching and measure of learningWith or without pre-test.

BEACONING GLA Case study: Downtown• Serious Game designed and develop to

teach young people with Down Syndrometo move around the city using the subway

• Evaluated with 51 people with cognitivedissabilities (mainly Down Syndrome)

• 42 users with all data

• 3h Gameplay/User

• 300K analytics xAPI data (traces) to analyze

Case Study: Downtown

• From user requirements to a gamedesign and its observables

• Know more about how and what islearn by people with Down Syndrome

31

Media Coverage

32

Game Design and Analysis Workflow

Present

Individualized

Learning Analysis

Collective

Learning Analysis

Predictive

Learning Analysis

….

Group 1

Group 2

Group 3

Game Sessions

Lea

rnin

g P

rogr

ess

d1.a d1.n

d2.a

d3.a

d2.n

d3.n

*d = Data collected during a game session

GLAID (Game Learning Analytics for Intellectual Disabilities) Model

Analytics Framework

User 1

User 2

User n

User 1

User n

User 3 User 2

User 5User 4

User 1

Data Handling

Designer Perspective Educator Perspective

User cognitive

restrictions

Formal

Requirements

Game & Learning

Design

Group of

Observables

Group of

Observables

Descriptive

Analytics

Clustering

Analytics

Predictive/Prescriptive

Analytics

xAPI

33

Hyp 1: Users prefer to identify themselveswith the avatar • REFUTED

• None of the users selected the avatar withDown features despite the trainers showedthem the avatar and pointed that thatcharacter was Down.

• The majority of the users used thepreconfigured character despite they wereasked to customize the avatars at the beginningof the game session.

• We are not observing significative evidences in the users’ play patterns between those whocustomize the character and those who don’t, but it may be significative that the majority of the users that changed the avatar were Down.

34

Hyp: High-Functioning users do a betterperformance using the game• To determine the cognitive skills and autonomy of the users we asked the

trainers to complete a test about each student• 6 intelectual dimensions were measured (5-point Likert scale)

• General cognitive/intellectual ability• Language and communication• Memory acquisition• Attention and distractibility• Processing speed• Executive functioning

• Users were divided in two groups: Medium-Functioning (≤ 3 avg.) and High-Functioning (>3 avg.).

• MF = 19 (45.2%) HF = 23 (54.8%)

35

Hyp 2: High-Functioning users do a betterperformance using the game

36

Number of MF users that played each level Number of HF users that played each level

Hyp 2: High-Functioning users do a betterperformance using the game

37

Average time completing levels for MF Average time completing levels for HF

12:31:21 AM

12:37:02 AM

12:33:16 AM

EASY MEDIUM EXPERT

12:32:44 AM

12:28:33 AM

12:36:51 AM

12:25:58 AM

EASY MEDIUM HARD EXPERT

Hyp: ID users are engaged and motivatedwhile learning with a videogame

38

12:01:30 AM

12:01:03 AM12:00:59 AM12:00:59 AM

12:00:50 AM

12:01:00 AM

12:00:36 AM12:00:38 AM

0:00

0:17

0:35

0:52

1:09

1:26

1:44

1 2 3 4 5 6 7 8

• Inactivity times reducedin a 70,7% avg. fromsession #1 to session #8

• Positive and motivationallearning environment(98,2% users show improvement and engagement performingthe videogame tasks)

Average inactivity time evolution

avg

tim

e

game session

Hyp: The game design of Downtown iseffective as a learning tool

39

• 100% of the trainers agreethat the use of Downtownwould enhance the userlearning adquisition(Perceived Usefulness)

• 85,8% of the users wereable to follow the rightpath (both LF and HF)

• 50,8% of the wrong pathoccured during the first 30 min. of playing

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5 6

Correct vs Incorrect Path per Game session

(#correct stations vs #incorrect stations)

cou

nt

game session

Cyberbullying: Conectado game

Educational desing

● Adventure game- sentiments and emotions are important

● Real situations familiar for students

● Events based on user decision making (but no agression options)

● Scenarios based on research about bullying and cyberbullying

● Different roles of bullying represented

● Designed to be used at classroom

Game mechanics

Seminario eMadrid sobre Serious gaes 2017-02-24 42

New student in schoolOccurs during 5 daysMinigames as “nightmares”Implications of the social networks

The experiment: initial validation

With students from 3 schools( Madrid, Zaragoza, Teruel)

257 students223 Valid pre-post and gameplays (121 males, 102 females)

Significant increase in the ciberbullyingperception

Wilcoxon paired test, p<0.001

5.726.38

Next steps in the project

•Currently validating its use as a tool for teachers• Al least with 100 teachers• At least with 100 educational sciences students

• Validating the dashboards with the teachers• Does the dashboard provide the right and meaningful information• Does the average teacher understand the dashboard• Are the errors and warnings useful for teachers?

• Next: Go into full scale final validation with at least 1000 students• Teachers use the game with their students at class• We got all the analytics data

First validation experience with teachers: perception

13 teachers (11 complete data sets)

Experiment with teachers: applicability of the game

Conclusions

• Game Learning Analytics has a great potential from the business, application and research perspective

• Still complex to implement GLA in SG• Increases the (already high) cost of the games

• Requires expertise not always present in SME

• New standards specifications and open software development could greatly simplify GLA implementation and adoption

48

49

Thank You!

Gracias!

¿Questions?• Mail: [email protected]

• Twitter: @BaltaFM

• GScholar: https://scholar.google.es/citations?user=eNJxjcwAAAAJ&hl=en&oi=ao

• ResearchGate: www.researchgate.net/profile/Baltasar_Fernandez-Manjon

• Slideshare: http://www.slideshare.net/BaltasarFernandezManjon

mailto:[email protected]

https://scholar.google.es/citations?user=eNJxjcwAAAAJ&hl=en&oi=ao

http://www.researchgate.net/profile/Baltasar_Fernandez-Manjon

http://www.slideshare.net/BaltasarFernandezManjon

Hyp 3: Users with previous experience in transportationtraining have a better performance using the game

50

12:30:15 AM12:32:03 AM

12:00:00 AM

12:07:12 AM

12:14:24 AM

12:21:36 AM

12:28:48 AM

12:36:00 AM

12:43:12 AM

12:50:24 AM

No trained Previously trained

Average time completing tasks

Users with no training vs users previously trainedREFUTED

• Almost no differencesbetween trained and no trained users regardingtime completing tasks

• Learning curve using thevideogame is independantof the previous experienceusing the publictransportation system

avg

tim

e

Hyp 4: Technology-trained users do a betterperformance using the game

51

12:32:16 AM

12:28:18 AM

12:00:00 AM

12:07:12 AM

12:14:24 AM

12:21:36 AM

12:28:48 AM

12:36:00 AM

No Videogame Players Videogame Players

Average time completing tasks

No videogame players vs videogame players

CONFIRMED

• Users that play videogamesoften completed the gamesessions faster than usersthat are not used to playvideogames at home

avg

tim

e

http://www.celt.iastate.edu/teaching-resources/effective-practice/revised-blooms-taxonomy/