analyzing student strategies in blended courses using ...analyzing student strategies in blended...

Analyzing Student Strategies In Blended Courses UsingClickstream Data

Nil-Jana AkpinarCarnegie Mellon University

5000 Forbes AvenuePittsburgh, PA 15213

[email protected]

Aaditya RamdasCarnegie Mellon University


[email protected]

Umut AcarCarnegie Mellon University


[email protected]

ABSTRACTEducational software data promises unique insights into stu-dents’ study behaviors and drivers of success. While muchwork has been dedicated to performance prediction in mas-sive open online courses, it is unclear if the same methodscan be applied to blended courses and a deeper understand-ing of student strategies is often missing. We use patternmining and models borrowed from Natural Language Pro-cessing (NLP) to understand student interactions and ex-tract frequent strategies from a blended college course. Fine-grained clickstream data is collected through Diderot, a non-commercial educational support system that spans a widerange of functionalities. We find that interaction patternsdiffer considerably based on the assessment type studentsare preparing for, and many of the extracted features can beused for reliable performance prediction. Our results suggestthat the proposed hybrid NLP methods can provide valuableinsights even in the low-data setting of blended courses givenenough data granularity.

KeywordsStudent Strategies, Blended Courses, hybrid NLP methods

1. INTRODUCTIONData collected through educational software systems canprovide promising starting points to address hard questionsrooted in the learning sciences. Modern education relies in-creasingly on these systems to assist teaching and grading,manage learning content, provide discussion boards, facili-tate group work, or replace the traditional class room settingaltogether. While blended courses revolve around the tradi-tional class room setting accompanied by task-specific soft-ware support, Massive Open Online Courses (MOOCs) areusually entirely virtual and often involve video lectures andhundreds to thousands of students in a single course. Al-most by design, these systems come with unprecedented op-portunities for large scale data collection on students’ studyhabits, content exposure and learning trajectories.

Much of the previous research effort has been directed to-wards performance prediction with the overall rationale thatreliable estimation of students’ grades and dropout proba-bility at early course stages can be used to devise EarlyWarning Systems (EWSs) [e.g. 30, 18, 29, 7]. Despite con-siderable success in this area, many performance predictionmodels suffer from a list of shortcomings. Prior work on per-formance prediction from student online activity data haspredominantly focused on MOOCs [e.g. 8, 28, 24], and itis unclear if the same methods can be applied to blendedcourses [3]. In most blended courses, some of the learningactivity takes place offline and cannot be tracked which leadsto relatively shallow data on only fragments of courses. Inaddition, many of the features that can be derived are sim-ple and coarse summary statistics of students’ online activitydata, e.g. counts of clicks or logins, that only have a limitedcapacity to reflect the often complex strategies students takewhen interacting with course material.

A detailed understanding of how students interact with ed-ucational systems and the strategies they take is crucial forreliable performance prediction. We thus seek to under-stand how students approach learning in blended coursesbased on the second half of a sophomore level college coursein computer science. Our data is drawn from Diderot, anon-commercial educational software system developed atCarnegie Mellon University which spans functions for virtu-ally all course components outside of face-to-face class andrecitation times, and thereby allows us to overcome many ofthe challenges that are generally faced when mining blendedcourses. Despite evident similarities, there are several im-portant characteristics which differentiate our blended learn-ing setting from the study of MOOCs. Most importantly,our data spans relatively few students and student actionswhich constitutes a challenge for many of the previously pro-posed methods. In addition, we have access to data that isunique to in-person classes such as individual attendance,and the nature of our activity data facilitates contextual-ization of student behavior which promises to increase theinterpretability of downstream prediction models.

In this paper, we place a dual focus on methodology and ed-ucational insights. On the one hand, we propose new model-ing pipelines based on ideas from natural language process-ing that work well in the low-data setting of blended courses.On the other hand, we apply both new and existing meth-ods to Diderot data and gain valuable insights into studentbehavior while addressing the following research questions:

arX

iv:2

006.

0042

1v1

[cs

.CY

] 3

1 M

ay 2

020

RQ1 How do students interact with course material, andwhat are frequent strategies they take?

RQ2 How do students use these strategies for homeworksolving as compared to exam preparation?

RQ3 Are student strategies indicative of grade outcomes?

The remainder of this paper is outlined as follows. We dis-cuss related work in Section 2, and proceed to give some con-text for the data in Section 3. Section 4 describes our meth-ods including the preprocessing of clickstream data, and wediscuss our results in Section 5. Finally, conclusions aredrawn in Section 6.

2. RELATED WORK2.1 Analysis Of Online Student BehaviorRaw data from educational software systems often comesin the form of time-stamped student actions with an ar-ray of suitable identifiers. Evidence for correlations betweenactivity log-based features and performance outcomes areplentiful. Many of the commonly discussed features revolvearound simple summary statistics such as counts of cer-tain types of actions, and have been shown to be indica-tive of students’ success particularly in MOOCs. Recentlines of research find links between general course comple-tion in MOOCs and the number of watched videos [38, 9],the number of question answer attempts [9], and the timespent on assignments [4]. Similar results have been observedfor blended courses but are much scarcer [39, 15]. In [15],the authors analyze sequences of transitions between dif-ferent online platforms in two undergraduate level collegecourses. Their study finds that, although students are gen-erally more likely to stay on the same platform in a studysession, high achieving students transition more often andare more likely to use the discussion board. In many cases,the limited amount of data in blended courses is problem-atic and can lead to complications such as zero-inflated countvariables.

A major shortcoming of count-based methods is their failureto leverage the sequential structure of students’ interactionswith educational software systems. Both the order and thetime difference between actions promise to carry valuable in-formation that can be taken into consideration when relyingon sequence based methods instead. In this work, we pro-pose a pipeline for analyzing student online behavior basedon session study sequences. While the order of actions istaken into account explicitly, time differences help us to de-rive reliable study sessions.

2.2 Study SessionsSequence-based approaches to processing online student ac-tivity data group student actions into smaller sessions. Inthe case of click actions, these sequences are generally re-ferred to as clickstreams. The goal when breaking a flow ofactions into session clickstreams is to maintain some notionof interpretability, i.e. to devise meaningful study sessions.While this appears to be easy in some cases, it is generallynon-trivial to find automated cut-offs rules that find sensiblerepresentations of study sessions for a large and diverse setof clickstreams at once.

Previous research suggests several different strategies to splitclickstreams. The authors of [8] choose fixed duration timeframes to group student actions from a several months longMOOC. The researchers decide for durations between oneday and one month and show some success in the down-stream prediction of student achievements with their choices.Similar fixed durations are used in [2]. Another popularsplitting strategy is based on time-out thresholds where anew sub-session is started when no action was performed ina predefined time window [31, 5, 35, 43, 12]. The authorsof other studies go one step further and combine the ap-proaches by first, splitting at a fixed duration cut-off andsecond, at data-driven timeout thresholds of 15 minutesfor ‘study sessions’ and 40 minutes for ‘browser sessions’[15]. Similar data-driven approaches are pursued in [45, 39].Other common heuristics include splitting at navigationalcriteria such as reloading of the course page [25].

On a high level, the problem of devising meaningful sub-sessions is closely related to the problem of time-at-task es-timation in web-usage mining. Ideally, study sessions re-flect time periods in which students interact with the ma-terial without any major breaks or distractions. There is arich body of literature on time at task estimation that sug-gests that there is no one-fits-all solution to finding suitabletime windows to split activity streams at [25, 6, 11]. Previ-ous research suggests that the exact splitting heuristic canhave a significant effect on overall model fit, model signifi-cance, and even interpretation of findings in the downstreammodeling tasks [25]. In [25], the authors explore the effectof 15 different time-at-task estimation procedures on fivedifferent models of student performance. Overall, the au-thors conclude that there is no universally best method andrecommend a mixture of existing methods including data-driven components. Following this suggestion, we employ amulti-step splitting procedure including navigational crite-ria, data-driven time-out thresholds, and separation of as-sessment weeks inspired by the procedure in [15].

2.3 Sequence AnalysisDifferent methods have been proposed to process sequence-type student action data dependent on the amount of data,the length of sequences, and the goal at hand. Several linesof research rely on Markov chains and hidden Markov mod-els which lend themselves well to visualization of sequences,but can make quantification of group differences in outcomeschallenging [14, 13, 19]. Another commonly used class ofmethods is clustering of activity sequences [12, 22, 16]. Us-ing data from three large MOOCs, the authors of [22] drawon simple k-means clustering of sequences of interactionswith video lectures and assessments and observe four high-level student trajectories: completing assessments, auditingthe course, disengaging after a while, and sampling content.In order to cluster the sequences, the authors rely on a nu-merical translation of student actions. The authors of [12]cluster and visualize students’ interactions with a collegemath environment, and instead rely on Levenshtein distanceto measure the distance between sequences. Some workscombine Markov models and clustering to account for therandomness introduced by the Markov models and reportmore robust results [40, 26, 23].

Although the described methods allow for a relatively easy

grouping of sequences, interpretation of clusters can be non-obvious. One way to address this problem is to deliber-ately focus on finding relevant sub-parts of action sequences.Methods based on this goal can be summarized under theterm pattern mining, and are both wide-spread and diverse.A relatively recent approach is given by differential patternmining which focuses on automatically extracting patternsthat are both above a certain threshold in frequency, andsufficiently different among groups of interest (e.g. high andlow achieving students) [21, 20]. Other lines of researchrely on more traditional data mining techniques [17, 34], orextraction of n-grams, i.e. sub-sequences of n consecutiveactions [8, 32, 44, 36]. The authors of [32] use a multi-stepprocedure to extract frequent n-grams that are subsequentlyused to identify different strategies in a collaborative interac-tive tabletop game. Part of our analysis is based on a similarapproach to extract frequent behavioral patterns, and com-bines ideas of n-gram extraction and clustering to get morerobust results.

A different class of promising methods is rooted in Natu-ral Language Processing (NLP). Hybrid language modelslend themselves well to the sequential structure of educationdata, and their use for student activity sequences has lead tosome success in retrieving patterns and creating new visual-izations. The underlying idea is that, given sufficiently fine-grained data, students’ sequential actions resemble wordsbuilding sentences and can be attributed some ‘semanticmeaning’. The NLP toolbox has not yet been explored fully,but some attempts to using language models for educationaldata are noteworthy and relevant for the context of our work.The authors of [44] use topical n-gram models to automat-ically extract ‘topics’ in the form of frequent patterns fromclickstreams. In [36], the authors train a skip-gram neuralnetwork to receive a structure preserving vector embeddingof the types of clicks student can make. After standard di-mensionality reduction, the researchers are able to providea new kind of visualization of students’ trajectories throughthe course. Since modern NLP models generally requirelarge amounts of granular training data, work relying onthese models has exclusively focused on MOOCs so far. Inthis study, we draw on Latent Dirichlet Allocation (LDA) inorder to automatically extract frequent patterns and com-pare derived student strategies against the results of a moretraditional n-gram pipeline. In some sense, LDA is similarto the ideas proposed by [44] but requires less training datawhich renders it particularly useful for blended courses. Inaddition, we use an adapted form of the skip-gram modelproposed by [36] in order to explore the context of studentactions in our data. To the best of our knowledge, this isone of the first works to employ NLP methods for analysisof blended courses.

3. DATA3.1 Data Context: DiderotThe data this study builds on was collected through the ed-ucational software system Diderot. Diderot is a cloud-basedcourse support system commonly used to assist undergrad-uate and graduate level college courses. The system spansa wide range of functionalities including sharing of lecturenotes, a discussion board (called post office), in-class at-tendance polls, homework submission, and automatic codegrading. This bandwidth usually renders the use of addi-

Figure 1: Histogram of the number of clicks per stu-dent. We observe 138,960 clicks spread between 164students (top). Number of clicks over observationperiod with assessment deadlines highlighted (mid-dle). Kernel density estimates for log-distributionof waiting times between clicks dependent on typeof last click after splitting at assessment weeks andLoad course actions. Final cut-offs at 5 and 60 min-utes are indicated by vertical lines (bottom).

tional outside technological course support unnecessary. Inturn, the student usage data collected from Diderot can givean almost comprehensive view on students’ course partici-pation outside of face-to-face class times.

When it comes to sharing of lecture notes, Diderot takesa more granular and interactive approach as compared totraditional learning management systems. Content is splitinto small sub-entities (called atoms) which are displayed ina linear fashion following the outline of a chapter. Atoms arehighly interactive and come with a variety of clickable iconsthat allow students to take notes, bookmark, follow, or likeatoms, and, in particular, to ask questions concerning theircontent. Discussions about course material that are sparkedin this way are visually attached to the respective atom,allowing other students to submit comments. This setupresults in much richer data on interactions with lecture notes

than we can expect from PDF formatted lecture material.

Most student usage data from Diderot is presented by in-dividual, time-stamped click actions that come with vari-ous identifiers coding the exact type, user, and location ofthe interaction. In turn, the activity data can be broadlyseparated into navigation (e.g. Load course, Click link,Search), discussion (e.g. Go to post office, Create post),and behaviors (e.g. Like atom, Follow post).

3.2 Data Description And ExplorationOur data is drawn from the second half of a large sophomore-level computer science course taught at Carnegie MellonUniversity in spring 2019. Since data is not available for thefirst part of the course due to initial technical difficulties,we exclude all students who dropped the course throughoutthe semester. One additional student was excluded basedon inflated click patterns which suggested an attempt atautomatically scraping content. Along with the click data,we rely on performance information measured by homeworkand exam grades, as well as student-level lecture and recita-tion attendance logs. All data is collected through Diderotand matched based on anonymous student identifiers. Asummary of the click data over the seven week observationperiod is displayed in Figure 1.

Types of clicks. At finest granularity, Diderot allows forseveral tens of thousands distinct click actions within a sin-gle course since every individual click is associated to a fullyspecified object and activity. However for the sake of analy-sis, we group clicks into different types where the appropri-ate level of granularity is non-obvious. We aggregate clicksbased on the type of object they refer to as well as the ac-tivity performed. In order to maintain interpretability, thisaggregation is performed separately in each sub-part of thecourse given by lecture notes, homework material, recitationnotes, a library documentation (which is comprised of cod-ing references), and practice exams. This leaves us with 37different click types, the most common of which are summa-rized in Table 1.

Grades and types of assessment weeks. Performanceoutcomes are measured by percentage grades in five home-works and two exams (a midterm and final exam) that fallinto the observation period. This naturally divides the datainto seven assessment weeks with a deadline for a homeworkproblem set or exam at the end of each period. Deadlines areapproximately evenly spaced with only one extended home-work period of 11 days after the midterm exam (which alsospans over a four day spring holiday), followed by a shorterhomework period of only 5 days. We take interest in relatingstudents’ study behavior to two distinct outcome variables:(1) The type of the assessment week, i.e. homework dead-line or exam, and (2) the percentage grade students receivedin the respective assessment. As depicted in Figure 1, thereare visible spikes of increased activity before the assessmentweek deadlines especially before the two exams. In addition,we note that the distribution of grades appears notably dif-ferent between homeworks and exams which is confirmed bya two-sample Kolmogorov-Smirnov test (p < 0.001). Whilethe distribution of exam grades is approximately bell-shapedwith heavy tails and a slight left-skew, i.e. more particularlyhigh scores than particularly low scores, the homework grade

Table 1: Summary of the most frequent click types.

Click type Count Share

View chapter in lecture notes 24,420 17.57 %

View general post 21,555 15.51 %

Load course 19,677 14.16 %

View post office 16,231 11.68 %

View atom post 15,468 11.13 %

View homework atom 7,888 5.68 %

distribution is left-skewed with additional modes at 0 and100. This difference in distributions is unsurprising as ex-ams are generally graded on a curve and cannot be skippedby students, while homeworks allow for more variability.

Class attendance. Attendance in lecture and recitationsessions was taken with Diderot polls. If a student partici-pated in the poll, which was generally only open for a fewminutes, it was assumed that they attended the session. Wetreat attendance in lectures and recitations separately andaggregate the binary information on an assessment week ba-sis by taking the mean. In turn, student’s attendance scoreslie between 0 and 1 with the exception of the final examweek which is not associated to any contact class time.

4. METHODS4.1 Session ClickstreamsIn raw form, each student is associated with a single click-stream which consists of ordered click actions over the wholeobserved time period. We employ a multi-step procedure tosplit this data into more meaningful study sessions. First,we divide the clickstreams based on assessment weeks. Sec-ond, we split the resulting sub-clickstreams each time aLoad course action is recorded, and last, we choose a data-driven timeout threshold to further break up the resultingsequences.

In order to find a suitable timeout threshold, we employ atechnique similar to [15] and examine the distribution oftime differences in the sub-sequences. We find that thedistribution of waiting times supports a wide range but israpidly decaying. While 75 % of clicks are made within 2.81minutes or less, a small subset of clicks has time differencesof up to 7 days. Figure 1 shows kernel density estimates oflog-transformed minutes until the next click within the sub-clickstreams obtained after the second step of our procedure.Different estimates are obtained for distinct categories of ac-tions. While the logarithmic distribution of post-related andmiscellaneous clicks is unimodal with the majority of follow-up clicks made within one minute, the distribution for clicksrelated to homework and lecture notes has an additionalmode at about 5-10 minutes. This disparity is unsurpris-ing given that most actions can be expected to be short,while reading through lecture notes or homeworks can bea more lengthy process. In order to preserve both types ofsessions, we separate clickstreams at a 60 minutes thresholdif the last action was loading of lecture notes or homeworkrelated content, and at 5 minutes otherwise. As a result, weobtain a total of total of 35,703 session clickstreams whereeach clickstream has between one and 115 clicks with meanof 3.98 clicks and standard deviation of 6.23; 75% of session

Figure 2: Skip-gram neural network. The hiddenlayer linearly transforms one-hot encoded inputswhile the softmax output layer approximates theprobability that each given click type appears in thesame context as the input click. After training, theweights of the hidden layer provide a structure pre-serving embedding of click types.

clickstreams have at most 4 clicks.

4.2 Context Of Click TypesWe explore the contexts in which different types of clicks aremade in order to gain some understanding of how studentsgenerally use the course support system. This is crucialsince Diderot is a fully integrated interactive platform thatallows the same type of click in contexts that can have differ-ent interpretations. Inspired by [36], we tackle this problemby devising a structure-preserving embedding of the clicktypes into a real-valued vector space, i.e. each click type ismapped onto a vector such that click types that appear inthe same contexts or are interchangeable are close to eachother. This type of embedding can be obtained from a skip-gram model which is a common supervised two layer neuralnetwork model often used for language type data (see Fig-ure 2).

Training data for the model is build by extracting pairs ofneighboring click types from the session clickstreams. Moreconcretely, each input click is paired with each click appear-ing within some index in the same clickstream. Both thewindow size and the number of hidden units are importanthyperparameters. Since most of our clicks are short and weseek an embedding of only 37 clicktypes, we explore smallvalues for both parameters, i.e. window sizes in {1, 2} andembedding sizes in {3, 4}. After this small grid search, weonly retain the model with the lowest average training lossin the last 2000 training steps. In order to speed up train-ing, we rely on mean noise-contrastive estimation (NCE)loss where 8 negative classes are sampled for every batchinstead of computing the entire softmax output. All modelsare trained over a maximum of 300,000 training steps withSGD with learning rate 1 and a batch size of 512. Trainingis terminated early when the average loss over 2000 trainingsteps does not change considerably for 5 consecutive non-overlapping 2000-step periods. Because training the modelis only the surrogate task in order to obtain the embedding,we train on all available data which comprises 206,514 or363,260 pairs dependent on the window size.

4.3 Frequent Pattern Extraction4.3.1 Clustered n-grams

We refer to finite sub-sequences of clickstreams as frequentpatterns if they appear various times across different stu-dents, study sessions, and assessment weeks. Our goal isto automatically extract frequent patterns which representsome kind of strategy or high level task students are ful-filling. As an example, the sequence [Login - View post

office - View general post] could be interpreted as anattempt to catch up on the course news.

Pattern mining in educational data mining can lead to rel-atively unstable results. In order to increase robustness, weexamine and compare the results of two distinct proceduresfor frequent pattern extraction. The first method resemblesthe procedure proposed by [32], and consists of a multi-stepprocedure which first extracts a large set of candidate pat-terns, and then narrows the selection down by similaritygrouping. Formally, we proceed according to the followingsteps:

(1) All n-grams. We extract n-grams, i.e. consecu-tive sub-sequences of n clicks, from the session click-streams. Since we expect very short patterns to be un-interpretable, and particularly long patterns are rarein our dataset, we choose n = 3, 4, 5.

(2) Candidate patterns. Only the most frequent pat-terns are kept as candidates for further analysis. Fol-lowing some experimentation, we choose to keep themost frequent 1 % of patterns of each length.

(3) Hierachical clustering. The set of candidate pat-terns can be expected to be repetitive in the sensethat patterns might be similar but vary in length ordiffer in a single click action but yield the same in-terpretation. To address this issue, we automaticallygroup candidate patterns by agglomerative clusteringwith average linkage. The number of clusters, and thusof final frequent pattern categories, is chosen by visualinspection of the model’s dendrogram.

The final step of this procedure requires us to specify a no-tion of similarity between patterns. In some sense, it isnatural to draw on a string distance measure as sequencesof clicks resemble many of the characteristics we would ex-pect from natural language. While the authors of [32] drawon the traditional Levenshtein distance, we choose the Jaro-Winkler distance between two patterns p1, p2 measured by1− jw(p1, p2), where jw(·, ·) denotes the Jaro-Winkler simi-larity. Jaro-Winkler distance is an adaptation of more tradi-tional edit distances which takes the sequence length as wellas common starting sub-sequences into account. This allowsmore sensible measuring of similarities between repetitivepatterns of different lengths such as the 3-gram [View gen-

eral post - View general post - View general post] andthe 5-gram [View general post - View general post - View

general post - View general post - View general post].Intuitively, the two patterns should have a low distance andin fact, their Jaro-Winkler distance is approximately 0.093while their normalized Levenshtein distance is 0.4. For ourpurpose, we treat each click as a character that can be ex-changed or transposed for a penalty on the distance. Then,

the Jaro similarity is defined as

j(p1, p2) :=

{0 if m = 0,13

(m|p1|

+ m|p2|

+ m−tm

)else,

where m is the number of matching clicks within an indexwindow of b(max{|p1|, |p2|}/2)c−1, and t is half the numberof required transpositions for matching clicks. Further, theJaro-Winkler similarity is defined as

jw(p1, p2) := j(p1, p2) +l

10(1− j(p1, p2)),

where l is the length of a common starting sequence betweenp1 and p2 (at most 4). The additional scaling ensures thatdistances are normalized to lie in [0, 1].

4.3.2 Topic ModelThe clustered n-grams procedure of extracting frequent pat-terns is easy to implement and model-free. However, it re-quires us to choose several hyperparameters such as the sizeof n-grams, the share of candidate patterns, or the num-ber of clusters. It is also likely that the exact choice of theedit distance in the clustering step has a non-negligible ef-fect on the observed results. In order to test our resultsfor robustness, we employ a second method for pattern ex-traction and compare the resulting student strategies. Thismethod draws on the idea that session clickstreams resem-ble sentences, individual clicks resemble words, and there issome notion of semantic to a sequence of clicks. Based onthese similarities, we use Latent Dirichlet Allocation (LDA),a common NLP model that allows automatic extraction oftopics from written documents.

LDA is a Bayesian model which, in our case, is build onthe assumption that each session clickstream is a mixture ofpatterns and each pattern is a mixture of clicktypes. We usethe words pattern and topic interchangeably here. While theclickstreams (and hence clicktypes) are given to the model,the topics are latent and can be inferred from the fittedmodel. The prior on the session clickstream generation as-sumes that M clickstreams of lengths N1, . . . , NM are drawnaccording to the following steps. (1) Draw a topic distribu-tion θi ∼ Dirk(α) for each i = 1, . . . ,M , where k is thenumber of topics. (2) Draw a click type distribution fortopics φi ∼ DirV (β) for each i = 1, . . . , V , where V is thenumber of different click types. (3) For each click positioni, j with i ∈ {1, . . . ,M} and j ∈ {1, . . . , Ni}, first, choose atopic according to zij ∼Multinomial(θi), and second, drawa click type from wij ∼ Multinomial(φzij ). LDA comeswith three hyperparameters: the prior Dirichlet parametersα and β which express some prior belief on how the mixturesof topics and click types are composed, and the number oflatent topics k. While we set the prior Dirichlet parame-ters to suggested default values, i.e. normalized asymmet-ric priors, the number of latent topics requires some morethought. Recent research suggests the use of topic coherencemeasures for comparison of models with different choices ofk [33, 42]. On a high level, topic coherence attempts tomeasure semantic similarity between high scoring words (orhere click types) in each topic which gives some indication ofhow interpretable the topics in question are. We experimentwith several numbers of topics ranging around the numberof frequent patterns extracted by the clustered n-gram tech-nique. Since no significant differences in coherence can be

observed, we resort to using the same number of topics asfor the clustered n-gram method for the sake of comparison.

4.4 Prediction ModelsFrequent patterns counts as features. In order to ex-plore what role the extracted strategies play in homeworksolving versus exam preparation and whether they drivesuccess, we build two prediction models based on patternscounts from the clustered n-gram method. For this, a repre-sentative pattern of 3 clicks is chosen for each of the devisedstrategy clusters, and its occurrences in each of the sessionclickstreams is counted by comparing against each 3-gramderived from the clickstream. Since we cannot expect thechosen pattern to accurately represent the whole cluster, weallow a Jaro-Winkler distance up to 0.2 when comparing thesub-sequences. This procedure allows matching of click se-quences with only one replacement (1−jw(abc, abd) ≈ 0.18),one transposition (1 − jw(abc, acb) ≈ 0.10), or one replace-ment and one trasposition (1 − jw(abc, adb) ≈ 0.20). Inorder to build student and assessment week based predic-tion models, we aggregate pattern counts along assessmentweeks and individual students by simple addition. Similarmethods have been employed by [8, 27, 41, 10].

Predicting assessment type. A random forest classifieris trained to predict the assessment type, i.e. homework orexam, from frequent pattern counts, the number of clicks,and the number of session clickstreams a student has withina given week. In practice, it is unlikely that we would need topredict the assessment type as it is usually known. Howeverwhen paired with careful analysis of feature importance andpartial dependence, such model can yield valuable insightsinto the most important differences in student behavior be-tween homework and exam weeks. We use 80 % of the 1,148student-week combinations for training and hold back 20 %as test set. Hyperparameters including the maximum treedepth, the maximum number of features to consider at splits,the minimum number of samples per leaf, and the numberof trees are chosen by a grid search over a range of values,where models are trained with 5-fold cross validation on thetraining set. Our model draws on Gini impurity to measurethe quality of splits, and we evaluate feature importancebased on the mean decrease in impurity (MDI) associatedwith splitting at a given feature when predicting Y . For aset of fitted trees T = {T1, . . . , TN}, the MDI of a featureXm is defined as

MDI(Xm) =1

N

∑T∈T

∑t∈T :v(st)=Xm

p(t)∆i(st, t), (1)

where p(t) is the proportion of samples that reaches nodet, v(st) is the variable used to split st, and ∆i(st, t) is thedecrease of impurity generated by the split.

Predicting grade outcomes. Similar to the assessmenttype prediction model, we train a random forest regressor topredict students’ grade outcomes based on strategy counts,the number of clicks, the number of session clickstreams,and attendance information. The additional considerationof lecture and recitation attendance requires us to removeall observations from finals week, since no face-to-face classtime has taken place in the last week of the course. Sincethis constitutes half of all exam observations in the data

Figure 3: Euclidean distances of click type embed-dings based on skip-gram neural network. Darkercolor suggests that embedding are close. Proxim-ity in the embedding space suggests clicks generallyappear in similar contexts. Rows and columns areclustered for visualization.

and grade distributions of homeworks and exams are signifi-cantly different (p < 0.001), we limit our prediction model tohomework grade prediction entirely. Of the 820 homeworksamples, 80 % are used for training and 20 % for testing. Agrid search of hyperparamters with 5-fold cross-validationon the training set is performed, and feature importance ismeasured analogous to Equation 1 with the MSE as impu-rity measure.

5. RESULTS5.1 RQ1 How do students interact with course

material, and what are frequent strategiesthey take?

5.1.1 Context Of Click TypesIn order to gain some initial understanding of online studentbehavior, we explore the contexts in which different types ofactions are performed by deriving a skip-gram neural net-work based embedding of actions. After exploring a smallgrid of hyperparameter values, our skip-gram is trained ondata pairs with window size 1 to learn a 4-dimensional em-bedding. Figure 3 depicts the Euclidean distances betweenthe embedding vectors of different click types based on themodel. Proximity of embeddings suggests that click typeseither appear in a similar context, i.e. within a few clicksof each other, or are interchangeable actions, i.e. have thesame context. In other words, by exploring which actions lieclose to a given click type in the embedding space, we cangain some insight into the set of clicks students typicallymake right before and after. It is noteworthy that sometypes of actions appear together by design of the Diderot

system, e.g. in order to comment on a post, the post hasto be loaded. Figure 3 reflects many of these expected re-lations which gives some validation to our methodologicalapproach.

Our results suggest several broad clusters of student actions.The block in the upper left corner of Figure 3 appears to fo-cus on active discussion participation including click typessuch as Like post or Create comment. The next block issomewhat close to many of the active discussion actions andconcentrates on scrolling through the discussion board repre-sented by View post office type actions. Although morerigorous statistical analysis is needed, the results suggestsome interesting interpretations:

(1) Students ask more questions about homeworksthan about any other course materials. This in-terpretation is based on the proximity of Create post

to View homework atom which appears to be much closerthan any other View atom type action. This suggeststhat student questions, comments and clarificationsare more common for homework material than for lec-tures notes, recitation material, practice exams, or thelibrary documentation.

(2) Students are more likely to interact with course-wide posts than material specific discussions.The action View general post is close to interactivebehavior such as Create comment, Like post or Fol-

low post while View atom post appears to be per-formed mostly in a different context. This suggeststhat discussion-specific reactions and interactions con-centrate mostly on general posts such as course an-nouncements or social posts and are less common forquestions and comments concerning particular parts ofthe course materials.

Overall, context analysis for click types based on skip-gramneural networks provides us with some valuable understand-ing of students’ use of Diderot. The same method might beuseful to other practitioners, in particular, for initial explo-ration of data collected through educational software sys-tems. It appears that interpretable low-dimensional embed-dings of a medium number of action types can be obtainedwith only a few weeks worth of data from a a single col-lege course which renders this method particularly usefulfor blended courses.

5.1.2 Frequent Pattern ExtractionPatterns are extracted with two distinct methods, and sub-sequently interpreted in terms of underlying student strate-gies. A summary of the results and comparison between themethods is given in Table 2. The left side of the table showsthe results of the clustered n-gram pipeline for pattern ex-traction. The most frequent 1% n-grams for each n = 3, 4, 5are extracted from the session clickstreams. This yields acandidate set of 223 sequential patterns which are clusteredinto 9 groups based on agglomerative clustering with aver-age linkage and Jaro-Winkler distance as distance function.The number of clusters is informed by visual inspection ofthe respective dendrogram. It is noteworthy that the clus-ters appear to have imbalanced sizes with the largest clusterincluding 106 candidate patterns, and the smallest clusters

Table 2: Comparison of student strategies extracted by clustered n-gram method and LDA. Patterns in thefirst block (B1) consist of exactly the same click types, while other patterns show differences but allow forsimilar interpretations (B2). Lastly, the LDA method finds a mixture of practice exam related patterns anda new load course pattern (B3).

Clustered n-gram method LDA method

Student strategy Associated click types Student strategy High weight click types

B1 Look at lecture notes View lecture notes chapter (75.88 %) Look at lecture notes View lecture notes chapter (1)

Look at homeworks View homework chapter (100 %) Look at homeworks View homework chapter (0.826)

Look at recitationmaterial

View recitation chapter (100 %) Look at recitationmaterial

View recitation chapter (0.712)

B2 Catch up on news View general post (52.04 %), Viewmain post office (24.3 %), View atompost (16.13 %)

Catch up on news View general post (0.543), Viewmain post office (0.410)

Active homeworkengagement

View atom post (50 %), Viewhomework atom (31.02 %), Viewgeneral post (10.65 %)

Active homeworkengagement

View atom post (0.653), Viewhomework atom (0.344)

In-depth review oflecture notes

View lecture notes atom (50 %),View atom post (28.57 %), Viewlecture notes chapter (21.43 %)

In-depth review oflecture notes

View lecture notes atom (0.483),View atom post (0.31), Click linklecture notes (0.195)

Look at librarydocumentation

View library documentation chapter(85.29 %)

Look at librarydocumentation

View library documentation chapter(0.674), Search atom (0.321)

B3 Go through apractice exam

View practice exam atom (100 %) Practice exams View practice exams chapter (0.658),View practice exams atom (0.341)

Look at practiceexams

View practice exams chapter (100 %) Load course Load course (0.998)

containing only 2 or 3 of the candidate patterns. Yet, in-spection of the associated click types and their in-clusterfrequencies allows for intuitive interpretations as studentstrategies. Multiple of the devised strategies revolve aroundpassive review of materials such as lecture notes, homeworks,recitation material, library documentation (which includescode snippets for reference), or practice exams. More in-volved strategies are given by active homework engagement,in-depth review of lecture notes, catching up on course news,and going through practice exams. For example, the catch-ing up on course news strategy is associated with sequentialpatterns involving reading of general posts, atom posts, andloading the main post office page.

The right side of Table 2 summarizes the results of pat-tern extraction based on Latent Dirichlet Allocation (LDA).For the sake of comparison, we keep the number of ex-tracted patterns fixed and derive 9 student strategies. Byassumption of the model, each pattern is a mixture of allclick types. In turn, extraction of weights is straightfor-ward and we report the click types with highest weightsfor each pattern. We find that multiple of the extractedpatterns match exactly the patterns retrieved with the clus-tered n-gram method in the sense that they are based onexactly the same click types (B1). Another set of patternsshows small changes in included click types, but essentiallyprovides the same interpretation as the patterns found withthe first method (B2). Lastly, the LDA method finds a prac-tice exam strategy which broadly presents a mixture of thetwo practice exam related strategies from the first model,and a load course strategy which almost entirely consistsof the Load course action (B3). The load course patternlikely arises from the session clickstreams with a single clickwhich present 30.30% of the session clickstreams. A totalof 56.66% of these one-click sequences are Load course ac-

tions. Reasons for these single Load course clicks can bemanifold. In some cases, students might get distracted im-mediately after loading the course, or they have to reload thecourse multiple times. However, we hypothesize that in mostcases, the course overview page which is loaded when loadingthe course provided all information the student was lookingfor since it includes recent updates, posts and announce-ments. Contrary to the clustered n-gram method which onlytakes into consideration session clickstreams of at least threeclicks, LDA can leverage even these short clickstreams. Yet,the additional insights gained through the load course pat-tern are marginal since it very short and hard to interpretas a strategy.

All in all, both methods roughly extract the same strategieswhich speaks in favor of the validity of both approaches. Onecould argue that the clustered n-gram method yields slightlymore tangible insights since the patterns present actuallyfrequently occurring sub-sequences. However for larger datasets, the method can become computational expensive ren-dering LDA a better choice.

5.2 RQ2 How do students use these strategiesfor homework solving as compared to exampreparation?

We extract strategy features for assessment week level pre-diction models by matching session clickstreams against theextracted frequent patterns. The results are summed up foreach student-week combination and thus roughly representhow often a given student has used a strategy in a given as-sessment week. After this aggregation, 91.03 % show at leastone occurence of one of the patterns. We generally expectnot all student click behavior to follow the extracted strate-gies or stringent strategies at all. Thus, it is unsurprising

Figure 4: Relative feature importance for assess-ment week random forest prediction (1 = home-work, 0 = exam) along with 95 % confidence in-tervals (bottom) and partial dependence plots forthe most important features (top). Features includestrategy counts from the clustered n-gram method,the number of clicks and the number of session click-streams.

that some of the student week combinations do not involveany of the patterns.

We train a random forest classifier to predict the assessmenttype on pattern counts, the number of clicks and the numberof sessions within a given week. A total of 80 % of the data isused for hyperparameter tuning and training, while 20 % iswithheld for testing. The model reaches a classification ac-curacy of 93.68 % on the training data which constitutes anevident improvement over the naive majority class predic-tion (71.90 % of the training data have the label homework).Based on a permutation test, we find that the model per-forms better than random on the training set (p < 0.01). Atotal of 100 permutations of labels were used for this eval-uation. Accuracy on the test set is 93.91 % which suggestssufficient generalization ability of the prediction model.

The prediction model results suggest that students use theeducational support system differently and employ the dif-ferent strategies at different rates when preparing for examsas compared to doing homeworks. We examine feature im-portance in the model in order to gain more insights intothese differences. Figure 4 depicts the mean decrease in im-purity (MDI) for splits at the different covariates, as wellas partial dependence of the predictions on the most impor-

tant features. We see that predictions are mainly driven bypattern counts of the strategies look at lecture notes (MDI= 0.395), in-depth review of lecture notes (MDI = 0.259),look at practice exams (MDI = 0.140), and active home-work engagement (MDI = 0.094). Partial dependence plotsshow that while increased counts in the strategies relatedto lecture notes and practice exam engagement increase theprobability that the model predicts an exam week, highercounts in the active homework engagement strategy increasethe models likelihood of predicting an upcoming homeworkdeadline. These results suggest that students approach tolearning is driven by the kind of performance assessmentthey are given. It appears that the increased activity inexam weeks (see Figure 1) is largely based on increased en-gagement with lecture notes and practice exams, while in-teractions with the homework related content is generallyless pronounced.

5.3 RQ3 Are student strategies indicative ofgrade outcomes?

We train a random forest regression model to predict home-work grades on a individual week and student level. Fea-tures include students’ strategy counts, the number of clicks,the number of sessions, and the mean attendance in bothlectures and recitations. Training is conducted on 80 % ofavailable data while 20 % are withheld for testing. Afterhyperparameter tuning with 5-fold cross validation, the pre-diction model realizes a MSE of 0.046 on the training dataset. A permutation test based on 100 permutations of labelsshows a significant improvement over random performancewith this model (p < 0.01). On the test set, the modelattains a prediction MSE of 0.054 which suggests sufficientgeneralization ability.

Figure 5 explores the importance of the different featuresfor predictions and displays partial dependence relations forthe most important covariates. Since we use MSE as im-purity measure, the mean decrease in impurity (MDI) for agiven feature effectively corresponds to the mean decreasein variance we receive by splitting at the feature. We seethat, in fact, the most relevant features appear to be thenumber of clickstream sessions (MDI = 0.311), the num-ber of clicks (MDI = 0.221), lecture attendance (MDI =0.123), and recitation attendance (MDI = 0.112). Partialdependence plots reveal that increases in any of the abovefeatures increase the predicted homework score percentageby a relatively large margin of up to 20 percentage points.Conversely, strategy counts appear to be less relevant forgrade predictions with some exceptions. Most notably, thepredicted grade rises with the number of times students ac-tively engaged in homeworks (MDI = 0.077).

Overall, our results show some success in prediction of home-work grade outcomes. The extracted features, includingsome of the pattern counts, add valuable information to theprediction model. In particular, students who come backto Diderot more often and thus use an increased number ofstudy sessions to solve their homeworks, and students whogenerally interact with the system at high rates are predictedto have better grade outcomes. In addition to time at task,the mere attendance in lectures and recitations increases stu-dents’ grade outcome predictions. In fact, students in theour data set who attended at least one lecture in a given

Figure 5: Relative feature importance for homeworkgrade random forest prediction along with 95 % con-fidence intervals (bottom) and partial dependenceplots for the most important features (top). Fea-tures include strategy counts from the clustered n-gram method, the number of clicks and session click-streams, and attendance information.

assessment week on average received a homework percent-age grade of 76.58 %, while students who skipped lectureson average scored 55.86 %. For recitation attendance thiscorresponds to 74.34 % and 49.20 % respectively.

Both of the discussed prediction models provide valuableinsights for instructors and educational system design. Thetree-based ensemble methods are particularly suitable forinitial modeling and processing of features on different scales.Their main advantage over many other models is the rel-atively straightforward explainability of predictions givenpartial dependence plots and measures of feature importancewhich renders them a useful approach to high stakes at-riskprediction.

6. CONCLUSIONSData from educational software systems provides insightsinto students’ study behaviors. While performance predic-

tion in MOOCs has been explored extensively, similar stud-ies for blended courses are scarce and often lack a deeperunderstanding of the underlying student strategies. Basedon fine-grained contextualizable click data collected throughthe non-commercial course support system Diderot, we ex-plore how students interact with educational software sys-tems, which strategies they employ to engage with coursematerials and in which ways strategies depend on the as-sessment type and drive performance. Our contributions aretwo-fold: (1) We gain relevant understanding of students’learning behavior that both confirms and adds to the exist-ing literature. (2) We propose new NLP-inspired approachesto analyzing student strategies’ based on clickstream data inblended learning scenarios which typically come with mod-erately sized data sets.

On the educational side, our results provide valuable insightsinto how students interact with course systems. In line withprevious research [37, 1], we observe increased activity be-fore deadlines, and, in particular, in the days leading upto an exam. Exam preparation appears to come with in-creased review of lecture notes as compared to homeworksolving. In general, students seem to ask more questionsrelated to homeworks as compared to other class materialssuch as lecture notes, recitation materials or practice exams.At the same time, interactions with already existing postssuch as liking or commenting seems to concentrate mostlyon course-wide announcements, social posts and course feed-back discussions and appears to be less common for directquestions on course materials. Many of the derived fea-tures have some predictive power for performance outcomes.In particular, the number of study sessions, the number ofclicks, attendance in lecture and recitation, and engagingwith homework related course content are strong predictorsfor homework grades in our model. The described observa-tions are entirely based on data from a seven week periodof a large sophomore level college course since technical dif-ficulties prohibited collection of data for the remainder ofthe semester. In the future, more complete data (e.g. froman entire course, or even multiple courses such as the samecourse offering over several years) could provide an enhancedunderstanding of student behavior and allow the tackling ofmore complex problems such as the simultaneous predictionof homework and exam grades which, such as in our data,can have very different distributions.

The methods proposed in this work promise to be useful to abroad range of researchers and practitioners who find them-selves analyzing activity log-data from blended courses, orare at the initial stages of developing early warning systems.The key insight of this work is that hybrid NLP methodscan be used to thoroughly analyze contexts of actions aswell as frequent strategies in the relatively low-data settingof blended courses. To the best of our knowledge, similarmodels have previously only been employed in the setting ofMOOCs [e.g. 44, 36]. In fact, our analysis shows that topicmodels such as latent Dirichlet allocation can recover almostthe same student strategies as more traditional data miningbased pipelines of pattern extraction, and small versions ofskip-gram neural networks can provide valuable insights intothe context of student actions even with moderately sizeddata sets.

References[1] Lalitha Agnihotri, Ani Aghababyan, and Shirin Mo-

jarad. “Mining Login Data For Actionable Student In-sight”. en. In: Proceedings of the 8th International Con-ference on Educational Data Mining. 2015, pp. 472–475.

[2] Bussaba Amnueypornsakul, Suma Bhat, and PhakpoomChinprutthiwong.“Predicting Attrition Along the Way:The UIUC Model”. In: Proceedings of the EMNLP 2014Workshop on Analysis of Large Scale Social Interac-tion in MOOCs. 2014, pp. 55–59.

[3] Truong-Sinh An, Christopher Krauss, and Agathe Mer-ceron. “Can Typical Behaviors Identified in MOOCsbe Discovered in Other Courses?” en. In: Proceedingsof the 10th International Conference on EducationalData Mining. 2017, pp. 220–227.

[4] Juan Miguel L. Andres et al. “Studying MOOC com-pletion at scale using the MOOC replication frame-work”. en. In: Proceedings of the 8th International Con-ference on Learning Analytics and Knowledge. 2018,pp. 71–78.

[5] Hafidh Ba-Omar, Ilias Petrounias, and Fahad Anwar.“A framework for using web usage mining to person-alise e-learning”. English. In: Proceedings of the 7thIEEE International Conference on Advanced LearningTechnologies. 2007, pp. 937–938.

[6] Ryan S. J. D. Baker. “Modeling and UnderstandingStudents’ Off-Task Behavior in Intelligent TutoringSystems”. In: Proceedings of ACM SIGCHI: Computer-Human Interaction. 2007.

[7] Behdad Bakhshinategh et al. “Educational data min-ing applications and tasks: A survey of the last 10years”. en. In: Education and Information Technolo-gies 23.1 (Jan. 2018), pp. 537–553.

[8] Christopher Brooks, Craig Thompson, and StephanieTeasley. “A time series interaction analysis method forbuilding predictive models of learners using log data”.en. In: Proceedings of the Fifth International Confer-ence on Learning Analytics And Knowledge. 2015, pp. 126–135.

[9] Yunfan Chen and Ming Zhang.“MOOC student dropout:pattern and prevention”. In: Proceedings of the ACMTuring 50th Celebration Conference - China. 2017, pp. 1–6.

[10] Cody A. Coleman, Daniel T. Seaton, and Isaac Chuang.“Probabilistic Use Cases: Discovering Behavioral Pat-terns for Predicting Certification”. In: Proceedings ofthe Second (2015) ACM Conference on Learning atScale. 2015, 141–148.

[11] Robert Cooley, Bamshad Mobasher, and Jaideep Sri-vastava. “Data Preparation for Mining World WideWeb Browsing Patterns”. en. In: Knowledge and In-formation Systems 1.1 (Feb. 1999), pp. 5–32.

[12] Michel C. Desmarais and Francois Lemieux. “Cluster-ing and Visualizing Study State Sequences”. In: Pro-ceedings of the 6th International Conference on Edu-cational Data Mining. 2013, pp. 224–227.

[13] Louis Faucon, Lukasz Kidzinski, and Pierre Dillen-bourg.“Semi-Markov model for simulating MOOC stu-dents”. In: Proceedings of the 9th conference on Edu-cational Data Mining. 2016, pp. 358–363.

[14] Chase Geigle and Cheng Xiang Zhai.“Modeling MOOCstudent behavior with two-layer hidden markov mod-els”. English (US). In: Proceedings of the 4th ACMConference on Learning at Scale. 2017, pp. 205–208.

[15] Niki Gitinabard et al. “What will you do next? A se-quence analysis on the student transitions between on-line platforms in blended courses”. In: arXiv: 1905.00928(2019).

[16] Julio Guerra et al.“The Problem Solving Genome: An-alyzing Sequential Patterns of Student Work with Pa-rameterized Exercises”. In: Proceedings of the 7th In-ternational Conference on Educational Data Mining.2014, pp. 153–160.

[17] James Herold, Alex Zundel, and Thomas F Stahovich.“Mining Meaningful Patterns from Students’ Hand-written Coursework”. en. In: Proceedings of the 6thInternational Conference on Educational Data Min-ing. 2013, pp. 67–73.

[18] Ya-Han Hu, Chia-Lun Lo, and Sheng-Pao Shih. “De-veloping early warning systems to predict students’ on-line learning performance”. en. In: Computers in Hu-man Behavior 36 (July 2014), pp. 469–478.

[19] Hogyeong Jeong and Gautam Biswas. “Mining Stu-dent Behavior Models in Learning-byTeaching Envi-ronments”. In: Proceedings of the 1st International Con-ference on Educational Data Mining. 2008, pp. 127–136.

[20] John S Kinnebrew and Gautam Biswas. “IdentifyingLearning Behaviors by Contextualizing Differential Se-quence Mining with Action Features and PerformanceEvolution”. en. In: Proceedings of the 5th InternationalConference on Educational Data Mining. 2012, pp. 57–64.

[21] John S. Kinnebrew, Kirk M. Loretz, and Gautam Biswas.“A Contextualized, Differential Sequence Mining Methodto Derive Students’ Learning Behavior Patterns”. en.In: Journal of Educational Data Mining 5.1 (May 2013),pp. 190–219.

[22] Rene F. Kizilcec, Chris Piech, and Emily Schneider.“Deconstructing disengagement: analyzing learner sub-populations in massive open online courses”. en. In:Proceedings of the Third International Conference onLearning Analytics and Knowledge. 2013, pp. 170–179.

[23] Severin Klingler, Tanja Kaser, and Barbara Solenthaler.“Temporally Coherent Clustering of Student Data”.en. In: Proceedings of the 9th International Conferenceon Educational Data Mining. 2016, pp. 102–109.

[24] Marius Kloft et al. “Predicting MOOC Dropout overWeeks Using Machine Learning Methods”. In: Proceed-ings of the 2014 Conference on Empirical Methods inNatural Language Processing. 2014.

[25] Vitomir Kovanovic et al. “Penetrating the black boxof time-on-task estimation”. en. In: Proceedings of theFifth International Conference on Learning AnalyticsAnd Knowledge. 2015, pp. 184–193.

[26] Mirjam Kock and Alexandros Paramythis. “Activitysequence modelling and dynamic clustering for per-sonalized e-learning”. en. In: User Modeling and User-Adapted Interaction 21.1-2 (Apr. 2011), pp. 51–97.

[27] X. Li, L. Xie, and H. Wang. “Grade Prediction inMOOCs”. In: 2016 IEEE Intl Conference on Com-putational Science and Engineering (CSE) and IEEEIntl Conference on Embedded and Ubiquitous Comput-ing (EUC) and 15th Intl Symposium on DistributedComputing and Applications for Business Engineering(DCABES). 2016, pp. 386–392.

[28] Xiao Li, Ting Wang, and Huaimin Wang. “Explor-ing N-gram Features in Clickstream Data for MOOCLearning Achievement Prediction”. In: Database Sys-tems for Advanced Applications. 2017.

[29] Martın Liz-Dominguez et al. “Predictors and EarlyWarning Systems in Higher Education - A SystematicLiterature Review”. en. In: LASI-SPAIN. 2019.

[30] Leah P. Macfadyen and Shane Dawson. “Mining LMSdata to develop an “early warning system” for educa-tors: A proof of concept”. en. In: Computers & Educa-tion 54.2 (Feb. 2010), pp. 588–599.

[31] Carlos G. Marquardt, Karin Becker, and Duncan Dubu-gras Alcoba Ruiz. “A pre-processing tool for Web us-age mining in the distance education domain”. In: Pro-ceedings. International Database Engineering and Ap-plications Symposium (2004), pp. 78–87.

[32] R Martinez, K Yacef, and J Kay. “Analysing frequentsequential patterns of collaborative learning activityaround an interactive tabletop”. en. In: Proceedings ofthe 4th International Conference on Educational DataMining. 2011, pp. 111–120.

[33] David Mimno et al. “Optimizing Semantic Coherencein Topic Models”. en. In: Proceedings of the 2011 Con-ference on Empirical Methods in Natural Language Pro-cessing. 2011, pp. 262–272.

[34] Patrick Mukala, Jcam Joos Buijs, and Van der Aalst.“Exploring students’ learning behaviour in MOOCs us-ing process mining techniques”. In: BPM reports; Vol.1510. 2015.

[35] Michal Munk and Martin Drlık. “Impact of Differ-ent Pre-Processing Tasks on Effective Identification ofUsers’ Behavioral Patterns in Web-based EducationalSystem”. en. In: Procedia Computer Science. Proceed-ings of the International Conference on ComputationalScience, ICCS 2011 4 (Jan. 2011), pp. 1640–1649.

[36] Zachary A. Pardos and Lev Horodyskyj. “Analysis ofStudent Behaviour in Habitable Worlds Using Con-tinuous Representation Visualization”. In: Journal ofLearning Analytics 6.1 (2019), pp. 1–15.

[37] Jihyun Park et al. “Detecting changes in student be-havior from clickstream data”. en. In: Proceedings ofthe Seventh International Learning Analytics and Knowl-edge Conference. 2017, pp. 21–30.

[38] B. K. Pursel et al. “Understanding MOOC students:motivations and behaviours indicative of MOOC com-pletion”. en. In: Journal of Computer Assisted Learn-ing 32.3 (2016), pp. 202–217.

[39] Adithya Sheshadri et al. “Predicting Student Perfor-mance Based on Online Study Habits: A Study ofBlended Courses”. en. In: Proceedings of the 11th In-ternational Conference on Educational Data Mining.2018, pp. 401–410.

[40] Benjamin Shih, Kenneth Koedinger, and Richard Scheines.“Unsupervised Discovery of Student Strategies.” In:Proceedings of the 3rd International Conference on Ed-ucational Data Mining. 2010, pp. 201–210.

[41] Tanmay Sinha, Patrick Jermann, and Pierre Dillen-bourg.“Your click decides your fate: Inferring Informa-tion Processing and Attrition Behavior from MOOCVideo Clickstream Interactions”. en. In: Proceedings ofthe 2014 Conference on Empirical Methods in NaturalLanguage Processing. 2014, pp. 3–14.

[42] Keith Stevens et al. “Exploring Topic Coherence overMany Models and Many Topics”. In: Proceedings of the2012 Joint Conference on Empirical Methods in Nat-ural Language Processing and Computational NaturalLanguage Learning. July 2012, pp. 952–961.

[43] Rodrigo del Valle and Thomas M. Duffy.“Online learn-ing: Learner characteristics and their approaches tomanaging learning”. In: Instructional Science 37.2 (2009),pp. 129–149.

[44] Miaomiao Wen and Carolyn Penstein Rose. “Identi-fying Latent Study Habits by Mining Learner Behav-ior Patterns in Massive Open Online Courses”. en. In:Proceedings of the 23rd ACM International Conferenceon Conference on Information and Knowledge Man-agement. 2014, pp. 1983–1986.

[45] Alyssa Friend Wise et al. “Broadening the Notion ofParticipation in Online Discussions: Examining Pat-terns in Learners’ Online Listening Behaviors”. en. In:Instructional Science: An International Journal of theLearning Sciences 41.2 (Mar. 2013), pp. 323–343.

analyzing student strategies in blended courses using ...analyzing student strategies in blended...

Documents