learning from time-to-event data from online learning contexts

82
Learning from Time-to-Event Data from Online Learning Contexts SHALIN HAI-JEW KANSAS STATE UNIVERSITY 4 TH ANNUAL BIG 12 TEACHING & LEARNING CONFERENCE TEXAS TECH UNIVERSITY JUNE 8 – 9, 2017 Digital Poster Session

Upload: shalin-hai-jew

Post on 21-Jan-2018

111 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Learning from Time-to-Event Data from Online Learning Contexts

Learning from Time-to-Event Data from Online Learning Contexts

S H A L I N H A I - J E W

K A N S A S S T A T E U N I V E R S I T Y

4 T H A N N U A L B I G 1 2 T E A C H I N G & L E A R N I N G C O N F E R E N C E

T E X A S T E C H U N I V E R S I T Y

J U N E 8 – 9 , 2 0 1 7

Digital Poster Session

Page 2: Learning from Time-to-Event Data from Online Learning Contexts

Digital Poster DescriptionTime-to-event analysis is a statistical analysis approach that enables time-based insights about student learning, such as, How long does it take before a learner makes a new acquaintance in an online course? A new friend? How long does it take before a learner achieves breakout capacity in a particular learning sequence? How long does it take for a learner to commit to a course? This digital poster session presents time-to-event analysis (aka “survival analysis”) from real LMS data and shows how this analysis is done. Terms related to time-to-event analysis will be introduced, and the assertability of extracted data is explored.

Time-to-event analysis, in its simplest form, enables the study of in-world phenomena which includes the time it takes to achieve a particular defined “event” (whether negative or positive, desirable or undesirable), and it includes the nuance of “censored” data (in-world records for which data about event achievement was not attained during the time period of the analysis). This presentation introduces “time-to-event analysis” (on IBM’s SPSS Statistics) as applied to online educational data.

2

Page 3: Learning from Time-to-Event Data from Online Learning Contexts

Overview 1. “Survival Analysis” and “Time-to-Event” Analyses

2. Required Online Learning Data

3. Structuring the Target Data in SPSS

4. About the Data Visualizations

5. Time-to-Event Analysis

6. Enriched Time-to-Event Analysis

7. Worked Real-World Cases

8. Some Possible Askable Questions in Teaching and Learning

9. Contact and Conclusion

3

Page 4: Learning from Time-to-Event Data from Online Learning Contexts

1. “Survival Analysis” and “Time-to-Event” Analyses

4

Page 5: Learning from Time-to-Event Data from Online Learning Contexts

Times and Events “TIME”

In the real:

Relative time as a continuum (in lived experiences, in mental conceptualization)

Time as discrete explicit sequential steps of varying units (in computation)

In research:

Observed and measured time as a continuum (continuous variable)

Observed and measured time as units or phases (fine-grained or coarse-grained as needed for the research)

“EVENT”

In the real:

Event as an experienced reality or “state”

In research:

An occurrence of a defined and specific type and character (optimally observable)

◦ Clear onset

◦ Clear completion

◦ Clear duration

5

Page 6: Learning from Time-to-Event Data from Online Learning Contexts

A Brief History of Survival Analysis “SURVIVAL ANALYSIS”

“Survival analysis” was originated in the 1950s to help analysts looking at health data to identify duration features, namely, how long individuals and groups “survived” before death or catastrophic system failure.

Historically, this involved longitudinal data, but various time lengths work fine.

Initially, this approach required parametric data; since then, statistical techniques have enabled approaches with non-parametric data.

TIME-TO-EVENT ANALYSIS

Outside the health sciences realm, this approach is known as “time-to-event” analysis, where the “event” may be any sort of observable occurrence.

These statistical techniques were also applied in other contexts and applications (beyond biostatistics): in engineering, these approaches are known as “reliability analysis,” in economics, “duration modeling,” in sociology, “event history analysis.”

6

Page 7: Learning from Time-to-Event Data from Online Learning Contexts

Basic Research DesignPre-research theorizing and hypothesizing about possible time-to-event observations (based on empirical data) and what that might mean

Defining a population of interest ◦ Need to control for “selection bias” or “left truncation” limitations in the use of that population…by only

including populations that have already met certain time data requirements instead of including N=all as much as possible ◦ Truncation risks come from those entering the study (while censoring risks come from those leaving the study)

◦ Require a sufficient population set for power in the research

◦ Some time-to-event analyses have “staggered entries” with studied populations being included at varying times

◦ A population can be animate (like people or animals) or inanimate (like objects such as machines or phenomena such as weather systems)

Cannot have more than 50% of the population falling into “censored” (“lost to follow-up” data)

7

Page 8: Learning from Time-to-Event Data from Online Learning Contexts

Basic Research Design (cont.)

Defining an “event” of interest (that occurs in measurable time) ◦ Identifying the features of an event that will enable observation of the occurrence of that event

Capturing data of whether, for each member of a population, an event has occurred or not during the period of the research

8

Page 9: Learning from Time-to-Event Data from Online Learning Contexts

“Censored Data” and Real-World Observations One of the strengths of time-to-event analysis is that it includes consideration of “censored” data, or data that is “lost to follow-up”

Three general types of data “censoring” in time-to-event analysis ◦ In cases where an event has not occurred during the research period, that data is known as “right-

censored” data because the non-event occurs outside the research time frame and is “lost to follow-up”

◦ If the data is captured in certain time periods, non-continuously, some data may be “interval censored” (such as that an event has occurred within a time period, but no exact time is available)

◦ If the data starts after event has already occurred to some of the members, that is known as “left-censored” data

Censoring accommodates perturbations of experiments that happen in the real world where “stuff happens”

Enables statistical - computational acknowledgment of engaging partial / incomplete data

9

Page 10: Learning from Time-to-Event Data from Online Learning Contexts

Some Data Assumptions For this to work, there is an assumption that the data is normally distributed (Gaussian distribution).

If there are non-normal patterns to the data, then it is assumed that there is something acting on that data to enable such an outcome.

◦ Time is a variable in the onset of the targeted “event.”

◦ There are likely other co-variates that affect the time of the event’s onset.

There are sufficient empirical data records to accurately represent the phenomenon with some power. Excessive censoring in the population may harm the results (because this is lost data).

Population members with censored data are assumed to have the same survival probabilities as non-censored population members (generalizability).

The past is considered fixed, so observed events that occur are theoretically non-reversible. (In the real, some events are reversible, particularly with educational data.)

10

Page 11: Learning from Time-to-Event Data from Online Learning Contexts

11

Page 12: Learning from Time-to-Event Data from Online Learning Contexts

Some Light Mathematical Notation Core Representations of Time

Where T = random outcome variable (a unit time when the event occurs)

Where t = observation time (during the observational research)

Research Time Period

t0 = Time 0 when the observations are beginning for the research (and in some studies, none of the participants have achieved event yet)

tmax = end of the observation

12

Page 13: Learning from Time-to-Event Data from Online Learning Contexts

Some Light Mathematical Notation (cont.)

Censored or Missing Data

Right censoring: (T > t): the time it takes to achieve event is longer than the time of the study

Interval censoring: (T ∈ (l, r]): the time it takes to achieve event is an element of some period of time from when the research observations started and ended, but it’s unclear when / in what interval exactly

Left censoring: ( T ≤ t): the time it takes to achieve event is less than the observation time (occurred prior to the start of the research), but it is unclear when the actual event occurred specifically

13

Page 14: Learning from Time-to-Event Data from Online Learning Contexts

Required Data to Run Time-to-Event Analyses At its simplest, the following information is needed to run a time-to-event analysis:

◦ A defined population that is “susceptible” to (or at-risk-of) experiencing a particular event

◦ An objectively observable “event” of interest (with specific and defined time-of-onset)

◦ Information on whether each individual in the population has either achieved event during the study period or did not (binary outcome: either event OR censored / “lost to follow-up”)

◦ A start-time and an end-time for the research period

For complexity and richer analysis, it also helps to have the following: ◦ Attribute (descriptive) data for each of the research participants

14

Page 15: Learning from Time-to-Event Data from Online Learning Contexts

Some Available Findings The average time to event

Min-max time ranges (shortest and longest survival periods, per record)

The frequency pattern of when events occur in time (so times of heightened risk of occurrence)

The frequency pattern of when censoring of data occurs

The probabilities of survival (vs. event achievement, and vs. censoring) at various time intervals for the studied population (by analyzing the number of population still surviving at a particular time)

◦ With the ability to generalize from the research to other similar populations

The estimated “hazard” function (cumulative probability risk of non-survival / achieving event) at particular time periods (ti), such as “time i”

15

Page 16: Learning from Time-to-Event Data from Online Learning Contexts

Post-Hoc Theorizing / HypothesizingFrom the patterned time data, researchers may be able to consider the following:

Are there particular times when there is heightened risk for achievement of event? Are there particular times when there is lowered risk for achievement of event?

◦ What “hazard” factors could explain these differences?

Which members of a group tend to achieve “early” event vs. “later” event? ◦ What are some differences between these groups?

◦ Which of these attributes (co-variates with time) may explain the time differences?

Which members of a group do not apparently achieve event during the time period of the study?

◦ Why would these members be right-censored? Why would other members not be right-censored?

◦ What attributes (co-variates) may explain whether a member experiences event or is censored?

16

Page 17: Learning from Time-to-Event Data from Online Learning Contexts

Post-Hoc Theorizing / Hypothesizing (cont.)

What research participant attributes are correlated or associated with research participants achieving event? Not achieving event?

What research participant attributes might be causal factors with research participants achieving event? Not achieving event?

If interventions were applied, what were differences between one part of the population that experienced intervention vs. the control group (the population that did not experience the intervention)?

What are some next research steps to add more insights?

17

Page 18: Learning from Time-to-Event Data from Online Learning Contexts

Some Statistical Approaches in Software Packages AD HOC STATISTICAL TOOLS

Kaplan-Meier method / Kaplan-Meier estimator (Product-Moment Method)

◦ Usually used for non-parametric survival functions

Life-Table Method

Nelson-Aalen estimator

Cox model

Mantel-Haenszel test

HAZARD FUNCTIONS

Exponential

Weibull

Gompertz

Piecewise Constant

18

Page 19: Learning from Time-to-Event Data from Online Learning Contexts

Kaplan-Meier EstimateTime Interval (5) Number of

Population N(t)(includes individuals with censored data at t)

Death / Event (N-D)/N S(t)

(The beginning of each interval is determined by “death” or “event.)t0

Full surviving population at the beginning

Probability of survival at any particular point-in-time

1.0

No surviving or censored population at the end

19

Page 20: Learning from Time-to-Event Data from Online Learning Contexts

Kaplan-Meier (Product Limit) assumptionsRe-estimates the survival probability at every event occurrence (to adjust for small sample sizes)

Censoring thought to be independent of the probability of event

Early participants and late participants in a study are thought to have similar survival probabilities

If comparison groups are used, the above assumptions should apply equally to both groups

20

Page 21: Learning from Time-to-Event Data from Online Learning Contexts

Additional Terms Related to the K-M Estimates Plots One-minus calculation: The one-minus plot is created by calculating 1-overall survival probability at the observed time period

Hazard analysis: Event rate (death) as a percentage of population that achieved event at time t (risk of event at any particular measured time based on empirical data)

21

Page 22: Learning from Time-to-Event Data from Online Learning Contexts

Life-Table Structure (from K-M) Time (in Units)

Number at Risk (Population)

Number of Deaths

Number Censored

SurvivalProbability

22

Note: Censoring events, or data lost to follow-up, does not change survival probability.

Page 23: Learning from Time-to-Event Data from Online Learning Contexts

Life-Table (Actuarial Table) Structure Time Period Death

(achieving event) (1)

Censored (0) Number of Living Individuals at the Beginning of Interval

Number of Individuals at Risk (those still alive at the beginning of the interval,surviving population density)

Probability of Survival at a Particular Point in Time (0 – 1)

Cumulative Survival or “Survival at Time t” or S(t) [The converse is the hazard function or h(t)]

23

Page 24: Learning from Time-to-Event Data from Online Learning Contexts

2. Required Online Learning Data

24

Page 25: Learning from Time-to-Event Data from Online Learning Contexts

Some Time-Based Issues of Interest in Online Learning Curriculum designs; course designs; short course designs

Learning sequences (including customized ones)

Learning dynamics

Online learning efficacy ◦ Effects of digital learning objects (DLOs)

◦ Effects of learning assignments

◦ Effects of learning assessments

◦ Remedial learning strategies, and others

Learner decision-making ◦ Learner awareness

◦ Learner metacognition

◦ Indicators of learner choice-making

◦ Catalysts for learner decisions and actions

◦ Learner-created assignments

Indicators of learning acquisition◦ Indicators of problem-solving capabilities

◦ Indicators of learning “expertise”

◦ Indicators of negative learning (risks to accurate learning)

◦ Indicators of learner innovation and creativity

25

Page 26: Learning from Time-to-Event Data from Online Learning Contexts

Some Time-Based Issues of Interest in Online Learning (cont.)

Learner sociality dynamics ◦ Online learning communications

◦ Instructor leadership and communications, instructor modeling

◦ Group work

◦ Learner collaboration

Domain-based online learning ◦ Content-based approaches

◦ Subject matter expert (SME) approaches

Technological dynamics ◦ Third-party applications and tools

◦ Group work dynamics

26

Page 27: Learning from Time-to-Event Data from Online Learning Contexts

Capture-able Data that Operationalizes Events-of-InterestWhat practically capture-able data may be used to indicate particular “events” in a convincing way?

◦ Are there alternative types of data that may be used to affirm or disconfirm onset of particular events?

◦ What are potential false indicators of particular events? Why are these “false” indicators vs. “true” indicators?

Are the data-of-interest captured as a matter-of-course as part of learning management system (LMS) operations, and are these data available to the institution of higher education?

◦ Are there various datasets captured in student information systems that may be accessible for understanding events-of-interest (registration, course enrollments, final course grades, cumulative GPAs, attribute data, demographic data, and others)?

◦ If the needed data are not captured as a matter-of-course, what additional work will be needed to capture the data accurately and without unnecessary intrusiveness (or any privacy-infringement on all involved)? What are additional costs to the data captures and the data processing?

27

Page 28: Learning from Time-to-Event Data from Online Learning Contexts

Capture-able Data that Operationalizes Events-of-Interest (cont.)

What are time-based measures of onset of events-of-interest? ◦ Are there time-based measures of completion of events-of-interest? Or are target events-of-interest

continuing?

How much confidence is there in the respective data? Are there ways to combine the data to increase accuracy and confidence?

◦ How so? How not?

28

Page 29: Learning from Time-to-Event Data from Online Learning Contexts

Simplest Form Unique Identifier Amount of Unit Time to Event Censored Column

(need definition of units) (0 for censoring, 1 for event)

29

Page 30: Learning from Time-to-Event Data from Online Learning Contexts

3. Structuring the Target Data in SPSS

30

Page 31: Learning from Time-to-Event Data from Online Learning Contexts

31

Page 32: Learning from Time-to-Event Data from Online Learning Contexts

32

Page 33: Learning from Time-to-Event Data from Online Learning Contexts

Time, Status, LabelTime (“spell”): amount of time before event…or within research observation before being right-censored

Status: event (1) or censored (0)

Label: identifier

33

Page 34: Learning from Time-to-Event Data from Online Learning Contexts

34

Page 35: Learning from Time-to-Event Data from Online Learning Contexts

35

Page 36: Learning from Time-to-Event Data from Online Learning Contexts

Survival Table Identifier Time (by months) Status (1 is event, 0 is censored) Estimate (% percentage of population

surviving after attrition, probability survival, remaining# at risk)

Std. Error (estimated error for survival estimate, 95% confidencelimits)

No. of Cum Events (# of non-survival events, incremented)

No. of Remaining Cases (#of survivors)

36

Page 37: Learning from Time-to-Event Data from Online Learning Contexts

Means and Medians for Times-to-Event

37

Page 38: Learning from Time-to-Event Data from Online Learning Contexts

Means and MediansMEANS FOR “SURVIVAL TIME”

Mean: average amount of time before event is achieved (parameter)

◦ 95% confidence interval: likelihood that for any hypothetical member of this population that their mean value will fall between 3.388 and 5.838 months (with a standard error of .625 months), in the population of instructional design projects

MEDIANS FOR “SURVIVAL TIME”

Median: midpoint amount of time before an event is achieved (parameter)

◦ 95% confidence interval: likelihood that for any hypothetical member of this population that their median value will fall between 1.865 and 6.135 months (for the lower and upper bounds), and a standard error of 1.089 months, depending on the variance in the population of instructional design projects

38

Page 39: Learning from Time-to-Event Data from Online Learning Contexts

Percentiles (in Quartiles)

25% of the population will have achieved event by 7 months (with a standard error of .593 months).

50% of the population will have achieved event by 4 months (with a standard error of 1.089 months).

75% of the population will have achieved event by 2 months (with a standard error of .403).

The tendency in this population is for relative “early” achievement of event (payment for instructional design work) rather than later achievement of event (such as 10 months or later).

39

Page 40: Learning from Time-to-Event Data from Online Learning Contexts

40

Page 41: Learning from Time-to-Event Data from Online Learning Contexts

41

Page 42: Learning from Time-to-Event Data from Online Learning Contexts

42

Page 43: Learning from Time-to-Event Data from Online Learning Contexts

43

Page 44: Learning from Time-to-Event Data from Online Learning Contexts

4. About the Data Visualizations

44

Page 45: Learning from Time-to-Event Data from Online Learning Contexts

Survival Analysis PlotsTo understand the survival analysis data, researchers do not only use the table data but also the computer-generated plots.

They make “eye judgments” based on the graphed data.

To this end, it helps to describe what the main plot types are for this simplest version of survival analysis.

45

Page 46: Learning from Time-to-Event Data from Online Learning Contexts

Kaplan-Meier Survival Function Curve

Survival function curve shows cumulative survival over time.

The dynamic captured is attrition from the initial population.

The curve is not described as “decreasing” because there are plateaus, but the general trendline is downwards.

The curve is non-increasing (so it’s either plateauing or going downward).

At time 0 (time0), all in the population should be alive (unless left-censored data are included). At time “max” (timemax)at the end of the observation period, the population may all have achieved event or some have and the rest are in the “censored data” category.

Kaplan-Meier Survival Function Curve

46

Page 47: Learning from Time-to-Event Data from Online Learning Contexts

One Minus Survival Function

The one-minus survival function curve is a non-descending curve that assumes full survival at time 0 (at the bottom left of the graph).

This visualization shows the cumulative incidence of survival at a time point.

At a certain time, 90% of the population is alive.

At the next time point, 82% of the population is alive (read: has not yet achieved event).

Survival probability is S(t).

Remember that probability is usually expressed as 0 – 1, with 0 as 100% likelihood.

So if one is a member of a target population and is on the timeline, one has descending probabilities of survival over time (if the survival analysis is properly generalizable).

One Minus Survival Function

1-S(t)

One minus survival at time t

47

Page 48: Learning from Time-to-Event Data from Online Learning Contexts

Log Survival Function

The log survival function plot scales the survival data (incidences of non-survival or achieving event) by weighting all the time points the same.

This tests the “equality” of survival functions and can highlight patterns of non-survival events during a time span (of observed time).

This type of analysis is used for events with rates that may increase or decrease at particular time periods.

For this particular data set, this does not seem to show much more except that time-to-event is achieved fairly early on in this time scale from 0 to about 10 months.

Log Survival Function

48

Page 49: Learning from Time-to-Event Data from Online Learning Contexts

Hazard Function Plot

The hazard function captures the amount of risk of non-survival (experiencing target “event”) at any particular time over time.

Depending on the underlying data, hazards may rise over time (ascend), fall over time (descend), or vary in other ways (rising and falling in different patterns over time).

Hazard rates may be constant, or they may be changing.

One classic example of changing hazard rate is the human life span. This curve is indicated as a bathtub curve with high risk at birth, falling risk as a baby gets older, and then rising again once a certain level of age is achieved.

Hazard Function Plot

49

Page 50: Learning from Time-to-Event Data from Online Learning Contexts

5. Time-to-Event Analysis

50

Page 51: Learning from Time-to-Event Data from Online Learning Contexts

Time-to-Event AnalysisA shift in thinking from “survival analysis” where…

◦ Population: animate or inanimate individuals or objects (with each “row” an “experimental unit” or member of the population)

◦ Event: an observable occurrence to members of the population (in which time is a factor and in which time may be observed) ◦ Event may be desirable or undesirable

◦ Interventions: Ways to use levers and mechanisms to try to change outcomes

Available empirical data ◦ Available attribute data about the members of the population (from which groupings may be created

and time-to-event analyses run on those comparative groups) ◦ Different time-to-event trajectories for different groups and sub-populations?

◦ Available intervention data for “control” and “experimental groups

Big data are preferable (for higher confidence in results)

51

Page 52: Learning from Time-to-Event Data from Online Learning Contexts

Time-to-Event Analysis (cont.)

So essentially time-series data with relative frequencies of occurrences of events for the population members, representing a phenomenon

May include additional qualitative (categorical) and quantitative variables

52

Page 53: Learning from Time-to-Event Data from Online Learning Contexts

6. Enriched Time-to-Event Analyses

53

Page 54: Learning from Time-to-Event Data from Online Learning Contexts

Adding Local Color Beyond the numbers, the min-max ranges, the quartiles, the table and chart / plot data, it is important to humanize the time-to-event information

The general extracted data are about macro-level phenomena and often miss the experienced aspects at the micro level (for animate populations)

◦ Some researchers record and share the human stories behind several of the more interesting cases

◦ They humanize the data by capturing local color

◦ They increase the memorability of the research by telling a story

There are still ways to add local color to inanimate populations around the relevance of the time-to-event

◦ There is always a human angle to research, even of inanimate things (such as the expected life of a particular commercial product in “survival analysis”)

54

Page 55: Learning from Time-to-Event Data from Online Learning Contexts

Comparing Groups It is possible to run time-to-event analyses on comparable groups with different attributes

◦ For example, in the instructional design analysis, it is possible to separate projects by “hard” vs. “soft” science projects to see if those differ in terms of actual payment (achieving “event”) and length of the projects before achieving “event” and censoring

Similarly, it is possible to compare groups based on whether they received interventions or not (experimental group vs. the control group)

◦ These may be done in teaching environments (albeit not in ways that disadvantage learners in either group)

55

Page 56: Learning from Time-to-Event Data from Online Learning Contexts

Observations of Multiple Events More nuanced insights may be possible

Observations of multiple sequential events◦ Precursor events to other events

◦ Intermediate events

Sequences of time-to-events analysis

56

Page 57: Learning from Time-to-Event Data from Online Learning Contexts

Predictivity A major value is in theorizing and hypothesizing around time-to-event data

Similar groups as those observed in a time-to-event analysis may be assumed to be under the same time and other constraints and so have similar time-to-event patterns

57

Page 58: Learning from Time-to-Event Data from Online Learning Contexts

7. Worked Real-World Cases

58

Page 59: Learning from Time-to-Event Data from Online Learning Contexts

A Note about the Following “Worked Cases” To give a sense of how this might work, some real-world data was run through survival analysis, and various visualizations were created. In this section are some questions and resulting visualizations.

To do this accurately, it is important to have the full cycle of work: theorizing, extracting data, processing the data, running the time-to-event analysis, interpreting the data, and analyzing and discussing the findings.

The actual background work was done for the respective articles from which the data visualizations were created.

In some cases, some light work was done to create visualizations for this digital poster session.

59

Page 60: Learning from Time-to-Event Data from Online Learning Contexts

How many months does it take before an instructional design project is either paid out or ends without payment (censored)?

60

(from “Applying ‘Survival Analysis’ to Instructional Design Project Data” by the author)

Page 61: Learning from Time-to-Event Data from Online Learning Contexts

And Practical ApplicabilityAnd is there a way to tell which projects will end up being paying ones early on before a lot of time is lost working on unfunded projects?

How long should an instructional designer wait to see if a project will make or not? ◦ Of course, it is clear which programs are better funded on a land-grant university campus, so there are

other ways to see this as well.

61

Page 62: Learning from Time-to-Event Data from Online Learning Contexts

How many days does it take before a created assignment is updated?

62

(from “Wrangling Big Data in a Small Tech Ecosystem” by the author)

Page 63: Learning from Time-to-Event Data from Online Learning Contexts

And Practical Applicability So most assignments in the LMS are updated, and many are updated shortly after initial moments of creation. In some cases, these are run live first before they are updated? In some cases, these are updated very shortly after creation…such as the same day but just an hour later, for example.

Some assignments are updated three years later… Which are these, and why are they updated? (Could the extended date lengths be a product of using LTI-enabled auto-transfers of assignments?)

◦ In one outlier context, this instructional designer found transferred assignments that were 15 years old that had not been updated.

What updates are usually made to assignments, and why?

However, there is a long tail of assignments that are not updated ever or over long periods of time. Should these be updated for relevance and applicability?

63

Page 64: Learning from Time-to-Event Data from Online Learning Contexts

How many days do deleted quizzes survive until deletion in an LMS instance?

64

(from “Wrangling Big Data in a Small Tech Ecosystem” by the author)

Page 65: Learning from Time-to-Event Data from Online Learning Contexts

And Practical ApplicabilityOf the quizzes that are deleted, most are deleted within a few months of their creation. Why?

Are some quizzes left dormant for a long time before a decision is made to delete them?

What goes into the decision to delete a quiz (instead of revising it)?

Are deleted quizzes replaced, and if so, with what?

Is it a net “positive” or a net “negative” that created quizzes are deleted? (Or is this the wrong question altogether)?

65

Page 66: Learning from Time-to-Event Data from Online Learning Contexts

A Few Caveats: Within-Sets vs. Across-Sets The prior are within-set analyses…but could be much more valuable if compared across-sets from multiple institutions.

After such comparisons, there can be enriched theorizing and hypothesizing and further analyses run on the empirical data.

◦ For example, why are there certain time-to-event differences between different sets? Among different sets?

66

Page 67: Learning from Time-to-Event Data from Online Learning Contexts

8. Some Possible Askable Questions in Teaching and Learning

67

Page 68: Learning from Time-to-Event Data from Online Learning Contexts

Conceptualizing Possibilities 1. Begin with an event-of-interest.

2. Theorize about what that event may mean.

3. Hypothesize what that event may mean within a certain in-world phenomenon.

4. Ensure that the conceptualized event exists in time.

5. Ensure that there are observable indicators in collectible data to know when the event has occurred.

6. Backtrack to potential precursor events.

7. Hypothesize variables that may affect the event.

8. Identify start points at which to begin the analysis.

9. Record the hypotheses prior to the research observations.

68

Page 69: Learning from Time-to-Event Data from Online Learning Contexts

Starting with an Event-of-Interest (and Working Backwards and Forwards in Time)

69

Page 70: Learning from Time-to-Event Data from Online Learning Contexts

If Time-to-Event is the Outcome (or Dependent) Variable…(and it is in time-to-event and “survival” analyses)

What co-variates (independent variables) act on the time-to-event?

What are possible interaction effects between the co-variates?◦ Which of the variables are likely the most influential on the time-to-event?

Which of the variables are able to be modified and acted-on for a beneficial outcome? (either hastening time-to-event or slowing time-to-event…or preventing the event…)

◦ Is there a way to study the possible effects of independent variables / co-variates on the time-to-event (dependent variable)?

What is the role of time in the time-to-event? ◦ (In aging and health issues, clearly, time has a clear role. In dealing with physical objects, time has a

clear role. Time may have complex effects on other in-world phenomena.)

70

Page 71: Learning from Time-to-Event Data from Online Learning Contexts

If Individuals, Groups, and Entities of Interest Are Available…What are related individuals, groups, and entities that might be comparable?

What are sub-groups within the population-of-study that may be identified and broken down for comparisons and contrasts?

◦ For example, learners may be divided by demographics (age group, class, race, ethnicity, geographical location, and others), majors, learning sequences, and so on

◦ For example, assignments may be divided by creators, assignment types, courses, learning domains, and so on

Are there naturally occurring “interventions” (experimental groups) and non-interventions (control groups)?

Are there sound interventions (potentially beneficial) that may be conducted while abiding by human subjects research professional ethics and practices?

71

Page 72: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning

Learner-Based Questions

How long does it take before a learner makes a friend in an online course? [What are ways to encourage constructive pro-learning social ties sooner rather than later? What are ways to keep these relationships constructive and healthy throughout the learning and beyond?]

How long does it take for a new online learner to acclimate to the online learning ecosystem? How can one tell? [What are ways to increase the speed to learner comfort and confidence? What are ways to increase online learner sense of self-efficacy and venturing (risk-taking)?]

How long does it take before an online message sent by the instructor is read by a majority of the students? [What are ways to make online messages more salient and interesting for earlier uptake? Are there ways to measure the remembrance of the message contents throughout the learning time period?]

72

Page 73: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning (cont.)

Learner-Based Questions (cont.)

How long does it take learners to achieve a “breakout capacity” in a particular learning sequence (such as a degree program)? [Why? What are ways to speed the time-to-event?]

How long does it take before a student graduates? What are time-based patterns in terms of graduating? [What are ways to increase speed-to-event without compromising learning quality?]

How long does it take before a student drops out? What are time-based patterns in terms of dropping out (such as time periods of greatest risk)? [And knowing those time periods of greatest risks, what sorts of interventions may be done to try to ensure both individual and group learner retention (and long-term positive learning outcomes)?]

73

Page 74: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning (cont.)

Online Learning Communities

What is the time-to-event for when members of an online course experience a sense of being part of a community? What precipitates this experience? Is the precipitating factor serendipitous? Designed? What sparks that experience? Who experienced the communal spark, and who didn’t, and why didn’t they? [What are some early ways to create “communities of practice” in an online learning context? What are some continuing ways to promote a sense of learning community for the online learners? Also, what are some ways to head off negative or anti-learning aspects or dynamics of community?]

74

Page 75: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning (cont.)

Online Course Development-Based Questions

How long does it take for an instructor or development team or other to develop an online course? How long is it before a developed course is alpha-tested and beta-tested? [What are ways to speed up course development time without compromising quality?]

How long does it take before an online course is created and all accessibility mitigations are in place? [What are ways to encourage instructors and course builders to design accessible courses right at the beginning? What are ways to encourage them to retrofit courses for accessibility as soon as possible (if this wasn’t done earlier)?]

75

Page 76: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning (cont.)

Digital Learning Objects

How long does it take before a digital learning object (DLO) in a particular domain field ages out of cutting-edge applicability? Ages out of learner interest? Learning relevance? [What are ways to design DLOs so that they are more future-proofed—for cutting-edge applicability, learner interest, learning relevance, and other dimensions?]

◦ Likewise with digital maps? Data visualizations? Case studies? Images? Datasets? Cases? Examples?

◦ Software programs?

76

Page 77: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning (cont.)

Online Assessments

How long does it take before a quiz becomes dated in a particular course or course sequence? What would indicators of quiz datedness be? [What are interventions to prevent quiz datedness? What are constructive ways to indicate to instructors that their quizzes are dated? Are there ways to design quizzes to be future-relevant for longer?]

How much time passes from when an assessment is created and when it is used? Do the time patterns show that an instructor tends to be fly-by-the-seat-of-his/her-pants or not? Are there quality differences between quick-developed assessments and those requiring more time?

What is the typical lifespan of an assessment? Are there quality differences between assessments used for longer or shorter periods of time? What do these patterns suggest about the quality of assessments (or their lack thereof) and their longevity (or lack thereof)?

77

Page 78: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning (cont.)

Time-to-Learning

How much time passes before critical learning decays? [What are ways to delay the effects of learning decay? What are effective ways to ultimately prevent learning decay?]

How much time does it take to effectively learn over incorrect initial learning? [What are effective methods to help learners get past incorrect initial learning with time efficiency?]

How much time does it take to develop muscle memory for a particular learning task? [What are teaching and learning methods to enhance muscle memory learning for a particular task for learner finesse and speed? Learning tools? Simulations? Equipment?]

How much time does it take to train a new learning-based habit? [What are ways to lower the amount of time to constructive habit-acquisition? What are ways to extend the time of the effect of a new habit (and lower relapse to less-constructive learning habits)?]

78

Page 79: Learning from Time-to-Event Data from Online Learning Contexts

Some Possible Questions from Time-to-Event Data from Online Learning (cont.)

Online Instruction

What is the time-to-event for the first time an online instructor has a unique and direct interaction with a particular and specific learner? [If this time never occurs for a particular learner, why has that not happened? What are ways to enable such senses of learning customizations for online learners?]

What is the first time an online instructor adjusts the learning content for a particular online learner? What is the reason for such a change? Is such a change constructive and beneficial?

79

Page 80: Learning from Time-to-Event Data from Online Learning Contexts

Some ReferencesHosmer, D.W., Lemeshow, S., & May, S. (2008). Applied Survival Analysis: Regression Modeling of Time-to-Event Data. (2nd Ed.) Hoboken: Wiley-Interscience.

Kaplan, E. L. & Meier, P. (1958). "Nonparametric Estimation from Incomplete Observations". Journal of the American Statistical Association: 53 (282): 457–481. Retrieved Apr. 3, 2017, from https://www.jstor.org/stable/pdf/2281868.pdf.

“Survival Analysis.” (2017, Mar. 27). Wikipedia. Retrieved Apr. 3, 2017, from https://en.wikipedia.org/wiki/Survival_analysis.

80

Page 81: Learning from Time-to-Event Data from Online Learning Contexts

9. Contact and Conclusion

81

Page 82: Learning from Time-to-Event Data from Online Learning Contexts

Contact and Conclusion Dr. Shalin Hai-Jew, Instructional Designer

◦ iTAC

◦ Kansas State University

◦ 212 Hale Library

[email protected]

◦ 785-532-5262

* Thanks! I am grateful to the organizers of the 4th Annual Big 12 Teaching & Learning Conference at Texas Tech University for accepting this digital poster.

Note about the Data Visualizations: All data visualizations were created by the author. The image of the countdown clock was a free and open-source clipart image (without an obvious citation).

82