applying “survival analysis” to instructional design project data

12
2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 1/12 Shalin Hai-Jew Sign out You have Author privileges Dashboard | Index | Guide C2C Digital Magazine (Fall 2016 / Winter 2017) Colleague 2 Colleague, Author C2C Digital Magazine (Fall 2016 / Winter 2017) 1. Cover 2. Issue Navigation 3. Letter from the Chair: Dr. Anna J. Catterson 4. Announcing 2016 SIDLIT Award Winners! 5. Creating Social- Ready Graphics on the iPad or Web with Adobe Spark Post 6. Tiger Range: Real Uses of Virtual Technologies at the Institute for New Media Studies 7. Making a Simple Ticket Board using Qualtrics and Trello 8. Digital Timelines for Illustration and Education 9. Shiny App: An Ef×cient and Creative Communicator of Big Data 10. Telling Data Stories 11. Introduction to Big Data 12. Dim Screens: Fighting Burnout in Online Instruction 13. Facilitating Online Collaboration with VoiceThread 14. Interpreting Creative Common Licenses in Open Educational Resources 15. Taking an Internet-of-Things Approach to Cybersecurity using Quantum Teams 16. Cluster Analyses and Related Data Visualizations in NVivo 11 Plus 17. Drawing a 2D Informational Graphic with Microsoft Visio Standard 2016 18. Extracting Linguistic Patterns Other paths that intersect here: Cover, page 20 of 26 Previous page on path Next page on path Applying “Survival Analysis” to Instructional Design Project Data By Shalin HaiJew, Kansas State University Traditionally, “survival analysis” is a statistical representation of the speed of human decline, under varying health and other conditions. This representation is based on empirically observed data, and from this data, a regression curve is extractedfrom which data generalizations may be made (about similar populations in similar contexts). This data can reveal times of heightened risk, and in that sense, it could indicate possible critical times for extended surveillance and possible interventions. Simply, at the beginning of the research (Time Zero), there is a population of individuals who are fully present. Over time, members fail to survive and drop out. At any particular time, there is a survival rate or S(t) which is known as “survival at time t”. Survival is conceptualized in part as a factor of time, and it is a factor of “hazard” or the risk of nonsurvival. The higher the hazard rate, the lower the survival rate; the converse is also true: the lower the hazard rate, the higher the survival one (these rates have a negative correlation). As noted earlier, these observations may be applied to other similar populations for deeper understandings. They may be applied to different populations to see if there are differences in survival rates: for example, one group may serve as a control, and another may be the group that receives some intervention to prolong survival. The comparisons made are between population curves to observe differences. With sufficiently large population sets, there may be an underlying assumption of a normal survival distribution (a Gaussian distribution), but there is no such assumption for smaller population sets which may be convenient data sets and which are not particularly random (and therefore not representative of a full normal distribution and not broadly generalizable). With sufficiently large populations, though, zscores may be extrapolated, and statistical significance of findings may be applied. Also, in terms of data, the longer the time period studied, the more data that may be included. Figure 1: Sample Survival Function Plot Main menu Sample Survival Function Plot Details

Upload: shalin-hai-jew

Post on 23-Feb-2017

126 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Page 1: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 1/12

Shalin Hai-Jew Sign out You have Author privileges Dashboard | Index | Guide

C2C Digital Magazine (Fall 2016 / Winter 2017)Colleague 2 Colleague, Author

C2C Digital Magazine(Fall 2016 / Winter 2017)

1. Cover

2. Issue Navigation

3. Letter from theChair: Dr. Anna J.Catterson

4. Announcing 2016SIDLIT AwardWinners!

5. Creating Social-Ready Graphics onthe iPad or Web withAdobe Spark Post

6. Tiger Range: RealUses of VirtualTechnologies at theInstitute for NewMedia Studies

7. Making a SimpleTicket Board usingQualtrics and Trello

8. Digital Timelinesfor Illustration andEducation

9. Shiny App: AnEf cient andCreativeCommunicator of BigData

10. Telling DataStories

11. Introduction toBig Data

12. Dim Screens:Fighting Burnout inOnline Instruction

13. FacilitatingOnline Collaborationwith VoiceThread

14. InterpretingCreative CommonLicenses in OpenEducationalResources

15. Taking anInternet-of-ThingsApproach toCybersecurity usingQuantum Teams

16. Cluster Analysesand Related DataVisualizations inNVivo 11 Plus

17. Drawing a 2DInformationalGraphic withMicrosoft VisioStandard 2016

18. ExtractingLinguistic Patterns

Other paths that intersect here:

Cover, page 20 of 26 Previous page on path Next page on path

Applying “Survival Analysis” to Instructional Design Project Data

By Shalin HaiJew, Kansas State University Traditionally, “survival analysis” is a statistical representation of the speed of human decline, under varyinghealth and other conditions. This representation is based on empirically observed data, and from this data, aregression curve is extractedfrom which data generalizations may be made (about similar populations insimilar contexts). This data can reveal times of heightened risk, and in that sense, it could indicate possiblecritical times for extended surveillance and possible interventions. Simply, at the beginning of the research (Time Zero), there is a population of individuals who are fully present. Over time, members fail to survive and drop out. At any particular time, there is a survival rate or S(t) which isknown as “survival at time t”. Survival is conceptualized in part as a factor of time, and it is a factor of “hazard”or the risk of nonsurvival. The higher the hazard rate, the lower the survival rate; the converse is also true: the lower the hazard rate, the higher the survival one (these rates have a negative correlation). As noted earlier, these observations may be applied to other similar populations for deeper understandings.They may be applied to different populations to see if there are differences in survival rates: for example, onegroup may serve as a control, and another may be the group that receives some intervention to prolongsurvival. The comparisons made are between population curves to observe differences. With sufficiently largepopulation sets, there may be an underlying assumption of a normal survival distribution (a Gaussiandistribution), but there is no such assumption for smaller population sets which may be convenient data setsand which are not particularly random (and therefore not representative of a full normal distribution and notbroadly generalizable). With sufficiently large populations, though, zscores may be extrapolated, andstatistical significance of findings may be applied. Also, in terms of data, the longer the time period studied,the more data that may be included.

Figure 1: Sample Survival Function Plot

Main menu

Sample Survival Function Plot Annotations

Details

Page 2: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 2/12

from Texts withLIWC (“luke”) forAnalysis

19. Creating ArticleTheme Histograms toMap a Topic

20. Conducting aCross TabulationAnalysis in theQualtrics ResearchSuite

21. Applying“Survival Analysis” toInstructional DesignProject Data

22. Free Licenses forall RapidMinerProducts: MachineLearning withoutCoding

23. Book Review:Immersing Virtuallythrough Avatars forOnline and BlendedLearning

24. Book Review:Harnessing SocialMedia for TeachingComposition

25. About Colleague2 Colleague

26. A Call forSubmissions to theC2C Digital Magazine(Spring-Summer2017)

Search

What a simple survival analysis graph shows, simply, is the timetoevent of death in a nonincreasing linegraph. (The reason why the line is not described as decreasing is because there are times of plateaus or nonloss of a member from the original population.) Another way to think about this is as a measure of attritionfrom a particular population over time. Time may be measured as a continuous set of values or as discreteones. In this example, time is treated as discrete and measured in monthly units (more granularly measuredtime is not necessary in this example). The hazard function is seen to be nondecreasing and will trend up with cumulative hazard over time. The“hazard” concept is something like “risk” based on empirical observations of events. In a simple survivalanalysis model, hazard is a constant over time. In real world contexts and more sensitive models, the hazardfunction changes over time and may increase or decrease. (A classic example of a hazard curve is the "bathtubcurve," which is intuitive and worth a look.)

"Censoring" So why don’t statisticians just use a normal linear regression to capture such changes over time? The thinkingis that survival analyses include "censored" data in the timeseries analysis. Data censoring refers to thedropout rate of persons in a survival analysis—due to the ending of the research period or a lack of followthrough of research participants with researchers; such normal random censoring is beyond the control ofresearchers. Including censored data in a survival analysis helps mitigate "survivorship bias" in statisticalanalysisor overweighting the effects of the data points that "survive" the research period and come toresearcher attention but not counting or acknowledging the data that drops out of the study or is notconsidered or seen by the researcher. (Mitigating "survivorship bias" requires researchers to ask themselves toconsider what data they're not seeing and not including when they design their research.) In this case, it helps to conceptualize a timeline with the past to the left and the future to the right. Here, “leftcensoring” refers to the lack of knowledge of researchers of what has happened with participants before theywere included in a study, and “right censoring” refers to the completion of the research before the individualsexperience the event (nonsurvival in some cases). After the research is complete, researchers will not befollowing up with the remaining or surviving participants and so do not know what happens to them (althoughthey may be able to infer what will happen given the captured data from others in that particular cohort). Research observations are censored when the data about their survival time is undetermined or incomplete. This censoring information, even though it is a data limitation, provides richer realworld information foranalysis. In the same way that all the population is alive at Time 0, there is a point in the future when the entirepopulation is no longer present. It is in the inbetween time that there is research interest. For example, theremay be periods of intensification of nonsurvival ("critical phases")…or long periods of survival and nonevents…or spates of censoring due to inenvironment events or other factors. There are a variety of variations to these basics. Besides censoring, research projects may have individualsenter at different times in what is known as staggered entry. There are statistical methods to enable study ofmultiple events…in a survival analysis, and other insights. For this work, a very basic survival analysis will beexplored but with realworld and original data.

Basic Data Elements of a Survival Analysis To operationalize a survival analysis, researchers have to be able to define a few important elements. So whatbasic data is needed in a survival analysis? A researcher needs the following: • a particular population (or phenomenon or object) to study, • defined units of time (at what level of granularity), and • what an “event” (nonsurvival, in the classic survival analysis sense) looks like (objectively). Time is the independent variable, and timetoevent is the random (outcome) variable. There may be variables(called "covariates" in survival analysis) that affect survival outcomes: positively or negatively, and to varyingdegrees. Some of these covariates may contribute to hazard, which decreases the timetoevent (or "non

View Recent

Page 3: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 3/12

survival"). Some covariates such as health interventions may lower hazard and increase the timetoevent (or"nonsurvival"). Now that the basic elements have been explained, it is important to conceptualize “survival analysis” in otherways. For example, it may be thought of as timetoevent.

Various Applications of “Survival Analyses”

A “survival analysis” may be applied to many more contexts than people passing on from a population. Otherforms of survival analysis are also known as “timetoevent” analysis, event history analysis, reliabilityanalysis, and duration analysis, and variations on this method are applied in engineering, economics, sociology,and other fields. To have a sense of the breadth of “survival analysis,” it may help to view the article networkaround the Wikipedia article about this in Figure 2.

Figure 2: A “Survival_Analysis” Article Network on Wikipedia (1 deg.) See if you can identify the “survival analysis” or “timetoevent” aspects of the following questions:

In political science, what is the timetoevent for a nationstate to reach full collapse? In sales and marketing, what is the timetoevent for a groomed potential customer to actually purchasethe particular merchandise? In sociology, what is the timetoevent for social movements to emerge into a powerful entity? In public health, what is the hazard rate of a newborn in a particular population to acquire a particulardisease? In education, what is the survival rate of learners in the STEM (science, technology, engineering, andmath) pipeline (from preK all the way through graduate school and beyond)? Or how long do learners stay in a particular online learning degree program? Or the college or universitybefore they either graduate or drop out? In education, what is the timetoevent for a freshman to earn a baccalaureate degree? In a help desk ticketing system, how long do tickets remain open before they are closed, and how manyare never closed (censored)?

The general examples above show that the “event” does not have to equal the demise of the subject. The eventmay be what many may consider a net positive.

A “Survival_Analysis” Article Network on Wikipedia (1 deg.) Annotations

Details

Page 4: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 4/12

There are various types of data structures underlying survival analyses, different computations that may beapplied to the data, and different software programs that may be used for such analyses. This particular articlewill use IBM’s SPSS.

Survival Analyses in Education Survival analysis, even in its simple form, can be applied to education. Essentially, any phenomenon that has astartandstop time element can be depicted in a survival analysis. This is not to say that these are run in adiscovery or exploratory method outside the context of some research questions, research design, andhypothesizing. To see how this might work, survival analysis was run on instructional design (ID) projects from 2010 – 2016. The original set of objects are the projects, and the “events” are completion with bills sent to the principalinvestigators (PIs) / faculty owners of the instructional design projects. The start date is artificial since thereare data from prior (to 2006), but the billing for prior projects were purchases of halfayear of instructionaldesigner time...for multiple projects...and those are somewhat beyond the focus of this example (preboughttime guarantees the project unless the instructional designer fails spectacularly in followthrough on the work). Censored projects are those for which no paid work was done—so in these cases, there were early discussionsof work and initial recording of hours, but nothing ultimately came of the work for various reasons. One way toconceptualize censoring is of somewhat incomplete work, something that has not been taken to its fullactualization. (Left out entirely from this research are the thousands of consultations and works for which nopay was ever expected because the projects did not rise to the design and development level of funded work. Also left out are the many indepth multiyear projects which occurred as part of the higher educationecosystem…but for which no funds were directly moved between university units. And finally, there is thegrantchasing work done on various projects. In many cases, no funds are directly acquired. In other cases,grant funds are attained but are distributed more locally to the respective colleges and departments, but nopayments were made to the instructional designer.) While the underlying data here are real, the specifics havebeen removed because these are not important to the method and would not contribute to deeperunderstandings in this context. It is thought that such “timetoevent” studies may improve the management of instructional design projects,so that these move from the stages of light talk to actual work and actual completion of paid work. The payingfor work often is correctly conflated with the use of the learning objects. Unpaid projects are often notpolitical priorities and garner much less administrative protection over time. While there are many ways tocomplete instructional design work successfully and to help the university attain a “win,” those projects will notbe discussed here.

Hazarding Educated Guesses (“Hypothesizing”)

Understanding which ID projects thrive may shed light on the “hazards” that make a project sufficientlysuccessful (based on the prior terms). As such, this set may be split into two: those that make to "event" andthose which "censor" out. Then the respective characteristics of the respective instructional design projects ineach grouping may be compared. Another builtin way to break apart the groups into categories is by length ofproject. Yet another is to cluster projects by highcost vs. lowcost (with various bounds for each). Thesecategories may be defined already with the given data. The topics of the respective projects may be analyzed,too, based on the redacted data (which is known to the researcher and may be coded in future work). Beyond those features mentioned earlier, it would help to identify other attribute features of the respectiveinstructional design projects. These may include the following variables:

types of deliverables existence of predefined designs standards for the instructional build defined technical platforms requirements to use proprietary technical platforms size of development team complexity of the targeted learner population

Page 5: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 5/12

whether there is need to collaborate with other institutions, and others

Once the respective features are defined and the individual projects are duly coded, it will be possible toanalyze survival analysis of the respective subclusters of instructional design projects for timepatterndifferences. For survival analyses involving multivariate data, researchers go to the Cox Regression survivalanalysis (which is one of four survival analysis types in SPSS). From the above, there will be a list of "askable questions" based on data queries and data visualizations.

For example, of the set of ID projects which achieve to “event,” how does that set differ from those thatend up “censored”? Is it possible to anticipate what sort of ID project may be in one category or theother? How do types of required project deliverables affect survival, "event," and censoring? Are predefined designs a net positive or a net negative in terms of project attainment of "event" /completion?Do collaborations with offcampus entities mean projects take longer periods or shorter periods tocomplete? Or are such offcampus collaborations not a factor? and so on...

Realworld debriefing. From experience, projects that are prefunded with a federal grant tend to proceedto funded work. Those ID projects with required and defined online course or learning objects will generallyresult in successful instructional design and project completion and payment to the unit. In part, those projectshave to meet stringent legal requirements on a number of fronts: intellectual property protections,accessibility, and so on. These also have to survive project assessment by a thirdparty entity, and these haveto show learning gains. Projects that have strong leadership, a lineup of talent, a clear plan, and sufficientresourcing tend to do well and achieve “event” instead of being “censored”. Censoring does not always occur due to limitations of the project PI. An instructional designer may have todecline a project because the project requirements are not attainable given local skills and local resources. Inthat case, IDs have to selfcensor projects. If PI expectations are too high in terms of what the technologies canachieve, that is another scenario in which selfcensoring of projects occurs. Early and clear communications arecritical to help identify projects which are a mismatch with ID capabilities and skills (which vary greatlydepending on units).

Instructional Design on a University Campus

It may help to describe the instructional design context on this particular university campus. For some years,instructional design was generally provided as a service to all faculty who requested the service. However, astime passed, it became clear that some clients were those who had federal grant dollars who needed intensivebuilds (whole online courses or learning sequences, sophisticated websites, and other objects). An officialuniversity rate card was created to address these types of projects that required more than the complimentaryinitial 10 hours. While some longterm instructional design projects were created for free—given the politicalhorse trading on campuses—these tend to be outliers…maybe only a halfdozen in a decade of ID work at thisinstitution, at least based on the experiences of one instructional designer. Billing for such projects requires approvals by the principal investigators (of grants) / faculty members. Someprojects extend over a budget year and may require multiple bills. In general bills are finalized at the ends ofprojects within budget years but occasionally, a few projects may extend over a fiscal year. The average bill foran instructional design project between 2010 – 2016 was $2,488.85, with this average skewed by severalprojects paying in the five digits. The period of research was seven years (or 84 4 = 80 months since this wasdone at the end of August 2016). The projects were entered in a staggered entry, and the length of project timewould be the months of work spent on the project. Twentysix projects were identified, with a minmax costrange of $400 to $34,000. On this campus, because of the sparsity of funds, the idea is to get any ID service gratis and move on withoutany bill if possible, so it’s important for an instructional designer to discuss the rate card and expectations earlyon.

Page 6: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 6/12

The average length of time from the start of a project to the end is four months. Rarely, talks will start on aproject many months before actual start of project because the funding has not come in until then. The start ofa project begins with the first contact with the recording of the gratis hours. For many projects, if it seems thatthe work will only be consultation without full design and development, those hours are only recorded on awork calendar, and no bill is even started. Bills are only started if there is discussion of more extensive paidwork. In Figure 3, the various ID projects with bills are listed. That said, there are sometimes periods ofdrought for paid work on a campus. Also, the years here are calendar years, not fiscal years (which start in Julyand end at the end of June the next year).

Figure 3: ForPay Instructional Design Projects: 2010 – 2016

Setting up for the Survival Analysis

Setting up the data for this survival analysis begins with a review of bill files. From these files, what is capturedare the following: the general name of the project, start dates, end dates, and final bill amount (if any). Thelength of the projects are calculated from the start and end dates with the unit time being months. If a projectresulted in a bill, that would be labeled a “1” for “event,” and if a project did not result in a bill, that would belabeled a “0” to represent “censored.” This information is represented in three columns:

ProjectName UnitTime (or "Spell") EventorCensored

The first column is comprised of string data written in camel case; the second is integer data; the third is adummy variable represented either with a 1 (event) or a 0 (censored), presence or absence. The resulting tabledata was run through a survival analysis using the KaplanMeier estimate in SPSS. The KaplanMeier methodoriginated in 1958 and is known as "the product limit estimator". The ingested data may be viewed in Figure 4.

For-Pay Instructional Design Projects through the Years Annotations

Details

Instructional Design Survival Analysis Data in SPSS Annotations

Page 7: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 7/12

Figure 4: Instructional Design Survival Analysis Data in SPSS According to the Case Processing Summary, this analysis identified an N = 26, with 23 of these achieving“event” (billed services) (88.5% of the set) and 3 “censored” (nonbilled services) (11.5% of the set). Thesurvival table captured 26 time events (Figure 5).

Figure 5: Redacted Survival Analysis Table from the Instructional Design Dataset By the first month in, 19% of the instructional design projects have reach event. By the end of the secondmonth, 42% of projects have achieved event or been censored, and only 58% survive. By the fifth month, thereis only 35% of projects are still in limbowithout either a determination of whether they will be a paid projector will censor out of the study. None of the projects exist in the unbilled or censored categories by the 11thmonth, so a decision for whether a project will continue as a billed one or not is decided within a calendar yearof the start of the project.

Details

Redacted Survival Analysis Table from Instructional Annotations

Details

Page 8: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 8/12

The average survival time is 4.613 months, with a standard error of .625, with the 95% confidence interval’slower and upper bounds at 3.388 to 5.838. The median survival time is 4, with a standard error of 1.089, with a95% confidence interval at 1.865 and the upper at 6.135. (The confidence bounds assume an underlying normalcurve, which may be an inaccurate assumption. In all likelihood, the underlying distribution is nonnormal,given the topic.) Another way to conceptualize this data is by percentile. At the 25th percentile, a project may take about 7months (with a standard error of .593). At the 50th percentile, representing half of all projects, these take 4months (with a standard error of 1.089). On the 75th percentile (third quartile), under which 75% of theprojects may be found, these projects take approximately 2 months (with a standard error of .403). All of thequartile values of the KM estimate are within the ranges and so are defined values). In other words, themajority of projects tend to be fairly short; said another way, rarer projects may be longer. Figure 6 shows the survival function curve for these 26 instructional design projects. Note on the yaxis“CumSurvival” (cumulative survival) that the full set is present at the top left with a score of 1, representing fullsurvival of the full set of ID projects. As time passes (as indicated on the xaxis), the curve changes in a stairstep way. When the line is parallel with the xaxis, that plateau shows that there was no project eventoccurrence. As the line moves perpendicular to the xaxis, that shows a dropoff of projects over time, whichindicates the experience of an ”event” (the paying of the project). The “plus,” if you will, shows a project beingcensored of leaving the set without payment. Note that the censoring does not result in a change to the curve(whether up or down). This line cannot be called a decreasing line graph because of the moments of plateauwhere there is no decrease. At the bottom right of the plot, the population of instructional design projects hasbeen dissipated through events or censorship to 0 at around the 11month mark. What the linegraph suggests, visually, is that projects that will censor are identified early and end early. Inother words, if it is clear that a project will not be funded, it tends not to continue for long periods of time. Also, most projects achieve “event” at fairly regular intervals at one month, two months, and so on. Longerprojects past nine months tend to go to nearly a year. So survival of a project is depicted here as a function oftime. The S(t) [the “s” of “t”] or survival at time t may be understood from the linegraph. The more time thatpasses, the more likelihood that a project will result in pay as long as the project hasn’t ended by beingcensored out.

Figure 6: “Survival Function” Applied to Instructional Design Projects

'Survival Function' of Instructional Design Projects Annotations

Details

Page 9: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=148725… 9/12

This same data may be visualized as a hazard function. In Figure 8, the line of the graph starts at the bottomleft. At (0,0), the origin of the graph, the hazard function is the same for all the instructional design projects. Over time (expressed in the horizontal xaxis), the hazard rate starts to manifest. One sign of that is in monthone when the line moves vertically, indicating an event (based on the hazard). And so, it all stairsteps upwardscumulatively. The hazards seem to occur earlier in the lifespan of an instructional design project and actuallyseems to lessen once Month 9 has been reached.

Figure 7: “Hazard Function” Applied to Instructional Design Projects In the instructional design context, it helps to know whether a project formalizes into a paying one. It helps tohave the hypothesizing around the desirable “hazards” to ensure that a project makes it to successful fruitionand mutual benefits for the respective university units involved. This same data may be plotted in a One Minus Survival Function (Figure 8). This shows a line that grows overtime. At any point in the line, an individual may capture the cumulative incidence rate at that particular pointin time. The “one minus” part works because both survival and hazard are plotted on a range from 0 – 1 (fornormalized probability data). One minus survival [1S] equals the cumulative hazard rate and also equals therising incidence of nonsurvival (event) as seen in time (or time segments).

'Hazard Function' of Instructional Design Projects Annotations

Details

Accumulating Incidence of ID Projects Achieving Event in the One Annotations

Page 10: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=14872… 10/12

Figure 8: Accumulating Incidence of ID Projects Achieving Event in the One Minus Survival Function Plot

Debriefing First, it is important to mention that a naive interpretive approach will be unhelpful. Some naive assertionsfrom these "survival analysis" findings may go as follows:

If I stretch out the time of a project, I will achieve completion and payday.If a project censors early on, it will not be recoverable and may never achieve "event."

And based on the early interpretations, some naive assertions may continue:

A PI with access to federal funds will always be a paying client (most are super frugal, which is good forthe institution of higher education overall, but which guarantees a small payday, if any).

What are some more sophisticated insights from these graphs? These may suggest which types of projects totarget. A cursory analysis of the underlying data shows that the projects that are generally funded tend to befrom the health sciences and agriculture because those are priorities for federal grant funding agencies. Also,federal compliance trainings sometimes result in some project payment (but would more likely result in gratiswork because these benefit the campus broadly and are politically justifiable in the work unit). It would bepossible to identify PIs who tend to spearhead successful projects. It would be possible to identify whichprojects may end up in a lot of healthy work and a fair payout based on an analysis of variables which often leadto project success. It may be possible to separate out the various projects (such as by topical groups) in order to run logranksurvival analyses against these groups to see if there are fundamental differences between these groups. Inthese visualizations, there are two or more survival lines running through the graph which enable comparison. Clearly, it is helpful to understand the underlying context in order to better interpret the data. It is importantto know the data itself fairly well to know how to accurately interpret the findings. In this example, limited but realworld data were used to run a survival analysis in SPSS to surface some basicinsights about the “state” of instructional design at one institution of higher education. These are incompletedata, and some data were redacted here. However, this example should give a sense of how a basic survivalanalysis may work.

Details

Page 11: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=14872… 11/12

Comment on this page

Conclusion "Survival analyses" may be used not only to describe timetoevent data in a descriptive way (to enablehypothesizing and analysis), but they may be used as models to understand other similarly situated data and toeven enable predictions of outofsample data. These predictions may inform expectations of future events,with varying levels of confidence. Certainly, as new data is available, it is possible to integrate more data to thesurvival analyses for the prior applications. A survival analysis is a straightforward statistical analysis approach that can have wide applications in highereducation to provide insights about timebased phenomena (pretty much everything). This approach canprovide data points and insights for faculty, administrators, staff, and students.

A Data Visualization Addendum Sometimes, I catch myself saying that I am mostly funded by those in the hard sciences vs. the soft sciences. Since I'd only recently discovered how to make a streamgraph in Excel 2016, I thought I would map out thenumber of soft vs. the hard science instructional design projects I'd worked per each of the years from 2010 2016. It turns out that while I do work more hard science projects, there were some soft sciences ones as well. The differential between the sizes of the projects are highly weighted towards the hard science projects, too,but that's another story.

Figure 9: Soft vs. Hard Science Instructional Design Projects (in a Streamgraph)

About the Author

Dr. Shalin HaiJew works as an instructional designer at Kansas State University. She may be reached [email protected].

Soft vs. Hard Sciences in Instructional Design Projects (in a Streamgraph) Annotations

Details

Page 12: Applying “Survival Analysis” to Instructional Design Project Data

2/16/2017 Applying “Survival Analysis” to Instructional Design Project Data

http://scalar.usc.edu/works/c2cdigitalmagazinefall2016winter2017/applyingsurvivalanalysistoinstructionaldesignprojectdata?path=index&t=14872… 12/12

• More

Version 33 of this page, updated 16 February 2017. C2C Digital Magazine (Fall 2016 / Winter 2017) by Colleague 2 Colleague. Help reading this book. Powered by Scalar.

Terms of Service | Privacy Policy | Scalar Feedback

New Edit Hide

Previous page on path Cover, page 20 of 26 Next page on path

Related: Issue Navigation

instructional design event history analysis One Minus Survival Function data censoring survivalanalysis time-to-event SPSS hazard function