outflow: exploring flow, factors and outcome of temporal event sequences
DESCRIPTION
My presentation at IEEE VisWeek 2012 in Seattle, WA //// Abstract: Event sequence data is common in many domains, ranging from electronic medical records (EMRs) to sports events. Moreover, such sequences often result in measurable outcomes (e.g., life or death, win or loss). Collections of event sequences can be aggregated together to form event progression pathways. These pathways can then be connected with outcomes to model how alternative chains of events may lead to different results. This paper describes the Outflow visualization technique, designed to (1) aggregate multiple event sequences, (2) display the aggregate pathways through different event states with timing and cardinality, (3) summarize the pathways’ corresponding outcomes, and (4) allow users to explore external factors that correlate with specific pathway state transitions. Results from a user study with twelve participants show that users were able to learn how to use Outflow easily with limited training and perform a range of tasks both accurately and rapidly.TRANSCRIPT
m
Outflow
Krist Wongsuphasawat HCIL, University of Maryland
David Gotz IBM Research
Exploring Flow, Factors and Outcomes of Temporal Event Sequences
InfoVis 2012 Seattle, WA
m
Events
m
Event | 12:15 p.m. Lunch
m
Event Sequences Event Event Event
m
Daily Activity
7:30 a.m. Wake Up
7:45 a.m. Exercise
8:15 a.m. Go to work
m
Soccer Game
90th minute Team A scores
25th minute Team B scores
10th minute Team A scores
m
Game #1
Time
10th minute Goal
90th minute Goal
25th minute Concede
Soccer Game
m
Goal
Game #1
Concede Goal
Goal
Game #2
Goal Concede
Time
Goal
Game #3
Concede Concede
Concede
Game #n
Goal Goal Goal
Many games
m
with outcome
Game #1
Game #2
Time
Game #3
Game #n
Lose (0)
Win (1)
Win (1)
Win (1)
Goal Concede Goal
Goal Goal Concede
Goal Concede Concede
Concede Goal Goal Goal
m
7 event types
823543 combinations
7 events per entity
m
Enjoy!
m
consumable
m
Overview / Summary
Event Sequences with Outcome
m
7 Steps
m
Step 1 | Aggregation
m
Entity #1
Entity #2
Entity #4
Entity #3
Entity #5
Entity #6
Entity #n
Entity #7
…
Outflow Graph
Event Sequences
m
Assumption • Events are persistent.
e1
Entity #1
e2 e3
Entity #1
m
Assumption • Events are persistent.
e1
Entity #1
e2 e3
e1
Entity #1
e1 e1
m
Assumption • Events are persistent.
e1
Entity #1
e2 e3
e1
Entity #1
e1 e2
e1 e2
m
Assumption • Events are persistent.
e1
Entity #1
e2 e3
e1
Entity #1
e1 e2
e1 e2 e3
m
Assumption • Events are persistent.
e1
Entity #1
e2 e3
e1
Entity #1
e1 e2
e1 e2 e3
[e1]
[e1, e2]
[e1, e2, e3] States
m
Select alignment point Pick a state
What are the paths that led to ?
What are the paths after ?
Soccer: Goal, Concede, Goal
Example
m
Select alignment point Pick a state
What are the paths that led to ?
What are the paths after ?
or just an empty state []
m
Outflow Graph
[e1, e2, e3]
Alignment Point
m
Outflow Graph
[e1, e2, e3]
[e1, e2]
[e1, e2, e3, e5]
[e1]
[ ]
Alignment Point
1 entity
m
Outflow Graph
[e1, e3]
Alignment Point
2 entities
[e1, e2, e3]
[e1, e2]
[e1, e2, e3, e5]
[e1]
[ ]
m
Outflow Graph
[e1, e2, e3, e4]
Alignment Point
[e3]
3 entities
[e1, e3]
[e1, e2, e3]
[e1, e2]
[e1, e2, e3, e5]
[e1]
[ ]
m
Outflow Graph
[e2, e3]
[e2]
Alignment Point
n entities
[e1, e2, e3, e4]
[e3]
[e1, e3]
[e1, e2, e3]
[e1, e2]
[e1, e2, e3, e5]
[e1]
[ ]
m
Outflow Graph
[e2, e3]
[e2]
Alignment Point
n entities
Average outcome Average time Number of entities
= 0.4 = 10 days = 10
[e1, e2, e3, e4]
[e3]
[e1, e3]
[e1, e2, e3]
[e1, e2]
[e1, e2, e3, e5]
[e1]
[ ]
layer
m
Soccer Results
2-1
2-0
1-1
0-2
2-2
3-1
1-0
0-1
0-0
Alignment Point
m
Step 2 | Visual Encoding
m
Alignment Future Past
e1!e2!
e1!
e2!
e1!e2!e3!
e1!e2!e4!
Color is outcome measure.
Node’s height is number of entities.
Time edge’s width is duration of transition.
Node’s horizontal position shows sequence of states.
time edge
link edge
End of path
m
Step 3 | Graph Drawing
m
m
m
3.1 Sugiyama’s heuristics • Directed Acyclic Graph (DAG) layout
– Sugiyama, K., Tagawa, S. & Toda, M., 1981. Methods for Visual Understanding of Hierarchical System Structures. IEEE Transactions on Systems, Man, and Cybernetics, 11(2), p.109-125.
• Reduce edge crossing
m
41 crossings
m
12 crossings
m
m
3.2 Force-directed layout • Spring simulation
x
Each node is particle.
Total force = Force from edges - Repulsion between nodes
m
m
m
3.3 Edge Routing • Avoid unnecessary crossings
Reroute
m
3.3 Edge Routing • After routing
m
m
m
Step 4 | Interactions
m
Interactions • Panning • Zooming • Brushing • Pinning • Tooltip • Event type selection
m
Demo
m
Step 5 | Simplification
m
Node Clustering • Cluster nodes in each layer • Similarity measure: Outcome, etc. • Threshold (0-1)
m
m
m
Step 6 | Factors
m
Entity #1
Factors Time
[e1] [e1, e2] [e1, e2, e3]
Factor 1 Factor 2 Factor 3 Factor 4
m
Patient #1
Factors Time
Which factors are correlated to each state?
Yellow Injury Red Substitution
[e1] [e1, e2] [e1, e2, e3]
m
Which keywords are correlated to each document?
Information Retrieval
State 1 … …
State 2 … … …
State 3 … … …
Doc#1 Doc#2 Doc#3
Factor xxx
Which factors are correlated to each state?
m
Present factors
[e1,e2,e3]
[e1,e2]
[e1,e3]
[e2,e3]
[e1,e2,e3,e4]
[e1,e2,e3,e5]
[e1]
[e2]
[e3]
[ ]
Alignment Point
Factor 1
m
Absent factors
[ ]
Alignment Point
Factor 2
Factor 2 [e1,e2,e3]
[e1,e2]
[e1,e3]
[e2,e3]
[e1,e2,e3,e4]
[e1,e2,e3,e5]
[e1]
[e2]
[e3]
m
tf-idf • Term frequency
Number of times a term t appear in the document
Number of terms in the document
Number of documents
Number of documents that has the term t + 1 log ( )
tf
idf
=
=
• Inverse document frequency
m
Score based on tf-idf • Ratio (presence)
Number of entities with factor f before state
Number or entities in the state
Number of states
Number of states preceded by factor f + 1 log ( )
Rp
R-1
=
=
• Inverse state ratio (presence)
sp
m
m
Step 7 | User Study
m
User Study • Goal:
Evaluate Outflow’s ability to support event sequence analysis tasks
• 12 participants • 60 minutes each • 9 tasks + 7 training tasks • Questionnaire
m
Results • Accurate:
3 mistakes from 108 tasks
• Fast: Average 5-60 seconds
• Findings: – From video – Different outcomes for each incoming paths – Etc.
m
Future Work • Integration with prediction algorithm • Additional layout techniques • Advanced factor analysis • Deeper evaluations with domain experts
m
Conclusions • Event sequences with outcome • Outflow
– Interactive visual summary – Explore flow & outcome – Factors – Multi-step layout process
• Not specific to sports
Contact: @kristwongz [email protected] [email protected]
m
Patient #1
Time
Aug 1998 Ankle Edema
Jan 1999 Weight Loss
Oct 1998 Cardiomegaly
Heart failure (CHF) patient Die (0)
m
Event Sequences
and more…
Medical Transportation
Education
Web logs
Sports
Logistics
m
Acknowledgement • Charalambos (Harry) Stavropoulos • Robert Sorrentino • Jimeng Sun • Comments from HCIL colleagues
m
Conclusions • Event sequences with outcome • Outflow
– Interactive visual summary – Explore flow & outcome – Factors – Multi-step layout process
• Not specific to medical or sports
Contact: @kristwongz [email protected] [email protected]
m
THANK YOU ขอบคุณครับ