extracting and analyzing sequential interaction-patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) opl...

47
Extracting and Analyzing Sequential Interaction-Patterns Simon Walk July 22, 2016 Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 1 / 37

Upload: others

Post on 13-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Extracting and AnalyzingSequential Interaction-Patterns

Simon Walk

July 22, 2016

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 1 / 37

Page 2: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Introduction

Motivation

Over the last decade, ontologies have become the mainstay in thebiomedical domain.

New and complementing areas of application

Increased complexity & size

For example, ICD-11 consists of roughly 50, 000 classes.

Highly specialized knowledge

Many different areas of expertise

Ontologies have become very hard to develop and maintain.

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 2 / 37

Page 3: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Introduction

Collaborative Ontology Engineering

Similar to Wikipedia, contributors engage remotely in developingontologies.

Many new and unexplored problems

Layer of social interactions adds complexity

Administrators are in need of tools to better manage the complexcollaborative engineering process.

Objective: Broaden our understanding of the dynamic social processes byanalyzing edit patterns.

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 3 / 37

Page 4: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Research Questions

Outline

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 3 / 37

Page 5: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Materials & Methods

Datasets

Characteristics of the datasets used for the different collaborativeontology-engineering analyses.

ICD-11 ICTM NCIt BRO OPL

# classes 48, 771 1, 506 102, 865 528 393# changes 439, 229 67, 522 294, 471 2, 507 1, 993

# users 109 27 17 5 3

first change 18.11.2009 02.02.2011 01.06.2010 12.02.2010 09.06.2011last change 29.08.2013 17.07.2013 19.08.2013 06.03.2010 23.09.2011observation period (ca.) 4 years 2.5 years 3 years 1 month 3 months

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 4 / 37

Page 6: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Materials & Methods

(Sequential) Interaction-Sequences

User-based sequences

Class-based sequences

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 5 / 37

Page 7: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Materials & Methods

Identifying Interaction Patterns

Using Markov chains

State space S , listing all possible states s1, s2, ...sn ∈ S with |S | = n.

Transition matrix P with pij listing the probability to go from state si to sj .

First-order Markov chain (Markovian property):

P(Xt+1 = sj |X1 = si1 , ...,Xt−1 = sit−1 ,Xt = sit︸ ︷︷ ︸all previous transitions

) =

P(Xt+1 = sj |Xt = sit︸ ︷︷ ︸current transition

) = pij

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 6 / 37

Page 8: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Materials & Methods

Identifying Interaction-Patterns

Markov chain of order k means that k previous states contain (useful)predictive information about the next state.

P(Xt+1 = sj |X1 = si1 , ...,Xt−1 = sit−1 ,Xt = sit︸ ︷︷ ︸all previous transitions

) =

P(Xt+1 = sj |Xt−k+1 = sit−k+1 , ...,Xt = sit︸ ︷︷ ︸k transitions

)

Overfitting and model complexity are problematic!

Lower order models are nested in higher order models!Solution: Model selection and prediction experiments

Parameter increase: θ = |S |k |S |Solution: Aggregated/abstract states

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 7 / 37

Page 9: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Process to Identify Interaction-Patterns

Preprocessing

Mapping, Session Separation, State Selection, Path Extraction

Model Fitting

Model Selection

Akaike IC, Bayesian IC, Prediction Experiments

Interpretation

[Walk et al., 2015b] Simon Walk, Philipp Singer, Markus Strohmaier, Denis Helic, Natalya Noy, Mark Musen: How to applyMarkov chains for modeling sequential edit patterns in collaborative ontology-engineering projects. Int. J. Hum.-Comput.Stud. 84: 51-66 (2015)

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 8 / 37

Page 10: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Outline

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 8 / 37

Page 11: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Sequences to Analyze for Patterns

User SequencesWho will change a class next?Which type of change will a user perform next?

Content-based SequencesWhich area of the user interface will a user use next?Which property will a user change next?

Structural SequencesWhich class is a user going to edit next?Where is the next class located in the ontology?Do users move along the ontological hierarchy when contributing tothe projects?

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 9 / 37

Page 12: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Do users move along the ontological hierarchy whencontributing to the projects?

States: Self, Parent, Child, Sibling, Cousin, Ascendent, Descendent, Other

To Relationship

Fro

m R

ela

tion

ship

02

00

00

40

00

06

00

00

80

00

0

Fre

qu

en

cy

Ancestor

Child

Cousin

Descendant

Other

Parent

Self

Sibling

BREAK

An

cest

or

Ch

ild

Co

usi

n

De

sce

nd

an

t

Oth

er

Pa

ren

t

Se

lf

Sib

ling

BR

EA

K

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

(a) ICD-11To Relationship

Fro

m R

ela

tion

ship

05

00

01

00

00

15

00

0

Fre

qu

en

cy

Ancestor

Child

Cousin

Descendant

Other

Parent

Self

Sibling

BREAK

An

cest

or

Ch

ild

Co

usi

n

De

sce

nd

an

t

Oth

er

Pare

nt

Se

lf

Sib

ling

BR

EA

K

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

(b) ICTMTo Relationship

Fro

m R

ela

tion

ship

02

00

00

40

00

06

00

00

Fre

qu

en

cy

Ancestor

Child

Cousin

Descendant

Other

Parent

Self

Sibling

BREAK

An

cest

or

Ch

ild

Co

usi

n

De

sce

nd

an

t

Oth

er

Pa

ren

t

Se

lf

Sib

ling

BR

EA

K

−0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

(c) NCItTo State

Fro

m S

tate

05

01

00

15

02

00

25

0

Fre

quency

Ancestor

Child

Cousin

Descendant

Other

Parent

Self

Sibling

BREAK

An

ce

sto

r

Ch

ild

Co

usin

De

sce

nd

an

t

Oth

er

Pa

ren

t

Se

lf

Sib

lin

g

BR

EA

K

0.0

0.1

0.2

0.3

0.4

0.5

0.6

(d) BROTo Relationship

Fro

m R

ela

tion

ship

01

00

20

03

00

Fre

qu

en

cy

Child

Cousin

Other

Parent

Self

Sibling

BREAK

Ch

ild

Co

usi

n

Oth

er

Pa

ren

t

Se

lf

Sib

ling

BR

EA

K

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

(e) OPL

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37

Page 13: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Do users move along the ontological hierarchy whencontributing to the projects?

To Relationship

Fro

m R

ela

tion

ship

02

00

00

40

00

06

00

00

80

00

0

Fre

qu

en

cy

Ancestor

Child

Cousin

Descendant

Other

Parent

Self

Sibling

BREAK

An

cest

or

Child

Cou

sin

De

sce

nda

nt

Oth

er

Pare

nt

Self

Sib

ling

BR

EA

K

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

(a) ICD-11

To Relationship

Fro

m R

ela

tion

ship

020000

40000

60000

Fre

qu

en

cy

Ancestor

Child

Cousin

Descendant

Other

Parent

Self

Sibling

BREAK

An

cest

or

Ch

ild

Co

usi

n

De

sce

nd

an

t

Oth

er

Pa

ren

t

Se

lf

Sib

ling

BR

EA

K

−0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

(b) NCIt

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 11 / 37

Page 14: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Understanding Editing Behaviors in Ontology-DevelopmentProjects!

States: Types of changes (aggregated) available in iCAT (Edit Property Value, CreateClass, etc.).

(a) ICD-11 (b) ICTM

(a) Absolute Differences (b) Relative Differences

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 12 / 37

Page 15: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Modeling Sequential Interaction-Sequences

Summary of findings

The edit behavior of users is influenced by the hierarchy of theontology.

Users edit the ontology top-down and breadth-first.

Users work in micro-workflows.

Roles of users can be identified.

Users edit closely related classes.

Users perform property-based workflows.

[Walk et al., 2014b] Simon Walk, Philipp Singer, Markus Strohmaier, Tania Tudorache, Mark Musen, Natalya Noy: DiscoveringBeaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains. Journal of Biomedical Informatics 51:254-271 (2014)

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 13 / 37

Page 16: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Predicting Aspects of Future Actions

k cross-fold prediction experiment

k stratified splitsk − 1 splits for the training set1 split for the test set

Determine the rank of each transition in test setModified competition rankingNatural occam’s razor

Calculate average rank over all transitions and splits

Lowest average rank determines best performing Markov chain orderBest models: Average rank between 1.7 and 3Worst models: Average rank between 2 and 6

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 14 / 37

Page 17: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Predicting Aspects of Future Actions

Best performing Markov chain orders

ICD-11 ICTM NCIt BRO OPL

Predict Users for Classes 1 1 1 1 2

Predict Change Types for Users 3 2 - 1 1

Predict Change Types for Classes 4 3 - 2 2

Predict Properties for Users 1 1 - 1 0

Predict Properties for Classes 1 1 - 3 2

[Walk et al., 2014a] Simon Walk, Philipp Singer, Markus Strohmaier: Sequential Action Patterns in CollaborativeOntology-Engineering Projects: A Case-Study in the Biomedical Domain. CIKM 2014: 1349-1358

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 15 / 37

Page 18: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Outline

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 15 / 37

Page 19: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Formulating & Comparing hypotheses

Hypotheses are potential explanations as opposed to actual empiricaltransitional observations.

Can be expressed as hypothesis matrix Q whereqij represents the belief in the transition between states si and sj

and∑

j qij = 1 for each row i of Q.

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 16 / 37

Page 20: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Example: Top-down hypothesis

Classes deeper in the hierarchy than the previously edited class are morelikely to be changed next.

qij =

{1, if depthi < depthj ,

0, otherwise.(1)

E F G

isAisA

isA

isA isA isA

CB D

A

(c) Top-Down Example

Hypothesis Transition Matrix

To Class

From

Cla

ss

G

F

E

D

C

B

A

A B C D E F G0.15

0.20

0.25

0.30

0.35

(d) Hypothesis Matrix Q

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 17 / 37

Page 21: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Hypotheses

A

B D

E F G

isAisA

isA

isA

isA isA

C

Uniform

A

B D

E F G

isAisA

isA

isA

isA isA

C

Hierarchy

A

B D

E F G

isAisA

isA

isA

isA isA

C

Similarity

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 18 / 37

Page 22: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

HypTrails

A framework to study the relative plausibility of hypotheses (about theproduction of human edit sequences).

Sequences modeled as first-order Markov chain.

Uses Bayesian inference (marginal likelihood) for comparing differenthypotheses.

The marginal likelihood P(D|H) describes the probability of data Dgiven hypothesis H.Higher evidences indicate higher plausibility.

Factor k , describing the strength of our belief in a hypothesis.

Produces a ranked list of hypotheses (Bayesian evidences).

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 19 / 37

Page 23: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

HypTrails Results

0 1 2 3 4

−180

0000

−120

0000

k

Baye

sian

evi

denc

eRanking of Hypotheses for ICD11

●● ● ● ●

0 1 2 3 4−140

000

−110

000

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for ICTM

●●

● ●

0 1 2 3 4−500

0−4

000

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for BRO

● ● ● ● ●

0 1 2 3 4

−300

0−2

400

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for OPL

●● ● ● ●

● uniformsimilarityhierarchy

top−downshortest pathbreadth−first

connectivitybottom−up

0 1 2 3 4

−180

0000

−120

0000

k

Baye

sian

evi

denc

eRanking of Hypotheses for ICD11

●● ● ● ●

0 1 2 3 4−140

000

−110

000

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for ICTM

●●

● ●

0 1 2 3 4−500

0−4

000

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for BRO

● ● ● ● ●

0 1 2 3 4

−300

0−2

400

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for OPL

●● ● ● ●

● uniformsimilarityhierarchy

top−downshortest pathbreadth−first

connectivitybottom−up

0 1 2 3 4

−180

0000

−120

0000

kBa

yesi

an e

vide

nce

Ranking of Hypotheses for ICD11

●● ● ● ●

0 1 2 3 4−140

000

−110

000

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for ICTM

●●

● ●

0 1 2 3 4−500

0−4

000

k

Baye

sian

evi

denc

eRanking of Hypotheses for BRO

● ● ● ● ●

0 1 2 3 4

−300

0−2

400

k

Baye

sian

evi

denc

e

Ranking of Hypotheses for OPL

●● ● ● ●

● uniformsimilarityhierarchy

top−downshortest pathbreadth−first

connectivitybottom−up

[Walk et al., 2015a] Simon Walk Philipp Singer, Lisette Espin Noboa, Tania Tudorache, Mark Musen, Markus Strohmaier:Understanding How Users Edit Ontologies: Comparing Hypotheses About Four Real-World Projects. International SemanticWeb Conference (1) 2015: 551-568

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 20 / 37

Page 24: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Results

Outline

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 20 / 37

Page 25: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal

BioPortal (Jan – Apr 2016)

Apply Markov chains and conduct analyses on Request Logs of BioPortal

Feature Value

Requests before filtering ∼50 MRequests after filtering ∼16.2 MClick-requests (interactions) ∼2.1 MDistinct IPs 160, 325

Number of Ontologies1, 652 (IDs + Names)

∼730 IDs∼500 unresolvable names

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 21 / 37

Page 26: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal

Seconds between click-requests (cut-off at 4, 200 seconds)

Ordered by IP, considered only users with > 1 request.

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 22 / 37

Page 27: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal

Hours between click-requests (cut-off at 168 hours)

Ordered by IP, considered only users with > 1 request.

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 23 / 37

Page 28: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Click-Sessions

Definition:A sequence of clicks, where each click is performed within 1,800 seconds(30 minutes) of the previous click. However, sessions can be longer than30 minutes!

Problem:Due to the dynamic nature of BioPortal (AJAX, caching, ...), Referrer isoften “wrong” (or missing), making it impossible to trace sessions, tabbedbrowsing, back-clicks, etc.!

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 24 / 37

Page 29: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Click-Sessions in BioPortal (Jan – Apr 2016)

Feature Value

Click-requests ∼2.1 M

Sessions 198, 6101-click sessions 64, 492≥10-click sessions 38, 535Min/Max clicks per session 1 / 3, 678Average/Median/Mode clicks per session 10.5 / 3 / 1

Min/Max duration 0 / 6hAverage/Median/Mode duration 175.6s / 2s / 1s

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 25 / 37

Page 30: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Requests & Duration per Click-Session

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 26 / 37

Page 31: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Example Click-Session

Timestamp Request

2016-03-14 09:07:46 /2016-03-14 09:07:48 /login?redirect=http%3A%2F%2Fbioportal.bioontology.org%2F2016-03-14 09:07:50 /login2016-03-14 09:08:04 /2016-03-14 09:08:22 /ontologies/MCCV2016-03-14 09:09:34 /ontologies/MCCV/submissions/new2016-03-14 09:08:58 /ontologies/MCCV/submissions2016-03-14 09:07:59 /ontologies/success/MCCV2016-03-14 09:10:14 /ontologies/MCCV

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 27 / 37

Page 32: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Example Click-Session

Click-Action (General) Click-Action (Specific)

Browse Main Page Browse Ontology ClassBrowse Ontologies Browse Ontology Class TreeBrowse Search Ontology SummaryBrowse Help Browse Ontology ClassesBrowse Mappings Browse Ontology MappingsBrowse Recommender Ontology AnalyticsBrowse Annotator Browse Ontology WidgetsBrowse Resource Index Browse Ontology VisualizationBrowse Projects Browse Ontology NotesBrowse Notes Browse Ontology PropertiesLogin Browse WidgetsLog-Out Browse Ontology Property TreeSign-Up Browse Class NotesLost Password Create Ontology SubmissionBrowse Account Validate Ontology FileFeedback Virtual Appliance Download

Browse Ontology Submission

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 28 / 37

Page 33: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Frequency of Click-Actions

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 29 / 37

Page 34: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Example Click-Session

Timestamp Click-Action Request

2016-03-14 09:07:46 Browse Main Page /2016-03-14 09:07:48 Login /login?redirect=http%3A%2F%2Fbioportal.bioontology.org%2F2016-03-14 09:07:50 Login /login2016-03-14 09:08:04 Browse Main Page /2016-03-14 09:08:22 Ontology Summary /ontologies/MCCV2016-03-14 09:09:34 Create Ontology Submission /ontologies/MCCV/submissions/new2016-03-14 09:08:58 Browse Ontology Submission /ontologies/MCCV/submissions2016-03-14 09:07:59 Create Ontology Submission /ontologies/success/MCCV2016-03-14 09:10:14 Ontology Summary /ontologies/MCCV

Interaction Sequence:Browse Main Page → Login → Ontology Summary → Create OntologySubmission → Browse Ontology Submission → Ontology Summary

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 30 / 37

Page 35: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

BioPortal Click-Transitions (first-order)

To State

From

Sta

te

Browse AccountBrowse Annotator

Browse Class NotesBrowse Help

Browse Main PageBrowse Mappings

Browse OntologiesBrowse Ontology Class

Browse Ontology ClassesBrowse Ontology Class TreeBrowse Ontology Mappings

Browse Ontology NotesBrowse Ontology Properties

Browse Ontology Property TreeBrowse Ontology Submission

Browse Ontology VisualizationBrowse Ontology Widgets

Browse ProjectBrowse Projects

Browse RecommenderBrowse Resource Index

Browse SearchBrowse Widgets

Create Ontology SubmissionFeedbackLog−Out

Lost PasswordOntology Analytics

Ontology SummaryValidate Ontology File

Virtual Appliance DownloadBREAK

Brow

se A

ccou

ntBr

owse

Ann

otat

orBr

owse

Cla

ss N

otes

Brow

se H

elp

Brow

se M

ain

Page

Brow

se M

appi

ngs

Brow

se O

ntol

ogie

sBr

owse

Ont

olog

y Cl

ass

Brow

se O

ntol

ogy

Clas

ses

Brow

se O

ntol

ogy

Clas

s Tr

eeBr

owse

Ont

olog

y M

appi

ngs

Brow

se O

ntol

ogy

Note

sBr

owse

Ont

olog

y Pr

oper

ties

Brow

se O

ntol

ogy

Prop

erty

Tre

eBr

owse

Ont

olog

y Su

bmiss

ion

Brow

se O

ntol

ogy

Visu

aliza

tion

Brow

se O

ntol

ogy

Wid

gets

Brow

se P

roje

ctBr

owse

Pro

ject

sBr

owse

Rec

omm

ende

rBr

owse

Res

ourc

e In

dex

Brow

se S

earc

hBr

owse

Wid

gets

Crea

te O

ntol

ogy

Subm

issio

nFe

edba

ckLo

g−O

utLo

st P

assw

ord

Ont

olog

y An

alyt

icsO

ntol

ogy

Sum

mar

yVa

lidat

e O

ntol

ogy

File

Virtu

al A

pplia

nce

Down

load

BREA

K

0.0

0.2

0.4

0.6

0.8

1.0

Brow

se A

ccou

ntBr

owse

Ann

otat

orBr

owse

Cla

ss N

otes

Brow

se H

elp

Brow

se M

ain

Page

Brow

se M

appi

ngs

Brow

se O

ntol

ogie

sBr

owse

Ont

olog

y Cl

ass

Brow

se O

ntol

ogy

Clas

s Tr

eeBr

owse

Ont

olog

y Cl

asse

sBr

owse

Ont

olog

y M

appi

ngs

Brow

se O

ntol

ogy

Note

sBr

owse

Ont

olog

y Pr

oper

ties

Brow

se O

ntol

ogy

Prop

erty

Tre

eBr

owse

Ont

olog

y Su

bmiss

ion

Brow

se O

ntol

ogy

Visu

aliza

tion

Brow

se O

ntol

ogy

Wid

gets

Brow

se P

roje

ctBr

owse

Pro

ject

sBr

owse

Rec

omm

ende

rBr

owse

Res

ourc

e In

dex

Brow

se S

earc

hBr

owse

Wid

gets

Crea

te O

ntol

ogy

Subm

issio

nFe

edba

ckLo

g−O

utLo

st P

assw

ord

Ont

olog

y An

alyt

icsO

ntol

ogy

Sum

mar

yVa

lidat

e O

ntol

ogy

File

Virtu

al A

pplia

nce

Down

load

BREA

K

0e+0

02e

+05

4e+0

56e

+05

8e+0

51e

+06

Freq

uenc

y

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 31 / 37

Page 36: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

BioPortal Click-Requests per Ontology (Top 50)

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 32 / 37

Page 37: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

BioPortal Click-Transitions per Ontology

To State

From

Sta

te

Browse AccountBrowse Annotator

Browse Class NotesBrowse Help

Browse Main PageBrowse Mappings

Browse OntologiesBrowse Ontology Class

Browse Ontology ClassesBrowse Ontology Class TreeBrowse Ontology Mappings

Browse Ontology NotesBrowse Ontology Properties

Browse Ontology Property TreeBrowse Ontology Visualization

Browse Ontology WidgetsBrowse Project

Browse ProjectsBrowse Recommender

Browse Resource IndexBrowse Search

Browse WidgetsCreate Ontology Submission

FeedbackLog−Out

Lost PasswordOntology Analytics

Ontology SummaryBREAK

Brow

se A

ccou

ntBr

owse

Ann

otat

orBr

owse

Cla

ss N

otes

Brow

se H

elp

Brow

se M

ain

Page

Brow

se M

appi

ngs

Brow

se O

ntol

ogie

sBr

owse

Ont

olog

y C

lass

Brow

se O

ntol

ogy

Cla

sses

Brow

se O

ntol

ogy

Cla

ss T

ree

Brow

se O

ntol

ogy

Map

ping

sBr

owse

Ont

olog

y N

otes

Brow

se O

ntol

ogy

Prop

ertie

sBr

owse

Ont

olog

y Pr

oper

ty T

ree

Brow

se O

ntol

ogy

Visu

aliz

atio

nBr

owse

Ont

olog

y W

idge

tsBr

owse

Pro

ject

Brow

se P

roje

cts

Brow

se R

ecom

men

der

Brow

se R

esou

rce

Inde

xBr

owse

Sea

rch

Brow

se W

idge

tsC

reat

e O

ntol

ogy

Subm

issi

onFe

edba

ckLo

g−O

utLo

st P

assw

ord

Ont

olog

y An

alyt

ics

Ont

olog

y Su

mm

ary

BREA

K

0.0

0.2

0.4

0.6

0.8

1.0

(a) CPT

To State

From

Sta

te

Browse AccountBrowse Annotator

Browse Class NotesBrowse Help

Browse Main PageBrowse Mappings

Browse OntologiesBrowse Ontology Class

Browse Ontology ClassesBrowse Ontology Class TreeBrowse Ontology Mappings

Browse Ontology NotesBrowse Ontology Properties

Browse Ontology Property TreeBrowse Ontology Visualization

Browse Ontology WidgetsBrowse Project

Browse ProjectsBrowse Recommender

Browse Resource IndexBrowse Search

Browse WidgetsFeedbackLog−Out

Lost PasswordOntology Analytics

Ontology SummaryVirtual Appliance Download

BREAK

Brow

se A

ccou

ntBr

owse

Ann

otat

orBr

owse

Cla

ss N

otes

Brow

se H

elp

Brow

se M

ain

Page

Brow

se M

appi

ngs

Brow

se O

ntol

ogie

sBr

owse

Ont

olog

y C

lass

Brow

se O

ntol

ogy

Cla

sses

Brow

se O

ntol

ogy

Cla

ss T

ree

Brow

se O

ntol

ogy

Map

ping

sBr

owse

Ont

olog

y N

otes

Brow

se O

ntol

ogy

Prop

ertie

sBr

owse

Ont

olog

y Pr

oper

ty T

ree

Brow

se O

ntol

ogy

Visu

aliz

atio

nBr

owse

Ont

olog

y W

idge

tsBr

owse

Pro

ject

Brow

se P

roje

cts

Brow

se R

ecom

men

der

Brow

se R

esou

rce

Inde

xBr

owse

Sea

rch

Brow

se W

idge

tsFe

edba

ckLo

g−O

utLo

st P

assw

ord

Ont

olog

y An

alyt

ics

Ont

olog

y Su

mm

ary

Virtu

al A

pplia

nce

Dow

nloa

dBR

EAK

0.0

0.2

0.4

0.6

0.8

1.0

(b) MEDDRA

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 33 / 37

Page 38: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Absolute Click-Transitions of CPT & MEDDRA

To State

From

Sta

te

Browse AccountBrowse Annotator

Browse Class NotesBrowse Help

Browse Main PageBrowse Mappings

Browse OntologiesBrowse Ontology Class

Browse Ontology ClassesBrowse Ontology Class TreeBrowse Ontology Mappings

Browse Ontology NotesBrowse Ontology Properties

Browse Ontology Property TreeBrowse Ontology Visualization

Browse Ontology WidgetsBrowse Project

Browse ProjectsBrowse Recommender

Browse Resource IndexBrowse Search

Browse WidgetsCreate Ontology Submission

FeedbackLog−Out

Lost PasswordOntology Analytics

Ontology SummaryVirtual Appliance Download

BREAK

Brow

se A

ccou

ntBr

owse

Ann

otat

orBr

owse

Cla

ss N

otes

Brow

se H

elp

Brow

se M

ain

Page

Brow

se M

appi

ngs

Brow

se O

ntol

ogie

sBr

owse

Ont

olog

y C

lass

Brow

se O

ntol

ogy

Cla

sses

Brow

se O

ntol

ogy

Cla

ss T

ree

Brow

se O

ntol

ogy

Map

ping

sBr

owse

Ont

olog

y N

otes

Brow

se O

ntol

ogy

Prop

ertie

sBr

owse

Ont

olog

y Pr

oper

ty T

ree

Brow

se O

ntol

ogy

Visu

aliz

atio

nBr

owse

Ont

olog

y W

idge

tsBr

owse

Pro

ject

Brow

se P

roje

cts

Brow

se R

ecom

men

der

Brow

se R

esou

rce

Inde

xBr

owse

Sea

rch

Brow

se W

idge

tsC

reat

e O

ntol

ogy

Subm

issi

onFe

edba

ckLo

g−O

utLo

st P

assw

ord

Ont

olog

y An

alyt

ics

Ont

olog

y Su

mm

ary

Virtu

al A

pplia

nce

Dow

nloa

dBR

EAK

1.00.90.80.70.60.50.40.30.20.10.0

−1.0−0.9−0.8−0.7−0.6−0.5−0.4−0.3−0.2−0.1

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 34 / 37

Page 39: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

BioPortal - Click-Sessions

Next Steps

Interpret and further analyze differences between ontologyclick-transitions on BioPortal.

Compare usage of the REST API and the UI.

Cluster users according to their click-action sequences (similarities)!Calculate stationary distribution vectors and use these to determinedistances for clustering.Compare Browsing behaviors between different clusters!

Compare editing behaviors before and after specific events (e.g.,ICD-11 iCAT editing vs. Public ICD-11 Beta Draft) for differentdatasets!

Use HypTrails to analyze which collaborative ontology-engineeringmethodologies people (most likely) follow, when developing anontology “in the wild”.

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 35 / 37

Page 40: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

Questions?

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 36 / 37

Page 41: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

Thanks!

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 37 / 37

Page 42: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

Hypotheses

A

B D

E F G

isAisA

isA

isA

isA isA

C

Uniform

A

B D

E F G

isAisA

isA

isA

isA isA

C

Top-down

A

B D

E F G

isAisA

isA

isA

isA isA

C

Bottom-up

A

B D

E F G

isAisA

isAisA

isA isA

C

Breadth-first

A

B D

E F G

isAisA

isA

isA

isA isA

C

HierarchyA

B D

E F G

isAisA

isA

isA

isA isA

C

Shortest path

A

B D

E F G

isAisA

isA

isA

isA isA

C

Connectivity

A

B D

E F G

isAisA

isA

isA

isA isA

C

Similarity

A

B D

E F G

isAisA

isA

isA

isA isA

C

Empirical

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 37 / 37

Page 43: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

HypTrails

Distribute chips to elicit Dirichlet prior

β =

uniform prior︷︸︸︷m2 + k ∗m2

︸ ︷︷ ︸informative prior

(2)

Process to αij :

Initial uniform distribution (m2)

Informative distribution (Q = Q||Q||1 ∗ β)

Normalize Q over ℓ1-normMultiply with remaining β

Remaining informative distributionRank and distribute according to Q = Q − ⌊Q⌋

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 37 / 37

Page 44: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

HypTrails - Eliciting Prior Example

β = 32 + k ∗ 32, k = 1

Q =

⎝0 1 20 0 02 0 0

⎠ Q =

⎝0 0.33 0.660 0 01 0 0

Q||Q||1 =

⎝0 0.165 0.330 0 00.5 0 0

⎠ Qβ =

⎝0 ⌊1.485⌋ ⌊2.97⌋0 0 0

⌊4.5⌋ 0 0

β = 9− 7

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 37 / 37

Page 45: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

Model selection

Likelihood ratio test

kηm = −2(

Log-Likelihoodk︷ ︸︸ ︷L(P(D|θk) )−

Log-Likelihoodm︷ ︸︸ ︷L(P(D|θm))) (3)

Significance test for likelihood ratios

χ2-CDF with kηm and degrees of freedom (θm − θk)

p-value defines significance of alternate model

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 37 / 37

Page 46: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

Model selection

Akaike Information Criterion

AIC (k) = kηm − 2(|θm|− |θk |) (4)

Balances model complexity (over/underfitting)

Penalizes model parameters θ

Bayesian Information Criterion

BIC (k) = kηm − 2(|θm|− |θk |) ln (n) (5)

Additionally penalizes the number of observations n (transitions).

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 37 / 37

Page 47: Extracting and Analyzing Sequential Interaction-Patterns · 2016. 8. 12. · 0.5 0.6 0.7 (e) OPL Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 10 / 37. Results

Questions

Model selection

Bayesian Model Selection & HypTrailsBayes’ rule for posterior distribution of θ given data D and hypothesis H.

posterior︷ ︸︸ ︷P(θ|D,H) =

likelihood︷ ︸︸ ︷P(D|θ,H)

prior︷ ︸︸ ︷P(θ|H)

P(D|H)︸ ︷︷ ︸marginal likelihood

(6)

P(D|H) =∏

i

Γ(∑

j αij)∏j Γ(αij)

∏j Γ(nij + αij)

Γ(∑

j(nij + αij)(7)

Hyperparameters α represent pseudo counts

nij is the number of transitions between states si and sj

Simon Walk Interaction-Patterns in Ontology-Engineering July 22, 2016 37 / 37