reconciling event-based knowledge through rdf2vec

18
Reconciling Event-Based Knowledge through RDF2VEC. Mehwish Alam 1,2 , Diego Reforgiato Recupero 2,3 , Misael Mongiovi 4 , Aldo Gangemi 1,2 , Petar Ristoski 5 1. Universite Paris 13, Paris, France, 2. ST-Lab National Research Council (CNR), Rome, Italy 3. University of Cagliari, Cagliari, Italy 4. National Research Council (CNR), Catania, Italy 5. University of Mannheim, Mannheim, Germany. 21 st October 2017 Hybrid Statistical Semantic Understanding and Emerging Semantics @ISWC 2017 1 / 18

Upload: mehwish-alam

Post on 16-Mar-2018

241 views

Category:

Education


1 download

TRANSCRIPT

Reconciling Event-Based Knowledge through RDF2VEC

Mehwish Alam12 Diego Reforgiato Recupero23 Misael Mongiovi4Aldo Gangemi12 Petar Ristoski5

1 Universite Paris 13 Paris France2 ST-Lab National Research Council (CNR) Rome Italy

3 University of Cagliari Cagliari Italy4 National Research Council (CNR) Catania Italy5 University of Mannheim Mannheim Germany

21stOctober 2017Hybrid Statistical Semantic Understanding and Emerging Semantics ISWC 2017

1 18

Knowledge Reconciliation

Figure Overall Process of Knowledge Reconciliation [Mongiovı et al 2016]

Why Knowledge Reconciliation

Text Summarization

Document Similarity

Generating Textual Analytics

Existing Tool ndash MERGILO

Graph Compression

Graph Alignment

Uses String matching and Word Similarity

Our Goal

Introduce similarities effectively using event information represented as Frames andRoles

Use background knowledge concerning event information2 18

Method

Figure Pipeline for Event-Based Knowledge Reconciliation

3 18

Framester [Gangemi et al 2016]

Figure Framester Factual-Linguistic Linked Data Hub Blue green orange yellow and greycolors represent role-oriented lexical resources wordnet-like lexical resources fact-oriented dataontology schemas and topic models respectively

4 18

Example Framester Frame Graph

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Figure A fragment of FrameNet-OWL graph dotted lines represent subFrameOf relation andsolid lines represent the inheritsFrom relation as defined in FrameNet-OWL

5 18

RDF Graph Based Frame Embeddings ndash RDF2Vec[Ristoski and Paulheim 2016]

Method

Word2Vec converts raw text into vector representations

RDF2Vec converts a graph into a sequence of nodes and edges

Models

Continuous Bag of Words

Given a context of next and previous words as input the central word is predicted

Example Capital Austria Ntilde Vienna

Skip-Gram

Given a word its context is predicted

Example Vienna Ntilde Capital Austria

Representing RDF Graphs into a sequence of nodes and edges

Graph Walks

Weisfeiler-Lehman Subtree RDF Graph Kernels

6 18

Graph Walks

Method

Define a depth d eg d ldquo 3

Perform walks of specific depth d to generate sequences of nodes and edges

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Sequences Generated

Event Ntilde inheritsFrom Ntilde Objective Influence Ntilde inheritsFrom Ntilde Transitive Action

Intentionally affect Ntilde inheritsFrom Ntilde Invasion Scenario Ntilde subFrameOf Ntilde Conquering

7 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Knowledge Reconciliation

Figure Overall Process of Knowledge Reconciliation [Mongiovı et al 2016]

Why Knowledge Reconciliation

Text Summarization

Document Similarity

Generating Textual Analytics

Existing Tool ndash MERGILO

Graph Compression

Graph Alignment

Uses String matching and Word Similarity

Our Goal

Introduce similarities effectively using event information represented as Frames andRoles

Use background knowledge concerning event information2 18

Method

Figure Pipeline for Event-Based Knowledge Reconciliation

3 18

Framester [Gangemi et al 2016]

Figure Framester Factual-Linguistic Linked Data Hub Blue green orange yellow and greycolors represent role-oriented lexical resources wordnet-like lexical resources fact-oriented dataontology schemas and topic models respectively

4 18

Example Framester Frame Graph

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Figure A fragment of FrameNet-OWL graph dotted lines represent subFrameOf relation andsolid lines represent the inheritsFrom relation as defined in FrameNet-OWL

5 18

RDF Graph Based Frame Embeddings ndash RDF2Vec[Ristoski and Paulheim 2016]

Method

Word2Vec converts raw text into vector representations

RDF2Vec converts a graph into a sequence of nodes and edges

Models

Continuous Bag of Words

Given a context of next and previous words as input the central word is predicted

Example Capital Austria Ntilde Vienna

Skip-Gram

Given a word its context is predicted

Example Vienna Ntilde Capital Austria

Representing RDF Graphs into a sequence of nodes and edges

Graph Walks

Weisfeiler-Lehman Subtree RDF Graph Kernels

6 18

Graph Walks

Method

Define a depth d eg d ldquo 3

Perform walks of specific depth d to generate sequences of nodes and edges

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Sequences Generated

Event Ntilde inheritsFrom Ntilde Objective Influence Ntilde inheritsFrom Ntilde Transitive Action

Intentionally affect Ntilde inheritsFrom Ntilde Invasion Scenario Ntilde subFrameOf Ntilde Conquering

7 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Method

Figure Pipeline for Event-Based Knowledge Reconciliation

3 18

Framester [Gangemi et al 2016]

Figure Framester Factual-Linguistic Linked Data Hub Blue green orange yellow and greycolors represent role-oriented lexical resources wordnet-like lexical resources fact-oriented dataontology schemas and topic models respectively

4 18

Example Framester Frame Graph

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Figure A fragment of FrameNet-OWL graph dotted lines represent subFrameOf relation andsolid lines represent the inheritsFrom relation as defined in FrameNet-OWL

5 18

RDF Graph Based Frame Embeddings ndash RDF2Vec[Ristoski and Paulheim 2016]

Method

Word2Vec converts raw text into vector representations

RDF2Vec converts a graph into a sequence of nodes and edges

Models

Continuous Bag of Words

Given a context of next and previous words as input the central word is predicted

Example Capital Austria Ntilde Vienna

Skip-Gram

Given a word its context is predicted

Example Vienna Ntilde Capital Austria

Representing RDF Graphs into a sequence of nodes and edges

Graph Walks

Weisfeiler-Lehman Subtree RDF Graph Kernels

6 18

Graph Walks

Method

Define a depth d eg d ldquo 3

Perform walks of specific depth d to generate sequences of nodes and edges

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Sequences Generated

Event Ntilde inheritsFrom Ntilde Objective Influence Ntilde inheritsFrom Ntilde Transitive Action

Intentionally affect Ntilde inheritsFrom Ntilde Invasion Scenario Ntilde subFrameOf Ntilde Conquering

7 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Framester [Gangemi et al 2016]

Figure Framester Factual-Linguistic Linked Data Hub Blue green orange yellow and greycolors represent role-oriented lexical resources wordnet-like lexical resources fact-oriented dataontology schemas and topic models respectively

4 18

Example Framester Frame Graph

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Figure A fragment of FrameNet-OWL graph dotted lines represent subFrameOf relation andsolid lines represent the inheritsFrom relation as defined in FrameNet-OWL

5 18

RDF Graph Based Frame Embeddings ndash RDF2Vec[Ristoski and Paulheim 2016]

Method

Word2Vec converts raw text into vector representations

RDF2Vec converts a graph into a sequence of nodes and edges

Models

Continuous Bag of Words

Given a context of next and previous words as input the central word is predicted

Example Capital Austria Ntilde Vienna

Skip-Gram

Given a word its context is predicted

Example Vienna Ntilde Capital Austria

Representing RDF Graphs into a sequence of nodes and edges

Graph Walks

Weisfeiler-Lehman Subtree RDF Graph Kernels

6 18

Graph Walks

Method

Define a depth d eg d ldquo 3

Perform walks of specific depth d to generate sequences of nodes and edges

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Sequences Generated

Event Ntilde inheritsFrom Ntilde Objective Influence Ntilde inheritsFrom Ntilde Transitive Action

Intentionally affect Ntilde inheritsFrom Ntilde Invasion Scenario Ntilde subFrameOf Ntilde Conquering

7 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Example Framester Frame Graph

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Figure A fragment of FrameNet-OWL graph dotted lines represent subFrameOf relation andsolid lines represent the inheritsFrom relation as defined in FrameNet-OWL

5 18

RDF Graph Based Frame Embeddings ndash RDF2Vec[Ristoski and Paulheim 2016]

Method

Word2Vec converts raw text into vector representations

RDF2Vec converts a graph into a sequence of nodes and edges

Models

Continuous Bag of Words

Given a context of next and previous words as input the central word is predicted

Example Capital Austria Ntilde Vienna

Skip-Gram

Given a word its context is predicted

Example Vienna Ntilde Capital Austria

Representing RDF Graphs into a sequence of nodes and edges

Graph Walks

Weisfeiler-Lehman Subtree RDF Graph Kernels

6 18

Graph Walks

Method

Define a depth d eg d ldquo 3

Perform walks of specific depth d to generate sequences of nodes and edges

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Sequences Generated

Event Ntilde inheritsFrom Ntilde Objective Influence Ntilde inheritsFrom Ntilde Transitive Action

Intentionally affect Ntilde inheritsFrom Ntilde Invasion Scenario Ntilde subFrameOf Ntilde Conquering

7 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

RDF Graph Based Frame Embeddings ndash RDF2Vec[Ristoski and Paulheim 2016]

Method

Word2Vec converts raw text into vector representations

RDF2Vec converts a graph into a sequence of nodes and edges

Models

Continuous Bag of Words

Given a context of next and previous words as input the central word is predicted

Example Capital Austria Ntilde Vienna

Skip-Gram

Given a word its context is predicted

Example Vienna Ntilde Capital Austria

Representing RDF Graphs into a sequence of nodes and edges

Graph Walks

Weisfeiler-Lehman Subtree RDF Graph Kernels

6 18

Graph Walks

Method

Define a depth d eg d ldquo 3

Perform walks of specific depth d to generate sequences of nodes and edges

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Sequences Generated

Event Ntilde inheritsFrom Ntilde Objective Influence Ntilde inheritsFrom Ntilde Transitive Action

Intentionally affect Ntilde inheritsFrom Ntilde Invasion Scenario Ntilde subFrameOf Ntilde Conquering

7 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Graph Walks

Method

Define a depth d eg d ldquo 3

Perform walks of specific depth d to generate sequences of nodes and edges

Event initial state Event Event end state

Objective influence Motion

Transitive action Control

Intentionally affect Mass motion Motion Noise

Invasion Scenario Attack

Invading Conquering Besieging

Repel

precedes precedes

precedes

precedes

Sequences Generated

Event Ntilde inheritsFrom Ntilde Objective Influence Ntilde inheritsFrom Ntilde Transitive Action

Intentionally affect Ntilde inheritsFrom Ntilde Invasion Scenario Ntilde subFrameOf Ntilde Conquering

7 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Weisfeiler-Lehman Subtree RDF Graph Kernels

e a

b

c d

eb abcd

baedc

cab dab

abcd Ntilde f

eb Ntilde h

baedc Ntilde g

dab Ntilde i

cab Ntilde j

h f

g

j i

Generated Sequences

bNtildegNtildej bNtildegNtildei bNtildegNtildef bNtildegNtildeh bNtildegNtildejNtildef

aNtildefNtildeg aNtildefNtildej aNtildefNtildei aNtildefNtildegNtildeh

8 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Continuous Bag of Words and Skip-gram Models

9 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

FRED Graphs Generated for two texts

T1 Spaniards conquered the Incas

T2 Spaniards attacked the Incas

10 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Framester Role and Frame Mappings

11 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Similarity Between two Frames

Wu Palmer

simwuppf1 f2q ldquo2 ˚ depthplcspf1 f2qq

depthpf1q ` depthpf2q

Leacock Chodorow

simlcpf1 f2q ldquo acutelogplenpf1 f2q ` 1

2 ˚ Dq

Path Similarity

simpathpf1 f2q ldquo1

lenpf1 f2q ` 1

Cosine Similarity (Vectors Computed using RDF2Vec)

simcosinepf1 f2q ldquoV1 uml V2

||V1|| uml ||V2||

lcs least common subsumer

len shortest path

D maximum depth of taxonomy

12 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Merged graph

13 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Experimentation

Cross-document Coreference Resolution (CCR) on RDF graphs

Associates RDF nodes about a same entity (object person concept etc) acrossdifferent RDF graphs generated from text

Dataset

EECB dataset specifies coreferent mentions (text fragments)

RDF graphs were generated from EECB using FRED

Text mentions were manually (via CrowdFlower) associated with graph nodes

The evaluation framework is built on top of the original MERGILOa

ahttpwitistccnritstlab-toolsmergilo

14 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Evaluation Measures

ndash MUC Link-based metric that quantifies the number of merges necessary to coverpredicted and gold clusters

ndash B3 Mention-based metric that quantifies the overlap between predicted and goldclusters for a given mention

ndash CEAFM (Constrained Entity Aligned F-measure Mention-based) Mention-basedmetric based on a one-to-one alignment between gold and predicted clusters

ndash CEAFE (Constrained Entity Aligned F-measure Entity-Based) Entity-based metricbased on a one-to-one alignment between gold and predicted clusters

ndash BLANC (Bilateral Assessment of NounPhrase Coreference) Rand-index-basedmetric that considers both coreference and non-coreference links

15 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Experimental Results

muc bcub ceafm blanc ceafeMERGILO Baseline 2405 1736 2861 1070 2620

FrameNet Inheritance Similarity MeasuresWu-Palmer 2714 1991 3191 1281 2941Path 2716 1993 3185 1273 2938Leacock Chodorow 2704 1980 3174 1277 2921

Graph walks (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 CBOW 200 2734 1999 3215 1266 2982CBOW 200 SG 800 2738 1997 3229 1269 2998CBOW 200 SG 500 2728 1995 3199 1269 2954

Graph kernels (full frame and role graphs)Frame2Vec Role2Vec muc bcub ceafm blanc ceafeCBOW 200 SG 200 2670 1952 3145 1240 2899CBOW 200 SG 500 2670 1952 3145 1240 2899SG 200 CBOW 200 2686 1962 3167 1248 2918SG 500 CBOW 200 2690 1968 3158 1260 2908

Table Event-Based Knowledge Reconciliation Results

16 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

Conclusions amp Perspectives

We introduce similarity measures for frames and semantic roles using Framestergraphs and RDF2Vec

We evaluate them as an improvement over MERGILO using event knowledge fromFRED graphs

Frame-based similarity is sensible even with top-down intensional embeddings

Frame embedding seems to improve also over classical graph-based similarityalgorithms

Further practical applications of frame embeddingsNext experimenting with extensional embeddings from corpus annotation (seebelow)

news series integrationknowledge graph evolution with robust event reconciliationconflict detection across texts describing similar factstext summarization or dialogue

Extended version of this paper just published on KBS [Alam et al 2017]

Frame2Vec models available athttplipnuniv-paris13fr~alamFrame2Vec

Anticipation new (extensional) frame embeddings based on frame extraction fromfull WFD of Wikipedia (come visit at the poster session for a demo)

17 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18

References I

Alam M Reforgiato Recupero D Mongiovi M Gangemi A and Ristoski P(2017)Event-based knowledge reconciliation using frame embeddings and frame similarityKnowledge-Based Systems 135192ndash203

Gangemi A Alam M Asprino L Presutti V and Recupero D R (2016)Framester a wide coverage linguistic linked data hubIn Knowledge Engineering and Knowledge Management 20th InternationalConference Bologna Italy pages 239ndash254

Mongiovı M Recupero D R Gangemi A Presutti V and Consoli S (2016)Merging open knowledge extracted from text with MERGILOKnowl-Based Syst 108155ndash167

Ristoski P and Paulheim H (2016)Rdf2vec Rdf graph embeddings for data miningIn ISWC pages 498ndash514

18 18