using dbpedia for spotting and disambiguating entities

8
Julien Plu, Giuseppe Rizzo, Raphaël Troncy {firstname.lastname}@eurecom.fr , @julienplu, @giusepperizzo, @rtroncy Using DBpedia for Spotting and Disambiguating Entities

Upload: julien-plu

Post on 14-Jul-2015

245 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Using DBpedia for Spotting and Disambiguating Entities

Julien Plu, Giuseppe Rizzo, Raphaël Troncy

{firstname.lastname}@eurecom.fr,

@julienplu, @giusepperizzo, @rtroncy

Using DBpedia for Spotting and

Disambiguating Entities

Page 2: Using DBpedia for Spotting and Disambiguating Entities

Agenda

Entity Linking task

Why using DBpedia?

Workflow

How is the index created?

Experiments on tweets

Future work

09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 2

Page 3: Using DBpedia for Spotting and Disambiguating Entities

Entity Linking Task

The purpose is to link entity mentions one can

find in text to their corresponding entries in a

knowledge base.

Example:

Last year I went to Paris to see the Eiffel Tower with

some friends.

http://dbpedia.org/resource/Paris http://dbpedia.org/resource/Eiffel_Tower

09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 3

Page 4: Using DBpedia for Spotting and Disambiguating Entities

Why using DBpedia?

No legacy problems compared with Freebase

Knowledge base is constantly evolving

Available in many languages which are

interlinked

Most of the resources have a type

All the resources have semantic relations with

others

Possibility to get the popularity of a resource

for each language

09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 4

Page 5: Using DBpedia for Spotting and Disambiguating Entities

Workflow

Text

POS Tagging / N-grams

analysis to get the entities

Lookup in the index to get

candidates for each entities

linking each entity in

choosing the right one

among the candidates

• Not domain-dependent

• The lookup and the linking processes are made on top of

an index created with DBpedia

09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 5

Page 6: Using DBpedia for Spotting and Disambiguating Entities

How is the index created?

3 datasets are used:

Titles

Redirects

Disambiguation links

Structure of the index:

First column is the label of the entity

Second column is the URI of the entity

Third column list all the labels of the redirect pages

linked to the entity

Fourth column is the label of the disambiguation page of

the entity

09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 6

Page 7: Using DBpedia for Spotting and Disambiguating Entities

Experiments on tweets

Dataset from the #Micropost2014 NEEL

challenge

Entity recognition

Entity recognition + linking

Precision Recall F-measure

31,29% 20,64% 24,88%

Precision Recall F-measure

63,51% 41,91% 50,50%

09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 7

Page 8: Using DBpedia for Spotting and Disambiguating Entities

Future work

Using deeply DBpedia:

Relation among the entities

Compute the popularity of an entity (i.e pageRank

according to a language)

Relation between different languages for the same entity

Using the types for each entity

Using better algorithm to rank candidates after

the lookup

09/02/2015 - 3rd DBpedia Community Meeting – Dublin, Ireland - 8