semashup - ensen in aimashup2014 by m.alsarem and p.portier

SEMashup -Mazen Alsarem & Pierre-Edouard Portier 1

How to enhance Web snippets with Linked Data?Mazen Alsarem & Pierre-Edouard PortierLaboratory LIRIS, INSA de Lyon, France

SEMashup

SEMashup -Mazen Alsarem & Pierre-Edouard Portier

2

Given the query: “epimenides knossos paradox”,Among the first results returned by the Google

SE, we find these snippets:


3

We enhance these snippets:


4

Our snippet highlights an alternative excerpt to better summarize the conceptual content of the document.


5

Alternative excerpt:


6

Our snippet also accentuates concepts that are present in the document and related to the user's information need as expressed by her query.


7

Important concepts:


8

After clicking the concept “Epimenides”:


9

Auto scrolling to an instance of the concept “Epimenides” in the underlying document:


10

How is it done?


11

A mashup of Web of Data services

We use the DBpedia Spotlight service to extract concepts from the document.


12


13

A mashup of Web of Data services

We use the DBpedia Spotlight service to extract concepts from the document.

We query a DBpedia SPARQL endpoint to find existing triples between the concepts.


14

dbp_res:Bertrand_Russell

dbp_res:Logic

dbp_res:Mathematics

dbp_res:Zondervan

dbp_res:Grand_Rapids,_Michigan

dbp_res:Callimachus

dbp_res:Alexandria

dbp_ont:mainInterest dbp_prop:deathPlace

dbp_prop:headquarters


15

In order to benefit from the Linked Data, we need to select the concepts to extend.

We propose to rank the concepts by their importance relatively to the user's information need.

To do this efficiently, we cannot rely only on the small graph we built, but we need to go back to the textual content of the document.

Therefore, we introduce a new iterative SVD algorithm.


16

To each concept, we associate a text made of its abstract and of the sentences of the document that contain its instances.

We build a concept-stem matrix whose entries are frequencies.

We do a first SVD decomposition.

We give more importance to the concepts and the stems close to the query, whereafter we do a second SVD decomposition.

In the reduced SVD space, we measure how the norms of the concepts and the stems evolved.


17

dbp:

Epim

enid

es

dbp:

Knoss

osdb

p:Par

adox

Evolution of the norms of the concepts in the reduced SVD space, between iterations 1 and 2:


18

The stems and the concepts that moved the most will be stressed at next iteration, the stems that nearly didn't move will be removed.

Concepts linked by a predicate to concepts elected to be stressed, will also be stressed.


19


20

We use a DBpedia SPARQL endpoint to find new triples about the most important resources.

In a pre-processing step, we kept only the DBpedia predicates that carry enough information (we discarded the predicates whose objects when concatenated had a low entropy).


21

In order to rank the triples of the extended graph and build the snippet, we do a tensor decomposition (CP) of the graph.

In order to take into account the types of the predicates, we choose to do a tensor decomposition instead of a decomposition of the adjacency matrix (each horizontal slice of the tensor represents the adjacency matrix for one given predicate).


22

Thank you!

And, please, come see the live demo!

http://demo.ensen-insa.org

http://demo.ensen-insa.org/

semashup - ensen in aimashup2014 by m.alsarem and p.portier

Software

france semashup

knossos dbp

important concepts

maininterest dbp

alexandria dbp

callimachus dbp

mathematics dbp

russell dbp