hermes: a semantic web-based news decision support system*
DESCRIPTION
Hermes: a Semantic Web-Based News Decision Support System*. Flavius Frasincar [email protected] Erasmus University Rotterdam. * Joint work with Jethro Borsje and Leonard Levering. Contents. Motivation Hermes Framework: News Classification Query Formulation Results Presentation - PowerPoint PPT PresentationTRANSCRIPT
March 17, 2008
SAC WT 2008 1
Hermes: a Semantic Web-Based News Decision Support System*
Flavius [email protected]
Erasmus University Rotterdam
* Joint work with Jethro Borsje and Leonard Levering
March 17, 2008
SAC WT 2008 2
Contents
• Motivation
• Hermes Framework:1. News Classification
2. Query Formulation
3. Results Presentation
• Hermes News Portal:– An example
• Conclusions
• Future Work
March 17, 2008
SAC WT 2008 3
Motivation
• Large quantity of news on the Web:– Difficult to find the ones of interest
• Limited annotation of RSS feeds:– Broad categories (business, cars, entertainment, etc.)
• News messages have a big impact on stock prices
• Google finance shows direct news which pertain to a certain portofolio:– Indirect news (competitors of Google like Microsoft are not
presented)– Not possible to ask (time-related) queries about news
March 17, 2008
SAC WT 2008 4
Hermes Framework
• Input: – News items from RSS feeds– Domain ontology linked to a semantic lexicon (e.g., WordNet)
• Output:– News items relevant for a particular user
• Three steps:– News Classification:
• Relate news items to ontology concepts
– Query Formulation• Allow the user to express his concepts of interest
– Results Presentation• Present the news items that match user’s concepts of interest
March 17, 2008
SAC WT 2008 5
1. News Classification
• Concept defined in the ontology (class or individual)
• Multiple lexical representations for the same concept:– Ontology synonyms (e.g., New York → New York, Big Apple)– Semantic lexicon synonyms (e.g., buy → acquire)
• Concepts without subclasses or instances:– Semantic lexicon hyponyms (e.g., company → dot-com)
• Lookup ontology concepts into news items
• Heuristics: at least three hits (concepts) in a news item
• Work in progress: use a word sense disambiguation algorithm (e.g., SSI, GAMBL)
March 17, 2008
SAC WT 2008 6
1. News Classification
• The news classification process:
March 17, 2008
SAC WT 2008 7
2. Query Formulation
• Present the domain knowledge as directed labeled multi-graph:– with the additional constraint that arcs between two nodes are
not allowed to share the same label
• User selects the concepts of interest in the original graph (e.g., Google)
• User is able to add to its selection concepts related to the concepts of interests via a certain relation (e.g., hasCompetitors: Microsoft, eBay, and Yahoo)
• The selected concepts are presented in a separate graph (called search graph)
March 17, 2008
SAC WT 2008 8
2. Query Formulation
• News are time stamped
• User is able to specify that only news in a certain time interval should be retrieved
• Time constraints:– Last hour– Last day – Last year– [2007-03-01T00:00:00.000+00:01, 2007-05-
31T00:00:00.000+00:01 ]
March 17, 2008
SAC WT 2008 9
3. Results Presentation
• Return news items that match a query
• Present the concepts involved in the query
• Per each news items show a summary:– Title– Source– Date– Few lines from the news item
• Emphasize the hits (found concepts from the ontology) in the retrieved news items
March 17, 2008
SAC WT 2008 10
Hermes News Portal
• Hermes News Portal (HNP) is an implementation of the Hermes framework
• Implementation language: Java
• Ontology represention langauge: OWL
• Semantic lexicon: WordNet
• Graph visualization: Prefuse
• Query language: SPARQL
• SPARQL extended with custom time functions (e.g., currentDate(), currentTime(), etc.)
March 17, 2008
SAC WT 2008 11
An Example
• Query:
Which are the news items interesting for Google from the past three months?
March 17, 2008
SAC WT 2008 12
News Classification
• Conceptual graph:
March 17, 2008
SAC WT 2008 13
2. Query Formulation
• Concepts selection:
March 17, 2008
SAC WT 2008 14
2. Query Formulation
Individuals
Classes
Selected concepts
Concepts relatedto the selected node
Concepts fromkeyword search
• Conceptual graph:
March 17, 2008
SAC WT 2008 15
2. Query Formulation
• Search graph:
March 17, 2008
SAC WT 2008 16
2. Query Formulation
PREFIX hermes: <http://hermes-news.org/news.owl#>SELECT ?titleWHERE {
?news hermes:title ?title .?news hermes:time ?date .?news hermes:relation ?relation .?relation hermes:relatedTo hermes:Google .FILTER(
?date > "2007-03-01T00:00:00.000+00:01" &&?date < "2007-05-31T00:00:00.000+00:01"
)}
• SPARQL query:
March 17, 2008
SAC WT 2008 17
2. Query Formulation
• Custom time functions:
Function name Output typecurrentDate() xsd:date
currentTime xsd:time
now() xsd:dateTime
dateTime-add(xsd:dateTime A, xsd:duration B) xsd:dateTime
dateTime-substract(xsd:dateTime A, xsd:duration B) xsd:dateTime
March 17, 2008
SAC WT 2008 18
2. Query Formulation
PREFIX hermes: <http://hermes-news.org/news.owl#>SELECT ?titleWHERE {
?news hermes:title ?title .?news hermes:time ?date .?news hermes:relation ?relation .?relation hermes:relatedTo hermes:Google .FILTER( ?date > hermes:dateTime-substract(hermes:now(), P0Y3M) &&
?date < hermes:now())
}
• Extended SPARQL query:
March 17, 2008
SAC WT 2008 19
3. Results Presentation
March 17, 2008
SAC WT 2008 20
Conclusions
• Hermes Framework: presents news items that match the user interests
• Hermes Framework:– News Classification– Query Formulation– Results Presentation
• Hermes News Portal (HNP): an implementation of the Hera framework
• HNP based on:– WordNet semantic lexicon, OWL ontology, (extended) SPARQL
queries, Prefuse visualization
March 17, 2008
SAC WT 2008 21
Future Work
• Word Sense Disambiguation:– SSI– GAMBL
• Ontology updates:– Learning from news items– Check if the extracted information obeys the ontology axioms:
• Faulty extraction• Ontology axioms update
• Simplify the query interface:– Allow users to ask English queries from a limited vocabulary
• Evaluate the tool outside the university lab