a framework for pay-as-you-go extraction ontology based information retrieval andrew zitzelberger

22
A Framework for Pay-as- you-go Extraction Ontology Based Information Retrieval Andrew Zitzelberger

Post on 19-Dec-2015

222 views

Category:

Documents


5 download

TRANSCRIPT

A Framework for Pay-as-you-go Extraction Ontology Based

Information RetrievalAndrew Zitzelberger

Problem

• Keyword search doesn’t work well for high precision

• Domain ontologies take a long time to build

Pay-as-you-goKeyword Search

Basic Data Frames

Derived Attributes

Interconnected Ontologies

Domain Ontologies

Data Frame Hierarchies

Relationship Data Frames

OSM-O Ontologies

Decidable!

OSM-EO Ontologies

• OSM-O Ontologies with data frames for object and relationship sets.– Recognition– Linguistic grounding– Understanding

Keyword Search

• Honda 2003 or newer for under 15 grand with under 180K miles on it.

Keyword Search

• Honda -170 Results• Price max of 15 grand 15

– 15,000 works (kind of)

Number Data Frame• Number

– Internal representation:• Double

– External representation:• [1-9]\d*|[1-9]\d{2},\d{3}+|…

– Units• K=1000; [Gg]rand=1000; million=1000000; ...

– Methods:• Greater than:

– (greater than|over|above|more than|>|…)\s+{Number}• Less than:

– (less than|under|below|<|…)\s+{Number}• …

Number Method Extraction

• Honda 2003 or newer for under 15 grand with under 180K miles on it.– (Number < 15000), (Number < 180000), (Number >= 2003)– (2003 <= Number < 15000)

• No change in results. Why?– Dates, Times

• Miles keyword problem

Data Frame Hierarchies

Method Extraction

• Honda 2003 or newer for under 15 grand with under 180K miles on it.– (Year >= 2003), (Price < 15000), (Mileage < 180000)

• Significant result reduction.

Relationship Data Frames

• {CountryName-Make}– {CountryName}\s+(makes|manufactures|…)\s+{Make}

• {Make-CountryName}– {Make}\s+(is\s)?{made in|…)\s+{CountryName}

Domain Ontology

Derived Attributes

• if Make in {JapanMake} then Japan• if Make in {GermanMake} then German• if …• else …

Interconnected Ontology

Interesting Problems

• Resolving matches across disconnected ontologies

• Choosing the extent of an ontology for extraction

• Adding relationship data frames to extraction processing

• How to efficiently choose the context ontologies when the library becomes large

User Interface

• Traditional text box for search• Radio options:

– Automatic• Run the system and give me what you get

– Feedback• Run the form feed back loop

– Exact• Let me pick/build the ontology/data frames I want

Form Feed Back

• System understanding displayed in a form

• User can modify form for a more structured query

• User can change ontology or append new data frames

Interesting Problems / Contributions

• Representing relationships and derived attributes in the form and ontology editor

• Quick intuitive way to add data frames from global library– Suggestions– Match tests

Architecture

• System starts with keyword search and small personal data frame library

• Can submit to or retrieve from larger global library

The Goal

Future Work

• Knowledge Bundles rather than simple IR– Extraction relative to ontology from multiple

sources

• Relationally complete forms