start: natural language access to information
DESCRIPTION
START: Natural Language Access to Information. Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris Temelkuran, Aaron Fernandes, Alp Simsek, Jonathan Wolfe, Matthew Bilotti MIT Artificial Intelligence Lab - PowerPoint PPT PresentationTRANSCRIPT
START: Natural Language Access
to InformationBoris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola,
Baris Temelkuran, Aaron Fernandes, Alp Simsek, Jonathan Wolfe, Matthew Bilotti
MIT Artificial Intelligence Lab
http://www.ai.mit.edu/projects/infolab/
RealityWhat we can do:
Understand ordinary sentences and questions
What we can’t do (yet):
1. Full-text NL understanding still beyond reach• Common sense implication• Intersentential reference• Summarization
2. Not all information is language—most Web resources are not textual
• Maps and Images • Sound and Video • Multimedia • Web resources are distributed across numerous non-traditional databases
Bridging the Gap
Library of Congress
+ In 1492,
Columbus sailed the ocean blue.
An object at rest tends to remain at rest.
Four score and seven years ago our forefathers brought forth
The Solution: Natural Language Annotations
Annotations bridge the gap between our ability to analyze naturallanguage sentences and our desire to access the huge amount of data available in our libraries and on the Web.
Annotations are collections of natural language sentences and phrases that describe the content of various information segments.
START
• analyzes these annotations
• creates the necessary representational structures
• produces special pointers to the information segments summarized by the annotations
Natural Language Annotations
Annotation
“Mars’s year is long.”+
Questions• “How long is the Martian year?”• “How long is a year on Mars?”• “How many days are in a Martian year?”• …
STARTknowledge base
Annotator
User
is
year long
related-to
year Mars
... one Mars year lasts 687 Earth days.
... one Mars year lasts 687 Earth days.
noun
molecule
quantity
two
det
Parsing
a noun
NP
N
PP
NPprep
converts
VP
S
A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate.
each
NP
N
PP
of glucose
into
smaller
prep NP
N
PP
molecules of pyruvate
N
V
chain
noun
reactions
of
Ternary expressions (T-expressions)
A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate.
<chain-1 related-to reactions-1><molecules-5 related-to pyruvate-1><molecules-5 quantity 2><molecules-5 is smaller><molecule-1 related-to glucose-1><molecule-1 quantifier each>
<<chain-1 convert molecule-1> into molecules-5>
into
moleculesconverts
chain
molecule
related-to
reactions glucose
related-to
pyruvate
related-to
each
quantifier
two
quantity
smaller
is
T-expression Representation
• List of node-link-node triples
• Nouns, adjectives are nodes
• Links cover:
• relationships between verbs and their arguments
• fundamental semantic relationships: “is-a” (for equality, membership, and subclass relationships), “related-to” (for possessives, etc.)
• modification of nouns: “quantifier”, “quantity”, “is” (for adjectives)
• prepositions
S-rules for Structural Variation
S-rule for the Property Factoring alternation:
emotional-reaction-
verb
someone1 someone2
with
something
related-to
someone1
someone1 emotional-reaction-verb someone2 with something
someone1’s something emotional-reaction-verb someone2
emotional-reaction-
verb
something1 someone2
something1
related-to
someone1
The president impressed the country with his determination.
The president’s determination impressed the country.
Emotional reaction verbs:
surprise stunamaze startleimpress pleaseembarrass annoyetc.
Sample Assertion
A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate.
into
moleculesconverts
chain
molecule
related-to
reactions glucose
related-to
pyruvate
related-to
each
quantifier
two
quantity
smaller
is
<chain-1 related-to reactions-1><molecules-5 related-to pyruvate-1><molecules-5 quantity 2><molecules-5 is smaller><molecule-1 related-to glucose-1><molecule-1 quantifier each>
<<chain-1 convert molecule-1> into molecules-5>
Sample Query
How are the glucose molecules converted into pyruvate molecules?
into
moleculesconverts
molecules
glucose
related-to
pyruvate
related-to
something
<molecules-5 related-to pyruvate-1><molecules-1 related-to glucose-1>
<<something convert molecules-1> into molecules-5>
Matching
Matcher
T-expressionsfrom Query
T-expressionsfrom Assertion
into
moleculesconverts
chain
molecule
related-to
reactions glucose
related-to
pyruvate
related-to
each
quantifier
two
quantity
smaller
is
something
Key: Input Processing Query Processing
A. Reply by Generating
A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate.
Generator DisplayedAnswer
Ternaryexpressions
Query: How are the glucose molecules converted into pyruvate molecules?
into
moleculesconverts
chain
molecule
related-to
reactions glucose
related-to
pyruvate
related-to
each
quantifier
two
quantity
smaller
is
Answer:
B. Reply from annotation
Find resource DisplayedAnswer
Ternaryexpressions
related-to
picture Cog
Annotatedresource
+
Query: Show me a picture of Cog.
C. Reply from annotation with script
directs
any-person any-IMDb-movie
+
Gone with the Wind (1939) was directed by George Cukor, Victor Fleming, and Sam Wood.
Source: The Internet Movie Database
Script•get http://us.imdb.com/Details?0031381•match regexp...
IMDb
T-exps
Run script DisplayedAnswerFind resource
Query: Who directed Gone with the Wind?
NASA
POTUS
Webster
Uniform Access
START
NL questions
Multimediaresponses
Omnibase
Queries
Data
• Local knowledge base of ternary expressions• Core vocabulary
• Uniform interface to multiple database formats (Web, text, etc.)• Integration time independent
of size of database• Extended lexicon
U.S. Census
IMDb
How START works
Web browser
START
Parser
Matcher
English
Input T-exps
Databaseof T-exps
T-exps from KB
Generator
HTML
English
Annotations
Scripts
Omnibase(externalknowledge)
Nativeknowledge
Scripts
WWW
PotusIMDb
World Factbook
U.S. Census