knowledge representation and semantic capturing albena strupchanska linguistic modelling department,...

22
Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy of Sciences

Upload: darleen-carmel-stewart

Post on 25-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Knowledge Representation and Semantic Capturing

Albena StrupchanskaLinguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy of Sciences

[email protected]

Page 2: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Few words about me

Programmer at LMD, 2001 -2003 Research Associate at LMD since 2003

Research interests knowledge representation: CGs, LFs in NLU;

ontologies, semantic web information extraction e-learning question-answering

Page 3: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Knowledge Representation: Conceptual Graphs

Realization of CG operations (generalization, specialization, projection and join)

Integration of CG operations in CGWorld

Usage of those operation in several system prototypes (simple question-answering, eLearning)

Page 4: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Knowledge Acquisition form Text General approach used in a few prototypes that process text in controlled English (restricted domains)

Lexical analysis, Named entities recognition and Part-of-speech tagger - GATE

Syntactic analysis - parser developed by Milena Yankova

Result: translation of text into Logical Forms (LFs) and other similar formalisms e.g. Conceptual Graphs

Page 5: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Knowledge-based approaches Resources used:

type hierarchy domain knowledge

Attempts to treat negation (prototype developed) recognize scenarios (FRET system)

Page 6: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

“Naive” Negation Processing

Sentence/Query -> LF -> CG The question:

"Who does not buy bonds?“

will be translated to:

¬(all (X,bond(X)&buy(Y)&(Y,agnt,Univ)&

(Y,obj,X))) set the negation scope to the whole sentence

Page 7: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

“Naive” Negation Processing

construct all possible LFs with localization of the negated phrases

(2.1) exists(X,¬bond(X)&buy(Y)&(Y,agnt,Univ)& (Y,obj,X)) (2.2) exists (X,bond(X)& ¬buy(Y)& (Y,agnt,Univ)& (Y,obj,X)) (2.3) exists (X,¬bond(X)&¬buy(Y)& (Y,agnt,Univ)& (Y,obj,X))

(2.1) Who does buy financial instruments different from bonds ? (2.2) Who is doing other actions with bonds except buying them? (2.3) Who is doing other actions except buying with something different

from bonds

Page 8: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

“Naive” Negation Processing

Every negated concept is replaced by its hierarchical environment:

every concept corresponding to a verb is replaced by its "antonym or complementary events";

every object is replaced by the so-called restricted universally quantified concepts.

S(nc)=(Sib(nc) SonSib(nc)) \ Son(nc), where nc is the negated concept

Projection of the query to the KB of CGs => retrieval of answers

Page 9: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Football Reports Extraction of Templates

Semantically driven approach for scenario recognition and templates filling deep understanding only in “certain scenario-relevant

points” by elaborating inference mechanisms LF representation for effective inference Text: football reports with specific paragraph

structure (tickers for each minute)

Page 10: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET’s Architecture

TextText

TextTextPreprocessorPreprocessor

Resource Resource BankBank

Logical FormLogical FormTranslatorTranslator

Templates FillerTemplates Filler

DirectMatching

FillingTemplates

InferenceMatching

STOPSTOP

KB of filled KB of filled template’s formstemplate’s forms

no

no

yes

yes

Page 11: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Resource Bank

Lexicon Grammar rules Rules for translation in logical form Graphs of events

description of the domain events (nodes) and relations (arcs) between them

Templates description (uninstantiated LFs)

Page 12: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Graph of Events

Three types of events (nodes in a directed graph):

Main event - LF description of obligatory and optional fields of the template and relations between them

Base events - LF of most important self-dependent events in the chosen domain

Sub-events - kinds of base events that are immediately connected to the main event (i.e. there exists an arc between the nodes of the main and the sub-events)

Page 13: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Graph of Events

Four types of relations (an arc with associated weight

in the graph): Event E2 invalidates event E1, i.e. event E2 happens

after E1 and annuls it Event E1 entails event E2, i.e. when E1 happens E2

always happens at the same time. Event E1 enables event E2, i.e. event E1 happens

before the beginning of event E2 and event E1 is a precondition for E2

Event E2 is a part of event E1.

Page 14: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Graph of Events

LF: time(Minute) & Action2(A2) & theta(A2,agnt,Player) & theta(A2,obj,D) & ball(D)

LF: time(Minute) & ball(D) & theta(D,into,G) & Net(G)

Base Event: The ball is into the net.Base Event: Player shots the ball.

is a part ofenables

LF: time(Minute) & Action1(A1) & theta(A1,agnt,B) & shot(B) & theta(B,poss,Player) &

theta(A1,obj,G) & Net(G)

Sub Event: Player’s shot hits the net.

LF Obligatory: time(Minute) & Score(A) & theta(A,agnt,Player)Optional: Action1(C) & ball(D) & theta(C,agnt,Player) & theta(C,obj,D) & Location(E) & theta(C,Loc,E) & Action2(F) & theta(F,agnt,Assistant) & theta(F,obj,D)&theta(F,to,Player)

Main Event: Player scores.

entails

Page 15: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Identification of Negation Explicit negation

Short sentences containing No Complete sentence containing “Not/Non/No”

Both cases: marker NEG attached to the LF of the

previous sentence or succeeding part of the sentence Implicit negation

Sentences with “but”, “however”, “although”

Markers: BAHpos’ and ‘BAHneg’

Markers are inserted during the parsing process

Page 16: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Negation Sentence:

79 mins: Henry fires at goal, but misses from a tight angle.

Logical forms:time(79) & fire(A) & (A,agnt,‘Henry’) & (A,at,B)

& goal(B) & marker(‘BAHpos’,7).

time(79) & miss(A) & (A,agnt,‘Henry’) & (A,form,B) & angle(B) & (B,char,C)& tight(C) & marker(‘BAHneg’,7).

Page 17: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET -Treatment of NegationInterpretation of marked LFs NEGNEG

the matching result is ignored BAHposBAHpos or BAHnegBAHneg

there are two possible interpretations: negation conjunction of independent statements

the algorithm checks whether the dual LFs marked with these markers can be matched to events connected with invalidate relation in the graph

if this succeeds, the previous matching is ignored.

Page 18: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Templates Filling

The templates filler performs two main steps: Matching LF

based on the modification of the unification algorithm Filling templates

The templates filler processes those LF, which are produced from the so-called extended paragraph. Thus each paragraph is treated separately.

Page 19: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

FRET - Matching Algorithm

Direct matching each LF from the extended paragraph to the main event

Inference Matching use inference rules and the knowledge base FRET inference-matching algorithm derives an

inference from:

base events LFs => sub-events LFs => main event LFIf necessary information about some sub- event => consider type of relation between this sub-event and the main event => either recognize or not the main event

Page 20: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Advantages and disadvantages + Logical forms: convenient formalism for

making inference + Knowledge representation as graph of events + Partial parsing (better to understand less than

nothing)

- Creation of graph of events (nodes presented in LFs) and templates (presented in LFs)

- Narrow and restricted domains (not scaleable)

Page 21: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Conclusion

Knowledge-based approaches are successful when they are applied to specific domains

Choice of domain representation formalism is crucial for semantic capturing

Domain modelling is difficult and time-consuming

Much efforts for semantic capturing of simple cases. Probably when these cases are the right ones the goal justifies the means

Page 22: Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy

Thank you!

Any questions?