shallow semantic parsing: making most of limited training data

Shallow semantic parsing: Making most of limited

training data

Katrin Erk

Sebastian Pado

Saarland University

Introduction

• Frame semantics:– “Who does what to whom” analysis:

senses and roles– Cross-lingual appeal (Boas 2005)

• Prerequisite for use in NLP:Automatic, robust, accurate methods for analysis of free text

• Predominant machine learning paradigm: Supervised classification– Learn relation between features and classes from

training corpus; guess classes in test corpus– Gildea and Jurafsky (2002) and many since

Frame-semantic analysis

• Step 1: Frame disambiguation– WSD-style classification of predicate in

terms of frames

• Step 2: Role assignment– Classification of nodes in terms of role

labels

Frame-semantic analysisCreeping in its shadow I reached a point whence I could look straight through the uncurtained window.(A. Conan Doyle, The Hound of the Baskervilles)

Problems of supervised learning setting

• Coverage: – lemmas may be missing– frames may be missing

• Languages other than English:– Training data may not be available– Can we take advantage of existing

resources for English?

Today’s talk

• Shalmaneser: a system for automatic frame-semantic analysis

• Unknown sense detection: dealing with missing frames

• Annotation projection for cross-lingual data creation

• Summary

Shalmaneser: Automatic frame-semantic analysis

• Assignment of – senses (frames) to predicates– semantic roles

• Aim: easy use, for exploring applications of frame-semantic analysis– Input: plain text– Syntactic

preprocessing integrated

– Visualization with SALTO tool

Shalmaneser: Automatic frame-semantic analysis

• Semantic analysis as supervised learning tasks– Pre-trained classifiers available for English

(FrameNet) and German (SALSA)

• Performance of English models:– Frame assignment: accuracy 0.93, baseline 0.89

• High baseline because some senses are missing

– Role assignment: • Role recognition F-score 0.75• Role labeling Accuracy 0.78

– Not top-scoring, but okay. Focus on ease of use and on flexibility.

Shalmaneser: Flexibiliby

• Processing steps linked only by interface format: Salsa/Tiger XML (Erk & Pado 04)– Adding a module: just needs to speak

Salsa/Tiger XML

• Model features specified in experiment file, can be changed easily

• Adding new parser by instantiating an interface class

• New language: only syntactic preprocessing changes

Today’s talk




• Summary

Detecting unknown word senses (frames)

Conan Doyle, The Hound of the Baskervilles.Syntax: Collins parserSemantics: Shalmaneser

• Unseen senses normal WSD approach will assign wrong sense

• Automatically detect senses we haven’t seen before?

Unknown sense detection as outlier detection

• Outlier detection: detect occurrences of previously unseen events (overview articles: Markou & Singh 2003a,b)– training data: positive cases only.

Derive model of “normal” cases– test data: positive and negative cases

training items

test items

A Nearest Neighbor-based outlier detection method

• Tax and Duin (2000): simple method, easy to implement

• Given test point and its nearest training neighbor : Is closer to than ‘s nearest neighbor?

– Test point x, nearest training neighbor t, nearest neighbor t’ of t, (Euclidean) distances d: Accept x if pNN(x) is below a given threshold

yes

no

Unknown sense detection: Results

• Evaluation (Erk NAACL 2006): – Use FrameNet data– Treat one sense of a lemma as pseudo-unknown

(iterate over all senses)

• Results (assignment of label “unknown”):– Tax&Duin’s method, one lemma at a time:

Prec 0.70, Rec 0.35– More data: all data for a frame,

not just that of one lemmaPrec 0.77, Rec 0.82

Results

• What features are important?1. Best: just context words2. Almost as good: features of 1, 3, 4 together3. Just the subcategorization frame: high precision, low recall4. Subcat frame, plus headwords of arguments: inbetween 3

and 2, but obviously too sparse

Unknown sense detection as outlier detection: The bigger picture

• Why assume missing word senses in the sense inventory and in the training data?– Growing, unfinished resources, like FrameNet– Domain-specific senses may be missing from

general-purpose sense inventories

• Outlier detection method presented here: applicable to any resource that groups words into senses, e.g. WordNet

• Using outlier detection to detect occurrences of nonliteral use?

Today’s talk




• Summary

Motivation

Definitions, Role set: Language-independent

Predicate classes: Language-specific

Annotated Sentences:

Specific, too

Agenda

• For new language, induce:1. Frame-semantic predicate classification

2. Corpus with frame-semantic annotation

• Method: Annotation projection in parallel corpus– Word alignments approximate semantic equivalence

• Corresponding word pairs (predicates)

• Corresponding constituents

• Evaluation: Study on EUROPARL corpus (De/En/Fr)

An idealised example

Peter comes home Pierre revient à la maison

Arriving Arriving

Frame-semantic classes

• Idea: For each frame, construct list of predicates in new language occurring aligned to predicates of this frame => FEEs for new languages

• Main obstacle: Translational divergence– Corresponding predicates don’t evoke same frame

• Address by shallow, language-independent filtering (Pado and Lapata AAAI 2005)– Important: Distributional patterns

• Evaluation: Can obtain predicate classes for German and French with precision of 65-70%– Main remaining problem: English polysemy not covered by

FrameNet

Role annotations (I)

• Idea: For each sentence, transfer semantic role annotation onto translated sentence

• Obstacle 1: Frame divergence– Role projection only sensible if frames match– Good news: In En-De test corpus (Pado and Lapata

HLT/EMNLP 2005), 70% of frames match

• Obstacle 2: Role divergence– Even if frames are parallel, do roles match?– Good news: In En-De test corpus, matching frames

show 90% role matches• Remaining cases mostly elisions (e.g. passive)

Role annotations (II)

• Obstacle 3: Errors/omissions in automatically induced word alignments– Can be overcome by using bracketing information (chunks /

constituents)– Induction of cross-lingual correspondences as graph

optimisation problem (Pado and Lapata ACL 2006)

• Evaluation (all exact match F-score): – Word-based projection: 0.50– Constituent-based: 0.75– Upper limit: 0.85

• Remaining errors mostly parsing-related

Summary

• Frame-semantic analysis potentially interesting for many NLP applications– Goal of Shalmaneser: flexible and easy-to-use

system

• Address incompleteness in resources– Unknown sense detection as outlier detection

• Porting Frame Semantics to new languages– Parallel corpora for automatic annotation

projection

shallow semantic parsing: making most of limited training data

Documents