a syntactic constituent ranker for speculation and …...2011/09/21  · university of oslo:...

65
University of Oslo : Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope Resolution Erik Velldal Lilja Øvrelid Jonathon Read Stephan Oepen [email protected] Date: 21 September 2011 Venue: Master Seminar in Language Technology Department of Informatics University of Oslo

Upload: others

Post on 21-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

University of Oslo : Department of Informatics

A Syntactic Constituent Ranker forSpeculation and Negation Scope Resolution

Erik Velldal Lilja ØvrelidJonathon Read† Stephan Oepen

[email protected]

Date: 21 September 2011 Venue: Master Seminar in Language TechnologyDepartment of InformaticsUniversity of Oslo

Page 2: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Introduction

TaskGiven an appropriate cue of speculation or negation, determineits scope—the subsequence of the sentence that is affected.

For example:I {The unknown amino acid

⟨may⟩

be used by these species}.I Samples of the protein pair space were taken {〈instead of〉

considering the whole space} [...]

Some applications:I Information ExtractionI Sentiment Analysis

Page 3: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Introduction

Our approach

Generate candidate scopes from constituents in syntactic trees;use a supervised learner to rank these candidates

unlabeledsentence

syntacticconstituent

parsing

data-driven

ranking

speculation/ negation

cue

predictedscope

Page 4: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Outline

1 Data sets and evaluation measures

2 A brief review of speculation and negation at LNS

3 A syntactic constituent ranker

4 Experiments

Page 5: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Outline

1 Data sets and evaluation measures

2 A brief review of speculation and negation at LNS

3 A syntactic constituent ranker

4 Experiments

Page 6: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Data Sets

BioScope Corpus (Vincze et al., 2008)I 20,924 sentences annotated with speculation/negation cues

and associated scopes.I Sentences are drawn from biomedical abstracts, full papers

and clinical reports.

CoNLL-2010 Shared Task Evaluation Data (Farkas et al., 2010)I A further set of 5,003 sentences from biomedical papers,

annotated for speculation.

Page 7: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Data Sets

Data

Total Speculation NegationSentences Sentences Cues Sentences Cues

Abstracts 11,871 2,101 2,659 1,597 1,719Papers 2,670 519 668 339 376CoNLL 5,003 790 1,033 – –

Note on data set splits for development and evaluation:Speculation: all of the Abstracts and Papers are used for

development; the CoNLL set is held-out.Negation: 90% of the Abstracts and Papers are used for

development; the remainder is held-out.

Page 8: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Preprocessing

I BioScope XML converted to stand-off characterisation.I Tokenisation performed using the GENIA tagger

(Tsuruoka et al., 2005), supplemented with a finite-statetokeniser

I Part-of-speech tags from both the GENIA tagger (for higheraccuracy in the biomedical domain) and TnT (Brants, 2000)

I Dependency parsing using MaltParser (Nivre et al., 2006)stacked with the XLE platform (Crouch et al., 2008) withthe English grammar developed by Butt et al., (2002).

I Constituent tree parsing with PET (Callmeier, 2002) andthe English Resource Grammar (Flickinger 2002).

Page 9: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Preprocessing

I BioScope XML converted to stand-off characterisation.I Tokenisation performed using the GENIA tagger

(Tsuruoka et al., 2005), supplemented with a finite-statetokeniser

I Part-of-speech tags from both the GENIA tagger (for higheraccuracy in the biomedical domain) and TnT (Brants, 2000)

I Dependency parsing using MaltParser (Nivre et al., 2006)stacked with the XLE platform (Crouch et al., 2008) withthe English grammar developed by Butt et al., (2002).

I Constituent tree parsing with PET (Callmeier, 2002) andthe English Resource Grammar (Flickinger 2002).

Page 10: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Preprocessing

I BioScope XML converted to stand-off characterisation.I Tokenisation performed using the GENIA tagger

(Tsuruoka et al., 2005), supplemented with a finite-statetokeniser

I Part-of-speech tags from both the GENIA tagger (for higheraccuracy in the biomedical domain) and TnT (Brants, 2000)

I Dependency parsing using MaltParser (Nivre et al., 2006)stacked with the XLE platform (Crouch et al., 2008) withthe English grammar developed by Butt et al., (2002).

I Constituent tree parsing with PET (Callmeier, 2002) andthe English Resource Grammar (Flickinger 2002).

Page 11: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Preprocessing

I BioScope XML converted to stand-off characterisation.I Tokenisation performed using the GENIA tagger

(Tsuruoka et al., 2005), supplemented with a finite-statetokeniser

I Part-of-speech tags from both the GENIA tagger (for higheraccuracy in the biomedical domain) and TnT (Brants, 2000)

I Dependency parsing using MaltParser (Nivre et al., 2006)stacked with the XLE platform (Crouch et al., 2008) withthe English grammar developed by Butt et al., (2002).

I Constituent tree parsing with PET (Callmeier, 2002) andthe English Resource Grammar (Flickinger 2002).

Page 12: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Preprocessing

I BioScope XML converted to stand-off characterisation.I Tokenisation performed using the GENIA tagger

(Tsuruoka et al., 2005), supplemented with a finite-statetokeniser

I Part-of-speech tags from both the GENIA tagger (for higheraccuracy in the biomedical domain) and TnT (Brants, 2000)

I Dependency parsing using MaltParser (Nivre et al., 2006)stacked with the XLE platform (Crouch et al., 2008) withthe English grammar developed by Butt et al., (2002).

I Constituent tree parsing with PET (Callmeier, 2002) andthe English Resource Grammar (Flickinger 2002).

Page 13: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Evaluation measures

Evaluation was performed using the scoring software of theCoNLL 2010 Shared Task.

Prec =tp

tp+fp Rec =tp

tp+fn F1 = 2×Prec×RecPrec+Rec

A true positive requires identification of all words in scope.

Page 14: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Outline

1 Data sets and evaluation measures

2 A brief review of speculation and negation at LNS

3 A syntactic constituent ranker

4 Experiments

Page 15: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Supervised classification of cues

We employ a binary word-by-word linear support vectormachine classifier using the SVMlight toolkit.

A simpler approach is to treat the set of cues as near-closedclass, applying filtering to disregard all known non-cues beforetraining and applying the classifier.

Cues described using features of:I n-grams of lemmas up to three positions to the left; andI n-grams of surface forms up to two positions to the right.

Page 16: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Supervised classification of cues

We employ a binary word-by-word linear support vectormachine classifier using the SVMlight toolkit.

A simpler approach is to treat the set of cues as near-closedclass, applying filtering to disregard all known non-cues beforetraining and applying the classifier.

Cues described using features of:I n-grams of lemmas up to three positions to the left; andI n-grams of surface forms up to two positions to the right.

Page 17: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Supervised classification of cues

We employ a binary word-by-word linear support vectormachine classifier using the SVMlight toolkit.

A simpler approach is to treat the set of cues as near-closedclass, applying filtering to disregard all known non-cues beforetraining and applying the classifier.

Cues described using features of:I n-grams of lemmas up to three positions to the left; andI n-grams of surface forms up to two positions to the right.

Page 18: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Supervised classification of cues

Multi-word cues are handled in post-processing by matchingpatterns observed in the development data lemmas.

Speculationcannot {be}? excludeeither .+ orindicate thatmay,? or may notno {evidence | proof | guarantee}not {known | clear | evident | understood | exclude}raise the .* {possibility | question | issue | hypothesis}whether or not

Negationrather than{can|could} notno longerinstead ofwith the * exception ofneither * nor{no(t?)|neither} * nor

Page 19: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Supervised classification of cues

Development Results: Speculation

Prec Rec F1

Baseline 90.49 81.16 85.57Word-by-word 94.65 82.26 88.02Filtering 94.13 84.60 89.11

Development Results: Negation

Prec Rec F1

Word-by-word 87.69 97.61 92.38Filtering 92.49 97.72 95.03

Page 20: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Supervised classification of cues

Development Results: Speculation

Prec Rec F1

Baseline 90.49 81.16 85.57Word-by-word 94.65 82.26 88.02Filtering 94.13 84.60 89.11

Development Results: Negation

Prec Rec F1

Word-by-word 87.69 97.61 92.38Filtering 92.49 97.72 95.03

Page 21: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Hueristics, activated by the cue’s part-of-speech operate overdependency structures:

I Coordinations scope over their conjuncts;

I Prepositions scope over their argument and itsdescendants;

I Attributive adjectives scope over their nominal head andits descendants;

I Predicative adjectives scope over referential subjects andclausal arguments, if present;

I Modals inherit subject scope from their lexical verb andscope over their descendants; and

I Passive or raising verbs scope over referential subjects andthe verbal descendants.

Page 22: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Hueristics, activated by the cue’s part-of-speech operate overdependency structures:

I Coordinations scope over their conjuncts;

I Prepositions scope over their argument and itsdescendants;

I Attributive adjectives scope over their nominal head andits descendants;

I Predicative adjectives scope over referential subjects andclausal arguments, if present;

I Modals inherit subject scope from their lexical verb andscope over their descendants; and

I Passive or raising verbs scope over referential subjects andthe verbal descendants.

Page 23: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Hueristics, activated by the cue’s part-of-speech operate overdependency structures:

I Coordinations scope over their conjuncts;

I Prepositions scope over their argument and itsdescendants;

I Attributive adjectives scope over their nominal head andits descendants;

I Predicative adjectives scope over referential subjects andclausal arguments, if present;

I Modals inherit subject scope from their lexical verb andscope over their descendants; and

I Passive or raising verbs scope over referential subjects andthe verbal descendants.

Page 24: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Hueristics, activated by the cue’s part-of-speech operate overdependency structures:

I Coordinations scope over their conjuncts;

I Prepositions scope over their argument and itsdescendants;

I Attributive adjectives scope over their nominal head andits descendants;

I Predicative adjectives scope over referential subjects andclausal arguments, if present;

I Modals inherit subject scope from their lexical verb andscope over their descendants; and

I Passive or raising verbs scope over referential subjects andthe verbal descendants.

Page 25: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Hueristics, activated by the cue’s part-of-speech operate overdependency structures:

I Coordinations scope over their conjuncts;

I Prepositions scope over their argument and itsdescendants;

I Attributive adjectives scope over their nominal head andits descendants;

I Predicative adjectives scope over referential subjects andclausal arguments, if present;

I Modals inherit subject scope from their lexical verb andscope over their descendants; and

I Passive or raising verbs scope over referential subjects andthe verbal descendants.

Page 26: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Hueristics, activated by the cue’s part-of-speech operate overdependency structures:

I Coordinations scope over their conjuncts;

I Prepositions scope over their argument and itsdescendants;

I Attributive adjectives scope over their nominal head andits descendants;

I Predicative adjectives scope over referential subjects andclausal arguments, if present;

I Modals inherit subject scope from their lexical verb andscope over their descendants; and

I Passive or raising verbs scope over referential subjects andthe verbal descendants.

Page 27: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Hueristics, activated by the cue’s part-of-speech operate overdependency structures:

I Coordinations scope over their conjuncts;

I Prepositions scope over their argument and itsdescendants;

I Attributive adjectives scope over their nominal head andits descendants;

I Predicative adjectives scope over referential subjects andclausal arguments, if present;

I Modals inherit subject scope from their lexical verb andscope over their descendants; and

I Passive or raising verbs scope over referential subjects andthe verbal descendants.

Page 28: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

In the case of negation an additional set of rules are applied:

I Determiners scope over their head node and itsdescendants;

I The noun none scopes over the entire sentence if it is thesubject, otherwise it scopes over its descendants;

I Other nouns or verbs scope over their descendants;I Adverbs with a verbal head scope over the descendants of

the lexical verb; and

I Other adverbs scope over the descendants of the head.

If none of these rules activate then a default scope is applied bylabeling from the cue to the sentence final punctuation.

Page 29: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

In the case of negation an additional set of rules are applied:

I Determiners scope over their head node and itsdescendants;

I The noun none scopes over the entire sentence if it is thesubject, otherwise it scopes over its descendants;

I Other nouns or verbs scope over their descendants;I Adverbs with a verbal head scope over the descendants of

the lexical verb; and

I Other adverbs scope over the descendants of the head.

If none of these rules activate then a default scope is applied bylabeling from the cue to the sentence final punctuation.

Page 30: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

In the case of negation an additional set of rules are applied:

I Determiners scope over their head node and itsdescendants;

I The noun none scopes over the entire sentence if it is thesubject, otherwise it scopes over its descendants;

I Other nouns or verbs scope over their descendants;I Adverbs with a verbal head scope over the descendants of

the lexical verb; and

I Other adverbs scope over the descendants of the head.

If none of these rules activate then a default scope is applied bylabeling from the cue to the sentence final punctuation.

Page 31: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

In the case of negation an additional set of rules are applied:

I Determiners scope over their head node and itsdescendants;

I The noun none scopes over the entire sentence if it is thesubject, otherwise it scopes over its descendants;

I Other nouns or verbs scope over their descendants;I Adverbs with a verbal head scope over the descendants of

the lexical verb; and

I Other adverbs scope over the descendants of the head.

If none of these rules activate then a default scope is applied bylabeling from the cue to the sentence final punctuation.

Page 32: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

In the case of negation an additional set of rules are applied:

I Determiners scope over their head node and itsdescendants;

I The noun none scopes over the entire sentence if it is thesubject, otherwise it scopes over its descendants;

I Other nouns or verbs scope over their descendants;I Adverbs with a verbal head scope over the descendants of

the lexical verb; and

I Other adverbs scope over the descendants of the head.

If none of these rules activate then a default scope is applied bylabeling from the cue to the sentence final punctuation.

Page 33: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

In the case of negation an additional set of rules are applied:

I Determiners scope over their head node and itsdescendants;

I The noun none scopes over the entire sentence if it is thesubject, otherwise it scopes over its descendants;

I Other nouns or verbs scope over their descendants;I Adverbs with a verbal head scope over the descendants of

the lexical verb; and

I Other adverbs scope over the descendants of the head.

If none of these rules activate then a default scope is applied bylabeling from the cue to the sentence final punctuation.

Page 34: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

In the case of negation an additional set of rules are applied:

I Determiners scope over their head node and itsdescendants;

I The noun none scopes over the entire sentence if it is thesubject, otherwise it scopes over its descendants;

I Other nouns or verbs scope over their descendants;I Adverbs with a verbal head scope over the descendants of

the lexical verb; and

I Other adverbs scope over the descendants of the head.

If none of these rules activate then a default scope is applied bylabeling from the cue to the sentence final punctuation.

Page 35: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Development Results: Speculation Scope for Gold Cues

F1

Abstracts Default Baseline 69.84Dependency Rules 73.67

Papers Default Baseline 45.21Dependency Rules 72.31

Development Results: Negation Scope for Gold Cues

F1

Abstracts Default Baseline 51.75Dependency Rules 70.76

Papers Default Baseline 31.38Dependency Rules 67.16

Page 36: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Hueristics for scope resolution

Development Results: Speculation Scope for Gold Cues

F1

Abstracts Default Baseline 69.84Dependency Rules 73.67

Papers Default Baseline 45.21Dependency Rules 72.31

Development Results: Negation Scope for Gold Cues

F1

Abstracts Default Baseline 51.75Dependency Rules 70.76

Papers Default Baseline 31.38Dependency Rules 67.16

Page 37: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Outline

1 Data sets and evaluation measures

2 A brief review of speculation and negation at LNS

3 A syntactic constituent ranker

4 Experiments

Page 38: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Selecting candidate constituents

Learn a ranking function over candidate constituents within aparse.

sb-hd mc c

sp-hd n c

d - the le

the

aj-hdn norm c

aj - i le

unknown

n ms-cnt ilr

n - mc le

amino acid

hd-cmp u c

v vp mdl-p le⟨may⟩ hd-cmp u c

v prd be le

be

hd-cmp u c

v pas odlr

v np le

used

hd-cmp u c

p np ptcl le

by

sp-hd n c

d - pl le

these

w period plr

n pl olr

n - c le

species.

Page 39: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Selecting candidate constituents

Candidates are generated by following the path from the cue tothe root of the tree, for example:

I modal verb : FalseThe unknown amino acid {

⟨may⟩} by used by these species.

I head – complement : FalseThe unknown amino acid {

⟨may⟩

be used by these species}.

I subject – head : True{The unknown amino acid

⟨may⟩

be used by these species}.

Page 40: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Selecting candidate constituents

Candidates are generated by following the path from the cue tothe root of the tree, for example:

I modal verb : FalseThe unknown amino acid {

⟨may⟩} by used by these species.

I head – complement : FalseThe unknown amino acid {

⟨may⟩

be used by these species}.

I subject – head : True{The unknown amino acid

⟨may⟩

be used by these species}.

Page 41: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Selecting candidate constituents

Candidates are generated by following the path from the cue tothe root of the tree, for example:

I modal verb : FalseThe unknown amino acid {

⟨may⟩} by used by these species.

I head – complement : FalseThe unknown amino acid {

⟨may⟩

be used by these species}.

I subject – head : True{The unknown amino acid

⟨may⟩

be used by these species}.

Page 42: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Alignment of constituents and scopes

Scope boundaries must align with the boundaries ofconstituents for the approach to be successful.

Alignment can be improved by applying slackening rules,including:I eliminating constituent-final punctuation;

I eliminating constituent-final parenthesised elements;

I reducing scope to the left when the left-most terminal is anadverb and not the cue; and

I ensuring the scope starts with the cue when it is a noun.

Page 43: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Alignment of constituents and scopes

Scope boundaries must align with the boundaries ofconstituents for the approach to be successful.

Alignment can be improved by applying slackening rules,including:I eliminating constituent-final punctuation;

I eliminating constituent-final parenthesised elements;

I reducing scope to the left when the left-most terminal is anadverb and not the cue; and

I ensuring the scope starts with the cue when it is a noun.

Page 44: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Alignment of constituents and scopes

80

85

90

95

1 10 20 30 40 50

% o

f h

edg

e sc

op

es a

lig

ned

wit

h c

on

stit

uen

ts

n-best parsing results

BSA

BSP

Page 45: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Alignment of constituents and scopes

Analysis of the non-aligned items indicated that mismatchesarise from:I parse ranking errors (40%),I non-syntactic scope (25%) e.g.

“This allows us to {⟨address a number of questions

⟩: what proportion

of [...] can be attributed to a known domain-domain interaction}?”,I divergent syntactic theories (16%),I parenthesised elements (13%) andI annotation errors (6%)

Page 46: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Alignment of constituents and scopes

In the papers:I A parse is produced for 86% of sentences.I Alignment of constituents and speculation scopes is 81%

when inspecting the first parse of sentences.I The upper-bound performance of the ranker is around

76% when resolving the scope of speculation.

I A similar analysis indicates an upper-bound of 77% fornegation.

Page 47: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Alignment of constituents and scopes

In the papers:I A parse is produced for 86% of sentences.I Alignment of constituents and speculation scopes is 81%

when inspecting the first parse of sentences.I The upper-bound performance of the ranker is around

76% when resolving the scope of speculation.

I A similar analysis indicates an upper-bound of 77% fornegation.

Page 48: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Training procedure

Given a parsed BioScope sentence:I The candidate consitutuent that corresponds to the scope is

labeled as correct;I all other candidates are labeled as incorrect.I Learn a scoring function (using SVMlight).

Describe candidates with three classes of features that record:I paths in the tree;I surface order; andI rule-like linguistic phenomena.

Page 49: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Training procedure

sb-hd mc c

sp-hd n c

d - the le

the

aj-hdn norm c

aj - i le

unknown

n ms-cnt ilr

n - mc le

amino acid

hd-cmp u c

v vp mdl-p le⟨may⟩ hd-cmp u c

v prd be le

be

hd-cmp u c

v pas odlr

v np le

used

hd-cmp u c

p np ptcl le

by

sp-hd n c

d - pl le

these

w period plr

n pl olr

n - c le

species.

Page 50: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Path features

Paths from speculation cues to candidate constituents:I specific, e.g. ‘‘v vp mdl-p le\hd-cmp u c\sb-hd mc c’’

I general, e.g. ‘‘v vp mdl-p le\\sb-hd mc c’’

With lexicalised variants:I specific, e.g. ‘‘may : v vp mdl-p le\hd-cmp u c\sb-hd mc c’’

I general, e.g. ‘‘may : v vp mdl-p le\\sb-hd mc c’’

Bigrams formed of nodes and their parents, e.g.‘‘v vp mdl-p le hd-cmp u c’’ and‘‘hd-cmp u c sb-hd mc c’’

Page 51: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Surface features

I Bigrams formed of preterminal lexical types, e.g.‘‘d - the le aj-hdn norm c’’, ‘‘aj-hdn norm c n - mc le’’, etc.

I Cue position within candidate (in tertiles) , e.g. ‘‘33%-66%’’.

I Candidate size relative to sentence length (in quartiles),e.g. ‘‘75%-100%’’

I Punctuation preceeding the candidate, e.g. ‘‘false’’

I Punctuation at end of the candidate e.g. ‘‘true’’

Page 52: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Rule-like features

Detection of:

I Passivisation

I Subject control verbs occuring with passivised verbs

I Subject raising verbs

I Predicative adjectives

Page 53: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Detecting control verbs with a passivised verb

1. Cue is a subject control verb.

2. Find the first subject – headparent of (1) on the head path.

3. Find the firsthead – complement child of (2)on the head path.

4. The right-most daughter of (3)or one of its descendents is apassivized verb.

5. The transitive head daughterof the left-most daughter of (2)is not an expletive it or there.

subject

control

verb

subject–

head

not

expletive

pronoun

H

the unknown amino acid

H

(1)

(2)

(5)

*

H

head–

complement

passive

verb

(3)

(4)

<may> be used by these species.

Page 54: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Using predictions from auxiliary systems

Combine with other systems by introducing features recordingmatches with:I the start of the auxiliary prediction;

I the end of the auxiliary prediction; or

I the entirety of the auxiliary prediction.

In cases where an ERG parse is unavailable we back off to theprediction of the dependency rules.

Page 55: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Outline

1 Data sets and evaluation measures

2 A brief review of speculation and negation at LNS

3 A syntactic constituent ranker

4 Experiments

Page 56: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Ranker optimisation

Speculation and negation in aligned items from papers (F1)

Speculation Negation

Random baseline 26.76 21.73Path 78.10 79.28Path+Surface 79.93 78.88Path+Rule-like 83.72 89.47Path+Surface+Rule-like 85.30 89.24

Training with the first aligned constituent in n-best parses andtesting with m-best parses did not greatly impact performance,but optimal values are: n = 1 and m = 3.

Page 57: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Ranker optimisation

Speculation and negation in aligned items from papers (F1)

Speculation Negation

Random baseline 26.76 21.73Path 78.10 79.28Path+Surface 79.93 78.88Path+Rule-like 83.72 89.47Path+Surface+Rule-like 85.30 89.24

Training with the first aligned constituent in n-best parses andtesting with m-best parses did not greatly impact performance,but optimal values are: n = 1 and m = 3.

Page 58: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

n-best optimisation for combined system

0 5

10 15

20 25 0

5 10

15 20

25

74

75

76

77

F1

training n-best

testing m-best

F1

Page 59: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Development results

Resolving scope of gold cues (F1)

Speculation Negation

Default baseline 64.89 48.07Constituent ranker 73.67 67.46Dependency rules 73.39 70.11Combined 78.69 73.98

Page 60: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Evaluation results

We evaluate end-to-end performance for speculation on theCoNLL-2010 Shared Task evaluation data.

Speculation

System Prec Rec F1

Morante et al. (2010) 59.62 55.18 57.32Cue classifier + combined 62.00 57.02 59.41

Page 61: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Evaluation results

Negation (cross-validated on abstracts)

System Prec Rec F1

Morante et al. (2009) 66.31 65.27 65.79Cue classifier + combined 68.93 73.03 70.92

Negation (on held-out papers)

System Prec Rec F1

Morante et al. (2009) 42.49 39.10 40.72Cue classifier + combined 61.77 71.55 66.30

Page 62: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Evaluation results

Negation (cross-validated on abstracts)

System Prec Rec F1

Morante et al. (2009) 66.31 65.27 65.79Cue classifier + combined 68.93 73.03 70.92

Negation (on held-out papers)

System Prec Rec F1

Morante et al. (2009) 42.49 39.10 40.72Cue classifier + combined 61.77 71.55 66.30

Page 63: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Conclusions

I It is possible to learn a discriminative ranking functionfor choosing subtrees from HPSG-based constituentstructures that match speculation scopes;

I Combining the rule-based and ranker-based approachesto resolve the scope of cues predicted by our classifierachieves the best results to date on the CoNLL 2010 SharedTask evaluation data;

I The system is readily adapted to deal with negation, alsoachieving state-of-the-art results.

Page 64: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Conclusions

I It is possible to learn a discriminative ranking functionfor choosing subtrees from HPSG-based constituentstructures that match speculation scopes;

I Combining the rule-based and ranker-based approachesto resolve the scope of cues predicted by our classifierachieves the best results to date on the CoNLL 2010 SharedTask evaluation data;

I The system is readily adapted to deal with negation, alsoachieving state-of-the-art results.

Page 65: A Syntactic Constituent Ranker for Speculation and …...2011/09/21  · University of Oslo: Department of Informatics A Syntactic Constituent Ranker for Speculation and Negation Scope

Conclusions

I It is possible to learn a discriminative ranking functionfor choosing subtrees from HPSG-based constituentstructures that match speculation scopes;

I Combining the rule-based and ranker-based approachesto resolve the scope of cues predicted by our classifierachieves the best results to date on the CoNLL 2010 SharedTask evaluation data;

I The system is readily adapted to deal with negation, alsoachieving state-of-the-art results.