overview of machine learning for nlp tasks: part ii named entity tagging: a phrase-level nlp task

28
Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Upload: winifred-shelton

Post on 28-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Overview of Machine Learning for NLP Tasks: part II

Named Entity Tagging:A Phrase-Level NLP Task

Page 2: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Outline

Identify a (hard) problem Frame the problem ‘appropriately’

(...so that we can apply our tools, find appropriate labeled data)

Preprocess data Apply FEX and SNoW Process output from FEX, SNoW to

annotate new text FEX and SNoW server modes

Page 3: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Named Entity Tagging

Identify e.g. people, locations, organizations

After receiving his [MISC M.B.A.] from [ORG Harvard Business School], [PER Richard F. America] accepted a faculty position at the [ORG McDonough School of Business] ([ORG Georgetown University]) in [LOC Washington].

Page 4: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Framing NE-tagging Problem

Not an easy problem: We won’t seek stellar results – Just want to show that tools work, and

how to apply them Where to begin?

Need labeled data Data must work with FEX

Page 5: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Ways to Approach NE-tagging

BIO/Open-Close Chunking: Word-level classification + inference BIO/Open-Close chunking found depends

on labels you train with (e.g. NE labels) Impose common-sense constraints on

open/close labels Optimize based on classifier confidence V. Punyakanok and D. Roth, “The Use of Classifiers in

Sequential Inference” NIPS-13, Dec, 2000

Use chunker to find phrase boundaries: phrase-level predicate – learn labels for

phrases can use FEX’s phrase mode

Page 6: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Framing NE-tagging Problem

We have some labeled Named Entity data

We can identify Noun-phrases with our chunker... See the Demos page for an example

...and FEX has a phrase mode... ...So we can frame this as a (noun)

phrase classification problem (assume all NEs are NPs) avoids working with invalid phrases avoids inference (as opposed to open-

close classifiers)

Page 7: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Review: Machine Learning System

PreprocessingFeature

Extraction

MachineLearner

Classifier(s) Inference

RawText

FormattedText

TestingExamples

FunctionParameters

Labels

FeatureVectors

TrainingExamples

Labels

Page 8: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Solution Sketch

Use labeled data to develop core classifier Adapt our labeled data to our model of the problem Experiment with FEX and SNoW to get good performance

using our labeled data Use the FEX and SNoW resources we develop as

the core of our NE Tagger Write tools to preprocess raw text into appropriate form

for input to FEX, SNoW

Write tools to convert SNoW output to labels for preprocessed data

Convert labeled preprocessed data into desired output format

For the training/evaluation data, we’ve done the pre- and post-processing for you…

Page 9: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

CONLL03 data

Have some column-format data... any problems?

O 0 0 B-NP PRP He x TXT/1 0O 0 1 B-VP VBD said x TXT/1 0O 0 2 I-NP DT a x TXT/1 0O 0 3 I-NP NN proposal x TXT/1 0O 0 4 B-NP JJ last x TXT/1 0O 0 5 I-NP NN month x TXT/1 0O 0 6 B-PP IN by x TXT/1 0B-ORG 0 7 B-NP NNP EU x TXT/1 0O 0 8 I-NP NNP Farm x TXT/1 0O 0 9 I-NP NNP Commissioner x

TXT/1B-PER 0 10 I-NP NNP Franz x TXT/1 0I-PER 0 11 I-NP NNP Fischler x TXT/1 0

Page 10: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Design Decisions

NE phrases are a subset of NPs We can find NPs, so label only NPs Given chunking, can use FEX phrase mode

CONLL03 data: NPs not labeled as NEs NE phrases could be embedded

How to resolve embeddings? Avoid embedding – ‘enlarge’ NE phrases

Data has been preprocessed to reflect our needs

Page 11: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Setting up...

Download NE data from CogComp tools page ne_tut_processed.tar.gz

Download sample FEX script link: ‘sample NE FEX script’ file: NE-simple.scr

Page 12: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Review: What FEX is doing...

Think of FEX as generating a list of boolean variables, X1, X2, … , Xn

Lexicon maps boolean variable Xi to a propositional logic term

e.g. “1204 w[rejects*]” could be written X1024 == BEFORE(X, TARG) where X == “rejects”, TARG є {too, to, two}

In FEX output: If boolean variable is present, it is active If boolean variable is not present, it is inactive

Page 13: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

FEX advanced modes: Phrase Mode

Why do we need extensions? The original design of FEX is “word-based” Each element is a word, and so is the target

Phrase detection/classification problem: The target is a phrase.

E.g. Named Entity tagging, Shallow Parse tagging Document classification problem:

The target is the whole document. Relations: Target is at some intermediate

level of representation. FEX also has an Entity-Relation mode…

Page 14: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Basic Structure

Two types of elements: phrases & words FEX’s window semantics are different for phrase

mode Column format input only

W1 W2

W3 W4 W5 W6

W7 W8Phrase

Page 15: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Changes to Fex for Phrase Mode

Only accepts COLUMN format input 1st column is used to store (phrase)

labels. 2nd column is used to store named entity

tags. Both use BIO format. Columns 2-6 have fixed meanings:

2 NE; 3 Index; 4 Phrase boundary; 5 POS; 6 Word

Page 16: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Sample Column Format Data

O 0 0 I-NP PRP He x TXT/1 0O 0 1 I-VP VBD said x TXT/1 0O 0 2 I-NP DT a x TXT/1 0O 0 3 I-NP NN proposal x TXT/1 0O 0 4 B-NP JJ last x TXT/1 0O 0 5 I-NP NN month x TXT/1 0O 0 6 I-PP IN by x TXT/1 0B-ORG 0 7 I-NP NNP EU x TXT/1 0O 0 8 I-NP NNP Farm x TXT/1 0O 0 9 I-NP NNP Commissioner x

TXT/1B-PER 0 10 I-NP NNP Franz x TXT/1 0I-PER 0 11 I-NP NNP Fischler x TXT/1 0

Page 17: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Phrase Mode Option

FEX command line option –P <length> -P takes an integer as its argument, which

stands for the maximum length of the candidate phrases.

For example, “fex -P 4” will generate examples for every phase of length 1, 2 ,3 and 4 from the corpus file.

If the length is equal to 0, then only positive examples will be generated.

> fex –P 0 ne.scr ne.lex ne.corp ne.out

Page 18: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Window Range in Phrase Mode

The meaning of the offsets in the window is different in Phrase mode:

w1 w2 w3 W4 W5 W6 w7 w8 w9 -3 -2 -1 0 0 0 1 2 3

“-1: w[0,0]” returns w[W4], w[W5], w[W6]. “-1 loc: w[0,0]” returns w[*W4]*, w[*_W5]*, w[*__W6]*.

(NOTE: * after [] indicates ‘within phrase’)“-1 loc: w[-2,-1]” returns w[w2_*], w[w3*].“-1 loc: w[1, 2]” returns w[*w7], w[*_w8].

Page 19: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Phrase Type Sensors

How to specify patterns within phrase? Several phrase type sensors can be

used. “-1 phLen[0,0]” returns 3 for the above

corpus file, since "W4 W5 W6" contains 3 words.

phNoSmall is active if all words in the target phrase are either capitalized (initial), symbols, or numbers.

phAllWord is active if all the elements in the target phrase are words (a-z,A-Z)

Many other custom sensors – check the FEX source code (Sensor.h)

Page 20: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

RGF operator conjunct

w1 w2 w3 W4 W5 W6 w7 w8 w9

-3 -2 -1 0 0 0 1 2 3

“conjunct(-1:w[-2,-1]; -1:phLen[0,0]; -1:w[1,2])” generates

w[w2]--phLen[3]--w[7], w[w2]--phLen[3]--w[8] w[w3]--phLen[3]--w[7], w[w3]--phLen[3]--w[8]

Page 21: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Choose FEX, SNoW parameters

Use FEX phrase mode:

% ./fex –P 0 ne.scr ne.lex data.in ne-snow.ex

Train SNoW with the resulting examples:

% ./snow –train –I ne-snow.ex –F ne.net –W:0-5

Test SNoW with examples from test data:

% ./snow –test –I ne-snow2.ex –F ne.net –o allpredictions –R ne.res

Page 22: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Improving Classifier Performance

Tune fex script: experiment with different sensors InitialCapitalized, NotInitialCapitalized,

AllCapitalized

Tune SNoW using Test data analyze.pl – a tool to help with tuning

Gives accuracy for each label Requires SNoW’s ‘-o allpredictions’ mode

% ./analyze.pl snow.res

Page 23: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

We now have a classifier…

Need a way to apply it to new text… No formatting or Gold Standard

labeling Need to enrich with POS, SP Need to track SNoW output and use it

to label the data

Sample tools: link: ‘NE tagging: tools for new data’ file: tut_ne_postprocess.tar.gz

Page 24: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Classifying New Data

First, let’s enrich our input: POS-tagging – POS tagger Chunking – Shallow Parser

NOTE: SP output format is not FEX-compatible Convert to Column format Tool available from ccg tools page

% ./chunk-to-column.pl inputFile > outputFile

Run data through FEX and SNOW servers One file at a time Doesn’t reload lexicon/network each time Can pipe test data through both together

Page 25: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Making life easier...

Starting SNoW server:% ./snow –server <port> -F network.net &

Starting FEX server:% ./fex –s <port> -P 0 <script> <lexicon> &

Need client scripts to interact with the servers See Snow_v3.1/tutorial/example-client.pl for SNoW See fex/fexClient.pl for FEX

Clean up after use… ‘ps’ kill server processes

Page 26: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Post-processing

SNoW ‘-o winners’ mode

% ./snow –test –I ... –F ... –R text.winners.res –o winners

Adding results to original data SNoW output mode must be ‘winners’

% ./numbers-to-labels.pl text.winners.res ne.lex > text.lab

% ./apply-labels.pl text.col text.lab > text.col.lab

In my solution, seeming disparity between performance on held-out data and on the completely unseen text

WHY? What is the best way to improve the performance? (i.e.,

what is likely to give the best return per unit time invested?)

Page 27: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Summary: SNoW and FEX

SNoW is supervised learning system Needs labeled data Performance constrained by the quality of the features it

is given Works with numerical features – needs preprocessing

stage to extract those features Fast, and good performance

FEX provides a framework for feature engineering Designed to represent examples in SNoW input format Does *not* generate features automatically –

not a replacement for human expert! Requires certain input formats Fairly modular – write new sensors to capture new feature

types Terse, expressive feature descriptors

Page 28: Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task

Summary: solving NLP problems

Need to frame problem appropriately (e.g. NE as noun phrase tagging)

Need appropriate labeled data If you want an application, will have to write

pre- and post-processing SNoW and FEX work close to the mathematical

models underlying machine learning User has good control over ML algorithms

Be prepared to spend some time on error analysis and feature engineering!