final hpsgs cleaning up and final aspects, semantics, overview to statistical nlp

70
CSE6339 3.0 Introduction to Computational Linguistics Tuesdays, Thursdays 14:30-16:00 – South Ross 101 Fall Semester, 2011 Instructor: Nick Cercone - 3050 CSEB - [email protected] 1 Final HPSGs Cleaning up and final aspects, semantics , overview to statistic al NLP

Upload: brigit

Post on 07-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP. HPSGs An Overlooked Topic: Complements vs. Modifiers • Intuitive idea: Complements introduce essential participants in the situation denoted; modifiers refine the description. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 1

Final HPSGs

Cleaning up and final aspects, semantics, overview to statistical NLP

Page 2: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 2

HPSGs

An Overlooked Topic: Complements vs. Modifiers• Intuitive idea: Complements introduce essential

participants in the situation denoted; modifiers refine the description.

• Generally accepted distinction, but disputes over individual cases.

• Linguists rely on heuristics to decide how to analyze questionable cases (usually PPs).

Page 3: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 3

HPSGs

Heuristics for Complements vs. Modifiers

• Obligatory PPs are usually complements.

• Temporal & locative PPs are usually modifiers.

• An entailment test: If X Ved (NP) PP does not entail X did something PP, then the PP is a complement.

Examples

– Pat relied on Chris does not entail Pat did something on Chris

– Pat put nuts in a cup does not entail Pat did something in a cup

– Pat slept until noon does entail Pat did something until noon

– Pat ate lunch at Bytes does entail Pat did something at Bytes

Page 4: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 4

HPSGs

Agreement

• Two kinds so far (namely?)

• Both initially handled via stipulation in theHead-Specifier Rule

• But if we want to use this rule for categories that don’t have the AGR feature (such as PPs and APs, in English), we can’t build it into the rule.

Page 5: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 5

HPSGs

The Specifier-Head Agreement Constraint (SHAC)

Verbs and nouns must be specified as:

Page 6: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 6

HPSGsThe Count/Mass Distinction

• Partially semantically motivated– mass terms tend to refer to undifferentiated substances (air, butter,

courtesy, information)

– count nouns tend to refer to individuatable entities (bird, cookie, insult, fact)

• But there are exceptions:– succotash (mass) denotes a mix of corn & lima beans, so it’s not

undifferentiated.

– furniture, footwear, cutlery, etc. refer to individuatable artifacts with mass terms

– cabbage can be either count or mass, but many speakers get lettuce only as mass.

Page 7: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 7

HPSGs – Semantics

The Linguist’s stance: Building a precise model• Some statements are statements about how the model works:

“[prep] and [AGR 3sing] cannot be combined because AGR is not a feature of the type prep.”

• Some statements are statements about how (we think) English or language in general works.

“The determiners a and many only occur with count nouns, the determiner

much only occurs with mass nouns, and the determiner the occurs with either.”

• Some are statements about how we code a particular linguistic fact within the model.

“All count nouns are [SPR < [COUNT +]>].”

Page 8: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 8

HPSGs – Semantics

The Linguist’s stance:A Vista on the Set of Possible English Sentences

• ... as a background against which linguistic elements (words, phrases) have a distribution

• ... as an arena in which linguistic elements “behave” in certain ways

Page 9: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 9

HPSGs - Semantics

So far, our “grammar” has no semantic representations. We have, however, been relying on semantic intuitions in our argumentation, and discussing semantic contrasts where they line up (or don't) with syntactic ones.

Examples?• structural ambiguity

• S/NP parallelism

• count/mass distinction

• complements vs. modifiers

Page 10: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 10

HPSGs - Semantics

Aspects of meaning we won’t account for

• Pragmatics

• Fine-grained lexical semantics:

Page 11: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 11

HPSGs - Semantics

Our Slice of a World of Meanings

“... the linguistic meaning of Chris saved Pat is a proposition that will be true just in case there is an actual situation that involves the saving of someone named Pat by someone named Chris.”

Page 12: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 12

HPSGs - Semantics

Our Slice of a World of Meanings

What we are accounting for is the compositionality of

sentence meaning.

• How the pieces fit togetherSemantic arguments and indices

• How the meanings of the parts add up to the meaning of the whole.

Appending RESTR lists up the tree

Page 13: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 13

HPSGs – Semantics in Constraint-based grammar

Constraints as generalized truth conditions• proposition: what must be the case for a proposition to be true

• directive: what must happen for a directive to be fulfilled

• question: the kind of situation the asker is asking about

• reference: the kind of entity the speaker is referring to

Syntax/semantics interface: Constraints on how syntactic arguments are related to semantic ones, and on how semantic information is compiled from different parts of the sentence.

Page 14: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 14

HPSGs – Semantics – Feature Geometry

Page 15: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 15

HPSGs – Semantics – How the pieces fit together

Page 16: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 16

HPSGs – Semantics – How the pieces fit together

Page 17: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 17

HPSGs – Semantics – How the pieces fit together

Page 18: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 18

HPSGs – Semantics (pieces together)

Page 19: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 19

HPSGs – Semantics (more detailed view of same tree)

Page 20: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 20

HPSGs – Semantics

To Fill in Semantics for the S-node, we need the Semantics Principles

The Semantic Inheritance Principle:• In any headed phrase, the mother's MODE and INDEX are

identical to those of the head daughter.

The Semantic Compositionality Principle:• In any well-formed phrase structure, the mother's RESTR

value is the sum of the RESTR values of the daughter.

Page 21: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 21

HPSGs – Semantics – semantics inheritance illustrated

Page 22: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 22

HPSGs – Semantics - semantic compositionality illustrated

Page 23: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 23

HPSGs – Semantics – what identifies indices

Page 24: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 24

HPSGs – Semantics – summary wordscontribute predications

‘expose’ one index in those predications, for use by words or phrases

relate syntactic arguments to semantic arguments

Page 25: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 25

HPSGs – Semantics – summary, grammar rules

identify feature structures (including the INDEX value) across daughters

Head Specifier

Rule

Head Complement

Rule

Head Modifier

Rule

Page 26: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 26

HPSGs – Semantics – summary, grammar rulesidentify feature structures (including the INDEX value) across daughters

license trees which are subject to the semantic principles- SIP ‘passes up’ MODE and INDEX from head daughter

Page 27: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 27

HPSGs – Semantics – summary, grammar rulesidentify feature structures (including the INDEX value) across daughters

license trees which are subject to the semantic principles

- SIP ‘passes up’ MODE and INDEX from head daughter

- SCP: ‘gathers up’ predications (RESTR list) from all daughters

Page 28: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 28

HPSGs – other aspects of semantics

• Tense, Quantification (only touched on here)• Modification• Coordination• Structural Ambiguity

Page 29: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 29

HPSGs – what were are trying to do

Objectives

• Develop a theory of knowledge of language

• Represent linguistic information explicitly enough to distinguish well-formed from ill-formed expressions

• Be parsimonious, capturing linguistically significant generalizations.

Why Formalize?

• To formulate testable predictions

• To check for consistency

• To make it possible to get a computer to do it for us

Page 30: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 30

HPSGs –how we construct sentences

The Components of Our Grammar• Grammar rules

• Lexical entries

• Principles

• Type hierarchy (very preliminary, so far)

• Initial symbol (S, for now)

We combine constraints from these components.• Question: What says we have to combine them?

Page 31: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 31

HPSGs – an example

A cat slept.

• Can we build this with our tools?

• Given the constraints our grammar puts on well-formed sentences, is this one?

Page 32: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 32

HPSGs – lexical entry for “a”

• Is this a fully specified description?

• What features are unspecified? • How many word structures can this entry license?

Page 33: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 33

HPSGs – lexical entry for “cat”

• Which feature paths are abbreviated and Is this fully specified?

• What features are unspecified? • How many word structures can this entry license?

Page 34: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 34

HPSGs - Effect of Principles: the SHAC

Page 35: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 35

HPSGs - Description of Word Structures for cat

Page 36: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 36

HPSGs - Description of Word Structures for a

Page 37: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 37

HPSGs - Building a Phrase

Page 38: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 38

HPSGs - Constraints Contributed by Daughter Subtrees

Page 39: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 39

HPSGs - Constraints Contributed by the Grammar Rule

Page 40: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 40

HPSGs - A Constraint Involving the SHAC

Page 41: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 41

HPSGs - Effects of the Valence Principle

Page 42: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 42

HPSGs - Effects of the Head Feature Principle

Page 43: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 43

HPSGs - Effects of the Semantic Inheritance Principle

Page 44: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 44

HPSGs - Effects of the Semantic Compositionality Principle

Page 45: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 45

HPSGs - Is the Mother Node Now Completely Specified?

Page 46: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 46

HPSGs - Lexical Entry for slept

Page 47: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 47

HPSGs - Another Head-Specifier Phrase

Page 48: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 48

HPSGs - Is this description fully specified?

Page 49: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 49

HPSGs - Does the top node satisfy the initial symbol?

Page 50: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 50

HPSGs - RESTR of the S node

Page 51: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 51

HPSGs – Another example

Page 52: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 52

HPSGs - Head Features from Lexical Entries

Page 53: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 53

HPSGs - Head Features from Lexical Entries, plus HFP

Page 54: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 54

HPSGs - Valence Features:Lexicon, Rules, and the Valence Principle

Page 55: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 55

HPSGs - Required Identities: Grammar Rules

Page 56: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 56

HPSGs - Two Semantic Features: the Lexicon & SIP

Page 57: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 57

HPSGs - RESTR Values and the SCP

Page 58: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 58

HPSGs - An Ungrammatical Example

What’s wrong with this sentence?

The Valence Principle, Head Specifier Rule

Page 59: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 59

HPSGs – Overview

• Information movement in trees

• Exercise in critical thinking

• SPR and COMPS

• Technical details (lexical entries, trees)

• Analogies to other systems you might know, e.g., How is the type hierarchy like an ontology?

Page 60: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 60

Statistical NLP – Introduction

NLP as we have examined thus far can be contrasted with statistical NLP. For example, statistical parsing researchers assue that there is a continuum and that the only distinction to be drawn is between the correct parse and all the rest.

The “parse” given by the parse tree on the

right would support this continuum view.

For statistical NLP researchers, there is no

Difference between parsing and syntactic

Disambiguation: its parsing all the way!

Page 61: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 61

Statistical NLP – Statistical NLP is normally taught in 2 parts:• Part I lays out the mathematical and linguistic foundation that the other

parts build on. These include concepts and techniques normally referred to throughout the course.

• Part II covers word-centered work in Statistical NLP. There is a natural progression from simple to complex linguistic phenomena in collocations, n-gram models, word sense disambiguation, and lexical acquisition. This work is followed by techniques such as Markov Models, tagging, probabilistic context free grammars, and probabilistic parsing, which build on each other. Finally other applications and techniques are introduced: statistical alignment and machine translation, clustering, information retrieval, and text categorization.

Page 62: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 62

Statistical NLP – What we will discuss

1. Information Retrieval and the Vector Space Model Typical IR system architecture, steps in document and query processing in IR, vector space model, tfidf - term frequency inverse document frequency weights, term weighting formula, cosine similarity measure, term-by-document matrix, reducing the number of dimensions, Latent Semantic Analysis, IR evaluation

Page 63: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 63

Statistical NLP - – What we will discuss

2. Text Classification Text classification and text clustering, Types of text classification, evaluation measures in text classification, F-measure, Evaluation methods for classification: general issues - over fitting and under fitting, methods: 1. training error, 2. train and test, 3. n-fold cross-validation

Page 64: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 64

Statistical NLP - – What we will discuss

3. Parser Evaluation, Text Clustering and CNG Classification Parser evaluation: PARSEVAL measures, labeled and unlabeled precision and recall, F-measure; Text clustering: task definition, the simple k-means method, hierarchical clustering, divisive and agglomerative clustering; evaluation of clustering: inter-cluster similarity, cluster purity, use of entropy or information gain; CNG -- Common N-Grams classification method

Page 65: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 65

Statistical NLP - – What we will discuss

4. Probabilistic Modeling and Joint Distribution ModelElements of probability theory, Generative models, Bayesian inference, Probabilistic modeling: random variables, random configurations, computational tasks in probabilistic modeling, spam detection example, joint distribution model, drawbacks of joint distribution model

Page 66: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 66

Statistical NLP - – What we will discuss

5. Fully Independent Model and Naive Bayes ModelFully independent model, example, computational tasks, sum-product formula; Naive Bayes model: motivation, assumption, computational tasks, example, number of parameters, pros and cons; N-gram model, language modeling in speech recognition

Page 67: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 67

Statistical NLP - – What we will discuss

6. N-gram Model N-gram model: n-gram model assumption, graphical representation, use of log probabilities; Markov chain: stochastic process, Markov process, Markov chain; Perplexity and evaluation of N-gram models, Text classification using language models

Page 68: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 68

Statistical NLP - – What we will discuss

7. Hidden Markov Model Smoothing: Add-one (Laplace) smoothing, Bell-Witten smoothing; Hidden Markov Model, graphical representations, assumption, HMM POS example, Viterbi algorithm -- use of dynamic programming in HMMs.

Page 69: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 69

Statistical NLP - – What we will discuss

8. Bayesian Networks Bayesian Networks, definition, example, Evaluation tasks in Bayesian Networks: evaluation, sampling, inference in Bayesian Networks by brute force, general inference in Bayesian Networks is NP-hard, efficient inference in Bayesian Networks,

Page 70: Final HPSGs Cleaning up and final aspects, semantics, overview to statistical NLP

CSE6339 3.0 Introduction to Computational LinguisticsTuesdays, Thursdays 14:30-16:00 – South Ross 101

Fall Semester, 2011

Instructor: Nick Cercone - 3050 CSEB - [email protected] 70

Other Concluding Remarks

ATOMYRIADES

Nature, it seems, is the popular namefor milliards and milliards and milliardsof particles playing their infinite gameof billiards and billiards and billiards.