1 natural language processing gholamreza ghassem-sani fall 1383

35
1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

Upload: basil-garrett

Post on 12-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

1

Natural Language Processing

Gholamreza Ghassem-SaniFall 1383

Page 2: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

2

Main Reference:

Natural Language Understanding

By James Allen, 1995

Page 3: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

3

Goals of the field

Computers would be a lot more useful if they could handle our email, do our library research, talk to us …

But they are fazed by natural human language.

How can we tell computers about language? (Or help them learn it as kids do?)

Page 4: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

4

Goals of the field

• Develop computational models of aspects of human language processing

• ...in order to write computer programs performing smart tasks with language

• ...in order to gain a better understanding of the human language processor, and of communication

Page 5: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

5

Interdisciplinary research...

• Psychology, Cognitive Science

• Linguistics

• Philosophy

• Computer Science, Artificial Intelligence

Page 6: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

6

NLP: applications

• Speech recognition and synthesis

• Machine translation

• Document processing– information extraction– summarization

• Text generation

• Dialog systems (typed and spoken)

Page 7: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

7

Speech recognition

• Task: From representation of audio stream to sequence of words

• Word hypotheses lattice

• Applications: dictation, ...

Page 8: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

8

Machine Translation

SL-TextSL-Text TL-TextTL-Text

Analysis Generation

Transfer

Interlingua

Page 9: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

9

Translating Speech

• Translate negotiation dialogs between German, English, Japanese in near-real-time

• Process spontaneous speech

Page 10: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

10

Document Summarization

On Monday, GreenChip Solutions made an acquisition offer to BuyOut Inc., a St. Louis-based plastic tree manufacturer that had tremendous success in equipping American households with pink plastic oak trees.

( ... ( ... )

( ... )) Analysis

Generation

GreenChip offered to acquire the plastic tree

manufacturer BuyOut.

Page 11: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

11

Information Extraction

• Dedicated search for key information in the text– acquirer: GreenChip– acquiree: BuyOut– date-offer: Monday – $$-offer: UNSPEC

• Key attributes are predefined

• Systems are domain-specific

Page 12: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

12

Text generation

• Reports– Data mining results– Project status – Continuous data (e.g., air pollution)

• Technical Documentation– User manuals– multilingual, many variants, frequent updates

• Adaptive Hypertext– Tailor the web page to the user – user model, navigation history

Page 13: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

13

Dialog systems

• Web-based, typed

• Spoken: telephone (web)– Travel timetable– Cinema programs

• E-Commerce: product search&info

• Combination of almost all NLP tasks, plus dialog strategy, planning, hearer model,...

Page 14: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

14

NLP - Methods

• Levels of language analysis

• Symbolic versus statistical techniques

• Parsing - “understanding”

• Syntax

• Semantics, domain modeling

Page 15: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

15

Levels of language analysis

• Phonetics/phonology/morphology: what words (or sub words) are we dealing with?

• Syntax: What phrases are we dealing with? Which words modify one another?

• Semantics: What’s the literal meaning?

• Pragmatics: What should you conclude from the fact that I said something? How should you react?

Page 16: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

16

Levels of language analysis

• Phonetics: sounds -> words

– /b/ + /o/ + /t/ = boat

• Morphology: morphemes -> words

– friend + ly = friendly

• Syntax: word sequence -> sentence structure

• Semantics: sentence structure + word meaning -> sentence meaning

• Pragmatics: sentence meaning + context -> more precise meaning

• Discourse and world knowledge

Page 17: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

17

Levels of language analysis (cont.)

1. Language is one of fundamental aspects of human behavior and is crucial component of our lives.

2. Green frogs have large noses.

3. Green ideas have large noses.

4. Large have green ideas nose.

5. I go store.

Page 18: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

18

“Symbolics or statistics”

• Statistic used for: – speech recognition

– part-of-speech tagging

– parsing

– machine translation, info retrieval, summarization

• Symbolic reasoning used for:– parsing

– semantic analysis

– generation

– MT, dialog management

• Hybrid approaches

Page 19: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

19

Parsing

Utterance

Grammar

Lexicon

Syntactic analysis

Semantic analysis

Pragmatic analysis

Page 20: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

20

Syntactic Analysis

• By analyzing sentence structure (parsing), determine how word are related to each other– (The big furry rabbit) (jumped over (the lazy

tortoise))

• Obtain the parse tree to represent groupings

Page 21: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

21

Syntactic analysis (2)

• Syntax can make explicit when there are several possible interpretations– (Rice flies) like sand.– Rice (flies like sand).

• Knowledge of ‘correct’ grammar can help finding the right interpretation– Flying planes are dangerous.– Flying planes is dangerous.

Page 22: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

22

Semantic Analysis

• Different sentences with same meaning should have the same representation:– John gave the book to Mary.– The book was given to Mary by John.– give-action:

agent: john object: book receiver: mary

• Then we can do reasoning, answer questions

Page 23: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

23

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

Page 24: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

24

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

To get a donut (spare tire) for his car?

Page 25: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

25

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

store where donuts shop? or is run by donuts?

or looks like a big donut? or made of donut?

or has an emptiness at its core?

Page 26: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

26

What’s hard about this story?

I stopped smoking freshman year, but

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

Page 27: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

27

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

Describes where the store is? Or when he stopped?

Page 28: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

28

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

Well, actually, he stopped there from hunger and exhaustion, not just from work.

Page 29: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

29

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

At that moment, or habitually?(Similarly: Mozart composed music.)

Page 30: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

30

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

That’s how often he thought it?

Page 31: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

31

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

But actually, a coffee only stays good for about 10 minutes before it gets cold.

Page 32: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

32

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

Similarly: In America a woman has a baby every 15 minutes. Our job is to find that woman and stop her.

Page 33: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

33

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

the particular coffee that was good every few hours? the donut store? the situation?

Page 34: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

34

What’s hard about this story?

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there.

too expensive for what? what are we supposed to conclude about what John did?

how do we connect “it” to “expensive”?

Page 35: 1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383

35

The Flow of Information