natural language processing in augmentative and alternative communication yael netzer

39
Natural Language Processing in Augmentative and Alternative Communication Yael Netzer

Post on 22-Dec-2015

233 views

Category:

Documents


0 download

TRANSCRIPT

Natural Language Processing in Augmentative and Alternative

Communication

Yael Netzer

Outline

• Natural Language Processing• Augmentative and Alternative Communication• My work - Generation of messages

– How does the process look like– What is needed

• NLP in AAC– Word prediction– Message generation– IR methods

Natural Language Processing

Applications/models with usage of linguistic knowledge, or that provide linguistic knowledge (POS taggers, parsers etc.)

• Language applications:– Machine translation– Text summarization– Information retrieval/extraction– Human Computer interface.

Alternative and Augmentative Communication• AAC Users

– Congenital diseases e.g. cerebral palsy– Progressive diseases e.g. ALS Amyotrophic Lateral Sclerosis (Lou

Gehrig's Disease)

– Trauma e.g. head injury• Cognitive disabilities vs. physical disabilities (each requires

different methods and assumptions).• Slow rate of conversation

– Speech rate 150-200 wpm, skilled typist 60 wpm– Speech prosthesis users: 10-15 wpm

• Each ‘key stroke’ may consume a lot of energy.• Trade off between conversation rate and cohesion of utterances.

AAC Techniques

• Simple pointing on boards and letter charts• Portable keyboard devices• Computer-based systems using single-switch

access for severely impaired subjects.• Symbols or letters

– Various symbol systems (Blissymbolics) /sets (PCS).

• Pre-stored phrases accessible via grid or iconic buttons.

AAC and NLP

• Common issues:– Text generation– Speech recognition– Text to speech synthesis– Information retrieval.

• 3 workshops, 1 special edition in journal (Natural Language Engineering).

Our framework

• Natural language generation – Content planning– Surface realization

• Lexical choice

• Syntactic realization

• Morphological processing.

• FUF/SURGE, HUGG

• Lexicon (Jing et al.)

Desired Scenario

Content planning

lexicalization

Syntacticrealization vocalization

Examples

• ME / TO SEE / CAT / TO EAT

I saw the cat eating.• CAT / TO EAT / TO SEE / ME

The cat ate and I saw it

The cat that ate saw me.

Blissymbolics• Invented by Charles Bliss 1965 as a written universal

language.• Adapted by Canadian speech therapists in the early 70s.:

successful alternative to verbal comm.• Consists of approx. 100 basic symbols.• Language now consists of more than 2000 complex

‘words’.• Not so easy to learn but…

– Enables good novel/personal expressions– Good basis for literacy– Adults like to use it too.

Bliss Example

Afraid = sad future

=

Example - Minspeak /PCS lexicon

apple, food, eat

house + food = grocery

rainbow + apple = red

Me See Cat Eat

Cat Eat See Me (Heb. Ver.)

Syntax: Ambiguity• ME / SEE / CAT / TO EAT

I saw the cat eating.• CAT / TO EAT / SEE / ME

The cat ate and I saw itThe cat that ate saw me.

• However, users of AAC don’t usually obey the word order of spoken language:– go+girl+house or girl+house+go or house+go+girl

– Two+bed+sleep+boy+one+girl+white+bed+brown+bed (the boy and the girl are sleeping in two bed, one in a white bed and the other in a brown bed).

Pragmatics

• where situation is taking place,

• who’s the hearer, – Good morning vs. Hi– Open the window vs. Can you please open the

window?

• Gestures (facial, body)

Pragmatics: Ambiguity

Do you want to eat?

I want to eat.

Contextual Resources

• What is the context of the things that are said, following what was already said before, referential expressions.

• In a restaurant you can talk about “the menu”

• In front of a computer, “the menu” is a set of commands.

Textual Context: Previous Utterances

someone

a person

a woman

a mother

a female parent

Lucy

she

etc…

I met Lucy.She looks great

Generating from Symbols: Issues

• Syntactic ambiguity• Contextual ambiguity• No strict rules for use of symbols – Individual codes,

conventions, abbreviations.• Textual – how one word affects the choice of another,

ordering words, fluency.• Practical: Enhancing communication rate w/o limiting

expressing abilities.– (efficient keyboard setup, word prediction, structure

prediction).

An overview on Architecture

Content planning

lexicalization

Syntacticrealization vocalization

Task: Parsing – Requirements: world knowledge

Lexical informationOutput: well formed input for

syntactic realizer

Task: generating well formed sentences

Tools: Valliant’s conceptual graphs parsing

Lexicon for verbs (Jing et al.)Bliss Lexicon

Tools: SURGE/HUGG

Lexicon

• Mapping concepts - symbols to word

• Compositional vs. non-compositional

• Organization of symbols for efficient retrieval.– (POS, semantic connections)

• Available lexical knowledge– Syntactic structure, irregularities etc.

Methodology

Test interaction of different aspects– Word/symbol/ structure prediction

With more specific questions:– Concepts to words– Referential expression generation– Pragmatic considerations.

Semantic Network (Valliant)

Blissymbols Grid

Word -> Symbols

 Symbols -> Symbol | Symbol near Symbols

 Symbol -> FeaturedComponent | Symbol++FeaturedComponent

 FeaturedComponent -> (Atomic)Position+Size+Direction

Atomic -> Pictographs | Arbitrary

 Pictographs -> protection, house, circle, plus, pointer^, arrow, room, body, legs, chair, water, wheel, feeling

 Arbitrary -> Articles | Numerals | Math-sign | Bliss-arbitrary

 Position -> Vertical and Horizontal

 

Vertical->VerticalPosition Spacing @ VerticalSigner

VerticalPosition -> right, left, centralized

VerticalSigner -> skyline, midline, earthline

 

Horizontal -> HorizontalPosition Spacing @ HorizontalSigner

HorizontalPosition -> above, under, centeralized

 HorizontalSigner -> leftline, middleline, rightline

 

Spacing -> zero | one | two ;; the distance between the constituents

 Size -> full, half, quarter

 

Direction -> as-is, horizontal, vertical, left, right, upside-down | Direction-Direction

 Bliss-arbitrary -> action , enclosure, multiplication, evaluation, nature, horizontal-line, vertical-line,

SURGE

Work left to do:

• Integration

• Evaluation– symbols to utterances corpus– keystrokes savings

Previous work of NLP-AAC

• Word prediction

• Message Generation

• Text simplification

Word Prediction

• Simple non-linguistic methods - possibly up to 50% savings of keystrokes.

• Required – improvement, • Including syntactic/semantic knowledge in the

prediction process, using machine learning methods, based on corpus analysis

• Methods: – Frequency-based models (bi/tri-grams)– Grammatical and conceptual modeling to predict well

formed utterances – such as the use of POS tags.

Word Prediction

• KOMBE project – hand written syntactic rules.

• Carlberger

• Different languages?

Message Generation

• Language generation from reduced input– Telegraphic text

[Cushler Badman Demasco and McCoy]• think red hammer break John =>

I think that the red hammer was broken by John.

– Cogeneration [Copestake]• Construction of full sentences from templates.

– PVI

• Main assumption: order of word choice implies topicalization and should be considered.

The common architecture

Iconic/telegraphic input

Semantic parser :Identification of

predicatorUnification of Arguments.

Lexical choice

Syntactic realization:Closed-words selection

Linearizationmorphology

Cogeneration approach • Situation-based approach.• A set of pre-defined templates :

Topic of discussion: <> Participants: <>Time of discussion: <> (optional)You know <participants> talking about <topic>

Prefer:You know we were talking at breakfast about buying a desk lamp.On ambiguous:You know we were talking about buying a desk lamp at breakfast.Templates W/o cogeneration:You know us talking about buy desk lamp breakfast.

PVI• Paradigmatic dimension: icons organized in taxemes,

further grouped in samantic domains.• Syntagmatic dimension: build a casual structure of

predicative concepts.• Meaning of an icon: the features that distinguish it from

the other icons. • Semantic analysis: reconstructing the meaning of the

icon sequence – building a semantic network.• Lexical choice – assuming there is no bijection mapping

of icons/words.• Generation

Message Selection Systems

• Discourse structure

• Talk:About univ. of Dundee

• A user uses pre-stored sentences.

• The sentences are indexed using rhetorical structure assumptions.

Language Simplification and Language Understanding• PSET project [Carroll et al.]

• Intended for aphasic readers – with lexical or syntactic impairments.

• Syntactic simplification: – Passive to active

• Lexical simplification: lookup for synonyms, use most frequent.

To sum..…

• NLP can be naturally and effectively integrated into AAC systems.

• Relaxations – user feedback is available on the spot.

• Data collection IS an issue here.• The aim: make more flexible, expressive

tools, with enhanced rate.• Possibly, combined approaches.