siridus an information state approach to flexible spoken dialogue systems david milward staffan...

39
Siridus an information state approach to flexible spoken dialogue systems David Milward Staffan Larsson [email protected] [email protected]

Post on 22-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Siridus

an information state approach to flexible spoken dialogue systems

David Milward Staffan Larsson [email protected] [email protected]

Overview

• Siridus project• Information State Approach• Main areas of work• Some selected topics

– Trindikit– Flexible issue based dialogue management– Robust interpretation

• Conclusions

Siridus Project

• EU Framework 5• Partners

– Universities of • Gothenburg (technical coordinator)• Seville• Saarland (administrative coordinator)

– Telefonica– Linguamatics– (SRI)

• Duration– Jan 2000 - Dec 2002

The IS Approach

• considering dialogue in terms of transitions between states – structured “information states"

– can be used to model mental states of the dialogue participants

• advantages – theoretical objects/data structures for dialogue analysis

– perspective for comparison of systems and theories

IS transitions

IS for System

IS for System

– what are the best structures to represent the state of the dialogue?–stacks? sets of feature structures?–private beliefs + shared commitments?

– is information in the IS relevant to:–recognition/interpretation–generation/synthesis

– what is explicitly coded, what is emergent?–moves/games emerge as an aggregate of updates

Interpret U Generate U’ IS for System

IS inspired systems

• IS seems well suited for building systems between e.g. VoiceXML and full BDI

• Examples: Delfos, Godis

• Information structure including– lists of feature structures (Delfos) – stack of QUD (Godis)

Main areas of work in Siridus

• Dialogue types• Generating prosody using IS • Enhancing speech recognition through IS • Demonstrators• Toolkit for researchers (Trindikit 2/3)• Flexible dialogue • Robust interpretation and reconfigurability of

dialogue systems

Dialogue Types

• Natural command languages– Quesada, Amores: Cooperation and collaboration in

Natural Command Language Dialogues

• Conditional responses– Kruijff-Korbayova, Karagjosova, Larsson: Enhancing

collaboration with conditional responses in information seeking dialogues

• Clarifications– Cooper & Ginzburg: Using Dependent Record Types in

Clarification Ellipsis

• Negotiative dialogue

Limited Enquiry Negotiation Dialogue (LEND)

• information seeking dialogues often assume a fixed set of parameters users must fill out

• in human-human corpora, participants often negotiate which parameters to use

• they are batting proposals to and fro• users need not supply any particular subset of the

parameters of an action • similarly, the system needn’t be indifferent about the best

set of parameters to use – – it may greatly prefer to get a departure-time if it can

• decide on the next question according to maximal utility

Main areas of work in Siridus

• Dialogue types• Generating prosody using IS • Enhancing speech recognition through IS • Demonstrators• Toolkit for researchers (Trindikit 2/3)• Flexible dialogue • Robust interpretation and reconfigurability of

dialogue systems

Generating Prosody using ISDo we need it?

• Sample dialogue– S: How can I help you?– U: I'd like a flight from London to Paris

– S: When do you want to travel?– S: WHEN do you want to travel?

• Default intonation in Text-To-Speech systems: – When do you WANT to travel? (Festival, ViaVoice) – When DO you want to travel? (Articulator)– When do you want TO travel? (AT&T)

Can we fix the intonation for particular sentences/phrases?

No! Intonation is context dependent

U: A train from London to Edinburgh, the first

S: Sorry, do you want to travel on the first of SEPTEMBER or do you want to travel first CLASS?

U: First of September how much is it?S: SECOND class costs FIFTY pounds. FIRST

class costs ONE HUNDRED pounds

Information State and Prosody

• Motivate prosody using abstract notion of information structure partitioning – c.f. Steedman, Vallduvi, Prague school etc.

• Integration of these concepts into an information state update approach– Theme/rheme -

• questions under discussion (QUD) • QUD can be implicit so more general than QA pairs

– Focus/background• comparison of current utterance vs

– shared commitments from dialogue history– alternatives available in the domain

• aim to avoid over generation of focus

Main areas of work in Siridus

• Dialogue types• Generating prosody using IS • Enhancing speech recognition through IS • Demonstrators• Toolkit for researchers (Trindikit 2/3)• Flexible dialogue • Robust interpretation and reconfigurability of

dialogue systems

Enhancing speech recognition and synthesis using IS

• For finite state or form based dialogue can associate particular grammars with particular prompts

• But what if you generate your prompts automatically?– Set of LMs to cover classes of generated prompts e.g.

• Prompts expecting a np reply

– Mix of grammar based and statistical LMs• Statistical LM used for back off • or embed grammars in a statistical LM

– Single statistical LM for the whole domain + rescoring of n-best/lattice according to syntax/semantics/context

Lattice/n-best approach

• Potential sources of information:– previous move/dialogue history

Where do you want to go.... Boston ...

– syntactic/semantic coherence of fragments switch on the door

– coherence of fragments relative to each other leave ..... at 6pm

– reference resolution turn off all heaters vs. turn off hall heater

– state of the worldturn on|off the light in the hall

• Combine/contrast sources of evidence to decide– most likely utterance – when to clarify

Main areas of work in Siridus

• Dialogue types• Generating prosody using IS • Enhancing speech recognition through IS • Demonstrators• Toolkit for researchers (Trindikit 2/3)• Flexible dialogue • Robust interpretation and reconfigurability of

dialogue systems

TrindiKit

GoDiS

GoDiS-I GoDiS-A

TravelAgency

VCRmanager

IBDM/KOS ISU

approachDelfos

basic system

Delfosframework

DELFOS-NCL

Homemanager

ISUapproach

ATOS Home

manager

Demonstrators: Automated telephone operator

• Natural language telephone-based access to company telephone directory/PABX– Dial by name – Email address– Multi-party conference– Call transfers

• KQML based message passing• Delfos dialogue management• User trials at Telefonica I+D

Demonstrators: Home device control

• Command and control of home devices (following on from the D’Homme Project)– Is the lounge TV on – Switch the TV to Channel 4– Turn all the lights off in the bedrooms– I’d like to record a programme– Night mode

• Reconfigurability issues “plug and play” of devices: what happens when move devices around, add new devices?

• Advanced reference resolution– Quesada and Amores: Knowledge based reference resolution for

Dialogue Management in a Home Domain Environment

Home Simulation

Por favor, enciende la luz.

USDD

Real devices

Main areas of work in Siridus

• Dialogue types• Generating prosody using IS • Enhancing speech recognition through IS • Demonstrators• Toolkit for researchers (Trindikit 2/3)• Flexible dialogue • Robust interpretation and reconfigurability of

dialogue systems

Main areas of work in Siridus

• Dialogue types• Generating prosody using IS • Enhancing speech recognition through IS • Demonstrators• Toolkit for researchers (Trindikit 2/3)• Flexible dialogue • Robust interpretation and reconfigurability of

dialogue systems

Interpretation in flexible dialogue systems

• System initiative useful in constraining what users are likely to say

• User initiated utterances are more variable and less easy to predict

• Compiling a precise grammar into the recogniser is less likely to give good results

• Want to be able to extract partial results and avoid total failure

Task

• map from IS and recognition lattice/n-best list to a (partial) interpretation

• using IS information after interpretation may be too late

S: what kind of route would you prefer?U: the quick route pleased

• need to use information in IS to help – choose between fragments, – find sensible combinations of fragments

• but can we deal with fragments at all?

Self contained fragments

I want to go from London ... to Birmingham

origin = London destination = Birmingham

• No need to put fragments together• Can extract pieces of semantics independently• Combination of meaning is via implicit

conjunction i.eorigin = London & destination = Birmingham

Dependent fragments

I want to leave Birmingham .... at 3 pm

departure-time = 3pm

• Can’t decide between arrival/departure time without considering other fragments

• Can use some keyword spotters - phrase grammars not enough

Context vs. Syntax

S: When do you want to leave?

U: ... 4pm ...U: I’m not sure when I’d like to leave, but I’d like to arrive by 4pm

• Need to be able to override use of keywords + context using syntactic information

Integrating syntactic constraints

• Phrase spotters go to a particular depth of analysis even if a full parse is possible

• When syntactic information is available should use it• When not should do as well as keyword/phrase spotting• Approach:

– distribute the semantic representation– get inside fragments to access particular components e.g. “leave”– structural constraints available where necessary

• Allows– rules similar to keyword/phrase spotting– more specialised rules involving syntactic constraints

Indexed representations (1)• Can split up any recursive representation into a set

of indexed constraints e.g.

John wants to leave Boston

wants(john, leave(john, boston)

e1: wants

e2: john

e3: leave

e4: john

e5: boston

e0: e1(e2,e6)

e6: e3(e4,e5)

Indexed Representations (2)

• Individual constraints provide content and structure • Uniform representation for utterance fragments or fully

connected sentence– fragments miss certain constraints, or identity between indices

• Provides a description of the original semantic representation

• semantic chart or lattice is an indexed representation where readings are packed – same index -> alternative reading– c.f. edges with ‘index’: 0-1-np

Achieving compositionality

• What happens when the output semantics needs to be more complex?,– I want to go to Glasgow on Tuesday and

Edinburgh on Wednesday– Turn on all the lights except in the kitchen

• Not enough to analyse each fragment independently

• Need to put semantics together

Compositional concept based semantics

What can provide clues for putting things

together if syntax is inadequate?

• regular patterns of hesitations and repairs

• function argument structure

• general ontological knowledge

• specific real world knowledge

Example

• Consider– the light in the back bedroom– the back bedroom light– light .... back ... bedroom

• if there is only one light in each bedroom, no need for clarification

• general ontological knowledge might suggest a device in a room

• specific knowledge about the house allows choice of “back bedroom” vs “back light”

Approach

• Connect concepts according to a model of the domain• Ontological relationships:

– devices can be in locations– lights are devices– rooms are locations

• Specific information:– b1 is a bedroom – b1 orientation is back– l1 is a light– l1 in b1

Conclusions

• Semantic chart provides a convenient knowledge representation for manipulating fragment semantics

• Ontological/domain knowledge used to combine fragments – allows recursive combination– not just filling in a fixed number of slots in a template

• Allows mapping from IS + word lattice to interpretation without requiring a full parse

• Reconciling traditional syntax and compositional semantics with robust approaches

IS Approach: Conclusions

• a framework in which we can do interesting theoretical work– improving prosody/recognition using IS– exploration of different dialogue genres – better comparison of theories

• a research tool for dialogue developers (TrindiKit)

• systems which are– modular – reconfigurable– between FS/form filling and BDI/planning

Siridus

www.ling.gu.se/projekt/siridus