learning to talk through listening
Post on 27-Jan-2016
44 Views
Preview:
DESCRIPTION
TRANSCRIPT
Learning to Talk Through Listening
Alexander I. Rudnickywith
Ananlada Chotimongkol and Dan BohusCarnegie Mellon University
CATALOG 2004 – BarcelonaJuly 21, 2004
Outline
• Empirical approaches to understanding dialogue and building dialogue systems
• A task-based approach to dialogue
• Fundamental representations and observable events
• Learning through observation
Outline
• Empirical approaches to understanding dialogue and building dialogue systems
• A task-based approach to dialogue
• Fundamental representations and observable events
• Learning through observation
Why Build Dialogue Systems?
• The devil is in the details
• Better understand the actual complexities of human-computer interaction
• Create specific artifacts that embody theories of dialogue and interaction (and thereby allow us to test them directly)
Domains, Tasks and Applications
Domain
Task
Task
Task
Application
Task representation/specification alternatives
• Code (unspecialized representations, procedural)– Difficult to manage
• Forms (properly, F→A sets)– Works for the simplest tasks, which can be easily cast
as such– Many examples
• Forms + graph-based dialogue structure– Graph-based part essentially = code, same problems– Examples: VXML, SALT
• Hierarchical, plan-based– Task specified as a hierarchical plan (recipe) for the
domain– Examples: RavenClaw, Collagen
CMU dialogue approaches and systems
• Procedural– Command and control [OM,, etc]– Information access [MovieLine,etc]
• Script-based and graph-based– Travel planning; maintenance [SpeechWear]
• AGENDA-based– Communicator: travel planning– LARRI: task guidance [m-modal]– Roomline, etc: information access and transactions– Madeleine: medical diagnosis– TeamTalk: multi-participant dialogue– Valerie: interviews
Graph-based systems
Welcome to Bank ABC! Please say one of the following:
Balance, Hours, Loan, ...
What type of loan are you interested in?Please say one of the following:
Mortgage, Car, Personal, ...
. . . .
Loan
Car
Frame-based systems
• I would like to fly to Boston• When would you like to fly?
• Friday
Destination_City: ______Departure_Date: ______Departure_Time: ______Preferred_Airline: ______...
20030822Boston
Frame-based systems
• I’d like to go to Boston on Friday, …• What time would you like to leave?
Destination_City: ______Departure_Date: ______Departure_Time: ______Preferred_Airline: ______...
20030822Boston
Frame-based systems
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..
Transition onkeyword or phrase
Outline
• Empirical approaches to understanding dialogue and building dialogue systems
• A task-based approach to dialogue
• Learning through observation
• Fundamental representations and observable events
Task-oriented Interaction
• Implicit system goal is to create products– Data structures that specify information for action
• Sessions can generate multiple products– Immediate products, e.g., information requests– Products that are built up incrementally over the
course of a session, e.g., a plan such as an itinerary
• An Agenda to order (and re-order) topics for discussion
Products and Actions
• Products and Actions are domain-specific – e.g., itineraries bookings, queries information
display• Products are represented as an ordered tree
– nodes in the trees correspond to schemas (handlers, agents, etc.) and are slots or forms
• Slot-specific computation is encapsulated in schema (handler objects)
• Agenda is generated from the current product tree
– defines the sequence of topics to take up with the user
Agenda Structure
• Ordered list of conversational topics
– current goal: focussed topic– pending goals: schema yet to
be filled– persistent goals: handlers that
are always active• constructors• generic help• garble
Current focus
Pending
goals
Persistent
goals
Simple and Compound Schema
valuetransform
focus hook prompt•Invalidate value
•self-promote
•reorder tree
receptors
Domain
Agent
valuetransform
Value_3
Value_1
Value_2
report
Domain
Agent
e.g. SQL query
receptor
+
Agendas from product tree traversal
• Default traversal of current product tree
– left-to-right, depth-first
– all nodes in the current product tree are always on the agenda
• Persistent goals sort to the bottom of the list
profile
root
Leg_1
Hotel_1 Car_1
Leg_2
1
2
4Flight_1
Dest_1 Time_1Date_1 65 7
3
9
10
8
Shifting focus
• Agenda has linear structure
– Derived from product tree
• Focus capture implies reordering sibling nodes
– Reordering propagates to root
• enclosing topic contexts get promoted
– Focus node is promoted to top of the agenda
node i gets focus
a
dc
b e
gf
iha
dc
b e
g f
ih
a
fg
e b
dc
ih
1
2
63
7
98
5 ( 1)4
Constructors
• Products are not fixed data structures but may expand through the course of a session
• Users can modify the product
– “I’d like to go on to Syracuse”
– [system adds a new leg sub-tree to the product]
t
n l
f
D d t
ch
l
f
D d t
ch
t
n l
f
D d t
ch
Hierarchical Plan-based Representation
Login
AskRegistered
AskName GreetUser
GetProfile
GreetGuest
PRE: registered=false
PRE: AVAILABLE(name)
PRE: AVAILABLE(name)
GOAL: (registered = false) || AVAILABLE(profile)
Execution policy
• Dialog control:– Task constraints (Declarative): define the boundaries
of the space of possible dialogs– Execution policy (Procedural/Workflow): actively
defines dialogue control
Hierarchical Plan-based Representation
Communicator
Welcome Login Travel Locals Bye
AskRegistered AskName GreetUser GetProfile Leg1
GetQuery ExecuteQuery DiscussLeg1
Registered: [yes]
Registered: [yes]Name: [user_name]
Registered: [yes]Name: [user_name]Departure: [City]Arrival: [City]… … …
AskRegistered
Login
Communicator
FOCUS
MAIN TOPIC
S: Are you a registered user?U: Yes, this is Alex [yes] [user_name]
Hierarchical Plan-based Representation
Leg1
ExecuteQuery DiscussLeg1GetQuery: FORM
DepartureLocation: TCityArrivalLocation: TCityDepartureDate: TDateDepartureTime: TTime
Common task skills
Dialog Engine
• Controls the dialog by executing the hierarchical plan-based task specification
• In the process, automatically exhibits appropriate generic (task and domain-independent) conversational skills:– Global dialogue mechanisms
• repeat, suspend, start-over, help, where are we?
– Grounding• Implicit and explicit confirmations, disambiguations, various
non-understanding handling strategies
– Timing and turn-taking
Issues that remain
• Parallel activities and asynchronous events– Understanding the scope of “dialogue”
• Knowledge engineering dialogue systems– Building the interface between the dialogue engine
and the world (“pragmatics”)– Capturing human speech and language behavior
within tasks and domains– Reasoning about the world within applications– Communicating meaningfully and efficiently with the
user about the state of the world
Outline
• Empirical approaches to understanding dialogue and building dialogue systems
• A task-based approach to dialogue
• Learning through observation
• Fundamental representations and observable events
Learning by observation
• Many automatic systems are meant to substitute for current human-based operations (e.g., a travel agency or a call center)
• Can we use such existing working human systems to infer the structure of a corresponding automatic system?
• If so, what might be the requisite representations and learning heuristics?
Learning to dialogue
• Goal-directed conversation is regular– Both participants can agree on the same goal
and both participants want to achieve this goal
– Correct transmission of information is at a premium
• Can we exploit the regularity to extract the (currently human engineered) structure of the dialogue?
Learning structure from dialogue
• Concept identification
• Form (topic) segmentation
• Task graphs
• Multiple data streams
• Lightly supervised learning
Travel agent and client
greeting
hotel
confirm
returnout leg
carpayment / close
Outline
• Empirical approaches to understanding dialogue and building dialogue systems
• A task-based approach to dialogue
• Learning through observation
• Fundamental representations and observable events
Properties of a dialogue representation
1. Sufficiency– Captures sufficient information for the creation of a
dialogue system– Describes the important (i.e., operative)
phenomena in conversations
2. Generality– Covers conversations in dissimilar domains
3. Learnability– Can be populated through observation (e.g., from a
corpus of human-human conversations)
• Components of task structure– Procedures for completing task goal(s)
• Steps in the task and their dependencies (i.e., the workflow)
– Domain language• Concepts and idioms that humans use to
communicate about the task
– Domain reasoning• The relationships between language and task, and
the domain of the application
Task-centric dialogue representation
• Components of task structure– Procedures for completing task goal(s)
• Steps in the task and their dependencies (i.e., the workflow)
– Domain language• Words, constructs and idioms that humans use to
communicate about the task
– Domain reasoning• The relationships between language and task, and
the domain of the application
Dialogue primitives
Levels of representation1. Task: a subset of conversational sequences that
achieves a particular (human/system) goal 2. Sub-task: a step in a task that contributes toward the
fulfillment of the task goal– The smallest unit of a dialogue that contains information
sufficient to execute a specific domain action
3. Concept: key domain entities (perhaps organized into a type-hierarchy or ontology)
Mechanisms1. Task Oriented: form-filling and result negotiation2. Discourse oriented: grounding, etc
Task Structure Representation
• Task = collection of forms
• Sub-task = a form
• Concept = a slot in a form
F: Query_Departure_Time
Depart_Location: carnegie_mellon
Arrive_Location: the airport
Arrive_Time: Hour: four Minute: thirty
Bus_Number: 28X
Example: Air travel planning
1. Task: create itinerary2. Sub-tasks:
– Flight reservation– Hotel reservation– Car rental reservation
3. Concepts: – Airline = { Continental, Iberia, … }– Hotel = { Novotel, Hilton, … }
Example: Bus schedule enquiry
1. Task (multiple tasks): – Find bus numbers that run between two locations – Find a departure time given a bus number and
stop location
2. Sub-tasks: – No further decomposition needed
3. Concepts: – Bus Number = { 61C, 28X, … }– Location = { CMU, airport, … }
Dialogue mechanisms
• Operations invoked by participants:– Correspond to an utterance or a part of an utterance – Has a unique consequence on the state of the
conversation– init_form causes a system to create a new form– The behavior of the same operation is the same
regardless of the domain (only the parameters that are different)
Dialogue mechanisms (2)
• Dialogue procedure– Requires more than one utterance to complete– A confirmation mechanism = 2 operations
(confirmation_request + respond)
• Non-verbal operation– Activated by a state of the representation rather than
a verbal expression– access_database is activated by the completion of
the query form
An example from the Map Task
• Forms– Action forms ( →draw_line )– Entity forms ( landmark )
• Operations ( various )
• Resolving a misunderstanding through grounding [session q8nc7]
Giver’s Map
Follower’s Map
Episode 11-1Operation:GIVER87: ask_landmark: have you got a TarLM:[golden beach((left))]?
FOLLOWER88: respond: yes uh-huh. add_landmark: (golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one)
Giver’s
Landmark: golden beach (left)
Giver Map: yes
Follower Map:
Location:
Follower’s
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Giver’s
Landmark: golden beach (left)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Follower’s
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Giver’s
Landmark: golden beach (left)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Episode 11-1 (2)Operation:GIVER87: ask_landmark: have you got a TarLM:[golden beach((left))]?
FOLLOWER88: respond: yes uh-huh. add_landmark: (golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one)
Grounding Form
Landmark: golden beach (left)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Origin:
Orientation:
Distance:
Path:
Destination
Episode 11-2Operation:GIVER89: fill_form_info: well goDir:[straight up ]... ... from Ori:[Loc:[the top of the white
mountain]] 'til you're just Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]FOLLOWER90: acknowledge: right,
Grounding Form
Landmark: golden beach (left)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Origin: Ori:[Loc:[the top of the white mountain]]
Orientation: Dir:[straight up ]
Distance:
Path:
Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]
Origin: Ori:[Loc:[the top of the white mountain]]
Orientation: Dir:[straight up ]
Distance:
Path:
Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]
Episode 11-3Operation:
ask_fill_form_info: you want me to go dilect-- ... Dir:[directly right]? GIVER91: respond: no, fill_form_info: Dir:[directly up].
Grounding Form
Landmark: golden beach (left)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Episode 11-4Operation:FOLLOWER92: fill_form_info: but golden beach((right)) is away in Loc:[the far right].
(The follower explicitly fill the location of the golden beach (right). )
GIVER93: acknowledge: ah right. (Agree with the location of the golden beach (right))
Giver’s
Landmark: golden beach (left)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Follower’s
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: implicitly grounded
Follower’s
Landmark: golden beach (right)
Giver Map:
Follower Map: yes
Location: the far right
Giver’s
Landmark: golden beach (left)
Giver Map: yes
Follower Map:
Location:
Giver’s
Landmark: golden beach (right)
Giver Map: yes
Follower Map:
Location:
Episode 11-5Operation:FOLLOWER94: ask_landmark: have you got TarLM:[your (golden beach (right))]?
GIVER95: inform_other_info: i've got two golden beaches.
FOLLOWER96: acknowledge: ah. add_landmark: (golden beach (right))
Landmark: golden beach (right)
Giver Map:
Follower Map: yes
Location: the far right
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: the far right
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: the far right
Episode 11-5 (2)Operation:FOLLOWER94: ask_landmark: have you got TarLM:[your (golden beach (right))]?
GIVER95: inform_other_info: i've got two golden beaches.
FOLLOWER96: acknowledge: ah. add_landmark: (golden beach (right))
Grounding Form
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: the far right
Episode 11-6Operation:GIVER97: fill_form_info: sorry ... so there's TarLM:[the one(golden beach (left))] Loc:
[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me.
FOLLOWER98: fill_form_info: is there, yeah there's nothing nothing there.
add_landmark: golden beach (left)
GIVER99: acknowledge: right okay,
Landmark: golden beach (left)
Giver Map: yes
Follower Map:
Location:
Landmark: golden beach (left)
Giver Map: yes
Follower Map: no
Location: above the ... white mountain, the left of it (white mountain)
Landmark: golden beach (left)
Giver Map: yes
Follower Map: no
Location: above the ... white mountain, the left of it (white mountain)
Landmark: golden beach (left)
Giver Map: yes
Follower Map:
Location: above the ... white mountain, the left of it (white mountain)
Grounding Form
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: the far right
Episode 11-6 (2)Operation:GIVER97: fill_form_info: sorry ... so there's TarLM:[the one(golden beach (left))] Loc:
[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me.
FOLLOWER98: fill_form_info: is there, yeah there's nothing nothing there.
add_landmark: golden beach (left)
GIVER99: acknowledge: right okay,
Grounding Form
Landmark: golden beach (left)
Giver Map: yes
Follower Map:
Location: above the ... white mountain, the left of it (white mountain)
Grounding Form
Landmark: golden beach (right)
Giver Map: yes
Follower Map: yes
Location: the far right
Applying the representation
• Four different task-oriented domains – Air travel planning
• Professional travel agent and volunteer clients (re)booking former trips
– HCRC map-reading task• Hired subjects communicating path information
– Bus schedule information• Professional agents helping customers
– UAV operation• Trainees flying an unmanned airline, in a
simulation
Evaluation Corpora
• Annotated conversations from the four task-oriented domains
Domain Available Analyzed
#Dialogs #Dialogs #Utterances
Bus schedule 12 5 90
Air travel 43 4 273
Map reading 128 4 498
UAV operation 2 1 224
Rejected utterances
• Utterances that could not be described by the proposed structure– Out Of Domain (OOD)– Out Of Scope (OOS) : in-domain but out of the
conversation goal– Indirect : requires substantial reasoning or world-
knowledge to interpret– Task Management (TM) : manages the overall state
of the dialogue, rather than a particular form
Rejected utterance percentage
Domain Rejected utterances (%)
OOD OOS Indirect TM Total
Bus schedule 4.4 4.4 6.7 0.0 15.6
Air travel 1.8 4.4 0.4 2.6 9.2
Map reading 0.0 0.0 2.2 0.0 2.2
UAV simulation 1.0 0.0 1.0 4.0 5.9
Summary
• Human-computer dialogue is organized around specific tasks within domains
• The key level of representation is in fact the task; applications are particular embodiments of these tasks
• All applications necessarily include a large amount of detail– Such detail is not knowable a priori (and much of it
cannot be generated from principle)– Either extensive knowledge engineering or (better)
systems that learn are necessary to produce systems that function robustly
top related