probabilistic approaches to reasoning and control: towards autonomous interactive mobile robots...

Probabilistic approaches to reasoning and control:

Towards autonomous interactive mobile robots

Joelle PineauCarnegie Mellon University

TAMALE Seminar

March 28, 2003

Probabilistic approaches to reasoning and control for interactive mobile robots Joelle Pineau

Our vision of robotic-assisted health-care

Moving thingsaround

Moving thingsaround

Enabling use of remote

health services

Enabling use of remote

health services

Supportinginter-personal

communication

Supportinginter-personal

communication

Calling for helpin emergencies

Calling for helpin emergencies

Monitoring Rx adherence

& safety

Monitoring Rx adherence

& safety

Providinginformation

(TV, weather)

Providinginformation

(TV, weather)

Management support of

ADLs

Management support of

ADLsReminding

to eat, drink, & take meds

Reminding to eat, drink, & take meds

Providing physical

assistance

Providing physical

assistance

Linking the caregiver to resources

Linking the caregiver to resources


Introducing Pearl: A mobile robotic assistant for elderly people and nurses

cameras

sonars

handle bars

mobile base

carrying tray

LCD mouth

touchscreen

microphone& speakers

laser


What are the challenges?

• Interaction with the environment:

– navigating robustly

– handling dynamic obstacles

• Interaction with individuals:

– communicating by speech

– providing cognitive reminders

– interpreting and satisfying user requests


System Overview

Cognitive supportNavigation Communication

High-level controller


System Overview

Cognitive support Communication


• Localization and map building(Burgard et al., 1999)

• People detection and tracking(Montemerlo et al., 2002)

Navigation


Navigation and people tracking


System Overview

Navigation Communication


• Autominder system (Pollack et al., 2002)

Cognitive support


• Speech recognition: Sphinx system(Ravishankar, 1996)

• Speech synthesis: Festival system(Black et al., 1999)

System Overview

Cognitive supportNavigation


Communication


Speech recognition with Sphinx


The role of the top-level controller

Cognitive supportNavigation Communication

ACTION SELECTION - based on the trade-off between:

- goals from different modules;

- goals with varying costs / rewards;

- reducing uncertainty versus accomplishing goals.



Types of uncertainty in robotics

• Cause #1: Non-deterministic effects of actions

• Cause #2: Partial and noisy sensor information

• Cause #3: Inaccurate model of the world and the user


Robot control under uncertainty using Partially Observable Markov Decision Processes

State

User + Environment + Robot

Action={ say-weather,update-appointment,clarify-query}

Speech=“today”Belief State

e.g. request-weather-today

e.g. P(st=weather-today)=0.5 P(st=appointment-today )=0.5


Existing applications of POMDPs

– Maintenance scheduling

» Puterman, 1994

– Robot navigation

» Koenig & Simmons, 1995;

Roy & Thrun, 1999

– Helicopter control

» Bagnell & Schneider, 2001;

Ng et al., 2002

– Dialogue modeling

» Roy, Pineau & Thrun, 2000;

Peak&Horvitz, 2000

– Preference elicitation

» Boutilier, 2002


Graphical Model Representation

POMDP is n-tuple { S, A, , T, O, R }:

What goes on: st-1 st

at-1 at

T(s,a,s’) = state-to-state transition probabilitiesO(s,a,o) = observation generation probabilitiesR(s,a) = Reward function

S = state setA = action set = observation set

What we see: ot-1 ot-1

Belief update:

Ss

jijii

j

tt sbsasToasOsb 1,,,,


Understanding the belief state

• A belief is a probability distribution over states

Where Dim(B) = |S|-1

– E.g. Let S={s1, s2}

P(s1)

0

1





– E.g. Let S={s1, s2, s3}

P(s1)

P(s2)

0

1

1




– E.g. Let S={s1, s2, s3 , s4}


P(s1)

P(s2)

0

1

1

P(s3)


Exact planning for POMDPs

• Simple problem: |S|=2, |A|=3, ||=2 Iteration # hyper-planes 0 1

P(s1)

V0(b)

b



• Simple problem: |S|=2, |A|=3, ||=2 Iteration # hyper-planes 0 1 1 3

P(s1)

V1(b)

b



• Simple problem: |S|=2, |A|=3, ||=2 Iteration # hyper-planes 0 1 1 3 2 27

P(s1)

V2(b)

b



• Simple problem: |S|=2, |A|=3, ||=2 Iteration # hyper-planes 0 1 1 3 2 27 3 2187

P(s1)

V2(b)

b



• Simple problem: |S|=2, |A|=3, ||=2 Iteration # hyper-planes 0 1 1 3 2 27 3 2187 4 14,348,907

P(s1)

V2(b)

b


Properties of exact planning

• Value function is always piecewise-linear convex

• Many hyper-planes can be pruned away

P(s1)

V2(b)

b

|S|=2, |A|=3, ||=2 Iteration # hyper-planes

0 1 1 3 2 5 3 9 4 7 5 13 10 27 15 47 20 59

…


Is pruning sufficient?

|S|=20, |A|=6, ||=8

Iteration # hyper-planes0 11 5

2 213 3 ?????

…

Not for this problem!


Certainly not for this problem!

Physiotherapy

Patientroom

Robothome

|S|=576, |A|=19, |O|=17

State Features: {RobotLocation, ReminderGoal, UserLocation, UserMotionGoal,

UserStatus, UserSpeechGoal}


The two curses of POMDP planning

• The curse of dimensionality:

– the dimension of each hyper-plane = # of states

• The curse of history:

– the number of hyper-planes grows

exponentially with the planning horizon

||1

2 |||||| nAS

|| n

Complexity of POMDP value iteration:

dimensionality history


Methods to solve POMDPs

Complexity

Performance

QMDP

MDP

FIB

Grid

O(S2A) O(S2AO )O(S2AO) O(S2AB) T

POMDP

New methods?

Objective: Find a policy, (b), which maximizes reward.


New approach: A hierarchy of POMDPs

Idea: Exploit domain knowledge to divide one POMDP into many smaller ones.

Motivation: Smaller action sets help overcome the curse of history.

Assumption: We are given POMDP M = {S,A,,b,T,O,R} and hierarchy H

Act

ExamineHealth Navigate

MoveVerifyPulse

ClarifyGoal

North South East West

VerifyMeds

subtask

abstract action

primitive action


PolCA+: Planning with a hierarchy of POMDPs

Navigate

Move ClarifyGoal

South East WestNorth

AMove = {N,S,E,W}

ACTIONSNorthSouthEastWest

ClarifyGoalVerifyPulseVerifyMeds



Step 1: Select the action set



Navigate

Move ClarifyGoal


AMove = {N,S,E,W}

SMove = {X,Y}

STATE FEATURESX-positionY-position

X-goalY-goal

HealthStatus


X-goalY-goal

HealthStatus






Step 2: Minimize the state set



Navigate

Move ClarifyGoal


AMove = {N,S,E,W}

SMove = {X,Y}


X-goalY-goal

HealthStatus


X-goalY-goal

HealthStatus





PARAMETERS

{bh,Th,Oh,Rh}

PARAMETERS

{bh,Th,Oh,Rh}



Step 3: Choose parameters



Navigate

Move ClarifyGoal


AMove = {N,S,E,W}

SMove = {X,Y}


X-goalY-goal

HealthStatus


X-goalY-goal

HealthStatus





PLAN

h

PLAN

h

PARAMETERS

{bh,Th,Oh,Rh}

PARAMETERS

{bh,Th,Oh,Rh}



Step 3: Choose parameters

Step 4: Plan task h


Results on small dialogue domain

-120

-100

-80

-60

-40

-20

0

0.01 0.1 1 10 100 1000 10000 100000 1000000

Time (secs)

R

POMDPPolCA-D1PolCA-D2FIBQMDP

|S|=12, |A|=20, |O|=3


Achieving a flexible trade-off

Planning time

Reward

QMDP

FIB

POMDP

PolCA+ D2

PolCA+ D1


PolCA+ in the Nursebot domain

• Goal: A robot is deployed in a nursing home, where it provides reminders to elderly users and accompanies them to appointments.


Sample scenario


Comparing user performance

0.1 0.10.18


The effects of confirmation actions

-2000

2000

6000

10000

14000

0 400 800 1200

Time Steps

Cu

mu

lativ

e R

ew

ard

PolCA+

PolCA

QMDP


0

500

1000

1500

2000

2500

3000

3500

4000

4500

NoAbs PolCA PolCA+

# S

tate

ssubInform

subMove

subContact

subRest

subAssist

subRemind

act

Addressing the curse of dimensionality


Ongoing work

• New POMDP approximation techniques.

• Parameter estimation for adaptation to user-specific speech patterns and preferences.

• Exploration of emotion and personality types using a new head.

• Addition of an arm for object manipulation.

• Addition of weight-bearing bars for assisted walking.


Summary

• We have developed a first prototype robot able to serve as a mobile nursing assistant for elderly people.

• The top-level controller uses a hierarchical variant of POMDPs to select actions.

– PolCA+ addresses both the curse of dimensionality and the curse of history.

• Lessons learned during our experiments:

– Uncertainty is crucial when dealing with people

– Probabilistic techniques are necessary to reason about uncertainty.

– Real belief tracking and planning really matters!

Project information: www.cs.cmu.edu/~nursebotNavigation software: www.cs.cmu.edu/~carmenPapers and more: www.cs.cmu.edu/~jpineau

Joint work with: Michael Montemerlo, Martha Pollack, Nicholas Roy, Sebastian Thrun


The Nursebot project in its early days


Autominder System

probabilistic approaches to reasoning and control: towards autonomous interactive mobile robots...

Documents