big picture: human-robot decision dynamics stochastic ...vaibhav/talks/2012c.pdf · vaibhav...

16
Stochastic Search and Surveillance Strategies for Mixed Human-Robot Teams Vaibhav Srivastava Department of Mechanical Engineering University of California Santa Barbara October 31, 2012 PhD Dissertation Defense Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 1 / 38 Big Picture: Human-robot decision dynamics Uncertain environment surveyed by human-UAV team (Courtesy: Prof. Kristi Morgansen) UCSB Camera Network UAV surveillance (Courtesy: http://www.modsim.org/) A surveillance operator (Courtesy: http://www.modsim.org/) Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 2 / 38 Information Overload 7(&+ -2851$/ 1RYHPEHU %\ :,//,$0 0 %8/.(/(< &KLFDJRV &DPHUD 1HWZRUN ,V (YHU\ZKHUH ([WHQVLYH 6XUYHLOODQFH 6\VWHP ,QWHJUDWHV 1RQSROLFH 9LGHR 5DLVHV &RQFHUQV $ERXW 3RVVLEOH 3ULYDF\ $EXVHV Reprints This copy is for your personal, noncommercial use only. You can order presentation-ready copies for distribution to your colleagues, clients or customers here or use the "Reprints" tool that appears next to any article. Visit www.nytreprints.com for samples and additional information. Order a reprint of this article now. January 16, 2011 In New Military, Data Overload Can Be Deadly By THOM SHANKER and MATT RICHTEL 5HSULQWV 7KLV FRS\ LV IRU \RXU SHUVRQDO QRQFRPPHUFLDO XVH RQO\ <RX FDQ RUGHU SUHVHQWDWLRQUHDG\ FRSLHV IRU GLVWULEXWLRQ WR \RXU FROOHDJXHV FOLHQWV RU FXVWRPHUV KHUH RU XVH WKH 5HSULQWV WRRO WKDW DSSHDUV QH[W WR DQ\ DUWLFOH 9LVLW ZZZQ\WUHSULQWVFRPIRU VDPSOHV DQG DGGLWLRQDO LQIRUPDWLRQ 2UGHU D UHSULQW RI WKLV DUWLFOH QRZ -XQH 0LOLWDU\ 7DSV 6RFLDO 1HWZRUNLQJ 6NLOOV %\ &+5,6723+(5 '5(: %($/( $,5 )25&( %$6( &DOLI ² $V D WHHQDJHU -DPLH &KULVWRSKHU ZRXOG 2EDPD &RPPDQGLQJ 5RERW 5HYROXWLRQ $QQRXQFHV 0DMRU 5RERWLFV ,QLWLDWLYH 3267(' %< (5,&2 *8,==2 )5, -81( 3UHVLGHQW %DUDFN 2EDPD ORYHV URERWV +H¶V LQYLWHG ERWV WR WKH :KLWH +RXVH DQG KDV HYHQ EHIULHQGHG D -DSDQHVH DQGURLG %XW QRZ 2EDPD KDV JRQH RQH VWHS IXUWKHU +H¶V GHFLGHG WR OHDG ZKDW PD\ EH D SURIRXQG URERWLFV UHYROXWLRQ ,Q D YLVLW WRGD\ WR &DUQHJLH 0HOORQ 8QLYHUVLW\V 1DWLRQDO 5RERWLFV (QJLQHHULQJ &HQWHU 2EDPD ODXQFKHG WKH $GYDQFHG 0DQXIDFWXULQJ 3DUWQHUVKLS D PLOOLRQ SURJUDP WR EULQJ WRJHWKHU LQGXVWU\ XQLYHUVLWLHV DQG JRYHUQPHQW WR LQYHVW LQ HPHUJLQJ WHFKQRORJLHV WKDW FDQ LPSURYH PDQXIDFWXULQJ DQG FUHDWH QHZ EXVLQHVVHV DQG MREV 5RERWV DUH D ELJ SDUW RI WKLV HIIRUW 7KH DGPLQLVWUDWLRQV QHZ 1DWLRQDO 5RERWLFV ,QLWLDWLYH VHHNV WR DGYDQFH QH[W JHQHUDWLRQ URERWLFV 7KH IRFXV LV RQ URERWV WKDW FDQ ZRUN FORVHO\ ZLWK KXPDQV²KHOSLQJ IDFWRU\ ZRUNHUV KHDOWKFDUH SURYLGHUV VROGLHUV VXUJHRQV DQG DVWURQDXWV WR FDUU\ RXW WDVNV Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and Surveillance V. Srivastava, K. Plarre, and F. Bullo. Randomized sensor selection in sequential hypothesis testing. IEEE Trans Signal Processing, 59(5):2342–2354, 2011 V. Srivastava, F. Pasqualetti, and F. Bullo. Stochastic surveillance strategies for spatial quickest detection. Int J Robotic Research, 2013. to appear Attention Allocation V. Srivastava, R. Carli, C. Langbort, and F. Bullo. Attention allocation for decision making queues. Automatica, February 2012. conditionally accepted V. Srivastava and F. Bullo. Knapsack problems with sigmoid utility: Approximation algorithms via hybrid optimization. European Journal of Operational Research, October 2012. Submitted Other Topics V. Srivastava, J. Moehlis, and F. Bullo. On bifurcations in nonlinear consensus networks. Journal of Nonlinear Science, 21(6):875–895, 2011 L. Carlone, V. Srivastava, F. Bullo, and G. C. Calafiore. Distributed random convex programming via constraints consensus. SIAM J Ctrl Optm, July 2012. Submitted Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 4 / 38

Upload: others

Post on 22-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Stochastic Search and Surveillance Strategiesfor Mixed Human-Robot Teams

Vaibhav Srivastava

Department of Mechanical Engineering

University of California Santa Barbara

October 31, 2012

PhD Dissertation DefenseVaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 1 / 38

Big Picture: Human-robot decision dynamics

Uncertain environment surveyed by human-UAV team

(Courtesy: Prof. Kristi Morgansen)

UCSB Camera Network

UAV surveillance (Courtesy: http://www.modsim.org/)

A surveillance operator (Courtesy: http://www.modsim.org/)

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 2 / 38

Information Overload

Reprints

This copy is for your personal, noncommercial use only. You can order presentation-ready copies for distributionto your colleagues, clients or customers here or use the "Reprints" tool that appears next to any article. Visitwww.nytreprints.com for samples and additional information. Order a reprint of this article now.

January 16, 2011

In New Military, Data Overload Can BeDeadlyBy THOM SHANKER and MATT RICHTEL

When military investigators looked into an attack by American helicopters last Februarythat left 23 Afghan civilians dead, they found that the operator of a Predator drone hadfailed to pass along crucial information about the makeup of a gathering crowd of villagers.

But Air Force and Army officials now say there was also an underlying cause for thatmistake: information overload.

At an Air Force base in Nevada, the drone operator and his team struggled to work out whatwas happening in the village, where a convoy was forming. They had to monitor the drone’svideo feeds while participating in dozens of instant-message and radio exchanges withintelligence analysts and troops on the ground.

There were solid reports that the group included children, but the team did not adequatelyfocus on them amid the swirl of data — much like a cubicle worker who loses track of animportant e-mail under the mounting pile. The team was under intense pressure to protectAmerican forces nearby, and in the end it determined, incorrectly, that the villagers’ convoyposed an imminent threat, resulting in one of the worst losses of civilian lives in the war inAfghanistan.

“Information overload — an accurate description,” said one senior military officer, who wasbriefed on the inquiry and spoke on the condition of anonymity because the case might yetresult in a court martial. The deaths would have been prevented, he said, “if we had justslowed things down and thought deliberately.”

Data is among the most potent weapons of the 21st century. Unprecedented amounts of rawinformation help the military determine what targets to hit and what to avoid. Anddrone-based sensors have given rise to a new class of wired warriors who must filter theinformation sea. But sometimes they are drowning.

Military Struggles to Harness a Flood of Data - NYTimes.com http://www.nytimes.com/2011/01/17/technology/17brain.html?...

1 of 4 4/24/11 7:21 PM

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38

Publications

Search and Surveillance

V. Srivastava, K. Plarre, and F. Bullo. Randomized sensor selection in sequential hypothesis testing.IEEE Trans Signal Processing, 59(5):2342–2354, 2011

V. Srivastava, F. Pasqualetti, and F. Bullo. Stochastic surveillance strategies for spatial quickestdetection. Int J Robotic Research, 2013. to appear

Attention AllocationV. Srivastava, R. Carli, C. Langbort, and F. Bullo. Attention allocation for decision making queues.Automatica, February 2012. conditionally accepted

V. Srivastava and F. Bullo. Knapsack problems with sigmoid utility: Approximation algorithms viahybrid optimization. European Journal of Operational Research, October 2012. Submitted

Other Topics

V. Srivastava, J. Moehlis, and F. Bullo. On bifurcations in nonlinear consensus networks. Journalof Nonlinear Science, 21(6):875–895, 2011

L. Carlone, V. Srivastava, F. Bullo, and G. C. Calafiore. Distributed random convex programmingvia constraints consensus. SIAM J Ctrl Optm, July 2012. Submitted

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 4 / 38

Page 2: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Mixed Team Setup

Cognition & AutonomyManagement System

λincoming tasks

queue lengthn

Vehicle RoutingAlgorithm

Decision SupportSystem

Anomaly DetectionAlgorithm

optimal allocations

tasks &

exogenou

sfactors

situational awareness

fatigue & sleep cycle

forgetting

boredom

AutonomyCognition

decision on tasks

distribution of tasks outgoing tasks

region selection policy

human operatorperformance

- Information aggregation: sensor selection policy/ vehicle routing policy

- Information processing: human attention allocation policy

- Mission goal: efficient search / surveillance

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 5 / 38

Mixed Team Setup

Cognition & AutonomyManagement System

λincoming tasks

queue lengthn

distribution of tasks

region selection policy

Vehicle RoutingAlgorithm

Decision SupportSystem

Anomaly DetectionAlgorithm

optimal allocations

tasks &

decision on tasks

human operator performance

exog

enou

sfactors

situational awareness

fatigue & sleep cycle

forgetting

boredom

outgoing tasks

AutonomyCognition

- Information aggregation: sensor selection policy/ vehicle routing policy

- Information processing: human attention allocation policy

- Mission goal: efficient search / surveillance

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 6 / 38

Incomplete Literature Review

Vehicle Routing for Information Gathering

D. J. Klein, J. Schweikl, J. T. Isaacs, and J. P. Hespanha. On UAV routing protocols for sparsesensor data exfiltration. In Proc ACC, pages 6494–6500, Baltimore, MD, USA, June 2010

V. Gupta, T. H. Chung, B. Hassibi, and R. M. Murray. On a stochastic sensor selection algorithmwith applications in sensor scheduling and sensor coverage. Automatica, 42(2):251–260, 2006

G. A. Hollinger, U. Mitra, and G. S. Sukhatme. Autonomous data collection from underwater sensornetworks using acoustic communication. In Proc IROS, pages 3564–3570, San Francisco, CA, USA,September 2011

Stochastic Surveillance and Pursuit EvasionJ. P. Hespanha, H. J. Kim, and S. S. Sastry. Multiple-agent probabilistic pursuit-evasion games. InProc CDC, pages 2432–2437, Phoenix, AZ, USA, December 1999

J. Grace and J. Baillieul. Stochastic strategies for autonomous robotic surveillance. In Proc CDC-ECC, pages 2200–2205, Seville, Spain, December 2005

K. Srivastava, D. M. Stipanovic, and M. W. Spong. On a stochastic robotic surveillance problem.In Proc CDC, pages 8567–8574, Shanghai, China, December 2009

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 7 / 38

Outline

1 Introduction

2 Stochastic Surveillance StrategiesSingle Vehicle PoliciesMultiple Vehicle Policies

3 Attention Allocation for human operatorTime Constrained Static QueueDynamic Queue with Latency Penalty

4 Mixed Team Surveillance

5 Conclusions

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 7 / 38

Page 3: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Stochastic Surveillance: Problem Setup

a UAV surveys n regions

Objective: quickly detect anomalies

processing time at region k : Tk

distance between region i and j : dij

observations at each region: i.i.d.

pdf of nominal & anomalousobservation at region k : f 0k & f 1k

AnomalyDetectionAlgorithm

VehicleRoutingAlgorithm

Decision

AnomalyLikelihood

Control Center

Observations Collected by UAVs

Vehicle Routing Policy

Surveillance Setup

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 8 / 38

Cumulative Sum Algorithm

standard distribution sampled from distribution f 0

anomalous distribution sampled from distribution f 1

Given a bound on false alarm rate CUSUM algorithmdetects the change in minimum expected time

CUSUM Procedure

1 set statistic Λ = 0

2 collect an observation y

3 update statistic

Λ = max�0,Λ+ log

f 1(y)

f 0(y)

4 if Λ > η: declare anomalydetected

5 else go to step 2.

E. S. Page. Continuous inspection schemes. Biometrika, 41(1/2):100–115, 1954Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 9 / 38

Proposed Policy: Randomized ensemble CUSUM algorithm

1 Anomaly detection algorithm:

n parallel CUSUM algorithms (one for each region)

2 Vehicle routing policy:

at each iteration sample region to visit from a probability distribution

AnomalyDetectionAlgorithm

VehicleRoutingAlgorithm

Decision

AnomalyLikelihood

Control Center

Observations Collected by UAVs

Vehicle Routing Policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 10 / 38

Randomized Ensemble CUSUM Algorithm

1 n parallel CUSUM algorithms (one for each region)

2 region k visited with stationary prob qk

3 KL divergence at region k : Dk = Ef 1k

�log(f 1k (Y )/f 0k (Y ))

Expected detection delay at region k

E[δk(q)]= e−η+η−1

qkDk

� n�

i=1

n�

j=1

qiqj(Ti + dij)�

1 2 3 4 5 6 7 8 9 100

200

400

600

800

1000

Threshold

Expe

cted

det

ectio

n de

lay

Threshold

Exp

ectedDetection

Delay

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 11 / 38

Page 4: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Optimal Stationary Policy

E[δk(q)] = η

qkDk(q · T+ q · Dq) and πk : prior for anomaly at region k

Optimal stationary policy

q∗ = argminq∈∆n

n�

k=1

πkE[δk(q)]

Chernoff bound based guaranteesthat only one minima exists q1

q 2

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

q1

q2

UCSB Campus Optimal Stationary Policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 12 / 38

Optimal Stationary Policy

E[δk(q)] = η

qkDk(q · T+ q · Dq) and πk : prior for anomaly at region k

Optimal stationary policy

q∗ = argminq∈∆n

n�

k=1

πkE[δk(q)]

Chernoff bound based guaranteesthat only one minima exists q1

q 2

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

q1

q2

UCSB Campus Optimal Stationary Policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 12 / 38

Efficient Stationary Policy

Upper bound on performance:

n�

k=1

πkE[δk(q)] ≤n�

k=1

πk η

qkDk(Tmax + dmax).

Tmax = maxk

Tk , and dmax = maxi

maxj

dij

Efficient Stationary policy

Minimizer of upper bound

q†k =

�πk/Dk�n

j=1

�πj/Dj

Tmin = mink

Tk , Dmin = mink

Dk ,

and Dmax = maxk

Dk

Factor of optimality

Tmax + dmax

Tmin

, w.r.t. stationary policy

nTmax + dmax

Tmin

Dmax

Dmin

, w.r.t. any policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 13 / 38

Efficient Stationary Policy

Upper bound on performance:

n�

k=1

πkE[δk(q)] ≤n�

k=1

πk η

qkDk(Tmax + dmax).

Tmax = maxk

Tk , and dmax = maxi

maxj

dij

Efficient Stationary policy

Minimizer of upper bound

q†k =

�πk/Dk�n

j=1

�πj/Dj

Tmin = mink

Tk , Dmin = mink

Dk ,

and Dmax = maxk

Dk

Factor of optimality

Tmax + dmax

Tmin

, w.r.t. stationary policy

nTmax + dmax

Tmin

Dmax

Dmin

, w.r.t. any policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 13 / 38

Page 5: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Efficient Stationary Policy

Upper bound on performance:

n�

k=1

πkE[δk(q)] ≤n�

k=1

πk η

qkDk(Tmax + dmax).

Tmax = maxk

Tk , and dmax = maxi

maxj

dij

Efficient Stationary policy

Minimizer of upper bound

q†k =

�πk/Dk�n

j=1

�πj/Dj

Tmin = mink

Tk , Dmin = mink

Dk ,

and Dmax = maxk

Dk

Factor of optimality

Tmax + dmax

Tmin

, w.r.t. stationary policy

nTmax + dmax

Tmin

Dmax

Dmin

, w.r.t. any policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 13 / 38

Adaptive Ensemble CUSUM

Adaptive policy1 at each iteration: update prior πk ∝ Λk

2 adapt the efficient stationary policy: q†k =

�πk/Dk�n

j=1

�πj/Dj

3 visit a region, and update CUSUM statistic

Performance of adaptive policy

E[δk(a)] ≤� ηDk

+2(n − 1)eη/2

√Dk(1− e−η/2

)√Dmin(1− e−Dk/2)

+(n − 1)

2eηDk(1− e−η)

Dmin(1− e−Dk )

�(Tmax+dmax).

Delay versus CUSUM Threshold

Comparison with stationary policy

Delay versus Divergence

Comparison with stationary policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 14 / 38

Adaptive Ensemble CUSUM

Adaptive policy1 at each iteration: update prior πk ∝ Λk

2 adapt the efficient stationary policy: q†k =

�πk/Dk�n

j=1

�πj/Dj

3 visit a region, and update CUSUM statistic

Performance of adaptive policy

E[δk(a)] ≤� ηDk

+2(n − 1)eη/2

√Dk(1− e−η/2

)√Dmin(1− e−Dk/2)

+(n − 1)

2eηDk(1− e−η)

Dmin(1− e−Dk )

�(Tmax+dmax).

Delay versus CUSUM Threshold

Comparison with stationary policy

Delay versus Divergence

Comparison with stationary policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 14 / 38

Adaptive Ensemble CUSUM

Adaptive policy1 at each iteration: update prior πk ∝ Λk

2 adapt the efficient stationary policy: q†k =

�πk/Dk�n

j=1

�πj/Dj

3 visit a region, and update CUSUM statistic

Performance of adaptive policy

E[δk(a)] ≤� ηDk

+2(n − 1)eη/2

√Dk(1− e−η/2

)√Dmin(1− e−Dk/2)

+(n − 1)

2eηDk(1− e−η)

Dmin(1− e−Dk )

�(Tmax+dmax).

1 2 3 4 5 6 7 8 9 100

100

200

300

400

500

600

Threshold

Expe

cted

det

ectio

n de

lay

Exp

ectedDetection

Delay

Threshold

Delay versus CUSUM Threshold

Comparison with stationary policy

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

5 x 104

(K L Divergence) 1

Expe

cted

det

ectio

n de

lay

(K-L Divergence)−1

Exp

ectedDetection

Delay

Delay versus Divergence

Comparison with stationary policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 14 / 38

Page 6: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Performance of Adaptive Policy

0 100 200 300 400 500 6000.1

0.2

0.3

0.4

0.5

Time

Routing p

olic

y

0 100 200 300 400 500 6000

1

2

3

4

5

6

Time

CU

SU

M S

tatistic

Time

Time

RoutingPolicy

CUSUM

Statistic

Adaptive policy with no anomaly

0 100 200 300 400 500 6000

0.2

0.4

0.6

0.8

Time

Routing p

olic

y

0 100 200 300 400 500 6000

2

4

6

8

Time

CU

SU

M s

atistic

Time

Time

RoutingPolicy

CUSUM

Statistic

Adaptive policies with anomalies

- frequent false alarms at low thresholds

- adaptive policy visits anomalous regions with higher probability

- adaptive policy very effective for high thresholds

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 15 / 38

Outline

1 Introduction

2 Stochastic Surveillance StrategiesSingle Vehicle PoliciesMultiple Vehicle Policies

3 Attention Allocation for human operatorTime Constrained Static QueueDynamic Queue with Latency Penalty

4 Mixed Team Surveillance

5 Conclusions

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 15 / 38

Extension to Multiple Vehicles

m identical vehicles simultaneously surveying the regions

Partitioning Policy1 m-partition regions with each partition having at most �n/m� regions

2 allocate one vehicle to each partition

3 implement single vehicle policy in each partition

Stationary policy with partitioning

Factor of optimality

4πmax

πmin

(Tmax + dmax)

Tmin

Dmax

Dmin

, w.r.t. stat. policy

m2

� n

m

� (Tmax + dmax)

Tmin

Dmax

Dmin

, w.r.t. any policy

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 16 / 38

Extension to Multiple Vehicles

m identical vehicles simultaneously surveying the regions

Partitioning Policy1 m-partition regions with each partition having at most �n/m� regions

2 allocate one vehicle to each partition

3 implement single vehicle policy in each partition

Stationary policy with partitioning

Factor of optimality

4πmax

πmin

(Tmax + dmax)

Tmin

Dmax

Dmin

, w.r.t. stat. policy

m2

� n

m

� (Tmax + dmax)

Tmin

Dmax

Dmin

, w.r.t. any policy

1 2 3 4 5 6 7 8 9 100

100

200

300

400

500

600

Threshold

Avg

De

t D

ela

y

Threshold

AverageDetectionDelay

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 16 / 38

Page 7: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Further Relaxations

Not all-to-all topology1 Construct a Markov chain with

desired stationary distribution

Dependent Observations1 Use CUSUM like algorithm for

HMMs (Chen and Willet ’00)

Dependence across Regions1 More information available, can

be used to improve performance

More than one kind of anomaly1 Use Generalized likelihood ratio

2 Side product: type of anomalyVaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 17 / 38

Further Relaxations

Not all-to-all topology1 Construct a Markov chain with

desired stationary distribution

Dependent Observations1 Use CUSUM like algorithm for

HMMs (Chen and Willet ’00)

Dependence across Regions1 More information available, can

be used to improve performance

More than one kind of anomaly1 Use Generalized likelihood ratio

2 Side product: type of anomaly

0 50 100 150 200 250 300 350 400 450 500 5500

0.2

0.4

0.6

0.8

1

Time

Rou

ting

Pol

icy

0 50 100 150 200 250 300 350 400 450 500 5500

5

10

15

Time

GLR

Sta

tistic

Time

Time

RoutingPolicy

GLR

Statistic

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

Hypothesis

Pos

terio

r Pro

babi

lity

Hypothesis

Norm

alizedLikelihood

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 17 / 38

Outline

1 Introduction

2 Stochastic Surveillance StrategiesSingle Vehicle PoliciesMultiple Vehicle Policies

3 Attention Allocation for human operatorTime Constrained Static QueueDynamic Queue with Latency Penalty

4 Mixed Team Surveillance

5 Conclusions

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 18 / 38

Mixed Team Setup

Cognition & AutonomyManagement System

λincoming tasks

queue lengthn

distribution of tasks

region selection policy

Vehicle RoutingAlgorithm

Decision SupportSystem

Anomaly DetectionAlgorithm

optimal allocations

tasks &

decision on tasks

human operator performance

exogenou

sfactors

situational awareness

fatigue & sleep cycle

forgetting

boredom

outgoing tasks

AutonomyCognition

- Information aggregation: sensor selection policy/ vehicle routing policy

- Information processing: human attention allocation policy

- Mission goal: efficient search / surveillance

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 18 / 38

Page 8: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Incomplete Literature Review

Human Decision MakingR. Bogacz, E. Brown, J. Moehlis, P. Holmes, and J. D. Cohen. The physics of optimal decisionmaking: A formal analysis of performance in two-alternative forced choice tasks. PsychologicalReview, 113(4):700–765, 2006

R. W. Pew. The speed-accuracy operating characteristic. Acta Psychologica, 30:16–26, 1969

Control of QueuesO. Hernandez-Lerma and S. I. Marcus. Adaptive control of service in queueing systems. IFAC Syst& Control L, 3(5):283–289, 1983

S. Agrali and J. Geunes. Solving knapsack problems with S-curve return functions. European Journalof Operational Research, 193(2):605–615, 2009

Human-in-the-loop ControlK. Savla and E. Frazzoli. A dynamical queue approach to intelligent task management for humanoperators. IEEE Proceedings, 100(3):672–686, 2012

L. F. Bertuccelli, N. Pellegrino, and M. L. Cummings. Choice modeling of relook tasks for UAVsearch missions. In Proc ACC, pages 2410–2415, Baltimore, MD, USA, June 2010

N. D. Powel and K. A. Morgansen. Multiserver queueing for supervisory control of autonomousvehicles. In Proc ACC, pages 3179–3185, Montreal, Canada, June 2012

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 19 / 38

Physics of Human Decision Making

Human Decision Making

Evolution of evidence for decision

Time

Time

tinf

tmin tmax

CorrectDecision

Probability

Probability of correct decision

1 Evidence for decision making evolves as a drift-diffusion process2 Probability of correct decision evolves as a sigmoid function

Sigmoid performance also occurs in

1. Human-machine communication2. Advertising response3. Bidding in simultaneous auctions4. Human assisted multiple target search

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 20 / 38

Physics of Human Decision Making

Human Decision Making

Evolution of evidence for decision

Time

Time

tinf

tmin tmax

CorrectDecision

Probability

Probability of correct decision

1 Evidence for decision making evolves as a drift-diffusion process2 Probability of correct decision evolves as a sigmoid function

Sigmoid performance also occurs in

1. Human-machine communication2. Advertising response3. Bidding in simultaneous auctions4. Human assisted multiple target search

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 20 / 38

Attention Allocation for Human Operator

Problem: How to optimally allocate operator attention to a batch oftasks or to an incoming stream of tasks

– The performance of operator evolves as sigmoid function

– Static queue: serve N tasks in time T

– Dynamic queue: tasks arrive continuously at some known rate

– Optimal design of queue: What is an optimal arrival rateVaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 21 / 38

Page 9: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Outline

1 Introduction

2 Stochastic Surveillance StrategiesSingle Vehicle PoliciesMultiple Vehicle Policies

3 Attention Allocation for human operatorTime Constrained Static QueueDynamic Queue with Latency Penalty

4 Mixed Team Surveillance

5 Conclusions

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 21 / 38

Knapsack Problem with Sigmoid Utility

Human operator to perform N surveillance tasks in time T

Expected reward for allocation t� to task � is f�(t�)

Find allocation that maximizes total expected reward

maximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t� ≥ 0, � ∈ {1, . . . ,N}Courtesy: Wikipedia

– knapsack problem: f� is step function

– If f� are sigmoid functions: decision variables are hybrid

– knapsack problem with sigmoid utility is NP hard

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 22 / 38

Standard Approach for Sigmoid Functions

Construct a concave envelop

Time

Time

tinf

tmin tmax

Correct

Decision

Probab

ility

Concave envelop may yield a very bad policy

Example: Identical Sigmoid Functions

maximizet�≥0

10�

�=1

1/(1 + exp(−t� + 5))

subject to t1 + . . .+ t10 = 8.

Optimal policy: t∗1= 8, t∗

2= . . . = t∗

10= 0 Reward: 0.9526

Concave envelop policy : t1 = t2 = . . . = t10 = 0.8 Reward: 0.1477

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 23 / 38

Standard Approach for Sigmoid Functions

Construct a concave envelop

Time

Time

tinf

tmin tmax

Correct

Decision

Probab

ility

Concave envelop may yield a very bad policy

Example: Identical Sigmoid Functions

maximizet�≥0

10�

�=1

1/(1 + exp(−t� + 5))

subject to t1 + . . .+ t10 = 8.

Optimal policy: t∗1= 8, t∗

2= . . . = t∗

10= 0 Reward: 0.9526

Concave envelop policy : t1 = t2 = . . . = t10 = 0.8 Reward: 0.1477

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 23 / 38

Page 10: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Sigmoid Function and Linear Penalty

Sigmoid function and linear penalty

maximizet≥0

f (t)− ψt

Time

Time

tinf

tmin tmax

Correct

Decision

Probab

ility

Derivative of a sigmoid function

00 Penalty Rate

Optimal

Allocation

ψf

Optimal allocation v/s penalty rate

–The optimal allocation jumps down to zero at critical penalty rate

– Jump creates combinatorial effects

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 24 / 38

KP with Sigmoid Utility: Approximation Algorithm I

KP with Sigmoid Utilitymaximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t� ≥ 0, � ∈ {1, . . . ,N}

Lagrangian

L(t,α) =N�

�=1

(f�(t�)− αt�)

α parametrized non-zero allocations

t†� = f †� (α) ≡�max{t | f �� (t) = α}, if y ∈ range(f �� ),

0, otherwise.

Allocations at boundary: t∗� ∈ {0,T}α-parametrized knapsack problem

maximize x1f1(t†1) + · · ·+ fN(t

†N)

subject to x1t†1+ · · ·+ xNt

†N = T

x� ∈ {0, 1}, � ∈ {1, . . . ,N}

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 25 / 38

KP with Sigmoid Utility: Approximation Algorithm I

KP with Sigmoid Utilitymaximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t� ≥ 0, � ∈ {1, . . . ,N}

Lagrangian

L(t,α) =N�

�=1

(f�(t�)− αt�)

α parametrized non-zero allocations

t†� = f †� (α) ≡�max{t | f �� (t) = α}, if y ∈ range(f �� ),

0, otherwise.

Allocations at boundary: t∗� ∈ {0,T}

α-parametrized knapsack problem

maximize x1f1(t†1) + · · ·+ fN(t

†N)

subject to x1t†1+ · · ·+ xNt

†N = T

x� ∈ {0, 1}, � ∈ {1, . . . ,N}

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 25 / 38

KP with Sigmoid Utility: Approximation Algorithm I

KP with Sigmoid Utilitymaximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t� ≥ 0, � ∈ {1, . . . ,N}

Lagrangian

L(t,α) =N�

�=1

(f�(t�)− αt�)

α parametrized non-zero allocations

t†� = f †� (α) ≡�max{t | f �� (t) = α}, if y ∈ range(f �� ),

0, otherwise.

Allocations at boundary: t∗� ∈ {0,T}α-parametrized knapsack problem

maximize x1f1(t†1) + · · ·+ fN(t

†N)

subject to x1t†1+ · · ·+ xNt

†N = T

x� ∈ {0, 1}, � ∈ {1, . . . ,N}

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 25 / 38

Page 11: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

KP with Sigmoid Utility: Approximation Algorithm II

2-factor approximation algorithm

1: Parametrize via Lagrange multiplier

t†� = f †� (α)

2: Solve α-parametrized relaxed knapsack

maximize x1f1(t†1) + · · ·+ xN fN(t

†N)

subject to x1t†1+ · · ·+ xNt

†N ≤ T

x� ∈ [0, 1], � ∈ {1, . . . ,N}

3: Search optimal Lagrange multiplier α

4: Serve tasks with x∗� = 1

5: Compare the reward with f�(T ), ∀�

6: Pick the better policy

α-parametrized knapsack

Optimal allocations

Approx allocations

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 26 / 38

KP with Sigmoid Utility: Approximation Algorithm II

2-factor approximation algorithm

1: Parametrize via Lagrange multiplier

t†� = f †� (α)

2: Solve α-parametrized relaxed knapsack

maximize x1f1(t†1) + · · ·+ xN fN(t

†N)

subject to x1t†1+ · · ·+ xNt

†N ≤ T

x� ∈ [0, 1], � ∈ {1, . . . ,N}

3: Search optimal Lagrange multiplier α

4: Serve tasks with x∗� = 1

5: Compare the reward with f�(T ), ∀�

6: Pick the better policy

Lagrange Multiplier αMax

ObjectiveFunction

α-parametrized knapsack

2 4 6 8 101 3 5 7 90

3

6

Opt

imal

allo

cOptimal

Allocation

ApproxAllocation

Task

Task1

1

5

52

2

3

3 4

4

6

6 7

7

8

8

9

9

10

10

3

3

6

6

0

0

Optimal allocations

2 4 6 8 101 3 5 7 90

3

6

Apro

x. a

lloc

Optimal

Allocation

ApproxAllocation

Task

Task1

1

5

52

2

3

3 4

4

6

6 7

7

8

8

9

9

10

10

3

3

6

6

0

0

Approx allocations

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 26 / 38

Outline

1 Introduction

2 Stochastic Surveillance StrategiesSingle Vehicle PoliciesMultiple Vehicle Policies

3 Attention Allocation for human operatorTime Constrained Static QueueDynamic Queue with Latency Penalty

4 Mixed Team Surveillance

5 Conclusions

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 26 / 38

Decision Making Queue with Penalty

Tasks arrive as a Poisson process with rate λ

Tasks sampled from a distribution p : D → R≥0

Reward wd for each correct decision on task d

Latency penalty per unit-time cd , for task d ∈ D, and c = Ep[cd ]

Objective of task release algorithm:

maxt1,t2,t3...

limL→∞

1

L

L�

�=1

E�wd� fd�(t�)−

1

2

� �+n�−1�

i=�

cdi +

�+n�+1−1�

j=�

cdj

�t��

where queue length n�+1 = max{1, n� − 1 + Poisson(λt�)}

Approach: Certainty-equivalent receding horizon optimization

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 27 / 38

Page 12: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Certainty-equivalent receding horizon optimization

Certainty-equivalent approximation:replace future uncertainties with their nominal values

CE queue length: n�+1 = max{1, n� − 1 + λt�}

CE performance function: f (t�) =

�d∈D wdpd fd(t�)�

d∈D wdpd

Finite horizon optimization problem for task �

maximumt1,...,tN

n��

j=1

�wj fj(tj)−

� n��

i=j

ci + (nj − n� − j + 1)c�tj −

1

2cλt2j

+N�

j=n�+1

�w f (tj)− c nj tj −

1

2cλt2j

– Univariate DP with continuous action and state variables !

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 28 / 38

Certainty-equivalent receding horizon optimization

Certainty-equivalent approximation:replace future uncertainties with their nominal values

CE queue length: n�+1 = max{1, n� − 1 + λt�}

CE performance function: f (t�) =

�d∈D wdpd fd(t�)�

d∈D wdpd

Finite horizon optimization problem for task �

maximumt1,...,tN

n��

j=1

�wj fj(tj)−

� n��

i=j

ci + (nj − n� − j + 1)c�tj −

1

2cλt2j

+N�

j=n�+1

�w f (tj)− c nj tj −

1

2cλt2j

– Univariate DP with continuous action and state variables !

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 28 / 38

Certainty-equivalent receding horizon optimization

Certainty-equivalent approximation:replace future uncertainties with their nominal values

CE queue length: n�+1 = max{1, n� − 1 + λt�}

CE performance function: f (t�) =

�d∈D wdpd fd(t�)�

d∈D wdpd

Finite horizon optimization problem for task �

maximumt1,...,tN

n��

j=1

�wj fj(tj)−

� n��

i=j

ci + (nj − n� − j + 1)c�tj −

1

2cλt2j

+N�

j=n�+1

�w f (tj)− c nj tj −

1

2cλt2j

– Univariate DP with continuous action and state variables !Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 28 / 38

Numerical Illustration

0 5 10 15 20 250

2

4

6

2

0

4

5 10 15 20 250Task

Allocation 6

Allocations

0 5 10 15 20 250

2

4

6

8

20

4

5 10 15 20 250Task

Queu

elength

68

Queue Length

0 5 10 15 20 250

2

4

6

2

0

4

5 10 15 20 250Task

Inflection

Point

6

Difficulty of tasks

0 5 10 15 20 250

1

2

1

0

2

5 10 15 20 250Task

Weigh

t

Importance of tasks

– Difficult and unimportant tasks are dropped

– Tasks dropped at high queue lengths

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 29 / 38

Page 13: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Experimental Validation

task = spot the differences

expected # detected differencesis linear function of time (DDM)

probability to detect more than 60% diffsis sigmoid (threshold-based decision making)

0 10 20 30 40 50 601

1.5

2

2.5

3

3.5

4

4.5

5

Info

rmat

ion

Aggr

egat

ed

Time

Information aggregation satisfy DDM

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Expe

cted

Rew

ard

Time

Probability of correct decision is sigmoid

Acknowledgment: Christopher J. HoThanks to Volunteers: Fabio, Anahita, Florian, Rush, and John

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 30 / 38

Outline

1 Introduction

2 Stochastic Surveillance StrategiesSingle Vehicle PoliciesMultiple Vehicle Policies

3 Attention Allocation for human operatorTime Constrained Static QueueDynamic Queue with Latency Penalty

4 Mixed Team Surveillance

5 Conclusions

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 30 / 38

Mixed Team Setup

Cognition & AutonomyManagement System

λincoming tasks

queue lengthn

Vehicle RoutingAlgorithm

Decision SupportSystem

Anomaly DetectionAlgorithm

optimal allocations

tasks &

exogenou

sfactors

situational awareness

fatigue & sleep cycle

forgetting

boredom

AutonomyCognition

decision on tasks

distribution of tasks outgoing tasks

region selection policy

human operatorperformance

Critical Issues:1 no sensor observations for surveillance

2 operator’s decision: binary random variable

3 sequence of decisions: dependent and non-identically distributed

4 standard CUSUM not applicable

5 performance function on a task varies throughout mission

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 31 / 38

Mixed Team Surveillance

Cognition & AutonomyManagement System

λincoming tasks

queue lengthn

Vehicle RoutingAlgorithm

Decision SupportSystem

Anomaly DetectionAlgorithm

optimal allocations

tasks &

exogenou

sfactors

situational awareness

fatigue & sleep cycle

forgetting

boredom

AutonomyCognition

decision on tasks

distribution of tasks outgoing tasks

region selection policy

human operatorperformance

Good News:1 CUSUM like algorithm applicable for dependent data

2 performance function varies but can be characterized

Bad News:1 No detection delay expressions

Simplified routing policy:

Region selection probability ∝ likelihood of anomaly

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 32 / 38

Page 14: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Mixed Team Surveillance

Cognition & AutonomyManagement System

λincoming tasks

queue lengthn

Vehicle RoutingAlgorithm

Decision SupportSystem

Anomaly DetectionAlgorithm

optimal allocations

tasks &

exogenou

sfactors

situational awareness

fatigue & sleep cycle

forgetting

boredom

AutonomyCognition

decision on tasks

distribution of tasks outgoing tasks

region selection policy

human operatorperformance

Good News:1 CUSUM like algorithm applicable for dependent data

2 performance function varies but can be characterized

Bad News:1 No detection delay expressions

Simplified routing policy:

Region selection probability ∝ likelihood of anomalyVaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 32 / 38

Operator performance in Surveillance Mission

Drift-diffusion model:

dx(t) = µdt + σdW (t),

x(0) =µ

2σ2log

π

1− π

π: operator’s prior belief on anomaly

Threshold

Eviden

ceEvo

lution

Time

Performance function: π�1− Φ

�−µt − x0σ√t

��+ (1− π)

�Φ�µt − x0

σ√t

��

Φ(·): standard normal cdf

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 33 / 38

Operator performance in Surveillance Mission

Drift-diffusion model:

dx(t) = µdt + σdW (t),

x(0) =µ

2σ2log

π

1− π

π: operator’s prior belief on anomaly

Threshold

Eviden

ceEvo

lution

Time

Performance function: π�1− Φ

�−µt − x0σ√t

��+ (1− π)

�Φ�µt − x0

σ√t

��

Φ(·): standard normal cdf

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 33 / 38

Mixed Team Surveillance: Update Rules

Prior Belief Update Rule

πnew =πPanom(dec�|t)

(1− π)Pno-anom(dec�|t) + πPanom(dec�|t)

CUSUM like update rule

Λ�+1 = max�0,Λ� + log

Pno-anom(dec�|t�, dec�−1, t�−1, . . .)

Panom(dec�|t�, dec�−1, t�−1, . . .)

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 34 / 38

Page 15: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Mixed Team Surveillance: Update Rules

Prior Belief Update Rule

πnew =πPanom(dec�|t)

(1− π)Pno-anom(dec�|t) + πPanom(dec�|t)

CUSUM like update rule

Λ�+1 = max�0,Λ� + log

Pno-anom(dec�|t�, dec�−1, t�−1, . . .)

Panom(dec�|t�, dec�−1, t�−1, . . .)

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 34 / 38

Mixed Team Surveillance: Numerical Illustration

0 20 40 60 80 1000

10

20

30

40

0

10

20

30

40 60 80 100

2

4

6

0

0

20

40

0 20 40 60 80 100Task

Task

Allocation

Queuelength Allocations

0 20 40 60 80 1000

2

4

6

0

10

20

30

40 60 80 100

2

4

6

0

0

20

40

0 20 40 60 80 100Task

Task

Allocation

Queuelength

Queue Length

0 500 1000 15000

1

0

500 1000 15000

500 1000 15000

0

1

2

4

6

8

Time

Time

CUSUM

Statistics

Reg.Select.

Prob.

Region Selection Probability

0 500 1000 15000

2

4

6

8

0

500 1000 15000

500 1000 15000

0

1

2

4

6

8

Time

Time

CUSUM

Statistics

Reg.Select.

Prob.

CUSUM Statistics

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 35 / 38

Outline

1 Introduction

2 Stochastic Surveillance StrategiesSingle Vehicle PoliciesMultiple Vehicle Policies

3 Attention Allocation for human operatorTime Constrained Static QueueDynamic Queue with Latency Penalty

4 Mixed Team Surveillance

5 Conclusions

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 35 / 38

Conclusions

Stochastic Surveillance

Surveillance for anomaly detection

Ensemble CUSUM algorithm with stochastic routing policy

Surv. policy depends on geography, difficulty, & anom likelihood

Attention Allocation

Decision making performance = speed/accuracy trade-off

Sigmoid performance renders combinatorial effects

Blend of combinatorial and convex optimization

Optimal policies drop tasks for static as well as dynamic problems

Mixed Team Surveillance

Time-varying operator performance

CUSUM like algorithm for anomaly detectionVaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 36 / 38

Page 16: Big Picture: Human-robot decision dynamics Stochastic ...vaibhav/talks/2012c.pdf · Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 3 / 38 Publications Search and

Future Directions

Stochastic Surveillance

More efficient partitioning policies

Maximum entropy Markov chain

Inter-region dynamics of anomalies

Adversarial anomalies

Attention Allocation

More efficient methods of incorporating deadlines

Experimental validation

Mixed Team Surveillance

Incorporating exogenous factors into decision making models

Real-time adaptation of parameters, e.g., by introducing control tasks

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 37 / 38

References

Search and Surveillance StrategiesV. Srivastava, F. Pasqualetti, and F. Bullo. Stochastic surveillance strategies for spatialquickest detection. Int J Robotic Research, 2013. to appearV. Srivastava, K. Plarre, and F. Bullo. Randomized sensor selection in sequential hypothesistesting. IEEE Trans Signal Processing, 59(5):2342–2354, 2011

Attention Allocation StrategiesV. Srivastava, R. Carli, C. Langbort, and F. Bullo. Attention allocation for decision makingqueues. Automatica, February 2012. conditionally acceptedV. Srivastava and F. Bullo. Knapsack problems with sigmoid utility: Approximation algo-rithms via hybrid optimization. European Journal of Operational Research, October 2012.Submitted

Mixed Team SurveillanceV. Srivastava, A. Surana, M. Eckstein, and F. Bullo. Mixed human-robot team surveillancewith guaranteed performance. 2012. In preparation.

Funding: AFOSR MURI Program “Behavioral Dynamics in MixedHuman/Robotics Teams” 5/07-6/12

Vaibhav Srivastava (UCSB) Mixed Team Surveillance October 31, 2012 38 / 38