university of california, san diegoflyingv.ucsd.edu/nima/thesis.pdf · the dissertation of nima...

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Extremum Seeking for Mobile Robots

A Dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in

Engineering Sciences (Mechanical Engineering)

by

Nima Ghods

Committee in charge:

Professor Miroslav Krstic, ChairProfessor Robert BitmeadProfessor William HeltonProfessor Raymond de CallafonProfessor Michael Todd

2011

Copyright

Nima Ghods, 2011

All rights reserved.

The Dissertation of Nima Ghods is approved, and

it is acceptable in quality and form for publication

on microfilm and electronically:

Chair

University of California, San Diego

2011

iii

For my mother

who suddenly faced adversity in this great land

and sacrificed in order to keep opportunity alive for her children.

iv

TABLE OF CONTENTS

Signature Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Abstract of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Slow Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Model of a Metal Oxide Sensor . . . . . . . . . . . . . . . . . . . . . 62.3 Extremum Seeking Design for Slow Sensors . . . . . . . . . . . . . . . 82.4 Slow Sensor and a Static Map . . . . . . . . . . . . . . . . . . . . . . 102.5 Drifting Sensor and a Static Map . . . . . . . . . . . . . . . . . . . . 152.6 Navigation of a 2D Point Mass With a Slow Sensor . . . . . . . . . . 18

3 Source Seeking for Nonholonomic Unicycle with Speed Regulation . . . . . 263.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.2 Vehicle Model and Control Design . . . . . . . . . . . . . . . . . . . . 283.3 The Average System . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.4 Stability for Small Positive or Negative Vc . . . . . . . . . . . . . . . 353.5 Stability for Medium and Large Positive Vc . . . . . . . . . . . . . . . 403.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Multi-Agent Deployment Over a Source . . . . . . . . . . . . . . . . . . . 474.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.3 Free Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.4 Fixed Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

v

5 Multi-agent Deployment with Stochastic Extremum Seeking . . . . . . . . 665.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.2 Vehicle Model and Local Agent Cost . . . . . . . . . . . . . . . . . . 685.3 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695.4 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.4.1 Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.4.2 Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6 Light Source Seeking Experiments . . . . . . . . . . . . . . . . . . . . . . . 816.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.2 Vehicle Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826.3 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.3.1 Localization and Tracking of a Light Source . . . . . . . . . . 866.3.2 Level Set Tracking of a Light Source . . . . . . . . . . . . . . 866.3.3 Collision Avoidance . . . . . . . . . . . . . . . . . . . . . . . . 88

6.4 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 89

7 Plume Source Seeking Experiments . . . . . . . . . . . . . . . . . . . . . . 937.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937.2 Testbed Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947.3 Robot Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977.4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1027.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 103

A Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

B Averaging in Infinite Dimensions . . . . . . . . . . . . . . . . . . . . . . . 110

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

vi

LIST OF FIGURES

Figure 2.1: (a) An example of metal oxide sensor TGS2602 responding tofour different concentrations of ethanol. (b) Comparison of the firstorder sensor model and the real sensor reaction to ethanol. . . . . . . 7

Figure 2.2: Extremum seeking block diagrams. The modified extremumseeking algorithm (b) applies both to the case with a slow sensor(ε > 0) and to the case with a sensor modeled as a pure integrator,which we also refer to as a ‘drifting sensor’ (ε = 0). In both cases(ε > 0 and ε = 0), the washout filter is optional (both h > 0 andh = 0 are permissible). . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Figure 2.3: Gas concentration distribution along the pipe with gas leak atposition 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Figure 2.4: Simulation results for modified extremum seeking with slowsensor dynamics. (a) Output of the nonlinear map. (b) The sensorposition relative to θ∗. (c) The signal after the high pass filter. (d)The slow sensor reading. . . . . . . . . . . . . . . . . . . . . . . . . 14

Figure 2.5: Simulation results for extremum seeking with Gsensor(s) = b/swith washout filter. (a) Output of the nonlinear map. (b) The sensorposition relative to θ∗. (c) The signal after the high pass filter. . . . . 17

Figure 2.6: Simulation results for extremum seeking with Gsensor(s) = b/sand without washout filter. (a) Output of the nonlinear map. (b)The sensor position relative to θ∗. . . . . . . . . . . . . . . . . . . . . 19

Figure 2.7: Modified ES for 2D point mass vehicle with slow sensor. Thescheme applies both to the case with a slow sensor (ε > 0) and to thecase with a sensor modeled as a pure integrator, which we also referto as a ‘drifting sensor’ (ε = 0), and with both h > 0 and h = 0 beingpermissible. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Figure 2.8: Simulation results for extremum seeking on a 2D point masswith a slow sensor. (a) Vehicle trajectory with the intensity of thenonlinear map in the background. (b) Output of the nonlinear map.(c) The slow sensor output. (e) The output of the washout filter. (d)and (f) The control input of x-axis and y-axis before the addition ofthe perturbation, respectively. . . . . . . . . . . . . . . . . . . . . . . 24

Figure 3.1: The notation used in the model of vehicle sensor and centerdynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Figure 3.2: Block diagram of source seeking via tuning of angular velocityand forward velocity using one reading . . . . . . . . . . . . . . . . . 30

Figure 3.3: Diagram of the error variables relating the vehicle and thesource. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

vii

Figure 3.4: Simulation results for steering-based unicycle source seekingwith forward speed regulation: (a), (b), (c) showing the evolutionof the variables rc, θ, and Vc + bξ, respectively, and (d) showing thetrajectory of the vehicle. . . . . . . . . . . . . . . . . . . . . . . . . . 38

Figure 3.5: The difference in trajectories for small positive and negativeVc. The two cases yield convergence to the average equilibria (3.31)and (3.32), respectively. For Vc < 0 the vehicle points towards thesource at the end of the transient, whereas for Vc > 0 the vehiclepoints away from the source at the end of the transient. . . . . . . . 41

Figure 3.6: Simulation result of vehicle trajectory using steering-basedsource seeking and forward speed regulation on a Rosenbrock function(the white shading represents the maximum). . . . . . . . . . . . . . 41

Figure 3.7: Two trajectories of the same vehicle, with the only differencebeing the initial condition in θ. The vehicle converges to two differentaverage equilibria, (3.33) and (3.34). (a) shows the evolution of therelative angle between the vehicle heading and the source, with µ0 ≈π/3. (b) shows the trajectory of the vehicles. . . . . . . . . . . . . . . 44

Figure 3.8: Three trajectories of the same vehicle, with the only differencebeing the value of Vc. The vehicle converges to three different trajec-tories that encircle the source. (a) shows the evolution of the relativeangle between the vehicle heading and the source, with µ0 ≈ 0 whenVc is close to V upper

c and µ0 ≈ π/2 when Vc ≫ V upperc . (b) shows the

trajectory of the vehicles. . . . . . . . . . . . . . . . . . . . . . . . . . 45

Figure 4.1: Vehicle density function for λ = 5 and λ(α) = 5(2− α). . . . . 57Figure 4.2: Block diagram of a single follower agent. . . . . . . . . . . . . 61Figure 4.3: Double y-axis plots of the vehicle trajectories showing time

scale on the left y-axis, the signal field strength on the right y-axis,and the location of the vehicles on the x-axis. (a) Agent deploymentwith fixed anchors. (b) Agent deployment with free anchors. . . . . . 63

Figure 4.4: Theoretical plot of (a) Formation distribution function and(b) Formation density function for the fixed and free anchor cases . . 64

Figure 4.5: (a) Agent deployment with free anchors starting far fromthe equilibrium with linearly increasing parameters (b) Group of 11agents using free anchor case to achieve seeking of a moving source . 65

Figure 5.1: Shows a group of vehicles using the stochastic extremum seek-ing algorithm with Case 1 perturbations and interaction gains givenby (5.68). The anchor agents are denoted by red triangles and thefollower agents are denoted by blue dots. The agents start inside thedashed black line and converge to a circular formation around thesource. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

viii

Figure 5.2: Shows a group of vehicles using the stochastic extremum seek-ing algorithm with Case 2 perturbations. The agents start inside thedashed black line and converge to a line formation centered aroundthe source with the anchor agents at the end of the line formation. . . 80

Figure 6.1: Graphical interpretation of the unicycle model with a decou-pled sensor. The red dot indicates the sensors location . . . . . . . . 83

Figure 6.2: ANT (a) top view (b) bottom view . . . . . . . . . . . . . . . 84Figure 6.3: CAD rendering of the PCB . . . . . . . . . . . . . . . . . . . 85Figure 6.4: Photographs of the ANT performing source seeking with over-

layed trajectory appearing in order from left to right top to bottom. . 87Figure 6.5: Photographs of the ANT performs level set tracing at 15 sec

intervals appearing in order from left to right top to bottom . . . . . 88Figure 6.6: Picture of the testbed after the ANT had traced the level set

several times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Figure 6.7: Photographs of two ANTs performing source seeking in a field

produced by to light sources at 10 sec intervals appearing in orderfrom left to right top to bottom. . . . . . . . . . . . . . . . . . . . . . 90

Figure 6.8: Photographs of the ANT performing obstacle avoidance whiletracking a light source at 5 sec intervals appearing in order from leftto right top to bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Figure 6.9: Photographs of the ANTs avoiding each other while trackinga light source at 5 sec intervals appearing in order from left to righttop to bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Figure 7.1: Wind tunnel (a) the intake (b) the outlet . . . . . . . . . . . . 95Figure 7.2: Smoke chamber (a) picture of the smoke chamber (b) diagram

of smoke chamber . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Figure 7.3: Matlab GUI used to run experiments. The GUI has commu-

nication states on the left the test controls on the top right, and thereal time plots on the bottom right. . . . . . . . . . . . . . . . . . . . 98

Figure 7.4: Plume-bot (a) picture of the plume-bot (b) CAD of plume-bot 99Figure 7.5: Custom designed circuit board . . . . . . . . . . . . . . . . . . 100Figure 7.6: Smoke sensor (a) picture of the smoke sensor (b) circuit dia-

gram for particulate sensors. . . . . . . . . . . . . . . . . . . . . . . . 100Figure 7.7: Circuit diagram for wind sensors. . . . . . . . . . . . . . . . . 101Figure 7.8: Block diagram of the overall experiment . . . . . . . . . . . . 102Figure 7.9: Picture of the plume-bot during a plume source seeking test . 103Figure 7.10: A 35 sec trajectory of the plume-bot performing smoke plume

localization in a wind tunnel with a rightward wind of 1m/s. . . . . 104

ix

ACKNOWLEDGEMENTS

First and foremost, I would like to express my gratitude to my advisor Professor

Miroslav Krstic for all the opportunities that he has provided for me. His excellent

advice and guidance have tremendously helped my academic as well as professional

development. It was truly an honor to work with him.

I would like to thank my mother Tara, my sister Mashia, my niece Armita, and

my brother-in-law Shahram for there love and support.

I would like to thank the members of my committee for their helpful questions

and comments, and lending their time and expertise to this project.

I would like to thank my fellow graduate students, Antranik Siranosian, Jen-

nie Cochran, Dan Arnold, James Gray, Paul Frihauf, David Zhang, Andrew Kwok,

Gabe Graham, Chad Foerster, Christopher Colburn, Ahsan Samiee, James Krieger,

Alicia Powers, Nikos Berkiaris-Liberis, Halil Basturk, Alex Scheinker, Alex Simp-

kins, Ameet Deshpande, Charles Kinney, Matthew Graham, Bahman Gharesifard,

Mike Ouimet, Delphine Bresch-Pietri, Artem Chakirov, Michael Bohm and Gideon

Prior for creating an enjoyable and collaborative research environment. A special

thanks goes to Jennie, Antranik, and Paul for all their advice and help.

I would like to thank the iBotics team, Gregory Mills, Jenny Wize, Paul Wise-

caver, Thomas Denewiler, Andrew Meares, and Chris Barngrover. A special thanks

to Andrew and Thomas for all their enthusiasm and love for robots.

I would like to thank the people on MURI and LANL plume project, Ramon

Huerta , Lev Tsimring, Alexander Vergara Tinoco, Kerem Muezzinoglu, Terry Pe-

ters, Nikolai Rulkov, Mikhail Rabinovich, Matt Bement, and Charles Farrar.

I would like to thank the MAE machine shop staff, Chris Cassidy, Thomas

Chalfant, and David Lischer for lending me lots of help and letting me use the shop

at ungodly hours.

I would like to thank all the undergrad team members that helped me with my

research experiments and side projects.

Finally, I would like to thank my friends John Crawford, Rody Tebcherani, and

Cezario Tebcherani, who I can always count on. A special thanks to John for his

listening ear and for always helping me put things in perspective.

x

This dissertation includes reprints of the following papers:

N. Ghods, and M. Krstic, “Source seeking with very slow or drifting sensors,” pro-

visionally accepted for Journal of Dynamic Systems, Measurement, and Control.

(Chapter 2)

N. Ghods, and M. Krstic, “Speed regulation in steering-based source seeking,” Au-

tomatica, vol. 46, pp. 452–459, 2010. (Chapter 3)

N. Ghods and M. Krstic, “Multi-agent deployment over a source,” provisionally

accepted for IEEE Transactions on Control Systems Technology. (Chapter 4)

N. Ghods, P. Frihauf, and M. Krstic, “Multi-Agent Deployment in the Plane Using

Stochastic Extremum Seeking,” IEEE Conference on Decision and Control, 2010.

(Chapter 5)

The dissertation author was the primary investigator and author of these publi-

cations.

xi

VITA

2006 B.S. in Mechanical Engineering, University of Califor-nia, San Diego

2006-2008 Teaching Assistant, Department of Mechanical Engi-neering, University of California, San Diego

2011 Ph.D. in Engineering Sciences (Mechanical Engineer-ing), University of California, San Diego

PUBLICATIONS

C. Zhang, D. Arnold, N. Ghods, A.A. Siranosian and M. Krstic, “Source seeking withnon-holonomic unicycle without position measurement and with tuning of forwardvelocity,” Systems & Control Letters, vol. 56, issue 3, pp. 245–252, 2007.

J. Cochran, N. Ghods, A. Siranosian, and M. Krstic, “3D source seeking for under-actuated vehicles without position measurement,” IEEE Transactions on Robotics,vol. 25, pp. 245–252, 2009.

N. Ghods, and M. Krstic, “Speed regulation in steering-based source seeking,” Au-tomatica, vol. 46, pp. 452–459, 2010.

N. Ghods, and M. Krstic, “Source seeking with very slow or drifting sensors,” pro-visionally accepted for Journal of Dynamic Systems, Measurement, and Control.

N. Ghods, and M. Krstic, “Multi-agent deployment over a source,”provisionallyaccepted for IEEE Transactions on Control Systems Technology.

xii

ABSTRACT OF THE DISSERTATION

Extremum Seeking for Mobile Robots

by

Nima Ghods

Doctor of Philosophy in Engineering Sciences (Mechanical Engineering)

University of California, San Diego, 2011

Professor Miroslav Krstic, Chair

The work in this thesis describes theoretical and experimental results of ex-

tremum seeking applied to vehicle(s) with the objective of localizing the source of

an unknown, nonlinear, signal field. For environments where position information

is unavailable, the extremum seeking method is applied to autonomous vehicles as

a means of navigating to find the source of some signal which the vehicles can mea-

sure locally. The signal is at maximum intensity at the source and decreases with

distance away from the source. Although we only assume that the signal field has

a maximum in experiments, to prove theoretical stability we use quadratic form a

local approximation of the signal field.

We explore the idea of dealing with a very slow or drifting sensor and provide

stability results for several distinct variations of an extremum seeking scheme for 1D

optimization and 2D source localization with point-mass vehicle dynamics. Detailed

convergence analysis and simulations for steering-based source seeking with forward

velocity regulation applied to nonholonomic vehicles are provided. We develop a

deterministic algorithm in a continuum to deploy a group of autonomous vehicles

(agents) capable of measuring relative position to neighbors, in a line formation,

which has a higher density of agents near the source of a measurable signal and a

lower density away from the source in 1D. We also consider stochastic swarming

algorithms in 2D that force the net of agents to spread, maintain a formation, and

seek a source without position information, whereby each agent is given a local

xiii

measurement of signal field and the relative distance from neighbors.

Experimental results of extremum seeking applied to mobile vehicles to perform

localization, tracking, and level-set tracing of a light source are shown. We perform

experiments with multiple vehicles using extremum seeking not only to localize the

light source but also to avoid objects and each other. Finally, we discuss details

of setting up a testbed to produce a characterized smoke plume and the results of

plume source seeking experiments.

xiv

1

Introduction

The main goal of this work is to develop algorithms based on extremum seeking

for autonomous vehicle(s). Using theoretical and experimental results we show that

these algorithms allow the vehicle(s) to localize an unknown source. Throughout

this work we assume the signal is at maximum intensity at the source and decreases

with distance away from the source. The main idea for the control law is to guide

the vehicle(s) up the gradient of the signal to find the source. For the theoretical

results we assume a quadratic form for the signal field. The quadratic assumption

can be relaxed using the same methods in [2, 53].

For coordinated motion control and autonomous agents, deprivation of position

information is an area of rapidly growing interest. Extremum seeking is a use-

ful concept in environments where GPS is unavailable and inertial navigation is

too expensive, such as urban environments, underwater, under ice and in caves.

Extremum seeking is a real-time, non-model based adaptive control technique for

tuning parameters to optimize an unknown nonlinear map. Extremum seeking re-

lies on persistence of excitation, usually a sinusoid, to perturb the parameters being

tuned. This quantifies the effects of the parameters on the output of the nonlinear

map, then uses that information to generate estimates of the optimal parameter

values. Extremum seeking [2] has been advanced or employed in applications by

several other authors [10, 4, 38, 53, 54, 1, 44, 45, 61, 46, 52, 62, 8, 57, 56].

In the present work we attempt to overcome some of the newly-faced challenges

of source seeking with autonomous vehicle(s) using extremum seeking. In working

1

2

on chemical localization, one is faced with the problem of very slow sensor dynamics,

which causes the overall system to perform poorly or become unstable. In Chapter

2 the problem of slow senor dynamics is addressed. In [11] the vehicle is constrained

to have a constant forward velocity, which creates a trade off between convergence

speed and size of the ring to which vehicle converges. The forward velocity constraint

is unrealistic for ground and underwater vehicles since most of the time they have

the ability not only to slow down but go backwards. In Chapter 3 we explore the

benefits of being able to regulate the forward velocity. The problem of multiple

vehicles with local information about the source and their neighbors performing

source seeking is analyzed in Chapters 4 and 5.

The thrust of the investigator’s effort as a Ph.D. candidate has been theoretical

development of control algorithms. These algorithms have been employed in sev-

eral applications including an autonomous underwater robot, a light-source seeking

robot, and a plume-source seeking robot. The final chapters set forth the experimen-

tal work that employs the algorithms. As is often the case when theory juxtaposes

with application, the applications give insight as to future theoretical work that

could improve robotic performance in extremum seeking. Some of the theoretical

work done in [11, 12] is experimentally verified in Chapter 6. The difficult task

of seeking the source of a complex smoke plume experimentally is considered in

Chapter 7.

1.1 Thesis Overview

The contents of this thesis are as follows.

Chapter 2 presents a modified extremum seeking scheme to account for and

exploit slow sensor dynamics. We also consider the worst case, which is sensor

dynamics governed by a pure integrator.

Chapter 3 presents an extremum seeking based design, with the intent of bring-

ing the vehicle to a stop, or as close to a stop as possible. The vehicle speed

is controlled using simple derivative-like feedback of the sensor measurement (the

derivative is approximated with a washout filter) to which a speed bias parameter

3

Vc is added. The angular velocity is tuned using standard extremum seeking.

Chapter 4 presents a control algorithm for vehicles that are capable of sensing

a local signal field and the relative position between them and their neighbors based

on a combination of two components. One component of the control law is inspired

by the heat partial differential equations (PDE) and it results in the agents deploying

between two anchor agents. The other component of the control law is based on

extremum seeking and it achieves higher vehicle density around the source. Using

averaging theory for PDEs we prove that the vehicle density will be highest around

the source.

Chapter 5 presents the deployment of a group of N autonomous fully actuated

vehicles (agents) in a non-cooperative manner in a planar signal field using stochastic

extremum seeking, with the objective of spreading, maintaining a formation, and

seeking a source. The vehicles are not able to sense their own positions but are

capable of sensing the distance between their neighbors and themselves.

Chapter 6 presents the robot design and experimental results for localizing,

tracking, level-set tracing of a light source. The experimental results in this chapter

validate some of the numerical and theoretical results presented in [11, 12].

Chapter 7 presents the construction of testbed and experimental results for

smoke plume source localization experiments. The experiments done in this chapter

are the first steps in validating the theoretical work in Chapter 2.

2

Slow Sensor

In this Chapter we introduce a new idea of how to extend extremum seeking to

deal with a slow or drifting sensor. Slow sensors arise in many applications, including

sensing chemical concentrations in tracking of contaminant plumes. Slow sensors

are often the cause of poor performance and a potential cause of instability. In this

paper we design a modified extremum seeking scheme to account for and exploit slow

sensor dynamics. We also consider the worst case, which is sensor dynamics governed

by a pure integrator. We provide stability results for several distinct variations of

an extremum seeking scheme for one-dimensional optimization. Then we develop a

design for source seeking in a plane using a fully actuated vehicle, prove its closed-

loop convergence, and present simulation results. We use metal-oxide microhotplate

gas sensors as a real world example of slow sensor dynamics, model the sensor based

on experimental data, and employ the identified sensor model in our source seeking

simulations.

2.1 Introduction

Recent advances in extremum seeking have shown it to be a powerful tool in real

time non-model based control and optimization [10, 4, 38, 51, 54, 1]. Success has

been achieved in compensating slow actuator dynamics [60, 59, 11], but no results

have been reported on extremum seeking for plants with slow sensor dynamics, or

in the extreme case of sensors governed by a pure integrator (drifting sensors). In

4

5

this thesis we introduce a new idea of how to extend extremum seeking to deal with

a slow or drifting sensor.

For simplicity, we first consider a single-parameter extremum seeking problem

with a static map, and sensor dynamics. Then we consider a 2D problem with

simple vehicle dynamics, and with slow sensor dynamics. The classical extremum

seeking scheme [2] is modified by observing that the integrator, a key adaptation

element, is already present in the sensor dynamics, if they are governed by a pure

integrator. We perform an appropriate (time-varying) swap of the integrator block

and the demodulation block (Section 2.3), and as a result obtain a scheme where

the map output converges to the extremum quickly, while the sensor output may

converge slowly, or it may even drift to infinity (in the case of a sensor modeled by

a pure integrator). Stability and simulation results are presented first for a system

with a slow sensor (Section 2.4). This is followed by results for a sensor governed by

a pure integrator (Section 2.5). (These results do not imply one another.) Finally,

results for the case of a 2D point mass vehicle with a slow sensor are presented

(Section 2.6).

Traditional methods for gas plume seeking using slow metal oxide sensors [28,

29, 30] (reviewed in Section 2.2) either wait for a large enough change in the sensor

reading or for the sensor reading to settle before they act. Most of these search

methods [5, 34, 35] are based on mimicking insect behavior (mainly moths) to local-

ize source of odor without much consideration of the sensor dynamics. The modified

ES scheme reacts to the sensor reading continuously, which allows the overall system

to converge to an optimum much faster than the sensor settling time.

Our compensation of slow sensor dynamics does not amount to employing a

differentiator after the sensor to cancel the integrator in the sensor and act on the

trend of the signal, rather than on the value of the signal. This approach would result

in amplification of noise. Instead, our approach leverages the integrator action in the

sensor, to have it assume the role of the tuning element in the extremum seeking

loop. We highlight this by considering both a version of the modified extremum

seeking scheme with the standard washout filter in the loop and a version without

the washout filter, proving stability in each case.

6

To show the capabilities of the modified extremum seeking scheme with the

metal oxide sensors we consider the realistic two dimensional problem of trying to

localize a gas leak in a room with a single moving sensor. In the 2D source seeking

problem we are faced with the problem that two integrators exist in the loop, one

from the sensor and one associated with the vehicle model. A modification of the

extremum seeking scheme is needed to reduce the loop phase drop from 180 to a

lesser value. This modification comes in the form of a washout filter to approximate

differentiator, or, if preferred, in the form of a phase-lead compensator.

2.2 Model of a Metal Oxide Sensor

Due to their small size, metal oxide based microhotplate sensors can be used

to develop portable, sensitive, and low-cost gas monitoring system to detect, for

example, leakage of hazardous gases. Modeling metal oxide microhotplate sensor

dynamics accurately can prove to be very difficult, as seen in [22, 20, 21]. In this

section we make a reasonable assumption to simplify the complicated models. The

basic premise of the sensor model in [22, 20, 21] is that the sensor reading is driven

by an exponential of the concentration of several gases, and the gas concentrations

are governed by several coupled ODEs, which correspond to chemical reactions. We

are concerned with locating the maximum of a single gas with little fluctuation in

temperature.

Tests were performed to better understand the leading dynamics of the sensor.

A gas with a certain concentration was released at 30 [sec] into the experiment, then

the gas was flushed out at 600 [sec]. Figure 2.1 (a) shows the reaction of a TGS2602

metal oxide microhotplate sensor [19] to ethanol at four different concentrations.

Note in Figure 2.1 (a) that the sensor reading takes around 120 [sec] to settle,

independently of the gas concentration.

From these tests we see that the dominant dynamics of the sensor are governed

by a first order system

Gsensor(s) =b

s+ ε, (2.1)

7

0 200 400 600 800 1000 12000

50

100

150

200

250

Time (sec)

Sen

sor

Res

ista

nce

(kΩ

)

Sensor Reaction to Ethanol

250 ppm200 pmm150 ppm100 ppm

(a)0 50 100 150 200 250 300 350

0

50

100

150

200

Time (sec)

Sen

sor

Res

ista

nce

(kΩ

)

Sensor Reading

Sensor Reading For 250 ppmFirst Order Sensor Model Reading

(b)

Figure 2.1: (a) An example of metal oxide sensor TGS2602 responding to fourdifferent concentrations of ethanol. (b) Comparison of the first order sensor modeland the real sensor reaction to ethanol.

8

where b and ε are positive constants that depend on the sensor and the type of

gases. After performing several tests we observed that, although ε is positive, its

magnitude is quite small (on the order of 10−2). By inspection we set b = 0.037 and

ε = 0.046 to get the model for the gas sensor reacting to ethanol. Figure 2.1 (b)

compares the identified sensor model against the real TGS2602 gas sensor reading.

The sensor model parameters change for different gases and different sensors but

always stay positive. Note that methods in [2] can be applied if the sensor also

contains any fast dynamics.

2.3 Extremum Seeking Design for Slow Sensors

In this section, we modify the classical extremum seeking scheme to work with

very slow sensors. In the extreme case the sensors are governed by a pure integrator,

namely drifting sensors. We start with a key observation that an integrator is already

a part of the classical extremum seeking loop in Figure 2.2(a). We need to modify the

scheme so that the sensor itself is performing the task of this integrator. To do this,

we need to swap the integrator and the multiplication by sin(ωt) in Figure 2.2(a),

i.e., to move the integrator upstream in the signal path. This is not a simple swap of

linear blocks because a multiplication by a time varying signal is involved. However,

using integration by parts, we get that∫ t

0

η(τ) sin(ωτ)dτ = sin(ωt)

∫ t

0

η(τ)dτ − ω

∫ t

0

cos(ωτ)

∫ τ

0

η(σ)dσdτ . (2.2)

We use this observation to convert the scheme in Figure 2.2(a) to the scheme in

Figure 2.2(b), where the guiding idea is that the sensor is a pure integrator, namely,

ε = 0. As we shall see, this modification also works when ε > 0.

In the following sections we will show, using averaging theory, that the modified

extremum seeking scheme can be used to maximize a signal (for example gas con-

centration), using just the output of the sensor and without any knowledge of the

map parameters or the sensor parameters.

9

Nonlinear Map

)( f

hs

s

!

)sin( t"s

1k#

)sin( ta "

J

$

(a) Classical extremum seeking algorithm

;%'6&'"38)/3<)

)(θfε+s

bθ

hs

s

+)sin( tω

)cos( tωω−s

1k

θ

)sin( ta ω

J ="'+%8)

η

µ

(b) Modified ES for slow sensor

Figure 2.2: Extremum seeking block diagrams. The modified extremum seekingalgorithm (b) applies both to the case with a slow sensor (ε > 0) and to the casewith a sensor modeled as a pure integrator, which we also refer to as a ‘driftingsensor’ (ε = 0). In both cases (ε > 0 and ε = 0), the washout filter is optional (bothh > 0 and h = 0 are permissible).

10

2.4 Slow Sensor and a Static Map

We consider applications in which the goal is to maximize the output of an

unknown nonlinear map f(θ) by varying the input θ. The signal f(θ(t)) is measured

through a slow sensor, namely, the signal µ(t), governed by the ODE

µ = −εµ+ bf(θ) . (2.3)

Let the maximizing value of θ be denoted as θ∗. We assume that the nonlinear map

is quadratic,

J = f(θ) = f ∗ − qθ(θ − θ∗)2, (2.4)

where besides θ∗ and f ∗ being unknown, qθ is an unknown positive constant.

In this section we study the case of a slow sensor (ε > 0 but small). We consider

both the ES scheme with a washout filter (h > 0) and without a washout filter

(h = 0). In the next section we address the same two cases but for a sensor modeled

as a pure integrator (ε = 0).

Let θ be the estimate of θ∗, and θ = θ− θ∗ be the error. From Figure 2.2 (b) we

obtain

θ = k

(η sin(ωt) +

1

s[−ηω cos(ωt)]

). (2.5)

Note, we mix the time and frequency domain notation by using the brackets [·] todenote that the transfer function acts as an operator on a time-domain function.

To prove stability we are going to analyze θ, η, and µ. Assuming the nonlinear

map (2.4) and the block diagram in 2.2 (b) we obtain

µ =b

s+ ε

[f ∗ − qθ(θ − θ∗)2

](2.6)

η =s

s+ h[µ] (2.7)

θ = k

(η sin(ωt) +

1

s[−ηω cos(ωt)]

)− θ∗. (2.8)

By rearranging (2.7), multiplying (2.6) and (2.8) by s, replacing θ with θ and

11

setting τ = ωt we obtain

dµ

dτ=1

ω

[bf ∗ − bqθ(θ + a sin(τ))2 − εµ

](2.9)

dη

dτ=1

ω

[bf ∗ − bqθ(θ + a sin(τ))2 − εµ− hη

](2.10)

dθ

dτ=− 1

ωk(hη + εµ− bf ∗ + bqθ(θ + a sin(τ))2) sin(τ) . (2.11)

Using the following two identities

1

2π

∫ 2π

0

(θ + a sin(τ))2dτ = θ2 +a2

2(2.12)

1

2π

∫ 2π

0

(θ + a sin(τ))2 sin(τ)dτ = θa, (2.13)

to average (2.9)–(2.11) we obtain

dµavg

dτ=1

ω

[bf ∗ − bqθ

(θ2 +

a2

2

)− εµavg

](2.14)

dηavgdτ

=1

ω

[bf ∗ − bqθ

(θ2 +

a2

2

)− εµavg − hηavg

](2.15)

dθavgdτ

=− kbaqθω

θavg . (2.16)

The equilibrium of the averaged system (2.14)–(2.16) is

µeavg =

b

ε

(f ∗ +

qθa2

2

)(2.17)

ηeavg = 0 (2.18)

θeavg = 0. (2.19)

The Jacobian of (2.14)–(2.16) at (µeavg, η

eavg, θ

eavg) is

Javg =1

ω

−ε 0 0

−ε −h 0

0 0 −kbaqθ

. (2.20)

Given that the nonlinear map has a maximum (qθ > 0) and that the sensor is sta-

ble (ε > 0) and non-inverting (b > 0), it follows that, if we choose a, ω, k, h > 0, the

Jacobian (2.20) is Hurwitz and the equilibrium of the averaged system (2.14)-(2.16)

is locally exponentially stable. From averaging theorem [36] we get the following

result.

12

Theorem 2.1 There exists ω∗ such that for all finite ω > ω∗ the system in Figure

2.2 (b) with nonlinear map (2.4) has a unique exponentially stable periodic solution

(µ2π/ω(t), η2π/ω(t), θ2π/ω(t)) of period 2π/ω which satisfies∥∥∥∥∥∥∥∥

µ2π/ω(t)− bε

(f ∗ + qθa

2

2

)η2π/ω(t)

θ2π/ω(t)

∥∥∥∥∥∥∥∥ ≤ O(1/ω), ∀ t ≥ 0. (2.21)

Since θ− θ∗ = θ+a sin(ωt) = (θ− θ2π/ω)+ θ2π/ω +a sin(ωt), the theorem implies

that the first term is zero, the second term is O(1/ω), and the third term is O(a).

Thus lim supt→∞ |θ(t)− θ∗| = O(1/ω). Hence, we get

lim supt→∞

|f(θ(t))− f ∗| = O(a2 + 1/ω2) , (2.22)

which characterizes the asymptotic performance of the extremum seeking loop in

Figure 2.2 (b).

Figure 2.4 shows simulations for a moving sensor along the length of a pipe, where

the objective is to localize a gas leak on the pipe with the use of sensor-compensated

extremum seeking, with the gas distribution, which is shown in Figure 2.3, modeled

in the form

f(θ) =δ∗

1 + pθ(θ − θ∗)2, (2.23)

where δ∗ = 250, pθ = 0.5, and θ∗ = 0. The extremum seeking parameters were

chosen as ω = 30, a = 0.2, k = 10, and h = 1. We assume the sensor model (2.1)

with the parameters ε = 0.046 and b = 0.037. Figure 2.4(b) shows the position of

the sensor in reference to the gas leak with a starting position of 3. The nonlinear

map output (J) and the sensor position (θ) quickly converge to a periodic motion

around f ∗ and θ∗, respectively. The signal after the washout filter (η), shown in

Figure 2.4(c), goes to zero.

Note in Figure 2.4(d) that the sensor reading converges very slowly. The time

interval for which J and θ are shown in Figure 2.4 is only one tenth of the time

interval on which η and µ are shown. This is done in order to display the details

13

−4 −2 0 2 40

50

100

150

200

250

Distance (m)

Gas

Con

cent

ratio

n (p

pm)

Distribution of Ethanol Gas

Figure 2.3: Gas concentration distribution along the pipe with gas leak at position0.

of the rapidly convergent sensor position θ, while the sensor reading µ is about ten

times slower. More specifically, even though it takes the sensor reading 120[sec] to

settle the extremum seeking algorithm is able to tune θ to achieve maximum output

from the nonlinear map in less than 6[sec]. The convergence would be orders of

magnitude slower if the algorithm had to wait for the sensor reading to settle every

time it wanted to tweak θ.

In some applications the use of washout filters may be undesirable because they

act as approximate differentiators and therefore may result in the amplification of

noise. Dropping the washout filter still results in a stable system. The washout filter

is used for performance reasons, not for stability reasons or to ‘cancel’ the extremely

slow (integrator-like) dynamics of the sensor. The proof for this case (omitted) is

very similar to the proof for the case where the sensor is a pure integrator but the

ES scheme does employ a washout filter (Theorem 2.3), with the Jacobian of the

averaged system given as

Javg =1

ω

[−ε 0

0 −kbaqθ

]. (2.24)

Theorem 2.2 Consider the system in Figure 2.2 (b) with the nonlinear map of form

14

0 2 4 60

50

100

150

200

250

Time (sec)

J (p

pm)

Ethanol Concentration

0 2 4 6

−1

−0.5

0

0.5

1

1.5

2

2.5

3

Time (sec)θ

(m)

Position Estimate of the Gas Leak

(a) (b)

0 20 40 60 800

1

2

3

4

5

6

7

8

Time (sec)

η

High Pass Filter of the Sensor Reading

0 20 40 60 800

50

100

150

200

Time (sec)

µ (k

Ω)

Sensor Reading

(c) (d)

Figure 2.4: Simulation results for modified extremum seeking with slow sensor dy-namics. (a) Output of the nonlinear map. (b) The sensor position relative to θ∗.(c) The signal after the high pass filter. (d) The slow sensor reading.

15

(2.4) and without the washout filter. There exists ω∗ such that for all finite ω > ω∗

the system has a unique exponentially stable periodic solution (y2π/ω(t), θ2π/ω(t)) of

period 2π/ω which satisfies∥∥∥∥∥∥ y2π/ω(t)− b

ε

(f ∗ + qθa

2

2

)θ2π/ω(t)

∥∥∥∥∥∥ ≤ O(1/ω), ∀ t ≥ 0 . (2.25)

Simulation (not included) for the system in Theorem 2.2 shows a convergence

rate that is inferior to that of the algorithm with the washout filter (Theorem 2.1).

This convergence rate difference is not captured by the averaging analysis because

the approximation accuracy of averaging is low when some of the eigenvalues of the

average system are small due to small ε.

2.5 Drifting Sensor and a Static Map

Our scheme works even when ε = 0, which is the case when the sensor is a

pure integrator. This is a rather extreme situation of a sensor that responds, but

permanently drifts in its value (towards infinity). All that we can achieve in this

case is to maximize the sensor’s input, since its output never settles.

The stability analysis for this case mimics some parts of the proof for Theorem

2.1. Assuming the nonlinear map in (2.4) and setting ε = 0, we write (2.6) as

µ =b

s

[f ∗ − qθ

(θ + a sin(ωt)

)2]. (2.26)

Since the sensor output µ is not going to settle when its input θ settles, we do not

include the sensor output as a state for which we are proving convergence. We only

study the states θ and η, whose equations are

dη

dτ=1

ω

[bf ∗ − bqθ(θ + a sin(τ))2 − hη

](2.27)

dθ

dτ=− 1

ωk(hη − bf ∗ + bqθ(θ + a sin(τ))2) sin(τ) . (2.28)

16

Using the identities (2.12) and (2.13) we obtain the following averaged equations

dηavgdτ

=1

ω[bf ∗ − bqθ(θ

2 +a2

2)− hηavg] (2.29)

dθavgdτ

=1

ω[−kbaqθθavg]. (2.30)

The averaged system (2.29), (2.30) has the equilibrium

[ηeavg, θeavg] =

[b

h

(f ∗ +

qθa2

2

), 0

], (2.31)

with the the Jacobian of

Javg =1

ω

[−h 0

0 −kbaqθ

]. (2.32)


2.2 (b) with the nonlinear map of form (2.4) and ε = 0 in the sensor dynamics has a

unique exponentially stable periodic solution (η2π/ω(t), θ2π/ω(t)) of period 2π/ω which

satisfies ∥∥∥∥∥∥ η2π/ω − b

h

(f ∗ + qθa

2

2

)θ2π/ω

∥∥∥∥∥∥ ≤ O(1/ω), ∀ t ≥ 0 . (2.33)

Figure 2.5 shows a simulation with a sensor Gsensor(s) = b/s, θ∗ = 0, f ∗ = 1,

qθ = 0.5, and b = 1. The ES parameters are chosen as ω = 30, a = 0.2, k = 10,

and h = 1. Figure 2.5(a) shows the ability of the sensor-compensated ES scheme

to maximize the output of a nonlinear map even with a marginally stable sensor.

Figure 2.5(b) shows θ starting from 3 and converging to a periodic motion around

θ∗. Figure 2.5(c) shows how the signal after the washout filter (η) converges to a

periodic motion around ηeavg = 1.02. The response for µ(t) is not shown since it

drifts in a linear manner towards infinity, as expected.

The scheme studied in Theorem 2.3 contains a cascade of the sensor’s integrator

dynamics and of a washout filter. It may appear that the key to the result is that a

differentiator cancels an integrator. This is not the case at all, as we illustrate with

the next simulations, for the system with Gsensor(s) = b/s and without the washout

filter (i.e., with h = 0). This simple result is given without a proof, which follows

from the fact that the (scalar) Jacobian is −kbaqθ/ω (in the τ time scale).

17

0 1 2 3 4 5 6−4

−3

−2

−1

0

1

Time (sec)

J

Nonlinear Map Output

0 1 2 3 4 5 6

0

0.5

1

1.5

2

2.5

3

Time (sec)θ

Estimate of θ*

(a) (b)

0 1 2 3 4 5 6−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time (sec)

η


(c)

Figure 2.5: Simulation results for extremum seeking with Gsensor(s) = b/s withwashout filter. (a) Output of the nonlinear map. (b) The sensor position relative toθ∗. (c) The signal after the high pass filter.

18

Theorem 2.4 Consider the system in Figure 2.2 without the washout filter, with

ε set to zero in the sensor dynamics, and the nonlinear map of form (2.4). There

exists ω∗ such that for all ω > ω∗ the system has a unique exponentially stable

periodic solution θ2π/ω(t) of period 2π/ω which satisfies∥∥∥θ2π/ω(t)∥∥∥ ≤ O(1/ω), ∀ t ≥ 0 . (2.34)

Simulation results for the system in Theorem 2.4 are shown in Figure 2.6 for

f ∗ = 1, qθ = 0.5, b = 1, ω = 30, a = 0.2, k = 10, and h = 1. As expected, θ and

f converge to a periodic motion around θ∗ and f ∗, respectively. The drifting sensor

without the washout filter has significant oscillations after settling compared to the

previous case with the washout filter. The significance of the result in Theorem 2.4,

shown in Figure 2.6, is that the modified extremum seeking scheme is not merely

acting based on the signal trend/derivative rather than on the signal value, which

would have been the case if the inclusion of a washout filter had turned out to be

crucial. Rather than ‘canceling’ the sensor’s integrator, our scheme leverages it, by

using its presence for the function of tuning θ(t) in the ES loop.

2.6 Navigation of a 2D Point Mass With a Slow

Sensor

In this section we study the case of a slow sensor (ε > 0 but small) on a vehicle

modeled as a 2D point mass

x(t) = ux(t) (2.35)

y(t) = uy(t) , (2.36)

where ux(t), uy(t) are two independent velocity inputs to the vehicle. For simplicity

of our presentation we assume that the nonlinear map is quadratic with the form

f(x, y) =f ∗ − qx(x− x∗)2 − qy(y − y∗)2, (2.37)

19

0 2 4 6−4

−3

−2

−1

0

1

Time (sec)

J


0 2 4 6−1

0

1

2

3

4

Time (sec)

θ

Estimate of θ*

(a) (b)

Figure 2.6: Simulation results for extremum seeking with Gsensor(s) = b/s andwithout washout filter. (a) Output of the nonlinear map. (b) The sensor positionrelative to θ∗.

where (x∗, y∗) is the maximizer, f ∗ is the maximum and qx, qy are some unknown

positive constants.

We develop a two-input scheme, which accounts for the integrator dynamics of

the vehicles two actuation channels, in the following manner. We start from the

scheme for a static map in Figure 2.2. To get an integrator to appear at the input

of the nonlinear map, we first place the term 1 = ssbetween the ES gain (k) and

the addition of the perturbation a sin(ωt). Then, taking the term 1sfrom the term

ssand moving it downstream in the signal flow direction and past the perturbation

input, which results in a differentiation of the perturbation, we get an integrator

to appear at the input of the nonlinear map. Then, realizing that a differentiator

s, which remains from the term ss, cannot be implemented, we replace it with an

approximate differentiator, i.e., a washout filter ss+dx

. Finally, we take advantage of

the availability of the integrator in the lowest branch of the extremum seeking loop

and, with a suitable block diagram manipulation, arrive at the scheme given in the

x-channel of the scheme in Figure 2.7.

To go from a 1D scheme to a two-input 2D-navigation scheme we simply add

another extremum seeking channel with the perturbation and the demodulation

20

;%'6&'"38)/3<)

),( yxfb

s+ε

µ

y

x

hs

s

+)sin( tω

)cos( tωω−s

1

xd−

xk

xξ

)cos( ta ωω

xɺ

)cos( tω

yd−

yky

ξ

)sin( ta ωω−

yɺ

)sin( tωω

J ="'+%8)

s

1

s

1

s

1

η

Figure 2.7: Modified ES for 2D point mass vehicle with slow sensor. The schemeapplies both to the case with a slow sensor (ε > 0) and to the case with a sensormodeled as a pure integrator, which we also refer to as a ‘drifting sensor’ (ε = 0),and with both h > 0 and h = 0 being permissible.

applied with a 90 phase shift, as was done in [60]. The vehicle control is given by

ux(t) = aω cos(ωt) + kxξx(t) (2.38)

uy(t) = −aω sin(ωt) + kyξy(t) . (2.39)

We introduce the new coordinates

x = x− x∗ − a sin(ωt) (2.40)

y = y − y∗ − a cos(ωt) . (2.41)

With the new coordinates the map (2.37) becomes

f(x, y) = f ∗ − qx(x+ a sin(ωt))2 − qy(y + a cos(ωt))2 . (2.42)

21

From the block diagram in Figure 2.2(c) we write the equations for µ, η, ξx, and ξy

µ =b

s+ ε

[f ∗ − qx(x− x∗)2 − qy(y − y∗)2

](2.43)

η =s

s+ h[µ] (2.44)

ξx = η sin(ωt)− 1

s[ηω cos(ωt) + dxξx] (2.45)

ξy = η cos(ωt) +1

s[ηω sin(ωt)− dyξy]. (2.46)

By replace (x, y) with (x, y), letting τ = ωt, and rearranging (2.43)–(2.46) we

obtain the ODEs

dµ

dτ=1

ω

[−εµ+ bf ∗ − bqx(x+ a sin(τ))2

−bqy(y + a cos(τ))2]

(2.47)

dη

dτ=1

ω

[−εµ− hη + bf ∗ − bqx(x+ a sin(τ))2

−bqy(y + a cos(τ))2]

(2.48)

dξxdτ

=− 1

ω

[(hη + εµ− bf ∗ + bqx(x+ a sin(τ))2

+bqy(y + a cos(τ))2) sin(τ) + dxξx]

(2.49)

dx

dτ=kxξx (2.50)

dξydτ

=− 1

ω

[(hη + εµ− bf ∗ + bqx(x+ a sin(τ))2

+bqy(y + a cos(τ))2)cos(τ) + dyξy

](2.51)

dy

dτ=kyξy (2.52)

Using the identities (2.12) and (2.13) to average (2.47)–(2.52), we get

dµavg

dτ=1

ω

[−εµavg + bf ∗ − bqx

(x2avg +

a2

2

)− bqy

(y2avg +

a2

2

)](2.53)

dηavgdτ

=1

ω

[−hη − εµavg + bf ∗ − bqx

(x2avg +

a2

2

)− bqy

(y2avg +

a2

2

)](2.54)

dξx avg

dτ=− 1

ω(baqxxavg + dxξx avg) (2.55)

dxavg

dτ=kxωξx avg (2.56)

22

dξy avgdτ

=− 1

ω(baqyyavg + dyξy avg) (2.57)

dyavgdτ

=kyωξy avg . (2.58)

The equilibrium of the averaged system

µeavg =

b

ε

(f ∗ +

(qx + qy)a2

2

)(2.59)

ηeavg = 0 (2.60)

ξex avg = 0 (2.61)

xeavg = 0 (2.62)

ξey avg = 0 (2.63)

yeavg = 0 . (2.64)

with the Jacobian of (2.53)–(2.58) at (µeavg, η

eavg, ξ

ex avg, x

eavg, ξ

ey avg, y

eavg) given by

Javg =1

ω

−ε 0 0 0 0 0

−ε −h 0 0 0 0

0 0 −dx −baqx 0 0

0 0 kx 0 0 0

0 0 0 0 −dy −baqx

0 0 0 0 ky 0

. (2.65)

Given that the nonlinear map has a maximum (qx, qy > 0) and that the sen-

sor is stable (ε > 0) and non-inverting (b > 0), it follows that, if we choose

a, ω, kx, ky, dx, dy, h > 0, the Jacobian (2.65) is Hurwitz and the equilibrium (2.59)–

(2.64) of the averaged system (2.53)–(2.58) is locally exponentially stable. From

averaging theorem [36] we get the following result.


2.7 with nonlinear map (2.37) has a unique exponentially stable periodic solution

(µ2π/ω(t), η2π/ω(t), ξ2π/ωx (t), x2π/ω(t),

23

ξ2π/ωy (t), y2π/ω(t)) of period 2π/ω which satisfies∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥

µ2π/ω(t)− bε

(f ∗ + (qx+qy)a2

2

)η2π/ω(t)

ξ2π/ωx (t)

x2π/ω(t)

ξ2π/ωy (t)

y2π/ω(t)

∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥≤ O(1/ω), ∀ t ≥ 0. (2.66)

Since x − x∗ = x + a sin(ωt) = (x − x2π/ω) + x2π/ω + a sin(ωt), the theorem

implies that the first term is zero, the second term is O(1/ω), and the third term

is O(a). Thus lim supt→∞ |x(t) − x∗| = O(1/ω) + O(a). Similarly in y we obtain

lim supt→∞ |y(t)− y∗| = O(1/ω) +O(a). Hence, we get

lim supt→∞

|f(x(t), y(t))− f ∗| = O(a2 + 1/ω2) , (2.67)

which characterizes the asymptotic performance of the extremum seeking loop in

Figure 2.7.

Figure 2.8 shows simulations of a point mass vehicle starting at position (1,1) us-

ing a sensor with slow dynamics and actuator-sensor-compensated extremum seeking

on a nonlinear map modeled in the form

f(x, y) =δ∗

1 + px(x− x∗)2 + py(y − y∗)2, (2.68)

where δ∗ = 250, px = 1, py = 0.5 and (x∗, y∗) = (0, 0). The extremum seeking

parameters are ω = 20, a = 0.5, kx = 1, ky = 1, dx = 0.2, dy = 0.2 and h = 1.

We assume the sensor model (2.1) with the parameters ε = 0.046 and b = 0.037.

It is interesting to note that the time it takes the vehicle to settle to the location

of the maximum concentration is one forth the time that it take the sensor reading

to settle. The increase in convergence time of the position of the sensor from the

previous 1D case to the 2D case is mainly due to the addition of the actuator

dynamics for the vehicle.

24

0 10 20 30 40 50

100

150

200

250

Time (sec)

J


(a) (b)

0 20 40 60 80 100 1200

50

100

150

200

Time (sec)

µ

Output of Sensor Reading

0 10 20 30 40 50

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

Time

˙ x

Control Input of X−axis

(c) (d)

0 20 40 60 80 100 1200

1

2

3

4

5

6

7

Time (sec)

η


0 10 20 30 40 50−0.2

−0.1

0

0.1

0.2

0.3

Time

˙ y

Control Input of Y−axis

(e) (f)

Figure 2.8: Simulation results for extremum seeking on a 2D point mass with aslow sensor. (a) Vehicle trajectory with the intensity of the nonlinear map in thebackground. (b) Output of the nonlinear map. (c) The slow sensor output. (e)The output of the washout filter. (d) and (f) The control input of x-axis and y-axisbefore the addition of the perturbation, respectively.

25

Similar with the modified one dimensional case, the two dimensional modified

extremum seeking case with point mass actuator dynamics can be extended to the

two dimensional case with no washout filter or with a purely drifting sensor.

Modified extremum seeking on the two dimensional case with point mass actuator

dynamics, similar to the one dimensional slow sensor case, can also be proven to be

stable with no washout filter or with a purely drifting sensor.

This chapter is in full a reprint of the material as it appears in: N. Ghods, and M.

Krstic, “Source seeking with very slow or drifting sensors,” provisionally accepted

for Journal of Dynamic Systems, Measurement, and Control.

The dissertation author was the primary investigator and author of this paper.

3

Source Seeking for Nonholonomic

Unicycle with Speed Regulation

The simplest strategy for extremum seeking-based source localization, for sources

with unknown spatial distributions and nonholonomic unicycle vehicles without po-

sition measurement, employs a constant positive forward speed. Steering of the

vehicle in the plane is performed using only the variation of the angular velocity.

While keeping the forward speed constant is a reasonable strategy motivated by im-

plementation with aerial vehicles, it leads to complexities in the asymptotic behavior

of the vehicle, since the vehicle cannot settle—at best it can converge to a small-size

attractor around the source. In this paper we regulate the forward velocity, with

the intent of bringing the vehicle to a stop, or as close to a stop as possible. The

vehicle speed is controlled using simple derivative-like feedback of the sensor mea-

surement (the derivative is approximated with a washout filter) to which a speed

bias parameter Vc is added. The angular velocity is tuned using standard extremum

seeking. We prove two results. For Vc in a certain range around zero, we show that

the vehicle converges to a ring around the source and on average the limit of the

vehicle’s heading is either directly away or towards the source. For other values of

Vc > 0, the vehicle converges to a ring around the source and it revolves around the

source. Interestingly, the average heading of this revolution around the source is

more outward than inward. The theoretical results are illustrated with simulations.

26

27

3.1 Introduction

In [11, 59], we considered the problem of seeking the source of a scalar signal

using a nonholonomic vehicle with no position information. We designed two distinct

strategies—keeping the angular velocity constant and tuning the forward speed by

extremum seeking [59]; and keeping the forward speed constant and tuning the

angular velocity by extremum seeking [11]. The strategy in [59] generates vehicle

motions that resemble triangles, rhombi, or stars (with arc-shaped sides), which drift

towards the source, resulting in periodic motions around the source. The strategy

in [11] generates motions that sinusoidally converge towards the source and settle

into an almost periodic (in a mathematical sense of the term) motion in a ring around

the source. While the proof of the result [11] is more challenging, the vehicle motion

is much more efficient than with the strategy in [59], since the simple tuning of

the heading results in trajectories where the distance of the vehicle from the source

decreases monotonically.

Neither of the strategies in [11, 60] are ideal, since [60] sacrifices the transients,

whereas [11] complexifies the asymptotic performance. In this paper we aim for the

best of both worlds, but not by simply combining the strategies in [11] and [60].

We propose something more elegant, a strategy that partly simplifies the approach

in [11], while adding a simple derivative-like feedback to a nominal forward speed

Vc. This feedback allows the vehicle to slow down as it gets closer to the source and

converge closer to the source without giving up convergence speed.

We prove two results, for quadratic signal fields that decay with the distance from

the source. For Vc in a certain range around zero, we show that the vehicle converges

to a ring around the source and on average the limit of the vehicle’s heading is either

directly away or towards the source. For other values of Vc > 0, the vehicle converges

to a ring around the source and it revolves around the source. Interestingly, the

average heading of this revolution around the source is more outward than inward—

this is possible because the vehicle’s speed is not constant, it is lower during the

outward steering intervals and higher during the inward steering intervals. The

theoretical results are illustrated with simulations. A simulation is also done to

consider the case when a Rosenbrock function as the signal field.

28

x

y

cr

sr

R

v

!

Figure 3.1: The notation used in the model of vehicle sensor and center dynamics.

In Section 3.2 a description of the vehicle model and extremum seeking scheme

are given. We derive the averaged system in Section 3.3. We prove local exponential

convergence results to ring/annulus-shaped sets around the source in Sections 3.4

and 3.5. Section 3.4 deals with the case of small |Vc|, whereas Section 3.5 deals with

medium and large positive values of Vc. Simulation results in Sections 3.4 and 3.5

illustrate the distinct behaviors exhibited using different values of Vc. In Section 3.6

we summarize the set of possible motions and attractors near the source that are

achieved for different values of a key design parameter.

3.2 Vehicle Model and Control Design

We consider a mobile agent modeled as a unicycle with a sensor mounted a

distance R away from the center. The diagram in Figure 3.1 depicts the position,

heading, angular and forward velocities for the vehicle center and sensor. The

equations of motion for the vehicle’s center are

rc = vejθ (3.1)

θ = Ω (3.2)

where rc is complex variable that represents the center of the vehicle in 2D, θ is

the orientation and v and Ω are the forward and angular velocity inputs, respec-

29

tively. The sensor is located at rs = rc + Rejθ. Note that this convenient complex

representation of the position would be less useful if extending this work to a 3D

setting.

The task of the vehicle is to seek a source that emits a signal (for example, the

concentration of a chemical, biological agent, electromagnetic, acoustic, or even ther-

mal signal) which decays as a function of distance away from the source. We assume

this signal field is distributed according to an unknown nonlinear map f (r(x, y))

which has an isolated local maximum f ∗ = f(r∗) where r∗ is the location of the lo-

cal maximum. We design a controller that achieves local convergence to r∗ without

knowledge of the shape of f , using only the measurement f(rs). We could design

a control law to force the vehicle’s trajectory to evolve according to the gradient of

the dynamical system rc = −∇f , if we knew both the shape of the map f and the

position of the vehicle rc, and further if the vehicle were fully actuated. In that case

the trajectory of rc would asymptotically converge to the set of stationary points of

f where ∇f(r∗) = 0. In the absence of the knowledge of function f(x, y) and of the

vehicle’s position, we have to employ techniques of non model-based optimization.

In addition, in the absence of direct actuation of the vehicle’s position, namely,

for a nonholonomic vehicle that cannot be directly steered sideways and all of its

motion has to be produced using forward and angular velocity inputs, the task of

source-seeking becomes even more challenging.

We employ extremum-seeking to tune the angular velocity (Ω) directly and the

forward velocity (v) indirectly. This scheme is depicted by the block diagram in

Figure 3.2. The control laws are given by

Ω = aω cos(ωt) + cξ sin(ωt) (3.3)

v = Vc + bξ , (3.4)

where ξ is the output of the washout filter, namely, of the approximate differentiator

of f(rs, t). The performance can be influenced by the parameters a, c, b, R, h, ω

and Vc. We tune angular velocity Ω with the basic extremum-seeking tuning law,

which has a perturbation term, aω cos(ωt), to excite the system. The ξ sin(ωt) term

estimates the angular gradient of the map.

30

Nonlinear Map

)(rf

f

hs

s

)sin( t!

c

"

)cos( ta !!

#

Unicycle

Dynamics

b

cr

#

cV

Figure 3.2: Block diagram of source seeking via tuning of angular velocity andforward velocity using one reading

The forward velocity v = Vc + bξ is chosen using the following intuition. When

the vehicle is approaching the source, heading straight towards it, the sensor reading

is increasing and hence ξ > 0. It is reasonable to increase the speed of the vehicle

when it is going towards the source. Conversely, when the vehicle is past the source

and the signal reading is decreasing, i.e., ξ < 0, the vehicle should be slowed down,

which (3.4) achieves.

We stress that the steering feedback (3.3) does not employ the nonlinear damp-

ing introduced in [11]. The damping needed to exponentially stabilize the average

equilibria is provided by the forward speed feedback (3.4).

3.3 The Average System

We focus on maps which depend on the distance from the source only. Since

our goal is only the establishment of local convergence, we assume that the map is

quadratic, and given by

f(rs) = f ∗ − qr|rs − r∗|2 (3.5)

31

where r∗ is the unknown maximizer, f ∗ = f(r∗) is the unknown maximum and qr

is an unknown positive constant.

We define an output error variable

e =h

s+ h[f ]− f ∗ , (3.6)

where hs+h

[f ] is a low-pass filter applied to the sensor reading f , which allows us

to express ξ, the output of the washout filter, as ξ = ss+h

[f ] = f(rs) − hs+h

[f ] =

f(rs − f ∗ − e), noting also that e = hξ.

Consider the system

rc = (Vc + bξ)ejθ (3.7)

θ = aω cos(ωt) + cξ sin(ωt) (3.8)

e = hξ (3.9)

ξ = −(qr|rs − r∗|2 + e) (3.10)

rs = rc +Rejθ (3.11)

shown in Figure 3.2. To analyze this system we start by defining the shifted variables

rc = rc − r∗ (3.12)

θ = θ − a sin(ωt) (3.13)

e = e− qrR2 . (3.14)

We also introduce the time scale change

τ = ωt, (3.15)

and introduce a map from the position rc to a scalar quantity θ∗, given by

−rc = |rc|ejθ∗

(3.16)

θ∗ = −j

2ln

(− rc¯rc

)= arg(r∗ − rc) , (3.17)

where θ∗ represents the heading angle towards the source located at r∗ when the

vehicle is at rc, and ¯rc is the complex conjugate of rc. Using these definitions, the

32

x

y

*r

*

*

~

r~

cr

Figure 3.3: Diagram of the error variables relating the vehicle and the source.

expression for ξ is

ξ =−(qr|rc +Rejθ − r∗|2 + e− qrR

2)

=−(qr

(|rc|2 − 2R|rc| cos(θ − θ∗ + a sin(τ))

)+ e). (3.18)

The dynamics of the shifted system are

drcdτ

=1

ω

(Vc + bξ)ej(θ+a sin(τ))

(3.19)

dθ

dτ=

1

ωcξ sin(τ) (3.20)

de

dτ=

1

ωhξ. (3.21)

We next define error variables rc and θ (depicted in Figure 3.3), which represent

the distance to the source, and the difference between the vehicle’s heading and the

optimal heading, respectively,

rc = |rc| (3.22)

θ = θ − θ∗. (3.23)

The resulting dynamics for the error variables are

33

drcdτ

=d√rc ¯rc

dτ=

1

2|rc|

(drcdτ

¯rc + rcd¯rcdτ

)=− Vc + bξ

ωcos(θ + a sin(τ)

)(3.24)

dθ

dτ=dθ

dτ− dθ∗

dτ=

dθ

dτ+

j

2|rc|2

(drcdτ

¯rc − rcd¯rcdτ

)=1

ω

[cξ sin(τ) +

Vc + bξ

rcsin(θ + a sin(τ)

)](3.25)

de

dτ=1

ωhξ (3.26)

ξ =−(qrr

2c + e− 2qrRrc cos

(θ + a sin(τ)

)). (3.27)

The system of equations is periodic with a period 2π, and the averaged error

system is

dravec

dτ=1

ω

[bJ0(a)(qrr

ave2

c + eave) cos(θave)

−bqrRravec (1 + J0(2a) cos(2θave))

−VcJ0(a) cos(θave)]

(3.28)

dθave

dτ=1

ω

[−qr(2cRJ1(a) + bJ0(a))r

avec sin(θave)

+bqrRJ0(2a) sin(2θave)

+VcJ0(a)− bJ0(a)e

ave

rcsin(θave)

](3.29)

deave

dτ=−h

ω

[(qrr

ave2

c + eave)

−2qrRJ0(a)ravec cos(θave)

], (3.30)

where J1(a) and J1(a) are Bessel functions of the first kind. The averaged error

system (3.28)–(3.30) has four equilibria defined byrave

eq1

c =VcJ0(a)

bqrRρ1θave

eq1= π

eaveeq1

= e12,

(3.31)

34

rave

eq2

c = −VcJ0(a)

bqrRρ1θave

eq2= 0

eaveeq2

= e12,

(3.32)

rave

eq3

c = ρ0

θaveeq3

= π + µ0

eaveeq3

= e34

(3.33)

rave

eq4

c = ρ0

θaveeq4

= π − µ0

eaveeq4

= e34 ,

(3.34)

where

ρ0 =

√γ1√

2cJ1(a)(3.35)

µ0 = arctan

√γ2

b√qrR(1− J0(2a))

(3.36)

e12 = −2VcJ20 (a)

bρ1− (VcJ

20 (a))

2

qrb2R2ρ21(3.37)

e34 = − γ12c2RJ2

1 (a)(3.38)

+bqrRhJ0(a)

√2γ1(1− J0(2a))

cJ1(a)√

γ2 + b2qrR(1− J0(2a))2. (3.39)

and

γ1 =cJ1(a)J0(a)Vc + b2qrRρ2

γ2 =2cJ1(a)J0(a)Vc − b2qrRρ3

ρ1 =1 + J0(2a)− 2J20 (a) ≥ 0

ρ2 =J20 (a)− J0(2a)− J0(2a)J

20 (a) + J2

0 (2a)

ρ3 =− 2J20 (a) + 2J0(2a)J

20 (a)− J2

0 (2a) + 1 ≥ 0.

Note that, due to properties of Bessel functions, 1−J0(2a) is positive for all positive

a. In addition, ρ1(a) and ρ3(a) = (1− J0(2a))ρ1(a) are positive for all positive and

sufficiently small values of a. In fact, both ρ1(a) and ρ3(a) > 0 appear to be positive

35

for all positive values of a (rather than only for small a > 0), but this may be

difficult to prove.

Due to the transformation (3.22), the four equilibria (3.31)–(3.34) can only be

related back to the original system if ravec is real and positive. It should be noted

that raveeq1

c and raveeq2

c cannot simultaneously be positive (note that Vc can be either

positive or negative), and also that raveeq3

c and raveeq4

c are real only when γ1 > 0. In

the next two sections we will show stability of the four average equilibria (not all of

them simultaneously) for different values of the speed bias parameter Vc, and infer

the appropriate convergence properties for the non-average system (3.24)–(3.27).

Each of the four average equilibria (3.31)–(3.34) represents a ring around the

source. However, more interesting information is obtained when considering the

average values of θ. With equilibrium 1 the vehicle points away from the source,

with equilibrium 2 it points directly towards the source, and with equilibria 3 and

4 the vehicle points, on the average, outwards relative to the ring, revolving around

the source in the counterclockwise direction for equilibrium 3 and in the clockwise

direction for equilibrium 4.

3.4 Stability for Small Positive or Negative Vc

In this section we analyze the stability properties of system shown in Figure 3.2

when the parameter Vc is small but not zero.

Theorem 3.1 Consider the system in Figure 3.2 with nonlinear map (3.5) that has

a maximum (qr > 0). Let the parameters c, b, R, h be chosen as positive. Let the

parameter a be chosen so that J0(a), J0(2a), J1(a), 1+ J0(2a)− 2J20 (a) > 0. Let the

parameter Vc be nonzero and such that either

Vc ∈ (0, V lowerc ), (3.40)

where V lowerc , −bqrR(1 + J0(2a)) + h

2J20 (a)

Rρ1,

or

Vc ∈ (V upperc , 0), where V upper

c , b2qrRρ32cJ1(a)J0(a)

. (3.41)

36

There exists constants ω∗ > 0 and δ > 0 such that, for all ω > ω∗, if the ini-

tial conditions rc(0), θ(0), e(0) are such that the following quantities are sufficiently

small, ∣∣∣∣|rc(0)− r∗| − |Vc|J0(a)bqrRρ1

∣∣∣∣ < δR (3.42)

|θ(0)− arg(rc(0)− r∗)− nπ| < δa , n ∈ N (3.43)∣∣e(0)− qrR2 − e12

∣∣ < δqR2 , (3.44)

then the trajectory of the vehicle center rc(t) locally exponentially converges to, and

remains in, the ring

|Vc|J0(a)bqrRρ1

−O(1/ω) ≤ ∥rc − r∗∥ ≤ |Vc|J0(a)bqrRρ1

+O(1/ω) . (3.45)

Proof: The Jacobian of the average system (3.28)–(3.30) at the equilibria

(3.31) and (3.32) is (at both equilibria) given by

Aeq1 =1

ω

−2VcJ2

0 (a)

Rρ1− bqrR(1 + J0(2a)) 0 −bJ0(a)

0 η 0

−2hJ0(a)(qrR + Vc

bRρ1

)0 −h

(3.46)

where

η = 2cJ1(a)J0(a)

bρ1Vc −

bqrRρ3ρ1

. (3.47)

By applying a similarity transformation with the matrix

T =

1 0 0

0 0 1

0 1 0

, (3.48)

we convert the Jacobian (3.46) into the block diagonal matrix

diag

1

ω

−2VcJ20 (a)

Rρ1− bqrR(1 + J0(2a)) −bJ0(a)

−2hJ0(a)(qrR + Vc

bRρ1

)−h

,η

ω

. (3.49)

The characteristic equation for this Jacobian is the combination of the characteristic

equations of the two blocks, which is

(ωs)2 + ζ(ωs) + hbqrRρ1 = 0 (3.50)

ωs− η = 0 , (3.51)

37

where

ζ =2J2

0 (a)Vc

Rρ1+ bqrR(1 + J0(2a)) + h . (3.52)

According to the Routh-Hurwitz criterion, to guarantee that the roots of the polyno-

mial have negative real parts, each coefficient must be greater than zero. Hence, we

need η < 0 in (3.47) and ζ > 0 in (3.52). Both of these conditions are satisfied under

either condition (3.40) or (3.41) of Theorem 3.1. By applying Theorem 10.4 from

[36] to this result, we conclude that the error system (3.24)–(3.27) has two distinct,

exponentially stable periodic solutions within O(1/ω) of the equilibria (3.31) and

(3.32), which proves that the the vehicle center rc converges to the annulus (3.45)

around the source r∗ defined in (3.45).

Simulation: Figure 3.4 shows the simulation with the map parameters r∗ =

(0, 0), qr = 1 and vehicle initial conditions of r0 = (1, 1) and θ0 = −π/2. The

ES parameters are chosen as ω = 20, a = 1.8, R = 0.1, c = 80, b = 4, h = 2, and

Vc = 0.005, which satisfies (3.41). Figures 3.4 (a), (b), and (c) show that the error

variables converge very near the theoretical equilibrium values. Figure 3.4 (d) shows

the trajectory of the vehicle in the signal field. It appears from Figure 3.4 (d) as if

the vehicle comes to a full stop. This is actually not the case, as we note from the

zoom frame in Figure 3.4 (c), and as we further explain in Remark 3.4.

Figure 3.5 shows the main difference between the small positive and negative

Vc with the map parameters, initial conditions, and ES parameters chosen to be

the same as the simulation in Figure 3.4 for both vehicles except for the parameter

Vc, which was set to +0.02 for one and −0.02 for the other. While with Vc > 0

the vehicle heading converges to a value pointing directly away from the source,

as predicted by the average equilibrium for the heading in (3.31), with Vc < 0

the vehicle heading converges to a value pointing directly towards the source, as

predicted by the average equilibrium for the heading in (3.32).

The abilities of this extremum seeking scheme on a non-quadratic function can be

seen in Figure 3.6, where the vehicle can converge to the maximum with the unknown

map being a Rosenbrock function. The Rosenbrock function is characterized by an

extremely deep valley along the parabola x2 = y that leads to the global minimum

and is often use abilities of an optimization scheme [48] . The Rosenbrock function

38

0 5 10 15 20 25 300

0.5

1

1.5

Time

rc

Absolute Distance from the Source

Simulation ResultTheoretical Equilibrium

Time

θ

Relative Angle between Vehicle and Source

0 10 20 30

0

π

2

π

Simulation ResultTheoretical Equilibrium

(a) (b)

0 5 10 15 20 25 30−2

−1

0

1

2

3

Time

Vc+

bξ

Forward speed

15 15.5 16

−5

0

5

x 10−3

−0.5 0 0.5 1 1.5−0.2

0

0.2

0.4

0.6

0.8

1

X

Y

Vehicle Trajectory

Start LocationVehicle TrajectoryVehicle BodySource LocationTheoretical Equilibrium

(c) (d)

Figure 3.4: Simulation results for steering-based unicycle source seeking with for-ward speed regulation: (a), (b), (c) showing the evolution of the variables rc, θ, andVc + bξ, respectively, and (d) showing the trajectory of the vehicle.

39

used in Figure 3.6 has a maximum at (1, 1) with the following form

f(rs) = −1

2(1− xs)

2 − (ys − x2s)

2, (3.53)

where xs =Re(rs) and ys =Im(rs). The vehicle is given the starting positions of

r0 = (−0.5,−0.5) and θ0 = π. The ES parameters are chosen as ω = 20, a =

1.8, R = 0.1, c = 80, b = 5, h = 1, and V c = −0.005.

Remark 3.1: The vehicle does not come to a full stop, as evident from Figure

3.4 (c), even though it slows down nearly to a stop due to a very small Vc = 0.005.

However, unlike in [11], the vehicle, after entering the annulus, does not revolve

around the source. It points, on the average, towards or away from the source, de-

pending on the sign of Vc. The vehicle’s angular velocity and forward speed oscillate

but the vehicle does not drift clockwise or counter-clockwise in the annulus. While

this fact is evident from the simulations, unfortunately it cannot be proved. This

is because only the relative heading with respect to the source has an exponentially

stable equilibrium. The absolute heading, after averaging the θ-system (3.25), has a

continuum of equilibria, but none of them are exponentially stable, which precludes

the possibility of proving, using the averaging method, that no drift occurs.

Similar to [11], the vehicle converges to an annulus around the source with a

radius proportional to Vc. From (3.49) we see that when h is large the decay rate

in the radial state rc of the vehicle is a function of two terms, one with Vc and the

other with b, unlike [11], where the convergence rate depends only on Vc, and where a

trade-off between the annulus size and convergence speed exists (faster convergence

implies a larger annulus, because the vehicle has constant speed). In the present

design we can choose Vc ≪ b and achieve fast convergence to a small annulus around

the source. With the choice of small Vc the vehicle comes almost to a stop, as shown

in Figure 3.4.

The linearization step fails when Vc = 0, due to the singularity at rc = 0 in

(3.25). For this reason, nothing can be said about the system behavior even though

Vc = 0 verifies the Routh-Hurwitz criterion. The singularity at rc = 0 also manifests

itself in the average equilibria (3.31) and (3.32), where rc = 0 at both equilibria,

but the heading has a non-unique value (θ = π or θ = 0).

40

3.5 Stability for Medium and Large Positive Vc

For medium or large values of Vc the vehicle converges to the average equilibria

3 and 4, namely to an annulus within which the vehicle revolves around the source,

similar to the vehicle trajectories produced by the algorithm in [11]. However, as

we shall see, an interesting difference relative to [11] arises thanks to the fact that

forward speed is not constant, which allows the vehicle to revolve around the source

with non-tangential average heading.

Theorem 3.2 Consider the system in Figure 3.2 with nonlinear map (3.5) that has

a maximum (qr > 0). Let the parameters c, b, R, h be chosen as positive. Let the

parameter a be chosen so that J0(a), J0(2a), J1(a), 1 + J0(2a)− 2J20 (a) > 0. Let

Vc > V upperc , (3.54)

where V upperc is defined in (3.41). There exists constants ω∗ > 0 and δ > 0 such

that, for all ω > ω∗, if the initial conditions rc(0), θ(0), e(0) are such that∣∣∣∣|rc(0)− r∗| −√γ1√

2cJ1(a)

∣∣∣∣ < δR (3.55)

|θ(0)− arg(rc(0)− r∗)− (2n+ 1)π ± µ0| < δa , n ∈ N (3.56)∣∣e(0)− qrR2 − e34

∣∣ < δqR2 , (3.57)

where θaveeq3or4

and eaveeq3or4

are from the equilibria (3.33) and (3.34), then the tra-

jectory of the vehicle center rc(t) locally exponentially converges to, and remains in,

the annulus

√γ1√

2cJ1(a)−O(1/ω) ≤ |rc − r∗| ≤

√γ1√

2cJ1(a)+O(1/ω). (3.58)

Proof: We first note that condition (3.54) ensures that γ2 > 0. We also note

that the statement of the theorem relies on γ1 being positive, since it appears under

the square root. To see that γ1 is indeed positive, we express it as

γ1 =γ22

+ b2qrR(ρ32

+ ρ2

), (3.59)

41

−0.5 0 0.5 1 1.5−0.2

0

0.2

0.4

0.6

0.8

1

X

Y

Vehicle Trajectory

Start LocationSmall Positive V

c

Small Negative Vc

Source LocationTheoretical Equilibrium

Figure 3.5: The difference in trajectories for small positive and negative Vc. Thetwo cases yield convergence to the average equilibria (3.31) and (3.32), respectively.For Vc < 0 the vehicle points towards the source at the end of the transient, whereasfor Vc > 0 the vehicle points away from the source at the end of the transient.

−1.5 −1 −0.5 0 0.5 1 1.5 2−1

−0.5

0

0.5

1

1.5

2

2.5

X

Y

Vehicle Trajectory

Start LocationVehicle TrajectoryVehicle BodySource Location

Figure 3.6: Simulation result of vehicle trajectory using steering-based source seek-ing and forward speed regulation on a Rosenbrock function (the white shading rep-resents the maximum).

42

where

ρ32

+ ρ2 =1

2(1− J0(2a))

2 ≥ 0 , ∀a . (3.60)

Since γ2 ≥ 0, it follows that γ1 > 0 and thus it follows that the average equilibria

(3.33) and (3.34) are well defined.

As done in the proof of Theorem 3.1, we can calculate the Jacobians for equilibria

(3.33) and (3.34), which happens to be the same matrix at both equilibria. Due to

the complicated form of the Jacobian matrix, we do not show the matrix and instead

just show its characteristic polynomial:

0 =[(ωs)3 + (Rbqr(1 + J0(2a))

+b2qrJ0(a)

cJ1(a)(1− J0(2a)) + h

)(ωs)2

+

((2qrR +

bqrJ0(a)

cJ1(a)

)γ2 +Rbqrhρ1

)(ωs)

+2Rqrhγ2] . (3.61)

According to the Routh-Hurwitz criterion, to guarantee that the roots of the poly-

nomial have negative real parts, each coefficient must be greater than zero and the

product of the s2 and s1 coefficients must be greater than the s0 coefficient. The

product of the s2 and s1 coefficients minus the s0 coefficient is

bq2r

((2qrR +

bqrJ0(a)

cJ1(a)

)γ2 +Rbqrhρ1

)(R(1 + J0(2a)) +

bJ0(a)

cJ1(a)(1− J0(2a))

)+ qrh

(bJ0(a)

cJ1(a)γ2 +Rbρ1

). (3.62)

With the condition (3.54), the Routh-Hurwitz criterion is satisfied and therefore

the Jacobian for the equilibria (3.33) and (3.34) is Hurwitz. By applying Theorem

10.4 from [36] to this result, we conclude that the error system (3.24)–(3.27) has

two distinct, exponentially stable periodic solutions within O(1/ω) of the equilibria

(3.33) and (3.34), which proves that the vehicle center rc converges to the annulus

(3.58) around the source r∗.

43

Simulation: On the approach towards the source, the vehicle trajectory with

Vc > V upperc is very similar to the trajectory for Vc ∈ (V lower

c , V upperc ). However,

as the vehicle for Vc > V upperc gets close to the source, it begins to encircle the

source clockwise or counterclockwise, depending on the initial conditions. Figure

3.7 shows the simulation for Vc > V upperc , with two different initial conditions, one

that converges to the average equilibrium (3.33) and the other that converges to

the average equilibrium (3.34). The simulations in Figure 3.7 were done with map

parameters and ES parameters chosen as to be the same as the simulation in Figure

3.4 except for Vc = 1, which satisfies (3.54).

Figure 3.8 shows a simulation of three vehicles with three different values for Vc.

The simulations in Figure 3.8 were done with map parameters and ES parameters

chosen to be the same as the simulation in Figure 3.4 except for Vc. The three

values of Vc were chosen as 1.001 × V upperc , 10 × V upper

c , and 100 × V upperc to show

that the vehicle’s average heading ranging from directly away from the source for

Vc slightly larger than V upperc to almost tangential to the ring for Vc ≫ V upper

c . Note

this behavior is explained by (3.36) and how it relates to V upperc .

3.6 Conclusion

We have proposed a modification of the nonholonomic source seeking algorithm

in [11], with a regulation of the vehicle forward speed which allows the vehicle to

slow down as it gets close to the source. We have proved the convergence to a

neighborhood of the source in three cases, identifying three classes of attractors:

• Vc ∈ (V lowerc , 0): the vehicle points, on the average, directly towards the

source, and does not drift around the ring. This is a continuum of attrac-

tors, parametrized by the position on the ring.

• Vc ∈ (0, V upperc ): the vehicle points, on the average, directly away from the

source, and does not drift around the ring. This is a continuum of attractors,

parametrized by the position on the ring.

• Vc > V upperc : the vehicle revolves around the source in the clockwise or counter-

44

Time

θ


0 2 4 6 8 100

π

2

π

3π

2

θ(0)=π/4θ(0)=−π/4Theoretical Equilibrium

(a)

−0.3 −0.2 −0.1 0 0.1 0.2 0.3−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

X

Y

Vehicle Trajectory

Start Locationθ(0)=π/4θ(0)=−π/4Source LocationTheoretical Equilibrium

(b)

Figure 3.7: Two trajectories of the same vehicle, with the only difference beingthe initial condition in θ. The vehicle converges to two different average equilibria,(3.33) and (3.34). (a) shows the evolution of the relative angle between the vehicleheading and the source, with µ0 ≈ π/3. (b) shows the trajectory of the vehicles.

45

Time

θ


0 10 20 300

π

2

π

3π

2

Vc=1.001*V

cupper

Vc=10*V

cupper

Vc=100*V

cupper

Theoretical Equilibria

(a)

−0.4 −0.2 0 0.2 0.4 0.6−0.4

−0.2

0

0.2

0.4

0.6

X

Y

Vehicle Trajectory

Start Location

Vc=1.001*V

cupper

Vc=10*V

cupper

Vc=100*V

cupper

Source LocationTheoretical Equilibria

(b)

Figure 3.8: Three trajectories of the same vehicle, with the only difference being thevalue of Vc. The vehicle converges to three different trajectories that encircle thesource. (a) shows the evolution of the relative angle between the vehicle heading andthe source, with µ0 ≈ 0 when Vc is close to V upper

c and µ0 ≈ π/2 when Vc ≫ V upperc .

(b) shows the trajectory of the vehicles.

46

clockwise direction, depending on the initial condition. The vehicle’s average

heading ranges from slightly outward relative to the ring (for Vc ≫ V upperc ) to

almost directly away from the source (for Vc only slightly larger than V upperc ).

While our new strategy is not applicable to fixed-wing aircraft, it is applicable to

mobile robots, marine vehicles, and rotorcraft. Of the three ranges for the speed

bias parameter Vc, namely, Vc ∈ (V lowerc , 0), Vc ∈ (0, V upper

c ), and Vc > V upperc , from

the point of view of asymptotic performance, the negative range Vc ∈ (V lowerc , 0)

seems preferable, because the vehicle virtually stops near the source and because it

points directly towards the source on average.

This chapter is in full a reprint of the material as it has been submitted to: N.

Ghods, and M. Krstic, “Speed regulation in steering-based source seeking,” Auto-

matica, vol. 46, pp. 452459, 2010.


4

Multi-Agent Deployment Over a

Source

We consider the problem of deploying a group of autonomous vehicles (agents)

in a formation which has higher density near the source of a measurable signal

and lower density away from the source. The spatial distribution of the signal and

the location of the source are unknown but the signal is known to decay with the

distance from the source. The vehicles do not have the capability of sensing their

own positions but they are capable of sensing the relative position between them

and their neighbors. We design a control algorithm based on a combination of two

components. One component of the control law is inspired by the heat PDE and it

results in the agents deploying between two anchor agents. The other component of

the control law is based on extremum seeking and it achieves higher vehicle density

around the source. Using averaging theory for PDEs we prove that the vehicle

density will be highest around the source. We also quantify the density function

of the agents’ deployment position. By discretizing the model with respect to the

continuous agent index, we obtain decentralized control laws for discrete agents and

illustrate the theoretical results with simulations.

47

48

4.1 Introduction

Extremum seeking has proved to be a powerful tool in real-time non-model based

control and optimization for single unmanned autonomous vehicles [59, 60, 11, 13,

39]. In recent years, extremum seeking has also been used for groups of unmanned

autonomous vehicles in a network with each vehicle having limited local information

[51, 49].

We consider the task of seeking the maximum of a signal field while simulta-

neously achieving a formation distribution which has higher density around the

areas with higher signal strength. We combine the method of extremum-seeking

with diffusion feedback to have a group of vehicles complete the task of formation

deployment and source-seeking.

With the new method we explore two different types of control for the agents on

the boundary, which we refer to as anchors: (1) the case of free anchors and (2) the

case of fixed anchors. The free anchor case allows the agents on the edge of the for-

mation to freely move, whereas the fixed anchors case has stationary anchor agents

that start at a desired location. Different deployment distributions are achieved in

the two cases.

The diffusion-based feedback enables the overall multi-agent formation to act

as a net of source seekers, rather than as a group of independent, uncoordinated

seekers, who intrude upon each others’ space. With the free anchors the user casts

the net in a manner to prompt attraction towards the source and spread around

the source. In the fixed anchor case the ends of the net are fixed and the agents in

between distribute such that they have the highest density near the source.

In the present paper we consider only the one-dimensional problem. The two-

dimensional coordinated source seeking problem allows a much broader array of

problem formulations, depending on various possible formation topologies. For this

reason, we focus on the 1D situation to introduce the design ideas and analysis

techniques.

The motivation for using the diffusion/heat PDE is that the diffusion action

induces each agent to take a position half way between his two neighbors. By

combining diffusion with extremum seeking one obtains a swarm of agents where

49

each agent is driven by two competing strategies, extremum seeking which aims to

place all the agents at the extremum, and diffusion which aims to spread the agents

evenly, provided the anchors are apart. The overall result of these two effects is

that the agents are deployed more densely near the extremum than away from the

extremum. We quantify this density in the paper.

The problem of understanding when the individual actions of interacting agents

give rise to a coordinated behavior has received considerable attention in many fields.

In the control community, the interest in coordination phenomena has been recently

promoted by the need of controlling groups of unmanned autonomous vehicles. A

basic, fairly simplified setup considers a group of nmobile agents, each one described

by a dynamic system capturing the evolution of its heading angle [31] or its position

and velocity [55]. When agents interact with a limited number of neighbors, one

faces the problem of designing a decentralized control scheme (where each agent

uses only the neighbors’ information) in order to orchestrate the collective behavior.

Decentralization implies that the control action can be computed in a distributed

fashion.

A method often used to design and analyze a decentralized controller for a group

of agents is to treat the agents as a continuum. Relations between distributed

consensus algorithms and the heat equation are made in [18]. In [37], agents use

model reference adaptive control laws to track desired trajectories, using either the

heat equation or the wave equation as reference models. Boundary control of PDEs

was used to deploy vehicles into planar curves in [23]. A continuum model for

a swarm of vehicles is formulated using a vehicle density function in [32]. In [9]

deployment on a line segment is achieved by using feedback laws consistent with the

spatially discretized heat equation.

Multi-agent and GPS-enabled source seeking problems have been solved in [43,

47]. A hybrid strategy for solving the source seeking problem was developed in [41].

In [33, 63, 14] the proposed problem in this paper is considered as a GPS-enabled

game problem were each agent is trying to maximize its own cost function, but in

these algorithms the agents also require the cost information of their neighbors.

Section 4.2 presents a description of the vehicle model and the control scheme

50

for both free and fixed anchor cases. We prove local exponential convergence results

of an equilibrium with the density function that has maximum density set around

the source in Sections 4.3 and 4.4. Section 4.3 deals with the case of free anchors,

whereas Section 4.4 deals with fixed anchors. Simulation results in Section 4.5

illustrate the distinct behavior exhibited using free and fixed control for the anchor

agents with and without independent parameters for each agent.

4.2 Control Design

We consider vehicles modeled as a velocity-actuated point mass

xt = v (4.1)

where x is a vector of position of the point masses, and v are the vehicles velocity

inputs. It is common to consider the heat equation

xt(α, t) = xαα(α, t) (4.2)

as a model that governs the position x(α, t) at time t of an agent indexed by α in

a large (continuum) group of agents, where each agent is able to sense its nearest

neighbor and apply diffusion feedback actuated through the velocity input, namely

v(α, t) = xαα(α, t), (4.3)

with the boundary conditions at xt(0, t) and xt(1, t). The subscripts are used to

denote a partial derivative in the respective variable. For simplicity without loss of

generality we choose the spatial domain α ∈ [0, 1].

Extremum-seeking on a single vehicle modeled as a velocity-actuated point mass

has been studied in [60]. The control law used in [60] is

v(t) = aω cos(ωt) + cξ sin(ωt) (4.4)

ξ =s

s+ h[J ], (4.5)

where J is the measurement of the signal field and a, ω, c, and h are parameters

chosen by the designer. The washout filter (4.5) is not required for stability [53],

but used to achieve better performance.

51

In this paper, given only the measurements of the values of the function J = f(x),

we employ a mix of extremum-seeking and nearest-neighbor based diffusion feedback

given by

v(α, t) =κ(α)xαα(α, t) + a(α)ω cos(ωt) + c(α)ξ(α, t) sin(ωt) (4.6)

ξ(α, t) =s

s+ h(α)[J(α, t)], (4.7)

where the performance can be influenced by the positive parameters a(α), c(α),

κ(α), h(α), and ω. The parameters can vary with respect to α, which allows each

vehicle to have different parameters.

For the agents on the boundary (anchor agents) we consider two different types of

control laws. We explore first the case of having the anchors free to move according

to the shape and location of the signal field, and then consider the case where the

user deploys the anchors to desired locations.

The free anchor boundary conditions have the form

v(0, t) =− κ(0)ν + a(0)ω cos(ωt) + c(0)ξ(0, t) sin(ωt) (4.8)

v(1, t) =κ(1)ν + a(1)ω cos(ωt) + c(1)ξ(1, t) sin(ωt), (4.9)

where ν is a constant velocity which makes the anchors expand out until the exter-

mum seeking (ES) term is big enough to counteract ν and stop the expansion of the

anchors.

The fixed anchor boundary conditions have the form

x(0, t) = x (4.10)

x(1, t) = x, (4.11)

where x and x are the desired fixed locations of the boundary agents. The fixed

boundary conditions are used to force the agents in between the anchors (follower

agents) to distribute between the desired locations. The fixed anchors can be virtual

points whose positions are fed to the nearest followers, or the fixed anchors can

represent a physical boundary like a wall that the followers can sense.

With the free anchors there are no restrictions on where the formation will end

up. The deployment range depends primarily on the initial anchor velocities ν. On

52

the other hand, the fixed anchor case allows the user to pick an area of interest and

have the agents explore all of this area.

We assume that the nonlinear map defining the distribution of the signal field is

quadratic and takes the form

J = f(x) = f ∗ − q(x− x∗)2, (4.12)

where x is the position of the vehicle, x∗ is the maximizer, f ∗ = f(x∗) is the

maximum, and q is an unknown positive constant. The assumption of the quadratic

form for the signal field is used to simplify the stability proof.

4.3 Free Anchors

In this section we analyze the convergence properties of the feedback law (4.6)–

(4.9). We define an output error variable e(α, t) = h(α)s+h(α)

[J(α, t)]− f ∗ where h(α)s+h(α)

is a low-pass filter applied to the sensor reading J , which allows us to express ξ(α, t),

the signal from the washout filter, as ξ(α, t) = ss+h(α)

[J(α, t)] = J(α, t)−f ∗−e(α, t),

noting also that e(α, t) = h(α)ξ(α, t).

To study the vehicle formation in a continuum case we use the formation density

function

p(x) =d

dxϕ−1(x) =

1

ϕ′(ϕ−1(x))(4.13)

where ϕ−1(x) is the inverse function of vehicle position ϕ(α) and ϕ′ denotes the

derivative with respect to the function’s only argument.

Theorem 4.1 Consider the closed-loop system

xt(α, t) =κ(α)xαα(α, t) + a(α) cos(ωt) + c(α)ξ(α, t) sin(ωt) (4.14)

et(α, t) =h(α)ξ(α, t) (4.15)

ξ(α, t) =− q(x(α, t)− x∗)2 − e(α, t) (4.16)

with the free boundary conditions (B.C.)

xt(0, t) =− κ(0)ν + a(0) cos(ωt) + c(0)ξ(0, t) sin(ωt) (4.17)

xt(1, t) =κ(1)ν + a(1) cos(ωt) + c(1)ξ(1, t) sin(ωt), (4.18)

53

where κ(α), h(α), a(α), c(α) > 0 and a(α), c(α) are chosen such that ddα

(a(α)c(α)) <a(α)c(α)

2, ∀α ∈ [0, 1], q > 0, and ν ∈ R. There exists ω∗ > 0 such that, for all ω > ω∗,

there exists a periodic solution (x2π/ω(α, t), e2π/ω(α, t)) of period 2π/ω in t and with

the property that

|x2π/ω(α, t)− x∗ − ρ(α)|2 ≤ O

(1

ω+max

αa(α)

)(4.19)

∀α ∈ [0, 1], t ≥ 0, where

ρ(α) = afree0 eγ(α) − afree1 e−γ(α), (4.20)

afree0 =ν

eγ(1) − e−γ(1)

(1

λ2(1)+

e−γ(1)

λ2(0)

), (4.21)

afree1 =ν

eγ(1) − e−γ(1)

(1

λ2(1)+

eγ(1)

λ2(0)

), (4.22)

γ(α) =

∫ α

0

λ(σ) dσ, and (4.23)

λ(σ) =

√qc(σ)a(σ)

κ(σ)(4.24)

such that whenever the quantities

|x(0, 0)− x∗ − ρ(0)|2,∫ 1

0|x(α, 0)− x∗ − ρ(α)|2 dα, (4.25)∫ 1

0|xα(α, 0)− ρα(α)|2 dα, and

∫ 1

0

∣∣∣e(α, 0) + qa2

2+ qρ2(α)

∣∣∣2 dα (4.26)

are sufficiently small, the solution (x(α, t), e(α, t)) exponentially converges to

(x2π/ω(α, t), e2π/ω(α, t)) in H1[0, 1]× L2[0, 1] norm.

Proof: We start the proof by defining the error variable

x = x− x∗ − a(α) sin(ωt), (4.27)

where x∗ is the location of the source, and the new time variable

τ = ωt. (4.28)

The resulting dynamics become

xτ (α, τ) =1

ω

(κ(α)

(xαα(α, τ) + a′′(α) sin(τ)

)− c(α)ξ(α, τ) sin(τ)

), (4.29)

eτ (α, τ) =h(α)

ωξ(α, τ), (4.30)

ξ(α, τ) = −q(x(α, τ) + a(α) sin(τ))2 − e(α, τ) (4.31)

54

with B.C.

xτ (0, τ) =1

ω(−κ(0)ν + c(0)ξ(0, t) sin(τ)) (4.32)

xτ (1, τ) =1

ω(κ(1)ν + c(1)ξ(1, t) sin(τ)). (4.33)

The average error system is

xaveτ (α, τ) =

1

ω(κ(α)xave

αα (α, τ)− qc(α)a(α)xave(α, τ)) (4.34)

eaveτ (α, τ) =− h(α)

ω

(q(xave(α, τ))2 +

qa2(α)

2+ eave(α, τ)

)(4.35)

with B.C.

xaveτ (0, τ) =

1

ω(−κ(0)ν − qc(0)a(0)xave(0, τ)) (4.36)

xaveτ (1, τ) =

1

ω(κ(1)ν − qc(1)a(1)xave(1, τ)). (4.37)

The equilibrium profile of the average error system (4.34)–(4.37) is

[xavee(α), eave

e

(α)]=

[ρ(α),−qa2(α)

2− qρ2(α)

], (4.38)

where ρ(α) is given in (4.20).

We shift the system state by its equilibrium profile with the following transfor-

mation

w(α, τ) = xave(α, τ)− xavee(α) (4.39)

z(α, τ) = eave(α, τ)− eavee

(α), (4.40)

which results in the following dynamics

wτ (α, τ) =1

ω(κ(α)wαα(α, τ)− qc(α)a(α)w(α, τ)) (4.41)

zτ (α, τ) = −h(α)

ω

(q(w(α, τ) + ρ(α))2 + z(α, τ)− qρ2(α)

)= −h(α)

ω

(qw2(α, τ) + 2qρ(α)w(α, τ) + z(α, τ)

)(4.42)

55

with B.C.

wτ (0, τ) =1

ω(−κ(0)ν − qc(0)a(0)(w(0, τ) + ρ(0)))

=− qc(0)a(0)

ωw(0, τ) (4.43)

wτ (1, τ) =1

ω(κ(1)ν − qc(1)a(1)(w(1, τ) + ρ(1)))

=− qc(1)a(1)

ωw(1, τ) (4.44)

Linearizing the averaged error system produces

wτ (α, τ) =1


zτ (α, τ) = −h(α)

ω(2qρ(α)w(α, τ) + z(α, τ)) (4.46)

with B.C.

wτ (0, τ) = −qc(0)a(0)

ωw(0, τ) (4.47)

wτ (1, τ) = −qc(1)a(1)

ωw(1, τ). (4.48)

Using Lemma A.1 in Appendix A, where k1 = κ(α), k2 = qa(α)c(α), k3 =

2qh(α)ρ(α), and k4 = h(α), we get that the averaged error system has an expo-

nentially stable equilibrium. Applying Theorem 3.6 and Example 6.4 in [27] (details

in Appendix B), we can state that there exists ω∗ > 0 such that, for all ω > ω∗,

there exists a periodic solution (x2π/ω(α, t), e2π/ω(α, t)) of period 2π/ω in t and with

the property that

|x2π/ω(0, t)− x∗ − ρ(0)|2

+

∫ 1

0

|x2π/ω(α, t)− x∗ − ρ(α)|2 dα

+

∫ 1

0

|x2π/ωα (α, t)− ρ′(α)|2 dα ≤ O

(1/ω +max

αa(α)

), (4.49)

so that the solution (x(α, t), e(α, t)) locally exponentially converges to (x2π/ω(α, t),

e2π/ω(α, t)) in H1[0, 1]× L2[0, 1] norm. Agmon’s inequality combined with Young’s

inequality yields

supα

|ζ(α, t)|2 ≤ ζ2(0, t) +∫ 1

0|ζ(α, t)|2 dα +

∫ 1

0|ζα(α, t)|2 dα . (4.50)

By applying (4.50) to (4.49) we get the bound (4.19).

56

Now we take a look at how the parameters affect the density function.

Proposition 4.1 The averaged equilibrium (4.20)–(4.24) has the following forma-

tion density function

p(x) =1 + x−x∗√

(x−x∗)2+4a0a1

λ(ϕ−1(x))(x− x∗ +

√(x− x∗)2 + 4a0a1

) , (4.51)

where a0 = afree0 , a1 = afree1 are given in (4.21) and (4.22).

Proof: We start by taking the vehicle position function, which has the form

x = ϕ(α) = ρ(α) + x∗ = a0eγ(α) − a1e

−γ(α) + x∗, (4.52)

and solving (4.52) for γ to obtain

γ(α) = ln

(x− x∗ +

√(x− x∗)2 + 4a0a1a0

). (4.53)

We use (4.23) to rewrite γ in terms of λ and differentiate both sides with respect to

x to obtain (d

dxϕ−1(x)

)λ(ϕ−1(x)) =

1 + x−x∗√(x−x∗)2+4a0a1(

x− x∗ +√

(x− x∗)2 + 4a0a1

) (4.54)

and then simply solve for the density function p(x) = ddxϕ−1(x).

Figure 4.1 shows two density plots with the parameters chosen in a way to make

λ = 5 for the solid black line and λ(α) = 5(2 − α) for the dashed blue line with

ν = 2 and x∗ = 0 for both. Figure 4.1 shows that the vehicles with higher value of

λ(α) squeeze towards the maximum x∗ and the vehicles with lower values of λ(α)

spread out more.

We consider the simple case of constant λ, to show the effect of λ and ν on the

density function at x∗. The formula for density function at x∗ with constant λ is

given by

p(x∗) =

√κλ sinh(λ)

ν√

2 + 2 cosh(λ), (4.55)

where it can be noted that as λ increases so does the density function at x∗, while

the opposite is true for ν.

57

−2 −1 0 1 20

0.5

1

1.5

2

Position (x− x∗)

Den

sity

(nve

hic

les/

∆x)

λ = 5λ = 5(1+α)

Figure 4.1: Vehicle density function for λ = 5 and λ(α) = 5(2− α).

4.4 Fixed Anchors

In this section we highlight the differences in the analysis of the fixed anchor

case from the free anchor case. The main differences between the two cases is that

the fixed anchor case forces the formation deployment profile to be between x and

x, which in turn causes the density function to be in the same range. Unlike in the

free anchor case, in the fixed anchor case the anchors are stationary.

Theorem 4.2 Consider the system

xt(α, t) =κxαα(α, t) + aω cos(ωt) + cξ(α, t) sin(ωt) (4.56)

et(α, t) =hξ(α, t) (4.57)

ξ(α, t) =− q(x(α, t)− x∗)2 − e(α, t) (4.58)

with the fixed boundary conditions

x(0, t) = x (4.59)

x(1, t) = x (4.60)

58

where x and x ∈ R. There exists ω∗ > 0 such that, for all ω > ω∗, there exists a

periodic solution (x2π/ω, e2π/ω(α, t)) of period 2π/ω in t and with the property that

|x2π/ω(α, t)− x∗ − ρ(α)|2 ≤ O

(1

ω+max

αa(α)

)(4.61)

∀α ∈ [0, 1], t ≥ 0, where

ρ(α) = afixed0 eγ(α) − afixed1 e−γ(α), (4.62)

afixed0 =x− x∗(1− e−γ(1))− xe−γ(1)

(eγ(1) − e−γ(1)), (4.63)

afixed1 =x− x∗(1− eγ(1))− xeγ(1)

(eγ(1) − e−γ(1)), (4.64)

and γ given by (4.23), such that whenever the quantities∫ 1

0|x(α, 0)− x∗ − ρ(α)|2 dα, (4.65)

and∫ 1

0

∣∣∣e(α, 0) + qa2

2+ qρ2(α)

∣∣∣2 dα (4.66)

are sufficiently small, the solution (x(α, t), e(α, t)) exponentially converges to(x2π/ω(α, t), e2π/ω(α, t)

)in L2[0, 1]× L2[0, 1] norm.

Proof: Similar to the proof for Theorem 4.1, we start by applying (4.27) and

(4.28) to system (4.56)–(4.58) with the B.C. (4.59)–(4.60), and then by averaging

we obtain

xaveτ (α, τ) =

1

ω(κ(α)xave

αα (α, τ)− qc(α)a(α)xave(α, τ)) (4.67)

eaveτ (α, τ) =− h(α)

ω

(q(xave(α, τ))2 +

qa(α)2

2+ eave(α, τ)

)(4.68)

with B.C.

xave(0, τ) = x− x∗ and xave(1, τ) = x− x∗. (4.69)

The average error system (4.67)–(4.69) has an equilibrium defined by[xavee(α), eave

e

(α)]=

[ρ(α),−qa2

2− qρ2(α)

], (4.70)

where ρ(α) is given in (4.62). We omit the details of the averaging, but would like

to point out that the main difference in averaging the fixed case from the free case

59

is in the boundary condition, which yields different coefficients for the equilibrium

(4.62).

Shifting the averaged system by the equilibrium and linearizing we get

wτ (α, τ) =1


zτ (α, τ) = −h(α)

ω(2qρ(α)w(α, τ) + z(α, τ)) (4.72)

with B.C. w(0, τ) = w(1, τ) = 0.

Using Lemma A.3 in Appendix A, where k1 = κ(α), k2 = qa(α)c(α), k3 =

2qh(α)ρ(α), and k4 = h(α), we get that the averaged error system has an expo-

nentially stable equilibrium. Using Theorem 3.6 and Example 6.4 in [27] (details

in Appendix B), we can state that there exists ω∗ > 0 such that, for all ω > ω∗,

there exists a periodic solution (x2π/ω, e2π/ω(α, t)) of period 2π/ω in t and with the

property that∫ 1

0

|x2π/ω(α, t)− x∗ − ρ(α)|2 dα ≤ O(1/ω +max

αa(α)

)(4.73)

so that the solution (x(α, t), e(α, t)) locally exponentially converges to (x2π/ω(α, t),

e2π/ω(α, t)) in L1[0, 1]×L2[0, 1] norm. By applying (4.50) to (4.73) we get the bound

(4.61).

The same result holds as in Proposition 4.1 for the averaged equilibrium of the

fixed anchor case (4.62)–(4.64) with the formation density function given as (4.51)

where a0 = afixed0 and a1 = afixed1 are given in (4.63) and (4.64), respectively. As

derived earlier, the formation density function at position x∗ with a constant λ,

given by

p(x∗) =sinh(λ)

λβ, (4.74)

where

β =(x∗2 − x∗(x+ x)

)(2− 2 cosh(λ))− 2xx cosh(λ) + x2 + x2, (4.75)

increases with bigger λ and decreases as the difference between x and x grows.

60

4.5 Simulation Results

To implement the algorithm in Section 4.2 we must first understand how to

choose and tune the parameters a, c, κ, ω, h, and ν. Higher values of a and c cause

the attraction of the vehicle towards the source to increase and the opposite is true

for κ. The parameters ω and a are chosen such that the quantity 1/ω +maxα a(α)

is sufficiently small. The cutoff frequency h for the washout filter has to be high

enough to significantly get rid of the DC term but smaller than the perturbation

frequency ω. In the free anchor case, the higher the ratio νκac, the farther the anchor

vehicle will settle from the source, thereby causing the formation to spread out.

To apply the algorithm in Section 4.2, we discretize the continuous model (4.6)

to implement the algorithm. The two anchor agents do not require any modification

of their control laws (4.8), (4.9), and (4.10) since they do not include any partial dif-

ferentiation with respect to the agent index in their control law. The state variables

x(α, t) and ξ(α, t) become x(iδ, t) and ξ(iδ, t) where i = 0, ..., n + 1, δ = 1/(n + 1),

and n is the number of follower agents. We denote the two anchor agents’ states as

[x0, ξ0] and [xn+1, ξn+1], and the interior seeking agents’ states as [xi, ξi].

We discretize the seeking agents’ control laws (4.6) by using three-point central

differencing to approximate the spatial derivatives, obtaining

vi(t) = κixi+1−2xi+xi−1

δ2+ ai cos(ωt) + ciξi(t) sin(ωt), (4.76)

which can be rearranged as

vi(t) = κi∆xi+1,i+∆xi−1,i

δ2+ ai cos(ωt) + ciξi(, t) sin(ωt), (4.77)

where ∆xj,i = xj − xi. The washout filter becomes

ξi(t) =s

s+ h[Ji(t)], (4.78)

where Ji is the sensor reading of agent i. Figure 4.2 shows the block diagram for

one follower agent.

The signal field parameters for plots in Figure 4.3 are f ∗ = 1, q = 10 and

x∗ = 0.6. We apply (4.77), where a = 0.008, c = 15, κ = 0.05, ω = 45, and h = 10,

for all follower agents and x0 = 0, xn = 1 for the anchor agents to simulate the fixed

61

Nonlinear Map

)( ixfiJ

ihs

s

)sin( t

ic i

)cos( tai

ix

s

1 ix

1 ix

1 ix

2

! i

2

Figure 4.2: Block diagram of a single follower agent.

agent case on 11 agents. Figure 4.3(a) shows the evolution of a group of autonomous

vehicles, with fixed boundary agents, all released equidistantly between x0 and xn.

The agents deploy more densely around the signal source (peak) than away from

the source, which is consistent with the form of the density function (4.51) where

a0 = afixed0 and a1 = afixed1 are given in (4.63) and (4.64), respectively.

We simulate the free boundary condition case using

v0(t) = −κν + aω cos(ωt) + cξ0(t) sin(ωt), (4.79)

vn(t) = κν + aω cos(ωt) + cξn(t) sin(ωt), (4.80)

and (4.77), where a, c, κ, ω, and h have the same value as the first simulation and

ν = 0.5. Figure 4.3(b) shows the evolution of a group of 11 autonomous vehicles,

with free boundary control, released starting with the anchor agents at position

0 and 0.1 and the follower agents spread equally between them. The deployment

density is consistent with the theoretically predicted solid curve in Figure 4.1.

The theoretical distribution and density functions for the free and fixed anchor

cases is shown in Figure 4.4. Figure 4.4(a) shows the normalized vehicle ID number

(α) on the y-axis and the vehicle location on the x-axis. Figure 4.4 shows that, in

the free anchor case, the agents cover less of the area (between 0.2 to 1) than in the

fixed anchor case, which are forced to cover the area between 0 and 1. Figure 4.4(b)

shows that the free anchor case has higher density around the source than the fixed

anchor case.

62

The simulation in Figure 4.5 is produced with the same parameters as the sim-

ulation shown in Figure 4.3(b), except that in Figure 4.5(a) the extermum seeking

parameters are ai = 0.008(1+ i/n), ci = 15(1+ i/n) and in Figure 4.5(b) the source

is moving according to x∗(t) = cx∗ + ax∗ sin(ωx∗t), where cx∗ = 0.6, ax∗ = 0.2, and

ωx∗ = π/5. Figure 4.5(a) shows how increasing the parameters a and c with respect

to the agents index i pulls the agents with a higher i closer to the source. Figure

4.5(b) shows how the algorithm handles a moving source.

4.6 Conclusions

We have introduced algorithms that expand the capability of previous single-

agent source seeking algorithms. The new multi-agent source seeking algorithms

cover the area around the source in such a manner that the highest density of

agents is achieved at the source and the density decreases away from the source.

This form of deployment is achieved by combining standard extremum seeking with

consensus-type ideas, namely, by using algorithms that are simultaneously driven by

the local signal strength and by diffusion feedback, which employs the distance to

the nearest agents. While diffusion aims to place an agent exactly halfway between

its neighbors, extremum seeking aims to pull the agent closer to the source. In

the presence of anchor agents, which deploy some distance apart, the result is that

agents deploy more densely near the source than away from the source.

Of interest for future research is to extend the present algorithms to the stochastic

case, namely, to replace the sinusoidally forced extremum seeking algorithms by

extremum seeking algorithms forced by white noise [39]. In addition, it is of interest

to extend the current results for one-dimensional formations in one-dimensional

space to higher-dimensional formations in higher-dimensional space. Finally, it is of

interest to extend the present results to non-holonomic vehicles.

This chapter is in full a reprint of the material as it has been submitted to: N.

Ghods, and M. Krstic, “Multi-agent deployment over a source,” under review.


63

−0.2 0 0.2 0.4 0.6 0.8 10

1

2

3

4

Tim

e

Vehicle Position−0.2 0 0.2 0.4 0.6 0.8 1

−6

−4

−2

0

2

Sig

nal F

ield

(a) Fixed anchors

0 0.5 10

1

2

3

4

Tim

e

Vehicle Position0 0.5 1

−6

−4

−2

0

2

Sig

nal F

ield

(b) Free anchors

Figure 4.3: Double y-axis plots of the vehicle trajectories showing time scale onthe left y-axis, the signal field strength on the right y-axis, and the location ofthe vehicles on the x-axis. (a) Agent deployment with fixed anchors. (b) Agentdeployment with free anchors.

64

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Position (x)

Dis

trib

utio

n (α

)

Formation Distribution

Free B.C.Fixed B.C.

(a)

−0.6 −0.4 −0.2 0 0.2 0.40

0.5

1

1.5

2

2.5

3

Position (x− x∗)

Den

sity

(nve

hic

les/

∆x)

Formation Density

Free B.C.Fixed B.C.

(b)

Figure 4.4: Theoretical plot of (a) Formation distribution function and (b) Forma-tion density function for the fixed and free anchor cases

65

−0.2 0 0.2 0.4 0.6 0.8 10

1

2

3

4

Tim

e

Vehicle Position−0.2 0 0.2 0.4 0.6 0.8 1

−6

−4

−2

0

2

Sig

nal F

ield

(a) Linearly increasing parameters a and c

−0.5 0 0.5 1 1.50

1

2

3

4

5

6

7

8

Tim

e

Vehicle Position

Source Trajectory

(b) Moving source

Figure 4.5: (a) Agent deployment with free anchors starting far from the equilibriumwith linearly increasing parameters (b) Group of 11 agents using free anchor caseto achieve seeking of a moving source

5

Multi-agent Deployment with

Stochastic Extremum Seeking

We consider the problem of deployment of a group of N autonomous fully ac-

tuated vehicles (agents) in a non-cooperative manner in a planar signal field using

the recently introduced method of stochastic extremum seeking. The spatial dis-

tribution of the signal is unknown to the vehicles but known to be convex. The

vehicles are not able to sense their own positions but are capable of sensing the

distance between their neighbors and themselves. Each vehicle employs a stochastic

extremum seeking control law whose goal is to minimize the value of the measured

signal, namely to be as close as possible to the bottom of the signal field, as well as to

simultaneously minimizing a function of the distances between neighboring agents.

Such a seemingly conflicting and mutually competitive nature of the agents’ control

laws produces a Nash equilibrium that depends on the agents’ control parameters

and the unknown signal distribution. We prove local exponential convergence, both

almost surely and in probability, to a small neighborhood near the Nash equilibrium.

The theoretical results are illustrated with simulations.

66

67

5.1 Introduction

Recently, extremum seeking has been considered for distributed control of vehi-

cles in a network with each vehicle having limited local information in [51, 50, 24].

The applications include groups of vehicles operating underwater, under ice, in caves

or in urban environments where GPS is unavailable, or where inertial navigation sys-

tems are too costly. Other applications include scenarios where communication or

interaction among all agents is not feasible.

We investigate a stochastic version of non-cooperative source seeking by navi-

gating the autonomous vehicles with the help of a random perturbation. We use

stochastic extremum seeking and apply an extra force to some of the vehicles, which

we refer to as anchor agents, to increase the deployment area. The remaining

agents, which we refer to as follower agents, achieve deployment over a source by

using stochastic extremum seeking to maximize or minimize their local costs. The

vehicles have no knowledge of their own position, nor the position of the source,

and are only required to sense the distance between their neighbors and themselves.

In an application, the signal could be the concentration of a chemical or biological

agent, or it could be an electromagnetic, acoustic, or thermal source. The strength

of the signal is assumed to decay away from the source through diffusion or other

physical processes, but the spatial distribution of the signal is not available to the

vehicles.

The work [50] considers a non-cooperative problem where each agent is trying to

maximize or minimize their local cost function, which results in the convergence of

the group of agents to a Nash equilibrium. In [50], similar to the one agent case [60],

each agent employs two out-of-phase sinusoidal perturbations in order to generate

gradient estimates in the x and y directions for the extremum seeking algorithm.

We consider two cases of excitation for the group of N vehicles. Case 1 uses

an independent Brownian motion on a unit circle for every vehicle, and Case 2

uses only one Brownian motion on a unit circle for all vehicles, but with limited

interaction between neighbors. We provide a stability analysis for both cases based

on stochastic averaging theorems recently developed in [40]. The choice of using

random processes for perturbation was motivated by [6] and [7], where it is observed

68

that the bacterium Escherichia coil (E. coli) is able to move up chemical gradients

towards higher densities of nutrients by using what appears to be random searching

from time to time. In the works [42, 39], also motivated by E. coli, the problem of

stochastic source seeking was considered for vehicles with unicycle dynamics.

In Section 5.2, we give a description of the vehicle model and the cost function

used by each agent. Section 5.3 presents the control scheme applied according to

Case 1, which allows interaction among all agents, and Case 2, which allows limited

interaction between the agents. We prove convergence results of a group of vehicles

to a Nash equilibrium in probability and almost surely for the control law in Case 1

and in Case 2 in Sections 5.4. Simulation results in Section 5.5 illustrate the distinct

behavior exhibited using both cases for control of the agents.

5.2 Vehicle Model and Local Agent Cost

We consider vehicles modeled as a velocity-actuated point mass

dxi

dt= vxi,

dyidt

= vyi, (5.1)

where (xi, yi) is the position of the vehicle in the plane, and vxi, vyi are the vehicle

velocity inputs. The subscript i is used to denote the ith vehicle.

We assume that the nonlinear map defining the distribution of the signal field is

quadratic and takes the form

fi(xi, yi) =f ∗ + qx(xi − x∗)2 + qy(yi − y∗)2 (5.2)

where (x∗, y∗) is the minimizer, f ∗ = f(x∗, y∗) is the minimum, and (qx, qy) are

unknown positive constants. To account for the interactions between the vehicles

we assume that each vehicle can sense the distance,

dij(x, y) =√

(xi − xj)2 + (yi − yj)2, (5.3)

between itself and other vehicles. The cost function

Ji = fi +∑j∈N

qijd2ij (5.4)

includes inter-vehicle interactions, where qij ≥ 0 is the weighting that vehicle i puts

on its distance to vehicle j.

69

5.3 Control Design

To deploy the agents about the source position, we propose a control scheme

that utilizes Brownian motion on the unit circle as the excitation signal to perform

stochastic extremum seeking. Brownian motion on the unit circle has the following

form:

Y = ejB = [Y1, Y2]T = [cos(B), sin(B)]T , (5.5)

where j is the imaginary unit, and B is a 1-dimensional Brownian motion.

First, for clarity, we introduce the control scheme for a single vehicle i, which

does not impose any constraints on the excitation signal. Then, we discuss the

deployment on N vehicles, which utilizes excitation signals from two cases. The

two types of excitation are as follows, one where each vehicle uses an independent

Brownian motion and the other where every vehicle uses the same Brownian motion

process but the initial conditions of the processes differ by kπ/2 , k ∈ Z, betweenneighboring vehicles.

We propose the following stochastic control algorithm for vehicle i:

vxi =− aη1i + cxξiη1i + νxi, (5.6)

vyi =− aη2i + cyξiη2i + νyi, (5.7)

ξi =s

s+ h[Ji], (5.8)

η1i =cos(Wi(t/ϵ)), (5.9)

η2i =sin(Wi(t/ϵ)), (5.10)

were ξi is the output of the washout filter for the cost Ji, η1i, and η2i are used

as perturbations in the stochastic extremum seeking scheme, a, cx, cy, ϵ, h > 0 are

extremum seeking design parameters, and νxi, νyi ∈ R. In (5.8) s represents the

frequency domain in the transfer function acting on the cost Ji. We consider vehicles

with νxi, νyi = 0 to be the anchor agents and those with νxi = νyi = 0 to be the

follower agents. The signal (Wi(t), t ≥ 0) is a standard Brownian motion defined in

a complete probability space (Ω,F , P ) with sample space Ω, the σ−field F , and

the probability measure P .

70

Using Ito’s formula, η1i and η2i can be written as the solution to the differential

equations,

dη1i =− 1

2ϵcos(Wi(t/ϵ))dt− sin(Wi(t/ϵ))dWi(t/ϵ), (5.11)

dη2i =− 1

2ϵsin(Wi(t/ϵ))dt+ cos(Wi(t/ϵ))dWi(t/ϵ), (5.12)

which are equivalent to the stochastic differential equations

dη1i =− 1

2ϵη1idt− η2idWi, (5.13)

dη2i =− 1

2ϵη2idt+ η1idWi, (5.14)

with initial condition Wi(0) = 0 and [η1i(0), η2i(0)]T = [cos(ϕi), sin(ϕi)]

T , ϕi ∈ R,i.e., η21i(0) + η22i(0) = 1. This equivalence is shown with more detail in Section II of

[40]. Hence, the control signals (5.6) and (5.7) become

vxi =− a

2ϵη1i − aη2iWi + cxξiη1i + νxi, (5.15)

vyi =− a

2ϵη2i + aη1iWi + cxξiη2i + νyi, (5.16)

where Wi is a white noise signal for all i ∈ 1, 2, ..., N. The vehicle dynamics in

closed-loop with control laws (5.6)–(5.10) are rewritten as

dxi =− a

2ϵη1idt− aη2idWi + cxξiη1idt+ νxidt, (5.17)

dyi =− a

2ϵη2idt+ aη1idWi + cyξiη2idt+ νyidt, (5.18)

where η1i and η2i are given in (5.13) and (5.14), respectively.

Remark 5.1: It is not necessary to choose the Brownian motion on the unit

circle as the probing signal in the stochastic design for one vehicle. It is only re-

quired that the excitation signals in the x and y directions are uncorrelated and

bounded. Note that the Brownian motion on the unit circle was primarily chosen

for the ease that it provides in the stochastic averaging and the ability to use one

Brownian motion per vehicle or, as it will be shown in the next section, use only

one Brownian motion for the entire group of vehicles.

71

For the stable deployment of N vehicles, with dynamics (5.13)–(5.14), (5.17)–

(5.18), we impose additional constraints on the Brownian motion Wi and the initial

condition of the Brownian motion on a unit circle (cos(ϕi), sin(ϕi)) according to the

two types of excitation that we consider.

Case 1: For this case, we require the Brownian motion used by the ith agent,

Wi(t), to be uncorrelated with Brownian motion used by the jth agent, Wj(t), for

i = j and allow ϕi ∈ R. Under these constraints, each vehicle is allowed to interact

with any of the other vehicles.

Case 2: This case allows every vehicle to use the same Brownian motion, Wi(t) =

W (t),∀i ∈ 1, 2, ..., N, but requires the initial condition of the Brownian motion

on a unit circle to satisfy

ϕi − ϕj =kπ2, i ∈ Ωodd, j ∈ Ωeven,

ϕi − ϕj = kπ, i, j ∈ Ωodd or i, j ∈ Ωeven,(5.19)

∀i, j ∈ 1, 2, ..., N, k ∈ Z, where Ωodd and Ωeven are nonempty sets chosen such

that

Ωeven ∪ Ωodd = 1, 2, ..., N and Ωeven ∩ Ωodd = ∅. (5.20)

With these constraints, a vehicle in the set Ωodd cannot gather distance information

about vehicles in the the same set because their perturbation signals are correlated.

Therefore a vehicle in the set Ωodd indirectly interacts with another vehicle in the

same set by influencing vehicles in the set Ωeven. The same is true for vehicles in

the set Ωeven.

Remark 5.2: Besides dealing with N vehicles converging to a Nash equilibrium,

the main difference between [60], where cos(ωt) and sin(ωt) were used as probing

signals, and this work is that we use components of Brownian motion on the unit

circle as a probing signal.

5.4 Stability Analysis

In this section, we present and prove local stability, in a specific probabilistic

sense, for a group of vehicles with the two excitation cases presented in Section 5.3.

72

We define an output error variable ei =h

s+h[Ji(t)] − f ∗ where h

s+his a low-pass

filter applied to the cost J , which allows us to express ξi(t), the signal from the

washout filter, as ξi(t) =s

s+h[Ji(t)] = Ji(t)− f ∗ − ei(t), noting also that ei = hξi.

5.4.1 Case 1

Here we show stability for a group of fully actuated vehicles with control laws

(5.6)–(5.10) using the Case 1 perturbations.

Theorem 5.1 Consider the closed-loop system,

dxi = − a

2ϵη1idt− aη2idWi + cxξiη1idt+ νxidt, (5.21)

dyi = − a

2ϵη2idt+ aη1idWi + cyξiη2idt+ νyidt, (5.22)

dei = hξidt, (5.23)

dη1i = − 1

2ϵη1idt− η2idWi, (5.24)

dη2i = − 1

2ϵη2idt+ η1idWi, (5.25)

ξi = qx(xi − x∗)2 + qy(yi − y∗)2

+N∑j=1

qijd2ij − ei, (5.26)

∀ i ∈ 1, 2, ..., N, where the parameters νx, νy ∈ RN , a, cx, cy, h, qx, qy > 0 and

qij ≥ 0, ∀ i, j ∈ 1, 2, ..., N, and the signal Wi is a Brownian motion with Wi(0) =

0,Wi(t) = Wj(t). If the initial conditions are [η1(0), η2(0)]T = [cos(ϕi), sin(ϕi)]

T

with ϕi ∈ R and x(0), y(0), e(0) are such that the quantities, |xi(0) − x∗ − xeqi |,

|yi(0)− y∗− yeqi |, |ei(0)− eeq|, are sufficiently small, where (x∗, y∗) is the minimizer

of (5.2),

xeq =1

cxaQ−1

x νx, (5.27)

yeq =1

cyaQ−1

y νy, (5.28)

eeqi =qxxeq2

i + qyyeq2

i +a2

2(qx + qy)

+∑j∈N

qij[(xeq

i − xeqj )2 + (yeqi − yeqj )2

], (5.29)

73

and the matrices Qx and Qy, given by

Qx ij =

−qx −

∑Nk=1 qik i = j

qij i = j, (5.30)

Qy ij =

−qy −

∑Nk=1 qik i = j

qij i = j, (5.31)

are invertible, then there exist constants Cx, Cy, γx, γy > 0 and a function T (ϵ) :

(0, ϵ0) → N such that for any δ > 0

limϵ→0

inft ≥ 0 : |xi(t)− x∗ − xeq

i | > Cxe−γxt + δ + a

= ∞, a.s., (5.32)

limϵ→0

inft ≥ 0 : |yi(t)− y∗ − yeqi | > Cye−γyt + δ + a

= ∞, a.s., (5.33)

and

limϵ→0

P|xi(t)− x∗ − xeqi |

≤ Cxe−γxt + δ + a, ∀t ∈ [0, T (ϵ)] = 1, (5.34)

limϵ→0

P|yi(t)− y∗ − yeqi |

≤ Cye−γyt + δ + a, ∀t ∈ [0, T (ϵ)] = 1, (5.35)

where ∀i ∈ 1, 2, ..., N with the limϵ→0 T (ϵ) = ∞. The constants Cx and Cy

are dependent on both the initial condition (x(0), y(0), e(0)) and the parameters

a, cx, cy, νx, νy, h, qx, qy. The constants γx, γy are dependent on the parameters a, cx,

cy, νx, νy, h, qx, qy.

Proof: We start by defining the error variables

x =x− x∗ − aη1, (5.36)

y =y − y∗ − aη2, (5.37)

and define [χ1(t), χ2(t)] = [η1(ϵt), η2(ϵt)]. Since

dx =dx− a

2χ1(t/ϵ) dt− χ2(t/ϵ) dW, (5.38)

dy =dy − a

2χ2(t/ϵ) dt+ χ1(t/ϵ) dW, (5.39)

74

we obtain the following dynamics for the error variables:

dx

dt=cxξχ1(t/ϵ) + νx, (5.40)

dy

dt=cyξχ2(t/ϵ) + νy, (5.41)

de

dt=hξ, (5.42)

ξi =qx(xi + aχ1i(t/ϵ))2 + qy(yi + aχ2i(t/ϵ))

2

+∑j∈N

qij[(xi + aχ1i(t/ϵ)− xj − aχ1j(t/ϵ))

2

+(yi + aχ2i(t/ϵ)− yj + aχ2j(t/ϵ))2]− ei, (5.43)

dχ1(t) =− 1

2χ1(t)− χ2(t)dW, (5.44)

dχ2(t) =− 1

2χ2(t) + χ1(t)dW. (5.45)

We use general stochastic averaging given in Theorem 2 of [40] to analyze the

error system. We first calculate the average system of (5.40)–(5.42). The signals

χ1 and χ2 are both components of the Brownian motion on a unit circle, which is

known to be exponentially ergodic with invariant distribution µ(S) = l(S)2π

for any

set S ⊂ (x, y) ∈ R2|x2+y2 = 1 where l(S) denotes the length (Lebesgue measure)

of S [3]. The integral over the entire space of functions of Brownian motion on a

unit circle can be reduced to the integral from 0 to 2π. Since∫Rcos2k+1(s)µ(ds) =

∫ 2π

0

cos2k+1(s)1

2πds = 0, (5.46)∫

Rcos2(s)µ(ds) =

∫ 2π

0

cos2(s)1

2πds =

1

2, (5.47)

∫R

∫Rcos(s) cos(r)µ(ds)µ(dr) =∫ 2π

0

∫ 2π

0

cos(s) cos(r)1

4π2dsdr = 0, (5.48)

(note that the same applies to the sine function) and∫Rcos(s) sin(s)µ(ds) =

∫ 2π

0

cos(s) sin(s)1

2πds = 0, (5.49)

75

we obtain the average error system

dxave

dt=cxaQxx

ave + νx, (5.50)

dyave

dt=cyaQyy

ave + νy, (5.51)

deavei

dt=h

(−eavei + qxx

ave2 + qyyave2 +

a2

2(qx + qy)

)+ h

∑j∈N

qij[(xave − xave

j )2 + (yave − yavej )2]. (5.52)

Using the fact that Qx and Qy have the special form, shown in (5.30) and (5.31),

with Gershgorin Circle Theorem (Theorem 7.2.1 in [25]) we get that as long as

qx, qy > 0, the matrices Qx, Qy have eigenvalues that are all negative (i.e. they are

Hurwitz and invertible).

The average error system has equilibria (5.27), (5.28), and (5.29) with the Jaco-

bian,

A =

cxa2π

Qx 0 0

0 cya

2πQy 0

0 0 −hI

. (5.53)

The matrices Qx and Qy are Hurwitz, which implies that A is Hurwitz and that the

equilibria (5.27), (5.28), and (5.29) are exponentially stable.

Using Theorem 2 in [40] there exist constants c > 0, r > 0, γ > 0 and functions

T (ϵ) : (0, ϵ0) → N, such that for any δ > 0, and any initial conditions |Λϵ(0)| < r,

limϵ→0

inft ≥ 0 : |Λϵ(t)| > c|Λϵ(0)|e−γt + δ

= ∞, a.s., (5.54)

and

limϵ→0

P|Λϵ(t)| > c|Λϵ(0)|e−γt + δ, t ∈ [0, T (ϵ)

= 1, (5.55)

with limϵ→0 T (ϵ) = ∞, where

Λϵ(t) =

x− xeq

y − yeq

e− eeq

. (5.56)

76

The results (5.54) and (5.55) state that the norm of the error vector Λϵ(t) expo-

nentially converges, both almost surely and in probability, to a point below an

arbitrarily small residual value δ over an arbitrarily long time interval, which tends

to infinity as ϵ goes to zero. In particular, each xi-component and yi-component

for all i ∈ 1, 2, . . . , N of the error vector converges to below δ, which gives us

(5.32)–(5.35).

5.4.2 Case 2

Here we show stability for a group of fully actuated vehicles with control laws

(5.6)–(5.10) using the Case 2 perturbations.

Theorem 5.2 Consider the closed-loop system (5.21)–(5.26) where the parameters

νx, νy ∈ RN , a, cx, cy, h, qx, qy > 0, and qij ≥ 0 ∀ i, j ∈ 1, 2, ..., N. If the initial

conditions are Wi(0) = 0, [η1i(0), η2i(0)]T = [cos(ϕi), sin(ϕi)]

T with ϕi chosen such

that (5.19)–(5.20) holds and x(0), y(0), e(0) are such that the following quantities

|xi(0)− x∗ − xeqi |, |yi(0)− y∗ − yeqi |, |ei(0)− eeq|, (5.57)

are sufficiently small, where[xeq

yeq

]=A−1

xy

[νx

νy

], (5.58)

eeqi =qxxeq2

i + qyyeq2

i +a2

2(qx + qy)

+∑j∈N

qij[(xeq

i − xeqj )2 + (yeqi − yeqj )2

], (5.59)

and the matrix Axy is invertible and is given by

Axy =

[cxa(QΩ − Iqx) −cxaQΩ

−cyaQΩ cya(QΩ − Iqy)

], (5.60)

77

QΩ ij =

−∑

k∈Ωoddqik i ∈ Ωeven, i = j

−∑

k∈Ωevenqik i ∈ Ωodd, i = j

qij i ∈ Ωeven, j ∈ Ωodd

qij i ∈ Ωodd, j ∈ Ωeven

0 otherwise

, (5.61)

then there exist constants Cx, Cy, γx, γy > 0 and a function T (ϵ) : (0, ϵ0) → N such

that for any δ > 0

limϵ→0

inft ≥ 0 : |xi(t)− x∗ − xeq

i | > Cxe−γxt + δ + a

= ∞, a.s (5.62)

limϵ→0

inft ≥ 0 : |yi(t)− y∗ − yeqi | > Cye−γyt + δ + a

= ∞, a.s (5.63)

and

limϵ→0

P|xi(t)− x∗ − xeqi |

≤ Cxe−γxt + δ + a, ∀t ∈ [0, T (ϵ)] = 1 (5.64)

limϵ→0

P|yi(t)− y∗ − yeqi |

≤ Cye−γyt + δ + a, ∀t ∈ [0, T (ϵ)] = 1 (5.65)

∀i ∈ [1, N ] with the limϵ→0 T (ϵ) = ∞.

Proof: Similar to the proof for Theorem 5.1 we start by applying (5.36), (5.37)

and defining [χ1(t), χ2(t)] = [η1(ϵt), η2(ϵt)]. By employing stochastic averaging to

compute the average system and then, linearizing the average system about the

equilibrium [xeq, yeq, eeq]T , we obtain the Jacobian,

A =

[Axy 0

0 −hI

], (5.66)

which is block diagonal and Hurwitz since Axy and −hI are both Hurwitz. By

applying Theorem 2 in [40], similar to Theorem 5.1, we can obtain the results

(5.62)–(5.65).

78

5.5 Simulation

In this section, we show numerical results for a group of vehicles with the control

scheme presented in Section 5.3. For the following simulations, without loss of

generality, we let the unknown location of the signal field be at the origin (x∗, y∗) =

(0, 0), and let the unknown signal field parameters be (qx, qy) = (1, 1).

In Figure 5.1 we consider 13 vehicles with Case 1 perturbations. We choose the

design parameters as a = 0.01, cx = cy = 150, h = 10, and define agents 1 through 6

as the anchor agents with the forcing terms,

(νxi, νyi) = 0.05

(cos

(iπ

3

), sin

(iπ

3

)), (5.67)

where i = 1, . . . , 6. In addition to the design parameters, we picked the interaction

gains qij such that

qij =

qi,i+1 = qi+1,i = 0.5, i ∈ 1, ..., 12, i = 6

qi,13 = 0.5, i ∈ 7, ..., 12qi,i−6 = qi−6,i = 1, i ∈ 7, ..., 12qi,j = 0, otherwise

. (5.68)

Figure 5.1 shows the ability of the control algorithm to produce a circular distribu-

tion around the source with a higher density of vehicles near the source. In this plot,

the trajectories of the vehicles are not shown to avoid obscuring the final vehicle

formation.

In Figure 5.2 we consider 5 vehicles with Case 2 constraints. We pick agent 1 and

agent 5 as the anchor agents with (νx1, νy1) = (−0.1, 0.1), (νx5, νy5) = (0.1,−0.1),

and choose the other design parameters to be the same as in the previous simulation.

We assume that each vehicle interacts with only the closest indexed agents with a

weighting of 0.5, i.e., qi,i+1 = qi+1,i = 0.5, i = 1, ..., 4. Figure 5.2 shows a line

formation centered at the source, with a higher density of agents near the source,

and generated by agents using a single Brownian motion signal.

Illustrated in these figures is the effect of the forcing terms (νxi, νyi) assigned

to the anchor agents. By carefully selecting these forcing terms, other geometric

deployments can be made, which will be distorted by the signal field. For instance,

79

Figure 5.1: Shows a group of vehicles using the stochastic extremum seeking algo-rithm with Case 1 perturbations and interaction gains given by (5.68). The anchoragents are denoted by red triangles and the follower agents are denoted by blue dots.The agents start inside the dashed black line and converge to a circular formationaround the source.

if νxi and νyi (5.67) were defined as

νxi =0.05

(a cos

(iπ

3

)− b sin

(iπ

3

))(5.69)

νyi =0.05

(a cos

(iπ

3

)+ b sin

(iπ

3

)), (5.70)

where a is the semimajor axis and b is the semiminor axis, an elliptical deployment

will result.

5.6 Conclusion

In this chapter, we presented a stochastic extremum seeking algorithm for a group

of agents, with two different constraints on the agents, to achieve stable deployment

over a source. We presented a stability proof that shows convergence of the vehicles,

to a Nash equilibrium, both in the almost sure sense and in probability when using

80

Figure 5.2: Shows a group of vehicles using the stochastic extremum seeking algo-rithm with Case 2 perturbations. The agents start inside the dashed black line andconverge to a line formation centered around the source with the anchor agents atthe end of the line formation.

two kinds of excitation signals. We show simulation results for the control algorithm

applied to agents on a static source.

This chapter is in full a reprint of the material as it has been submitted to:

N. Ghods, P. Frihauf, and M. Krstic, “Multi-agent deployment in the plane using

stochastic extremum seeking, IEEE Conference on Decision and Control, 2010.


6

Light Source Seeking Experiments

In this chapter we consider the problem of seeking a light source with an au-

tonomous ground vehicle. The vehicle does not have the capability of sensing its

position or the position of the source but is capable of sensing the light signal orig-

inating from the light bulb. The light field created by the light bulb decays away

from the position of the light bulb but the vehicle does not have the knowledge of the

functional form of this field. We employ a control strategy that keeps the forward

velocity constant and tunes the angular velocity via extremum seeking. First, we

present the design for a light-seeking robot. We produce experimental results of a

vehicle performing localizing, tracking, and tracing level-sets of a light source. We

also present multiple vehicles seeking a light source while avoiding objects and each

other.

6.1 Introduction

Research in applications that use autonomous vehicles are wide, varied, and

constantly growing. In particular, the field of research dealing with vehicles deprived

of position information is rapidly gaining interest. These vehicles must navigate and

perform a desired task without the use of GPS or inertial navigation. The vehicles

that we use in this work do not have lateral motion capabilities.

In this chapter we present experiments to support some of the theoretical and

numerical results covered in [11, 12]. We will employ a control schemes based on

81

82

extremum seeking to control the heading of ground vehicles while keeping their for-

ward speed constant. In [11] theoretical results for basic extremum seeking applied

to the steering of autonomous vehicles are provided. In [12], the application of

extremum seeking on vehicles with different objectives and different configurations

from those which the theory covers are presented.

Extremum seeking employs a periodic probing motion of the vehicle to search

the signal space, which then provides the necessary information to orient the vehicle

in the correct direction. There exist applications for which this probing motion is

undesirable, in which case extremum seeking can still be applied via a slight modi-

fication of decoupling the sensor from the body on the vehicle. In the experiments

presented here we modified the extremum seeking method to separate the desired

tuning of the vehicle orientation from the undesirable periodic probing. The concept

behind decoupled extremum seeking is that the sensor can move along the vehicle

body, providing the necessary probing motion, while the vehicle itself moves in a

smooth fashion. Implementing decoupled extremum seeking does not hinder the

vehicle’s capability of source seeking.

In Section 6.2 we provide the design of the Autonomous Nonholonomic Tracker

(ANT), which is used in the light-seeking experiments. Section 6.3 presents exper-

imental results for localizing a stationary light source, tracking a moving source,

tracing the level sets, and source seeking and collision avoidance with one or two

robots. We conclude this chapter with our future intentions in Section 6.4.

6.2 Vehicle Design

The basic vehicle configuration used for extremum seeking assumes the vehicle

itself can readily perform the movement caused by the periodic perturbation used to

search the space. In our case it is inefficient to have the entire robot move in these

period probing motions so we consider the use of decoupling the sensor from the

body of robot. The ANT was designed around the unicycle model with a decoupled

sensor depicted in Figure 6.1. The key aspects in the design of the ANT were keeping

the sensor at a distance R from the center, keeping the axis of rotation of the vehicle

83

x

y

θ

θs

v

Figure 6.1: Graphical interpretation of the unicycle model with a decoupled sensor.The red dot indicates the sensors location

at its center rc, and having separate actuation to decouple the sensor sweeping θs

from the vehicle turning (θ). Numerical validation of the decoupled unicycle model

used for sources seeking is discussed in [12].

The ANT was assembled with two decks made of acrylic. As shown in Figure

6.2 the wireless communications, battery, steering servo, and the decoupling sensor

servo are housed in between the acrylic decks. The bottom of the lower deck houses

the steering gears and the driving servo. The top deck contains the light sensor arm

and circuit board.

The ANT uses two types of sensors on-board: a light sensor and an IR proximity

sensor. The TAOS TSL14S-LF is a light sensor placed at the tip of the sweeping

sensor arm and to provide light intensity readings. The light sensor output passes

through a low-pass RC filter, with a cutoff frequency of 10 Hz, built on a custom

printed circuit board (PCB), to remove high frequency noise. The ANT has two

Sharp GP2D120XJ00F IR proximity sensors located front left and right of the robot,

which help the robot detect and avoid obstacles.

The ANT uses two Hitec HS-85MG micro servos for locomotion, one for contin-

uously moving forward and the other for steering. A Cirrus CS301 micro servo is

used to provide the sweeping motion of the sensor arm. All servos are controlled by

the PWM (pulse width modulation) that comes from the microprocessor. To power

84

Light-sensor arm

Battery

IR sensor

Wireless communication

Servo

PCB board

Wheel

Axle

Bearing

Steering gear

Driving servo

(a) (b)

Figure 6.2: ANT (a) top view (b) bottom view

all the electronics on the ANT we use a Tenergy 2S-500-10 lithium polymer battery

pack. A 5V voltage regulator is used to maintain a consistent supply voltage to the

electrical components.

A custom designed PCB, shown in Figure 6.3, was created for the ANT. At the

core of the in PCB is the dsPIC30F4012 microprocessor from Microchip. There are

six analog input channels and six PWM output channels on the 28-pin microproces-

sor, which allow for the addition of three more sensors in the future depending on

the application. The microprocessor on the PCB also connects through the MPR

connector to a wireless xBee communication module used for data collation.

The control algorithm for the ANTs are given as

v = v0 (6.1)

θ = c sin(ωt)s

s+ h[J ] + d(IRleft − IRright) (6.2)

θs = aω cos(ωt) (6.3)

where v commands the surge servo, θ commands the steering servo, and θs commands

the decoupled sensor arm. The light sensor reading J and the IR sensor readings

IRleft and IRright are the inputs to the control algorithm. The parameter v0 is a

constant that determines the forward speed of the robot. The extremum seeking

parameters are a, ω, c, and h where a is the probing amplitude, ω is the sinusoidal

85

Sensor Inputs

Battery Input

ON/OFF Switch

MPR ConnectionProgrammingConnection

Motor Outputs

PIC Microcontroller

Figure 6.3: CAD rendering of the PCB

sweeping frequency, c is the adaptive gain, and h is the cutoff frequency of the

washout filter. The addition of the d term, which acts as an obstacle avoidance

gain, in (6.2) was made to give the robot the ability of avoiding obstacles and other

robots. The ANT was programmed with a digital version of the extremum seeking

algorithm (6.1)–(6.3) in the MPLAB integrated development environment provided

by Microchip.

6.3 Experiment

In this section we show the extremum seeking method employed on the ANT.

The ANT is given the task of seeking the source or a level set produced by a light

source while avoiding obstacles. The ANT has no information about its position or

the position of the source. Similar to most mobile vehicles, the ANT has kinematic

constraints, which do not allow the robot to move sideways. Considering these

constraints, one of the advantages of the extremum seeking method is being able to

simultaneously solve a nonholonomic steering problem while also solving an adaptive

optimization problem. The experiments done in this section use one or two desk

lamps as the source and a table gridded with 0.15m (6.0in) squares to give a better

idea of relative distance as the vehicle moves around on the table.

86

6.3.1 Localization and Tracking of a Light Source

Here we show experimental results of the extremum seeking method to not only

localize a light source but also to track the light source once the source moves.

We designed an experiment to test how the algorithm would handle the worst case

scenario of moving sources, i.e., instantly moving from one location to another. The

experiment was done with two light bulbs. The experiment begins with one light

bulb turned on, then once the robot has converged to the light bulb it is turned

off and a second light bulb is turned on. From the perspective of the vehicle this

experiment emulates a source that can instantly move from one location to another.

The first two photos in Figure 6.4 show the ANT starting from a location away from

the light bulb and then quickly converging to the light bulb. Since the extremum

seeking algorithm never stops searching the vehicle continues to sniff around the

light source. As shown in the last two photos of Figure 6.4, once the light source is

switched the vehicle starts converging to the new light source.

6.3.2 Level Set Tracking of a Light Source

Tracing out the curves which define a specific value of the signal is a good way to

gain more information about the signal field. These curves are referred to as a level

set or isoline. A simple modification to the extremum seeking algorithm produces a

simple solution for implementation on the ANT to perform level set tracing. In the

tracking experiment the robot was trying to maximize the light intensity J that it

was measuring. In this experiment we modify (6.2) by replacing J with the negative

absolute value of the difference of the sensor reading J and the desired level set

value Jd. The steering control law, modified for level set tracing, becomes

θ = c sin(ωt)s

s+ h[−|J − Jd|] + d(IRleft − IRright). (6.4)

For this experiment we hang the two lamps above the table to produce a peanut-

like shaped signal field. Figure 6.5 shows a sequence of pictures of the vehicle

employing extremum seeking to trace a level set of the light source. The pictures are

taken at fifteen second intervals. A marker attached to the bottom of the vehicle is

used to draw the vehicle’s path as it performs the level set tracing. Figure 6.6 shows

87

Figure 6.4: Photographs of the ANT performing source seeking with overlayed tra-jectory appearing in order from left to right top to bottom.

88

Figure 6.5: Photographs of the ANT performs level set tracing at 15 sec intervalsappearing in order from left to right top to bottom

the test table after five minutes, where the ANT has traced out the level set two and

a half times. The vehicle traced a peanut-like shape of approximately 45in × 27in

(115cm × 70cm) with a maximum deviation of approximately 2in (5cm) between

laps. From these pictures we can conclude that a vehicle employing extremum

seeking can successfully perform level set tracing on a static unknown source given

a desired signal intensity Jd.

6.3.3 Collision Avoidance

In almost all applications of mobile vehicles collision avoidance is an important

part of the task. Here we present three experiments that show the collision avoidance

capabilities of the ANTs. The experimental setup is very similar to the setup in the

light tracing experiment, where the light sources were hung above the test table.

Figure 6.7 shows a sequence of pictures of a red and black ANT employing extremum

seeking to track two light sources. The pictures are taken at ten second intervals.

The first picture of Figure 6.7 shows the two desk lamps being used as sources as

well as the starting position of the robots. The red and the black ANTs start next

to each other but once they are turned on they repel each other and head to two

different light sources. The last picture in Figure 6.7 shows each robot settling to a

89

Figure 6.6: Picture of the testbed after the ANT had traced the level set severaltimes

different light source.

A second experiment was done with one light source and some obstacles in the

way of the robot. As shown in Figure 6.8 the robot avoids the two objects on its way

to the light source. A final experiment was done to see how well the two robots can

avoid each other while tracking one light source. Figure 6.9 show how they avoided

each other once they both arrived at the source. After some time the two robots

settled in to a small circular trajectory with the robots being at opposite ends.

6.4 Conclusion and Future Work

In this chapter we showed that extremum seeking applied to autonomous vehicles

allows for the completion of a variety of tasks, such as source tracking, level set

tracing, and multi-vehicle sources seeking while avoiding collision. In the future,

we plan to experiment with multi-vehicle algorithms with methods similar to the

ones mentioned in Chapter 4 and 5 but for nonholonomic vehicles. We also plan to

investigate the application of extremum seeking in performing cooperative tracking

of multiple targets.

90

Figure 6.7: Photographs of two ANTs performing source seeking in a field producedby to light sources at 10 sec intervals appearing in order from left to right top tobottom.

91

Figure 6.8: Photographs of the ANT performing obstacle avoidance while trackinga light source at 5 sec intervals appearing in order from left to right top to bottom.

92

Figure 6.9: Photographs of the ANTs avoiding each other while tracking a lightsource at 5 sec intervals appearing in order from left to right top to bottom.

7

Plume Source Seeking

Experiments

Tracking a plume of chemical back to its source is made difficult by the com-

plexity of a plume structure caused by turbulence and shifts in the prevailing wind

direction. Insects overcome this problem using forms of anemotaxis, which involve

traveling upwind when an attractive chemical is perceived. We combine the method

of extremum seeking with the biologically inspired idea of traveling up wind to

achieve plume source localization. We create an apparatus that is able to produce

a wide range of plumes. We present experimental results of an autonomous vehicle

equipped with a smoke sensor and a wind direction sensor seeking the source of a

smoke plume.

7.1 Introduction

Tracking plumes to their source is a difficult task, as it is highly affected by the

turbulence of the media and by the sensitivity of the sensors to both the media

and other contaminants in the media. In general, most attempts at plume tracking

have used the “PC on board” philosophy. The assumption is that a great deal of

processing is required to extract enough data to track a plume, as the data used

by biological systems ([16, 17, 26]) may be quite detailed and subtle. Data ranging

93

94

from edge detection to gradient calculations might be used to track plumes.

In this chapter, we describe a robot implementing a simple algorithm. This algo-

rithm is based on a combination of extremum seeking and wind direction feedback,

and contains no explicit state or memory and no internal processing of sensory data.

The robot simply reacts to external environmental conditions. However, the robot

is capable of tracking an odor plume reliably upstream, and has a high success rate

from anywhere within the plume, and with any initial configuration. In Section 7.2,

we cover the construction of a testbed that allows the operator to control the smoke

concentration at the source and the wind speed. Section 7.3 shows the design and

assembly of a mobile robot with the capability of tracking a smoke plume source,

which we refer to as plume-bot. The experimental results are shown in Section 7.4.

We conclude this chapter with potential future work in Section 7.5.

7.2 Testbed Setup

This testbed consists of three main parts: a wind tunnel, a chamber with a

known smoke concentration, and a base station computer. The wind tunnel has two

fans that control wind speed, which allows us to perform tests at a wide range of

wind flow environments. The smoke chamber allows us to produce a smoke source at

the intake of the wind tunnel. The base station computer is used to control the fans,

the smoke release, and record the status of the plume-bot during an experiment.

The wind tunnel has overall interior dimensions of 1.2 m wide by 2.4 m long and

0.33 m high. The entire tunnel was constructed using plywood, except for the top,

which needed to be clear acrylic in order for the vision system to track the position

of the plume-bot. To avoid muzzle turbulence, which would misrepresent natural

conditions in the tunnel, an intake was designed and constructed using standard

0.15 m long drinking straws stacked together to form a honeycomb structure. To

maximize intake flow, the honeycomb has the same cross-sectional dimensions as the

tunnel itself. The outlet section houses two 0.10 m diameter DC brushless fans that

were attached to a 0.20 m wide by 0.25 m high tapered outlet. The fans pull the

air through the system and force it through a 0.20 m diameter air duct that leads

95

to the lab fume hood. An electronic ignition device and smoke chamber is located

at the intake where the smoke can be released into the box. Ignition and fan speed

controls are provided through a micro-controller board with a serial RS232 interface

to the base station computer. Figure 7.1 shows the intake and the outlet of the

wind tunnel box. The clear acrylic is attached to a metal frame which hinges onto

the wind tunnel box. The hinged acrylic allows us to easily access the inside of the

wind tunnel box for placing and moving the plume-bot.

(a) (b)

Figure 7.1: Wind tunnel (a) the intake (b) the outlet

Creating an apparatus with reliable and characterized smoke plume is the most

difficult task of this testbed due to the complex nature of the plume. Characterizing

our smoke plume allows us to understand how our system will work with similar

environments outside our testbed and allows us to reliably compare the different

experiments with each other. To characterize the smoke plume we first start with

characterizing flow through the box. A good descriptor of the wind flow is the

Reynolds number, which is given by the following

Re =ρudnµ

(7.1)

dn =4A

p(7.2)

where u, ρ, dn, µ are wind velocity, air density, hyraulic diameter of the tunnel, and

dynamic viscosity of air, respectively. The equation for hyraulic diameter is given

96

in terms of the area A and the perimeter p. For our case the formula simplifies to

Re =2ρuab

µ(a+ b)(7.3)

(7.4)

where a and b are the width and height of the box. The Reynolds number can

be used to determine if flow is laminar, transient or turbulent. The flow is laminar

when Re ≤ 2300, transient when 2300 < Re < 4000, and turbulent when Re ≥ 4000.

Given that air has a density of 1.205 kg/m3 and a dynamic viscosity of 1.983× 10−5

kg s/m and that the box’s cross section is 1.2 m wide and 0.33 m high, we can write

the Reynolds number just in terms of the wind velocity as follows

Re = 16000u. (7.5)

By controlling the wind velocity we can produce all three types of flows. For example,

if we wanted laminar flow we would control the wind speed to be less than 0.14 m/s

and for turbulence we would set the wind speed to higher than 0.25 m/s.

The smoke chamber allows us to control the concentration and pressure of the

infused smoke released at the intake. The smoke chamber consists of a cylinder tube

with sealed ends, a pressure controlled inlet, a hot plate to create smoke particulates,

and an outlet hose that releases the smoke into the wind tunnel box. Figure 7.2

shows a picture of the smoke chamber. During each test a set amount of powder is

placed onto the hot plate igniter and the inlet pressure is set to be slightly above

the pressure inside the wind tunnel to allow the smoke to leak into the wind tunnel.

The base station computer interfaces with the control box, the plume-bot wireless

serial link, and the overhead video camera (mounted six feet above the apparatus).

A Matlab GUI running on the base station computer collects data and controls

the experiment. Matlab provides image processing tools that we use to locate a

bright light on the plume-bot and track its position as the plume-bot moves across

the camera’s field of vision. The Matlab GUI was used to collect data from the

camera, plume-bot, and the wind tunnel. Figure 7.3 shows a snapshot of the GUI

where the controls are on the top right, the real time video and vehicle trajectory

are on the bottom right, and the connection states to the plume-bot and the wind

tunnel are on the left.

97

Igniter

Smoke Output line

Smoke powder Bowl

Dry air inlet

(a) (b)

Figure 7.2: Smoke chamber (a) picture of the smoke chamber (b) diagram of smokechamber

7.3 Robot Design

In this section we discuss the design of the plume-bot. The plume-bot consists

of an acrylic frame with two in-line wheels and two side supports. The two in-line

wheels are both steered by a gear assembly and a radio controlled (RC) servo. The

rear wheel, which moves the vehicle forward, is turned by another servo modified for

continuous rotation. The side supports are each terminated with a single bearing

and serve to prevent the plume-bot from tipping. The plume bot is shown in Figure

7.4.

At the core of the electronics system on the plume-bot is the plume-bot con-

troller board (shown in Figure 7.5). This custom-made printed circuit board (PCB)

is based upon an Atmel microcontroller and was designed as a general purpose tool

for controlling the vehicle hardware, interfacing with analog sensors, and communi-

cating via serial links with other devices or computers.

Low cost, wireless telemetry at 9600 bps was obtained with a pair of 433 MHz

RF transceivers from Parallax Inc. The link is unidirectional, with the base station

98

Figure 7.3: Matlab GUI used to run experiments. The GUI has communicationstates on the left the test controls on the top right, and the real time plots on thebottom right.

99

(a) (b)

Figure 7.4: Plume-bot (a) picture of the plume-bot (b) CAD of plume-bot

receiving real-time data from the plume-bot. In order to facilitate modulation with

the carrier wave for transmission, the data packets are prefixed and suffixed with

symmetrical bit patterns. In addition, each data packet contains a packet ID and

an error checksum. The standard data packet from the plume-bot consists of the

current battery voltage and the current smoke sensor reading. Other packets may

contain control parameters for debugging. The packet IDs are sequential and the

base station software, upon missing a packet ID, will attempt to re-synchronize the

connection.

The plume-bot is equipped with a single compact optical smoke sensor that

allows the plume-bot to avoid colliding with the walls of the wind tunnel. The

smoke sensor, shown in Figure 7.6 (a), comes in a 46 × 30 × 18 mm package. The

smoke sensor outputs a voltage proportional to smoke density in the sensor’s opening

located in its center. The output voltage goes from 0 to 4 volts, which corresponds

to a dust density of 0 to 0.5 mg/m3, respectively. A circuit diagram of the optical

smoke sensor is shown in Figure 7.6 (b). The smoke sensor is mounted on a forward

facing arm that can be moved side to side with an RC servo. A 15 mm diameter fan

is mounted in an acrylic box behind the sensor to force the air through the sensor’s

100

Figure 7.5: Custom designed circuit board

opening and to prevent false readings from stagnant smoke in the particle sensor’s

detection chamber.

(a) (b)

Figure 7.6: Smoke sensor (a) picture of the smoke sensor (b) circuit diagram forparticulate sensors.

The plume-bot is equipped with a novel wind direction sensor consisting of a

pair of self-heated thermistor anemometers. The cooling effect of wind blowing over

the thermistor causes the temperature of the thermistor to drop. A differential

amplifier, shown in Figure 7.7, is used to amplify the voltage difference between

the two thermistors. By placing the thermistor on the right and the left side of the

plume-bot, the voltage output of the amplifier can be used to determine whether the

plume-bot is facing with the wind or against it, i.e., giving angle of attack. The wind

101

sensor is calibrated to give 0 volts when plume-bot is facing 90 degrees to the left

of the wind flow, 2.5 volts when the plume-bot is facing upwind, and 5 volts when

the plume-bot is facing 90 degrees to the right of the wind flow. The wind sensor

does not produce any meaningful output when the plume-bot is facing down wind,

therefore the plume-bot’s initial heading in the experiments is always set between

90 degrees to the left or right of the wind flow.

Figure 7.7: Circuit diagram for wind sensors.

The algorithm used on the plume-bot is a combination of extremum seeking and

wind direction feedback. The extremum seeking algorithm tries to drive the plume-

bot to the location of highest smoke concentration, while the wind feedback tries

to make the plume-bot go upstream. The full control law consists of setting the

forward velocity to a constant and angular velocity (θ) to the following

θ = aω cos(ωt) +s

s+ h[µ] sin(ωt) + p sin(ϕ) (7.6)

where ϕ is the robot angle relitive to the incoming wind, µ is the smoke sensor

reading, a, ω, and h are extremum seeking parameters, and p is the weighting on the

wind feedback term. A block diagram of the entire system is shown in Figure 7.8.

The addition of wind feedback to the extremum seeking algorithm was biologically

inspired. Moths, for example, do not only search for the plume but also surge

upwind [58].

102

» (1)

a! cos(!t)

ksin(!t)

)sin(⋅p

Vehicle Dynamics

Smoke sensor

Wind

Pos. Conc.

Wind sensor

PLUME Unknown function

of the position

ss+ h

Controller

µ

φ

θ

V

θ

Figure 7.8: Block diagram of the overall experiment

7.4 Experiment Results

In this section we discuss the experimental procedure then show the results of

plume experiments. In this experiment the plume-bot searches for a smoke source

using two kinds of information: smoke concentration detected by the smoke sensors

and wind direction detected by the wind sensors. The basic strategy given in (7.6)

is to perform local search for a plume and to track it in the upwind direction.

Figure 7.9 shows a picture of the robot performing source seeking on the smoke

plume. After tuning of the parameters in the algorithm we started testing. Thirty

tests were run for a wind speed of 1 m/s and the robot placed 1.8 m (6.0 ft) down-

stream and 0.61m (2.0 ft) to the right of the source with a heading of 55 degrees to

the right of the oncoming wind. The starting location was chosen as far as possi-

ble downstream and close to the edge of the smoke plume. Out of the thirty tests

twenty one were successful, where success is defined as the smoke sensor on the

robot coming within 0.15m (6.0 in) of the smoke source. Figure 7.10 shows a plot

of a successful run, where the plume-bot travels from the edge of the smoke plume

103

Figure 7.9: Picture of the plume-bot during a plume source seeking test

to the source of the smoke plume within 35 sec. Tests with different wind speeds

proved to have similar rates of success.

Twenty tests were performed without the wind sensor feedback. In these twenty

tests the plume-bot only reached the the plume source eight times. We speculate

that the reason for the lack of success of the tests without wind sensor feedback was

the pockets of smoke that the plume-bot would encounter. Once the plume-bot met

a pocket of smoke, it would try to follow the high concentration in the smoke pocket

downstream.

7.5 Conclusion and Future Work

We proved the extremum seeking algorithm with wind feedback to be 2/3 suc-

cessful in finding the source of a smoke plume. In the future we would want to use

chemical sensors with slow sensor dynamics and implement the extremum seeking

algorithm for slow sensors, discussed in Chapter 2, to perform source seeking of a

chemical. We would also want to perform plume source localization in a more real-

istic, less controlled environment. The use of multiple plume-bots would be useful

to increase the success rate.

104

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

1.2

1.4

X [m]

Y [m

]

Vehicle TrajectoryPlume SourceApproximate plume edgeStarting location

Figure 7.10: A 35 sec trajectory of the plume-bot performing smoke plume localiza-tion in a wind tunnel with a rightward wind of 1m/s.

Appendix A

Stability Analysis

Lemma A.1 Consider the following system

wτ (α, τ) = k1(α)wαα(α, τ)− k2(α)w(α, τ) (A.1)

zτ (α, τ) = −k3(α)z(α, τ)− k4(α)w(α, τ) (A.2)

with boundary conditions

wτ (0, τ) = −k2(0)w(0, τ) and wτ (1, τ) = −k2(1)w(1, τ), (A.3)

where k1(α), k2(α), k3(α), k4(α) are strictly positive bounded functions, and k2(α)

satisfies k2′(α) < 1

2k2(α), ∀α ∈ [0, 1]. The system (A.1)–(A.3) is exponentially stable

at the equilibrium w = 0, z = 0, i.e., there exists M > 0 and µ > 0 such that for all

τ > 0,

Ω(τ) ≤ M e−µτΩ(0), (A.4)

where

Ω(τ) =

∫ 1

0

w(α, τ)2 dα +

∫ 1

0

wα(α, τ)2 dα + w(0, τ)2 +

∫ 1

0

z(α, τ)2 dα. (A.5)

Proof: Let V (τ) be the Lyapunov functional,

V (τ) =m

2

∫ 1

0

wα(α, τ)2 dα+

m

2w(0, τ)2 +

1

2

∫ 1

0

z(α, τ)2 dα, (A.6)

105

106

where m is a positive scalar to be determined. Computing the derivative of V (τ)

gives

V =m

∫ 1

0

wτα(α, τ)wα(α, τ) dα +mwτ (0, τ)w(0, τ) +

∫ 1

0

zτz dα. (A.7)

Integrating the first term by parts, we obtain

V =mwτwα|10 −m

∫ 1

0

wτ (α, τ)wαα(α, τ) dα + wτ (0, τ)w(0, τ)

+

∫ 1

0

zτ (α, τ)z(α, τ) dα. (A.8)

Substituting (A.1)–(A.3) yields

V =−mk2(α)w(α, τ)wα(α, τ)|10 −m

∫ 1

0

k1(α)wαα(α, τ)2 dα

+m

∫ 1

0

k2(α)w(α, τ)wαα(α, τ) dα−mk2(0)w(0, τ)2 −

∫ 1

0

k3(α)z(α, τ)2 dα

−∫ 1

0

k4(α)ρ(α)w(α, τ)z(α, τ) dα. (A.9)

The second term is negative and can be removed. Integrating by parts on the third

term of (A.9), gives

V ≤−m

∫ 1

0

k2(α)wα(α, τ)2 dα−mk2(0)(w(0, τ)

2)

−m

∫ 1

0

k2′(α)w(α, τ)wα(α, τ) dα−

∫ 1

0


−∫ 1

0

k4(α)w(α, τ)z(α, τ) dα. (A.10)

We now bound V by applying the Cauchy-Schwarz and Young’s Inequality to the

third and last term with the parameters θ1, θ2 > 0

V ≤−m

∫ 1

0

k2(α)wα(α, τ)2 dα−mk2(0)w(0, τ)

2 −∫ 1

0


+1

2θ1

∫ 1

0

z(α, τ)2 dα +m

2θ2

∫ 1

0

k2′(α) dα

∫ 1

0

wα(α, τ)2 dα

+m

2

∫ 1

0

(θ1mk24(α) + θ2k

′2(α)

)dα

∫ 1

0

w(α, τ)2 dα. (A.11)

107

Applying Poincare inequality on the last term, which states∫ 1

0

w(α, τ)2 dα ≤ 2w(0, t)2 + 4

∫ 1

0

wα(α, τ)2 dα, (A.12)

letting

k2 = minα∈[0,1]

(k2(α)− 2k2′(α)) , (A.13)

k3 = minα∈[0,1]

k3(α), (A.14)

k4 = maxα∈[0,1]

k4(α), (A.15)

and choosing θ1 = 1/k3 and θ2 = 1/2, we get

V ≤−m

(k2 − 2

k2

4

mk3

)∫ 1

0

wα(α, τ)2 dα−m

(k2 −

k2

4

mk3

)w(0, τ)2

− k3

2

∫ 1

0

z(α, τ)2 dα. (A.16)

Selecting the analysis parameters m = 4 k24

k2k3, we find

V ≤− mµ

2

∫ 1

0

wα(α, τ)2 dα− mµ

2w(0, τ)2 − µ

2

∫ 1

0

z(α, τ)2 dα

≤− µV, (A.17)

where µ = min (k2, k3). From the comparison Lemma [36] and Lemma A.2, we have

Ω(τ) ≤ 1p1V (τ) ≤ 1

p1e−µτV (0) ≤ p2

p1e−µτΩ(0), (A.18)

where p1 = 12min

(m8, 1), and p2 = 1

2max(m, 1). The result (A.4) is obtained from

(A.18) with M = p2p1.

Lemma A.2 There exists p1 and p2 > 0 such that

p1Ω(τ) ≤ V (τ) ≤ p2Ω(τ), (A.19)

where Ω(τ) and V (τ) are shown (A.5) and (A.6), respectively.

Proof: With p2 =12max(m, 1), the RHS of the equation (A.19) is immediate.

Rewriting V (τ) by using Poincare inequality,

V (τ) ≥m

4

∫ 1

0

wα(α, τ)2 dα +

m

16

∫ 1

0

w(α, τ)2 dα +3m

8w(0, τ)2 +

1

2

∫ 1

0

z(α, τ)2 dα,

(A.20)

we obtain the LHS of (A.19), with p1 =12min

(m8, 1).

108

Lemma A.3 Consider the following system

wτ (α, τ) = k1(α)wαα(α, τ)− k2(α)w(α, τ) (A.21)

zτ (α, τ) = −k3(α)z(α, τ)− k4(α)w(α, τ) (A.22)

with boundary conditions w(0, τ) = 0, w(1, τ) = 0, where k1(α), k2(α), k3(α), and

k4(α) are strictly positive bounded functions ∀α ∈ [0, 1]. The system (A.21)–(A.22)

is exponentially stable at the equilibrium w = 0, z = 0, i.e., there exists µ > 0 such

that for all τ > 0,

V (τ) ≤ e−µτV (0), (A.23)

where V (τ) = 12

∫ 1

0mw(α,τ)2

k1(α)dα + 1

2

∫ 1

0z(α, τ)2 dα and m > 0 is given in the proof.

Proof: Computing the derivative of V gives us

V = −∫ 1

0

m

k1(α)wτ (α, τ)w(α, τ) dα−

∫ 1

0

zτ (α, τ)z(α, τ) dα (A.24)

(A.25)

substituting (A.21) and (A.22) we obtain

V =m

∫ 1

0

wαα(α, τ)w(α, τ) dα−m

∫ 1

0

k2(α)

k1(α)w(α, τ)2 dα (A.26)

−∫ 1

0

k3(α)z(α, τ)2 dα−

∫ 1

0

k4(α)w(α, τ)z(α, τ) dα (A.27)

Integrating by parts on the first term and using the Cauchy-Schwarz and Young’s

inequality with the parameter θ > 0 on the last term we get

V =mw(α, τ)wα(α, τ)|10 −m

∫ 1

0

wα(α, τ)2 dα−m

∫ 1

0

k2(α)

k1(α)w(α, τ)2 dα

−∫ 1

0

k3(α)z(α, τ)2 dα +

∫ 1

0

θk24

2w(α, τ)2 dα +

∫ 1

0

1

2θ(α)z(α, τ)2 dα . (A.28)

Given the boundary conditions, the first term is zero. The second term is negative

and can be removed. Combining the common terms we get

V ≤−∫ 1

0

m

k1(α)

(k2(α)−

θk1(α)k24(α)

2m

)w(α, τ)2 dα

−∫ 1

0

(k3(α)−

1

2θ

)z(α, τ)2 dα. (A.29)

109

Letting

k1 = maxα∈[0,1]

k1(α), (A.30)

k2 = minα∈[0,1]

k2(α), (A.31)

k3 = minα∈[0,1]

k3(α), (A.32)

k4 = maxα∈[0,1]

k4(α), (A.33)

and choosing θ1 = 1/k3 and m = k1k4k2k3

we get

V ≤− m

2k2

∫ 1

0

w(α, τ)2

k1(α)dα− 1

2k3

∫ 1

0

z(α, τ)2 dα

≤− µV , (A.34)

where µ = min(k2, k3

). By solving (A.34) for V (τ) we get (A.23).

Appendix B

Averaging in Infinite Dimensions

We rewrite the system as

u = Au+ F (t/ϵ, u) (B.1)

with ϵ = 1/ω. For the PDE system (4.14)–(4.18) in Chapter 4 Section 3 with

dynamic boundary conditions. We introduce a system of the form (B.1) with u =

(x, e, xl, xr)T , by defining its linear operator as

A =

A0 0 0 0

0 L 0 0

0 0 0 0

0 0 0 0

(B.2)

D(A) =u ∈ D(A0)× L2(0, 1)× R2 |Blx = xl and Brx = xr

. (B.3)

The a linear operator A0 is defined as

A0f(α) = κ(α)d2f(α)

dα2, (B.4)

with the domain

D(A0) =

f(α) ∈ L2(0, 1) : f(α) and

df(α)

dαare abs. cont.,

d2f(α)

dα2∈ L2(0, 1)

,

(B.5)

and the linear operator L is defined as

Lf(α) = −h(α)f(α) (B.6)

110

111

with the domain D(L) = L2(0, 1). The linear operators Bl and Br are defined as

Blf(α) = f(0) (B.7)

Brf(α) = f(1) (B.8)

D(Bl) = f(α) ∈ L2(0, 1) : f(α) is abs. cont. (B.9)

D(Br) = f(α) ∈ L2(0, 1) : f(α) is abs. cont.. (B.10)

The nonlinear operator F = (F1, F2, F3, F4)T is defined with ϵ = 1/ω and

F1(ωt, x, e)(α) = κ(α)a′′(α) sin(ωt)− c(α)ξ(ωt, x, e)(α) sin(ωt) (B.11)

F2(ωt, x, e)(α) = −q(x(α) + a(α) sin(ωt))2 (B.12)

F3(ωt, x, e) = −ν + κ(0)a′′(0) sin(ωt)− c(0)ξ(ωt, x, e)(0) sin(ωt) (B.13)

F4(ωt, x, e) = ν + κ(1)a′′(1) sin(ωt)− c(1)ξ(ωt, x, e)(1) sin(ωt) (B.14)

ξ(ωt, x, e)(α) = −q(x(α) + a(α) sin(ωt))2 − e(α). (B.15)

Similarly for the PDE system (4.14), (4.15), (4.16), (4.59) in Chapter 4 Section

4 with homogeneous Dirichlet boundary condition, we define the operator A by

A =

(A0 0

0 L

)(B.16)

D(A) =

(x

e

)∈ D(A0)× L2(0, 1) |Blx = 0 and Brx = 0

, (B.17)

with the nonlinearity F = (F1, F2)T .

To use Theorem 3.6 in [27], the system (B.1) must satisfy the following assump-

tions:

• F is almost periodic and satisfies the smoothness conditions from Section 2 of

[27] (continuously differentiable). Both of the conditions are trivially satisfied

for (B.11)–(B.15).

• The linear operator A, which is such that ∥TA(t)∥ ≤ Mekt for some positive

M and k, must satisfy hypothesis (H) given in [27] as a condition that if

h : [s,∞) → X is norm-continuous, then

(i)∫ t

sTA(t− τ)h(τ) dτ ∈ D(A), for s ≤ t;

(ii)∥∥A ∫ t

sTA(t− τ)h(τ) dτ

∥∥ ≤ Mekt sups≥τ≥t ∥h(τ)∥, for s ≤ t.

112

It is a routine extension of known results [15] that, for both (B.2) and (B.16), Agenerates an analytic semigroup and that properties (i) and (ii) in hypothesis (H)

hold . Hence the conditions of [27] are satisfied and Theorems 1 and 2 in Chapter

4 follow.

Bibliography

[1] V. Adetola and M. Guay, “Parameter convergence in adaptive extremum-seeking control,” Automatica, vol. 43, no. 1, pp. 105–110, 2007.

[2] K. B. Ariyur and M. Krstic, Real Time Optimization by Extremum SeekingControl. Wiley-Interscience, 2003.

[3] J. R. Baxter and G. A. Brosamler, “Energy and the law of iterated logarithm,”Mathematica Scandinavica, vol. 38, pp. 115–136, 1976.

[4] R. Becker, R. King, R. Petz, and W. Nitsche, “Adaptive closed-loop separationcontrol on a high-lift configuration using extremum seeking,” AIAA, vol. 45,no. 6, p. 1382, 2007.

[5] J. Belanger and E. Arbas, “Behavioral strategies underlying pheromone-modulated flight in moths: lessons from simulation studies,” Journal of Com-parative Physiology A: Sensory, Neural, and Behavioral Physiology, vol. 183,no. 3, pp. 345–360, 1998.

[6] H. Berg, E Coli in Motion. Springer New York, 2003.

[7] H. Berg and D. A. Brown, “Chemotaxis in e. coli analyzed by three-dimensionaltracking,” Nature, vol. 239, pp. 500–504, 1972.

[8] E. Biyik and M. Arcak, “Gradient climbing in formation via extremum-seekingand passivity-based coordination rules,” Asian J. Control: Special Issue on”Collective Behavior and Control of Multi-Agent Systems”, vol. 10, no. 2, pp.201–211, March 2008.

[9] R. Carli and F. Bullo, “Quantized coordination algorithms for rendezvous anddeployment,” SIAM J. Control Optim., vol. 48, no. 3, pp. 1251–1274, 2009.

[10] C. Centioli, F. Iannone, G. Mazza, M. Panella, L. Pangione, S. Podda, A. Tuc-cillo, V. Vitale, and L. Zaccarian, “Maximization of the lower hybrid powercoupling in the frascati tokamak upgrade via extremum seeking,” Control En-gineering Practice, vol. 16, no. 12, pp. 1468 – 1478, 2008.

113

114

[11] J. Cochran and M. Krstic, “Nonholonomic source seeking with tuning of angularvelocity,” IEEE Transactions on Automatic Control, vol. 54, pp. 717–731, 2009.

[12] J. Cochran, A. Siranosian, N. Ghods, and K. M, “Source seeking with non-holonomic unicycle without position measurements and with tuning of angularvelocity part ii: Applications,” IEEE Conference on Decision and Control,2007.

[13] J. Cochran, A. Siranosian, N. Ghods, and M. Krstic, “3d source seeking forunderactuated vehicles without position measurement,” IEEE Transactions onRobotics, pp. 117–129, 2009.

[14] J. Cortes, S. Martınez, T. Karatas, and F. Bullo, “Coverage control for mobilesensing networks,” IEEE Transactions on Robotics and Automation, vol. 20,no. 2, pp. 243–255, 2004.

[15] R. F. Curtain and H. J. Zwart, An introduction to infinite-dimensional linearsystems theory. Springer-Varlag, New York, 1995.

[16] K. Dittmer, F. Grasso, and J. Atema, “Effects of varying plume turbulenceon temporal concentration signals available to orienting lobsters,” BiologicalBulletin, pp. 232–233, 1995.

[17] ——, “Obstacles to flow produce distinctive patterns of odor dispersal on a scalethat could be detected by marine animals,” Biological Bulletin, pp. 313–314,1996.

[18] G. Ferrari-Trecate, A. Buffa, and M. Gati, “Analysis of coordination in multi-agent systems through partial difference equations,” IEEE Transactions onAutomatic Control, vol. 51, no. 6, pp. 1058–1063, 2006.

[19] Technical information for TGS2620 data sheet, revised 03/05 ed., Figaro Engi-neering Inc.

[20] A. Fort, M. Mugnaini, S. Rocchi, V. V. M.B. Serrano-Santos, and R. Spinicci,“Surface state model for conductance responses during thermal-modulation ofSnO2-based thick film sensors. part I. model derivation,” IEEE Trans. Instr.Meas., 2006.

[21] ——, “Surface state model for conductance responses during thermal-modulation of SnO2-based thick film sensors. part II. experimental verification,”IEEE Trans. Instr. Meas., 2006.

[22] A. Fort, M. Mugnaini, S. Rocchi, M. Serrano-Santos, V. Vignoli, andR. Spinicci, “Simplified models for sno2 sensors during chemical and thermaltransients in mixtures of inert, oxidizing and reducing gases,” Sensors and Ac-tuators B: Chemical, vol. 124, no. 1, pp. 245–259, 2007.

115

[23] P. Frihauf and M. Krstic, “Leader-enabled deployment into planar curves,”IEEE Transactions on Automatic Control, Submitted.

[24] N. Ghods and M. Krstic, “Multi-agent deployment over a source,” Submitted,submitted to IEEE Transactions on Control Systems Technology .

[25] G. H. Golub and C. F. V. Loan, Matrix Computations, 3rd ed. Baltimore,MD: The Johns Hopkins University Press, 1996.

[26] F. Grasso, T. Consi, D. Mountain, and J. Atema, “Behavior of purely chemo-tactic robot lobster reveals different odor dispersal patterns in the jet regionand the patch field of a turbulent plume,” Biological Bulletin, pp. 312–313,1996.

[27] J. Hale and S. V. Lunel, “Averaging in infinite dimensions,” Integral EquationsAppl., vol. 2, no. 4, pp. 463–494, 1990.

[28] H. Ishida, T. Nakamoto, T. Moriizumi, T. Kikas, and J. Janata, “Plumetrackingrobots: A new application of chemical sensors,” Biological Bulletin, vol. 200,pp. 222–226, 2001.

[29] H. Ishida, G. Nakayama, T. Nakamoto, and T. Moriizum, “Odor-source local-ization in the clean room by an autonomous mobile sensing system,” Sensorsand Actuators B: Chemical, vol. 33, no. 1-3, pp. 115 – 121, 1996, eurosensorsIX.

[30] ——, “Controlling a gas/odor plume-tracking robot based on transient re-sponses of gas sensors,” Sens., Proceedings of IEEE, vol. 2, pp. 1665–1670,2002.

[31] A. Jadbabaie, J. Lin, and A. S. Morse, “Coordination of groups of mobileautonomous agents using nearest neighbor rules,” IEEE Transactions on Au-tomatic Control, vol. 48, no. 6, pp. 988–1001, 2003.

[32] E. W. Justh and P. S. Krishnaprasad, “Equilibria and steering laws for planarformations,” Systems & Control Letters, vol. 52, no. 1, pp. 25 – 38, 2004.

[33] J. C. K. Laventall, “Coverage control by multi-robot networks with limited-range anisotropic sensory,” International Journal of Control, vol. 82, pp. 1113–1121, 2009.

[34] R. Kanzaki, “Coordination of wing motion and walking suggests common con-trol of zigzag motor program in a male silkworm moth,” Sensory, Neural, andBehavioral Physiology, vol. 182, no. 3, pp. 267–276, 1998.

116

[35] R. Kanzaki, N. Sugi, and T. Shibuya, “Self-generated zigzag turning of bombyxmori males during pheromonemediated upwind walking,” Zoological Science,vol. 9, no. 3, pp. 515–527, 1992.

[36] H. Khalil, Nonlinear Systems. Prentice-Hall, 2002.

[37] J. Kim, K.-D. Kim, V. Natarajan, S. D. Kelly, and J. Bentsman, “Pde-basedmodel reference adaptive control of uncertain heterogeneous multiagent net-works,” Nonlinear Analysis: Hybrid Systems, vol. 2, no. 4, pp. 1152–1167,2008.

[38] Y. Li, A. Rotea, G. T.-C. Chiu, L. Mongeau, and I.-S. Paek, “Extremum seekingcontrol of a tunable thermoacoustic cooler,” IEEE Trans. Contr. Syst. Technol.,vol. 13, pp. 527–536, 2005.

[39] S. J. Liu and M. Krstic, “Stochastic source seeking for nonholonomic unicycle,”Automatica to appear.

[40] ——, “Stochastic averaging in continuous time and its applications to ex-tremum seeking,” IEEE Transactions on Automatic Control, to appear.

[41] C. G. Mayhew, R. G. Sanfelice, and A. Teel, “Robust source-seeking hybridcontrollers for nonholonomic vehicles,” American Control Conference, pp. 2722–2727, June 2008.

[42] A. R. Mesquita, J. P. Hespanha, and K. Astrom, “Optimotaxis: A stochasticmulti-agent optimization procedure with point measurements,” in HSCC, 2008,pp. 358–371.

[43] P. Ogren, E. Fiorelli, and N. Leonard, “Cooperative control of mobile sen-sor networks: adaptive gradient climbing in a distributed environment,” IEEETrans. Automat. Contr, vol. 29, pp. 1292–1302, 2004.

[44] Y. Ou, C. Xu, E. Schuster, T. Luce, J. R. Ferron, and M. Walker, “Extremum-seeking finite-time optimal control of plasma current profile at the diii-d toka-mak,” 2007 American Ctrl. Conf., 2007.

[45] K. Peterson and A. Stefanopoulou, “Extremum seeking control for soft landingof an electromechanical valve actuator,” Automatica, vol. 29, pp. 1063–1069,2004.

[46] D. Popovic, M. Jankovic, S. Magner, and A. Teel, “Extremum seeking methodsfor optimization of variable cam timing engine operation,” IEEE Transactionson Control Systems Technology, vol. 14, no. 3, pp. 398–407, 2006.

[47] B. Porat and A. Neohorai, “Localizing vapor-emitting sources by moving sen-sors,” IEEE Trans. Signal Processing, vol. 44, pp. 1018–1021, 1996.

117

[48] M. Potter and K. De Jong, “A cooperative coevolutionary approach to functionoptimization,” in Parallel Problem Solving from Nature PPSN III. SpringerBerlin / Heidelberg, 1994, vol. 866, pp. 249–257.

[49] M. S. Stankovic, K. H. Johansson, and D. M. Stipanovic, “Distributed seekingof nash equilibria with applications to mobile sensor networks,” submitted toIEEE Tran. on Automatic Control.

[50] M. S. Stankovic, K. Johansson, and D. M. Stipanovic, “Distributed seeking ofnash equilibria in mobile sensor networks,” Submitted, submitted to 2010 Proc.IEEE Conf. on Decision and Control .

[51] M. S. Stankovic and D. Stipanovic, “Stochastic extremum seeking with appli-cations to mobile sensor networks,” 2009 American Control Conference, 2009.

[52] K. Stegath, N. Sharma, C. Gregory, and W. E. Dixon, “An extremum seekingmethod for non-isometric neuromuscular electrical stimulation,” IEEE Inter-national Conference on Systems, Man and Cybernetics, pp. 2528–2532, 2007.

[53] Y. Tan, D. Nesic, and I. M. Mareels, “On non-local stability properties ofextremum seeking controllers,” Automatica, vol. 42, pp. 889–903, 2006.

[54] M. Tanelli, A. Astolfi, and S. Savaresi, “Non-local extremum seeking controlfor active braking control systems,” Conf. on Control Applications, 2006.

[55] H. Tanner, A. Jadbabaie, and G. Pappas, “Flocking in fixed and switchingnetworks,” IEEE Transactions on Automatic Control, vol. 52, pp. 863–868,2007.

[56] H.-H. Wang and M. Krstic, “Extremum seeking for limit cycle minimization,”IEEE Transactions on Automatic Control, vol. 45, pp. 2432–2436, 2000.

[57] H.-H. Wang, S. Yeung, and M. Krstic, “Experimental application of extremumseeking on an axial-flow compressor,” IEEE Transactions on Control SystemsTechnology, vol. 8, pp. 300–309, 1999.

[58] T. D. Wyatt, “Moth flights of fancy,” Nature, vol. 369, pp. 98–99, 1994.

[59] C. Zhang, D. Arnold, N. Ghods, A. Siranosian, and M. Krstic, “Source seekingwith nonholonomic unicycle without position measurement and with tuning offorward velocity,” Systems & Ctrl. Letters, vol. 56, pp. 245–252, 2007.

[60] C. Zhang, A. Siranosian, and M. Krstic, “Extremum seeking for moderatelyunstable systems and for autonomous vehicle target tracking without positionmeasurements,” Automatica, vol. 43, pp. 1832–1839, 2007.

118

[61] X. Zhang, D. Dawson, W. Dixon, and B. Xian, “Extremum seeking nonlinearcontrollers for a human exercise machine,” Proc. 2004 IEEE Conf. Decisionand Ctrl., 2004.

[62] X. Zhang, D. M. Dawson, W. E. Dixon, and B. Xian, “Extremum seekingnonlinear controllers for a human exercise machine,” IEEE Transactions onMechatronics, vol. 14, no. 2, pp. 233–240, 2006.

[63] M. Zhu and S. Martınez, “Distributed coverage games for mobile visual sensornetworks,” SIAM Journal on Control and Optimization, submitted, January2010.

university of california, san diegoflyingv.ucsd.edu/nima/thesis.pdf · the dissertation of nima...

Documents