sequential composition of dynamically dexterous robot

26
Sequential Composition of Dynamically Dexterous Robot Behaviors R. R. Burridge * Metrica, Incorporated [email protected] A. A. Rizzi * The Robotics Institute Carnegie Mellon University [email protected] D. E. Koditschek Artificial Intelligence Laboratory and Controls Laboratory EECS Department, College of Engineering University of Michigan [email protected] Abstract We report on our efforts to develop a sequential robot controller composition technique in the con- text of dexterous “batting” maneuvers. A robot with a flat paddle is required to strike repeatedly at a thrown ball until the ball is brought to rest on the paddle at a specified location. The robot’s reachable workspace is blocked by an obstacle that disconnects the freespace formed when the ball and paddle remain in contact, forcing the machine to “let go” for a time in order to bring the ball to the desired state. The controller compositions we create guarantee that a ball introduced in the “safe workspace” remains there and is ultimately brought to the goal. We report on experimental results from an implementation of these formal compo- sition methods, and present descriptive statistics characterizing the experiments. * Supported in part by NSF Grant IRI-9396167. Supported in part by NSF Grants IRI-9396167, CDA- 9422014, and IRI-9510673. 1

Upload: others

Post on 03-Dec-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sequential Composition of Dynamically Dexterous Robot

Sequential Composition of Dynamically

Dexterous Robot Behaviors

R. R. Burridge∗

Metrica, [email protected]

A. A. Rizzi∗

The Robotics InstituteCarnegie Mellon University

[email protected]

D. E. Koditschek†

Artificial Intelligence Laboratory and Controls LaboratoryEECS Department, College of Engineering

University of [email protected]

Abstract

We report on our efforts to develop a sequentialrobot controller composition technique in the con-text of dexterous “batting” maneuvers. A robotwith a flat paddle is required to strike repeatedlyat a thrown ball until the ball is brought to reston the paddle at a specified location. The robot’sreachable workspace is blocked by an obstacle thatdisconnects the freespace formed when the ball andpaddle remain in contact, forcing the machine to“let go” for a time in order to bring the ball tothe desired state. The controller compositions wecreate guarantee that a ball introduced in the “safeworkspace” remains there and is ultimately broughtto the goal. We report on experimental resultsfrom an implementation of these formal compo-sition methods, and present descriptive statisticscharacterizing the experiments.

∗Supported in part by NSF Grant IRI-9396167.†Supported in part by NSF Grants IRI-9396167, CDA-

9422014, and IRI-9510673.

1

Page 2: Sequential Composition of Dynamically Dexterous Robot

1 Introduction

We are interested in tasks requiring dynamical dex-terity — the ability to perform work on the envi-ronment by effecting changes in its kinetic as wellas potential energy. Specifically, we want to ex-plore the control issues that arise from dynamicalinteractions between robot and environment such asactive balancing, hopping, throwing, catching, andjuggling. At the same time, we wish to explore howsuch dexterous behaviors can be marshaled towardgoals whose achievement requires at least the rudi-ments of strategy.

In this paper, we explore empirically a simple butformal approach to the sequential composition ofrobot behaviors. We cast “behaviors” — roboticimplementations of user specified tasks — in a formamenable to representation as state regulation viafeedback to a specified goal set in the presence of ob-stacles. Our approach results ideally in a partitionof state space induced by a palette of pre-existingfeedback controllers. Each cell of this partition isassociated with a unique controller, chosen in sucha fashion that entry into any cell guarantees pas-sage to successively “lower” cells until the “lowest,”the goal cell, has been achieved. The domain of at-traction to the goal set for the resulting switchingcontroller is shown to be formed from the union ofthe domains of the constituent controllers.1

1.1 Statement of the Problem

We have chosen the task domain of paddle jugglingas one that epitomizes dynamically dexterous be-havior. The ball will fall to the floor unless repeat-edly struck from below or balanced on the paddlesurface. Thus the dynamics of the environment ne-cessitate continual action from the robot. More-over, the intermittent nature of the robot-ball in-teraction provides a focused testbed for studyingthe hybrid (mixed continuous and discrete) controlproblems that arise in many areas of robotics. Inthis paper, the machine must acquire and containa thrown ball, maneuver it through the workspacewhile avoiding obstacles that necessitate regrasp-ing2 along the way, and finally bring it to rest on

1Portions of this paper have appeared previously in (Bur-ridge 1996).

2In general, regrasping is required when the free config-uration space attendant upon the current grasp is discon-nected and the goal does not lie in the presently occupiedconnected component. As will be seen, in the DPP problemthe configuration space of the “palming” grasp disconnects

the paddle at a desired location. We refer to thistask as “Dynamical Pick and Place” (DPP). By ex-plicitly introducing obstacles,3 and requiring a for-mal guarantee that the system will not drive theball into them, we add a strategic element to thetask.

From the perspective of control theory, our notionof “behavior” comprises merely the closed-loop dy-namics of a plant operating under feedback. Yet forour task, as is often the case when trying to controla dynamical system, no available feedback controlalgorithm will successfully stabilize as large a rangeof initial conditions as desired. Instead, there existsa collection of control laws, each capable of stabi-lizing a different local region of the system’s statespace. If they were properly coordinated, the regionof the workspace across which their combined influ-ence might be exerted would be significantly largerthan the domain of attraction for any one of theavailable local controllers. The process of creatinga switching strategy between control modes is an as-pect of hybrid control theory that is not presentlywell established. In this paper, we use conservativeapproximations of the domains of attraction andgoal sets of the local controllers to create a “pre-pares” graph. Then, backchaining away from thecontroller that stabilizes the task goal, we use thegraph to create a partition of the state space of therobot into regions within each of which a uniquecontroller is active. This leads to the creation ofcomplex switching controllers with formal stabilityproperties, and this specific instance of stable hy-brid control represents our present notion of behav-ioral composition.

In order to test our methods, we use a threedegree-of-freedom robot whose paddle-juggling ca-pabilities are already well established (Rizzi 1994),see also the associated video of juggling one andtwo balls. Although our available juggling controllaws induce very good regulation about their respec-tive goal sets, there will always be ball states thateach can not contain: the goal sets have limited do-mains of attraction. However, many of the states“uncontainable” by one control law can be success-fully handled by another with different parameter

the goal from the component in which the ball is typicallyintroduced. The robot can only “connect” the two compo-nents by adopting a “juggling” grasp, whose configurationspace has a dramatically different topology.

3The finite length of the paddle has always induced animplicit obstacle, but we have not hitherto modeled or ac-counted for it in our juggling work (Rizzi, Whitcomb, andKoditschek 1992).

2

Page 3: Sequential Composition of Dynamically Dexterous Robot

settings, such as a new goal point, or perhaps dif-ferent gain settings. In this paper, we resort solelyto such parameter retunings to explore the problemof behavioral composition. For all aspects of therobot’s task — containing the initial throw, movingthe ball through the workspace, negotiating the ob-stacle that divides the workspace, and finally bring-ing the ball to rest on the paddle — we require thatit use members of an established palette of jugglingcontrol laws with different goal points or gain tun-ings.

Our group has explored paddle juggling for sometime (Rizzi, Whitcomb, and Koditschek 1992; Rizzi1994), and it should be emphasized here that this isnot a paper about juggling per se. Rather, our jug-gling apparatus provides a convenient testbed forthe more general switching control methods pre-sented in Section 3, which necessitate the explo-ration of the juggling behavior described in Section2.5.

1.2 Previous Literature

1.2.1 Pick and Place in Cluttered Environments

Our focus on the dynamical pick and place prob-lem is inspired in part by the Handey system de-veloped by Lozano-Perez and colleagues (Lozano-Perez et al. 1987), who emphasized the importanceof developing task planning capabilities suitable forsituations where regrasping is necessitated by en-vironmental clutter (see Footnote 1), however oursetup presents dynamical rather than quasi-staticregrasping problems. Indeed, our emphasis on ro-bustness and error recovery as the driving consider-ation in robot task planning derives from their origi-nal insights on fine motion planning (Lozano-Perez,Mason, and Taylor 1984), and the subsequent liter-ature in this tradition bears a close relation to ourconcerns, as will be touched upon below.

Comprehensive work by Kak and colleagues(Tung and Kak 1994) yielded a contemporary quasi-static assembly environment in the best traditionsof Handey. The system built a plan based uponsensed initial conditions, spawned an execution pro-cess that sensed exceptions to the planned evolutionof states, and produced actions to bring the world’sstate back to the intermediate situation presumedby the plan. We wish to examine the ways in whichthis higher level “feedback” (local replanning is aform of feedback) can be reasoned about and guar-anteed to succeed, albeit in the drastically simpli-

fied setting of the dynamical pick and place problemoutlined above.

1.2.2 Hybrid Control and Discrete Event Systems

The difficult question of how to reason about the in-terplay between sensing and recovery at the eventlevel in robotics has been considerably stimulatedby advent of the Ramadge-Wonham DES controlparadigm (Ramadge and Wonham 1987). In someof the most careful and convincing of such DES-inspired robotics papers, Lyons proposes a formal-ism for encoding and reasoning about the construc-tion of action plans and their sensor-based execu-tion (Lyons 1993, Lyons and Hendricks 1994). Incontrast to our interest, however, Lyons explicitlyavoids the consideration of problems wherein ge-ometric and force sensor reports must be used toestimate progress and thereby stimulate the appro-priate event transitions.

1.2.3 Error Detection and Recovery

Despite the many dissimilarities in problem domain,models and representation, the work discussed hereseems to bear the closest correspondence to the“fine-motion planning with uncertainty” literaturein robotics (for example, as discussed in Latombe’stext (Latombe 1991)) originated by Lozano-Perezand colleagues (Lozano-Perez, Mason, and Tay-lor 1984). Traditionally, this literature focuses onquasi-static problems (generalized damper dynam-ics (Whitney 1977) with coulomb reaction forces(Erdmann 1984) on the contact set). Furthermore,the possible control actions typically are restrictedto piecewise constant velocity vectors, each con-stant piece corrupted by a constant disturbancevector of bounded magnitude and unknown direc-tion. In contrast, we are interested in Newtoniandynamical models, and our control primitive is nota constant action but rather the entire range of ac-tions consequent upon the closed-loop policy, Φ, tobe specified below. Differences in models notwith-standing, we have derived considerable motivationfrom the “LMT” (Lozano-Perez, Mason, and Taylor1984) framework as detailed in Section 1.3.3, below.

1.2.4 Juggling

In (Burridge, Rizzi, and Koditschek 1997) we in-troduced the notions of dynamical safety and ob-stacle avoidance, while in (Burridge, Rizzi, and

3

Page 4: Sequential Composition of Dynamically Dexterous Robot

Koditschek 1995) we discussed the controller com-position algorithm mentioned in Section 3. The ma-jor contribution of this paper is in furthering ourwork by showing how to compose controllers in sucha manner as to avoid an obstacle that disconnectsthe workspace.

In previous work in our lab (Buehler, Koditschek,and Kindlmann 1990b; Rizzi and Koditschek 1994),a great deal of attention has been paid to the lowerlevel palette of controllers. Our earliest endeavorswere limited to planer juggling and are shown inthe associated video. Our architecture implementsevent-driven robot policies whose resulting closed-loop dynamics drive the coupled robot-environmentstate toward a goal set. We strive to develop con-trol algorithms that are sufficiently tractable as toallow correctness guarantees with estimates of thedomain of attraction as well. Thus, we have focusedtheoretical attention on “practicable stability mech-anisms” — dynamical systems for which effectivelycomputable local tests provide global (or, at least,over a “large” volume) conclusions. At the sametime, we have focused experimental attention onbuilding a distributed computational environmentthat supports flexibly reconfigurable combinationsof sensor and actuator hardware controllers, motionestimation and control algorithms, and event-drivenreference trajectory generators. The result has beena series of laboratory robots that exhibit indefati-gable goal seeking behavior, albeit in a very narrowrange of tasks. In this paper, we explore the “higherlevel” problem of generalizing the range of tasks bycomposing the existing controllers, achieving newtasks that no single existing controller is capable ofin isolation.

1.3 Overview of the Approach

1.3.1 Feedback Confers Robustness

Physical disturbances to a system can be classi-fied, broadly speaking, into two categories. Thefirst includes the small, persistent disturbances aris-ing from model inaccuracies and sensor and ac-tuator noise. Ball spin, air drag on a ball mod-eled as a point mass, and spurious changes in cam-era measurements introduced by lighting variationsin a scene represent typical examples of this firsttype of disturbance. The second category includeslarge, significant changes to the world as expected.For example, if a parts tray is severely jostled orarrives unexpectedly empty, an assembly robot’s

world model is likely to have been grossly violated.Such large disturbances occur very rarely (other-wise, they would have been accounted for systemat-ically by the robot’s designers!), but are potentiallycatastrophic.

A common tradition in robotics and control hasbeen to model these disturbances in the hope oftreating them rationally. One typically appealsto stochastic representations of small perturbationsand introduces an optimal design (for example aKalman filter) to handle them. Analogously, recov-ery from catastrophe is similarly model-based, typ-ically being addressed by special “exception han-dling” routines specifically written for each partic-ular failure anticipated. But disturbances are gen-erally left out of nominal models precisely becausethey are so difficult to model. The systematic ori-gin of the typical small disturbance often defies astochastic representation and the resulting optimaldesigns may be overly aggressive. If the humansystem designers haven’t imagined a severe failuremode, then the system will be unable to react to itwhen it occurs, with possibly disastrous results.

Feedback controlled systems can be designed toconfer a level of robustness against both typesof disturbance without recourse to explicit distur-bance models. On the one hand, asymptotically sta-ble systems are generically locally structurally sta-ble. This means that the first class of disturbancescan degrade performance (e.g., biasing the goal set,or coarsening the neighborhood of the goal set towhich all initial conditions in its domain of attrac-tion are eventually brought) but, if small enough,cannot destroy the stability of the (possibly biased)goal. On the other hand, if the closed-loop systeminduced by a feedback controller exhibits a global(or at least “large”) domain of attraction then largedisturbances can never “take it by surprise”, so longas the original state space and dynamical modelused to represent the world (not the disturbance) re-main valid following the perturbation. The systemwill merely continue to pursue the same shepherd-ing function, now applied at the new state, with thesame tireless progression toward the goal as before.Of course, such disturbances need to be rare rela-tive to the time required to recover from them, orwe should not expect the system to ever approachthe goal.

4

Page 5: Sequential Composition of Dynamically Dexterous Robot

1.3.2 Feedback Strategies as Funnels

A decade ago, Mason introduced the notion of afunnel as a metaphor for the robust, self-organizingemergent behavior displayed by otherwise unactu-ated masses in the presence of passive mechanicalguideways (Mason 1985).4 He understood that theshape of the guideways constituted an embodiedcomputational mechanism of control that ought tobe rendered programmable. In this paper, we em-brace Mason’s metaphor and seek to extend it to ac-tive electromechanical systems (i.e., possessing sen-sors and actuators) which are more obviously pro-grammable.

Figure 1: The Lyapunov Function as Funnel: Ide-alized graph of a positive definite function over itsstate space.

Figure 1 depicts the idealized graph of a positivedefinite scalar valued function centered at a goalpoint — its only zero and unique minimum. If afeedback control strategy results in closed-loop dy-namics on the domain (depicted as the x-y planein these figures) such that all motions, when pro-jected up onto the funnel, lead strictly downward(z < 0), then this funnel is said to be a Lyapunovfunction. In such a case, Mason’s metaphor of sandfalling down a funnel is particularly relevant sinceall initial conditions on the plane lying beneath theover-spreading shoulders of the funnel (now inter-preted as the graph of the Lyapunov function) willbe drawn down the funnel toward its center — thegoal.

Although the funnels strictly represent Lyapunovfunctions, which may be difficult to find, we notethat such functions must exist within the domainof attraction for all asymptotically stable dynam-ical systems. Thus we loosen our notion of “fun-

4 Other researchers in robotics and fine motion planninghave introduced comparable metaphors (Ish-Shalom 1984,Lozano-Perez, Mason, and Taylor 1984).

nels”, and let them represent attracting dynamicalsystems, with the extent of the funnel representinga known invariant region (for a Lyapunov function,this would be any of the level curves).

Figure 2: Ideal funnel capturing the entire obstacle-free state space of the system.

In Figure 2 we depict a complicated boundarysuch as might arise when the user specifies an “ob-stacle” — here, all states outside the bounded areaof “good” states.5 Ideally, we would like a controlstrategy that funnels all “good” states to the goal,without allowing any to enter the obstacle. We sug-gest this in the figure by sketching a complex fun-nel whose shape is exactly adapted to the domain.6

Unfortunately, it is usually very difficult or impos-sible to find globally attractive control laws, so Fig-ure 2 often cannot be realized for practical reasons.Moreover, there is a large and important class ofmechanical systems for which no single funnel (con-tinuous stabilizing feedback law) exists (Brockett1983, Koditschek 1992). Nevertheless, one need notyet abandon the recourse to feedback.

1.3.3 Sequential Composition as PreimageBackchaining

We turn next to a tradition from the fine motionplanning literature, introduced by (Lozano-Perez,Mason, and Taylor 1984) — the notion of preimage

5Of course, this picture dramatically understates the po-tential complexity of such problems since the free space neednot be simply connected as depicted!

6(Rimon and Koditschek 1992) addressed the design ofsuch global funnels for purely geometric problems whereinthe complicated portion of the domain is limited to the con-figuration space. In contrast, in this paper we are concernedwith complex boundaries that run through the velocities aswell — e.g., see Figure 11.

5

Page 6: Sequential Composition of Dynamically Dexterous Robot

backchaining. Suppose we have a palette of con-trollers whose induced funnels have goal sets thatcan be placed at will throughout the obstacle-freestate space. Then we may deploy a collection ofcontrollers whose combined (local) domains of at-traction cover a large part of the relevant region ofthe state space (e.g., states within some bound ofthe origin). If such a set is rich enough that thedomains overlap and each controller’s goal set lieswithin the domains of other controllers, then it ispossible to “backchain” away from the overall taskgoal, partitioning the state space into cells in whichthe different controllers will be active, and arrive ata scheme for switching between the controllers thatwill provably drive any state in the union of all thevarious domains to the single task goal state.

Figure 3: Sequential composition of funnels: thegoal point of each controller lies within the domainof attraction induced by the next lower controller.Each controller is only active outside the domainsof lower controllers. The lowest controller stabilizesthe system at the final destination.

This method of combining controllers in sequenceis depicted in Figure 3. Note that the controller rep-resented by each funnel is only active when the sys-tem state is in the appropriate region of state space(beyond the reach of lower funnels), as sketchedat the bottom of the figure, and indicated by thedashed portion of the funnel. As each controllerdrives the system toward its local goal, the statecrosses a boundary into another region of statespace where another controller is active. This pro-cess is repeated until the state reaches the final cell,which is the only one to contain the goal set of itsown controller.

We are now in a position to approximate Figure 2with Figure 4, and this simple idea captures the

essence of the contribution of this paper.

Figure 4: A suitable composition of local funnelspartitions the obstacle free state space (or a closeapproximation to it) into cells. Within each cell, aunique controller will be active. See Figure 16 for acomposition of controllers that avoids the obstaclesin the real robot’s state space.

This scheme affords the same robustness againstdisturbances discussed above. Local stabilityagainst small disturbances is obtained because eachfunnel implies asymptotic stability for that partic-ular controller and goal point. It also provides ro-bustness against large, unanticipated disturbancesbecause the control system has no higher-level state,or preconceived plan. If the state of the systemshifts to a distant location, then the controller ap-propriate for that particular region of state space au-tomatically activates, and the process begins again.Similarly, if the state takes a favorable leap to-ward the goal state, or starts in the goal parti-tion, then intermediate controllers will never be ac-tivated. Thus, as long as the disturbance doesn’tsend the state outside the union of all the domainsin the partition, it will be effectively handled.

As we try to suggest in Figure 4, this versionof preimage backchaining is particularly convenientwhen the obstacle free state space has a compli-cated geometry. Because the domains used for thepartition of state space must be invariant, there isa natural notion of safety that arises from this ap-proach. If there are obstacles in the state space,but they do not intersect any of the cells of thepartition, then it follows that the state will neverreach an obstacle while under the influence of thecomposite controller.

6

Page 7: Sequential Composition of Dynamically Dexterous Robot

1.4 Contributions of the Paper

When written a little more carefully, as will be donebelow, it is fairly easy to verify the theoretical valid-ity of the backchaining approach outlined above. Inthe absence of noise, each local controller is guar-anteed by definition to drive its entire domain toits goal. In doing so, the controller forces the sys-tem into the domain of a higher-priority controller.This process must repeat until the domain of thegoal controller is reached.

For the nonlinear problems that inevitably arisein robotics, however, invariant domains of attrac-tion are difficult to derive in closed form, so theutility of the approach is called into question. More-over, physical systems are inevitably “messier” thanthe abstracted models their designers introduce forconvenience, and the question arises whether theseideas incur an overly optimistic reliance on closedloop models. Modeling errors may, for instance,destroy the correctness of our method, allowing cy-cling among the controllers. Thus we must turn toempirical tests to verify that our formally correctmethod can produce useful controllers in practice.

The experimental environment for our jugglingrobot is highly nonlinear and “messy,” and chal-lenges the idealized view of the formal method inseveral ways. We have no closed form Lyapunovfunctions from which to choose our domains of at-traction. Instead, we are forced to use conservativeinvariant regions found using experimental data andnumerical simulation. Also, there is significant localnoise in the steady-state behavior of the system, sothat even our “invariant” regions are occasionallyviolated. Nevertheless, as will be shown, our com-position method proves to be very robust in the faceof these problems. We are able to create a palette ofcontrollers that successfully handles a wide range ofinitial conditions, persevering to the task goal whileavoiding the obstacles, despite the significant noisealong the way.

1.5 Organization of Paper

The paper follows a logical development roughlysuggested by Figures 1 through 4. In Section 2we provide a high level overview of the hardwareand software, describe how entries in the controllerpalette (suggested by Figure 1) arise, and introducethe Juggle, ΦJ , a controller used as a basis for allentries in our palette. We also introduce variantsof ΦJ : Catching, ΦC ; and Palming, ΦP . In Sec-

tion 3 we introduce the notion of a safe controllersuggested by Figure 2, as well as formally definethe sequential composition depicted in Figure 3. InSection 4 we implement the partition suggested byFigure 4, and report on our empirical results.

2 Physical Setting

2.1 Hardware System

For the experiments described in this paper, weuse the “Buhgler,” a three degree of freedom directdrive machine sketched in Figure 5(a), which hasbeen discussed in a variety of other settings (Whit-comb, Rizzi, and Koditschek 1993; Rizzi 1994;Rizziand Koditschek 1996). Two cameras provide imagesof the ping pong ball(s) at 60Hz, and all computa-tion, from the image processing to the generation ofthe analog signals to the motor amplifiers, is carriedout on a network of twenty transputers.

Figure 5(b) shows the workspace from above, anddepicts the various obstacles with which the robotmust contend. The inner and outer edges of the an-nular workspace are caused by the “shoulder” off-set and finite paddle length respectively. The visualboundaries are considered obstacles because if theball strays beyond these lines, the robot will losetrack of it and fail to juggle it. Finally, a flat ob-stacle (which we will refer to as the Beam) is intro-duced that disconnects the horizontal workspace.The Beam is very thin in the vertical dimension,protruding into the workspace just above the planeof zero height, and is situated as shown in Fig-ure 5(b). The robot may pass under it, and theball may pass over it, but no contact is allowed be-tween robot and ball in the region it occupies, andneither is allowed to touch it. Thus, in addressingthe problem of how to bring a ball from one side ofthe Beam to the other, we encounter a simple situ-ation wherein a regrasp maneuver (see Footnote 1)is necessary for the task to succeed.

Using the tangent space notation of (Abrahamand Marsden 1978), we denote the position of theball by b ∈ B := IR3, and the full state of the ballby Tb = (b, b) ∈ TB. Similarly, the position of therobot will be denoted by q = (φ, θ, ψ) ∈ Q := S3,and the full state by Tq = (q, q) ∈ TQ. Whilethe ball is in flight, we use the standard Newtonianflight model, ignoring all effects other than grav-ity (which is assumed to be a constant accelerationacting vertically downward). Robot–ball collisions

7

Page 8: Sequential Composition of Dynamically Dexterous Robot

X�0�

Z� 0�

Y�0�

Z� 1�

Y�1� X

�2�

Y�2�

Z� 2�

Z� 3�X�3�

Y 3�

p� g�

n g� ng�

X�1�

q 1

q 3� q 3�q 2�

q 2�

q 1

s� 1

s� 2

d� 1

d� 2

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5

Y

X

Beam

Inner Limit

Outer Limit

Visual Boundary

Visual Boundary

Figure 5: (a) The Buhgler Arm. The three joint values, q1, q2, and q3 are referred to as φ, θ, and ψrespectively in this paper. (b) The horizontal workspace with obstacles: The Beam, inner and outerpaddle limits, and the boundary of the visible workspace.

are assumed to be instantaneous, and are modeledusing Newton’s restitution model. The reader is re-ferred to (Rizzi 1994) for more detailed discussionof these models.

2.2 Software System

The full software system is depicted in Figure 6,with the various blocks representing high-level sub-systems. Rectangles represent the subsystems withconsiderable internal state: Vision, Observer, Con-trol, and Actuator, while the circle in the middlerepresents a transformation that has no significantinternal state.

Ball states, Tb, are interpreted at 60 Hz by thevision system, V (Cameras and image processing),which produces image plane locations at the samerate. These image plane locations are passed to theobserver, O, which triangulates estimates of ball po-sition b ∈ B, and uses standard observer methods toestimate the full state of the ball, T b = (b, ˆb) ∈ T B.The estimated ball state, T b is fed back to V ,to direct its focus of attention, and also interpo-lated to produce estimates at 330 Hz for the restof the system. This nonlinear V – O sensing loopis discussed in detail and shown to be asymptoti-cally stable in (Rizzi and Koditschek 1996). In thepresent work we have simply assumed that conver-gence, T b→ Tb, occurs at a time scale much fasterthan that relevant to the larger loop, an assump-tion which occasionally proves invalid, as discussedin Section 4.4.3.

Estimated ball states, T b, are passed through a

memoryless transformation, m : T B → Q, to pro-duce reference robot positions, r = m(T b). Sincem is analytic, this block also produces r and rfor use by the control system. Following (Buehler,Koditschek, and Kindlmann 1990a), the m used forjuggling leads to robot motion that appears to be adistorted reflection of the ball’s motion, and we callit a mirror law. The juggling mirror law used inthis paper was developed by Rizzi, and a detaileddescription can be found in (Rizzi 1994). The mir-ror law uses nine key parameters that prescribe thespatial location and feedback gains for the juggle be-havior. In this work we will mostly be interested inmodifying the location of the set point, G (denotingthe three parameters used to prescribe horizontalposition and the apex height of the ball’s desiredperiodic vertical trajectory), thereby defining jug-gling control strategies active in different regionsof the workspace. We will also modify the verticalenergy gains to create the palming behavior. Thejuggling mirror law is discussed in more detail inSection 2.5, and we introduce control laws for otherbehaviors in Section 2.6.

The reference robot states created by the mir-ror law are fed to an inverse dynamics joint spacecontroller (Whitcomb 1992; Whitcomb, Rizzi, andKoditschek 1993), C, which produces torque val-ues, τ . The torques are used by the actuator block,A (amplifiers, motors, etc.), which generates truerobot joint states, Tq ∈ TQ. The robot states aresensed and returned to C for feedback control. TheC – A control/actuation loop has been shown tobe globally asymptotically stable, so that Tq → Tr

8

Page 9: Sequential Composition of Dynamically Dexterous Robot

V m C A�

Ob, b i b, b

. r, r, r.

q, q.

τ�

Φ

. ..

Figure 6: Flow chart showing the various functional blocks of the system: vision, V ; observer, O; mirrorlaw, m; control, C; and actuation, A. The parameters of interest to this paper all reside in m.

whenever r is continuous. As for the sensor block,V – O, we have assumed that the transients ofthe C – A loop have died out by the time of anyrobot–ball interaction, and, hence, that Tq = Tr.Again, in practice, the transients are generally rapidenough to support this simplified reasoning, how-ever in some situations we inevitably pay a pricein performance. This topic is further explored inSection 4.4.3.

Together, all the blocks depicted in Figure 6 forma dynamical plant, or filter, relating inputs Tb tooutputs Tq. We denote this filter by Φ, and looselyrefer to it as a “controller.”

2.3 Closed Loop System

Φ

Ε

b, b.

q, q.

Figure 7: The closed loop dynamics, FΦ, inducedby Φ and the environment, E.

Figure 7 depicts the interconnected robot–environment (ball) system that arises when a fallingball excites the visual field of the hardware andsoftware system just described. We denote the re-sulting closed loop dynamics FΦ, where the sub-script reflects the dependence of the the robot’sreactions, and hence the evolution of the coupledrobot–environment states, upon the blocks of Φ de-picted in Figure 6. It turns out that the repetitivecontinuous trajectories of this closed loop systemcan be much more readily understood by passage to

an induced discrete event sampled mapping — the“return map,” of the periodic orbits, that we denoteby fΦ. This map (more exactly, its iteration con-sidered as a dynamical system) represents the focusof analysis and experiment throughout the remain-der of the paper. We now describe in some detailthe manner in which the robot controller, Φ, givesrise to the closed loop dynamics, FΦ, and the rela-tionship of these continuous dynamics to the centralobject of study, fΦ.

2.3.1 Collapse of Dimension to Ball ApexStates

The robot controller, Φ, has been expressly de-signed to cause all the robot’s internal state to “col-lapse” onto a lower dimensional local embeddingof the environment’s state in a manner determinedby the mirror law, m. In consequence, after initialtransients, the closed loop dynamics, FΦ, can bewritten in the coordinates of the ball apex states,as we now explain. A much more thorough pre-sentation of these ideas may be found in (Buehler,Koditschek, and Kindlmann 1990b).

Consider the “total system” configuration space,X := B × Q × B, and its tangent space, TX =TB×TQ×TB, that represents the state space for thecoupled robot–environment pair. Given the physi-cal models introduced above, the dynamics of thiscoupling, denotedXΦ, are induced by Φ, Newtonianball flight, and impacts between robot and ball. Inparticular, the coupling between robot and environ-ment occurs via collisions which take place on thecontact set, C ⊂ X , comprising those configurationswhere the ball and the robot’s paddle are touching.Our simple impact model introduced in Section 2.1amounts to a morphism (Hirsch 1976) of TCX intoitself 7 — a rule by which a new ball velocity at

7 The ball’s velocity changes discontinuously while its po-

9

Page 10: Sequential Composition of Dynamically Dexterous Robot

contact is determined by a linear combination of itsold velocity and that of the robot’s paddle normal,with positions instantaneously unchanged.

Define the mirror surface M ⊂ TX to consistof those states satisfying simultaneously the con-straint equations T b = Tb and Tq = Tm(Tb).8

The first of these corresponds to an error-free es-timator (the internal states of V – O in Figure 6having converged), and the second corresponds toan error-free controller (the internal states of C –A in Figure 6 having converged). Since M is at-tracting and invariant under XΦ, we concentrateon the restriction dynamics, XΦ | M, as mentionedin Section 2.2. We find it most convenient to do soin “ball coordinates,” as follows. Notice that themirror surface, M, being the graph of a function

M : TB → TX ; Tb 7→ (Tb, Tm(Tb), T b), (1)

is parametrized by TB using M as a global changeof coordinates. Take FΦ to be the representationof the restriction dynamics, XΦ | M, expressed inTB coordinates via M . In symbols, denoting therespective flows (i.e., the time trajectories generatedby the respective vector field) by the superscript t,we have F tΦ :=M−1 ◦ (XΦ | M)t ◦M.

The mirror law, m, is so designed that if therobot’s paddle had infinite extent, and if the robot’svisual apparatus had an unbounded view, thenthe ball’s flight under gravity would ensure a re-turn to the contact set after each collision. Inball coordinates, this amounts to asserting thatP :=M−1(TCX∩M) ⊂ TB is a global “Poincare sec-tion” (Guckenheimer and Holmes 1983; Koditschekand Buehler 1991) for the relaxation oscillatorFΦ.9 From the computational and conceptual pointof view, it turns out that the collisions, P, aresition remains fixed. In contrast, since the mirror law speci-fies robot positions that depend upon the ball velocities, therobot’s position is commanded to change discontinuously af-ter each impact (Rizzi and Koditschek 1993). Of course,this discontinuity can never be tracked “instantaneously” bythe robot controller, and the impact event represents one ofthe chief events that belie our naive assumption in the nextparagraph concerning the invariance of M.

8 This is a convenient abuse of notation since in real-ity Tq = Tm(TTb). However, for our simplified modelsTTb = N(Tb), where N denotes the six dimensional vec-tor field of Newtonian flight in gravity. Hence, to be trulyprecise we would have to write Tq = Tm ◦ N(Tb), a usagewhose complexity seems not to be justified by any gain in un-derstanding and whose abandonment should not cause anyconfusion.

9 Since the paddle is finite and the cameras admit a rela-tively narrow viewing area, P is in fact only a local Poincaresection, and part of the obstacle avoidance behavior to be

less convenient to examine than the set of apexpoints, at which the ball’s vertical velocity van-ishes. Namely, decompose the ball’s (Euclidean)configuration space into its one dimensional “Verti-cal” and two dimensional “Horizontal” components,B = V×H, with the induced state space decomposi-tion, TB = TV ×TH. Denote by F−taΦ the map thattakes a ball just before collision with the robot backto it’s previous apex. Then using this last changeof coordinates,

F−taΦ : P → TH× V,

we may study the iterates of the ball’s return toapex on the transformed Poincare section, TH×V ⊂TB.

2.3.2 The Horizontal Return Map at BallApex, fΦ

Denote by (fΦ, gΦ) the manner in which the ball’sstate at a previous apex gives rise to its state atthe next apex — this is a mapping from TH × Vonto itself. (Buehler, Koditschek, and Kindlmann1990a; Buehler, Koditschek, and Kindlmann 1990b)showed that when the juggling mirror law is re-stricted to V, gΦ has global attraction to GV (thevertical component of the set point). In contrast,only local attraction of (fΦ, gΦ) to GH×GV has beendemonstrated to date.

Furthermore, (Rizzi and Koditschek 1994)showed that fΦ and gΦ are nearly decoupled. In-deed, empirical results confirm that the horizontalbehavior of fΦ changes very little within a largerange of apex heights, and that the vertical systemis regulated much faster than the horizontal.

Thus, except where otherwise specified, we ignoreV throughout this paper, and concentrate our atten-tion on the evolution of fΦ in TH. Accordingly, forease of exposition, we denote the horizontal goal setG (dropping the H subscript), a singleton subset ofTH comprising a desired horizontal position at zerovelocity.

2.4 Obstacles

In this section, we briefly describe the origin of theobstacles in TH × V with which we are concerned.We must start by reverting back to the full six-dimensional space, TB, and the restriction flow F tΦ.

discussed at length in the sequel involves modifying Φ sothat P is indeed always visited forever into the future.

10

Page 11: Sequential Composition of Dynamically Dexterous Robot

Although we attempt to be fairly thorough in thissection, some details are omitted in the interest ofreadability. The reader is referred to (Rizzi 1994;Burridge 1996) for the missing details.

There are three types of physical obstacle withwhich we are concerned. The first consists of re-gions of the robot space “off limits” to the robot,OQ ⊂ Q, whether or not it is in contact with theball. Robot obstacles include the floor, joint lim-its, etc., as well as any physical obstacle in theworkspace, and have been studied in the kinematicsliterature. Recall that for a particular mirror law,m, each ball state induces a corresponding robotstate. Stretching our terminology, we loosely definem−1(OQ) = {Tb ∈ TB | m(Tb) ∈ OQ}.

The second type of physical obstacle consists ofsubsets of ball state space which are “off limits” tothe ball, OTB ⊂ TB, whether or not it is touch-ing the robot. This set may include walls, thefloor, visual boundaries, and any other objects inthe workspace that the ball should avoid (such asthe Beam). Note that this region resides in the fullstate space of the ball so, for example, if the visionsystem fails for ball velocities above a certain valueat certain locations, such states can be included inOTB.

The final type of physical obstacle is defined asthe set of ball states that are “lost.” Intuitively, thisset, which we will call the “Workspace Obstacle”,OWS , consists of all ball states on the boundary ofthe workspace that are never coming back.

We combine these three types into one generalphysical obstacle,

O = m−1(OQ)⋃OTB

⋃OWS .

occupying the ball’s configuration space.These physical obstacles must be “dilated” in the

(discrete time) domain of fΦ to account for physi-cal collisions that may occur during the continuoustime interval between apex events. We proceed asfollows.

For any ball state, Tb , denote by τO+(Tb) the

“time to reach the physical obstacle,”

τO+(Tb) = min t > 0 : F t(Tb) ∈ TO,

and by τO−(Tb) the “time elapsed from the last

physical obstacle,”

τO−(Tb) = max t < 0 : F t(Tb) ∈ TO.

Similarly, denote the times to contact with therobot by

τΦ+(Tb) = min t > 0 : M ◦ F t(Tb) ∈ TC,

and

τΦ−(Tb) = min t < 0 : ∃Tx ∈ TC : F t◦Imp(Tx) = Tb,

where Imp is the impact map.A discrete obstacle is any apex state whose fu-

ture continuous time trajectory must intersect theobstacle, or whose past continuous time trajectorymay have intersected the obstacle.

O:={Tb ∈ TH× V : τO+(Tb) < τΦ

+(Tb)∨τO−(Tb) > τΦ

−(Tb)}

We will return to obstacles in the discussion onSafety, in Section 3.2.

2.5 Example: The Juggle ΦJ

The generic juggling behavior developed by Rizzifor the Buhgler is induced by a family of controllaws in which a single ball is struck repeatedly bythe robot paddle in such a way as to direct it to-ward a limit cycle characterized by a single zero-velocity apex point, or set point, G ⊂ H (we ignorethe vertical component here). We refer to membersof this generic class of controllers as ΦJ . Underthe assumptions made in Section 2.3, ΦJ reducesto its associated mirror law, mJ , which is detailedin (Rizzi and Koditschek 1993; Rizzi 1994) and de-scribed below.

2.5.1 The Mirror Law, mJ

For a given ball state, Tb, we begin by inverting thekinematics of the arm to calculate a robot positionthat would just touch the ball: q(b) = (φb, θb, 0)(recall 5). We choose ψ = 0 to eliminate redun-dancy. This effectively gives us the “ball in jointspace coordinates.”

Next, we express informally a robot strategy thatcauses the paddle to respond to the motions of theball in four ways. Here, φr, θr, and ψr refer to thecommanded joint values, whereas φb and θb referto the values necessary for the paddle to touch theball.

(i) φr = φb causes the paddle to track under theball at all times.

11

Page 12: Sequential Composition of Dynamically Dexterous Robot

(ii) θr “mirrors” the vertical motion of the ball (asit evolves in θb). As the ball goes up, the pad-dle goes down, and vice-versa, meeting at zeroheight. Differences between the desired and ac-tual total ball energy lead to changes in paddlevelocity at impact.

(iii) Radial motion or offset of the ball causeschanges in θr, resulting in a slight adjustmentof the normal at impact, tending to push theball back toward the set point.

(iv) Angular motion or offset of the ball causeschanges in ψr, again adjusting the normal soas to correct for lateral position errors.

For the exact formulation of mJ , the reader isreferred to (Rizzi 1994).

The mirror law has three parameters for each ofthe three cylindrical coordinates (angular, radial,and vertical): a set point, and two PD gains. InSection 2.6 we will set the vertical gains to zeroto create qualitatively different behavior (palming),while in Section 4 we will generate a family of dif-ferent juggling controllers by changing only the an-gular set point.

2.5.2 Analysis, Simulation, and Empirical Explo-ration of the Domain of Attraction, D(ΦJ).

The mirror law used in the experiments discussed inthis paper does not permit a closed-form expressionfor the apex-apex return map. Recently, we havemade strides toward developing a mirror law thathas a solvable return map (Rizzi and Koditschek1994), yet still generates similar behavior. As willbe seen in the sequel, this would be very helpfulto our methods, but the absence of closed form ex-pressions does not impede the ideas introduced inSection 1.3.

We define the domain of attraction of Φ to G byD(Φ) = {Tb ∈ TB : lim fnΦ(Tb) = G} in the absenceof obstacles. For the choice of mirror law param-eters used in this paper, D(Φ) contains nearly allof TB such that the ball is above the paddle. InSection 3.2 we will constrain this definition by in-corporating all the obstacles from Section 2.4.

Due to the complexity of the system, it is difficultto ascertain the shape of the boundaries of D(ΦJ)through systematic juggling experiments, especiallywith the presence of obstacles. Nevertheless, in Fig-ure 8, we display experimental data used to for-mulate an approximation of what we will come to

call in Section 3.2 the safe domain for a jugglingcontroller operating within the bounded workspaceformed from the robot’s paddle and visible region —the region of Figure 5(b) without the Beam obsta-cle. Data representing a sequence of typical jugglingexperiments used to develop this figure are avaliablein the associated electronic Appendix. (Burridge,Rizzi, and Koditschek 1997). The particular choiceof mirror law parameters for this controller comesfrom previous empirical work. In these plots, theaxes Rho and Phi refer to the horizontal polar co-ordinates of the ball relative to the robot’s base.

To overcome this difficulty, and to speed up de-ployment, we created a numerical simulation of thejuggler (source of the simulation is included in theelectronic Appendix). In Figure 9, we display thesame data as in Figure 8, but run through thenumerical simulation of the robot operating underΦJ rather than the real robot. Since all but oneof the starting conditions that failed on the realrobot failed in simulation, it is clear that the simu-lation provides a conservative approximation to therobot’s true actions. This simulation is used exten-sively in Section 4 for finding conservative, invariantellipsoids to approximate D(ΦJ).

2.6 The Complete Palette

In Section 4 we deploy a selection of different con-trollers from the palette represented by changes inthe mirror law parameters. We start with a set ofparameters that gives rise to a robust juggling con-troller, ΦJ0 . Next, we take advantage of the rota-tional symmetry of mJ , changing only the value forthe angular component of the set point, φb, whilerotating a numerically derived domain of attractionto match.

In addition to varying the set point for ΦJ , we usetwo other controllers based on the mirror law, butwith choices of parameters that lead to qualitativelydifferent behavior.

2.6.1 Palming: ΦP

If we set to zero the gains regulating the verticalenergy of the ball, then we find that the ball willlose energy with each impact, finally coming to reston the paddle. The lateral terms, however, will con-tinue to stabilize the ball at the horizontal set point,G ∈ H. The closed loop dynamics of this new be-havior are significantly different from juggling, asthe paddle is in continuous contact with the ball,

12

Page 13: Sequential Composition of Dynamically Dexterous Robot

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-2 -1.5 -1 -0.5 0 0.5 1

Phi

dot

Phi

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Rho

dot

Rho

Figure 8: Empirical data used to estimate the juggling domain, D(ΦJ). Each “.” (“+”) represents inapex coordinates a ball trajectory that was successfully (unsuccessfully) handled under the action of ΦJ .Due to the properties of the vertical subsystem, most of these points are at nearly the same height, soonly the horizontal coordinates are plotted.

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-2 -1.5 -1 -0.5 0 0.5 1

Phi

dot

Phi

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Rho

dot

Rho

Figure 9: Starting with the same set of data points as in Figure 8, we used the numerical simulator ofΦJ to predict whether the states would be successfully (“.”) or unsuccessfully (“+”) handled.

exerting continuous control over it. We refer to thisbehavior generically as palming: ΦP . The contin-uous interaction between ball and paddle allows usto tune the lateral PD gains higher for ΦP than forΦJ .

As our numerical simulator relies on the intermit-tent nature of contact in juggling, we can not useit to simulate palming. Currently, we have neitheran analytically nor a numerically derived domain ofattraction for fΦP , but empirical results have shownthat the projection of the domain into H is compa-rable in size to that of fΦJ , if not larger. We willuse the same horizontal domain for palming as forjuggling, but will only allow ΦP to be active whenthe ball has very low vertical energy, as suggestedby empirical tests.

2.6.2 Catching: ΦC

The assumption that we may ignore the verticalsubsystem becomes invalid for very low set points,and we find that ΦJ behaves poorly when the ballis nearly in continuous contact with the paddle. Inthese situations, the V – O subsystem may not no-tice small bounces quickly enough, and is very sus-ceptible to noise. ΦP , on the other hand, has PDgains tuned too aggressively for such bounces of theball, and should be switched on only when the ball isessentially resting on the paddle. In order to guar-antee a smooth transition from juggling to palming,we introduce a catching controller, ΦC , designed todrain the vertical energy from the ball and bring itdown close to rest on the paddle. Currently, a vari-ant of ΦJ is used in which the vertical set point isvery low, and the robot meets the ball higher thannormal for ΦJ . For an intuitive view of how the do-mains for ΦJ , ΦC , and ΦP overlap, see Figure 15.

13

Page 14: Sequential Composition of Dynamically Dexterous Robot

Due to the mixed nature of this controller, wehave neither analytical nor numerical approxima-tions to its domain of attraction. Instead, weuse a very conservative estimate based on thenumerically-derived D(ΦJ0) of Section 4.1. Al-though this controller is not ideal, it removesenough of the vertical energy from the ball that wecan safely turn on palming. Currently, catching isthe dominant source of task failures, as we reportin Section 4.4.

3 Formal Idea: Safe Sequen-tial Composition

The technique proposed here is a variant of thepreimage backchaining idea introduced by (Lozano-Perez, Mason, and Taylor 1984), although their al-gorithm was used for compliant motion in quasi-static environments. We substitute sensory eventsfor the physical transitions that characterized theircontrol sequences.

3.1 Sequential Composition

Say that controller Φ1 prepares controller Φ2 (de-noted Φ1 � Φ2) if the goal of the first lies withinthe domain of the second: G(Φ1) ⊂ D(Φ2). For anyset of controllers, U = {Φ1, ...,ΦN}, this relationinduces a directed (generally cyclic) graph, denotedΓ.

Assume that the overall task goal, G, coincideswith the goal of at least one controller, which wecall Φ1: G(Φ1) = G.10 Starting at the node of Γcorresponding to Φ1, we traverse the graph breadth-first, keeping only those arcs that point back in thedirection of Φ1, thus ending up with an acyclic sub-graph, Γ′ ⊂ Γ. The construction of Γ′ induces apartial ordering of U , but the algorithm below ac-tually induces a total ordering, with one controllerbeing handled each iteration of the loop.

In the following algorithm, we recursively processcontrollers in U in a breadth-first manner leadingaway from Φ1. We build U ′(N) ⊂ U , the set of allcontrollers that have been processed so far, and itsdomain, DN (U ′), which is the union of the domainsof its elements. As we process each controller, Φi,we define a subset of its domain, C(Φi) ⊂ D(Φi)

10If the goal set of more than one controller coincides withG, then we choose one arbitrarily (e.g., the one with largestdomain), and note that all the others will be added to thequeue in the first iteration of the algorithm.

that is the cell of the partition of state space withinwhich Φi should be active.

1. Let the queue contain Φ1. Let C(Φ1) =D(Φ1), N = 1, U ′(1) = {Φ1}, and D1(U ′) =D(Φ1).

2. Remove the first element of the queue, andappend the list of all controllers which prepareit to the back of the list.

3. While the first element of the queue has a pre-viously defined cell, C, remove the first elementwithout further processing.

4. For Φj , the first unprocessed element on thequeue, let C(Φj) = D(Φj) − DN (U ′). LetU ′(N + 1) = U ′ ∪ {Φj}, and DN+1(U ′) =DN (U ′) ∪ D(Φj). Increment N.

5. Repeat steps 2, 3 and 4 until queue is empty.

At the end of this process, we will have a regionof state space, DM (U ′), partitioned into M cells,C(Φj). When the state is within C(Φj), controllerΦj will be active. The set U ′(M) ⊂ U containsall the controllers that can be useful in driving thestate to the task goal. Note that the process out-lined above is only used to impose a total orderon the members of U ′(M) — the actual cells C(Φj)need never be computed in practice. This is becausethe automaton can simply test the domains of thecontrollers in order of priority. As soon as a domainis found that contains the current ball state, the as-sociated controller is activated, being the highestpriority valid controller.

The combination of the switching automaton andthe local controllers leads to the creation of a newcontroller, Φ, whose domain is the union of allthe domains of the elements of U ′ (i.e., D(Φ) =⋃

Φi∈U ′ D(Φi)), and whose goal is the task goal.The process of constructing a composite controllerfrom a set of “simpler” controllers is denoted byΦ =

∨U ′, and is depicted in Figure 10. In that fig-

ure, the solid part of each funnel represents the par-tition cell, C, associated with that controller, whilethe dashed parts indicate regions already handledby higher-priority funnels.

3.2 Safety

It is not enough for the robot simply to reach theball and hit it – the impact must produce a “good”ball trajectory that doesn’t enter the obstacle, and

14

Page 15: Sequential Composition of Dynamically Dexterous Robot

Φ 3�

Φ 2

Φ 1

Figure 10: Sequential composition of controllers:Each controller is only active in the part of its do-main not already covered by those nearer the goal.Here, Φ3 � Φ2 � Φ1, and Φ1 is the goal controller.

is in some sense closer to the goal. Intuitively, weneed to extend the obstacle to include all thoseball states which induce a sequence of impacts thateventually leads to the obstacle, regardless of howmany intervening impacts there may be. We dothis by restricting the definition of the domain ofattraction.

We define DO(Φ), the safe domain of attraction,for controller Φ, given obstacle O, to be the largest(with respect to inclusion) positive invariant subsetof D(Φ)− O under the map fΦ(·).

3.2.1 Safe Deployments

If all the domains of the controllers Φj in a de-ployment are safe with respect to an obstacle, thenit follows that the resulting composite controller,Φ =

∨Φj , must also be safe with respect to the

obstacle. This is because any state within D(Φ)is by definition within DO(Φj) for whichever Φj isactive, and the state will not be driven into the ob-stacle from within DO(Φj) under the influence offΦj .

4 Implementation: Deployingthe Palette

4.1 Parameterization of Domains

While fΦ is not available in closed form, its local be-havior at goal G is governed by its Jacobian, DfΦ,which might be computed using Calculus and the

Implicit Function Theorem for simple G. However,in practice (Buehler, Koditschek, and Kindlmann1990b; Buehler, Koditschek, and Kindlmann 1994)we have found that the Lyapunov functions thatarise from DfΦ yield a disappointingly small do-main around G relative to the empirical reality, andwe wish to find a much larger, physically represen-tative domain. Thus, we resort to experimental andnumerical methods to find a large and usable invari-ant domain.

We created a “typical” juggling controller, ΦJ0 ,by using an empirically successful set of parameters,with goal point G0 = (ρ0, φ0, η0) = (0.6, 0.0, 4.5).Using the numerical simulator introduced in Sec-tion 2.5, we then tested a large number of initialconditions, (hi, hi) = (xi, yi, xi, yi), covering rele-vant ranges of TH (the starting height, vi, is thesame for all runs, as we are not concerned here withthe vertical subsystem, owing to the decoupling ofH and V discussed in Section 2.5). In Figure 11 weshow samples of the “velocity spines” of these tests.In each plot, one initial velocity, hi, is chosen, anda range of positions is mapped forward under iter-ates of fΦ. The shaded areas show the location ofthe first impact of all states that are successfullybrought to G0, (denoted “∗”) without leaving theworkspace serviced by the paddle. In the left handcolumn, yi = 0, while xi varies from large nega-tive to large positive. On the right side, the rolesof x and y are reversed. These plots demonstratethe rather complicated shape of the domain of at-traction, as we have tried to suggest intuitively inFigure 2.

15

Page 16: Sequential Composition of Dynamically Dexterous Robot

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

Figure 11: Shaded regions denote initial conditions that were successfully contained in the workspacewhile being brought to the goal via iteration of fΦJ . (a) Left Column: Varying initial xi from negative(top) to positive (bottom) with yi = 0. (b) Right Column: Varying initial yi from negative to positivewith xi = 0. The scale for all plots is meters.

16

Page 17: Sequential Composition of Dynamically Dexterous Robot

4.1.1 D(ΦJ0): The Domain of ΦJ0

Currently, we have no convenient means of param-eterizing the true domain of ΦJ0 , pictured in Fig-ure 11. Yet to build the sequential compositionssuggested in this paper we require just such a pa-rameterization of domains. Thus, we are led to finda smaller domain, D0, whose shape is more readilyrepresented, as introduced in Figure 1, and which isinvariant with respect to fΦJ (i.e., fΦJ (D0) ⊂ D0).Ellipsoids make an obvious candidate shape, andhave added appeal from the analytical viewpoint.However, the ellipsoids we find will be considerablylarger than those we have gotten in the past fromLyapunov functions derived from DfΦJ . Inclusionwithin an ellipsoid is also an easy test computation-ally — an important consideration when many suchtests must be done every update (currently 330 Hz).

To find an invariant ellipsoid, start with a can-didate ellipsoid, Ntest ⊂ TH, centered at G0, andsimulate one iteration of the return map, fΦ, to getthe forward image of the ellipsoid, fΦJ (Ntest). IffΦJ (Ntest) ⊂ Ntest then Ntest is invariant underfΦJ . Otherwise, adjust the parameters of the Ntestto arrive at N ′test, and repeat the process. In Fig-ure 12, we display four slices through the surfaceof such an invariant ellipsoid found after many tri-als (smooth ellipses), along with the same slices ofits single-iteration forward image (dots). Althoughthese pictures do not in themselves suffice to guar-antee invariance, they provide a useful visualizationof the domain and its forward image, and comprisethe primary tool used in the search.11 The invari-ant ellipsoid depicted in Figure 12 will be denotedD0 throughout the sequel.

Relying on the rotational symmetry of the mir-ror law, we now use ΦJ0 to create a family of con-trollers by varying only the angular component ofthe goal point, φ, keeping all other parameters fixed,and rotating D0 to match. We denote a member ofthis family by ΦJ0(φ), with goal G0(φ) and domainD0(φ).

11Since G0 is an attractor, the existence of open forward in-variant neighborhoods is guaranteed. However, “tuning theshape” of Ntest by hand can be an arduous process. Thisprocess will be sped up enormously by access to an analyti-cally tractable return map. Even without analytical results,the search for invariance could be automated by recourse topositive definite programming (a convex nonlinear program-ming problem) on the parameters (in this case, the entries ofa positive definite matrix) defining the shape of the invariantset. We are presently exploring this approach to generationof Ntest.

4.1.2 DPADDLE(ΦJ0): Safety with respect to theWorkspace Boundaries

The largest safe domain with respect to the paddle’sworkspace is indicated by Figure 11. However, sincethere is no obvious way of parameterizing this set(recall that geometrically close approximations tothis volume in TH will typically fail to be invariantwith respect to fΦJ ), we resort to the more conser-vative approximation, D0, shown in Figure 12.

4.1.3 DBEAM (ΦJ0): Safety with respect to theBeam

In Figure 13(a), we depict the simulated iteratesof fΦJ again, but add the Beam to the workspace.The zero-velocity slice of the safe domain, D0, hasalso been added for comparison (ellipse with dot-ted boundary). The gray area represents those ini-tial states with zero initial horizontal velocity thatare successfully brought to G0 without leaving theworkspace or hitting the Beam (compare to themiddle pictures of Figure 11). The dark region rep-resents all states which hit the Beam, either ini-tially, or after a few bounces. Note that all statesin this figure that don’t hit the Beam initially aresuccessfully brought to the goal. This happens be-cause G0 is so strongly attracting that states on thewrong side of the Beam are simply knocked over it,whereas those already near the goal never stray farenough from it to reach the Beam. In Figure 13(b),we add horizontal velocity to our initial conditions,and note that secondary images of the Beam ap-pear, demonstrating the added complexity in theshape of the largest safe domain arising from theintroduction of the Beam.

Any ΦJ0(φ) whose domain does not intersect theBeam or its preimages, as shown in Figure 13, willbe safe with respect to the Beam, as well as thepaddle’s workspace.

Because the Beam divides the horizontalworkspace into disjoint regions, we need to take ad-vantage of the discrete dynamics of fΦJ to cross it(recall from Section 2.3 this is a return map). Wedo this by finding a disjoint, yet still invariant, safedomain for ΦJ0 . We generate a series of small ellip-soids starting on the far side of the Beam, such thateach maps forward into the next, and the last mapsforward entirely into D0. We will call this new dis-connected forward invariant domain D1 throughoutthe sequel.

In principle, finding D1 is not much more diffi-

17

Page 18: Sequential Composition of Dynamically Dexterous Robot

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

-0.2 0 0.2 0.4 0.6

Y

X

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6

X d

ot�

X

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Y d

ot�

Y

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4

Y d

ot�

X dot

Figure 12: Our 2-D slices of the invariant ellipsoid, D0 ⊂ H (smooth curves), and its one-bounce forwardimage, fΦ(D0) (dots).

cult than finding D0. First we choose another con-troller close to the Beam on the far side, ΦJ0(φpre),and take a small neighborhood, N1, surrounding itsgoal, G0(φpre). This will be the “launch window”.Next, we take the forward image of N1 under oneiteration of the discrete return map, and find an el-lipsoid, N2, that completely contains it. We thenmap N2 forward, and continue this process until wehave found an ellipsoid, Nn, whose forward image iscompletely contained in D0. By construction, theunion of ellipsoids, D1 := D0

⋃N1

⋃. . .⋃Nn, is

invariant under fΦJ , and safe with respect to theBeam and the workspace boundaries.

In practice, differences between the true robot be-havior and the simulator have proven too substan-tial for the small neighborhoods, Ni, to effectivelycontain the trajectory of the ball during the activa-tion of the trans-Beam controller. Instead, we havebeen forced to build our D1 domain via direct ex-perimentation. We inspect the actual experimentaldistribution of apex points (that should have beenin N2 according to the simulation), and create an

empirically-derivedN ′2. We then repeat this processfor N3, arriving at an empirically-derived N ′3, andthen again to create N ′4. For the present case, thisturns out to be completely contained in D0. Theempirically-derived trans-Beam domain, D1, usedin the experiments below has four parts: N1, N ′2,N ′3, and D0, and is shown in Figure 14 projectedonto four orthogonal planes in H.

4.2 Composition of Domains

The robot’s task is to capture and contain a ballthrown into the workspace on one side of the Beam,negotiate it over the Beam without hitting it, andbring it to rest on the paddle at location GT =(0, ρ0, φT ). We now describe the deployment cre-ated by hand using the controllers constructed inthe previous sections to achieve this task.

18

Page 19: Sequential Composition of Dynamically Dexterous Robot

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5 1

Y

X

Figure 13: Safe domain for ΦJ0 with the Beam inserted. Light gray areas represent successful initialstates, darker areas show states that eventually hit the Beam. (a) Zero initial velocity. The appropriateslice of D0 has been added for comparison. (b) xi = 1.5ms , showing preimages of the Beam.

4.2.1 Deployment

The only controllers in our palette capable of stabi-lizing the dynamics about GT are the stable palmingcontrollers with goal point GT , so we choose onewith empirically successful parameters: ΦP (GT ).The palming domain is a large ellipsoid in TH, butits projection onto V is much smaller, thus we onlyallow palming for states with very low vertical en-ergy. We also use a catching controller, ΦC , that hasa small domain at GT , and that successfully reducesthe ball’s vertical energy from typical juggling val-ues to values near zero (i.e., palming values). Thisintroduces the need for ΦJ0(φT ) — a juggler that at-tracts juggled balls to GT , but with non-zero verticalenergy. Note that by our construction, ΦJ0(φT ) �ΦC(GT ) � ΦP (GT ), but ΦJ0(φT ) |� ΦP (GT ), sothe system must go through catching from jugglingto reach palming. This idea is depicted intuitivelyin Figure 15: In the left hand figure, ΦJ draws aflat circular region of high-energy states toward itsgoal, GJ , which lies within DC . In the right handfigure, ΦC draws its (ellipsoidal) domain down invertical energy to its goal, which lies within DP , aflat circular region of low energy.

Starting with ΦJ0(φT ), we add new jugglers withnew values of φ such that each new goal lies justwithin the last domain (which was a rotated copy ofD0), repeating this process until we reach φ0, andthe Beam blocks further progress. At this pointwe use D1, the trans-Beam domain, for ΦJ0(φ0), asdescribed in Section 4.1.3.

Finally, we create another sequence of rota-tionally symmetric controllers on the far side of

Ellipses Type Goal:φ Domain Type1 ΦP 0.3 DP2 ΦC 0.3 DC3 ΦJ 0.3 D0

4 ΦJ 0.15 D0

5 - 8 ΦJ 0.0 D1

9 ΦJ −0.64 D0

10 ΦJ −0.81 D0

11 ΦJ −0.97 D0

12 ΦJ −1.12 D0

13 ΦJ −1.26 D0

14 ΦJ −1.4 D0

Table 1: The full deployment, with controllertypes, goal points and domain types. All goal pointshave the same radial component, ρ0 = 0.6, so we listhere only the angular component, φ. Ellipse num-bers correspond to those in Figure 16.

the Beam, starting with one whose goal lies atG0(φpre), and continuing until the edge of the vis-ible workspace blocks further progress. The entirepalette of controllers is shown in Figure 16. Notethat all domains are simply rotated copies of D0

except the four ellipsoids surrounding the Beam,which make up the invariant domain, D1, for thetrans-Beam controller of Figure 14. Some of the in-teresting parameters of this deployment are listedin Table 1.

Every member of this deployment was chosen byhand to ensure that a path exists from all the do-mains to the task goal state. The algorithm of Sec-

19

Page 20: Sequential Composition of Dynamically Dexterous Robot

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

-0.2 0 0.2 0.4 0.6 0.8

Y

X

Beam

Launch Window = N1

D0

N2

N3

-0.6

-0.4

-0.2

0

0.2

0.4

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

X d

ot�

X

Launch Window = N1

D0

N2

N3

-0.4

-0.2

0

0.2

0.4

0.6

0.8

0.2 0.4 0.6 0.8 1

Y d

ot�

Y

Launch Window = N1

D0

N2

N3

-0.4

-0.2

0

0.2

0.4

0.6

0.8

-0.5 0 0.5

Y d

ot�

X dot

Launch Window = N1

D0

N2

N3

Figure 14: Four projections of the union of the disjoint ellipsoids, D1 := D0

⋃N1

⋃N2

⋃N3, created by

analysis of empirical data. Notice the velocity components of N2, which consists of apex points over theBeam.

tion 3.1 leads in a straightforward manner to a lin-ear tree with no branches, thus the ideal trajectoryof the system should be a stepwise descent fromthe first cell of the partition entered through all theintermediate cells, and finally to the catching andthen the palming cell, where it should remain.

4.2.2 A Hierarchy of Controllers

The deployment and controller switching scheme wehave just described relies for its formal propertieson the abstracted domain and goal properties ofthe constituent return maps. In turn, the invari-ance and attracting properties of fΦ arise from ab-stractions whose validity depends on the underlyingtracking and convergence properties of the contin-uous controller and observer. Thus, we have a hi-erarchy of controllers (and, hence, of abstraction),where each level functions on the basis of certainassumptions about the quality of performance ofthe lower levels. As we will see in Section 4.4.3,

however, there will be situations where the tran-sients in the robot’s tracking of the reference aretoo large for it to get to the ball despite the mirrorlaw, thus violating the underlying assumptions atthe lower level. Moreover, we will see in Section 4.3that the noise in the system sometimes causes theball to stray from all the domains of the deploy-ment, despite their theoretical invariance, violatingthe properties assumed by the higher level.

The entire deployment of Section 4.2.1 may bethought of as generating a single (complex) con-troller, Φ, with its own goal and domain, whichcould be used as an element of a still more com-plex deployment. When this is done, there will beassumptions about the ability of Φ to safely movethe ball across the Beam and to its goal, and so theprocess continues.

In summary, a hierarchical system will functiononly as well as its constituent abstractions can berealized empirically. Since no model is ever perfect,all theoretical guarantees are subject to operating

20

Page 21: Sequential Composition of Dynamically Dexterous Robot

h

·

h

J�

C

�C

�J

h

�·

h

P

C

P�

C�

Figure 15: Juggle to Catch to Palm: (a) The Juggle drives all of its domain to GJ , which lies within thedomain for the Catch, but with non-zero vertical energy. (b) The Catch reduces the vertical energy ofthe ball, bringing it down into the domain for Palming.

-0.2

0

0.2

0.4

0.6

0.8

0 0.5 1

1

2

3 4 5

6

7

8

9

10

11

12

13

14

Beam

Figure 16: A theoretically sound deployment usingthe jump controller of Figure 14.

assumptions. It is important to keep in mind thelower-level assumptions being made, so that empir-ical settings wherein such assumptions fail can beidentified.

4.3 Characteristics of the Test Runs

In Figure 16, we depict the entire deployment of theprevious section by superimposing the zero-velocityslices of all the various domains onto the horizontalworkspace, along with projections of the non-zerovelocity parts of D1. There are a total of fourteenellipsoids for eleven different controllers, the unionof four of them associated with the trans-Beam con-troller’s domain, D1, and the same ellipsoid being

used for the final juggling and palming controllers(numbers 1 & 3 in the figure). This is our hand-crafted realization of the concept depicted in Fig-ure 4.

4.3.1 A Typical Run

Figure 17(a) shows the trace of typical run, withactive controller plotted against time, while Fig-ure 17(b) shows the same run projected onto thehorizontal workspace, along with the deployment ofFigure 16 to this. Typically, the ball was introducedinto a cell of the partition that had very low prior-ity, such as ellipses 13 or 14 in Figure 16. The robotbrought it fairly quickly, within a few bounces, tothe controller closest to the Beam on the far side (9),where it stayed for some time, waiting for the ball toenter the launch window, N1 (8), of the trans-Beamcontroller’s domain, D1.

At this point the “messiness” of the system men-tioned in Section 1.4 is most evident, as the observer(recall Figure 6) sometimes“hallucinates” that theball state is within N1 (8), activating the trans-Beam controller, and then changes its estimate,which takes the ball state back out of N1, turningon the pre-Beam controller (9) again (these spuri-ous transitions are labeled “Launch Attempts” inFigure 17(a)). Although the eventual action of therobot at impact (dots in Figure 17) is almost al-ways correct, the third degree of freedom, ψr, vis-ibly shifts back and forth during the flight of theball due to the large difference in goal location be-tween the pre- and trans-Beam controllers (9 and

21

Page 22: Sequential Composition of Dynamically Dexterous Robot

0

2

4

6

8

10

12

14

0 5 10 15 20 25 30

Mod

e

Time

PalmCatch Attempt

Catch

Juggle Juggle

Ball Height

Pre-Beam

Launch AttemptsLaunch Window

Trans-Beam

-0.2

0

0.2

0.4

0.6

0.8

0 0.5 1

Mod

e

Time

1

2

3 4 5

6

7

8

9

10

11

12

13

14

Beam

Figure 17: (a) Cell number (mode, also ellipse number from Figure 16) plotted against time for a typicalrun, with a trace of the ball’s vertical position added for reference. The dots represent the mode atthe moment of impact. (b) Horizontal projection of the same trace, superimposed on the deployment ofFigure 16. The dots show the apex positions.

8). Eventually, the true ball state enters N1, thetrans-Beam controller (8) becomes active, and thetask proceeds.

In the particular run shown in Figure 17, the ballovershoots controllers (5) and (4), and goes straightfrom (6) to (3). The first catch attempt is unsuc-cessful, with the ball exiting DC (2) before enteringDP (1), so the juggle (3) turns back on, and aftera couple of hits, ΦC turns on again, and the ball isbrought down to the paddle successfully.

4.3.2 Descriptive Statistics

In Table 2 we list the percentage of successful trialswithin each of three sets of experiments to be de-scribed below. We also show for all successful trialsthe mean and standard deviation of the length oftime from introduction to pre-launch mode, timespent there before launch, time from launch topalming, and total time from start to finish. Wealso show the total number of regressive steps con-sidered. The “hallucinated” mode switches resultfrom controller transitions based on false observertransients, which change back after the observersettles. All regressive steps actually taken wereassociated with failed Catch attempts, where theball moved out of DC before it was in DP , andsometimes moved to juggles further away than (3).The regressive steps, both those “hallucinated” andthose taken, remind us of the gulf between the idealassumption, T b ≈ Tb, and reality, as discussed inSection 2.1 and Section 4.4.3.

4.4 Deployment of Invariant Sets

We ran three sets of experiments, each with sixtytrials. We started off carefully introducing the ballto a theoretically correct deployment. Next, weadded a set of “catch-alls” — controllers with do-mains not demonstrably safe, yet practically effec-tive — and once again carefully introduced the ball.We also modified DC to be more conservative in or-der to reduce the number of failures due to catching,letting a catch-all surrounding it save the ball if itstrayed out before reaching DP . Finally, we ran thesecond deployment again, but this time with wild,almost antagonistic, introductory tosses of the ball.The results of these experiments we now discuss indetail and have summarized in Table 2.

4.4.1 Theoretically Correct Deployment

In our first set of experiments, the deployment wasexactly that of Figure 16 — an exact implemen-tation of the formal algorithm presented in Sec-tion 3.1. The ball was carefully introduced into theworkspace 60 times, 49 of which (82%) were suc-cessfully negotiated along the workspace, over theBeam, and safely brought to rest on the paddle.Five times the ball was knocked out of all safe do-mains while going over the Beam, and six times dur-ing the transition from catching to palming, whichis still being refined.

Some statistics concerning the various segmentsof a typical run are given in the first column of Ta-ble 2. Note that more than half of the total time is

22

Page 23: Sequential Composition of Dynamically Dexterous Robot

Experiment 1 Experiment 2 Experiment 3Number of successful runs: 49 (82%) 57 (95%) 23 (38%)

Time (s) to reach Pre-launch mode : mean 3.954 2.775 1.276std. dev. 2.086 1.283 0.428

Time (s) spent in Pre-launch mode : mean 9.811 8.368 13.671std. dev. 10.020 5.879 8.498

Time (s) from Launch to Palming : mean 3.871 5.837 4.351std. dev. 1.821 4.456 2.204

Time (s) from Throw to Palming : mean 17.636 16.980 19.299std. dev. 10.190 7.809 8.300

Regressive mode switches: “hallucinated” 98 71 34Actually taken 4 0 0

Failed Catch attempts: recovered 6 52 7lost 6 3 1

Table 2: Statistics characterizing three sets of 60 trials each. The first three rows give a breakdown forthe successful runs of the total time into three segments: Throw to Pre-launch, time spent in Pre-launch,and Launch to Palming. The fourth row has the same statistics for total time from Throw to Palm.Regressive mode switches are “hallucinated” if the mode switches back before impact.

spent waiting for the ball to enter the launch win-dow and get knocked over the Beam, and differencesin the length of this delay account for most of thevariance in total task time. This results from thelaunch window being small relative to the noise inthe robot-ball system.

The imperfect success rate for this formally cor-rect control scheme belies the exact accuracy of ourmodels and assumptions. Nevertheless, despite the“messiness” of our physical system, the relativelyhigh success rate validates the utility of our ap-proach in a practical setting.

4.4.2 Adding “Safety Nets”

Given the inevitable infidelities of the robot and ballwith respect to the ideal models of ball flight, im-pacts, and robot kinematics, a physical implemen-tation will remain roughly faithful to the simulationresults, but inevitably depart frequently enough andwith significant enough magnitude that some fur-ther “safety net” is required to bring balls that start(or “unexpectedly stray”) outside the domain of thecomposite back into it.

Using Figure 11 as a guide, we added very lowpriority controllers with larger domains of attrac-tion. These controllers have domains that in somecases extend beyond the end of the paddle (but notinto the Beam), and are not theoretically invariant.Nonetheless, the idea behind our second set of ex-periments was to use such controllers as backups

for the regular deployment, capturing the ball if itstrays away from the controllers of Experiment 1,and channeling it back into the theoretically soundportion of the deployment. In addition, the pres-ence of the catch-all controllers allowed us to reducethe size of DC , thus avoiding some of the failuresresulting from bad catches.

The ball was again carefully introduced 60 times(video of a typical experiment can be seen in theelectronic Appendix). Three failures resulted frombad transitions from catching to palming, but theremaining 57 runs (95%) were successful. The suc-cess of these catch-all controllers confirms the sim-ulation results of Figure 11, suggesting that thereexist larger invariant domains than the conservativeD0, or even D1.

In the second column of Table 2, we give statis-tics concerning these test runs. Note that the moreaggressive domains bring the ball to the pre-launchmode faster, and actually launch the ball slightlyfaster. However, the more conservative domain forcatching results in longer delays before palming,with many more failed catch attempts, although thecatch-alls nearly always provided a safe recovery.

4.4.3 Testing the Limits

The final set of tests involved significantly wilderintroductions of the ball. Several new controllerswere added to handle high lateral velocities of ballsthrown into the workspace. Although the base mo-

23

Page 24: Sequential Composition of Dynamically Dexterous Robot

tor is quite powerful, we found that the robot couldreact quickly enough to contain only those throwslanding relatively near its paddle. For those land-ing further away, the mirror law generated a goodreference trajectory, but the robot was not phys-ically able to track it in time (the ball lands —reaches z = 0 — in roughly 10 camera frames, or0.17 s). Of the balls that were thrown “well”, andlanded near the paddle, including some with largehorizontal velocity, 23 of 25 (92%) were successfullycontained and negotiated to the final state. In fact,the robot succeeded 23 of 24 times when its favor-able initial configuration enabled it to hit the ballmore than twice.

This experiment demonstrates, again, the prob-lems that may arise when the underlying assump-tions (in this case that the robot is successfullytracking the mirror law reference) are no longervalid, as discussed in Section 4.2.2. Since the mirrorlaw generates “reasonable” trajectories during thenormal course of juggling, it would suffice to ensurethat the robot start in a state from which it canhandle any ball state within any domain of the de-ployment. This was not possible within our taxingframework, thus the ball had to be introduced nearthe paddle in order for the robot to contain it.

The other breakdown of our assumptions comeswhen T b is not tracking Tb correctly at the mo-ment of impact. This leads to the incorrect choiceof partition, and thus control law, leading to unde-sirable behavior. Since our observer is quite good,this only occurs when the ball meets the paddlesoon after leaving it, which only happens duringcatches. Thus, we see that when the ball is thrownin such a manner as to allow the robot to track themirror law, it is the failure of the observer to trackthe ball that is the dominant source of failure.

5 Conclusions

We have described an approach to creating switch-ing control algorithms that provides a theoreticalguarantee of the safety of the resultant controllerbased on the properties of the constituent con-trollers. Although the methods proposed in this pa-per require theoretical constructs that may be diffi-cult or impossible to obtain exactly, we have shownthat conservative approximations can be used intheir place with very good success. We believe thatthe developing systems discipline described hereinmay be extended to build a variety of useful dex-

terous machines that are similarly single-minded intheir pursuit of the user’s goal behavior and abilityto surmount unanticipated perturbations along theway.

Although the robot is running with a deploy-ment of many controllers, it is difficult for a ca-sual observer to tell that there is more than onecontroller. The machine’s motion seems coordi-nated, the progress of the ball toward its goal seemsreasonably timely (see Table 2), and the responseto obstacles, and especially unanticipated distur-bances (such as an experimenter deflecting the ballin flight) looks dexterous.

There are several interesting directions of furtherstudy highlighted by these experiments. Clearly,we need to develop automated methods for findinglarger invariant domains than the extremely con-servative D0 found in Section 4.1. This would allowfewer members in the deployment. It would also beuseful to extend the model introduced in Section 2to explicitly include robot states in the design of adeployment of Section 3, avoiding some of the prob-lems mentioned in Section 4.4.3.

The deployment created and used in Section 4was carefully hand-crafted in order to ensure thatthere was a path from the starting region to thegoal. We would like to develop automatic meth-ods for creating deployments from a given paletteof controllers, such as the rotationally symmetricfamily of controllers stemming from D0. If the sys-tem were able to automatically go from G to Φto D, then it could choose its own deployment bybackchaining away from the task goal in a manneranalogous to the method used in Section 4.2.1.

We are also exploring ways to move the controllersmoothly from starting point to task goal. At anypoint in time, the local controller’s goal would bemoving toward the task goal by descending a po-tential field such as those discussed in (Rimon andKoditschek 1992). At any moment in time, therewould be tradeoffs between the size of the domainand the velocity of the local goal toward the taskgoal.

6 Acknowledgments

We are particularly grateful for the helpful and in-sightful comments of Professors Elmer Gilbert andMike Wellman.

24

Page 25: Sequential Composition of Dynamically Dexterous Robot

References

Abraham, R. and Marsden, J. E. 1978. Foun-dations of Mechanics. Reading, MA: Ben-jamin/Cummings.

Brockett, R. W. 1983. Asymptotic stability andfeedback stabilization. Differential Geometric Con-trol Theory, ed. R. W. Brockett, R. S. Millman, andH. J. Sussman. Birkhauser, pp. 181–191.

Buehler, M., Koditschek, D. E., and Kindlmann,P. J. 1990. A Simple Juggling Robot: Theoryand Experimentation. Experimental Robotics I, ed.V. Hayward and O. Khatib. Springer-Verlag, pp.35–73.

Buehler, M., Koditschek, D. E., and Kindlmann,P. J. 1990. A family of robot control strategiesfor intermittent dynamical environments. IEEEControl Systems Magazine, 10(2):16–22.

Buehler, M., Koditschek, D. E., and Kindlmann,P. J. 1994. Planning and control of a jugglingrobot. International Journal of Robotics Research,13(2):101–118.

Burridge, R. R., Rizzi, A. A., and Koditschek,D. E. 1997. Obstacle avoidance in intermittentdynamical environments. Experimental RoboticsIV, ed. O. Khatib and J. K. Salisbury. Springer-Verlag, pp. 43–48.

Burridge, R. R., Rizzi, A. A., and Koditschek,D. E. 1995. Toward a dynamical pick and place.Proc. 1995 Int. Conf. on Intelligent Robots andSystems, Los Alamitos, CA: IEEE Computer Soci-ety Press, pp. 2:292–297.

Burridge, R. R. 1996. Sequential Compositionof Dynamically Dexterous Robot Behaviors. Ph.D.thesis, University of Michigan Department of Elec-trical Engineering and Computer Science.

Erdmann, M. A. 1984. On motion planning withuncertainty. AI-TR-810 (MS Thesis), MIT AI Lab.

Guckenheimer, J. and Holmes, P. 1983. Nonlin-ear Oscillations, Dynamical Systems, and Bifurca-tions of Vector Fields. New York, NY: Springer-Verlag.

Hirsch. M. W. 1976. Differential Topology. NewYork, NY: Springer-Verlag.

Ish-Shalom, J. 1984. The cs language concept: A

new approach to robot motion design. Proc. 1984IEEE Conf. on Decision and Control, pp. 760–767.

Koditschek, D. E., and Buehler, M. 1991. Anal-ysis of a simplified hopping robot. InternationalJournal of Robotics Research, 10(6):587-605.

Koditschek, D. E. 1992. Task encoding: Towarda scientific paradigm for robot planning and control.Journal of Robotics and Autonomous Systems, 9:5–39.

Latombe, J. 1991. Robot Motion Planning.Boston, MA: Kluwer.

Lozano-Perez, T., Mason, M. T., and Taylor,R. H. 1984. Automatic synthesis of fine-motionstrategies for robots. International Journal ofRobotics Research, 3(1):3–23.

Lozano-Perez, T., Jones, J. L., Mazer, E., andO’Donnell, P. A. 1987. Handey: A robot systemsthat recognizes, plans, and manipulates. Proc.1987 IEEE International Conference on Roboticsand Automation, pp. 843–849.

Lyons, D. M. and Hendriks, A. J. 1994. Plan-ning by adaption: Experimental results. In Proc.1994 IEEE International Conference on Roboticsand Automation, pp. 855–860.

Lyons, D. M. 1993. Representing and analyzingaction plans as networks of concurrent processes.IEEE Transactions on Robotics and Automation,9(3):241–256.

Mason, M. T. 1985. The mechanics of manipula-tion. In Proc. 1985 IEEE International Conferenceon Robotics and Automation, pp. 544–548.

Ramadge, P. J. and Wonham, W. M. 1987.Supervisory control of a class of discrete eventprocesses. SIAM J. Control and Optimization,25(1):206–230.

Rimon, E. and Koditschek, D. E. 1992. Ex-act robot navigation using artificial potential fields.IEEE Transactions on Robotics and Automation,8(5):501–518.

Rizzi, A. A. and Koditschek, D. E. 1993. Furtherprogress in robot juggling: The spatial two-juggle.In Proc. 1993 IEEE International Conference onRobotics and Automation, pp. 919–924.

Rizzi, A. A. and Koditschek, D. E. 1994. Further

25

Page 26: Sequential Composition of Dynamically Dexterous Robot

progress in robot juggling: Solvable mirror laws.In Proc. 1994 IEEE International Conference onRobotics and Automation, pp. 2935–2940.

Rizzi, A. A. and Koditschek, D. E. 1996. Anactive visual estimator for dexterous manipulation.IEEE Transactions on Robotics and Automation,12(5).

Rizzi, A. A., Whitcomb, L. L. and Koditschek,D. E. 1992. Distributed real-time control of a spa-tial robot juggler. IEEE Computer, 25(5).

Rizzi, A. A. 1994. Dexterous Robot Manipulation.Ph.D. thesis, Yale University.

Tung, C. P. and Kak, A. C. 1994. Integratingsensing, task planning and execution. In Proc.1994 IEEE International Conference on Roboticsand Automation, pp. 2030–2037.

Whitcomb, L. L., Rizzi, A. A. and Koditschek,D. E. 1993. Comparative experiments with a newadaptive controller for robot arms. IEEE Transac-tions on Robotics and Automation, 9(1):59–70.

Whitcomb, L. L. 1992. Advances in Architec-tures and Algorithms for High Performance RobotControl. Ph.D. thesis, Yale University.

Whitney, D. E. 1977. Force feedback controlof manipulator fine motions. ASME Journal onDynamical Systems, 98:91–97.

26