gradient-based model predictive control design with...

Gradient-based Model Predictive Control Design with Application

to Driftless Nonlinear Systems

John T. Wen∗and Sooyong Jung†

Abstract

Systems subject to nonholonomic constraints or nonintegrable conservation laws may bemodeled as driftless nonlinear control systems. Such systems are not stabilizable using continu-ous time-invariant state feedback. Time varying or piecewise continuous feedback are the onlystabilization strategies currently available. This paper presents a new feedback stabilizationalgorithm for this class of systems by choosing the control action based on a gradient matrixto reduce the predicted state error. This approach is similar to the model predictive controlbut does not consider optimality nor constraints explicitly. As a result, the on-line computationload is greatly reduced. Existing stability analysis techniques for model predictive control arenot directly applicable due to the lack of local stabilizability. We show that under the fullrank condition of a gradient matrix, the closed loop system is globally asymptotically stable.Examples of several nonlinear control systems, including a unicycle, a double-chain system, anunder-actuated satellite, and a nonholonomic integrator are provided to illustrate the proposedmethod.

1 Introduction

Control systems subject to nonholonomic constraints, such as the wheel no-slip constraint in ve-hicles, or non-integrable conservation laws, such as the conservation of angular momentum inmulti-body systems in space or free fall, may be modeled as a continuous time driftless nonlinearcontrol system: [1, 2]:

x = g(x)u, x ∈ Rn, u ∈ Rm. (1)

For n > m, this system linearized about any state is not controllable, which means that it cannotbe stabilized by a linear controller. Furthermore, there does not even exist a continuous time-invariant stabilizing feedback control law [3, 4]. For these reasons, this class of nonlinear systemsis considered to be challenging in terms of control design. Various open loop control schemes havebeen proposed by using sinusoidal inputs [5], optimal control [6], piece-wise constant inputs [7],approximation of holonomic trajectories [8], and search based motion planning [9]. For feedbackstabilization, the only available techniques use either piecewise continuous control [10, 11] or timevarying feedback [12, 13, 14, 15]. In this paper, we consider the finite difference approximation of(1):

xk+1 = xk + tsg(xk)uk, (2)

where ts is the sampling period. We present a new feedback stabilization scheme that chooses thecontrol action to reduce the predicted state error. This approach is similar to the model predictivecontrol but does not consider optimization nor constraints explicitly.

∗Electrical, Computer, & Systems Eng., Rensselaer Polytechnic Institute, Troy, NY 12180, [email protected]†School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA, [email protected]

1

Model predictive control (MPC) has been a popular approach to address control problemsinvolving state and input constraints and optimality consideration [16]. The basic formulation in-volves solving a finite horizon constrained optimal control problem at each sampling time. MPC hasfirst been proposed for linear systems [17] and later extended to nonlinear systems [18] (nonlinearMPC, or NMPC). In the absence of constraints, NMPC may be considered as a feedback stabi-lization approach to general nonlinear control systems [19]. Surveys of various NMPC strategiesand their stability properties may be found in [20, 21]. Closed loop stability for NMPC typicallyrequires a terminal constraint to enforce the predicted terminal state to be inside a constraint setat the end of the optimization horizon, within which the desired stabilizing property holds. Theterminal equality constraint approach is used in [18, 22, 19], but is challenging to implement dueto the computational requirement. Terminal inequality approach combined with a local stabilizingcontroller is proposed in [23, 24] in the so-called dual-mode control. In [25], stability is ensured bychoosing a terminal constraint based on an approximation of the truncated portion of the infinitehorizon optimization. The constrained optimization problem in these approaches is computation-ally demanding, and their requirement of a locally stabilizing feedback precludes application tononholonomic systems. A contractive state constraint is used in [26] to ensure stability withoutnecessarily solving for the optimal control, but it is in general difficult to guarantee a feasible solu-tion exists. The approach of ensuring stability by requiring certain feasibility condition instead ofoptimality is also used in [27], but a suitable terminal constraint set and a local stabilizing controllerare still required. In a similar vein, if a control Lyapunov function (CLF) is known (equivalent tothe knowledge of a stabilizing feedback), then it may be used as the terminal constraint functionto guarantee stability without the complete solution of the optimal control problem at each time[28, 29]. Another recently proposed NMPC scheme [30] does not require the a priori knowledge of aCLF but needs to construct an appropriate optimization index and terminal cost to ensure stability.This scheme also addresses the potential lack of robustness in NMPC methods with equality ter-minal constraints and a short optimization horizon [31]. The computation burden associated withthe solution of a constrained optimal or suboptimal control problem has limited the application ofNMPC mostly to process control problems with slow time constants. This has motivated increasingefforts to develop more computationally efficient implementation of MPC and NMPC [32]. In ad-dition to the feasibility based approaches mentioned above (that do not require complete solutionthe optimal control problem), there have been other more computationally efficient NMPC schemesproposed, e.g., based on linearization [33], linear time varying system approximation together withrobust control [34], and combining NMPC with a family of stabilizing linear controllers [35].

Application of the numerous existing NMPC algorithms to driftless nonlinear systems (1) or(2) is hampered by the lack of a locally stabilizing controller. A simple nonholonomic examplewas given in [26], but asymptotic stability is not achieved. The recently introduced NMPC schemein [30] shows promise, and has been applied to several nonholonomic examples, but the on-linecomputation load associated with finding the optimal solution is still potentially high.

In this paper, we consider a feedback control strategy similar in spirit to NMPC but the con-trol sequence in the prediction horizon is chosen to reduce the predicted state error based on agradient matrix rather than minimizing an objective function. As a result, the on-line computa-tion requirement is much reduced. The motivation of this approach is based on path planning fornonholonomic systems using a gradient-type of iteration [36, 37, 38]. Though the linearized systemabout an equilibrium is not controllable, the system linearized about a finite trajectory (resultingin a linear time varying system) is almost always controllable. This scheme was extended to amodel predictive implementation in [39], in which the look-ahead control vector is updated by aNewton-Raphson step based on the predicted error. Using linearized time varying system in aNewton-Raphson iteration is also employed in an NMPC scheme introduced in [40]. However, here

2

we only require improvement of the predicted state error instead of convergence of this error tozero.

Existing stability analysis techniques in NMPC are not directly applicable to our gradient basedcontrol scheme since the optimization index is not positive definite (it is zero over the entire predic-tion horizon) and the system is not locally stabilizable. The goal of this paper is to show globallyasymptotically stability of the desired state and to demonstrate the effectiveness of this controllerthrough a number of nonlinear examples. The only assumptions that are required are the uniformfull rank condition of a gradient matrix(equivalent to the controllability of the system linearizedabout the predicted state trajectory) and the boundedness of the evolution of the predicted controlsequence by the predicted state error.

To illustrate the proposed nonlinear control scheme, we have included several nonlinear controlexamples: kinematic control of a unicycle (used in [26]), a so-called double-chain system with 5states and 3 inputs, an under-actuated satellite orientation control using two angular velocities asthe control inputs, a nonholonomic integrator example from [30], and a nonlinear affine system usedin [31] to demonstrate the lack of robustness in NMPC with the terminal constraint and a shorthorizon. In all the examples, we use a nonlinear transformation to enforce the input saturationconstraint.

Notation: We use ‖x‖ to denote the Euclidean norm, and ‖x‖2P := xT Px, P > 0, to denoted

weighted norm (only for states). All matrix norms are induced 2-norms (maximum singular value).The notation of Im is used for the m×m identity matrix and 0m denotes the m×m zero matrix.Given f : Rn → Rm, ∂f(x)

∂x denotes the m× n gradient matrix, and ∂f(z)∂x denotes ∂f(x)

∂x

∣∣∣x=z

.

2 Main Result

2.1 Overview of Algorithm

We will state our algorithm for a general discrete time-invariant nonlinear control system:

xk+1 = f(xk, uk) (3)

where xk ∈ Rn, uk ∈ Rm, and f is smooth in both variables. Let xd ∈ Rn be the desired closedloop equilibrium state that satisfies

xd = f(xd, ud) (4)

for some input vector ud ∈ Rm. Note that for discretized driftless nonlinear systems (2), ud = 0.We shall address the following full state feedback stabilization problem:

For each k ≥ 0, find uk ∈ Rm in terms of current and past state vectors, {xi : i ≤ k},such that xd is an asymptotically stable equilibrium.

At time k, given the state xk and an M -step-ahead control sequence, uk,M ∈ Rm × . . .×Rm︸︷︷︸M times

,

uk,M =

u

(k)1...

u(k)M

, (5)

the M -step-ahead error is defined as

ek,M = φM (xk, uk,M )− xd (6)

3

where φM denotes the state at the end of M steps with the initial state xk and input sequenceuk,M .

The proposed NMPC algorithm is simple to describe: at the kth time step, the M -step-aheadcontrol sequence, uk,M , is refined to vk,M based on the M -step-ahead prediction error, ek,M (e.g.,by using Newton-Raphson, steepest descent, Levinson-Marquardt, etc.); the first m elements ofvk,M are used as the actual control, uk; then vk,M is shifted forwarded with the last m elementsreplaced by an appropriately chosen vector, uk. A more detailed description is now given below.

Algorithm 1 Choose an initial M -step-ahead control sequence, u0,M .

For k = 0, 1, . . .,

1. Refine the M -step-ahead control sequence uk,M to vk,M =

v

(k)1...

v(k)M

based on the M -step-ahead

predicted state error, ek,M :vk,M = uk,M + ∆uk,M (7)

where ∆uk,M is an (mM)-vector to be chosen.

2. Generate the control action uk from vk,M :

uk = v(k)1 = Λvk,M (8)

where Λ :=[

Im 0m . . . 0m

].

3. Update of M -step-ahead sequence vector:

uk+1,M =

u

(k+1)1...

u(k+1)M−1

u(k+1)M

=

v

(k)2...

v(k)M

uk

= Γvk,M + Φuk (9)

where uk is an m-vector to be chosen, and Γ : Rm·M → Rm·M and Φ : Rm → Rm·M aredefined as

Γ :=

0m Im 0m . . . 0m

0m Im. . . 0m

.... . . . . .

...0m Im

0m . . . 0m

, Φ :=

0m...

0m

Im

.

If the Newton-Raphson algorithm is used in Step 1, then the control sequence update law isgiven by

∆uk,M = −ηkD†kek,M , (10)

where Dk :=∂φM (xk,uk,M )

∂uk,M∈ Rn×(mM) is the gradient of the predicted state with respect to the

M -step-ahead control sequence, D†k is the Moore-Penrose pseudo-inverse of Dk, and ηk will be

4

determined to satisfy Assumption 1 in the next section. If (3) is a linear time invariant (LTI)system, then Algorithm 1 with (10) is an (mM)th order linear compensator which degenerates toa static full state feedback for ηk = 1. If M = n, it becomes a dead beat controller with gain givenby the Ackerman formula placing all closed loop poles at the origin.

If Dk is of full row rank (i.e., rank n), then D†k is just the right inverse of Dk:

D†k = DT

k (DkDTk )−1. (11)

In terms of implementation, the update law (10) can be performed more efficiently by solving amatrix equation instead of computing the pseudo-inverse explicitly [41]:

Dk∆uk,M = −ηkek,M . (12)

If Dk is very large (e.g., due to a long prediction horizon or large input and state dimensions),other efficient numerical methods such as the matrix-free method used in finite element analysis[42] may be employed to avoid explicit calculation and storage of Dk.

The gradient matrix Dk is related to the system linearized about the predicted state trajectory(which is a linear time varying system):

δxk+i+1 = Aiδxk+i + Biδu(k)i+1, δxk = 0, i = 0, . . . ,M − 1. (13)

Ai :=∂f(xk+i, u

(k)i+1)

∂x, Bi :=

∂f(xk+i, u(k)i+1)

∂uDk = [AM−1AM−2 . . . A1B0|AM−1 . . . A2B1| . . . |AM−1BM−2|BM−1] . (14)

The full row rank condition on Dk is equivalent to the controllability of the above linear timevarying system. In fact, DkD

Tk is precisely the controllability gramian of this system. Given xk,

the control sequences at which Dk loses rank are exactly the singular controls of the optimal controlproblem for (3) with only the terminal state cost. These singularities have been shown to be non-generic [38] for continuous time driftless systems in the sense that the non-singular controls forma dense set in C∞. In [43], a sufficient condition, called the strong bracket generating condition, isderived for a driftless system to have only singular controls that are identically zero. This conditionis satisfied for the so-called unicycle model (see Sec. 3.1), but is too stringent in general. Singularcontrols for certain classes of driftless and affine nonlinear systems (including multi-chain systems)are completely characterized in [44]. Singular controls have also been considered in the context ofmobile robots in [45].

With Algorithm 1, the closed loop system may be represented as an interconnected systemshown in Figure 1:

P : xk+1 = f(xk, uk), uk = Λvk,M (15)

K : x(k)j+1 = f(x(k)

j ,Λu(k)j ), x

(k)0 = xk (16)

u(k)j+1 = Γu

(k)j , u

(k)0 = uk,M (17)

vk,M = uk,M + ∆uk,M (ek,M ), ek,M = x(k)M − xd (18)

Q : uk+1,M = Γvk,M + Φuk(ek,M ). (19)

We will use the following steps to show the closed loop asymptotic stability:

1. We first show that under Assumption 1, which is essentially a full rank condition on Dk,∆uk,M and uk may be chosen so that ek,M converges asymptotically and monotonically.

5

2. We next show that under Assumption 2, which impose bounds on ∆uk,M and uk as a function

of ek,M , uk converges to ud, and uk,M and vk,M converge to ud :=

ud...

ud

.

3. From step 2, uk and Λu(k)j will be arbitrarily close for k sufficiently large. By the continuity

of f , it follows from (15)–(16) that xk+M converges to x(k)M which in turn converges to xd from

step 1.

4. Finally, with u0,M chosen to be continuous with respect to x0, xd is stable in the senseof Lyapunov. Combining with the global asymptotic convergence in step 3, xd is a globalasymptotically stable equilibrium.

Figure 1: Closed Loop System under Algorithm 1

2.2 Asymptotic Convergence of the Predicted State Error

The first step of the stability proof is to show that the M -step-ahead prediction error, ek,M , con-verges to zero under the following assumption, which states that the combination of M -step-aheadcontrol refinement and an appropriate choice of uk will result in a decrease of the M -step-aheadprediction error.

Assumption 1 For all k ≥ 0, the following conditions hold:

1. The M -step-ahead error under the refined M -step-ahead sequence, vk,M , satisfies the bound∥∥∥φM (xk, vk,M )− xd

∥∥∥P≤ λk

∥∥∥φM (xk, uk,M )− xd

∥∥∥P

(20)

where λk is a positive constant, P > 0.

2. A control vector uk (in (9)) can be found to satisfy the following bound:∥∥∥φM+1(xk, (vk,M , uk))− xd

∥∥∥P≤ αk

∥∥∥φM (xk, vk,M )− xd

∥∥∥P

, (21)

where αk is a positive constant.

6

3. The constants λk and αk from (20)–(21) satisfy

limk→∞

k∏j=0

λjαj = 0 (22)

αkλk ≤ 1, for all k ≥ 0. (23)

Suppose that f is affine in control, i.e.,

f(x, u) = f0(x) + g(x)u. (24)

In this case, the best uk (one that minimizes the left hand side of (21)) is

uk = −b†a = −(bT Pb)−1bT Pa (25)

where

a = f0(φM (xk, vk,M ))− xd

b = g(φM (xk, vk,M ))

b† = (bT Pb)−1bT P.

With this choice, αk in (21) becomes

αk =∥∥∥∥P 1

2

(∂f0(z1)

∂x+

∂(g(z2)ud)∂x

)P− 1

2

∥∥∥∥ ∥∥∥I − P12 b(bT Pb)−1bT P

12

∥∥∥ (26)

where zi = ξiφM (xk, vk,M ) + (1 − ξi)xd for some ξi ∈ [0, 1], i = 1, 2. For bilinear systems, i.e.,f0(x) and g(x) are both linear (e.g., as in the double chain example in Sec. 3.2, orientation controlexample in Sec. 3.3, nonholonomic example in Sec. 3.4), the first norm is a constant and the secondnorm is upperbounded by 1. Then αk may be chosen as a constant (the first norm in (26)). Notethat the choice of P also affects the size of the second norm.

For discretized driftless systems (2), αk from (26) satisfies αk ≤ 1. In fact, we could also justchoose uk = ud = 0, and αk = 1. In this case, Assumption 1 reduces to the uniform full row rankcondition on Dk (i.e., the minimum singular value of Dk is uniformly positive). If Dk is of uniformfull row rank, then ηk may be updated by using a line search to ensure λk < 1 uniformly. Thisimplies that Assumption 1.3 is satisfied with αkλk < 1 uniformly.

We now show that it follows from Assumption 1 the M -step-ahead error is non-increasing andconverges to zero asymptotically.

Lemma 1 Given the initial M -step-ahead control sequence u0,M and the control law in Algorithm 1that satisfies Assumptions 1. Then ek,M converges to 0 as k →∞ and satisfies the bound

‖ek+j,M‖P ≤

j−1∏i=0

αk+iλk+i

‖ek,M‖P , for all k, j ≥ 0. (27)

Proof:Using (6), ek+1,M can be written as

ek+1,M = φM (xk+1, uk+1,M )− xd

= φM (f(xk, uk), uk+1,M )− xd

= φM+1(xk, {uk, uk+1,M})− xd.

7

By using (9) and the fact that uk is just the first element of vk,M , we have

ek+1,M = φM+1(xk, {vk,M , uk})− xd.

Applying (21) in Assumption 1, we get

‖ek+1,M‖P =∥∥∥φM+1(xk, {vk,M , uk})− xd

∥∥∥P≤ αk

∥∥∥φM (xk, vk,M )− xd

∥∥∥P

.

By (20) in Assumption 1,

‖ek+1,M‖P ≤ λkαk

∥∥∥φM (xk, uk,M )− xd

∥∥∥P

= λkαk ‖ek,M‖P .

The bound (27) then follows directly. The asymptotic convergence of ek,M follows from (22) inAssumption 1.

As discussed earlier, for discretized driftless nonlinear systems, if Dk is of uniform full row rank,then ek,M converges to 0 exponentially.

Instead of the two-step approach in Assumption 1, i.e., choosing ∆uk,M based on (20) and uk

based on (21), we can choose (uk,M , uk) together to satisfy the monotonically decreasing requirementon ek,M . However, the Assumption 2 below will be harder to guarantee.

2.3 Asymptotic Convergence of the Control Signal

We next show that the convergence of the predicted error implies the convergence of uk,M , vk,M ,and uk. To this end, we assume that ∆uk,M and uk in (18)–(19) are bounded by the predictionerror, ek,M .

Assumption 2 There exist positive constants β and ρ such that for all k ≥ 0∥∥∥∆uk,M

∥∥∥ =∥∥∥vk,M − uk,M

∥∥∥ ≤ β ‖ek,M‖P (28)

‖uk − ud‖ ≤ ρ ‖ek,M‖P . (29)

For the Newton-Raphson update (10), boundedness assumption of ∆uk,M , (28), is equivalent to

the uniform full row rank condition on Dk (which is also needed for Assumption 1) since∥∥∥D†

k

∥∥∥ =1/σmin(Dk), where σmin(Dk) is the minimum singular value of Dk.

Now consider the boundedness assumption of uk, (29). For affine nonlinear systems, if theoptimal uk is chosen as in (25), then

‖uk − ud‖ =∥∥∥−b†a− ud

∥∥∥=

∥∥∥g†(φM (xk, vk,M ))(xd − f0(φM (xk, vk,M )))− g†(xd)(xd − f0(xd))∥∥∥

≤∥∥∥∥∥(

∂(g†(z)(xd − f0(z)))∂x

P− 12

)∥∥∥∥∥ ‖ek,M‖P ,

where z = ξφM (xk, vk,M ) + (1 − ξ)xd for some ξ ∈ [0, 1]. For driftless nonlinear systems, if uk is

chosen according to (25), then ρ is an upperbound of∥∥∥g†(x)P− 1

2

∥∥∥ =∥∥∥(gT (x)Pg(x))−1gT (x)P

12

∥∥∥.If uk is chosen to be ud (i.e., 0), then ρ = 0 and (29) is trivially satisfied.

Under Assumptions 1–2, the following lemma shows that uk converges to ud, and uk,M andvk,M both converge to ud, as k →∞.

8

Lemma 2 For the control law in Algorithm 1, if Assumptions 1–2 hold, then for all k ≥ M

∥∥∥uk,M − ud

∥∥∥ ≤

βk−1∑

i=k−M+1

i−1∏j=0

αjλj

+ ρk−1∑

i=k−M

i−1∏j=0

αjλj

‖e0,M‖ (30)

∥∥∥vk,M − ud

∥∥∥ ≤

βk∑

i=k−M+1

i−1∏j=0

αjλj

+ ρk−1∑

i=k−M

i−1∏j=0

αjλj

‖e0,M‖ (31)

‖uk − ud‖ ≤

βk−1∑

i=k−M+1

i−1∏j=0

αjλj

+ ρ

k−M−1∏j=0

αjλj

‖e0,M‖ . (32)

Proof: We can eliminate vk,M from (18)–(19) to obtain the recursion on uk,M :

uk+1,M = Γuk,M + Γ∆uk,M + Φuk. (33)

Iterating from k = 0, we have

uk,M = Γku0,M +k−1∑i=0

(Γk−i∆ui,M + Γk−i−1Φui

).

Noting that Γj = 0 for j ≥ M , we have for all k ≥ M

uk,M − ud =k−1∑

i=k−M+1

Γk−i∆ui,M +k−1∑

i=k−M

Γk−i−1Φ(ui − ud). (34)

Taking the norm of both sides, using Assumption 2, and noting ‖Γ‖ = ‖Φ‖ = 1, we obtain∥∥∥uk,M − ud

∥∥∥ ≤ βk−1∑

i=k−M+1

‖ei,M‖+ ρk−1∑

i=k−M

‖ei,M‖ . (35)

Using Lemma 1, (30) follows immediately.Next write vk,M − ud as

vk,M − ud = uk,M − ud + ∆uk,M .

Taking the norm of both sides and using (28) in Assumption 2 and (35), we get∥∥∥vk,M − ud

∥∥∥ ≤ βk∑

i=k−M+1

‖ei,M‖+ ρk−1∑

i=k−M

‖ei,M‖ . (36)

Again by using Lemma 1, we obtain (31).Finally, for uk, we multiply both sides of (34) by Λ and noting ΛΓjΦ = 0 for all j 6= M − 1 and

ΛΓM−1Φ = I, we obtain

uk − ud = Λ(uk,M − ud) =k−1∑

i=k−M+1

ΛΓk−i∆ui,M + (uk−M − ud).

Taking norm of both sides, we obtain

‖uk − ud‖ ≤ βk−1∑

i=k−M+1

‖ei,M‖+ ρ ‖ek−M,M‖ . (37)

Using Lemma 1, the bound (32) follows.

9

2.4 Asymptotic Convergence of the State

Given that uk,M converges to ud and uk converges to ud, we can show that xk+M converges to

x(k)M , by using the continuity of f . This in turn shows that xk converges to xd since ek,M → 0 from

Lemma 1.

Lemma 3 Given the control law in Algorithm 1. If Assumptions 1–2 hold, then for all x0 ∈ Rn

and u0,M ∈ RmM , xk → xd as k →∞.

Proof: By the continuity of f , for any ε > 0, there exist δxM−1 > 0 and δu

M−1 > 0 such that if∥∥∥xk+M−1 − x(k)M−1

∥∥∥ ≤ δxM−1,

∥∥∥uk+M−1 − Λu(k)M−1

∥∥∥ ≤ δuM−1,

then ∥∥∥xk+M − x(k)M

∥∥∥ =∥∥∥f(xk+M−1, uk)− f(x(k)

M−1,Λu(k)M−1)

∥∥∥ ≤ ε. (38)

We now apply this argument repeatedly: for j = 0, . . . ,M − 1, given δxj+1, there exist δx

j > 0 andδuj > 0 such that if ∥∥∥xk+j − x

(k)j

∥∥∥ ≤ δxj ,

∥∥∥uk+j − Λu(k)j

∥∥∥ ≤ δuj (39)

then ∥∥∥xk+j+1 − x(k)j+1

∥∥∥ =∥∥∥f(xk+j , uk+j)− f(x(k)

j ,Λu(k)j )

∥∥∥ ≤ δxj+1. (40)

At j = 0, x(k)0 = xk, therefore,

∥∥∥xk − x(k)0

∥∥∥ ≤ δx0 is always satisfied. From Lemma 2, for any δu

j ,j = 0, . . . ,M − 1, there exists K sufficiently large such that for all k ≥ K,∥∥∥uk+j − u

(k)j

∥∥∥ =∥∥∥uk+j − ΛΓjuk,M

∥∥∥ ≤ δuj .

Since (39) is satisfied for j = 0, . . . ,M − 1, (38) is satisfied for all k ≥ K. This means thatxk+M → x

(k)M as k →∞. Since x

(k)M → xd from Lemma 1, it follows that xk+M → xd as k →∞.

2.5 Closed Loop Asymptotic Stability

The asymptotic stability of the desired state xd requires stability in the sense of Lyapunov inaddition to the asymptotic convergence. This can be shown if the initial M -step-ahead controlsequence, u0,M , is appropriately chosen.

Theorem 1 Consider the control law given in Algorithm 1 with u0,M continuous in x0, and u0,M =ud if x0 = xd. Suppose that Assumptions 1-2 hold. Then xd is a globally asymptotically stableequilibrium.

Proof: If u0,M is chosen as stated, then the initial predicted state error, e0,M = φM (x0, u0,M )−xd,is continuous in x0 and equal to 0 if x0 = xd. Therefore, from Lemma 2, if x0 is sufficiently close toxd, then

∥∥∥uk − Λuk,M

∥∥∥ can be made arbitrarily small for all k ≥ M . By using the same argumentas in the proof of Lemma 3, ‖xk − xd‖ can also be made arbitrarily small for k ≥ M . For k < M ,it follows from the continuity of f , and the continuity of u0,M and e0,M with respect to x0 that forx0 sufficiently close to xd, ‖xk − xd‖ can be made arbitrarily small. This shows that xd is stable inthe sense of Lyapunov. Since global asymptotic convergence has already been shown in Lemma 3,xd is a global asymptotically stable equilibrium.

If the constant control sequence ud is not a singular control, we can just choose u(0)j = ud for

all x0. Otherwise, we may use u(0)j = ud + anj where nj is a random vector bounded by ‖nj‖ ≤ 1

and a is proportional to ‖x0 − xd‖.

10

3 Simulation Results

We will present the results of applying Algorithm 1 with the Newton-Raphson update (10) to severalnonlinear examples, including the kinematic control of a unicycle, a double-chain nonholonomicsystem, an under-actuated satellite orientation control, a nonholonomic example from [30] and adiscrete time system used in [31] to show the lack of robustness of conventional NMPC schemes.

In all the nonholonomic examples, we will consider the unconstrained case as well as imposingan input saturation constraint. Suppose a hard constraint |u| ≤ umax is required for a given inputchannel, we will use the following transformation

u = s(v) =π

2umax tan−1

(2v

πumax

), (41)

and apply our algorithm with v as the input. Note that with this transformation, an LTI systemwould become nonlinear.

The step size ηk in (10) is chosen with the Amijo’s rule [46]. We start with an initial step sizeh0. The step size is repeatedly halved until either

∥∥∥φM (xk, uk,M − hD+k ek,M )

∥∥∥ <∥∥∥φM (xk, uk,M )

∥∥∥or if h < hmin in which case h is set to zero. When h = 0, the system essentially switches to theopen loop control using the control vectors in uk,M .

3.1 Kinematic Model of Unicycle

The kinematic model of a planar unicycle is commonly used to represent mobile robots. Under theassumption that the wheel does not slip, the kinematic model is given by

x = cos θ u1

y = sin θ u1 (42)θ = u2

where (x, y, θ) represent the position and orientation of the unicycle and (u1, u2) the driving andsteering velocities.

We consider the finite difference discretization of the equation of motion with ts = 0.01sec.By calculating the gradient matrix Dk in (14), it is straightforward to show that the singularitycorresponds to u1 and u2 being identically zero for the entire prediction horizon. Since ∂f

∂x and∂f∂u are uniformly bounded in x, our control scheme is globally asymptotically stable if the initialcontrol sequence is not identically zero. We therefore choose u0,M as a random vector with a smallamplitude. To illustrate the parallel parking maneuver, the initial state is chosen to be x0 = (0, 5, 0).

First we consider the unconstrained problem with two different look-ahead horizons: M = 4and M = 10. As shown in Figure 2, the predicted state errors converge monotonically, and theactual states converge asymptotically. The corresponding inputs are shown Figure 3. As expected,with a smaller horizon, the convergence is faster but with a larger control effort.

Without the input constraint, the input can be very large and renders the finite differencediscretization invalid. We therefore apply the transformation (41) to impose the input saturationconstraint. With this constraint, the control signal becomes oscillatory, see Figure 4, but the statestill converges to zero, though over a longer period. The x-y motion trajectory comparison fordifferent input constraints is shown in Figure 5. For tighter constraints, the motion steps becomesmore tightly spaced (due to the constraint on u1) and jagged (due to the constraint on u2).

11

Figure 2: Unicycle Example: Convergence of predicted State vs. actual state, M = 4 and M = 10,no input constraint

Figure 3: Unicycle Example: Input trajectories for M = 4 and M = 10

Figure 4: Unicycle Example: Input trajectories for umax = [0.3, 0.3] and umax = [0.5, 0.5], M = 10

12

Figure 5: Unicycle Example: x-y motion trajectory comparison umax = [0.5, 0.5], [0.3, 0.3] and[0.2, 0.2]; M = 10

3.2 Double-Chain System

Nonholonomic systems can sometimes be transformed to the chain form (e.g., a multi-trailer sys-tem). We consider the following double-chain system where the depth of each chain is 2:

x1 = u1

x2 = u2

x3 = x2u1 (43)x4 = u3

x5 = x4u1.

The singularity corresponds to when u1 is identically zero for the entire prediction horizon. Weagain discretize the equation using finite difference with ts = 0.01sec. The initial state is arbitrarilychosen to be x0 = (−5, 5, 10,−20, 8). The initial control vector is set to zero. Though this isa singular control, the initial predicted error, e0,M , is not in the null space of D†

0, therefore, thealgorithm is able to proceed. We consider both the unconstrained case and the input constrainedcase with the saturation level at [1, 1, 1]. The predicted state error vs. the actual state error plotsfor the two cases are shown in Figure 6. The actual state trajectories are shown in Figure 7. In bothcases, the predicted error converges monotonically, and the actual error converges asymptotically.The convergence in the constrained input case takes about twice as long but the maximum stateerror is much smaller. This is due to the much larger input amplitude in the unconstrained case,as shown in Figure 8. As in the unicycle case, the input constraint also leads to rapid oscillationsin the control trajectory.

13

Figure 6: Double Chain Example: Convergence of predicted State vs. actual state, M = 10, noinput constraint vs. umax = [1, 1, 1]

Figure 7: Double Chain Example: State trajectories, M = 10, no input constraint vs. umax =[1, 1, 1]

14

Figure 8: Double Chain Example: Control trajectories, M = 10, no input constraint vs. umax =[1, 1, 1]

3.3 Under-actuated Satellite System

Consider the orientation control of satellite using only two of the angular velocities. Using thevector quaternion representation, the continuous time equation of motion is given by

x =12(−x× +

√1− ‖x‖2I3)

u1

u2

0

(44)

where x =

x1

x2

x3

is the vector quaternion, x× is the 3 × 3 matrix representation of the vector

cross product, and (u1, u2) are the two angular velocities considered as the control variables. Theequation of motion ensures that ‖x(t)‖ ≤ 1 if ‖x(0)‖ ≤ 1. We discretize the equation using finitedifference with ts = 0.01sec. The initial state is chosen to be x0 = (0.1, 0.1, 0.707). The initialcontrol vector, u0,M , is chosen to be all zeros. The look ahead horizon is M = 10.

The predicted state error vs. the actual state error plots for the unconstrained input and inputconstrained at [20, 20] are shown in Figure 9. The actual state trajectories are shown in Figure 10.In both cases, the predicted error converges monotonically, and the actual error converges asymp-totically. The convergence in the constrained input case takes longer due to the input constraint.The input trajectories are shown in Figure 11. As in the other case, the input constraint also leadsto rapid oscillations in the control trajectory.

3.4 Nonholonomic Example from [30]

The following three-state/2-input nonholonomic integrator is used as an example in [30]:

x1 = u1

x2 = u2 (45)x3 = u2x2 − u2x1.

The discretization time is chosen as 1 (same as in [30]). The initial state is arbitrarily chosen asx0 = (−5, 5, 10). The initial control vector, u0, is chosen to be identically zero over the prediction

15

Figure 9: Satellite Example: Convergence of predicted State vs. actual state, M = 10, no inputconstraint vs. umax = [20, 20]

Figure 10: Satellite Example: State trajectories, M = 10, no input constraint vs. umax = [20, 20]

Figure 11: Satellite Example: Control trajectories, M = 10, no input constraint vs. umax = [20, 20]

16

horizon. The comparison between ‖ek,M‖ and ‖xk‖ for the unconstrained and input constrainedcases are shown in Figure 12. The predicted errors and actual state errors all decay monotonicallyin both cases. As expected, the constrained input case shows a slower response. The comparisonof the state trajectories and input trajectories are shown in Figures 13–14.

Figure 12: Example from [30]: Convergence of predicted State vs. actual state, M = 10, no inputconstraint vs. umax = [0.25, 0.25]

Figure 13: Example from [30]: State trajectories, M = 10, no input constraint vs. umax =[0.25, 0.25]

3.5 A Nonlinear Example from [31]

In [31], the following single input discrete time example is used to demonstrate the lack of robustnessof conventional NMPC schemes when a terminal constraint (the origin) is used together with a shorthorizon (M = 2):

x1k+1= x1k

(1− uk)

x2k+1=

√x2

1k+ x2

2kuk. (46)

17

Figure 14: Example from [30]: Control trajectories, M = 10, no input constraint vs. umax = [1, 1, 1]

Note that in contrast to the earlier nonlinear examples, this system does not correspond to the finitedifference discretization of a continuous time driftless system. The initial condition is arbitrarilychosen to be x0 = (5, 10). Since this example was used to show the lack of robustness when theprediction horizon is small, we set M = 1; other prediction horizons produced similar results. Inaddition to the nominal case, we also consider the case when a random state noise is added (whichwould cause any NMPC with terminal constraint and M = 2 to be unstable). The predicted statevs. actual state plots for the nominal and perturbed cases are shown in Figure 15. Though wehave not proven robustness of our method, it can be seen at least for this example that the systemremains stable in the perturbed case. The phase plane trajectories for the two cases are shown inFigure 16.

Figure 15: Example from [31]: Convergence of predicted State vs. actual state, nominal vs. dis-turbed, M = 1

18

Figure 16: Example from [31]: Phase plane trajectories, nominal vs. disturbed, M = 1

4 Conclusions

This paper presents a gradient-based nonlinear feedback stabilization strategy for discrete timenonlinear systems. The approach is similar to NMPC but the control vector is updated to reducepredicted state error rather than minimizing an optimization index. We show that under the fullrank assumption of a gradient matrix and the boundedness of the predicted control sequence refine-ment by the predicted state error, the closed loop system is globally asymptotically stable. Theseassumptions are reasonable for the finite difference approximation of driftless nonlinear systemsmotivated by systems subject to nonholonomic constraints or non-integrable conservation laws.The efficacy of the algorithm is demonstrated by several nonlinear examples.

We are currently working on analyzing the robustness of the proposed scheme – the exponentialconvergence of the predicted error (if the gradient matrix is uniformly full rank) offers a promisingstarting point. We are also extending our approach to the dual problem of observer design (similarto the Newton approach in [47]) and the combined NMPC observer based control.

Acknowledgment

The authors gratefully acknowledge the careful and helpful comments from the anonymous review-ers, and helpful discussion with Professors Fernando Lizarralde, Liu Hsu, Murat Arcak, Andy Teel,and Dr. Dan Popa. This material is based upon work supported in part by the National ScienceFoundation under Grants CMS-9813099 and CMS-0301827. This work is supported in part by theCenter for Automation Technologies under a block grant from the New York State Office of Science,Technology, and Academic Research.

References

[1] J.T. Wen. Control of nonholonomic systems. In W. Levine, editor, Control Handbook, pages1359–1368. CRC Press, 1995.

[2] I. Kolmanovsky and N. H. McClamroch. Developments in nonholonomic control problems.Control Systems Magazine, 15:20–36, December 1995.

19

[3] R.W. Brockett. Asymptotic stability and feedback stabilization. In Differential GeometricControl Theory, edited by R.W. Brockett, R.S. Millman and J.J. Sussmann, volume 27, pages181–208. Birkhauser, 1983.

[4] J. Zabczyk. Some comments on stabilizability. Applied Mathematics and Optimization, 19:1–9,1989.

[5] R.M. Murray and S.S. Sastry. Nonholonomic motion planning – steering using sinusoids. IEEETransactions on Automatic Control, 38:700–716, May 1993.

[6] C. Fernandes, L. Gurvits, and Z.X. Li. Near-optimal nonholonomic motion planning for asystem of coupled rigid bodies. IEEE Transactions on Automatic Control, 39(3):450–463,March 1994.

[7] G. Lafferriere and H.J. Sussmann. Motion planning for controllable systems without drift. In1991 IEEE R&A Workshop on Nonholonomic Motion Planning, Sacramento, CA, April 1991.

[8] H.J. Sussmann, Y. Yang, and W. Liu. Limits of highly oscillatory controls and the approxi-mation of general paths by admissible trajectories. In Proceedings of the IEEE Conference onDecision and Control, pages 437–442, Tucson, AZ, December 1992.

[9] J. Barraquand and J. Latombe. On nonholonomic mobile robots and optimal maneuvering.In Intelligent Control Workshop, pages 340–347, Albany, NY, September 1989.

[10] A.M. Bloch, M. Reyhanoglu, and N.H. McClamroch. Control and stabilization of nonholonomicdynamic systems. IEEE Transactions on Automatic Control, 37:1746–1757, November 1992.

[11] C. Canudas de Wit and O.J. Sordalen. Exponential stabilization of mobile robots with non-holonomic constraints. IEEE Transaction on Automatic Control, 37(11):1791–1797, 1992.

[12] J.-M. Coron. Global asymptotic stabilization for controllable systems without drift. Mathe-matics of Control, Signals, and Systems, 5(3), 1992.

[13] C. Samson and K. Ait-Abderrahim. Feedback stabilization of a nonholonomic wheeled mobilerobot. In IEEE/RSJ International Workshop on Intelligent Robots and Systems, pages 1242–1247, Osaka, Japan, November 1991.

[14] A. Teel, R. Murray, and G. Walsh. Nonholonomic control systems: From steering to stabi-lization with sinusoids. In Proc. 31th IEEE Conference on Decision and Control, Tucson, AZ,December 1992.

[15] R.T. M’Closkey and R.M. Murray. Convergence rates for nonholonomic systems in powerform. In Proceedings of 1993 American Control Conference, San Francisco, CA, 1993.

[16] C.E. Garcia, D.M. Prett, and M. Morari. Model predictive control: Theory and practice: Asurvey. Automatica, 25(3):335–348, May 1989.

[17] W. H. Kwon and A. E. Pearson. On feedback stabilization of time-varying discrete linearsystems. IEEE Trans. on Automatic Control, 23(3):479–481, June 1978.

[18] C. C. Chen and L. Shaw. On receding horizon feedback control. Automatica, 18(3):349–352,March 1982.

20

[19] D. Q. Mayne and H. Michalska. Receding horizon control of nonlinear system. IEEE Trans.on Automatic Control, 35(7):814–824, July 1990.

[20] D. Q. Mayne, J. B. Rawlings, C.V. Rao, and P. Scokaert. Constrained model predictive control:Stability and optimality. Automatica, 36(6):789–814, June 2000.

[21] J. B. Rawlings. Tutorial overview of model predictive control. IEEE Control Systems Magazine,pages 38–52, June 2000.

[22] S. S. Keerthi and E. G. Gilbert. Optimal infinite-horizon feedback laws for a general class ofconstrained discrete-time systems : Stability and moving-horizon approximation. Journal ofOptimization Theory and Applications, 57(2):265–293, May 1988.

[23] D. Q. Mayne and H. Michalska. Robust receding horizon control of constrained nonlinearsystem. IEEE Trans. on Automatic Control, 38(11):1623–1633, November 1993.

[24] H. Chen and F. Allgower. A quasi-infinite horizon nonlinear model predictive control schemewith guaranteed stability. Automatica, 34(10):1205–1217, October 1998.

[25] L. Magni, G. De Nicolao, L. Magnani, and R. Scattolini. A stabilizing model-based predictivecontrol algorithm for nonlinear systems. Automatica, 37:1351–1362, 2001.

[26] S. L. Oliveira and M. Morari. Contractive model predictive control for constrained nonlinearsystems. IEEE Trans. on Automatic Control, 45(6):1053–1071, June 2000.

[27] P. Scokaert, D. Mayne, and J. Rawlings. Suboptimal model predictive control (feasibilityimplies stability). IEEE Trans. on Automatic Control, 44(3):648–654, March 1999.

[28] A. Jadbabaie, J. Yu, and J. Hauser. Receding horizon control nonlinear systems. In Proc. 1999IEEE Conference on Decision and Control, pages 3376–3381, Phoenix, AZ, December 1999.

[29] J. A. Primbs, V. Nevistic, and J. C. Doyle. A receding horizon generalization of pointwisemin-norm controller. IEEE Trans. on Automatic Control, 45(6):898–909, June 2000.

[30] G. Grimm, M.J. Messina, A.R. Teel, and S.E. Tuna. Model predictive control when a localcontrol Lyapunov function is not available. In Proc. of American Control Conference, pages4125–4130, Denver, CO, June 2003.

[31] G. Grimm, M.J. Messina, A.R. Teel, and S.E. Tuna. Examples of zero robustness in constrainedmodel predictive control. In Proc. of 42nd Conference on Decision and Control, Maui, HI,December 2003.

[32] B. Kouvaritakis, J.A. Rossiter, and J. Schuurmans. efficient robust predictive control. IEEETransaction on Automatic Control, 45(8):1545–1549, 2000.

[33] S.L. De Oliveira, V. Nevistic, and M. Morari. Control nonlinear systems usbject to inputconstraints. In Proc. of IFAC Symposium on nonlinear control systems design, pages 15–20,Tahoe City, CA, 1995.

[34] M. Cannon, B. Kouvaritakis, M. Grimble, and B. Bulut. nonlinear predictive control of hotstrip rolling mill. International Journal of Robust and Nonlinear Control, 13:365–380, 2003.

[35] A. Zheng. A computationally efficient nonlinear MPC algorithm. In Proc. of American ControlConference, pages 1623–1627, Albuquerque, NM, 1997.

21

[36] H.J. Sussmann. A continuation method for nonholonomic path-finding problem. In Proc. 32ndIEEE Conference on Decision and Control, pages 2718–2723, San Antonio, TX, December1993.

[37] A. Divelbiss and J.T. Wen. Trajectory tracking control of a car–trailer system. IEEE Trans-action on Control System Technology, 5(3):269–278, May 1997.

[38] E. D. Sontag. Control of systems without drift via generic loops. IEEE Trans. on AutomaticControl, 40(7):1210–1219, July 1995.

[39] F. Lizarralde, J. T. Wen, and L. Hsu. A new model predictive control strategy for affinenonlinear control systems. In Proc. 1999 American Control Conference, pages 4263–4267, SanDiego, CA, June 1999.

[40] B. Kouvaritakis, M. Cannon, and J.A. Rossiter. nonlinear model based predictive control.International Journal of Control, 72(10):919–928, 1999.

[41] G.H. Golub and C.F. Van Loan. Matrix Computations. Johns Hopkins University Press, 3edition, 1996.

[42] R. Choquet and J. Erhel. Newton-GMRES algorithm applied to compressible flow. Interna-tional Journal on Numerical Methods for Fluids, pages 177–190, 1996.

[43] H.J. Sussmann and Y. Chitour. A continuation method for nonholonomic path finding problem.IMA Workshop on Robotics, January 1993.

[44] D. Popa and J.T. Wen. Singularity computation for iterative control of nonlinear affine systems.Asian Journal of Control, 2(2):57–75, June 2000.

[45] K. Tchon. On kinematic singularities of nonholonomic robotic systems. In 13th CISM-IFToMM, Zakopane, Poland, July 2000.

[46] E. Polak. Optimization: Algorithms and Consistent Approximations. Springer-Verlag, NewYork, 1997.

[47] P.E. Moraal and J.W. Grizzle. Observer design for nonlinear systems with discrete-time mea-surements. IEEE Trans. on Automatic Control, 40(3):395–404, March 1995.

22

gradient-based model predictive control design with...

Documents