constrained optimization - university of pittsburgh

111

CONSTRAINED OPTIMIZATION 1. EQUALITY CONSTRAINTS

Consider the problem (P1): Minimize f(x)

st hj(x) = 0, j=1,2,…,m

x Rn

Let us first examine the case where m=1 (i.e., a single constraint). Without this

constraint the necessary condition for optimality was f(x)=0. With the

constraint h(x)=0, we also require that x lie on the graph of the (nonlinear)

equation h(x)=0.

-f(x*) steepest descent direction, is

orthogonal to the tangent of the contour of f

through the point x*.

It must also be orthogonal to

the tangent of the curve

h(x)=0 at x*!

But we know that h(x*) is also orthogonal to the curve h(x)=0 at x*. Therefore...

-f(x*)

f(x*)

h(x)=0

x*

Contours of f(x)

Unconstrained Minimum

112

-f(x*) and h(x*) MUST BOTH LIE ALONG THE SAME LINE, i.e., for some

R, we must have

-f(x*) = h(x*)

So, if x* is a minimizer, the necessary condition reduces to

f(x*) + h(x*) = 0

We now proceed to the general case where the above necessary condition should

hold for each constraint.

DEFINITION: The Lagrangian function for Problem P1 is defined as

L(x,) = f(x) + j=1…,m j hj(x)

The KARUSH-KUHN-TUCKER Conditions

If the point x*Rn is a minimizer for problem P1, it must satisfy the following

necessary condition for some *Rm:

∗ ∗ 0 and ∗ 0∀

or equivalently,

x,L(x*, *) = 0.

113

EXAMPLE:

Minimize ½ x12 + ½ x2

2 –x1x2 - 3x2

st x1 + x2 = 3 ( h(x) = x1 + x2 - 3 = 0)

The Lagrangian is

L(x,) = (½ x12 + ½ x2

2 –x1x2 - 3x2) + ( x1 + x2 - 3)

The K-K-T conditions yield

0

3 0

3 0 originalconstraints

Solving these (in this particular case, linear) equations:

x1*= 0.75, x2

*= 2.25, *= 1.5, with f(x*) = -5.625

The above point thus satisfies the necessary conditions for it to be a minimum.

SUFFICIENT CONDITIONS: IF in addition to the K-K-T necessary conditions,

suppose that for arbitrary zRn with z0, the following statement is true:

“zT hj(x*) = 0, j=1,2,…,m implies that zTx

2L(x*,*)z > 0”

THEN f has a strict local minimum at x*, st hj(x*)=0, j=1,2,…,m.

114

For our example,

x2L(x*,*) = 1 1

1 1 and h(x*) = 1

1

Consider zR2 = 00.

Then zT hj(x*) = z1+z2 = 0 (i.e., z2 = -z1) implies that

zT x2L(x*,*) z = 1 1

1 1 = (z1 - z2)

2 = (2z1)2 > 0

x* is a strict local minimum.

As in unconstrained optimization, in practice, sufficient conditions become quite

complicated to verify, and most algorithms only look for points satisfying the

necessary conditions.

2. INEQUALITY-CONSTRAINED OPTIMIZATION

Consider the problem (P2): Minimize f(x)

st gj(x) 0, j=1,2,…,m

x Rn.

Again, let us first consider the case with a single inequality constraint (i.e., m=1).

115

Q. Can this be converted to an equivalent equality constrained problem?

1. Suppose that (as in LP) we write g(x)0 in the form g(x) + S =0, where S is

a nonnegative slack variable. Then we have added the constraint S0;

effectively, we have not accomplished anything (we just trade one

inequality for another)!

2. Instead, let us write g(x) + S2 = 0

(S20 is of course unnecessary…)

The Lagrangian is

L(x,S,) = f(x) + (g(x)+S2)

and the K-K-T optimality conditions become

∗ ∗ ∗ 0, 1,2, … ,

∗ ∗ 0 originalconstraints

2 ∗ ∗ 0 COMPLEMENTARYSLACKNESS

EXAMPLE: Minimize x12 + ½ x2

2 - 8x1 - 2x2 - 60

st 40x1 + 20x2 140 = 0

Converting to an equality constraint, we have

40x1 + 20x2 + S2 – 140 = 0

L(x1, x2, S, ) = (x12 + ½ x2

2 - 8x1 - 2x2 – 60) + (40x1 + 20x2 + S2 – 140)

116

The necessary conditions are therefore

2 8 40 0, 2 20 0

40 20 – 140 0 originalconstraints

2 0

Here we have nonlinear system of equations. Again note that these are only

necessary conditions, and there may be one or more points that satisfy these. We

seek an optimal solution to the original problem that is among the solution(s) to

the necessary conditions.

Rather than attempt a direct solution, let us use the complementary slackness

condition to investigate two cases: (i) =0, (ii) S=0.

Case (i) =0 2x1 -8=0 x1 =4

x2 -2=0 x2 =2

40x1 + 20x2 + S2 – 140 = 0 S2 = -60 INFEASIBLE!

Case (ii) S=0 2x1 -8+40=0 x1 =4

x2 -2+20=0 x2 =2

40x1 + 20x2– 140 = 0 x1 =3, x2 =1, =0.05

the only solution to the necessary conditions is the point x=(3,1). Thus, if the

original problem had a solution, then this must be it!

117

Geometric Interpretation

Consider Minimize f(x), st g(x)0.

Just as in equality constrained optimization, here too we have one of the

necessary conditions as

-f(x*) = .g(x*) for some value of

With equality constraints, recall that was just some real number with no sign

restrictions

(>0) (<0)

With inequality constraints we can “predict” the correct sign for

g(x)0, >0 g(x)0, <0

Feasible Region

Feasible Region

Thus for minimization, g(x)0 0; and g(x)0 0.

h

-f

h

h(x)=0 h(x)=0

OR

-f

g

g(x)<0

OR

-f

g

g(x)>0

g(x)=0 g(x)=0

dir. of increase of g

dir. of increase of g

-f

118

In practice, we dispense with the slack variable S and state the NECESSARY

CONDITIONS in a standard form that does not use the slack variable.

Let L(x,) = f(x) + g(x)

Then the necessary conditions are:

1 ∗, ∗ ∗ ∗ ∗ 0,∀

i.e., xL(x*,*) = f(x*) + *g(x*) = 0

(2) g(x*) 0

(3) *g(x*) 0 *0

Condition 2 yields the original constraints.

Condition 3 is complementary slackness: (g(x*)<0 * = 0; * >0 g(x*)=0).

For the general case with m inequality constraints (i.e., Problem P2) the K-K-T

necessary conditions are that if x* is a minimizer then j* 0 for j=1,2, ... ,m

such that for L(x,)=f(x) + j=1…m jgj(x)

1 ∗, ∗ ∗ ∑ ∗ ∗ 0 equations)

(2) gj(x*) 0, j=1,2,…, m

(3) {gj(x*)}j

* 0, j=1,2,…, m

NOTE: If a constraint were of the form g(x)0, we would require that j*0.

Alternatively, we could redefine gj(x)0 as -g(x)0 and retain j*0!

119

We now look at the general NLP problem (P3)

Minimize f(x)

st gj(x) 0, j=1,2,…,m,

gj(x) 0, j=m+1,…,p,

gj(x) 0, j=p+1,…,q,

xRn.

,

Karush-Kuhn-Tucker Necessary Conditions

If x* minimizes f(x) while satisfying the above constraints, then provided certain

regularity conditions are met, there exist vectors * and * such that

1 ∗ ∗ ∗ ∗ ∗ ∗ ∗

(2) All original constraints are satisfied

(3) j*{gj(x

*)} 0 j=1,2,…,p

(4) j*0 j=1,2,…,p

(5) j* are unrestricted in sign for j=p+1,…, q.

The regularity conditions referred to above are also known as constraint

qualifications. We will address these later…

120

NON-NEGATIVITY: Many engineering problems require the decision variables

to be non-negative. While each non-negativity requirement could be treated as a

constraint, we don't have to...

Consider the following problem (for simplicity, assume only constraints….):

Minimize f(x)

st gj(x) 0; j=1,2,…,m

qi(x) = xi 0; i=1,2,…,n.

Let us define Lagrange multipliers 1, 2,..., n corresponding to the non-negativity

constraints. Then in the K-K-T conditions we have via complementary slackness

i*x i

* = 0; i=1,2,…,n

and Condition (1) becomes

∗ ∗ ∗ ∗ ∗

⟹ ∗ ∗ ∗ ∗

100::00

∗

010::00

… ∗

000::10

∗

000::01

⟹ ∗ ∑ ∗ ∗ ∗, where * Rn = [1* 2

* … n*]T

Note that we want ∗

121

Thus the necessary conditions become

1 ∗ ∑ ∗ ∗ (n conditions)

2 Alloriginalconstraintsaresatisfied

3 j* gj x* 0 j 1,2,…,m

4 j*0 j 1,2,…,m

5 ∗ ∗ 0∀ , i. e., ∑ ∗ ∗ ∗ ∙ ∗ ∑ ∗ ∗ 0

NOTE THAT NO EXPLICIT MULTIPLIERS ARE USED FOR THE NONNEGATIVITY

CONSTRAINTS!

The Karush-Kuhn-Tucker conditions are the most commonly used ones to check

for optimality. However, they are actually not valid under all conditions.

There do exist a somewhat weaker set of necessary conditions (known as the Fritz

John conditions) that are valid at all times; however in many instances they do not

provide the same information as the Karush-Kuhn-Tucker conditions.

122

FRITZ JOHN CONDITIONS

These weaker necessary conditions for the problem

Minimize f(x)

st gj(x) 0, j=1,2,…,m

x Rn.

are based on the so-called weak Lagrangian

,

and may be stated as follows:

If x* is a minimizer then Rm+1 such that

1 ∗, ∗ ∗ ∗ ∗ ∗

(2) gj(x*) 0, j=1,2,…,m

(3) j*{ gj(x

*)} 0, j=1,2,…,m

(4) *0 and *0.

While the Fritz John conditions are always necessary for x* to be a solution, the

Karush-Kuhn-Tucker conditions are necessary provided certain conditions known

as CONSTRAINT QUALIFICATIONS are met. In effect:

Local Optimum Fritz John Karush-Kuhn-Tucker Constraint

Qualification

123

An example of a constraint qualification is:

The set S = {gj(x*) | jJ} is linearly independent, where J = {j|gj(x

*)=0}.

The above C.Q. states that the gradient vectors of all the constraints that are

binding (satisfied as strict equalities) at x* should be linearly independent.

In the Fritz John conditions, suppose that = 0. Then the conditions are satisfied

for any function f at the point x*, regardless of whether or not that function has a

minimum at x*!

This is the main weakness of the Fritz John conditions. If = 0, then they make

no use of the objective and they are of no practical use in locating an optimal

point.

Constraint qualifications essentially are constraints that ensure that >0, so that

if we redefine j by (j

/) for j=0,1,...,m, we can reduce the Fritz John

conditions to the Karush-Kuhn-Tucker conditions ( =1 now and may thus be

ignored).

124

EXAMPLE: Minimize –x1, st g1(x)=x2-(1-x1)30; g2(x)= -x20.

The optimum solution is at x*= 10

, where we also have

f(x*) = 10

; g1(x*) = 0

1; g2(x

*) = 01

Note that the constraint qualification is not met since both constraints are active at

x*, but the gradient vectors of these constraints are not linearly independent.

FRITZ JOHN CONDITIONS: 1

0 +

01+

01

= 00

Satisfied for 0=0, and = = arbitrary > 0.

KARUSH-KUHN-TUCKER CONDITIONS: -f(x*)= 10

= 01+

01

1(0) + 2(0) = 1 and 1(1) + 2(-1) = 0

Inconsistent! (cannot be satisfied)

‐2

0

2

4

6

8

‐1 ‐0.5 0 0.5 1 1.5 2 2.5

x2

x1

g1(x*)

g2(x*)

g1(x)=0

g2(x)=0 f(x*)

Feasible Region

125

CONVEX PROGRAMS

The following NLP (Program P4) is referred to as a convex program:

Minimize f(x)

st gj(x) 0, j=1,2,…,m,

gj(x) 0, j=m+1,…,p,

hj(x) 0, j=1,2,…,q,

x Rn.

where f is a proper convex function on Rn

gj are proper convex functions on Rn, j=1,...,m

gj are proper concave functions on Rn, j=m+1,...,p

hj are linear functions of the form

(Note: A proper function has at least one point in its domain that is <+ (i.e., its

effective domain is non-empty), and it is also >- everywhere in its domain.)

Thus a convex minimization program has a convex objective, and the set of

feasible solutions is a convex set.

EXERCISE: Show that the set of feasible points determined by the constraints of

program P4 is convex.

126

Q. Why are convex programs important?

A. Because they are 'well-behaved' programs with nice results!

Slater’s Condition: For convex programs, if x'Rn

gj(x') < 0 for j=1,2,...,m, and

gj(x') > 0 for j=m+1,...,p,

then the constraint qualification holds at x' (the relative interior is non-empty)

Furthermore, for a convex program where the constraint qualification holds, the

K-K-T necessary conditions are also sufficient.

Many algorithms have been designed specifically for convex programming

problems. A special case is (posynomial) geometric programming, where the

constrained local optima also turn out to be global optima.

127

LAGRANGIAN DUALITY

Just like LP, NLP also has a duality theory. For the problem

Min f(x), st gj(x)0, j=1,2,…,m; hj(x)0, j=1,2,…,p; xS

this is based upon the Lagrangian function L(x,) = f(x) + jjgj(x) + jjhj(x).

Primal Dual

MINIMAX

Minimize , st xS

where Max , , ,

if 0∀ and 0∀j∞if 0or 0forsome

and so

Min∈

Min∈

Max,

, ,

Min∈

| 0, 0∀

This is our original (Primal) problem!

MAXIMIN

Maximize , , st ,

where , Min , ,

In particular, under conditions of

convexity & differentiability this

reduces to

Max L(x,)

st , ,

0, unr xS,

WEAK DUALITY THEOREM: This states that

, , , for all 0 and xS

Max, {Min x L(x,,)} Min x {Max, L(x,,)}

The quantity , is referred to as the “duality gap.”

128

GEOMETRIC INTERPRETATION OF DUALITY

Consider the problem: Minimize f(x), st g(x) 0, xS.

Let z1 =g(x) and z2 =f(x)

G is the image of the set S under the map z1=g(x), z2=f(x). Thus the original

problem is equivalent to

Minimize z2 , st z1 0, z=(z1, z2) G

which yields the optimum solution shown above.

For 0 given, Min ∈ , Min ∈

= Min ∈

Thus the dual problem is to find the slope of a supporting (tangential) line (or

hyperplane) for which the intercept with the z2 axis is maximized.

x

z1

z2

G S

,

slope =

z2+z1 =

(0,0)

z=[g(x) f(x)]T

129

EXAMPLE 1: Minimize 4x12 + 2x1x2 + x2

2

st 3x1 + x2 6; x1, x2 0

Let S ={x | x1 0, x2 0}. The Lagrangian is

L(x,) = 4x12 + 2x1x2 + x2

2 + (-3x1 – x2 + 6)

The PRIMAL problem is

Minimize∈

Maximize , Min∈

i.e., Min 4x12 + 2x1x2 + x2

2

st 3x1 + x2 6; x1, x2 0

The corresponding DUAL problem is

Maximize Minimize∈

, Max

Consider the dual objective

Minimize∈

, 4 2 3 6

For this particular example, L(x,) (the objective function) is convex; therefore to

get the minimum the necessary and sufficient conditions are that

, 8 2 3 0

, 2 2 0i. e., ∗

3, ∗

6

Note that if 0, this satisfies xS.

130

Thus we have = L[(/3),(/6),] = 6 - (7/12)

The dual problem thus reduces to

Maximize 6 - (7/12)

st 0

The objective function for this subproblem turns out to be concave (CHECK and

VERIFY for yourself):

∴ 6712

2 0 ⟹ 367 0

Since 36/7 is not < 0, it solves the problem, i.e.,

DUAL SOLUTION: ∗ , ∗

PRIMAL SOLUTION: ∗ ∗

0; ∗ ∗

0;

and 3x1* + x2

* = (36/7) + (6/7) = 6 ---- feasible.

Complementary slackness is satisfied since *(6-3x1* + x2

*) = 0

(12/7, 6/7) is optimal for the Primal

Furthermore

4127

2127

67

67

1087

⟹ !

131

EXAMPLE 2: Min (-x2 –x3 )

(P) st x2 1

The Lagrangian for this problem is L(x,) = -x2 - x3 + (x2-1)

Consider the PRIMAL problem; optimality conditions are:

xL(x, ) = 0 -2x - 3x2 + 2x = 0

(x2-1)=0, 0; x2 1.

(a) =0 x = (-2/3) OR x = 0;

(b) >0 x2-1= 0 x = 1

x=+1 =2.5 x=-1 = -0.5 (infeasible…)

Therefore the solutions are

(x,) = (-2/3,0) or (0,0) or (1,2.5)

f(x) = -4/27 or 0 or -2

Thus the primal optimum is x* =1, f(x*) = -2.

‐6

‐5

‐4

‐3

‐2

‐1

0

1

2

‐2 ‐1.5 ‐1 ‐0.5 0 0.5 1 1.5 2

f(x)=-x2-x3

x

x*=1

f(x*)=-2

‐(2/3)

132

Let us look at the dual (D) and see if there is any duality gap. The dual is:

(D) Maximize ; st 0

where

Min , Min 1

= - for all values of !

∴ Max ∞

Thus the primal objective = -2; dual objective = -. There is a DUALITY GAP!

Looking at it geometrically, let z1=g(x)=x2–1, and z2 =f(x)= –x2 –x3.

The supporting hyperplane is the plane (line) through (-1,0) that intersects the z2-

axis at -.

‐10

‐8

‐6

‐4

‐2

0

2

4

6

‐2 ‐1 0 1 2 3 4

z2

z1

optimum primal value

G

133

Saddlepoint Sufficiency

Recall that from Weak Duality )()()(ˆ xλx,λ LLL .

Definition: The point )( λ,x is a saddlepoint of L(x,) if

1. SLL xλx,λ,x )()( , i.e., ( ) ( ) ( )SˆL Min L L xx,λ x,λ λ , and

2. ( ) ( )L L x,λ x,λ λ 0 , i.e., ( ) ( , ) ( )L Max L L x,λ x x

In other words, if )( λ,x is a saddlepoint of L(x,) then )(ˆ)()( λλ,xx LLL ,

i.e., primal objective = dual objective at the point and the duality gap = 0.

Now consider the following Primal problem:

Min f(x), st gj(x) 0, j=1,2,…,m; xS .

where f(x) and gj(x) are all convex functions, and S is a convex set.

Saddlepoint Sufficiency Condition: If 0and λx S , then )( λ,x is a

saddlepoint of L(x,) if, and only if,

x minimizes )()()( xgλxλx, TfL over S.

gj(x) 0 for each j=1,2,…,m

0)( xλ jj g , which implies that )()( λ,xx Lf

If )( λ,x is a saddlepoint of L(x,) then x solves the Primal problem above and

λ solves the Dual problem: Maximize )(ˆ λL st 0, where )()(ˆ λx,λ x LMinL S

134

Strong Duality Theorem

Consider the primal problem: Find

where SRn is nonempty and convex, f(x) and gj(x) are convex & hj(x) are linear.

Note: infimum may be replaced by minimum if the minimum is attained at some x

Define the dual problem: Find

where ( , ) ( ) ( ) ( )T TSL inf f g h xλ μ x λ x μ x .

Strong Duality Theorem: Assume that the following Constraint Qualification

holds:

where h(S) = h(x): xS.

Then = , i.e. there is no duality gap. Furthermore, if >-, then

= ),(ˆ ** μλL for some *0

If x*solves the primal, it satisfies complementary slackness: j*gj(x

*)=0 j.

1

2

inf ( )

st ( ) 0, 1,2,...,

( ) 0, 1,2,...,

j

j

f

g j m

h j m

x

x

x

x S

0λ

μλ

st

L ),(ˆsup

)(int0&

,...,2,1,0)ˆ(

,...,2,1,0)ˆ(

thatsuchˆexistsThere

2

1

Sh

x

x

x

mjh

mjg

j

j

135

QUADRATIC PROGRAMMING

Quadratic Programs are convex minimization programs that have a convex

(quadratic) objective and a convex constraint set formed by linear constraints. The

QP primal problem is:

(P) Minimize ½ xTQx + cTx

st Ax b ( b-Ax 0)

(any nonnegativity restrictions are included in Axb)

where

xRn, Q is a (nn), real, symmetric, positive definite matrix

cRn, A is a (mn) real matrix, bRm.

Thus we have a convex objective & linear constraints.

The Lagrangian function is

,12

The dual QP problem is

Max Min , ,i. e. ,Max , , st ,

This yields

(D) Maximize ,

st xL(x,) = Qx + c -AT = 0; 0.

136

The dual constraints imply that xT[Qx+ c –AT] = xT[0]

i.e., xTQx+ xTc – xT(AT) = xTQx+ xTc – TA x = 0 ()

The objective function can be rearranged as

Tb – ½ xTQx + (xTQx + xTc – TA x) = Tb – ½ xTQx from ()

Thus the dual reduces to

Maximize Tb – ½ xTQx

st Qx + c – AT = 0; 0.

If Q=[0], then (P) is Min cTx, st Axb

(D) is Max b, st AT = c

the regular LP primal-dual pair!

For Q [0]

(P) has n variables, m linear inequality constraints

(D) has (m+n) variables, n equality constraints, and m nonnegativity constraints

If Q is nonsingular (i.e., Q-1 exists) then from the dual constraints,

Qx + c – AT = 0 x + Q-1c - Q-1AT = 0

i.e., x = Q-1[AT - c]; thus we may eliminate x altogether from the dual

problem!

137

First, recall that given any two matrices U and V, (i) [UV]T=VTUT, (ii) [UT]T = U

and (iii) UTV=VTU (assuming compatibility). Also, (iv) Q, Q-1 are symmetric and

identical to their transposes.

Substituting this value of x into the dual objective function and rearranging:

Tb – ½ xTQx = Tb – ½ [Q-1(AT – c)]T Q [Q-1(AT – c)]

= bT – ½ (AT – c)T Q-1 Q Q-1 (AT – c) (i) and (iv)

= bT – ½ [(AT)T – cT] Q-1 (AT – c)

= bT – ½ [TAQ-1 – cT Q-1] (AT – c) (i) and (ii)

= bT – ½ [TAQ-1AT + cTQ-1c – cT Q-1 AT – TAQ-1c]

= bT – ½ [TAQ-1AT + cTQ-1c – cT Q-1 AT – (AQ-1c)T] (iii)

= bT – ½ [TAQ-1AT + cTQ-1c – cT Q-1 AT – cT(Q-1)TAT] (i)

= bT – ½ [TAQ-1AT + cTQ-1c – 2cT Q-1 AT] (iv)

= [bT + cT Q-1AT] – ½ [TAQ-1AT] – ½ [cTQ-1c]

= eT – ½ [TD] – ½ [cTQ-1c]

where

e = b + AQ-1c and D= -AQ-1AT

so that the dual is (as long as Q is nonsingular)

Maximize 12

12

138

EXAMPLE: Minimize ½ x12 + ½ x2

2 - 2x1 - 2x2

st 0 x1 1 1

0 ; 0 x2 1 1

0

i. e. ,Minimize12

1 00 1

2 2

xT Q x cT x

st

1 00 11 00 1

1100

A x b

In order to define the dual we have Q-1 (=Q) = 1 00 1

⟹

1100

1 00 11 00 1

1 00 1

22

1122

D= -AQ-1AT =

1 00 11 00 1

1 00 1

1 0 1 00 1 0 1

1 0 1 00 1 0 11 0 1 00 1 0 1

Therefore the dual is

Maximize 12

12

with the e and D as given above. The optimal primal solution x* = Q-1[AT*-c]

= 1 00 1

1 0 1 00 1 0 1

22

139

AN ALGORITHM FOR QUADRATIC PROGRAMMING

PROBLEMS: The Method of HILDETH AND D'ESPO

This is a "cyclic co-ordinate search" method applied to the QP dual:

Maximize 12

STEP 1: Start with an initial (e.g. =0)

STEP 2: For j=1,2,...,m do the following

Search for the maximum in the direction parallel to the j-axis by fixing k

for k≠j and solving 0 for j. If this j<0, then fix j=0.

STEP 3: If from this iteration the same as the from the previous iteration

STOP else go to STEP 2.

NOTE: In order to obtain we need

Consider the previous example where the dual was defined as:

Maximize 12

12

with

22; 1 0

0 1;

1122

;

1 0 1 00 1 0 11 0 1 00 1 0 1

140

Substituting the appropriate values of D, c, Q and e yields

Maximize 11 2 212

1 0 1 00 1 0 11 0 1 00 1 0 1

12

2 2 1 00 1

22

Maximize 2 212

2 2 4

The K-K-T conditions for this yield

1(1-1+3)=0

2(1-2+4)=0 Need to solve this system to obtain the optimal

3(-2-3+1)=0

4(-2-4+2)=0

Instead, let us try the method of Hildeth and D'Espo…

1 0 1 00 1 0 11 0 1 00 1 0 1

1122

1122

→→→→

/ ∂/ ∂/ ∂/ ∂

141

CYCLE-1 Let = [0 0 0 0]T

Solve∂

0 with 0 toobtain

-1+3+1=0 -1+0+1=0 1=1

= [1 0 0 0]T

Solve∂

0 with 1, 0 toobtain

-2+4+1=0 -2+0+1=0 2=1

= [1 1 0 0]T

Solve∂


1-3-2=0 1-3-2=0 3=-1 Fix 3=0

= [1 1 0 0]T

Solve∂


2-4-2=0 1-4-2=0 4=-1 Fix 4=0

= [1 1 0 0]T

END OF CYCLE 1 Another cycle is required since has changed from [0 0 0 0]T to [1 1 0 0]T.

142

1. -1+3+1=0; with 2=1, 3=0, 4=0 1=1

= [1 1 0 0]T

2. -2+4+1=0; with 1=1, 3=0, 4=0 2=1

= [1 1 0 0]T

3. 1-3-2=0; with 1=1, 2=1, 4=10 3=-1

= [1 1 0 0]T (3<0, so we set it to 0)

4. 2-4-2=0; with 1=1, 2=1, 3=1 4=-1

= [1 1 0 0]T (4<0, so we set it to 0)

END OF CYCLE 2

Since the value of has not changed from the previous cycle the algorithm has

converged.

Thus 1*= 2

*= 1; 3*= 4

* = 0. In order to now obtain the optimal primal vector x*

x* = Q-1[AT*-c] = 1 00 1

1 0 1 00 1 0 1

22

11

Dual Objective = Primal Objective = -3.

EXERCISE: Verify that K-K-T conditions for the dual are satisfied at x*.

143

WOLFE'S ALGORITHM FOR QP

Minimize12

,st , .

KARUSH-KUHN-TUCKER CONDITIONS (ignoring x0)

xL(x,) = 0 Qx + c - AT = 0 Qx - AT = -c

Original Constraints Ax ≥ b Ax - y = b (ysurplus)

Complementary Slackness Ty = 0

Nonnegativity y ≥ 0, ≥ 0.

Thus, we need to solve complementary slackness, nonnegativity for , y and x,

along with

This is solved in a manner similar to the PHASE-1 procedure of the simplex

method, while also using a "restricted basis entry" (RBE) rule:

“If yi is already in the basis at a level>0, do not consider i as a

candidate for entry into the basis, and vice versa. If yi (i) is in the

basis already at a 0 level, i (yi) may enter only if yi (i) remains at the

zero level.”

144

Let wj represent the artificial variable for the jth equality. Then, for the objective

we minimize the sum of the artificial variables (jwj). Thus the problem is to

Minimize W = jwj

st

while using the RBE rule.

If the optimum objective value of the LP is zero, we have an optimum

solution to the original problem; else the latter has no solution.

======================================

EXAMPLE: Minimize ½ x12 + x2

2 – x1x2 – x1 – x2

st x1+x2 ≤ 3 (i.e., – x1 – x2 ≥ -3)

2x1 +3x2 ≥ 6

x1, x2 ≥ 1.

i.e.,Min12

1 11 2

1 1

xT Q x cT x

st

1 12 30 01 1

3611

, 00

A x b x 0

Then it follows that

145

1 1 1 2 1 0 0 0 0 01 2 1 3 0 1 0 0 0 01 1 0 0 0 0 1 0 0 0

2 3 0 0 0 0 0 1 0 01 0 0 0 0 0 0 0 1 00 1 0 0 0 0 0 0 0 1

Therefore the simplex tableau for Wolfe's method is:

0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 01 1 1 2 1 0 0 0 0 0 1 0 0 0 0 0 0 11 2 1 3 0 1 0 0 0 0 0 1 0 0 0 0 0 11 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 3

2 3 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 61 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 10 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1

The simplex method requires that the RHS be non-negative, so that both sides of

equality in the 3rd constraint should be multiplied by -1.Thus we have

0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 01 1 1 2 1 0 0 0 0 0 1 0 0 0 0 0 0 11 2 1 3 0 1 0 0 0 0 0 1 0 0 0 0 0 1

1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 32 3 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 61 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 10 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1

Adjusting the objective row via, Row 0 Row 0 + j,j≠3(Row j) yields the final

starting tableau below; now we can proceed with the Simplex method (with RBE)

4 6 2 5 1 1 0 1 1 1 0 0 0 0 0 0 0 111 1 1 2 1 0 0 0 0 0 1 0 0 0 0 0 0 11 2 1 3 0 1 0 0 0 0 0 1 0 0 0 0 0 11 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 32 3 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 61 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 10 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1

146

BEALE'S ALGORITHM FOR QP

Minimize12

,st , .

Let BBasic; NNonbasic, so that ABxB + ANxN = b

| initialbasicfeasiblesolution, and

A = [AB | AN] where AB has columns corresponding to the basic variables xB and

is nonsingular. If xN =0, then xB = [AB]-1b.

In general, xB = [AB]-1b - [AB]-1ANxN for x that satisfies Ax=b.

Let us also partition c and Q as

| and where

EXAMPLE: Min ½ x12 + ½ x2

2 + x3 + 2x1x2 + 2x1x3 + x2x3 + 2x1 + x2 + x3

12

1 2 22 1 12 1 2

211

Q cT

st

810 ⟹ 1 0 1

0 1 1810

,

A b

147

An initial BFS might be x1=8, x2=10, x3=0, with the objective equal to 268 so that

| | 810|0

| 1 0 10 1 1

; | 21|1

1 2 22 1 12 1 2

Notethat 1 00 1

810

810

1 00 1

11

0 00

The objective function is f(x) = ½ xTQx + cTx

| |

|

Upon substituting xB = [AB]-1b - [AB]-1ANxN we obtain

12

12

148

Thus 12

Computing the partial derivatives, we have

for ϵ

= pi at xN = 0.

Choose the most negative pi, and increase the corresponding value of xi until

(1) One of the basic variables becomes zero (cf. the Simplex method...), OR

2 0, i. e. ,

(this results in a nonbasic but feasible solution)

For our example,

810 21

810 1 22 1

1 00 1

11

1 21 1 00 1

10

30

∴ 30

Increase x3 from 0 until one of the basic variables goes to zero, i.e., by 8 units

when x1=0, x2=2, x3 =8 (don't need R now since xN = 0). The new objective

function value is 92 (compared to 268 at the previous iteration).

At this point xB = [x2 x3 | x1] = [2 8 | 0] etc., etc., etc…

constrained optimization - university of pittsburgh

Documents