lec 17 multivariable ot
TRANSCRIPT
-
8/9/2019 Lec 17 Multivariable OT
1/30
-
8/9/2019 Lec 17 Multivariable OT
2/30
-
8/9/2019 Lec 17 Multivariable OT
3/30
Optimality Criteria• The definition of a local, a global or an inflection point
remains the same as for single variable functions;
• However optimality criteria for multi-variable functionsare different.• In multivariable functions the gradient of a function is
not a scalar quantity, instead it is a vector.• The optimality criteria can be derived by using the
definition of a local optimal point and a Taylor expansion
of the function.• We present these results here.
-
8/9/2019 Lec 17 Multivariable OT
4/30
Optimality Criteria
The unconstrained optimization problemconsidered in this section is stated as following:
Find a vector of optimization variables x = (x 1 , x 2 , x 3 , . . . , x n )T in order to minimize f (x)where f( x ) is termed as objective function
-
8/9/2019 Lec 17 Multivariable OT
5/30
The first order optimality condition for the minimum of f(x)can be derived by considering linear expansion of the functionaround the optimum point x* using Taylor Series:
Necessary Condition for Optimality:
*)((x*)*)()( T x x f x f x f −+≈
*)((x*)*)()( T x x f x f x f −=−
Where f(x*) is the gradient of function f(x) and x - x* is the distance.
-
8/9/2019 Lec 17 Multivariable OT
6/30
Unconstrained Problems:
• If the x* is a minimum point thenthis condition can only be ensuredif f(x)=0; The gradient of f(x)must vanish at the optimum.
• Thus the first order necessary
condition for the minimum of afunction is that its gradient is zeroat the optimum.
• This condition is true for amaximum point also and for anyother point where the slope is zero.
• Therefore, it is only a necessary
condition and is not sufficientcondition.
Conditions for Optimality
Graph of f(x) = x(-cos(1) – sin(1) + sin(x))
-6 -4 -2 0 2 4 6
-12
-8
-4
0
4
8
12 f(x)
x
Local max
Local min
Inflection
Local min
-
8/9/2019 Lec 17 Multivariable OT
7/30
Conditions for Optimality
-6 -4 -2 0 2 4 6
-12
-8
-4
0
4
8
12
df(x)/dx
x-6 -4 -2 0 2 4 6
-12
-8
-4
0
4
8
12
f(x)
x
Local
max
Local
min
Inflection
Local
min
-
8/9/2019 Lec 17 Multivariable OT
8/30
How to Plot Multivariable functions
% matlab function for 3D plotclear all[X,Y] = meshgrid(-8:.5:8);R = sqrt(X.^2 + Y.^2) + eps;
Z = sin(R)./R;mesh(X,Y,Z)%contour(X,Y,Z,20)
MATLAB PROGRAM:
-
8/9/2019 Lec 17 Multivariable OT
9/30
Contours for 3D plots
% matlab functionclear all[X,Y] = meshgrid(-8:.5:8);
R = sqrt(X.^2 + Y.^2) + eps;Z = sin(R)./R;contour(X,Y,Z,100)
MATLAB PROGRAM:
The direction of steepestascent (gradient) isgenerally perpendicular,
or orthogonal, to theelevation contour.
-
8/9/2019 Lec 17 Multivariable OT
10/30
-
8/9/2019 Lec 17 Multivariable OT
11/30
-
8/9/2019 Lec 17 Multivariable OT
12/30
-
8/9/2019 Lec 17 Multivariable OT
13/30
Optimality Conditions – Unconstrained Case
• Let x* be the point that we think is the minimum for f(x)• Necessary condition (for optimality):
f(x*) = 0• A point that satisfies the necessary condition is a stationary
point
• It can be a minimum, maximum, or saddle point• How do we know that we have a minimum?• Answer: Sufficiency Condition:
The sufficient conditions for x* to be a strict local minimumare: f(x*) = 0
2 f(x*) is positive definite
-
8/9/2019 Lec 17 Multivariable OT
14/30
Definition of GradientThe gradient vector of a function f, denotedas f , tells us that from an arbitrary point
• Which direction is the steepest ascend/descend?
• That is the Direction that will yield the greatestchange in f.
• How much we will gain by taking that step?
• Indicate by the magnitude of f = || f || 2∂∂
∂∂∂∂
=
n x f
x f x f
f
Μ2
1
-
8/9/2019 Lec 17 Multivariable OT
15/30
Gradient – ExampleProblem : Employ gradient to evaluate the steepest ascent directionfor the function f ( x , y ) = xy 2 at point (2, 2).
Solution:
8)2)(2(2y2
4)2(y 22
===∂
∂
===∂∂
x y
f x f
4 unit
8 unit
944.884of magnitudeascentof Magnitude22
=+== f
ly.respectivedirectionsyandxinrsunit vectotheare, where
84or8
4directionascentSteepest
ji
ji f +==
-
8/9/2019 Lec 17 Multivariable OT
16/30
Gradient – ExampleThe direction of steepest ascent (gradient) is generallyperpendicular, or orthogonal, to the elevation contour.
01
23
45
02
4
60
50
100
150
f (x, y) = xy2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
clear all[x,y] = meshgrid([0:.2:5]);z = x.*y.^2;
% mesh(x,y,z)contour(x,y,z,20)
-
8/9/2019 Lec 17 Multivariable OT
17/30
Testing Optimum Point for one-D• For 1-D problems:
If f' ( x' ) = 0
andIf f" ( x' ) < 0 , then x' is a maximum pointIf f" ( x' ) > 0 , then x' is a minimum pointIf f" ( x' ) = 0 , then x' is a saddle point
• What about for multi-dimensional problems?
-
8/9/2019 Lec 17 Multivariable OT
18/30
Testing Optimum Point for Two-D
• For 2-D problems, if a point is an optimum point, then
0and0 =∂∂
=∂∂
y f
x f
• In addition, if the point is a maximum point, then
0and0 22
2
2<
∂∂<
∂∂
y f
x f
• Question : If both of these conditions aresatisfied for a point, can we conclude that thepoint is a maximum point?
-
8/9/2019 Lec 17 Multivariable OT
19/30
-
8/9/2019 Lec 17 Multivariable OT
20/30
• For 2-D functions, we also have to take intoconsideration of
• That is, whether a maximum or a minimum occursinvolves both partial derivatives w.r.t. x and y and thesecond partials w.r.t. x and y.
Testing Optimum Point for two-D systems
):note(222
x y f
y x f
y x f
∂∂∂=
∂∂∂
∂∂∂
-
8/9/2019 Lec 17 Multivariable OT
21/30
• Also known as the matrix of second partial derivatives.• It provides a way to discern if a function has reached an
optimum or not.
Hessian Matrix (or Hessian of f )
∂
∂
∂∂
∂∂∂
∂∂∂
=
2
22
2
2
2
y
f
x y
f y x f
x f
H
∂∂
∂∂∂
∂∂∂
∂∂∂
∂∂
∂∂∂
∂∂∂
∂∂∂
∂∂
=
2
2
2
2
1
2
2
2
22
2
12
2
1
2
1
2
21
2
nnn
n
nn
x f
x x f
x x f
x x f
x f
x x f
x x f
x x f
x f
Λ
ΜΜ
Λ
Λ
H
n=2
-
8/9/2019 Lec 17 Multivariable OT
22/30
• Suppose gradient f and Hessian H is evaluated at
x* = ( x *1, x* 2, …, x* n).
• If f = 0, the point x* is a stationary point.
• Further if H is positive definite, then x* is a minimum .
• If - H is positive definite (or H is negative definite) , then x* is amaximum point .
• If H is indefinite (neither positive nor negative definite), then x*is a saddle point.
• If H is singular, no conclusion (need further investigation)
Testing Optimum Point (General Case)
-
8/9/2019 Lec 17 Multivariable OT
23/30
Assuming that the partial derivatives are continuous at and near the pointbeing evaluated. For function with two variables (i.e. N = 2 ),
pointsaddleahas),(then,0If
maximumlocalahas),(then,0and0If
minimumlocalahas),(then,0and0If
2
2
2
2
22
2
2
2
2
y x f
y x f x
f y x f x
f
y x f
y f
x f
<
<∂∂>
>∂
∂>
∂∂∂−
∂∂
∂∂=
H
H
H
H
The quantity | H| is equal to the determinant of the Hessian matrix of f .
Testing Optimum Point (Special case – function with two variables)
-
8/9/2019 Lec 17 Multivariable OT
24/30
Principal Minors Test• This test usually requires less computational effort than the eigen-value
test. If all principal minors, A i for i = 1, 2, ..., n, of n х n matrix A in thequadratic form f (x) = 0.5 x T Ax are known, then the sign of the
quadratic form is determined as follows:
1. Positive definite if A i > 0 for all i = 1, ...., n.
2. Positive semi-definite if A i ≥ 0 for all i = 1,..., n.
3. Negative definite { if А i < 0 for all i = l, 3, 5,... (odd indices) ;
or А i > 0 for all i = 2, 4, 6,... (even indices) ;
2. Negative semi-definite { if А i ≤ 0 for all i = l, 3, 5,... (oddindices);
or А i ≥ 0 for all i = 2, 4, 6,... (even indices) ;
2. Indefinite if none of the above cases applies.
-
8/9/2019 Lec 17 Multivariable OT
25/30
Example 4.1:
+−
−+−=
∂∂
∂∂
y x
y x
y f
x f
4
22
−
−
=
∂∂∂∂∂
∂∂∂∂∂41
12
//
//222
222
y f x y f
y x f x f
Find all stationary points for the following function. Using Optimalityconditions, classify them as minimum, maximum or inflection points.
The objective function is : -2x + x 2 –xy +2y 2
The gradient vector :
The Hessian Matrix :
-
8/9/2019 Lec 17 Multivariable OT
26/30
-
8/9/2019 Lec 17 Multivariable OT
27/30
Example 4.1:The function f(x, y) at the point x* = 1.14286, y* = 0.285714 is
f = -1.14286 ;
The point is minimum point.Since Hessian matrix is positive define, we know the function is convex.
Therefore any minimum is a global minimum.
% Matlab program to draw contour of function[X,Y] = meshgrid(-1:.1:2);Z = -2.*X + X.*X - X.*Y + 2.*Y.*Y;contour(X,Y,Z,100)
-
8/9/2019 Lec 17 Multivariable OT
28/30
Example 4.1:• Contour graph using MATLAB
Graphical
presentation offunction andminimum at point(x*, y*)
-
8/9/2019 Lec 17 Multivariable OT
29/30
-
8/9/2019 Lec 17 Multivariable OT
30/30
Example 1:Important observations: • The minimum point does not change if we add a
constant to the objective function.• The minimum point does not change if we multiply theobjective function by a positive constant.
• The problem changes from minimization to maximizationproblem if we multiply the objective function by anegative sign.
• The unconstrained problem is a convex problem if theobject function is convex . For convex problems any localminimum is also a global minimum .