a brief introduction to pde constrained optimization · a brief introduction to pde constrained...

100
A brief introduction to PDE constrained optimization Harbir Antil Department of Mathematical Sciences George Mason University [email protected] June 6, 2016 Frontiers in PDE-constrained Optimization Institute for Mathematics and its Applications Funding: ExxonMobil, IMA H. Antil 06/06/16 1

Upload: others

Post on 27-May-2020

24 views

Category:

Documents


0 download

TRANSCRIPT

A brief introduction to PDE constrainedoptimization

Harbir Antil

Department of Mathematical SciencesGeorge Mason University

hantilgmuedu

June 6 2016

Frontiers in PDE-constrained OptimizationInstitute for Mathematics and its Applications

Funding ExxonMobil IMA

H Antil 060616 1

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 2

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 3

Controlling pollutants in river Antil Heinkenschloss Hoppe 2010

(Loading movieavi)

H Antil 060616 4

riveravi
Media File (videoavi)

Magnetic drug targeting Antil Nochetto Venegas 2016

The magnetic field exerts a force on magnetic materials such as magneticnanoparticles (MNPs)

Experimental setup (Shapiro et al 2013)

H Antil 060616 5

Electrowetting on Dielectric (EWOD)

maximal ldquopinning pressurerdquo Ppin=2cpin H in dimensionalform which represents the maximum opposing force perliquid-gas interface area that the contact line can applyagainst motion of the interface The factor of ldquo2rdquo accountsfor the interface contact line pinning at the floor and ceilingof the EWOD device The nondimensional pinning pressureis then given by

Ppin =1P0

2cpin

H 12

where P0 is the dimensional reference pressure scale forceper unit of area

This allows us to introduce a locally defined pinningpressure into the boundary conditions in nondimensionalform

p = + E + 13

= Ppin sgnu middot n

In other words if the normal velocity of the liquid-gas inter-face is positive then the pinning pressure will push backwith maximum positive pressure +Ppin to limit the motionLikewise if the normal velocity is negative the pinningpressure will push back in the opposite direction Ppin Andif the normal velocity is zero then takes on a value be-tween Ppin and acts as a Lagrange multiplier to enforce theconstraint that the interface does not move also see Fig 5This definition of ensures $ Ppin and so is consistentwith the experimental observation in Sec II D 1

Numerical implementation of this simple phenomeno-logical model is difficult because of the discontinuity $thesign function of Eq 13 see Fig 6 This discontinuityhowever is essential If it was replaced by a smooth functionthen no droplet could ever be pinned in a noncircular shapesomething that does occur in the experiments see Fig 12below To show this we argue by contradiction Assume no

EWOD forcing so the boundary pressure is given by p=+ $from Eq 13 Suppose the droplet has become com-pletely pinned in a noncircular shape but the function relat-ing to the normal front velocity u middotn is smooth Any sym-metric smoothed version f of the sign function ldquosgnrdquo musthave = Ppinf0=0 which implies p= by Eq 13 But ifthe droplet is pinned in a noncircular shape then will notbe constant around the droplets circumference hence thepressure field inside the droplet will not be constant72 ByEq 1 the velocity u will not be zero which contradicts thedroplet being pinned Note that a smooth but steep function fwill lead to a slowly creeping liquid but only a discontinu-ous f can truly pin the liquid Moreover our model with thesgn function is a physically motivated description538ndash4463ndash68

Our model with Eq 13 is nonlinear and introduces thevelocity into the pressure boundary conditions Moreover itis local meaning that multiple parts of the boundary canbe pinned and unpinned simultaneously Furthermore the re-gions where the droplet is pinnedunpinned are not known apriori this must be determined as part of the solution Ourpinning model is similar to the Signorini problem inelasticity73 which models the deformation of an elastic bodyagainst a rigid obstacle and utilizes a contact variable to en-force the rigid constraint similarly to our Our model canbe increased in complexity see Sec II D 3 to account formore interesting contact line dynamics54ndash57 We are able toinclude this model into our variational formulation of SecIII and we have a method of solving for the velocity pres-sure and 2574 Knowing immediately yields where theboundary is pinnedunpinned

One important issue here is the three-phase linesingularity75ndash77 which originates because of an apparentparadox between the no-slip boundary condition on the wallsof the device and the fact that the contact line moves Ourformulation is only a two dimensional description as appro-priate for planar devices We do not track the location of thethree-phase line in our model in three dimensions we onlyhave an ldquoaveragedrdquo description of the liquid-gas interfaceRecall that the velocity field in the z direction is parabolic afundamental assumption in Hele-Shaw flow Thus in our

Pinned

Pinned

Pinned

= Ppin

= Ppin

= +Ppin

Ppin +Ppin

FIG 5 A two dimensional droplet top view with parts of the boundarypinned electrode grid not shown The pinned regions are denoted by asolid line unpinned regions are shown as a dashed line with velocity arrowsindicating direction of motion An outward motion is considered positiveu middotnamp0 and an inward motion is negative u middotn0 The pinning variable is defined on the boundary of the droplet On the unpinned regions thevalue of saturates to Ppin On the pinned regions u middotn=0 variesbetween Ppin and +Ppin see Fig 6 In our simulations Ppin is used toindicate where the boundary is pinned

+Ppin

Ppin

u middot n

FIG 6 A more realistic relationship between the resistive pressure at theinterface and the normal velocity Here $+Dviscu middotn is plotted as the thickline Note how the interface pressure increases with increasing velocity Thedashed line depicts a slightly more nonlinear relationship $+Dviscu middotn+Gfricu middotn $see Eq 16 The qualitative form of the dashed line has beenshown in the work of Refs 54ndash57 Note the dashed line asymptotes towardthe thin line Dviscu middotn

102103-6 Walker Shapiro and Nochetto Phys Fluids 21 102103 2009

Downloaded 29 Jun 2010 to 168725468 Redistribution subject to AIP license or copyright see httppofaiporgpofcopyrightjsp

I Forward problem Walker Nochetto Shapiro 2009

I Optimal control Antil Hintermuller Nochetto Surowiec Wegner2015

H Antil 060616 6

Shape optimization of microfluidic biochip Antil

Heinkenschloss Hoppe Linsenmann Wixforth 2010

Microfluidic Biochip

H Antil 060616 7

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 2

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 3

Controlling pollutants in river Antil Heinkenschloss Hoppe 2010

(Loading movieavi)

H Antil 060616 4

riveravi
Media File (videoavi)

Magnetic drug targeting Antil Nochetto Venegas 2016

The magnetic field exerts a force on magnetic materials such as magneticnanoparticles (MNPs)

Experimental setup (Shapiro et al 2013)

H Antil 060616 5

Electrowetting on Dielectric (EWOD)

maximal ldquopinning pressurerdquo Ppin=2cpin H in dimensionalform which represents the maximum opposing force perliquid-gas interface area that the contact line can applyagainst motion of the interface The factor of ldquo2rdquo accountsfor the interface contact line pinning at the floor and ceilingof the EWOD device The nondimensional pinning pressureis then given by

Ppin =1P0

2cpin

H 12

where P0 is the dimensional reference pressure scale forceper unit of area

This allows us to introduce a locally defined pinningpressure into the boundary conditions in nondimensionalform

p = + E + 13

= Ppin sgnu middot n

In other words if the normal velocity of the liquid-gas inter-face is positive then the pinning pressure will push backwith maximum positive pressure +Ppin to limit the motionLikewise if the normal velocity is negative the pinningpressure will push back in the opposite direction Ppin Andif the normal velocity is zero then takes on a value be-tween Ppin and acts as a Lagrange multiplier to enforce theconstraint that the interface does not move also see Fig 5This definition of ensures $ Ppin and so is consistentwith the experimental observation in Sec II D 1

Numerical implementation of this simple phenomeno-logical model is difficult because of the discontinuity $thesign function of Eq 13 see Fig 6 This discontinuityhowever is essential If it was replaced by a smooth functionthen no droplet could ever be pinned in a noncircular shapesomething that does occur in the experiments see Fig 12below To show this we argue by contradiction Assume no

EWOD forcing so the boundary pressure is given by p=+ $from Eq 13 Suppose the droplet has become com-pletely pinned in a noncircular shape but the function relat-ing to the normal front velocity u middotn is smooth Any sym-metric smoothed version f of the sign function ldquosgnrdquo musthave = Ppinf0=0 which implies p= by Eq 13 But ifthe droplet is pinned in a noncircular shape then will notbe constant around the droplets circumference hence thepressure field inside the droplet will not be constant72 ByEq 1 the velocity u will not be zero which contradicts thedroplet being pinned Note that a smooth but steep function fwill lead to a slowly creeping liquid but only a discontinu-ous f can truly pin the liquid Moreover our model with thesgn function is a physically motivated description538ndash4463ndash68

Our model with Eq 13 is nonlinear and introduces thevelocity into the pressure boundary conditions Moreover itis local meaning that multiple parts of the boundary canbe pinned and unpinned simultaneously Furthermore the re-gions where the droplet is pinnedunpinned are not known apriori this must be determined as part of the solution Ourpinning model is similar to the Signorini problem inelasticity73 which models the deformation of an elastic bodyagainst a rigid obstacle and utilizes a contact variable to en-force the rigid constraint similarly to our Our model canbe increased in complexity see Sec II D 3 to account formore interesting contact line dynamics54ndash57 We are able toinclude this model into our variational formulation of SecIII and we have a method of solving for the velocity pres-sure and 2574 Knowing immediately yields where theboundary is pinnedunpinned

One important issue here is the three-phase linesingularity75ndash77 which originates because of an apparentparadox between the no-slip boundary condition on the wallsof the device and the fact that the contact line moves Ourformulation is only a two dimensional description as appro-priate for planar devices We do not track the location of thethree-phase line in our model in three dimensions we onlyhave an ldquoaveragedrdquo description of the liquid-gas interfaceRecall that the velocity field in the z direction is parabolic afundamental assumption in Hele-Shaw flow Thus in our

Pinned

Pinned

Pinned

= Ppin

= Ppin

= +Ppin

Ppin +Ppin

FIG 5 A two dimensional droplet top view with parts of the boundarypinned electrode grid not shown The pinned regions are denoted by asolid line unpinned regions are shown as a dashed line with velocity arrowsindicating direction of motion An outward motion is considered positiveu middotnamp0 and an inward motion is negative u middotn0 The pinning variable is defined on the boundary of the droplet On the unpinned regions thevalue of saturates to Ppin On the pinned regions u middotn=0 variesbetween Ppin and +Ppin see Fig 6 In our simulations Ppin is used toindicate where the boundary is pinned

+Ppin

Ppin

u middot n

FIG 6 A more realistic relationship between the resistive pressure at theinterface and the normal velocity Here $+Dviscu middotn is plotted as the thickline Note how the interface pressure increases with increasing velocity Thedashed line depicts a slightly more nonlinear relationship $+Dviscu middotn+Gfricu middotn $see Eq 16 The qualitative form of the dashed line has beenshown in the work of Refs 54ndash57 Note the dashed line asymptotes towardthe thin line Dviscu middotn

102103-6 Walker Shapiro and Nochetto Phys Fluids 21 102103 2009

Downloaded 29 Jun 2010 to 168725468 Redistribution subject to AIP license or copyright see httppofaiporgpofcopyrightjsp

I Forward problem Walker Nochetto Shapiro 2009

I Optimal control Antil Hintermuller Nochetto Surowiec Wegner2015

H Antil 060616 6

Shape optimization of microfluidic biochip Antil

Heinkenschloss Hoppe Linsenmann Wixforth 2010

Microfluidic Biochip

H Antil 060616 7

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 3

Controlling pollutants in river Antil Heinkenschloss Hoppe 2010

(Loading movieavi)

H Antil 060616 4

riveravi
Media File (videoavi)

Magnetic drug targeting Antil Nochetto Venegas 2016

The magnetic field exerts a force on magnetic materials such as magneticnanoparticles (MNPs)

Experimental setup (Shapiro et al 2013)

H Antil 060616 5

Electrowetting on Dielectric (EWOD)

maximal ldquopinning pressurerdquo Ppin=2cpin H in dimensionalform which represents the maximum opposing force perliquid-gas interface area that the contact line can applyagainst motion of the interface The factor of ldquo2rdquo accountsfor the interface contact line pinning at the floor and ceilingof the EWOD device The nondimensional pinning pressureis then given by

Ppin =1P0

2cpin

H 12

where P0 is the dimensional reference pressure scale forceper unit of area

This allows us to introduce a locally defined pinningpressure into the boundary conditions in nondimensionalform

p = + E + 13

= Ppin sgnu middot n

In other words if the normal velocity of the liquid-gas inter-face is positive then the pinning pressure will push backwith maximum positive pressure +Ppin to limit the motionLikewise if the normal velocity is negative the pinningpressure will push back in the opposite direction Ppin Andif the normal velocity is zero then takes on a value be-tween Ppin and acts as a Lagrange multiplier to enforce theconstraint that the interface does not move also see Fig 5This definition of ensures $ Ppin and so is consistentwith the experimental observation in Sec II D 1

Numerical implementation of this simple phenomeno-logical model is difficult because of the discontinuity $thesign function of Eq 13 see Fig 6 This discontinuityhowever is essential If it was replaced by a smooth functionthen no droplet could ever be pinned in a noncircular shapesomething that does occur in the experiments see Fig 12below To show this we argue by contradiction Assume no

EWOD forcing so the boundary pressure is given by p=+ $from Eq 13 Suppose the droplet has become com-pletely pinned in a noncircular shape but the function relat-ing to the normal front velocity u middotn is smooth Any sym-metric smoothed version f of the sign function ldquosgnrdquo musthave = Ppinf0=0 which implies p= by Eq 13 But ifthe droplet is pinned in a noncircular shape then will notbe constant around the droplets circumference hence thepressure field inside the droplet will not be constant72 ByEq 1 the velocity u will not be zero which contradicts thedroplet being pinned Note that a smooth but steep function fwill lead to a slowly creeping liquid but only a discontinu-ous f can truly pin the liquid Moreover our model with thesgn function is a physically motivated description538ndash4463ndash68

Our model with Eq 13 is nonlinear and introduces thevelocity into the pressure boundary conditions Moreover itis local meaning that multiple parts of the boundary canbe pinned and unpinned simultaneously Furthermore the re-gions where the droplet is pinnedunpinned are not known apriori this must be determined as part of the solution Ourpinning model is similar to the Signorini problem inelasticity73 which models the deformation of an elastic bodyagainst a rigid obstacle and utilizes a contact variable to en-force the rigid constraint similarly to our Our model canbe increased in complexity see Sec II D 3 to account formore interesting contact line dynamics54ndash57 We are able toinclude this model into our variational formulation of SecIII and we have a method of solving for the velocity pres-sure and 2574 Knowing immediately yields where theboundary is pinnedunpinned

One important issue here is the three-phase linesingularity75ndash77 which originates because of an apparentparadox between the no-slip boundary condition on the wallsof the device and the fact that the contact line moves Ourformulation is only a two dimensional description as appro-priate for planar devices We do not track the location of thethree-phase line in our model in three dimensions we onlyhave an ldquoaveragedrdquo description of the liquid-gas interfaceRecall that the velocity field in the z direction is parabolic afundamental assumption in Hele-Shaw flow Thus in our

Pinned

Pinned

Pinned

= Ppin

= Ppin

= +Ppin

Ppin +Ppin

FIG 5 A two dimensional droplet top view with parts of the boundarypinned electrode grid not shown The pinned regions are denoted by asolid line unpinned regions are shown as a dashed line with velocity arrowsindicating direction of motion An outward motion is considered positiveu middotnamp0 and an inward motion is negative u middotn0 The pinning variable is defined on the boundary of the droplet On the unpinned regions thevalue of saturates to Ppin On the pinned regions u middotn=0 variesbetween Ppin and +Ppin see Fig 6 In our simulations Ppin is used toindicate where the boundary is pinned

+Ppin

Ppin

u middot n

FIG 6 A more realistic relationship between the resistive pressure at theinterface and the normal velocity Here $+Dviscu middotn is plotted as the thickline Note how the interface pressure increases with increasing velocity Thedashed line depicts a slightly more nonlinear relationship $+Dviscu middotn+Gfricu middotn $see Eq 16 The qualitative form of the dashed line has beenshown in the work of Refs 54ndash57 Note the dashed line asymptotes towardthe thin line Dviscu middotn

102103-6 Walker Shapiro and Nochetto Phys Fluids 21 102103 2009

Downloaded 29 Jun 2010 to 168725468 Redistribution subject to AIP license or copyright see httppofaiporgpofcopyrightjsp

I Forward problem Walker Nochetto Shapiro 2009

I Optimal control Antil Hintermuller Nochetto Surowiec Wegner2015

H Antil 060616 6

Shape optimization of microfluidic biochip Antil

Heinkenschloss Hoppe Linsenmann Wixforth 2010

Microfluidic Biochip

H Antil 060616 7

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Controlling pollutants in river Antil Heinkenschloss Hoppe 2010

(Loading movieavi)

H Antil 060616 4

riveravi
Media File (videoavi)

Magnetic drug targeting Antil Nochetto Venegas 2016

The magnetic field exerts a force on magnetic materials such as magneticnanoparticles (MNPs)

Experimental setup (Shapiro et al 2013)

H Antil 060616 5

Electrowetting on Dielectric (EWOD)

maximal ldquopinning pressurerdquo Ppin=2cpin H in dimensionalform which represents the maximum opposing force perliquid-gas interface area that the contact line can applyagainst motion of the interface The factor of ldquo2rdquo accountsfor the interface contact line pinning at the floor and ceilingof the EWOD device The nondimensional pinning pressureis then given by

Ppin =1P0

2cpin

H 12

where P0 is the dimensional reference pressure scale forceper unit of area

This allows us to introduce a locally defined pinningpressure into the boundary conditions in nondimensionalform

p = + E + 13

= Ppin sgnu middot n

In other words if the normal velocity of the liquid-gas inter-face is positive then the pinning pressure will push backwith maximum positive pressure +Ppin to limit the motionLikewise if the normal velocity is negative the pinningpressure will push back in the opposite direction Ppin Andif the normal velocity is zero then takes on a value be-tween Ppin and acts as a Lagrange multiplier to enforce theconstraint that the interface does not move also see Fig 5This definition of ensures $ Ppin and so is consistentwith the experimental observation in Sec II D 1

Numerical implementation of this simple phenomeno-logical model is difficult because of the discontinuity $thesign function of Eq 13 see Fig 6 This discontinuityhowever is essential If it was replaced by a smooth functionthen no droplet could ever be pinned in a noncircular shapesomething that does occur in the experiments see Fig 12below To show this we argue by contradiction Assume no

EWOD forcing so the boundary pressure is given by p=+ $from Eq 13 Suppose the droplet has become com-pletely pinned in a noncircular shape but the function relat-ing to the normal front velocity u middotn is smooth Any sym-metric smoothed version f of the sign function ldquosgnrdquo musthave = Ppinf0=0 which implies p= by Eq 13 But ifthe droplet is pinned in a noncircular shape then will notbe constant around the droplets circumference hence thepressure field inside the droplet will not be constant72 ByEq 1 the velocity u will not be zero which contradicts thedroplet being pinned Note that a smooth but steep function fwill lead to a slowly creeping liquid but only a discontinu-ous f can truly pin the liquid Moreover our model with thesgn function is a physically motivated description538ndash4463ndash68

Our model with Eq 13 is nonlinear and introduces thevelocity into the pressure boundary conditions Moreover itis local meaning that multiple parts of the boundary canbe pinned and unpinned simultaneously Furthermore the re-gions where the droplet is pinnedunpinned are not known apriori this must be determined as part of the solution Ourpinning model is similar to the Signorini problem inelasticity73 which models the deformation of an elastic bodyagainst a rigid obstacle and utilizes a contact variable to en-force the rigid constraint similarly to our Our model canbe increased in complexity see Sec II D 3 to account formore interesting contact line dynamics54ndash57 We are able toinclude this model into our variational formulation of SecIII and we have a method of solving for the velocity pres-sure and 2574 Knowing immediately yields where theboundary is pinnedunpinned

One important issue here is the three-phase linesingularity75ndash77 which originates because of an apparentparadox between the no-slip boundary condition on the wallsof the device and the fact that the contact line moves Ourformulation is only a two dimensional description as appro-priate for planar devices We do not track the location of thethree-phase line in our model in three dimensions we onlyhave an ldquoaveragedrdquo description of the liquid-gas interfaceRecall that the velocity field in the z direction is parabolic afundamental assumption in Hele-Shaw flow Thus in our

Pinned

Pinned

Pinned

= Ppin

= Ppin

= +Ppin

Ppin +Ppin

FIG 5 A two dimensional droplet top view with parts of the boundarypinned electrode grid not shown The pinned regions are denoted by asolid line unpinned regions are shown as a dashed line with velocity arrowsindicating direction of motion An outward motion is considered positiveu middotnamp0 and an inward motion is negative u middotn0 The pinning variable is defined on the boundary of the droplet On the unpinned regions thevalue of saturates to Ppin On the pinned regions u middotn=0 variesbetween Ppin and +Ppin see Fig 6 In our simulations Ppin is used toindicate where the boundary is pinned

+Ppin

Ppin

u middot n

FIG 6 A more realistic relationship between the resistive pressure at theinterface and the normal velocity Here $+Dviscu middotn is plotted as the thickline Note how the interface pressure increases with increasing velocity Thedashed line depicts a slightly more nonlinear relationship $+Dviscu middotn+Gfricu middotn $see Eq 16 The qualitative form of the dashed line has beenshown in the work of Refs 54ndash57 Note the dashed line asymptotes towardthe thin line Dviscu middotn

102103-6 Walker Shapiro and Nochetto Phys Fluids 21 102103 2009

Downloaded 29 Jun 2010 to 168725468 Redistribution subject to AIP license or copyright see httppofaiporgpofcopyrightjsp

I Forward problem Walker Nochetto Shapiro 2009

I Optimal control Antil Hintermuller Nochetto Surowiec Wegner2015

H Antil 060616 6

Shape optimization of microfluidic biochip Antil

Heinkenschloss Hoppe Linsenmann Wixforth 2010

Microfluidic Biochip

H Antil 060616 7

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Magnetic drug targeting Antil Nochetto Venegas 2016

The magnetic field exerts a force on magnetic materials such as magneticnanoparticles (MNPs)

Experimental setup (Shapiro et al 2013)

H Antil 060616 5

Electrowetting on Dielectric (EWOD)

maximal ldquopinning pressurerdquo Ppin=2cpin H in dimensionalform which represents the maximum opposing force perliquid-gas interface area that the contact line can applyagainst motion of the interface The factor of ldquo2rdquo accountsfor the interface contact line pinning at the floor and ceilingof the EWOD device The nondimensional pinning pressureis then given by

Ppin =1P0

2cpin

H 12

where P0 is the dimensional reference pressure scale forceper unit of area

This allows us to introduce a locally defined pinningpressure into the boundary conditions in nondimensionalform

p = + E + 13

= Ppin sgnu middot n

In other words if the normal velocity of the liquid-gas inter-face is positive then the pinning pressure will push backwith maximum positive pressure +Ppin to limit the motionLikewise if the normal velocity is negative the pinningpressure will push back in the opposite direction Ppin Andif the normal velocity is zero then takes on a value be-tween Ppin and acts as a Lagrange multiplier to enforce theconstraint that the interface does not move also see Fig 5This definition of ensures $ Ppin and so is consistentwith the experimental observation in Sec II D 1

Numerical implementation of this simple phenomeno-logical model is difficult because of the discontinuity $thesign function of Eq 13 see Fig 6 This discontinuityhowever is essential If it was replaced by a smooth functionthen no droplet could ever be pinned in a noncircular shapesomething that does occur in the experiments see Fig 12below To show this we argue by contradiction Assume no

EWOD forcing so the boundary pressure is given by p=+ $from Eq 13 Suppose the droplet has become com-pletely pinned in a noncircular shape but the function relat-ing to the normal front velocity u middotn is smooth Any sym-metric smoothed version f of the sign function ldquosgnrdquo musthave = Ppinf0=0 which implies p= by Eq 13 But ifthe droplet is pinned in a noncircular shape then will notbe constant around the droplets circumference hence thepressure field inside the droplet will not be constant72 ByEq 1 the velocity u will not be zero which contradicts thedroplet being pinned Note that a smooth but steep function fwill lead to a slowly creeping liquid but only a discontinu-ous f can truly pin the liquid Moreover our model with thesgn function is a physically motivated description538ndash4463ndash68

Our model with Eq 13 is nonlinear and introduces thevelocity into the pressure boundary conditions Moreover itis local meaning that multiple parts of the boundary canbe pinned and unpinned simultaneously Furthermore the re-gions where the droplet is pinnedunpinned are not known apriori this must be determined as part of the solution Ourpinning model is similar to the Signorini problem inelasticity73 which models the deformation of an elastic bodyagainst a rigid obstacle and utilizes a contact variable to en-force the rigid constraint similarly to our Our model canbe increased in complexity see Sec II D 3 to account formore interesting contact line dynamics54ndash57 We are able toinclude this model into our variational formulation of SecIII and we have a method of solving for the velocity pres-sure and 2574 Knowing immediately yields where theboundary is pinnedunpinned

One important issue here is the three-phase linesingularity75ndash77 which originates because of an apparentparadox between the no-slip boundary condition on the wallsof the device and the fact that the contact line moves Ourformulation is only a two dimensional description as appro-priate for planar devices We do not track the location of thethree-phase line in our model in three dimensions we onlyhave an ldquoaveragedrdquo description of the liquid-gas interfaceRecall that the velocity field in the z direction is parabolic afundamental assumption in Hele-Shaw flow Thus in our

Pinned

Pinned

Pinned

= Ppin

= Ppin

= +Ppin

Ppin +Ppin

FIG 5 A two dimensional droplet top view with parts of the boundarypinned electrode grid not shown The pinned regions are denoted by asolid line unpinned regions are shown as a dashed line with velocity arrowsindicating direction of motion An outward motion is considered positiveu middotnamp0 and an inward motion is negative u middotn0 The pinning variable is defined on the boundary of the droplet On the unpinned regions thevalue of saturates to Ppin On the pinned regions u middotn=0 variesbetween Ppin and +Ppin see Fig 6 In our simulations Ppin is used toindicate where the boundary is pinned

+Ppin

Ppin

u middot n

FIG 6 A more realistic relationship between the resistive pressure at theinterface and the normal velocity Here $+Dviscu middotn is plotted as the thickline Note how the interface pressure increases with increasing velocity Thedashed line depicts a slightly more nonlinear relationship $+Dviscu middotn+Gfricu middotn $see Eq 16 The qualitative form of the dashed line has beenshown in the work of Refs 54ndash57 Note the dashed line asymptotes towardthe thin line Dviscu middotn

102103-6 Walker Shapiro and Nochetto Phys Fluids 21 102103 2009

Downloaded 29 Jun 2010 to 168725468 Redistribution subject to AIP license or copyright see httppofaiporgpofcopyrightjsp

I Forward problem Walker Nochetto Shapiro 2009

I Optimal control Antil Hintermuller Nochetto Surowiec Wegner2015

H Antil 060616 6

Shape optimization of microfluidic biochip Antil

Heinkenschloss Hoppe Linsenmann Wixforth 2010

Microfluidic Biochip

H Antil 060616 7

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Electrowetting on Dielectric (EWOD)

maximal ldquopinning pressurerdquo Ppin=2cpin H in dimensionalform which represents the maximum opposing force perliquid-gas interface area that the contact line can applyagainst motion of the interface The factor of ldquo2rdquo accountsfor the interface contact line pinning at the floor and ceilingof the EWOD device The nondimensional pinning pressureis then given by

Ppin =1P0

2cpin

H 12

where P0 is the dimensional reference pressure scale forceper unit of area

This allows us to introduce a locally defined pinningpressure into the boundary conditions in nondimensionalform

p = + E + 13

= Ppin sgnu middot n

In other words if the normal velocity of the liquid-gas inter-face is positive then the pinning pressure will push backwith maximum positive pressure +Ppin to limit the motionLikewise if the normal velocity is negative the pinningpressure will push back in the opposite direction Ppin Andif the normal velocity is zero then takes on a value be-tween Ppin and acts as a Lagrange multiplier to enforce theconstraint that the interface does not move also see Fig 5This definition of ensures $ Ppin and so is consistentwith the experimental observation in Sec II D 1

Numerical implementation of this simple phenomeno-logical model is difficult because of the discontinuity $thesign function of Eq 13 see Fig 6 This discontinuityhowever is essential If it was replaced by a smooth functionthen no droplet could ever be pinned in a noncircular shapesomething that does occur in the experiments see Fig 12below To show this we argue by contradiction Assume no

EWOD forcing so the boundary pressure is given by p=+ $from Eq 13 Suppose the droplet has become com-pletely pinned in a noncircular shape but the function relat-ing to the normal front velocity u middotn is smooth Any sym-metric smoothed version f of the sign function ldquosgnrdquo musthave = Ppinf0=0 which implies p= by Eq 13 But ifthe droplet is pinned in a noncircular shape then will notbe constant around the droplets circumference hence thepressure field inside the droplet will not be constant72 ByEq 1 the velocity u will not be zero which contradicts thedroplet being pinned Note that a smooth but steep function fwill lead to a slowly creeping liquid but only a discontinu-ous f can truly pin the liquid Moreover our model with thesgn function is a physically motivated description538ndash4463ndash68

Our model with Eq 13 is nonlinear and introduces thevelocity into the pressure boundary conditions Moreover itis local meaning that multiple parts of the boundary canbe pinned and unpinned simultaneously Furthermore the re-gions where the droplet is pinnedunpinned are not known apriori this must be determined as part of the solution Ourpinning model is similar to the Signorini problem inelasticity73 which models the deformation of an elastic bodyagainst a rigid obstacle and utilizes a contact variable to en-force the rigid constraint similarly to our Our model canbe increased in complexity see Sec II D 3 to account formore interesting contact line dynamics54ndash57 We are able toinclude this model into our variational formulation of SecIII and we have a method of solving for the velocity pres-sure and 2574 Knowing immediately yields where theboundary is pinnedunpinned

One important issue here is the three-phase linesingularity75ndash77 which originates because of an apparentparadox between the no-slip boundary condition on the wallsof the device and the fact that the contact line moves Ourformulation is only a two dimensional description as appro-priate for planar devices We do not track the location of thethree-phase line in our model in three dimensions we onlyhave an ldquoaveragedrdquo description of the liquid-gas interfaceRecall that the velocity field in the z direction is parabolic afundamental assumption in Hele-Shaw flow Thus in our

Pinned

Pinned

Pinned

= Ppin

= Ppin

= +Ppin

Ppin +Ppin

FIG 5 A two dimensional droplet top view with parts of the boundarypinned electrode grid not shown The pinned regions are denoted by asolid line unpinned regions are shown as a dashed line with velocity arrowsindicating direction of motion An outward motion is considered positiveu middotnamp0 and an inward motion is negative u middotn0 The pinning variable is defined on the boundary of the droplet On the unpinned regions thevalue of saturates to Ppin On the pinned regions u middotn=0 variesbetween Ppin and +Ppin see Fig 6 In our simulations Ppin is used toindicate where the boundary is pinned

+Ppin

Ppin

u middot n

FIG 6 A more realistic relationship between the resistive pressure at theinterface and the normal velocity Here $+Dviscu middotn is plotted as the thickline Note how the interface pressure increases with increasing velocity Thedashed line depicts a slightly more nonlinear relationship $+Dviscu middotn+Gfricu middotn $see Eq 16 The qualitative form of the dashed line has beenshown in the work of Refs 54ndash57 Note the dashed line asymptotes towardthe thin line Dviscu middotn

102103-6 Walker Shapiro and Nochetto Phys Fluids 21 102103 2009

Downloaded 29 Jun 2010 to 168725468 Redistribution subject to AIP license or copyright see httppofaiporgpofcopyrightjsp

I Forward problem Walker Nochetto Shapiro 2009

I Optimal control Antil Hintermuller Nochetto Surowiec Wegner2015

H Antil 060616 6

Shape optimization of microfluidic biochip Antil

Heinkenschloss Hoppe Linsenmann Wixforth 2010

Microfluidic Biochip

H Antil 060616 7

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Shape optimization of microfluidic biochip Antil

Heinkenschloss Hoppe Linsenmann Wixforth 2010

Microfluidic Biochip

H Antil 060616 7

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

PDE constrained optimization

PDE constrained optimization or optimal control of PDEs (OCP) has 4major components

I control z

I state u which associates with every control z a state u

I cost functional J which depends on z and u to be minimized

I constraints obeyed by z (control constraints) and u (stateconstraints)

Today

I Control constraints are pointwise ae

I State equation is a boundary value problem or a variationalinequality

H Antil 060616 8

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Steps to analyze OCPAnalysis

I Existence of solution to PDE Is it unique

I Existence of solution to optimization problem

I First-order necessary conditionsI Adjoint equations

I If possible second order necessarysufficient conditions

Solver development (function space)

I GradientHessian based

Discrete problem solver

I FD FV FE for problem and solver

I Analysis of discrete solver mesh independence

I EfficiencyI Adaptive finite element methods (AFEM)I Model reduction Proper orthogonal decomposition reduced basis

I Software development Trilinos Rapid Optimization Library (ROL)Dolfin-Adjoint

H Antil 060616 9

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Optimal heat sourceI Domain Let Ω sub Rd d ge 1 be open bounded Lipschitz domain

with boundary partΩ

I Goal Choose a distributed source z in such a way that thetemperature u = u(x) is close to a desired ud = ud(x) in Ω

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 10

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Problems involving semilinear elliptic PDEsI Heating with radiation boundary condition If the heat radiation

of the heated body is taken into account then we obtain

minus∆u = 0 in Ω

partνu = α(z4 minus u4) on partΩ

where the control z is the temperature of the surrounding medium

I Simplified supercondutivity

minus∆u+ u+ u3 = z in Ω

u = 0 on partΩ

I Control of stationary flows

minusmicro∆u + (u middot nabla)u +nablap = f in Ω

div u = 0 in Ω

u = g on partΩ

H Antil 060616 11

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Optimal control of variational inequalitiesProblem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to u isin K = w isin H10 (Ω) w(x) ge 0 ae x isin Ωint

Ω

nablau middot nabla(uminus v) dx le (z + f uminus v) forallv isin K

with z isin Zad

Regularized problem

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to (semilinear PDE)

u isin H10 (Ω) minus∆u+ γρε(minusu) = z + f in Ω

and z isin Zad γ gt 0 and ε gt 0 are the penealization and regularizationparameters respectively

H Antil 060616 12

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 13

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear elliptic problemsΩ sub Rn bounded Lipschitz domain

Sobolev spacesI Lp(Ω) 1 le p le infin are Lebesgue spaces p = 2 L2(Ω) is a Hilbert

space with inner product (u v) =int

Ωu(x)v(x) dx

I W sp for s isin R 1 le p le infin are Sobolev spaces When p = 2W s2(Ω) = Hs(Ω) is a Hilbert space

I H10 (Ω) = v isin H1(Ω) v|partΩ = 0 with norm (due to Poincare

inequality)uH1

0 (Ω) = nablauL2(Ω)

I H10 (Ω)lowast = Hminus1(Ω)

I Note L2(Ω) sub Hminus1(Ω)

Model problem

minusdiv (Anablau) = f in Ω

u = 0 on partΩ

H Antil 060616 14

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Well-posed

I Weak solution Given f isin Hminus1(Ω) a function u isin H10 (Ω) is called

a weak solution iffintΩ

Anablau middot nablav = 〈f v〉minus11 forallv isin H10 (Ω)

where 〈middot middot〉minus11 denotes the Hminus1(Ω)ndashH1(Ω) duality pairing

I Existence and uniqueness For each f isin Hminus1(Ω) there exists aunique u isin H1

0 (Ω) such that

nablauL2(Ω) le CdΩfHminus1(Ω)

I A = I the result is due to the Riesz-representation theorem

I A 6= I (maybe non-symmetric) the result is due to Lax-Milgramlemma

I For general W 1p-spaces use inf-sup conditions combined withBanach-Necas theorem which is a necessary and sufficient condition

H Antil 060616 15

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear-quadratic elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minusdiv (Anablau) = z in Ω

u = 0 on partΩ

and the pointwise control constraints

z isin Zad = v isin L2(Ω) za(x) le v(x) le zb(x) ae x isin Ω

λ gt 0 is a parameter za zb isin Linfin(Ω) with za lt zb are given A isbounded symmetric positive

H Antil 060616 16

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Reduced OCP

I Control to state map The solution operator defined as

S Hminus1(Ω)rarr H10 (Ω) z 7rarr u(z) = Sz

Note In view of the embedding

H10 (Ω) rarr L2(Ω) rarr Hminus1(Ω)

we may also consider S from L2(Ω) to L2(Ω) We will use the samenotation for this operator (for simplicity)

I Reduced cost functional The reduced cost functionalJ L2(Ω)rarr R is defined as

J (z) = J(Sz z)

I Reduced problem minzisinZadJ (z)

H Antil 060616 17

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Existence of solution Direct methodThere exists a unique solution to the above problem

I Minimizing sequence J is bounded below by zero therefore thereexists a minimizing sequence znnisinN such that

limnrarrinfin

J (zn) = inf J (z)

I Weak compactness zn sub L2(Ω) is bounded Since L2(Ω) isreflexive therefore there exists a subsequence (not relabeled) suchthat

zn z and z isin Zad

where the latter is a consequence of Zad being closed and convex

I Optimality Since J is continuous and convex and therefore isweakly lower semicontinuous We obtain

lim inf J (zn) ge J (z)

which implies that z is optimal

I Uniqueness Exercise

H Antil 060616 18

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 19

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Differentiation in Banach spacesLet (Z middot Z) (V middot V ) B-spaces Z sub Z open and F Z rarr V

Let z isin Z

I If exist δF (z h) = limtdarr01t (F (z + th)minus F (z)) then δF (z h) is called

the directional derivative of F at z in direction h

I If exist δF (z h) forallh isin Z then h 7rarr δF (z h) is the first variation of Fat z

I Let exist the first variation δF (z middot) F is said to be Gateauxdifferentiable at z iff existA isin L(Z V ) such that δF (z h) = Ahforallh isin Z We write A = F primeG(z)

I F is said to be Frechet differentiable at z iff existA isin L(Z V ) and amapping r(z middot) Z rarr V such that for all h isin U with z + h isin Zwe have

F (z+h) = F (z)+Ah+r(z h) withr(z h)VhZ

rarr 0 as hZ rarr 0

We write F prime(z) = A

H Antil 060616 20

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)

H Antil 060616 21

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

ExamplesExample 1 (H (middot middot)H) Hilbert space F (z) = z2H = (z z)H

forallz h F (z + h)minus F (z) = 2(z h)H + h2HThen

F prime(z)h = (2z h)H

Reisz representation theorem (identify H with Hlowast) we can write

F prime(z)h = (nablaF (z) h)H

where nablaF (z) = 2z is the gradient

Example 2 Let (Z (middot middot)Z) (H (middot middot)H) Hilbert spaces ud isin H fixedLet S isin L(ZH) Consider E Z rarr R

E(z) = Sz minus ud2H

Then E(z) = G(F (z)) where G(v) = v2H and F (z) = Sz minus ud

Eprime(z)h = Gprime(F (z))F prime(z)h = (2v F prime(z)h)H

= 2(Sz minus ud Sh)H = 2(Slowast(Sz minus ud) h)Z

we denote the gradient as nablaE(z) = 2Slowast(Sz minus ud)H Antil 060616 21

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 22

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

First order necessary optimality conditions

Most of the theory is based on these two simple results

Theorem Let (Z middot Z) be a normed space J Z rarr (infin+infin] amapping with J 6equiv +infin Then z isin Z minimizer of J hArr 0 isin partJ (z)

Proof 0 isin partJ (z) means by definition of partJ (z) J (z)minus J (z) le 0forallz isin Z

Theorem Let (Z middot Z) be a normed space Zad sub Z nonemptyconvex closed J Z rarr R Gateaux differentiable where Zad sub Z sub ZZ open If z isin Zad is a solution to

minzisinZad

J (z)

then z solvesJ prime(z)(z minus z) ge 0 forallz isin Zad

Proof Exercise

H Antil 060616 23

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Linear quadratic OCPI Reduced funcational

J (z) =1

2Sz minus ud2L2(Ω) +

λ

2z2L2(Ω)

I Gradient computation using sensitivity

J prime(z)h = (Slowast(Sz minus ud) + λz h)L2(Ω) forallh isin L2(Ω)

we denote the gradient by

nablaJ (z) = Slowast(Sz minus ud) + λz

I First order necessary and sufficient optimality condition

(nablaJ (z) z minus z)L2(Ω) ge 0 forallz isin Zad

I Adjoint equation Let p = Slowast(Sz minus ud) which translates intop isin H1

0 (Ω) solving

minusdiv (Anablap) = uminus ud in Ω

p = 0 on partΩ

I Gradient computation using adjoint nablaJ (z) = p+ λz

H Antil 060616 24

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Optimality system

I State u isin H10 (Ω)

intΩAnablau middot nablav =

intΩzv forallv isin H1

0 (Ω)

I Adjoint p isin H10 (Ω)

intΩAnablap middot nablav =

intΩ

(uminus ud)v forallv isin H10 (Ω)

I Control z isin Zad (p+ λz z minus z) ge 0 forallz isin Zad

H Antil 060616 25

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Pointwise interpretation

The variational inequality

(p+ λz z minus z) ge 0 forallz isin Zad

is equivalent to for ae x isin Ω

z(x) =

za(x) if p(x) + λz(x) gt 0isin [za(x) zb(x)] if p(x) + λz(x) = 0zb(x) if p(x) + λz(x) lt 0

Consequence

I When λ gt 0 then ae in Ω

z(x) = P[za(x)zb(x)]

minus 1

λp(x)

= min

zb(x)max

za(x)minus 1

λp(x)

H Antil 060616 26

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Formal Lagrange method applied to linear OCPIntroduce L H1

0 (Ω)times Zad timesH1(Ω)rarr R defined as

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablau)minus z

)p dx

Integrate by parts to arrive at

L(u z p) = J(u z)minusint

Ω

(minusdiv (Anablap)uminus zp

)dx+

intpartΩ

(Anablau)p ds

If (u z p) is a stationary point then

I State

DpL(u z p)h = 0 forallh isin H1(Ω)

I AdjointDuL(u z p)h = 0 forallh isin H1

0 (Ω)

I ControlDzL(u z p)(z minus z) ge 0 forallz isin Zad

H Antil 060616 27

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 28

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Semilinear elliptic OCP

min J(u z) =1

2uminus ud2L2(Ω) +

λ

2z2L2(Ω)

subject to

minus∆u+ g(x u) = z in Ω

u = 0 on partΩ

and z isin Zad with za zb isin Linfin(Ω)

H Antil 060616 29

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

State space(A1) Ω sub Rd open bounded partΩ is Lipschitz

(A2) g Ωtimes Rrarr R is bounded and measurable with respect to x isin Ωfor every u isin R

(A3) g is continuous monotone increasing and Lipschitz continuous in ufor ae x isin Ω

(A4) g(x 0) = 0 for ae x isin Ω (not really necessary)

Theorem Suppose (A1)minus (A4) hold For every z isin Lr(Ω) withr gt d2 the state equation has a unique weak solutionu isin H1

0 (Ω) cap C(Ω) ie we haveintΩ

nablau middot nablav dx+

intΩ

g(x u(x))v(x) dx =

intΩ

zv dx forallv isin H10 (Ω)

Moreover there exists a constant C gt 0 such that

uH1(Ω) + uC(Ω) le CzLr(Ω)

RemarkI The proof of the theorem uses the BrowderndashMinty theorem on

monotone operators

H Antil 060616 30

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Control-to-state map and reduced problem

I Control to state map

S Linfin(Ω)rarr H10 (Ω) cap C(Ω)

is well defined and is globally Lipschitz continuous

I Reduced problem

minJ (z) =1

2S(u)minus ud2L2(Ω) +

λ

2z2L2(Ω)

H Antil 060616 31

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin Zad

I GradientnablaJ (z) = p+ λz

H Antil 060616 32

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

First-order necessary conditionsI Lagrangian functional

L(u z p) = J(u z)minusint

Ω

(minus∆u+ g(x u(x))minus z

)p dx

= J(u z)minusint

Ω

(nablau middot nablap+ g(x u(x))pminus zp

)dx

I Adjoint system For all h isin H10 (Ω)

0 = DuL(u z p)h =

intΩ

(uminus ud)h dxminusint

Ω

(nablah middot nablap+ gu(x u(x))ph

)which yields

minus∆p+ gu(x u(x))p = uminus ud in Ω

p = 0 on partΩ

I Control

DzL(u z p)(z minus z) = (p+ λz z minus z) dx ge 0 forallz isin ZadI Gradient

nablaJ (z) = p+ λz

H Antil 060616 32

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Differentiability of Nemytskii operator

The rigorous proof uses

Theorem In addition to (A1)-(A4) if

(A5) g is continuously differentiable with respect to u for ae x isin Ωand we have

(i) |gu(x 0)| le K for ae x isin Ω(ii) gu is locally Lipschitz with respect to u isin R

then the Nemytskii operator G u 7rarr g(middot u(middot)) is continuously Frechetdifferentiable from Linfin(Ω) into itself and

(Gprime(u)h)(x) = gu(x u(x))h(x) ae in Ω forallh isin Linfin(Ω)

Remark (A5) holds if g(x u) = g(u) and g isin C2(R)

H Antil 060616 33

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 34

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin Zad

I Directional derivative Let S be the control to state map(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Gradient computation via adjoint general formI Abstract problem

min J(u z)

subject toc(u z) = 0

and z isin ZadI Directional derivative Let S be the control to state map

(nonlinear in general)

z 7rarr S(z) = u(z)

Under the assumptions of ldquoimplicit function theoremrdquo

J prime(z)h = (nablauJ (u(z) z) Sprime(z)h) + (nablazJ (u(z) z) h)

I Sensitivity of u with respect to z Differentiating the stateequation

cu(S(z) z)Sprime(z)h = minuscz(S(z) z)h

leads toSprime(z)h = minuscu(S(z) z)minus1

(cz(S(z) z)h

)

provided cu(S(z) z)minus1 is well-defined

H Antil 060616 35

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Adjoint equation and gradient general form

I Directional derivative

J prime(z)h = minus(nablauJ (u(z) z) cu(u(z) z)minus1cz(u(z) z)h

)+ (nablazJ (u(z) z) h)

I Gradient

nablaJ (z) = minuscz(u(z) z)lowast(cu(u(z) z)minuslowastnablauJ (u(z) z)

)+nablazJ (u(z) z)

I Introducing adjoint variable p solving

cu(u(z) z)lowastp = nablauJ (u(z) z)

we arrive at following form of gradient

nablaJ (z) = minuscz(u(z) z)lowastp+nablazJ (u(z) z)

which is tractable

H Antil 060616 36

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Hessian computation general form

I Lagrangian L(u z p) = J(u z)minusint

Ωc(u z)p dx Then we have

DzJ (z)h = DzL(u z p)h

I Second order derivative

D2zJ (z)[h1 h2] = DuDzL(u z p)[h1 S

prime(z)h2] +D2zL(u z p)[h1 h2]

+DpDzL(u z p)[h1 Dzp(z)h2]

I We already know Sprime(z)h2 moreover

DpDzL(u z p)[h1 Dzp(z)h2] = minus(cu(u z)h1 Dzp(z)h2

)

It remain to identify Dzp(z)h2

H Antil 060616 37

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Outline

Motivation and examples

Linear elliptic optimal control problems

Differentiation in Banach spaces

First-order necessary conditions

Optimal control of semilinear PDEs

An abstract problem

Discretization using finite element methods

H Antil 060616 38

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub K

I Quasi-uniform A family of shape regular meshes is called-quasiuniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Numerical approximation of Poisson equation MeshTo avoid technical issues we will assume that partΩ is polygonal

I Mesh Let T = K be a mesh of Ω where K sub Rd is an elementthat is isoparametrically equivalent to a unit simplex in Rd

I Conforming mesh We say that the mesh is conforming ifI Ω = cupKisinT KI If K1 cap K2 = x then x is a node of K1 and K2I If K1 cap K2 6= empty and K1 cap K2 6= x then K1 cap K2 is an edgeface

contained in partK1 cap partK2

I Shape regular Let T be the collection of all conformingrefinements of an original mesh T 0 Then T is called shape regularif exist κ gt 0 such that

hKρKle κ forallK isin cupiT i

where hK = diam K ρK = maxρ gt 0|Bρ sub KI Quasi-uniform A family of shape regular meshes is called-quasi

uniform if there exist a constant σ gt 0 such that

maxhKminhK

le σ

uniformly for all meshes in the family

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1

I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posed

I Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )satisfying

ψxi(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodes

I Galerkin orthogonality(nabla(uminus UT )nablaW

)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

FE space and Galerkin approximationI Finite element space For T isin T we define the finite element

space as

V(T ) = W isin C0(Ω) WK isin P1forallK isin T W |partΩ = 0

where P1 is the set of polynomials of degree at most 1I Galerkin approximation of continuous problem is given by

Find UT isin V(T )

intΩ

AnablaUT middot nablaV = 〈f V 〉 forallV isin V(T )

The discrete problem is well-posedI Nodal basis Consists of functions ψi isin V(T ) with xi isin N (T )

satisfyingψxi

(xj) = δij

where N (T ) denotes the interior (plus Neumann) nodesI Galerkin orthogonality(

nabla(uminus UT )nablaW)L2(Ω)

= 0 forallW isin V(T )

H Antil 060616 40

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Best approximation

I Cearsquos lemma

nabla(uminus UT )L2(Ω) le C infWisinV(T )

nabla(uminusW )L2(Ω)

I Nodal interpolation operator ΠT C(Ω)rarr V(T ) satisfies

ΠT u(xi) = u(xi) forallxi isin N (T )

and is defined as

ΠT u =sum

xiisinN (T )

u(xi)ψxi

I Error estimates In Cearsquos lemma replace W by ΠT u and thencompute the interpolation error

H Antil 060616 41

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Algebraic system state equation

I Recall UT isin V(T ) solves

a(UT V ) = 〈f V 〉 forallV isin V(T )

I Now setting UT =sumN

j=1 Ujψj V = ψi we get

Nsumj=1

Uja(φj φi) = 〈f ψi〉 i = 1 N

which is the matrix-vector system

AU = F

I Stiffness matrix A is SPD

H Antil 060616 42

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Discrete linear elliptic OCP

min JT (UT ZT ) =1

2UT minus ud2L2(Ω) +

λ

2ZT 2L2(Ω)

subject to

Find UT isin V(T )

intΩ

nablaUT middot nablaV = (ZT V )L2(Ω) forallV isin V(T )

and whereZT isin Zad(T ) = Zad cap Z(T )

is either one of these

Z(T ) =

V(T )L2(Ω) variational discretization (Hinze)WT isin Linfin(Ω) ZT |K isin P0(K)K isin T

H Antil 060616 43

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods

Recent books

F TroltzschOptimal control of partial differential equations volume 112 ofGraduate Studies in MathematicsAmerican Mathematical Society Providence RI 2010Theory methods and applications Translated from the 2005 Germanoriginal by Jurgen Sprekels

K Ito and K KunischLagrange multiplier approach to variational problems andapplications volume 15 of Advances in Design and ControlSociety for Industrial and Applied Mathematics (SIAM)Philadelphia PA 2008

H Antil 060616 44

  • Motivation and examples
  • Linear elliptic optimal control problems
  • Differentiation in Banach spaces
  • First-order necessary conditions
  • Optimal control of semilinear PDEs
  • An abstract problem
  • Discretization using finite element methods