Introduction and Motivation for PDE-Constrained Optimal Control - NMOPT

Overview¶

This lecture introduces optimal control as an optimization problem constrained by equations. We first build the finite-dimensional analogy, then move to PDE-constrained models.

Logical path of the lecture:

start from a constrained optimization problem in $(y,u)$ ;
eliminate the state through the model equation when possible;
obtain a reduced optimization problem in the control variable only;
derive first-order conditions in finite dimension;
transfer the same structure to PDE settings.

General Optimal Control Problem¶

Choose a control variable $u$ and a state variable $y$ such that

(y^\star,u^\star)\in\operatorname*{argmin}_{(y,u)} J(y,u)

(1)

subject to

\mathcal{E}(y,u)=0,\qquad u\in\mathcal{U}_{\mathrm{ad}}.

(2)

Key ingredients:

state equation $\mathcal{E}(y,u)=0$ (ODE/PDE/algebraic)
admissible controls $u\in\mathcal{U}_{\mathrm{ad}}$
cost functional $J(y,u)$
optional extra constraints (box constraints, state constraints, etc.)

Forward Problem vs Control Problem¶

Forward problem:

data are fixed
solve once for $y$

Optimal control:

$u$ is unknown
solve the state equation repeatedly inside optimization

Finite-Dimensional Setting (Simultaneous vs Reduced)¶

Consider

\min_{y\in\mathbb{R}^n,\,u\in\mathbb{R}^m} J(y,u) \quad\text{s.t.}\quad Ay=Bu,

(3)

with $A\in\mathbb{R}^{n\times n}$ invertible.

Simultaneous formulation¶

Optimize in $(y,u)$ and enforce $Ay=Bu$ explicitly.

Interpretation:

optimization variable: the pair $(y,u)$ ;
coupling: $y$ and $u$ are linked by the model equation;
computational consequence: every candidate pair must satisfy the constraint.

Reduced formulation¶

Since $A$ is invertible,

y=A^{-1}Bu=:S(u),

(4)

where $S$ is the control-to-state operator. Then define

f(u):=J(S(u),u),

(5)

and solve

\min_{u\in\mathcal{U}_{\mathrm{ad}}} f(u).

(6)

This reduces a finite-dimensional control problem to a standard finite-dimensional optimization problem.

Step-by-step logic:

the constraint $Ay=Bu$ defines $y$ uniquely as a function of $u$ ;
therefore the only independent decision variable is $u$ ;
the objective becomes a composite map $u\mapsto J(S(u),u)$ ;
all constraint information is encoded in $S$ .

Existence in Finite Dimensions¶

A standard existence result for the reduced problem:

$f$ is lower semicontinuous and bounded from below,
one level set $L_t:=\{u\in\mathcal{U}_{\mathrm{ad}}:f(u)\le t\}$ is nonempty, closed, and bounded,

then a minimizer exists.

Reason: in finite dimensions, closed and bounded sets are compact (Weierstrass theorem).

Why this matters for control:

before deriving optimality conditions, we need existence of at least one minimizer;
existence is easy in finite dimension under compactness of level sets;
this argument will fail in infinite dimensions unless additional structure is used.

Unconstrained First/Second-Order Conditions¶

For convex differentiable $f$ on a convex set $K$ :

\nabla f(\bar u)\cdot (u-\bar u)\ge 0\quad\forall u\in K

(7)

is the first-order optimality condition.

Special case (interior point):

\nabla f(\bar u)=0.

(8)

If $f\in C^2$ and $\bar u$ is a local minimizer:

$\nabla f(\bar u)=0$
$D^2 f(\bar u)$ is positive semidefinite

If moreover $D^2 f(\bar u)$ is positive definite, then $\bar u$ is a strict local minimizer.

Constrained minimization (2D Example)¶

Let

\mathcal{U}_{\mathrm{ad}}=\mathbb{R}^2,\qquad f(u)=\frac12 u^T A u,\qquad \varphi(u)=Bu-g=0,

(9)

with

A\in\mathbb{R}^{2\times 2}\text{ SPD},\quad B\in\mathbb{R}^{1\times 2},\quad g\in\mathbb{R}.

(10)

Logical interpretation:

objective level sets are ellipses (because $A$ is SPD);
feasible points lie on an affine line $Bu-g=0$ ;
the minimizer is where the first objective level set touches that feasible line.

At an optimal feasible point, $\nabla f$ is orthogonal to the feasible tangent direction, so it must be parallel to $\nabla\varphi$ :

\nabla f(\bar u)=(\nabla\varphi(\bar u))^T\lambda.

(11)

Equivalent first-order form:

\nabla f(\bar u)-(\nabla\varphi(\bar u))^T\lambda=0, \qquad \varphi(\bar u)=0.

(12)

Meaning:

$\varphi(\bar u)=0$ : feasibility at the solution;
gradient balance: objective gradient is compensated by constraint normal direction;
multiplier $\lambda$ : strength/sign of that compensation.

Lagrangian Formalism¶

Define the Lagrangian

\mathcal{L}(u,\lambda)=f(u)-\varphi(u)\cdot\lambda.

(13)

and search for saddle points:

\bar u, \bar\lambda = \arg\min_u \arg\max_\lambda \mathcal{L}(u,\lambda).

(14)

Stationarity gives

\frac{\partial\mathcal{L}}{\partial u}=0, \qquad \frac{\partial\mathcal{L}}{\partial \lambda}=0.

(15)

For this quadratic/affine case:

\nabla_u\mathcal{L}=Au-B^T\lambda=0, \qquad \nabla_\lambda\mathcal{L}=-Bu+g=0,

(16)

which is the KKT linear system

\begin{pmatrix}A & -B^T \\\ -B & 0\end{pmatrix}\begin{pmatrix}u \\\ \lambda\end{pmatrix}=\begin{pmatrix}0 \\\ -g\end{pmatrix}.

(17)

This is the prototype for all later optimality systems:

state/constraint equation;
adjoint or multiplier equation;
coupling through stationarity.

Objective restricted to the feasible line

From Finite to Infinite Dimensions¶

For PDE-constrained control, state/control live in function spaces (typically Hilbert spaces):

state space $Y$ (e.g. Sobolev spaces)
control space $U$
PDE operator $\mathcal{E}(y,u)=0$

Typical difficulties:

non-compactness in infinite dimension
weak vs strong convergence issues
differentiability in Banach/Hilbert spaces
adjoint equations for gradient computation

Conceptual continuity with the finite-dimensional case:

same optimization structure;
same reduced-vs-simultaneous viewpoints;
same KKT logic;
only the functional-analytic setting changes.

Prototypical PDE-Constrained Models¶

Elliptic distributed control¶

\min_{(y,u)} \frac12\|y-y_d\|_{L^2(\Omega)}^2 +\frac\alpha2\|u\|_{L^2(\Omega)}^2

(18)

subject to

-\Delta y=u\text{ in }\Omega, \qquad y=0\text{ on }\partial\Omega, \qquad u\in U_{\mathrm{ad}}.

(19)

Parabolic control¶

\partial_t y-\Delta y=u\text{ in }Q, \qquad y(\cdot,0)=y_0,

(20)

with tracking over space-time and Tikhonov regularization.

Flow and inverse problems¶

Navier-Stokes control (nonlinear constraints)
parameter estimation/data assimilation

Control Constraints¶

Common box constraints:

u_{\min}\le u\le u_{\max}.

(21)

They lead to variational inequalities and KKT systems in function spaces.

Typical Variations of OCP Formulations¶

Many optimal control models keep the same abstract structure but vary in where the control acts, what is observed, and how the objective is measured.

Common variations include:

Localized distributed control: $u$ acts only on a subdomain $\omega\subset\Omega$ (for example through $\chi_\omega u$ in the PDE).
Boundary control: $u$ appears in Dirichlet/Neumann/Robin boundary conditions on part of $\partial\Omega$ .
Initial-condition control: $u$ is an unknown initial datum in time-dependent models.
Parameter control: $u$ is a coefficient (diffusivity, reaction rate, material parameter), not a source term.
Different tracking norms: replace $L^2$ tracking by $H^1$ , weighted norms, or mixed space-time norms.
Different control penalties: use $L^2$ , $H^1$ , or sparsity-promoting terms (e.g. $L^1$ -type penalties).
Pointwise state constraints: enforce bounds on the state (or output) in the domain or on the boundary.
Multi-objective costs: combine tracking, regularization, and engineering criteria (energy, drag, flux, etc.).
Inverse problems / data assimilation: identify unknown inputs/parameters from measurements with regularization.
Uncertainty-aware OCPs: optimize expected cost, risk measures, or robust worst-case criteria.

These variants motivate why we need a flexible theoretical framework and multiple numerical methods in the rest of the course.

References for This Lecture¶

F. Tröltzsch, Optimal Control of Partial Differential Equations, Chapter 1
A. Manzoni, A. Quarteroni, S. Salsa, Optimal Control of PDEs, Chapter 1
J. C. De los Reyes, Numerical PDE-Constrained Optimization, Section 1