Overview¶
The previous lecture established the continuous framework for linear-quadratic elliptic control:
the reduced problem
the operatorial all-at-once formulation on
the adjoint-based gradient formula
and the unconstrained KKT system in dual spaces.
This lecture extends the framework to the case of control constraints.
The goal is to derive:
the variational inequality for box-constrained controls;
the operatorial KKT system with a nonlinear control condition;
the pointwise projection formula;
the saddle-point interpretation in infinite dimensions.
Notation and Setting¶
The operators are
with
The state equation is
and the cost functional is
Hence the all-at-once problem is
From the previous lecture, in the unconstrained case, the first-order system is
Equivalently, in reduced form,
This is the only fact we need to import from the previous lecture.
Admissible Controls¶
We now impose box constraints on the control:
We assume
so that is nonempty, closed, and convex.
The constrained reduced problem is therefore
The state equation is unchanged. Only the admissible set for the control changes.
Variational Inequality¶
For minimization of a differentiable functional over a closed convex set in a Hilbert space, the first-order condition is a variational inequality rather than the equation .
Thus is optimal if and only if
Using the reduced gradient from the previous lecture, this becomes
This is the constrained analogue of stationarity. If the constraint is inactive, one recovers . If a bound is active, only admissible perturbations are allowed, so equality is replaced by an inequality.
Operatorial KKT System¶
The cleanest way to express the constrained problem in the operatorial setting is through the normal cone to . For a closed convex set , the normal cone at is
Then the variational inequality is equivalent to the inclusion
Therefore the full KKT system becomes
Compared with the unconstrained system from the previous lecture, only the third equation changes: the linear equation in is replaced by a nonlinear inclusion in .
This can also be written as a block saddle-point relation:
in the dual product space
So the constrained problem still has the same KKT block structure as in the previous lecture, but now one block is nonlinear because of the cone term.
Projection Formula¶
For box constraints, the normal-cone inclusion has a pointwise characterization. It is equivalent to the projection formula
where denotes the pointwise projection onto the interval .
Explicitly, for almost every ,
Thus the control law remains explicit, but it is no longer linear. The unconstrained formula is simply clipped onto the admissible interval.
Full Optimality System¶
Combining state equation, adjoint equation, and control characterization, we obtain
In the Poisson model of this course, this reads
The first two equations are linear and exactly the same kind of objects as in the previous lecture. The third one carries the entire effect of the control constraint.
Interpretation as Infinite-Dimensional KKT¶
The constrained system is the direct analogue of finite-dimensional KKT conditions:
the unknowns are functions in ;
the equations live in the dual product space ;
the state and adjoint equations are linear operator equations;
the control condition is a complementarity relation encoded either as a normal-cone inclusion or as a projection formula.
So the conceptual picture is:
previous lecture: linear saddle-point system;
current lecture: saddle-point system with a nonlinear control block.
This is exactly the structure later exploited by active-set and semismooth Newton methods.
Projected Gradient Methods¶
The most direct numerical method works on the reduced functional over the convex set . Starting from , one computes at each iteration:
state solve
adjoint solve
reduced gradient
projected update
This is the Hilbert-space analogue of projected gradient descent for bound-constrained finite-dimensional optimization.
Two remarks are important:
the projection enforces admissibility at every iteration;
each step still requires one state solve and one adjoint solve, exactly as in the unconstrained reduced method.
For the linear-quadratic case, if one chooses
then
so the update becomes
This explains why the projection formula is both an optimality condition and a natural fixed-point iteration.
Projected gradient is robust and easy to implement, but it is usually only linearly convergent and may become slow when the active set is nearly identified but not yet fixed.
Primal-Dual Active Set Methods¶
The projection formula suggests splitting the domain into active and inactive regions. Given an iterate , define
and the inactive set
To make the primal-dual structure explicit, introduce a multiplier
for the box constraints, and write the control optimality condition as
For box constraints, satisfies the complementarity pattern
Once the active sets are frozen, the inequalities are replaced by equality constraints. On a given active-set guess , one may therefore consider the Lagrangian
where the equality constraints are imposed only on the active sets. The multipliers and are supported on and , respectively.
Then the optimality conditions suggest the update rules
and
Equivalently, the multiplier can be recovered from
with
and sign restrictions on and .
So, once the active sets are guessed, one solves a linear system for with those sets frozen. In strong form this means:
Algorithmically:
guess active and inactive sets from the current iterate;
solve the equality-constrained KKT system with these sets fixed;
update the sets and repeat until they stop changing.
This is called a primal-dual active set method because:
the primal variable determines where bounds are active;
the dual variables and decide whether a point should be active or inactive.
In practice, PDAS is often much faster than projected gradient because once the active set is identified, the remaining step is essentially the solution of the correct linearized KKT system.
The same idea extends beyond box constraints. Let be any closed convex set for which the projection
is available in explicit or computational form. Then one may still write the optimality condition as
or equivalently as a normal-cone inclusion
and build active-set type methods by locally identifying the part of the constraint that behaves as an equality constraint at the current iterate. For box constraints this identification is pointwise and especially transparent, which is why PDAS takes its simplest form there. For more general admissible sets, the same principle survives, but the geometry of the active manifold and the structure of the projected Newton step become more problem-dependent.
Semismooth Newton Interpretation¶
The projection operator is not differentiable in the classical sense, but it is semismooth. This allows one to apply generalized Newton methods to the nonsmooth optimality system.
Recall the basic definition. Let be locally Lipschitz and directionally differentiable. Its generalized derivative at is the Clarke generalized Jacobian
that is, the convex hull of all limits of classical derivatives computed at nearby points where is differentiable.
Then is called semismooth at if, for every perturbation and every generalized derivative in the sense of Clarke,
If the stronger estimate
holds, one speaks of strongly semismoothness. Semismoothness is the regularity that replaces classical differentiability in generalized Newton theory.
A convenient reduced equation is
with some parameter . At a solution, this is equivalent to the variational inequality.
Indeed, in a Hilbert space the projection onto a closed convex set is characterized by
Apply this with
Then
Using the projection characterization, this is equivalent to
that is,
Since , this is equivalent to
which is exactly the variational inequality
So the fixed-point equation and the constrained first-order condition are equivalent.
For box constraints, the projection acts pointwise, so its generalized derivative is easy to characterize:
it is 0 on points that are projected onto an active bound;
it is 1 on inactive points.
Therefore a generalized Newton step for amounts to freezing the current active set and solving the corresponding linearized state-adjoint-control system. This is exactly the mechanism behind PDAS.
So, in the linear-quadratic box-constrained case, one can view:
PDAS as an active-set method;
semismooth Newton as a generalized Newton method for the projection equation;
and, in this specific setting, the two viewpoints are essentially equivalent.
This equivalence is one of the main reasons why active-set methods are so effective for elliptic control problems with box constraints: they inherit Newton-type local behavior while preserving the simple geometric interpretation of active and inactive regions.
Algorithmic Comparison¶
The three methods fit the same state-adjoint-control structure, but they use it differently:
projected gradient uses the projection only as a feasible descent step;
PDAS uses the projection to identify active sets and then solves a linear KKT system;
semismooth Newton interprets the projection equation as a nonsmooth root-finding problem.
Typical qualitative behavior:
projected gradient: cheapest iteration, strongest robustness, slowest asymptotically;
PDAS: more expensive iteration, often very fast once the active set is close to correct;
semismooth Newton: Newton-type local convergence, but requires a good formulation and linear solver strategy.
This is why the continuous optimality system matters algorithmically: it tells us not only what the solution must satisfy, but also what class of numerical methods is natural for the problem.
Summary¶
This lecture builds directly on the previous one:
the state equation and adjoint equation are unchanged;
the unconstrained stationarity condition
is replaced by the constrained inclusion
equivalently, the control satisfies the projection formula
the full KKT system remains a saddle-point problem in dual spaces
with a nonlinear control block.
this structure naturally leads to three algorithmic families:
projected gradient methods;
primal-dual active set methods;
semismooth Newton methods.
This closes the continuous first-order theory for linear-quadratic elliptic control with box constraints.
References¶
Fredi Troeltzsch, Optimal Control of Partial Differential Equations, Chapter 2.
Alfio Borzi and Volker Schulz, Computational Optimization of Systems Governed by Partial Differential Equations.
Juan Carlos De Los Reyes, Numerical PDE-Constrained Optimization.