Implementing SQH for a Poisson Control Problem - NMOPT — Numerical Methods for Optimal Control

Lecture 17 introduced the Sequential Quadratic Hamiltonian method from the Pontryagin point of view. This lecture records the implementation now added to the deal.II code base. The important point is that the code is not a full all-at-once Newton or SQP solver. It is a callback-based SQH driver, specialized here to a distributed Poisson control problem with an $L^2$ plus cutoff $L^1$ control cost.

The new implementation is split into two layers:

SQH<VectorType> in codes/dealii/include/sqh.h, a generic algorithmic loop that knows how to accept and reject Hamiltonian updates;
Poisson<dim> in codes/dealii/include/poisson_sqh.h and codes/dealii/source/poisson_sqh.cc, a concrete finite element problem that supplies state solves, adjoint solves, cost evaluation, distances, and the pointwise Hamiltonian maximizer.

The executable codes/dealii/execs/poisson_sqh.cc only initializes MPI, reads the parameter file through PoissonParameters<dim>, constructs a Poisson<dim> object, and calls run_sqh().

Mathematical Problem Implemented¶

Let $\Omega=(-2,2)^d$ . The code builds a hypercube mesh on this domain and solves a distributed control problem governed by the Poisson equation

-\Delta y = u+f \qquad\text{in }\Omega,

(1)

with Dirichlet boundary values prescribed by the parameter Exact solution expression. The control is a discontinuous finite element field, while the state and adjoint are continuous finite element fields.

The objective functional implemented in Poisson<dim>::cost_function is

J(y,u) = \int_\Omega \left[ \frac12 (y-y_d)^2 + \frac{\alpha}{2}u^2 + \beta |u|\,\mathbf 1_{\{|u|>s\}} \right]\,dx.

(2)

The parameters are:

$\alpha$ , from L2 control parameter (alpha);
$\beta$ , from L1 control parameter (beta);
$s$ , from Cutoff parameter for L^1 (s);
$y_d$ , from Target expression;
$f$ , from Right hand side expression.

Notice the exact cutoff used in the code:

\beta |u|\,\mathbf 1_{\{|u|>s\}}.

(3)

This is not the shifted Huber-like expression $\beta(|u|-s)_+$ ; it is literally the product implemented by

par.beta * (std::abs(u) * (std::abs(u) > par.cutoff_s))

at quadrature points.

The weak state equation assembled by assemble_rhs() and solved by solve() is

\int_\Omega \nabla y\cdot\nabla v\,dx = \int_\Omega (u+f)v\,dx \qquad \forall v.

(4)

The adjoint right-hand side assembled by assemble_adjoint_rhs() is

\int_\Omega (y-y_d)q\,dx.

(5)

Therefore the adjoint solve is

\int_\Omega \nabla p\cdot\nabla q\,dx = \int_\Omega (y-y_d)q\,dx \qquad \forall q,

(6)

with homogeneous Dirichlet boundary conditions.

With the sign convention used by the code, the local Hamiltonian contribution relevant for the control update can be read as

H(y,v,p) = -\frac{\alpha}{2}v^2 -\beta |v| -pv,

(7)

away from the cutoff discontinuity. The forcing term and terms independent of $v$ do not affect the pointwise maximization. The SQH regularized Hamiltonian is

H_\varepsilon(y,v,w,p) = H(y,v,p) - \varepsilon |v-w|^2,

(8)

where $w$ is the current control value.

Ignoring the cutoff in the derivative of the nonsmooth term, the pointwise maximizer implemented in argmax_u_of_H() is obtained from the two signs of $v$ :

v_+ = \frac{2\varepsilon w-p-\beta}{\alpha+2\varepsilon}, \qquad v_- = \frac{2\varepsilon w-p+\beta}{\alpha+2\varepsilon}.

(9)

The code returns $v_+$ if $v_+>0$ and otherwise returns $v_-$ . This is the standard scalar soft-threshold structure associated with the nonsmooth $L^1$ term, expressed in the Hamiltonian maximization convention used here.

Discretization and Linear Algebra¶

The implementation uses two finite element spaces:

FE_Q<dim> for the state $y$ and adjoint $p$ ;
FE_DGQ<dim> for the control $u$ .

The state stiffness matrix is assembled once per mesh by MatrixCreator::create_laplace_matrix(). Both the state and adjoint solves reuse this matrix and a Trilinos AMG preconditioner inside a CG solve.

The control update is pointwise at quadrature points, but the result must be stored in the DG control space. The code therefore:

evaluates $w$ , $y$ , and $p$ at quadrature points;
computes the scalar Hamiltonian maximizer at each quadrature point;
assembles these values as a control-space right-hand side;
applies a cellwise inverse DG mass matrix to recover the control coefficient vector.

The inverse control mass matrix is assembled cell by cell in assemble_system(). Since the control space is discontinuous, this inverse is local to each cell and can be inserted directly into the global sparse matrix.

The distances used by SQH are true $L^2$ distances computed by VectorTools::integrate_difference():

\|u-w\|_{L^2(\Omega)} \qquad\text{and}\qquad \|y-y_{\mathrm{old}}\|_{L^2(\Omega)}.

(10)

Sketch of the SQH Algorithm¶

The implementation follows the robust SQH algorithm from Lecture 17, with the names used in SQHParameters:

Initial epsilon: initial $\varepsilon$ ;
Increase factor for epsilon (sigma): $\sigma>1$ ;
Decrease factor for epsilon (zeta): $\zeta\in(0,1)$ ;
Eta: $\eta$ in the sufficient decrease test;
Tolerance: stopping tolerance for the control update;
Max iterations: maximum number of outer trials.

The actual loop in SQH<VectorType>::optimize() is:

Solve the forward problem for the initial control:

y^0=S(u^0).

(11)

Store the initial cost:

J^0=J(y^0,u^0).

(12)

For each trial index $k$ , solve the adjoint equation at the last accepted pair:

p^k=P(y^k,u^k).

(13)

Compute a trial control by pointwise maximization:

\tilde u = \operatorname*{arg\,max}_v \left[ H(y^k,v,p^k) - \varepsilon |v-u^k|^2 \right].

(14)

Solve the state equation for the trial control:

\tilde y=S(\tilde u).

(15)

Compute

\Delta J = J(\tilde y,\tilde u)-J(y^k,u^k), \qquad \tau=\| \tilde u-u^k\|_{L^2(\Omega)}^2.

(16)

Reject the trial if

\Delta J > -\eta \tau.

(17)

On rejection, the code restores the previous state and control, increases

\varepsilon \leftarrow \sigma \varepsilon,

(18)

logs the new value of $\varepsilon$ , and repeats from the adjoint and Hamiltonian update step.

Accept the trial otherwise. On acceptance, the code decreases

\varepsilon \leftarrow \zeta \varepsilon,

(19)

stores the cost, squared control distance, state distance, and accepted iterate, and emits the accepted-iteration callback.

Stop when

\tau < \texttt{Tolerance}.

(20)

The code records the histories of $J$ , $\tau$ , $\Delta y$ , and $\varepsilon$ . If deal.II was configured with HDF5 support (DEAL_II_WITH_HDF5), these histories are written to sqh_log.h5.

The implementation follows the notation of Lecture 17: the control distance callback returns the $L^2$ norm, and SQH::optimize() squares it to form

\tau_k=\|u^{k+1}-u^k\|_{L^2(\Omega)}^2.

(21)

This squared quantity is used in the sufficient decrease test, the stopping test, and the HDF5 history dataset tau.

Mermaid Diagram of the Algorithm¶

Mermaid Diagram of the Code¶

The generic SQH class and the Poisson specialization¶

The generic SQH class stores only algorithmic state. It does not know what a PDE is. Its public interface consists of user-supplied callbacks:

solve_forward_problem(y,u);
solve_adjoint_problem(p,y,u);
cost_function(y,u);
pointwise_argmax_u(u,y,p,w,eps);
control_distance(u,w);
state_distance(y,yd).

This makes the SQH loop reusable for other problems, provided they can supply the same operations.

The Poisson specialization supplies those callbacks in its constructor. In particular:

the forward callback assembles the right-hand side from $u+f$ and solves the Poisson equation;
the adjoint callback assembles the right-hand side $y-y_d$ and solves the homogeneous adjoint equation;
the cost callback integrates the tracking, $L^2$ control, and cutoff $L^1$ terms over cells;
the Hamiltonian callback computes the pointwise scalar maximizer and projects it back into the DG control space;
the accepted and rejected iteration callbacks optionally write VTU files, depending on the corresponding parameters.

The result is a compact implementation of the lecture-17 SQH idea:

global PDE solves for state and adjoint, local Hamiltonian optimization for the control, and adaptive regularization through accept/reject decisions.