Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Implementing SQH for a Poisson Control Problem

University of Pisa

Lecture 17 introduced the Sequential Quadratic Hamiltonian method from the Pontryagin point of view. This lecture records the implementation now added to the deal.II code base. The important point is that the code is not a full all-at-once Newton or SQP solver. It is a callback-based SQH driver, specialized here to a distributed Poisson control problem with an L2L^2 plus cutoff L1L^1 control cost.

The new implementation is split into two layers:

The executable codes/dealii/execs/poisson_sqh.cc only initializes MPI, reads the parameter file through PoissonParameters<dim>, constructs a Poisson<dim> object, and calls run_sqh().


Mathematical Problem Implemented

Let Ω=(2,2)d\Omega=(-2,2)^d. The code builds a hypercube mesh on this domain and solves a distributed control problem governed by the Poisson equation

Δy=u+fin Ω,-\Delta y = u+f \qquad\text{in }\Omega,

with Dirichlet boundary values prescribed by the parameter Exact solution expression. The control is a discontinuous finite element field, while the state and adjoint are continuous finite element fields.

The objective functional implemented in Poisson<dim>::cost_function is

J(y,u)=Ω[12(yyd)2+α2u2+βu1{u>s}]dx.J(y,u) = \int_\Omega \left[ \frac12 (y-y_d)^2 + \frac{\alpha}{2}u^2 + \beta |u|\,\mathbf 1_{\{|u|>s\}} \right]\,dx.

The parameters are:

Notice the exact cutoff used in the code:

βu1{u>s}.\beta |u|\,\mathbf 1_{\{|u|>s\}}.

This is not the shifted Huber-like expression β(us)+\beta(|u|-s)_+; it is literally the product implemented by

par.beta * (std::abs(u) * (std::abs(u) > par.cutoff_s))

at quadrature points.

The weak state equation assembled by assemble_rhs() and solved by solve() is

Ωyvdx=Ω(u+f)vdxv.\int_\Omega \nabla y\cdot\nabla v\,dx = \int_\Omega (u+f)v\,dx \qquad \forall v.

The adjoint right-hand side assembled by assemble_adjoint_rhs() is

Ω(yyd)qdx.\int_\Omega (y-y_d)q\,dx.

Therefore the adjoint solve is

Ωpqdx=Ω(yyd)qdxq,\int_\Omega \nabla p\cdot\nabla q\,dx = \int_\Omega (y-y_d)q\,dx \qquad \forall q,

with homogeneous Dirichlet boundary conditions.

With the sign convention used by the code, the local Hamiltonian contribution relevant for the control update can be read as

H(y,v,p)=α2v2βvpv,H(y,v,p) = -\frac{\alpha}{2}v^2 -\beta |v| -pv,

away from the cutoff discontinuity. The forcing term and terms independent of vv do not affect the pointwise maximization. The SQH regularized Hamiltonian is

Hε(y,v,w,p)=H(y,v,p)εvw2,H_\varepsilon(y,v,w,p) = H(y,v,p) - \varepsilon |v-w|^2,

where ww is the current control value.

Ignoring the cutoff in the derivative of the nonsmooth term, the pointwise maximizer implemented in argmax_u_of_H() is obtained from the two signs of vv:

v+=2εwpβα+2ε,v=2εwp+βα+2ε.v_+ = \frac{2\varepsilon w-p-\beta}{\alpha+2\varepsilon}, \qquad v_- = \frac{2\varepsilon w-p+\beta}{\alpha+2\varepsilon}.

The code returns v+v_+ if v+>0v_+>0 and otherwise returns vv_-. This is the standard scalar soft-threshold structure associated with the nonsmooth L1L^1 term, expressed in the Hamiltonian maximization convention used here.


Discretization and Linear Algebra

The implementation uses two finite element spaces:

The state stiffness matrix is assembled once per mesh by MatrixCreator::create_laplace_matrix(). Both the state and adjoint solves reuse this matrix and a Trilinos AMG preconditioner inside a CG solve.

The control update is pointwise at quadrature points, but the result must be stored in the DG control space. The code therefore:

  1. evaluates ww, yy, and pp at quadrature points;

  2. computes the scalar Hamiltonian maximizer at each quadrature point;

  3. assembles these values as a control-space right-hand side;

  4. applies a cellwise inverse DG mass matrix to recover the control coefficient vector.

The inverse control mass matrix is assembled cell by cell in assemble_system(). Since the control space is discontinuous, this inverse is local to each cell and can be inserted directly into the global sparse matrix.

The distances used by SQH are true L2L^2 distances computed by VectorTools::integrate_difference():

uwL2(Ω)andyyoldL2(Ω).\|u-w\|_{L^2(\Omega)} \qquad\text{and}\qquad \|y-y_{\mathrm{old}}\|_{L^2(\Omega)}.

Sketch of the SQH Algorithm

The implementation follows the robust SQH algorithm from Lecture 17, with the names used in SQHParameters:

The actual loop in SQH<VectorType>::optimize() is:

  1. Solve the forward problem for the initial control:

y0=S(u0).y^0=S(u^0).
  1. Store the initial cost:

J0=J(y0,u0).J^0=J(y^0,u^0).
  1. For each trial index kk, solve the adjoint equation at the last accepted pair:

pk=P(yk,uk).p^k=P(y^k,u^k).
  1. Compute a trial control by pointwise maximization:

u~=arg maxv[H(yk,v,pk)εvuk2].\tilde u = \operatorname*{arg\,max}_v \left[ H(y^k,v,p^k) - \varepsilon |v-u^k|^2 \right].
  1. Solve the state equation for the trial control:

y~=S(u~).\tilde y=S(\tilde u).
  1. Compute

ΔJ=J(y~,u~)J(yk,uk),τ=u~ukL2(Ω)2.\Delta J = J(\tilde y,\tilde u)-J(y^k,u^k), \qquad \tau=\| \tilde u-u^k\|_{L^2(\Omega)}^2.
  1. Reject the trial if

ΔJ>ητ.\Delta J > -\eta \tau.

On rejection, the code restores the previous state and control, increases

εσε,\varepsilon \leftarrow \sigma \varepsilon,

logs the new value of ε\varepsilon, and repeats from the adjoint and Hamiltonian update step.

  1. Accept the trial otherwise. On acceptance, the code decreases

εζε,\varepsilon \leftarrow \zeta \varepsilon,

stores the cost, squared control distance, state distance, and accepted iterate, and emits the accepted-iteration callback.

  1. Stop when

τ<Tolerance.\tau < \texttt{Tolerance}.

The code records the histories of JJ, τ\tau, Δy\Delta y, and ε\varepsilon. If deal.II was configured with HDF5 support (DEAL_II_WITH_HDF5), these histories are written to sqh_log.h5.

The implementation follows the notation of Lecture 17: the control distance callback returns the L2L^2 norm, and SQH::optimize() squares it to form

τk=uk+1ukL2(Ω)2.\tau_k=\|u^{k+1}-u^k\|_{L^2(\Omega)}^2.

This squared quantity is used in the sufficient decrease test, the stopping test, and the HDF5 history dataset tau.


Mermaid Diagram of the Algorithm


Mermaid Diagram of the Code


The generic SQH class and the Poisson specialization

The generic SQH class stores only algorithmic state. It does not know what a PDE is. Its public interface consists of user-supplied callbacks:

This makes the SQH loop reusable for other problems, provided they can supply the same operations.

The Poisson specialization supplies those callbacks in its constructor. In particular:

The result is a compact implementation of the lecture-17 SQH idea:

global PDE solves for state and adjoint, local Hamiltonian optimization for the control, and adaptive regularization through accept/reject decisions.