Overview¶
In the previous lecture we discussed parameter estimation in elliptic problems at the continuous level. We now turn that theory into a concrete all-at-once finite element algorithm for the identification of the diffusion coefficient in Poisson’s equation.
The goal of this lecture is twofold:
derive the optimality system for a coefficient-identification problem with box constraints;
explain in detail the deal.II implementation in
codes/dealii/source/inverse_poisson_kkt.cc, together with the class declarationcodes/dealii/include/inverse_poisson_kkt.h, the drivercodes/dealii/execs/inverse_poisson_kkt.cc, and the default parameter filecodes/dealii/parameters/inverse_poisson_kkt.prm.
The computational strategy is not reduced optimization. Instead, we assemble and solve the Karush-Kuhn-Tucker system for the state, adjoint, and control together. Because the control enters the operator, the KKT system is nonlinear. The code therefore combines:
a one-shot KKT formulation;
a Newton linearization of the nonlinear optimality system;
a primal-dual active set strategy for the box constraints;
AffineConstraints<double>to freeze active control unknowns directly in the linearized solve.
Throughout the lecture, let be a bounded polygonal or polyhedral Lipschitz domain.
The Continuous Inverse Problem¶
We want to reconstruct a diffusion coefficient from measurements of the corresponding state . The forward model is
The coefficient is the control variable. We assume pointwise bounds
with
almost everywhere. The positivity assumption is essential because it keeps the state equation uniformly elliptic.
Given a desired state and a reference coefficient , we consider
subject to the PDE and the box constraints, where .
The admissible set is
The natural weak state equation is: find such that
If , then
is bounded on and coercive:
Hence Lax-Milgram gives a unique state .
Sensitivity with Respect to the Coefficient¶
Unlike distributed control, the control does not appear additively in the right-hand side. It multiplies the highest-order term. This is the key structural difference.
Let and perturb it by . Formally differentiating the weak state equation gives the sensitivity :
Thus the derivative is obtained by solving another coercive elliptic problem with right-hand side depending on the current state gradient.
This immediately shows why coefficient identification is more nonlinear than source control:
the state depends nonlinearly on the control;
the sensitivity equation itself depends on the current state;
the optimality system will contain products of state and adjoint gradients.
Lagrangian and First-Order Conditions¶
Introduce the Lagrangian
where is the adjoint variable.
The derivative with respect to gives back the state equation:
The derivative with respect to yields the adjoint equation:
The sign depends on the sign convention chosen in the Lagrangian. The code uses the equivalent convention
which is the same equation written with the right-hand side moved to the other side.
For the control variable, the Gâteaux derivative is
Therefore the gradient density is
The constrained first-order condition is the variational inequality
Equivalently, if the control space is interpreted with the inner product, one has the projection formula
This is the starting point for active-set methods.
Primal-Dual Active Set Interpretation¶
Define the multiplier-like quantity
Then the box-constrained optimality conditions can be written pointwise as
In a primal-dual active set method, one predicts the active sets through a shifted complementarity test. Given a parameter , define
Then the inactive set is
The active-set update is:
on , impose ;
on , impose ;
on , solve the stationarity equation together with state and adjoint equations.
For linear-quadratic box-constrained control this is equivalent to a semismooth Newton method. In our coefficient-identification problem, the state equation is already nonlinear in the control, so PDAS is combined with an additional Newton linearization of the KKT system.
One-Shot KKT Formulation¶
Let the unknown be the triple
The optimality system can be written as a nonlinear operator equation
where the three blocks are:
state residual,
adjoint residual,
control stationarity residual.
The one-shot philosophy means that we do not eliminate the state with a reduced map. We solve directly for all variables at once.
This has two computational consequences:
the assembled linear system is a coupled saddle-point-type Jacobian;
each Newton step simultaneously updates state, adjoint, and coefficient.
The price is a larger linear algebra problem. The benefit is that the code reflects the structure of the continuous optimality system almost literally.
Newton Linearization of the KKT System¶
Suppose is the current iterate. Newton’s method solves
and updates
Because the diffusion coefficient multiplies both and , the Jacobian contains the couplings
for the state block, and similarly
for the adjoint block.
The derivative of the stationarity equation produces
This explains why the matrix assembled in the code is not the same as the linear-quadratic KKT matrix from distributed Poisson control.
Discretization Used in the Code¶
The code uses a single FESystem with three scalar components:
state: continuous
FE_Q;adjoint: continuous
FE_Q;control: discontinuous
FE_DGQ.
The choice
is natural:
and need conformity;
the coefficient only appears as an or quantity in the variational equations;
a discontinuous coefficient space makes pointwise box projection simple and local.
The implementation creates exactly this structure in
create_fe_system() is implicit in the constructor and parameter parsing
through
FE_Q(state_degree), FE_Q(state_degree), and FE_DGQ(control_degree).
Homogeneous Dirichlet conditions for state and adjoint are imposed with
VectorTools::interpolate_boundary_values(...) on the corresponding
components in
setup_system().
Algorithm Structure¶
The overall solver is easier to understand from the following conceptual diagram than from the exact file layout.
This is not a literal copy of the source code. It is the mathematical logic implemented by the solver.
Mapping the Theory to the deal.II Class¶
The main class is
InversePoissonKKT.
Its public interface is intentionally small: the driver constructs the class,
reads the parameter file, and calls run().
The most important internal methods are:
setup_system(): distributes DoFs, renumbers by block, and builds the global constraint object for homogeneous Dirichlet data and hanging nodes.
initialize
_control _mass _matrix(): assembles the mass matrix on the control block and factorizes it. This is used to transform the control residual into an -type stationarity indicator. assemble_system(): assembles both the nonlinear KKT residual and its Jacobian at the current iterate.
compute
_stationarity(): extracts the control residual and applies the inverse control mass matrix to obtain the stationarity quantity used by PDAS. compute
_lower _active _set() and compute _upper _active _set(): implement the shifted active-set tests. make
_control _constraints(): converts the active-set prediction into AffineConstraints<double>by freezing active control degrees of freedom to lower or upper bounds.solve
_linearized _system(): condenses the matrix with the active constraints and solves the Newton system with SparseDirectUMFPACK.run(): orchestrates the full PDAS/Newton loop, damping, convergence logic, and output.
The Residual Assembled by the Code¶
In assemble_system() the code evaluates the current state, adjoint, and control at quadrature points and assembles three residual contributions on each cell.
For a test basis function in the state component, it inserts
For a test basis function in the adjoint component, it inserts
For a test basis function in the control component, it inserts
The sign in the last term reflects the sign convention chosen in the discrete Lagrangian. This is why the stationarity residual appears in the source as
regularization * (coefficient - reference) - state_gradient * adjoint_gradientrather than with a plus sign.
Why AffineConstraints<double> Are a Good Fit for PDAS¶
The key idea of the implementation is that an active set is not handled by a special projection after solving the linear system. Instead, it is inserted into the linear algebra structure itself.
If a control DoF is predicted active at the lower bound, the code adds the constraint
and similarly at the upper bound
This means the Newton correction already respects the active-set decision. The constrained linear system is built by:
copying the current Jacobian;
merging the PDE constraints with the active-set constraints;
condensing matrix and right-hand side;
solving for the correction;
redistributing the constrained values.
This is exactly the role of
make
For teaching purposes, this is an excellent example of how a nonsmooth optimization idea can be translated into standard finite element tools.
Damping and Stopping¶
Because the KKT system is nonlinear in the coefficient, a full Newton step may be too aggressive. The code therefore performs a simple residual-based damping strategy inside run():
compute the Newton update;
try step lengths ;
accept the first step that decreases the KKT residual norm;
otherwise keep the best trial found during backtracking.
This is not yet a polished globalization strategy, but it is mathematically honest and easy to inspect.
The solver stops in one of two situations:
the active set is unchanged and both the update norm and residual norm are below tolerance;
the active set is unchanged and no damped Newton step improves the residual.
The second condition is important in a teaching code: it prevents endless iterations when the current globalization strategy stalls.
Build and Run¶
From the deal.II directory:
cd codes/dealii
cmake -S . -B build
cmake --build build --target inverse_poisson_kkt_2d.g
./build/inverse_poisson_kkt_2d.g parameters/inverse_poisson_kkt.prmThe default parameter file is written for a two-dimensional manufactured solution. In particular:
the desired state is
sin(pi*x)*sin(pi*y);the exact diffusion coefficient is
1+x+y;the forcing term is chosen consistently with that exact pair;
the coefficient bounds are positive, ensuring ellipticity.
The output consists of:
one
.vtufile for each PDAS/Newton iteration;one
.pvdfile collecting the full nonlinear history;one final
.vtufile with the last iterate.
The VTU fields include:
state;adjoint;diffusion;projected desired state;
reference and exact coefficient;
lower and upper bounds;
lower-active, upper-active, and inactive indicators.
What to Learn from This Example¶
This laboratory shows a useful transition point in PDE-constrained optimization.
For distributed linear-quadratic control, the KKT system is already linear. For coefficient identification, three new ingredients appear simultaneously:
nonlinearity through the control-dependent operator;
nonsmoothness through box constraints;
mixed regularity, since the coefficient naturally lives in a discontinuous finite element space.
The deal.II implementation shows that these difficulties can still be handled within the same finite element framework, provided one is careful about:
the variational structure of the residual;
the Jacobian couplings;
the choice of discrete spaces;
the enforcement of active constraints.
In this sense, the code is not just an application of the previous lectures. It is a prototype for much more advanced inverse problems and PDE-constrained Newton methods.