Preconditioned Krylov subspace method in Feel++
We wish to solve a linear system of the form:
where \(A\) is a sparse large matrix, \(x\) is the solution, and \(F\) is the right-hand side.
Our objective is to solve a linear system \(Ax = F\) with an iterative process efficiently, a direct method is not suitable because of the size of the matrix. We will use a Krylov subspace method with a preconditioner to accelerate the convergence.
Given \(x^0\) the initial guess, we wish to compute \(x^{1},x^2, \ldots x^n,x^{n+1}, \ldots\) through the iterative process.
1. Krylov subspace method
The Krylov subspace methods finds such an approximate solution in a space which the dimension increase at each step of the method. It is based on the Arnoldi process which generates an orthogonal basis of the Krylov subspace.
where \(r^k\) is the residual of the system at the step \(k\).
There are many Krylov subspace methods in the literature, here are some of them:
- Cg (Conjugate Gradient)
-
for symmetric positive definite matrices
- Gmres (Generalized Minimal Residual)
-
for non-symmetric matrices
- Bicg (BiConjugate Gradient)
-
for non-symmetric matrices
- Fgmres (Flexible Generalized Minimal Residual)
-
for non-symmetric matrices enabling the use of preconditioners changing at each iteration
- Minres (Minimal Residual)
-
for symmetric indefinite matrices
Their convergence rate depend of condition number of \(A\), it is highly recommended to precondition the system to accelerate the convergence.
From now on, we use preconditioned Krylov subspace method to solve the system \(Ax=F\). We transform the original system into an equivalent system which has a better conditioning:
- Left preconditioning
-
denote \(M_L^{-1}\) the left preconditioner, we solve
\[M_L^{-1} A x = M_L^{-1} F\] - Right preconditioning
-
denote \(M_R^{-1}\) the right preconditioner, we solve
\[A M_R^{-1} y = F, \quad x=M_R^{-1} y\] - Left and right preconditioning
-
denote \(M_L^{-1}\) and \(M_R^{-1}\) the left and right preconditioners, we solve
\[M_L^{-1} A M_R^{-1}, \quad y = M_L^{-1} F \quad x=M_R^{-1} y\]
We now describe the different preconditioners \(M\) that can be used in Feel++.
2. LU
First, we start with the LU factorization of the matrix \(A\) to solve the system \(Ax=F\).
We compute the factorisation \(A=LU\)
-
Basic algo: step \(k\) of \(LU\) factorization \(a_{kk}\) pivot
-
For \(i > k\): \(l_{ik} = a_{ik} / a_{kk}\)
-
For \(i > k, j > k\): \(u_{ij} = a_{ij} - l_{ik} a_{kj}\)
Note that
-
Algorithm complexity (\(A\) is sparse matrix \(n \times n\) with bandwidth \(b\)): \(\mathcal{O}(n b^2)\) in time, \(\mathcal{O}(n b)\) in space.
-
Ordering routine (to reduce fill) to be used in the LU factorization \(LU = PAQ\):
The options set in Feel++ are in fact PETSc options. Here is how to set the LU factorization options.
- Matrix ordering
-
--pc_factor_mat_ordering_type=nd [natural, nd, 1wd, rcm, qmd, rowlength]
- Factorization package
-
mumps, petsc, pastix, umfpack, superlu, …
--pc-factor-mat-solver-package-type=mumps
MUMPS and Pastix are parallel factorization packages. |
mumps is the default factorization package in Feel++.
|
To use a direct solver, you can disable the iterative solver and use a direct solver
--ksp-type=preonly
--pc-type=lu
If you do not disable the iterative solver, you should expect that the iterative solver will take 1 iteration to solve the system. |
3. ILU
An alternative to the LU factorization is the Incomplete LU factorization. ILU factorizations provide approximations of the LU factorization. Various ILU factorizations are available in Feel++:
- ILU(k)
-
approximate factorization
\[\forall {i,j > k} : \mathrm{if} (i,j)\in S a_{ij} \leftarrow a_{ij} - a_{ik} a_{kk}^{-1} a_{kj}\] - ILUT
-
only fill in zero locations if \(a_{ik} a_{kk}^{-1} a_{kj}\) large enough
--pc-factor-levels=3 --pc-factor-fill=6
The options indicate the amount of fill you expect in the factored matrix (fill = number nonzeros in factor/number nonzeros in original matrix).
To select an incomplete factorization, you have to first select the factorization package using --pc-factor-mat-solver-package-type=…
Factorization package |
Option |
Comment |
PETSc |
|
PETSc ILU factorization, not parallel |
HYPRE |
|
HYPRE ILU factorization, works in sequential (ILU(k) and ILUT) and parallel (ILU(k)) |
HYPRE |
|
HYPRE ILUT factorization, works in sequential and parallel |
the table needs to be checked and updated. |
4. Relaxation methods
An alternative to the LU, ILU factorization is the relaxation methods. The principle is to split \(A\) into lower, diagonal, upper parts: \(A = L +D + U\) and to solve the system \(Ax=F\) with the following iterative process:
- Jacobi
-
--pc-type=jacobi
\[P^{-1} = D^{-1}\] - Successive over-relaxation (SOR)
-
--pc-type=sor
\[P = (1/\omega)D + U forward, P = backward x^{k+1} = -(D+\omega L)^{-1} ((\omega U + (1-\omega) D)) x^{k} + \omega (D +\omega L)^{-1} F\]
The SOR method is a generalization of the Jacobi method and has various options:
--pc-sor-type=local_symmetric [symmetric, forward, backward, local_symmetric, local_forward, local_backward]
--pc-sor-omega=1 : omega in ]0,2[. if omega=1 => Gauss-Seidel
--pc-sor-lits=1 : the number of smoothing sweeps on a process before doing a ghost point update from the other processes
--pc-sor-its=1
The total number of SOR sweeps is given by lits*its
.
5. Domain decomposition
- Additive Schwarz with overlap
where \(R_i,\tilde{R}_i,\hat{R}_i\) are restriction operators.
--pc-type=gasm
--pc-gasm-overlap=2 [size of extended local subdomains]
--pc-gasm-type=restrict [basic, restrict, interpolate, none]
--sub-pc-type=lu [preconditioner used for A_i^{-1}]
--sub-pc-factor-mat-solver-package-type=mumps
--pc-type=gasm
--pc-gasm-overlap=2 # [size of extended local subdomains]
--pc-gasm-type=restrict # [basic, restrict, interpolate, none]
--sub-pc-type=ilu # [preconditioner used for A_i^{-1}]
--sub-pc-factor-levels=3
--sub-pc-factor-fill=6
6. Algebraic multigrid
-
Basic multigrid with two levels
-
Solve on fine grid: \(A_1 u_1 = F_1\)
-
Compute residual: \(r_1 = F_1 - A_1 u_1\)
-
Restrict \(r_1\) to the coarse grid: \(r_{0}= R r_1\)
-
Solve on coarse grid: \(A_{0} e_{0}=r_{0}\)
-
Interpolate: \(e_{1}=I e_{0}\)
-
Solve on fine grid by starting with \(u_1 = u_1 + e_1\)
boomeramg
only --pc-mg-levels is taken into account. A lot of options can be given with PETSc options (see doc or code). Seams to be very efficient but strange behavior with fine meshes (convergence saturation)#
|
- MultiGrid options
-
here is an example of options to use with multigrid:
--pc-type=ml [ml,gamg,boomeramg] --pc-mg-levels=10 --pc-mg-type=multiplicative [multiplicative, additive, full, kaskade]
- Coarse options
--mg-coarse.pc-type=redundant
- All fine levels options (not including coarse)
--mg-levels.pc-type=sor
- Last fine levels options
--mg-fine-level.pc-type=sor
- Specific levels options (up to 5)
--mg-levels3.pc-type=sor
7. FieldSplit
- IndexSplit
-
computed automatically with composite space, block matrix, and block graph. User can redefine split definition (union of split):
--fieldsplit-fields=0->(1),1->(2,0)
--fieldsplit-type=additive [additive, multiplicative, symmetric-multiplicative, schur]
- Block Jacobi (additive)
-
filedsplit additive
\[\left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & A_{11}^{-1} \end{array} \right)\] - Block Gauss-Seidel (multiplicative)
-
here is the block Gauss-Seidel (multiplicative) preconditioner:
\[\left( \begin{array}{cc} I & 0 \\ 0 & A_{11}^{-1} \end{array} \right) \left( \begin{array}{cc} I & 0 \\ -A_{10} & I \end{array} \right) \left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & I \end{array} \right)\] - Symmetric Block Gauss-Seidel (symmetric-multiplicative)
-
\[\left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & I \end{array} \right) \left( \begin{array}{cc} I & -A_{01} \\ 0 & I \end{array} \right) \left( \begin{array}{cc} A_{00} & 0 \\ 0 & A_{11}^{-1} \end{array} \right) \left( \begin{array}{cc} I & 0 \\ -A_{10} & I \end{array} \right) \left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & I \end{array} \right)\]
--fieldsplit-0.pc-type=gasm
--fieldsplit-0.sub-pc-type=lu
--fieldsplit-1.pc-type=boomeramg
--fieldsplit-1.ksp-type=gmres
--fieldsplit-fields=0->(0,2),1->(1)
--fieldsplit-0.pc-type=fieldsplit
--fieldsplit-0.fieldsplit-fields=0->(0),1->(2)
8. Schur complement
First we compute the \(LDU\) factorization of the block matrix:
where the \(S\) is called the Schur complement of \(A_{00}\) and is equal to
the LDU factorization works only with 2 fields! |
We can now write the inverse of the matrix A:
--fieldsplit-type=schur
--fieldsplit-schur-fact-type=full [diag, lower, upper, full]
Preconditioner |
Mathematical object |
Comment |
|
\(\tilde{D}^{-1}\) |
positive definite, suitable for MINRES |
|
\((LD)^{-1}\) |
suitable for left preconditioning |
|
\((DU)^{-1}\) |
suitable for right preconditioning |
|
\(U^{-1} (LD)^{-1} = (DU)^{-1} L^{-1}\) |
an exact solve if applied exactly, needs one extra solve with A |
By default, all \(A_{00}^{-1}\) approximation use the same Krylov Subspace (KSP) however you can change it with the following options:
- Inner solver in the Schur complement
-
recal that \(S = A_{11} - A_{10} A_{00}^{-1} A_{01}\)
--fieldsplit-schur-inner-solver.use-outer-solver=false --fieldsplit-schur-inner-solver.pc-type=jacobi --fieldsplit-schur-inner-solver.ksp-type=preonly
- Upper solver
-
in full Schur factorisation:
\[A^{-1} = \left( \begin{array}{cc} I & -A_{00}^{-1}A_{01} \\ 0 & I \end{array} \right) \left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & S^{-1} \end{array} \right) \left( \begin{array}{cc} I & 0 \\ -A_{10}A_{00}^{-1} & I \end{array} \right)\]
--fieldsplit-schur-upper-solver.use-outer-solver=false
--fieldsplit-schur-upper-solver.pc-type=jacobi
--fieldsplit-schur-upper-solver.ksp-type=preonly
Here are some preconditioning strategies for the Schur complement:
- SIMPLE
-
Semi-Implicit Method for Pressure-Linked Equations (Patankar and Spalding 1972)
\[P_\text{SIMPLE} = \left( \begin{array}{cc} A_{00} & 0 \\ A_{10} & \tilde{S} \end{array} \right) \left( \begin{array}{cc} I & D_{00}^{-1} A_{01} \\ 0 & I \end{array} \right)\]with \(\tilde{S} = A_{10}D_{00}^{-1}A_{01}\)
--fieldsplit-schur-precondition=user
--fieldsplit-schur-upper-solver.use-outer-solver=false
--fieldsplit-1.pc-type=lu
--fieldsplit-1.ksp-type=gmres
--fieldsplit-1.ksp-maxit=10
--fieldsplit-1.rtol=1e-3
Variants: SIMPLER (Patankar - 1980), SIMPLEC (Van Doormaal and Raithby - 1984), SIMPLEST (Spalding - 1989) |
- Approximate \(S^{-1}\) with preconditioner
PCLSC
-
\(S^{-1} \approx (A_{10}A_{01})^{-1} A_{10} A_{00} A_{01} (A_{10}A_{01})^{-1}\)
--fieldsplit-schur-precondition=self --fieldsplit-1.pc-type=lsc --fieldsplit-1.ksp-type=gmres --fieldsplit-1.ksp-maxit=10 --fieldsplit-1.ksp-rtol=1e-3 --fieldsplit-1.lsc.ksp-type=gmres --fieldsplit-1.lsc.pc-type=boomeramg --fieldsplit-1.lsc.ksp-rtol=1e-4
- Other preconditioners
-
Yosida, PCD (Pressure Convection Diffusion), aSIMPLE, aPCD,…