Preconditioned Krylov subspace method in Feel++
We wish to solve a linear system of the form:
where \(A\) is a sparse large matrix, \(x\) is the solution, and \(F\) is the right-hand side.
Our objective is to solve a linear system \(Ax = F\) with an iterative process efficiently, a direct method is not suitable because of the size of the matrix. We will use a Krylov subspace method with a preconditioner to accelerate the convergence.
Given \(x^0\) the initial guess, we wish to compute \(x^{1},x^2, \ldots x^n,x^{n+1}, \ldots\) through the iterative process.
1. LU
First, we start with the LU factorization of the matrix \(A\) to solve the system \(Ax=F\).
We compute the factorisation \(A=LU\)
-
Basic algo: step \(k\) of \(LU\) factorization \(a_{kk}\) pivot
-
For \(i > k\): \(l_{ik} = a_{ik} / a_{kk}\)
-
For \(i > k, j > k\): \(u_{ij} = a_{ij} - l_{ik} a_{kj}\)
Note that
-
Algorithm complexity (\(A\) is sparse matrix \(n \times n\) with bandwidth \(b\)): \(\mathcal{O}(n b^2)\) in time, \(\mathcal{O}(n b)\) in space.
-
Ordering routine (to reduce fill) to be used in the LU factorization \(LU = PAQ\):
The options set in Feel++ are in fact PETSc options. Here is how to set the LU factorization options.
- Matrix ordering
-
--pc_factor_mat_ordering_type=nd [natural, nd, 1wd, rcm, qmd, rowlength]
- Factorization package
-
mumps, petsc, pastix, umfpack, superlu, …
--pc-factor-mat-solver-package-type=mumps
MUMPS and Pastix are parallel factorization packages. |
mumps is the default factorization package in Feel++.
|
To use a direct solver, you can disable the iterative solver and use a direct solver
--ksp-type=preonly
--pc-type=lu
If you do not disable the iterative solver, you should expect that the iterative solver will take 1 iteration to solve the system. |
2. ILU
An alternative to the LU factorization is the Incomplete LU factorization. ILU factorizations provide approximations of the LU factorization. Various ILU factorizations are available in Feel++:
- ILU(k)
-
approximate factorization
\[\forall {i,j > k} : \mathrm{if} (i,j)\in S a_{ij} \leftarrow a_{ij} - a_{ik} a_{kk}^{-1} a_{kj}\] - ILUT
-
only fill in zero locations if \(a_{ik} a_{kk}^{-1} a_{kj}\) large enough
--pc-factor-levels=3 --pc-factor-fill=6
The options indicate the amount of fill you expect in the factored matrix (fill = number nonzeros in factor/number nonzeros in original matrix).
To select an incomplete factorization, you have to first select the factorization package using --pc-factor-mat-solver-package-type=…
Factorization package |
Option |
Comment |
PETSc |
|
PETSc ILU factorization, not parallel |
HYPRE |
|
HYPRE ILU factorization, works in sequential (ILU(k) and ILUT) and parallel (ILU(k)) |
HYPRE |
|
HYPRE ILUT factorization, works in sequential and parallel |
the table needs to be checked and updated. |
3. Relaxation methods
An alternative to the LU, ILU factorization is the relaxation methods. The principle is to split \(A\) into lower, diagonal, upper parts: \(A = L +D + U\) and to solve the system \(Ax=F\) with the following iterative process:
- Jacobi
-
--pc-type=jacobi
\[P^{-1} = D^{-1}\] - Successive over-relaxation (SOR)
-
--pc-type=sor
\[P = (1/\omega)D + U forward, P = backward x^{k+1} = -(D+\omega L)^{-1} ((\omega U + (1-\omega) D)) x^{k} + \omega (D +\omega L)^{-1} F\]
The SOR method is a generalization of the Jacobi method and has various options:
--pc-sor-type=local_symmetric [symmetric, forward, backward, local_symmetric, local_forward, local_backward]
--pc-sor-omega=1 : omega in ]0,2[. if omega=1 => Gauss-Seidel
--pc-sor-lits=1 : the number of smoothing sweeps on a process before doing a ghost point update from the other processes
--pc-sor-its=1
The total number of SOR sweeps is given by lits*its
.
4. Domain decomposition
- Additive Schwarz with overlap
where \(R_i,\tilde{R}_i,\hat{R}_i\) are restriction operators.
--pc-type=gasm
--pc-gasm-overlap=2 [size of extended local subdomains]
--pc-gasm-type=restrict [basic, restrict, interpolate, none]
--sub-pc-type=lu [preconditioner used for A_i^{-1}]
--sub-pc-factor-mat-solver-package-type=mumps
--pc-type=gasm
--pc-gasm-overlap=2 # [size of extended local subdomains]
--pc-gasm-type=restrict # [basic, restrict, interpolate, none]
--sub-pc-type=ilu # [preconditioner used for A_i^{-1}]
--sub-pc-factor-levels=3
--sub-pc-factor-fill=6
5. Algebraic multigrid
-
Basic multigrid with two levels
-
Solve on fine grid: \(A_1 u_1 = F_1\)
-
Compute residual: \(r_1 = F_1 - A_1 u_1\)
-
Restrict \(r_1\) to the coarse grid: \(r_{0}= R r_1\)
-
Solve on coarse grid: \(A_{0} e_{0}=r_{0}\)
-
Interpolate: \(e_{1}=I e_{0}\)
-
Solve on fine grid by starting with \(u_1 = u_1 + e_1\)
boomeramg
only --pc-mg-levels is taken into account. A lot of options can be given with PETSc options (see doc or code). Seams to be very efficient but strange behavior with fine meshes (convergence saturation)#
|
- MultiGrid options
-
here is an example of options to use with multigrid:
--pc-type=ml [ml,gamg,boomeramg] --pc-mg-levels=10 --pc-mg-type=multiplicative [multiplicative, additive, full, kaskade]
- Coarse options
--mg-coarse.pc-type=redundant
- All fine levels options (not including coarse)
--mg-levels.pc-type=sor
- Last fine levels options
--mg-fine-level.pc-type=sor
- Specific levels options (up to 5)
--mg-levels3.pc-type=sor
6. FieldSplit
- IndexSplit
-
computed automatically with composite space, block matrix, and block graph. User can redefine split definition (union of split):
--fieldsplit-fields=0->(1),1->(2,0)
--fieldsplit-type=additive [additive, multiplicative, symmetric-multiplicative, schur]
- Block Jacobi (additive)
-
filedsplit additive
\[\left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & A_{11}^{-1} \end{array} \right)\] - Block Gauss-Seidel (multiplicative)
-
here is the block Gauss-Seidel (multiplicative) preconditioner:
\[\left( \begin{array}{cc} I & 0 \\ 0 & A_{11}^{-1} \end{array} \right) \left( \begin{array}{cc} I & 0 \\ -A_{10} & I \end{array} \right) \left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & I \end{array} \right)\] - Symmetric Block Gauss-Seidel (symmetric-multiplicative)
-
\[\left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & I \end{array} \right) \left( \begin{array}{cc} I & -A_{01} \\ 0 & I \end{array} \right) \left( \begin{array}{cc} A_{00} & 0 \\ 0 & A_{11}^{-1} \end{array} \right) \left( \begin{array}{cc} I & 0 \\ -A_{10} & I \end{array} \right) \left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & I \end{array} \right)\]
--fieldsplit-0.pc-type=gasm
--fieldsplit-0.sub-pc-type=lu
--fieldsplit-1.pc-type=boomeramg
--fieldsplit-1.ksp-type=gmres
--fieldsplit-fields=0->(0,2),1->(1)
--fieldsplit-0.pc-type=fieldsplit
--fieldsplit-0.fieldsplit-fields=0->(0),1->(2)
7. Schur complement
First we compute the \(LDU\) factorization of the block matrix:
where the \(S\) is called the Schur complement of \(A_{00}\) and is equal to
the LDU factorization works only with 2 fields! |
We can now write the inverse of the matrix A:
--fieldsplit-type=schur
--fieldsplit-schur-fact-type=full [diag, lower, upper, full]
Preconditioner |
Mathematical object |
Comment |
|
\(\tilde{D}^{-1}\) |
positive definite, suitable for MINRES |
|
\((LD)^{-1}\) |
suitable for left preconditioning |
|
\((DU)^{-1}\) |
suitable for right preconditioning |
|
\(U^{-1} (LD)^{-1} = (DU)^{-1} L^{-1}\) |
an exact solve if applied exactly, needs one extra solve with A |
By default, all \(A_{00}^{-1}\) approximation use the same Krylov Subspace (KSP) however you can change it with the following options:
- Inner solver in the Schur complement
-
recal that \(S = A_{11} - A_{10} A_{00}^{-1} A_{01}\)
--fieldsplit-schur-inner-solver.use-outer-solver=false --fieldsplit-schur-inner-solver.pc-type=jacobi --fieldsplit-schur-inner-solver.ksp-type=preonly
- Upper solver
-
in full Schur factorisation:
\[A^{-1} = \left( \begin{array}{cc} I & -A_{00}^{-1}A_{01} \\ 0 & I \end{array} \right) \left( \begin{array}{cc} A_{00}^{-1} & 0 \\ 0 & S^{-1} \end{array} \right) \left( \begin{array}{cc} I & 0 \\ -A_{10}A_{00}^{-1} & I \end{array} \right)\]
--fieldsplit-schur-upper-solver.use-outer-solver=false
--fieldsplit-schur-upper-solver.pc-type=jacobi
--fieldsplit-schur-upper-solver.ksp-type=preonly
Here are some preconditioning strategies for the Schur complement:
- SIMPLE
-
Semi-Implicit Method for Pressure-Linked Equations (Patankar and Spalding 1972)
\[P_\text{SIMPLE} = \left( \begin{array}{cc} A_{00} & 0 \\ A_{10} & \tilde{S} \end{array} \right) \left( \begin{array}{cc} I & D_{00}^{-1} A_{01} \\ 0 & I \end{array} \right)\]with \(\tilde{S} = A_{10}D_{00}^{-1}A_{01}\)
--fieldsplit-schur-precondition=user
--fieldsplit-schur-upper-solver.use-outer-solver=false
--fieldsplit-1.pc-type=lu
--fieldsplit-1.ksp-type=gmres
--fieldsplit-1.ksp-maxit=10
--fieldsplit-1.rtol=1e-3
Variants: SIMPLER (Patankar - 1980), SIMPLEC (Van Doormaal and Raithby - 1984), SIMPLEST (Spalding - 1989) |
- Approximate \(S^{-1}\) with preconditioner
PCLSC
-
\(S^{-1} \approx (A_{10}A_{01})^{-1} A_{10} A_{00} A_{01} (A_{10}A_{01})^{-1}\)
--fieldsplit-schur-precondition=self --fieldsplit-1.pc-type=lsc --fieldsplit-1.ksp-type=gmres --fieldsplit-1.ksp-maxit=10 --fieldsplit-1.ksp-rtol=1e-3 --fieldsplit-1.lsc.ksp-type=gmres --fieldsplit-1.lsc.pc-type=boomeramg --fieldsplit-1.lsc.ksp-rtol=1e-4
- Other preconditioners
-
Yosida, PCD (Pressure Convection Diffusion), aSIMPLE, aPCD,…