Preconditioners

Preconditioners

Today we are going to learn some important concepts related to matrix solvers. We will learn some of the preconditioners. They are often used as a black box in ABAQUS. We use its output ignoring the mechanism behind it. I hope with going through this blog post you will get a fair idea about the following broad areas:

  • What are condition numbers and why are they important.
  • Why there is a need to use Preconditioners.
  • Some basic Preconditioners and Solvers used in FEniCS.

Condition Number

Condition no. is used to define the change in dependent variables due to change in theΒ independent variables.
Consider a system of equations, $[A][x]=[b]$. The condition number is defined as the ratio of the relative error in $[x]$ to the relative error in $[b]$. It shows how much change in elements of the b matrix is going to change the value of the solution of the equation. If the condition number is high, for a small change in the inputs there is a large change in the solution. Let us take an example to find the condition number of a matrix.

$[A]= \left[\begin{array}{lll} 3 & 0 & 2 \\ 2 & 0 & -2 \\ 0 & 1 & 1 \end{array}\right]$

$Adj(A)=\left[\begin{array}{lll} 2 & 2 & 0\\ -2 & 3 & 10 \\ 2 & -3 & 0 \end{array}\right]$

$A^{-1}=\frac{A d j(A)}{|A|}=\frac{1}{10}\left[\begin{array}{ccc} 2 & 2 & 0 \\ -2 & 3 & 10 \\ 2 & -3 & 0 \end{array}\right]$

Condition Number of $[A]= |A|_{\infty}\left|A^{-1}\right|_{\infty}$
$\therefore$ Condition Number= $5\times1.5= 7.5$

Classification of a matrix on the basis of Condition Number

Based on Condition number, matrices are of two types.

$Condition No. = \begin{cases} Small \Rightarrow Well-conditioned\ matrix \\ Large \Rightarrow Ill-conditioned\ matrix \end{cases}$

  • For a well-conditioned matrix, convergence rate is faster, correct solution to the system is easy to find. So, the number of iterations required is less.
  • For an ill-conditioned matrix, convergence rate is slower, correct solution to the system is hard to find. So, number of iterations required is high.

In the last example, the condition no. is 7.5 $\therefore$ It is an Ill-conditioned matrix.

Preconditioners

Preconditioning means multiplying something to the left and right sides of the equation.

$$𝑃𝐴π‘₯=𝑃𝑏$$

The matrix P is called Preconditioner.
It conditions a given problem into a form that is more suitable for numerical solving methods. It reduces condition no. which eventually makes convergence faster.

Consider a linear system of equations:

$𝐴π‘₯=𝑏………….(a)$

Residual error, $π‘Ÿ_π‘˜=π‘βˆ’π΄π‘₯$

Equation(a) can be rewritten as $𝑃π‘₯=(π‘ƒβˆ’π΄)π‘₯+𝑏$

For successive iterations, $𝑃π‘₯_{(π‘˜+1)}=(π‘ƒβˆ’π΄) π‘₯_π‘˜+𝑏$

Difference between solutions of two successive iterations, $𝑒_π‘˜=π‘₯βˆ’π‘₯_π‘˜$ and $𝑒_{π‘˜+1}=π‘₯_π‘˜-π‘₯_{k+1}$

Replacing $𝑒_π‘˜$ and $𝑒_{π‘˜+1}$ in the above equation, we get

$𝑃𝑒_{(π‘˜+1)}=(π‘ƒβˆ’π΄)𝑒_π‘˜$

$𝑒_{(π‘˜+1)}=(πΌβˆ’π‘ƒ^{βˆ’1}𝐴)𝑒_π‘˜…………(b)$

$\Rightarrow$ $e_{(k+1)}=Me_k$

$\therefore$ The convergence rate depends entirely on M.

Requirements of Good Preconditioners

The preconditioner should satisfy certain requirements:

  • The preconditioner should be easy to form.
  • Convergence should be much faster such that the computational cost can be minimized.
  • For convergence to take place, spectral radius 𝜌(𝑀) should be lesser than 1.
  • The spectral radius $[\rho(M)]$ of a square matrix is defined as the largest absolute eigenvalue among all its eigenvalues.
    $\Rightarrow$ $\rho(M)= |{\lambda_{max}(M)}|$

Let us discuss some of the basic preconditioners.

a) Jacobi Preconditioner

The Jacobi Preconditioner is defined by-

$$𝑃=π·π‘–π‘Žπ‘”π‘œπ‘›π‘Žl\ of\ [𝐴]$$

Let us understand how it is used with the help of an example.

Consider a matrix $[A]= \left[\begin{array}{lll} 1 & 2 & 3 \\ 4 & 6 & 2 \\ 3 & 5 & 6 \end{array}\right]$
P= Diagonal of $[A]= \left[\begin{array}{lll} 1 & 0 & 0 \\ 0 & 6 & 0 \\ 0 & 0 & 6 \end{array}\right]$
Coefficient of Error Equation,

$ M =(πΌβˆ’π‘ƒ^{βˆ’1}𝐴) $

So, $[M]= \left[\begin{array}{lll} 0 & -2 & -3 \\ -0.8 & 0 & -0.4 \\ -0.5 & -0.833 & 0 \end{array}\right]$

Eigenvalues of $[A]= 0.1719, 2.1015, 11.0704$

Eigenvalues of $[M]= -2.024, 0.82, 1.20$

Since $|{\lambda_{max}(M)}|>1$ , with the use of Jacobi preconditioner, the solution is not going to converge. Although the $\lambda_{max}$ has reduced from 11.0704 to 1.20. So, we need to look for other preconditioners for convergence of this system.

b) Gauss-Seidel Preconditioner

The Gauss-Seidel preconditioner is defined by-

$$P=L+D$$
where 𝐿 = lower-triangular part of [A] and 𝐷 = diagonal of [A]

One Gauss-Seidel step is worth two Jacobi steps as the Jacobi eigenvalues and spectral radius is squared. So, the number of iterations required for convergence is reduced to half.

c) Successive Over-relaxation

It is a combination of Jacobi and Gauss-Seidel Iteration.
The preconditioner used is-

$$P = D+\omega L$$

As it is a triangular matrix, $x^{(k+1)}$ can be computed sequentially usingΒ forward substitution. The factor πœ” is chosen between 0 to 2 to minimize the spectral radius Ξ»(M).

d) Incomplete LU factorization

The preconditioner used is-

$$P = (approx\ of\ L)Γ—(approx\ of\ U)$$
This preconditioner is based on the LU Factorization. In LU decomposition, sometimes zero elements of the original matrix become non-zero making the solution more difficult. A minimum degree algorithm is used to reduce fill-in from the matrices.

e) Incomplete Cholesky factorization

The preconditioner used is-

$$P=approximations\ of\ (LΓ—L^T)$$
where L is the lower triangular part of the original matrix.
Incomplete Cholesky factorization removes non-zero elements from $[𝐿×𝐿^𝑇]$ matrix and makes it as sparse as the original matrix.

f) Multigrid Preconditioner

Multigrid Preconditioners remove both low and high-frequency errors. The steps involved in the Multigrid Method are-

  • Removal of high-frequency errors using Jacobi or Gauss-Seidel Method.
  • Projection of small frequency errors onto coarser grids to make them high-frequency errors.
  • Remove these high-frequency errors when they are on the coarser grid.
  • Interpolate it on to the fine grid.
Image Courtesy: [https://images.app.goo.gl/81CjXXPMD5GeMMLd8]

Β Common Preconditioners and Solvers used in FEniCS

preconditioners = (
"amg",
"default",
"hypre_amg",
"hypre_euclid",
"hypre_parasails",
"icc",
"ilu",
"jacobi",
"none",
"petsc_amg",
"sor",
)solvers = (
"bicgstab",
"cg",
"default",
"gmres",
"minres",
"mumps",
"petsc",
"richardson",
"superlu",
"tfqmr",
"umfpack",
)

Among preconditioners, amg stands for Algebraic Multigrid, icc stands for Incomplete Cholesky factorization, ilu stands for Incomplete LU factorization, sor stands for Successive Over-relaxation.

Among solvers, bicgstab stands for Biconjugate Gradient Stabilized Method, cg stands for Conjugate Gradient Method, gmres stands for Generalized Minimal Residual Method, superlu stands for Sparse Linear Equation Solver.

FEniCS stands for Finite Element Computational Software, which is an open-source computing platform for solving partial differential equations. It enables users to quickly translate scientific models into efficient finite element codes.

Summary

  • If the condition number is less, the solution will converge faster and easily computable.
  • The value of spectral radius should be less than 1, for the solution to converge fast.
  • Multigrid Preconditioner is most useful as it removes both low and high-frequency errors.
  • It is not possible that one preconditioner can solve all types of problems. The choice of a particular preconditioner is totally dependent on the type of problem, it’s documentation and the user’s experience.
  • We should have a good knowledge and understanding of underlined mathematical principles associated with each method to get the desired solution in a cost-effective and timely manner.

Leave a Reply