Iterative methods

1 Iterative methods

Suppose we have the system of equations

$A X = B .$

The aim here is to find a sequence of approximations which gradually approach $X$ . We will denote these approximations

$X^{(0)}, X^{(1)}, X^{(2)}, \dots, X^{(k)}, \dots$

where $X^{(0)}$ is our initial “guess", and the hope is that after a short while these successive iterates will be so close to each other that the process can be deemed to have converged to the required solution $X$ .

Key Point 10

An iterative method is one in which a sequence of approximations (or iterates ) is produced. The method is successful if these iterates converge to the true solution of the given problem.

It is convenient to split the matrix $A$ into three parts. We write

$A = L + D + U$

where $L$ consists of the elements of $A$ strictly below the diagonal and zeros elsewhere; $D$ is a diagonal matrix consisting of the diagonal entries of $A$ ; and $U$ consists of the elements of $A$ strictly above the diagonal. Note that $L$ and $U$ here are not the same matrices as appeared in the $L U$ decomposition! The current $L$ and $U$ are much easier to find.

For example

$\begin{matrix} \underset{⏟}{[\begin{matrix} 3 & - 4 \\ 2 & 1 \end{matrix}]} & = & \underset{⏟}{[\begin{matrix} 0 & 0 \\ 2 & 0 \end{matrix}]} & + & \underset{⏟}{[\begin{matrix} 3 & 0 \\ 0 & 1 \end{matrix}]} & + & \underset{⏟}{[\begin{matrix} 0 & - 4 \\ 0 & 0 \end{matrix}]} \\ ↑ & ↑ & ↑ & ↑ \\ A & = & L & + & D & + & U \end{matrix}$

and

$\begin{matrix} \underset{⏟}{[\begin{matrix} 2 & - 6 & 1 \\ 3 & - 2 & 0 \\ 4 & - 1 & 7 \end{matrix}]} & = & \underset{⏟}{[\begin{matrix} 0 & 0 & 0 \\ 3 & 0 & 0 \\ 4 & - 1 & 0 \end{matrix}]} & + & \underset{⏟}{[\begin{matrix} 2 & 0 & 0 \\ 0 & - 2 & 0 \\ 0 & 0 & 7 \end{matrix}]} & + & \underset{⏟}{[\begin{matrix} 0 & - 6 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}]} \\ ↑ & ↑ & ↑ & ↑ \\ A & = & L & + & D & + & U \end{matrix}$ and, more generally for $3 \times 3$ matrices

$\begin{matrix} \underset{⏟}{[\begin{matrix} ∙ & ∙ & ∙ \\ ∙ & ∙ & ∙ \\ ∙ & ∙ & ∙ \end{matrix}]} & = & \underset{⏟}{[\begin{matrix} 0 & 0 & 0 \\ ∙ & 0 & 0 \\ ∙ & ∙ & 0 \end{matrix}]} & + & \underset{⏟}{[\begin{matrix} ∙ & 0 & 0 \\ 0 & ∙ & 0 \\ 0 & 0 & ∙ \end{matrix}]} & + & \underset{⏟}{[\begin{matrix} 0 & ∙ & ∙ \\ 0 & 0 & ∙ \\ 0 & 0 & 0 \end{matrix}]} . \\ ↑ & ↑ & ↑ & ↑ \\ A & = & L & + & D & + & U . \end{matrix}$

1.1 The Jacobi iteration

The simplest iterative method is called Jacobi iteration and the basic idea is to use the $A = L + D + U$ partitioning of $A$ to write $A X = B$ in the form

$D X = - (L + U) X + B .$

We use this equation as the motivation to define the iterative process

$D X^{(k + 1)} = - (L + U) X^{(k)} + B$

which gives $X^{(k + 1)}$ as long as $D$ has no zeros down its diagonal, that is as long as $D$ is invertible. This is Jacobi iteration.

Key Point 11

The Jacobi iteration for approximating the solution of $A X = B$ where $A = L + D + U$ is given by

$X^{(k + 1)} = - D^{- 1} (L + U) X^{(k)} + D^{- 1} B$

Example 18

Use the Jacobi iteration to approximate the solution $X = [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}]$ of $[\begin{matrix} 8 & 2 & 4 \\ 3 & 5 & 1 \\ 2 & 1 & 4 \end{matrix}] [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}] = [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}] .$

Use the initial guess $X^{(0)} = [\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}]$ .

Solution

In this case $D = [\begin{matrix} 8 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 4 \end{matrix}]$ and $L + U = [\begin{matrix} 0 & 2 & 4 \\ 3 & 0 & 1 \\ 2 & 1 & 0 \end{matrix}]$ .

First iteration .

The first iteration is $D X^{(1)} = - (L + U) X^{(0)} + B$ , or in full

$[\begin{matrix} 8 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] = [\begin{matrix} 0 & - 2 & - 4 \\ - 3 & 0 & - 1 \\ - 2 & - 1 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(0)} \\ x_{2}^{(0)} \\ x_{3}^{(0)} \end{matrix}] + [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}] = [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}],$

since the initial guess was $x_{1}^{(0)} = x_{2}^{(0)} = x_{3}^{(0)} = 0$ .

Taking this information row by row we see that

\begin{array}{rcl} 8 x_{1}^{(1)} & = & - 16 ∴ x_{1}^{(1)} = - 2 \\ 5 x_{2}^{(1)} & = & 4 ∴ x_{2}^{(1)} = 0.8 \\ 4 x_{3}^{(1)} & = & - 12 ∴ x_{3}^{(1)} = - 3 \end{array}

Thus the first Jacobi iteration gives us $X^{(1)} = [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] = [\begin{matrix} - 2 \\ 0.8 \\ - 3 \end{matrix}]$ as an approximation to $X$ .

Second iteration .

The second iteration is $D X^{(2)} = - (L + U) X^{(1)} + B$ , or in full

$[\begin{matrix} 8 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] = [\begin{matrix} 0 & - 2 & - 4 \\ - 3 & 0 & - 1 \\ - 2 & - 1 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] + [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}] .$

Taking this information row by row we see that

\begin{array}{rcl} 8 x_{1}^{(2)} & = & - 2 x_{2}^{(1)} - 4 x_{3}^{(1)} - 16 = - 2 (0.8) - 4 (- 3) - 16 = - 5.6 ∴ x_{1}^{(2)} = - 0.7 \\ 5 x_{2}^{(2)} & = & - 3 x_{1}^{(1)} - x_{3}^{(1)} + 4 = - 3 (- 2) - (- 3) + 4 = 13 ∴ x_{2}^{(2)} = 2.6 \\ 4 x_{3}^{(2)} & = & - 2 x_{1}^{(1)} - x_{2}^{(1)} - 12 = - 2 (- 2) - 0.8 - 12 = - 8.8 ∴ x_{3}^{(2)} = - 2.2 \end{array}

Therefore the second iterate approximating $X$ is $X^{(2)} = [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] = [\begin{matrix} - 0.7 \\ 2.6 \\ - 2.2 \end{matrix}]$ .

Third iteration .

The third iteration is $D X^{(3)} = - (L + U) X^{(2)} + B$ , or in full

$[\begin{matrix} 8 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(3)} \\ x_{2}^{(3)} \\ x_{3}^{(3)} \end{matrix}] = [\begin{matrix} 0 & - 2 & - 4 \\ - 3 & 0 & - 1 \\ - 2 & - 1 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] + [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}]$

Taking this information row by row we see that

\begin{array}{rcl} 8 x_{1}^{(3)} & = & - 2 x_{2}^{(2)} - 4 x_{3}^{(2)} - 16 = - 2 (2.6) - 4 (- 2.2) - 16 = - 12.4 ∴ x_{1}^{(3)} = - 1.55 \\ 5 x_{2}^{(3)} & = & - 3 x_{1}^{(2)} - x_{3}^{(2)} + 4 = - 3 (- 0.7) - (2.2) + 4 = 8.3 ∴ x_{2}^{(3)} = 1.66 \\ 4 x_{3}^{(3)} & = & - 2 x_{1}^{(2)} - x_{2}^{(2)} - 12 = - 2 (- 0.7) - 2.6 - 12 = - 13.2 ∴ x_{3}^{(3)} = - 3.3 \end{array}

Therefore the third iterate approximating $X$ is $X^{(3)} = [\begin{matrix} x_{1}^{(3)} \\ x_{2}^{(3)} \\ x_{3}^{(3)} \end{matrix}] = [\begin{matrix} - 1.55 \\ 1.66 \\ - 3.3 \end{matrix}]$ .

More iterations ...

Three iterations is plenty when doing these calculations by hand! But the repetitive nature of the process is ideally suited to its implementation on a computer. It turns out that the next few iterates are

$X^{(4)} = [\begin{matrix} - 0.765 \\ 2.39 \\ - 2.64 \end{matrix}], X^{(5)} = [\begin{matrix} - 1.277 \\ 1.787 \\ - 3.215 \end{matrix}], X^{(6)} = [\begin{matrix} - 0.839 \\ 2.209 \\ - 2.808 \end{matrix}],$

to 3 d.p. Carrying on even further $X^{(20)} = [\begin{matrix} x_{1}^{(20)} \\ x_{2}^{(20)} \\ x_{3}^{(20)} \end{matrix}] = [\begin{matrix} - 0.9959 \\ 2.0043 \\ - 2.9959 \end{matrix}]$ , to 4 d.p. After about 40 iterations successive iterates are equal to 4 d.p. Continuing the iteration even further causes the iterates to agree to more and more decimal places. The method converges to the exact answer

$X = [\begin{matrix} - 1 \\ 2 \\ - 3 \end{matrix}]$ .

The following Task involves calculating just two iterations of the Jacobi method.

Task!

Carry out two iterations of the Jacobi method to approximate the solution of

$[\begin{matrix} 4 & - 1 & - 1 \\ - 1 & 4 & - 1 \\ - 1 & - 1 & 4 \end{matrix}] [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}] = [\begin{matrix} 1 \\ 2 \\ 3 \end{matrix}]$

with the initial guess $X^{(0)} = [\begin{matrix} 1 \\ 1 \\ 1 \end{matrix}]$ .

The first iteration is $D X^{(1)} = - (L + U) X^{(0)} + B$ , that is,

$[\begin{matrix} 4 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] = [\begin{matrix} 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(0)} \\ x_{2}^{(0)} \\ x_{3}^{(0)} \end{matrix}] + [\begin{matrix} 1 \\ 2 \\ 3 \end{matrix}]$

from which it follows that $X^{(1)} = [\begin{matrix} 0.75 \\ 1 \\ 1.25 \end{matrix}]$ .

The second iteration is $D X^{(1)} = - (L + U) X^{(0)} + B$ , that is,

$[\begin{matrix} 4 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] = [\begin{matrix} 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(0)} \\ x_{2}^{(0)} \\ x_{3}^{(0)} \end{matrix}] + [\begin{matrix} 1 \\ 2 \\ 3 \end{matrix}]$

from which it follows that $X^{(2)} = [\begin{matrix} 0.8125 \\ 1 \\ 1.1875 \end{matrix}]$ .

Notice that at each iteration the first thing we do is get a new approximation for $x_{1}$ and then we continue to use the old approximation to $x_{1}$ in subsequent calculations for that iteration! Only at the next iteration do we use the new value. Similarly, we continue to use an old approximation to $x_{2}$ even after we have worked out a new one. And so on.

Given that the iterative process is supposed to improve our approximations why not use the better values straight away? This observation is the motivation for what follows.

1.2 Gauss-Seidel iteration

The approach here is very similar to that used in Jacobi iteration. The only difference is that we use new approximations to the entries of $X$ as soon as they are available. As we will see in the Example below, this means rearranging $(L + D + U) X = B$ slightly differently from what we did for Jacobi. We write

$(D + L) X = - U X + B$

and use this as the motivation to define the iteration

$(D + L) X^{(k + 1)} = - U X^{(k)} + B .$

Key Point 12

The Gauss-Seidel iteration for approximating the solution of $A X = B$ is given by

$X^{(k + 1)} = - {(D + L)}^{- 1} U X^{(k)} + {(D + L)}^{- 1} B$

Example 19 which follows revisits the system of equations we saw earlier in this Section in Example 18.

Example 19

Use the Gauss-Seidel iteration to approximate the solution $X = [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}]$ of $[\begin{matrix} 8 & 2 & 4 \\ 3 & 5 & 1 \\ 2 & 1 & 4 \end{matrix}] [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}] = [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}] .$ Use the initial guess $X^{(0)} = [\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}]$ .

Solution

In this case $D + L = [\begin{matrix} 8 & 0 & 0 \\ 3 & 5 & 0 \\ 2 & 1 & 4 \end{matrix}]$ and $U = [\begin{matrix} 0 & 2 & 4 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{matrix}]$ .

First iteration .

The first iteration is $(D + L) X^{(1)} = - U X^{(0)} + B$ , or in full

$[\begin{matrix} 8 & 0 & 0 \\ 3 & 5 & 0 \\ 2 & 1 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] = [\begin{matrix} 0 & - 2 & - 4 \\ 0 & 0 & - 1 \\ 0 & 0 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(0)} \\ x_{2}^{(0)} \\ x_{3}^{(0)} \end{matrix}] + [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}] = [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}],$

since the initial guess was $x_{1}^{(0)} = x_{2}^{(0)} = x_{3}^{(0)} = 0$ .

Taking this information row by row we see that

\begin{array}{rcl} 8 x_{1}^{(1)} & = & - 16 ∴ x_{1}^{(1)} = - 2 \\ 3 x_{2}^{(1)} + 5 x_{2}^{(1)} & = & 4 ∴ 5 x_{2}^{(1)} = - 3 (- 2) + 4 ∴ x_{2}^{(1)} = 2 \\ 2 x_{1}^{(1)} + x_{2}^{(1)} + 4 x_{3}^{(1)} & = & - 12 ∴ 4 x_{3}^{(1)} = - 2 (- 2) - 2 - 12 ∴ x_{3}^{(1)} = - 2.5 \end{array}

(Notice how the new approximations to $x_{1}$ and $x_{2}$ were used immediately after they were found.)

Thus the first Gauss-Seidel iteration gives us $X^{(1)} = [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] = [\begin{matrix} - 2 \\ 2 \\ - 2.5 \end{matrix}]$ as an approximation to $X$ .

Solution

Second iteration .

The second iteration is $(D + L) X^{(2)} = - U X^{(1)} + B$ , or in full

$[\begin{matrix} 8 & 0 & 0 \\ 3 & 5 & 0 \\ 2 & 1 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] = [\begin{matrix} 0 & - 2 & - 4 \\ 0 & 0 & - 1 \\ 0 & 0 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] + [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}]$

Taking this information row by row we see that

\begin{array}{rcl} 8 x_{1}^{(2)} & = & - 2 x_{2}^{(1)} - 4 x_{3}^{(1)} - 16 ∴ x_{1}^{(2)} = - 1.25 \\ 3 x_{1}^{(2)} + 5 x_{2}^{(2)} & = & - x_{3}^{(1)} + 4 ∴ x_{2}^{(2)} = 2.05 \\ 2 x_{1}^{(2)} + x_{2}^{(2)} + 4 x_{3}^{(2)} & = & - 12 ∴ x_{3}^{(2)} = - 2.8875 \end{array}

Therefore the second iterate approximating $X$ is $X^{(2)} = [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] = [\begin{matrix} - 1.25 \\ 2.05 \\ - 2.8875 \end{matrix}]$ .

Third iteration .

The third iteration is $(D + L) X^{(3)} = - U X^{(2)} + B$ , or in full

$[\begin{matrix} 8 & 0 & 0 \\ 3 & 5 & 0 \\ 2 & 1 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(3)} \\ x_{2}^{(3)} \\ x_{3}^{(3)} \end{matrix}] = [\begin{matrix} 0 & - 2 & - 4 \\ 0 & 0 & - 1 \\ 0 & 0 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] + [\begin{matrix} - 16 \\ 4 \\ - 12 \end{matrix}] .$

Taking this information row by row we see that

\begin{array}{rcl} 8 x_{1}^{(3)} & = & - 2 x_{2}^{(2)} - 4 x_{3}^{(2)} - 16 ∴ x_{1}^{(3)} = - 1.0687 \\ 3 x_{1}^{(3)} + 5 x_{2}^{(3)} & = & - x_{3}^{(2)} + 4 ∴ x_{2}^{(3)} = 2.0187 \\ 2 x_{1}^{(3)} + x_{2}^{(3)} + 4 x_{3}^{(3)} & = & - 12 ∴ x_{3}^{(3)} = - 2.9703 \end{array}

to 4 d.p. Therefore the third iterate approximating $X$ is

$X^{(3)} = [\begin{matrix} x_{1}^{(3)} \\ x_{2}^{(3)} \\ x_{3}^{(3)} \end{matrix}] = [\begin{matrix} - 1.0687 \\ 2.0187 \\ - 2.9703 \end{matrix}] .$

More iterations ...

Again, there is little to be learned from pushing this further by hand. Putting the procedure on a computer and seeing how it progresses is instructive, however, and the iteration continues as follows:

$X^{(4)} = [\begin{matrix} - 1.0195 \\ 2.0058 \\ - 2.9917 \end{matrix}], X^{(5)} = [\begin{matrix} - 1.0056 \\ 2.0017 \\ - 2.9976 \end{matrix}], X^{(6)} = [\begin{matrix} - 1.0016 \\ 2.0005 \\ - 2.9993 \end{matrix}],$

$X^{(7)} = [\begin{matrix} - 1.0005 \\ 2.0001 \\ - 2.9998 \end{matrix}], X^{(8)} = [\begin{matrix} - 1.0001 \\ 2.0000 \\ - 2.9999 \end{matrix}], X^{(9)} = [\begin{matrix} - 1.0000 \\ 2.0000 \\ - 3.0000 \end{matrix}]$

(to 4 d.p.). Subsequent iterates are equal to $X^{(9)}$ to this number of decimal places. The Gauss-Seidel iteration has converged to 4 d.p. in 9 iterations. It took the Jacobi method almost 40 iterations to achieve this!

Task!

Carry out two iterations of the Gauss-Seidel method to approximate the solution of

$[\begin{matrix} 4 & - 1 & - 1 \\ - 1 & 4 & - 1 \\ - 1 & - 1 & 4 \end{matrix}] [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}] = [\begin{matrix} 1 \\ 2 \\ 3 \end{matrix}]$

with the initial guess $X^{(0)} = [\begin{matrix} 1 \\ 1 \\ 1 \end{matrix}]$ .

The first iteration is $(D + L) X^{(1)} = - U X^{(0)} + B$ , that is,

$[\begin{matrix} 4 & 0 & 0 \\ - 1 & 4 & 0 \\ - 1 & - 1 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] = [\begin{matrix} 0 & 1 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(0)} \\ x_{2}^{(0)} \\ x_{3}^{(0)} \end{matrix}] + [\begin{matrix} 1 \\ 2 \\ 3 \end{matrix}]$

from which it follows that $X^{(1)} = [\begin{matrix} 0.75 \\ 0.9375 \\ 1.1719 \end{matrix}]$ .

The second iteration is $(D + L) X^{(1)} = - U X^{(0)} + B$ , that is,

$[\begin{matrix} 4 & 0 & 0 \\ - 1 & 4 & 0 \\ - 1 & - 1 & 4 \end{matrix}] [\begin{matrix} x_{1}^{(2)} \\ x_{2}^{(2)} \\ x_{3}^{(2)} \end{matrix}] = [\begin{matrix} 0 & 1 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{matrix}] [\begin{matrix} x_{1}^{(1)} \\ x_{2}^{(1)} \\ x_{3}^{(1)} \end{matrix}] + [\begin{matrix} 1 \\ 2 \\ 3 \end{matrix}]$

from which it follows that $X^{(2)} = [\begin{matrix} 0.7773 \\ 0.9873 \\ 1.1912 \end{matrix}]$ .

1 Iterative methods

1.1 The Jacobi iteration

Answer

Answer

1.2 Gauss-Seidel iteration

Answer

Answer