Iterative Methods for Linear Systems II
Iterative methods are a powerful way to solve systems of linear equations, especially when the system is large and direct methods like elimination become slow or expensive. In this lesson, students, you will learn the main ideas behind iterative methods, how they connect to conditioning and other linear systems topics, and why methods such as Jacobi and Gauss-Seidel are so important in numerical analysis 🔍.
Why iterative methods matter
A linear system can be written as $A\mathbf{x}=\mathbf{b}$, where $A$ is a matrix, $\mathbf{x}$ is the unknown vector, and $\mathbf{b}$ is the known right-hand side. For small systems, methods like Gaussian elimination can work well. But for large systems, especially those arising from engineering, computer graphics, weather prediction, or networks, direct methods may require too much memory or time.
Iterative methods approach the solution by starting with an initial guess $\mathbf{x}^{(0)}$ and repeatedly improving it to get a sequence $\mathbf{x}^{(1)},\mathbf{x}^{(2)},\mathbf{x}^{(3)},\dots$. The goal is that this sequence gets closer and closer to the true solution $\mathbf{x}$ 📈.
The key idea is simple: instead of solving everything at once, we make a guess, correct it, and repeat. This is useful when exact arithmetic is impossible or when we only need a good approximation rather than a perfectly exact answer.
Core ideas and terminology
An iterative method uses a recurrence relation to generate better approximations. A common general form is
$$\mathbf{x}^{(k+1)} = G\mathbf{x}^{(k)} + \mathbf{c},$$
where $G$ is the iteration matrix and $\mathbf{c}$ is a constant vector. If the method converges, then $\mathbf{x}^{(k)}$ approaches the true solution as $k$ increases.
Important terms include:
- Initial guess $\mathbf{x}^{(0)}$: the starting point.
- Iteration: one update step from $\mathbf{x}^{(k)}$ to $\mathbf{x}^{(k+1)}$.
- Convergence: the approximations approach the exact solution.
- Divergence: the approximations move away or fail to settle.
- Stopping criterion: a rule for deciding when to stop iterating.
A common stopping criterion is based on the residual vector
$$\mathbf{r}^{(k)} = \mathbf{b} - A\mathbf{x}^{(k)}.$$
If $\|\mathbf{r}^{(k)}\|$ is small, then $\mathbf{x}^{(k)}$ is usually a good approximation. Another common rule is to stop when the change between two successive guesses is small, such as $\|\mathbf{x}^{(k+1)}-\mathbf{x}^{(k)}\|<\varepsilon$, where $\varepsilon$ is a chosen tolerance.
The residual is important because a small update does not always mean the answer is accurate. In a poorly conditioned problem, a small change in the guess can still hide a noticeable error.
Jacobi method: updating with old values only
The Jacobi method is one of the simplest iterative methods. It is based on solving each equation for one variable and using only values from the previous iteration when updating all components.
Suppose a system is written in component form as
$$a_{11}x_1+a_{12}x_2+\cdots+a_{1n}x_n=b_1,$$
$$a_{21}x_1+a_{22}x_2+\cdots+a_{2n}x_n=b_2,$$
and so on. If $a_{ii}\neq 0$ for all $i$, then each equation can be rearranged to isolate $x_i$:
$$x_i = \frac{1}{a_{ii}}\left(b_i-\sum_{j\neq i}a_{ij}x_j\right).$$
In the Jacobi method, the new value $x_i^{(k+1)}$ is computed using only the old values $x_j^{(k)}$ for $j\neq i$:
$$x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i-\sum_{j\neq i}a_{ij}x_j^{(k)}\right).$$
This makes Jacobi easy to understand and parallelize because each component update can be computed independently from the same previous vector. That is useful in modern computing systems 🖥️.
Example of the Jacobi idea
Consider the system
$$3x+y=5,$$
$$x+2y=5.$$
Rewriting gives
$$x=\frac{5-y}{3}, \quad y=\frac{5-x}{2}.$$
Start with $\mathbf{x}^{(0)}=(0,0)$. Then
$$x^{(1)}=\frac{5-0}{3}=\frac{5}{3}, \quad y^{(1)}=\frac{5-0}{2}=\frac{5}{2}.$$
Using these values again,
$$x^{(2)}=\frac{5-\frac{5}{2}}{3}=\frac{5}{6}, \quad y^{(2)}=\frac{5-\frac{5}{3}}{2}=\frac{5}{3}.$$
The approximations continue changing, and if the system is suitable, they will approach the exact solution.
Gauss-Seidel method: using the newest values as soon as possible
The Gauss-Seidel method improves on Jacobi by using updated values immediately within the same iteration. This often makes it converge faster.
For the same system structure, the update formula is
$$x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i-\sum_{j<i}a_{ij}x_j^{(k+1)}-\sum_{j>i}a_{ij}x_j^{(k)}\right).$$
The main difference is that when computing $x_i^{(k+1)}$, the method uses the newest available values $x_1^{(k+1)},x_2^{(k+1)},\dots,x_{i-1}^{(k+1)}$ as soon as they are known.
This often gives Gauss-Seidel an advantage over Jacobi because the method uses fresh information instead of waiting until the next full iteration. In practical terms, it is like updating a group project draft as soon as one person finishes a section, rather than waiting for everyone to finish before revising ✍️.
Convergence and why it is not guaranteed
Not every iterative method works for every system. Convergence depends on properties of the matrix $A$ and on the method used.
A useful sufficient condition for both Jacobi and Gauss-Seidel is strict diagonal dominance, meaning that for each row $i$,
$$|a_{ii}| > \sum_{j\neq i}|a_{ij}|.$$
If a matrix is strictly diagonally dominant, these methods will converge. Another important case is when $A$ is symmetric positive definite, where Gauss-Seidel is known to converge.
The speed of convergence depends on the iteration matrix. If the iteration matrix is $G$, then convergence occurs when the spectral radius $\rho(G)$ satisfies
$$\rho(G)<1.$$
The spectral radius $\rho(G)$ is the largest absolute value of the eigenvalues of $G$. This condition connects iterative methods to linear algebra theory and shows why some systems are easy to solve iteratively while others are not.
If the matrix is poorly conditioned, small changes in the data can cause large changes in the solution. That does not automatically mean the iterative method fails, but it can make accurate computation harder. This is why iterative methods belong naturally in the broader study of conditioning in Linear Systems II.
A practical workflow for solving systems iteratively
When using an iterative method, the process usually looks like this:
- Rewrite the system into update form.
- Choose an initial guess $\mathbf{x}^{(0)}$.
- Compute the next approximation $\mathbf{x}^{(k+1)}$.
- Check the residual $\mathbf{r}^{(k+1)}=\mathbf{b}-A\mathbf{x}^{(k+1)}$.
- Stop when the residual or change is small enough.
For example, if a numerical model gives a system with thousands of unknowns, an engineer may only need a solution accurate to a certain tolerance. Then an iterative method can be much more efficient than trying to compute a fully exact answer.
The choice of stopping rule matters. If the tolerance is too large, the result may be inaccurate. If it is too small, the method may waste time doing extra work. Good numerical analysis balances accuracy and efficiency.
Comparing Jacobi and Gauss-Seidel
Jacobi and Gauss-Seidel are closely related, but they differ in how they use information.
- Jacobi uses only values from the previous iteration.
- Gauss-Seidel uses new values as soon as they are available.
Because of this, Gauss-Seidel often converges in fewer iterations than Jacobi for the same system. However, Jacobi can be easier to parallelize because each update is independent within one iteration.
In real applications, the best method depends on the matrix, the computing environment, and the required accuracy. Sometimes an iterative method is used as a solver by itself; sometimes it is used as part of a larger algorithm, such as a preconditioned method for very large sparse systems.
Conclusion
Iterative methods provide a practical and flexible way to solve linear systems, especially when systems are large, sparse, or too expensive for direct methods. The central idea is to begin with a guess and improve it step by step until the approximations are good enough. Jacobi and Gauss-Seidel show two major styles of iteration, with Gauss-Seidel often converging faster because it reuses updated values immediately.
students, this topic fits into Linear Systems II by connecting computational procedures with theory: matrix structure, convergence, residuals, and conditioning all affect whether an iterative method succeeds. Understanding these ideas helps you judge not only how to compute a solution, but also whether the answer can be trusted ✅.
Study Notes
- Iterative methods solve $A\mathbf{x}=\mathbf{b}$ by starting with an initial guess and improving it repeatedly.
- The sequence $\mathbf{x}^{(0)},\mathbf{x}^{(1)},\mathbf{x}^{(2)},\dots$ should approach the true solution if the method converges.
- The residual is $\mathbf{r}^{(k)}=\mathbf{b}-A\mathbf{x}^{(k)}$.
- A common stopping rule is to stop when $\|\mathbf{r}^{(k)}\|$ or $\|\mathbf{x}^{(k+1)}-\mathbf{x}^{(k)}\|$ is smaller than a tolerance $\varepsilon$.
- The Jacobi method updates all variables using only values from the previous iteration.
- The Gauss-Seidel method uses the newest available values during the same iteration.
- Gauss-Seidel often converges faster than Jacobi, but Jacobi is easier to parallelize.
- Strict diagonal dominance, $|a_{ii}|>\sum_{j\neq i}|a_{ij}|$, is a sufficient condition for convergence of Jacobi and Gauss-Seidel.
- Convergence is linked to the iteration matrix $G$ and the condition $\rho(G)<1$.
- Conditioning affects how reliably a computed solution reflects the true solution.
- Iterative methods are especially useful for large sparse systems and when approximate answers are acceptable.
