The Big Picture: Iterative Averaging
Poisson's equation, , has a beautifully simple meaning: at every point, the value of differs from the average of its surroundings by exactly the amount of source sitting there. No source means "equal to the average"; positive source means "a bit above the average"; negative source means "a bit below." That single sentence is the entire physical content of the equation — and it is also, almost word-for-word, an algorithm.
Imagine a heated metal plate whose edges are held at fixed temperatures and whose interior has a few heat sources sprinkled around. To find the steady-state temperature distribution, you could solve a huge sparse linear system. Or you could do something far simpler: walk around the plate and adjust each interior point so it matches the average of its neighbours, plus a little kick from any local heat source. Repeat. The temperature field will relax into the steady state — hence the name relaxation methods.
The core idea in one line. A relaxation method takes the discretized Poisson equation, , and applies it as an assignment rule, sweep after sweep, until the values stop changing. The PDE's defining property — every cell averages its neighbours — becomes the literal source code.
In this section we will walk through the three classical relaxation methods in order of cleverness:
- Jacobi — sweep the grid using only the old neighbour values. Maximum parallelism, slowest convergence.
- Gauss-Seidel — sweep the grid using the freshest values available. Roughly twice as fast, harder to parallelize.
- Successive Over-Relaxation (SOR) — do Gauss-Seidel, then deliberately overshoot by a factor . With a well-chosen , an order of magnitude faster than either of the previous two.
Then we will sketch how multigrid attacks the same problem from a completely different angle, achieving the asymptotically optimal cost in 2D — meaning constant work per unknown, regardless of grid size.
From PDE to Grid: The 5-Point Stencil
Before we can iterate, we need a finite version of . Replace the continuous square with a uniform grid of points and spacing . Write . Now Taylor-expand:
Add them. Odd-order terms cancel:
Solving for gives the famous centred second difference:
Do the same on the -axis and add. The discrete Laplacian is:
This is the 5-point stencil. Set it equal to (where ), and solve for :
This is the master formula every relaxation method below applies, over and over. The only thing the three methods disagree about is which values of the neighbours to use when computing the average.
Picture the stencil. The interior cell reaches out to its four nearest neighbours — up, down, left, right — and to a single source term. No diagonals. The pattern looks like a plus-sign:V_{i, j+1} | V_{i-1, j} --- V_{i, j} --- V_{i+1, j} + (h² / 4) · f_{i,j} | V_{i, j-1}
Jacobi: The Cleanest Idea
Carl Gustav Jacob Jacobi's relaxation idea (1845) is the most literal possible reading of the discretized equation. Make two copies of the grid. In one sweep, read from the old copy, write to the new copy:
Then swap the copies and repeat. The superscript means "value at the start of iteration ." The defining property: every cell on iteration uses only iteration- values. The order in which we update cells does not matter, so we can do them all simultaneously — perfect for vectors and GPUs.
Why this works (intuition). Picture a hot point in the middle of a cold plate. After the first Jacobi sweep, only the four cells immediately touching the hot point feel anything; everyone else is still cold. After the second sweep, the next ring out feels it. Heat — or potential, or whatever represents — diffuses outward by exactly one cell per sweep. To reach a boundary that is cells away takes at least sweeps. That is the geometric reason Jacobi is slow.
Formally, Jacobi's iteration matrix has spectral radius for the discretized Dirichlet Poisson problem on a unit square. For small that is , which means each iteration cuts the error by a factor that is very close to 1. Halving roughly quadruples the number of iterations needed. Jacobi is iterations on an grid.
Gauss-Seidel: Use the Fresh News
Imagine the same sweep, but as you walk from cell to cell, you immediately write the new value back. By the time you reach cell , the neighbours and (assuming row-major sweep order) have already been updated this iteration. So when you compute the new , you use:
Mixed superscripts — two updated, two not yet — because cells are visited in lexicographic order. This is the Gauss-Seidel method (Gauss 1823 in letters, Seidel 1874 in print).
Intuition. Why should using fresher data be faster? Picture the hot-point example again. In Jacobi, a single update at the hot spot only sends information to its immediate neighbours by the end of the sweep. In Gauss-Seidel, once the hot spot updates, every cell scanned after it in the same sweep already sees the change. Information propagates further per iteration. In fact, the spectral radius drops to , so Gauss-Seidel takes roughly half the iterations of Jacobi to reach the same accuracy.
The downside: the update order matters. Cell depends on the freshly-updated cell , which we cannot compute in parallel with . On a GPU this is a disaster — we would lose the very parallelism that made Jacobi attractive. The fix is a red-black ordering (chessboard colouring), which we will use in the PyTorch implementation below: update all red cells simultaneously, then all black cells, alternating sweeps. With this trick Gauss-Seidel and SOR vectorize cleanly.
SOR: Overshoot on Purpose
Here is a strange-sounding observation: when Gauss-Seidel updates a cell, the new value almost always lies between the old value and the eventual converged value. The iteration is, on average, under-correcting. What if we deliberately jumped a bit past the Gauss-Seidel update — would we get there faster? That gamble is Successive Over-Relaxation, due to David M. Young Jr. in his 1950 thesis.
Let denote the value that Gauss-Seidel would assign. SOR replaces it with:
The factor is the relaxation parameter. Three regimes:
| Range | Behaviour | Use |
|---|---|---|
| ω = 1 | Pure Gauss-Seidel — no overshoot. | Baseline. |
| 0 < ω < 1 | Under-relaxation: blends old + GS. Smaller step, more stable. | Stiff non-linear problems; rarely for Poisson. |
| 1 < ω < 2 | Over-relaxation: jumps past GS by an extra (ω−1) of the step. | Standard choice for SPD problems like Poisson. |
| ω ≥ 2 | Diverges. The iteration matrix exits the unit circle. | Forbidden. |
The miracle is that for the discretized Poisson problem there is an exact formula for the optimal value:
For this is ; for it is . As the grid gets finer the optimal omega creeps toward 2 (but never reaches it).
The payoff. With optimal omega, the spectral radius becomes , which behaves like for small . Compare to Jacobi's . Halving only doubles the iteration count for SOR, while Jacobi's quadruples. That is the difference between and iterations — a giant speedup for the cost of two extra multiplies per cell.
Why SOR Wins: Spectral Radius Intuition
Each relaxation method can be written as a linear iteration for some iteration matrix . The error evolves as , so after iterations the error is reduced by a factor of approximately , where is the spectral radius (largest eigenvalue magnitude) of .
For the discretized Poisson problem on an unit square the three numbers are known in closed form:
| Method | Spectral radius ρ(G) | Iterations to drop error by 10⁻⁶ |
|---|---|---|
| Jacobi | cos(πh) ≈ 1 − ½ π² h² | ≈ 1.4 · N² / π² |
| Gauss-Seidel | cos²(πh) = ρ_J² ≈ 1 − π² h² | ≈ 0.7 · N² / π² |
| SOR (optimal ω) | ω_opt − 1 ≈ 1 − 2π h | ≈ 2.2 · N / π |
For an example, set :
- Jacobi: ≈ 1400 iterations.
- Gauss-Seidel: ≈ 700 iterations.
- SOR(optimal): ≈ 70 iterations.
SOR is a full 20× faster than Jacobi at , and the gap grows linearly with . The interactive plot below sweeps on a 32×32 grid and shows the characteristic basin — under-relaxation slows GS down, optimal omega crashes the residual, and over-shooting past 2 makes the iteration explode:
Residual after 200 iterations of SOR, swept across all omega values on a 32×32 grid with a point source. The basin is shallow on the left side (under-relaxation just slows GS) but sharpens to a dramatic minimum near ; past the method diverges.
Fourier intuition for what relaxation does. Any error on the grid can be decomposed into spatial frequencies (modes). Jacobi and GS damp high-frequency modes (the wiggly errors) very quickly, but low-frequency modes (the long-wavelength ones) decay agonisingly slowly — they need information to propagate across the whole grid. SOR's overshoot is engineered to cancel the slowest mode at every step, dramatically accelerating the low-frequency damping. Multigrid (see below) attacks the same problem in a different way: solve the low-frequency part on a coarser grid where it is no longer low-frequency.
Interactive: One-Step Relaxation Animator
Click Step ▶ repeatedly and watch a 7-cell 1D grid relax. The single red bar is a source at ; the grey bars at the ends are the Dirichlet boundaries . The dashed green outlines show the exact converged solution. Switch methods between Jacobi, Gauss-Seidel, and SOR; with SOR enabled, slide and see how aggressively the cells overshoot.
One sweep at a time: watch the cells average their neighbours
A 7-cell 1D grid with and a single source . The true converged answer is — see if you can reach it.
Things to notice as you play:
- Jacobi. Information spreads one cell per step. After one step, only the centre cell is non-zero. After two, the immediate neighbours light up. After three, the next ring out. Each cell takes many sweeps to reach its true value.
- Gauss-Seidel. After a single sweep the cells right-of-centre already feel the source (because we sweep left-to-right and the centre updates first). After two or three sweeps the field is already close to the truth.
- SOR. At you will see cells overshoot the truth, then settle. At the convergence is near-optimal. Push above 1.9 and the iteration may oscillate or refuse to settle.
Worked Example: 1D Poisson by Hand
Let's do the same problem on paper to feel the mechanics. Consider the 1D Poisson equation on with , a single source at and . The discrete equation is:
The unknowns are . The exact converged solution comes from a 4×4 linear system — derived inside the collapsible below — and equals . Below we run Jacobi, Gauss-Seidel, and SOR for a few sweeps against this target and watch how quickly each one closes the gap.
📐 Solve the worked example step by step
1. Exact converged answer (linear-algebra route)
Write the four equations:
V_1 = ½ (V_0 + V_2 + h² f_1) = ½ (0 + V_2 + 0) = ½ V_2 V_2 = ½ (V_1 + V_3 + 0) = ½ (V_1 + V_3) V_3 = ½ (V_2 + V_4 + h² f_3) = ½ (V_2 + V_4 + 8) V_4 = ½ (V_3 + V_5 + 0) = ½ (V_3 + 0) = ½ V_3
Substitute step by step. From eq. 1, ; from eq. 4, . Plug into eq. 2:
V_2 = ½ (V_2/2 + V_3) → 2 V_2 = V_2/2 + V_3 → V_3 = (3/2) V_2
Plug both into eq. 3:
(3/2) V_2 = ½ (V_2 + (3/2) V_2 · ½ + 8) (3/2) V_2 = ½ (V_2 + (3/4) V_2 + 8) (3/2) V_2 = (V_2 + (3/4) V_2 + 8) / 2 3 V_2 = V_2 + (3/4) V_2 + 8 3 V_2 - V_2 - (3/4) V_2 = 8 (5/4) V_2 = 8 V_2 = 32 / 5 = 6.4
So:
V_1 = V_2 / 2 = 3.2 V_2 = 6.4 V_3 = (3/2) V_2 = 9.6 V_4 = V_3 / 2 = 4.8
2. Four Jacobi sweeps from V = 0
Use the OLD values everywhere; each row reads only from the row above.
| Iter k | V_1 | V_2 | V_3 | V_4 |
|---|---|---|---|---|
| 0 (init) | 0.000 | 0.000 | 0.000 | 0.000 |
| 1 | 0.000 | 0.000 | 4.000 | 0.000 |
| 2 | 0.000 | 2.000 | 4.000 | 2.000 |
| 3 | 1.000 | 2.000 | 6.000 | 2.000 |
| 4 | 1.000 | 3.500 | 6.000 | 3.000 |
| 5 | 1.750 | 3.500 | 7.250 | 3.000 |
| … | … | … | … | … |
| true | 3.200 | 6.400 | 9.600 | 4.800 |
After 5 Jacobi sweeps against a true value of 9.6 (still 25% off).
3. Four Gauss-Seidel sweeps from V = 0
Sweep left-to-right; use the freshest value of every neighbour. When computing we already have the new from this same sweep.
| Iter k | V_1 | V_2 | V_3 | V_4 |
|---|---|---|---|---|
| 0 (init) | 0.000 | 0.000 | 0.000 | 0.000 |
| 1 | 0.000 | 0.000 | 4.000 | 2.000 |
| 2 | 0.000 | 2.000 | 6.000 | 3.000 |
| 3 | 1.000 | 3.500 | 7.250 | 3.625 |
| 4 | 1.750 | 4.500 | 8.062 | 4.031 |
| 5 | 2.250 | 5.156 | 8.594 | 4.297 |
| … | … | … | … | … |
| true | 3.200 | 6.400 | 9.600 | 4.800 |
After 5 GS sweeps — already much closer to 9.6.
4. Two SOR sweeps with ω = 1.5
Compute the GS update, then move 50% further:.
Iter 1: V_1^GS = ½(0 + 0 + 0) = 0 → V_1 = -0.5·0 + 1.5·0 = 0.000 V_2^GS = ½(0 + 0 + 0) = 0 → V_2 = -0.5·0 + 1.5·0 = 0.000 V_3^GS = ½(0 + 0 + 8) = 4 → V_3 = -0.5·0 + 1.5·4 = 6.000 V_4^GS = ½(6 + 0 + 0) = 3 → V_4 = -0.5·0 + 1.5·3 = 4.500 Iter 2: V_1^GS = ½(0 + 0 + 0) = 0 → V_1 = -0.5·0 + 1.5·0 = 0.000 V_2^GS = ½(0 + 6 + 0) = 3 → V_2 = -0.5·0 + 1.5·3 = 4.500 V_3^GS = ½(4.5 + 4.5 + 8) = 8.5 → V_3 = -0.5·6 + 1.5·8.5 = 9.750 V_4^GS = ½(9.75 + 0 + 0) = 4.88 → V_4 = -0.5·4.5 + 1.5·4.88 = 5.06
After just two SOR sweeps against true 9.6 — within 1.6%. SOR is mildly oscillating around the answer (overshooting then under), and converges fast.
Interactive: Jacobi vs Gauss-Seidel vs SOR
Now back to the 2D Poisson problem on an grid. All three methods start from and try to drive the residual to zero. The curves below show that residual on a log scale per iteration. The thumbnails on the bottom show the final field each method produces.
Race the three solvers on the same problem
All three start from . Every iteration the residual is measured (log scale). The fastest curve wins.
Jacobi after 400 iters
Gauss-Seidel after 400 iters
SOR after 400 iters
Slide the grid resolution and watch what happens to the gap between the three curves. At the gap is already noticeable; by Jacobi has barely made progress in the same 400 iterations while SOR has bottomed out near machine precision. Try setting — the SOR curve collapses onto Gauss-Seidel (as expected). Push above 1.95 and the SOR curve may bend back upward as you approach divergence.
Python: All Three Solvers in One File
Here is a complete, runnable Python implementation of Jacobi, Gauss-Seidel, and SOR. The test uses the manufactured-solution method: choose , compute , feed that as the source, and check the solver recovers back. With this gold-standard test you cannot fool yourself: the iteration count gap and the final-error gap separately tell you about solver speed and about discretization quality.
Running this file produces output of the form:
Jacobi: 1804 iters, max error = 2.45e-03 Gauss-Seid: 902 iters SOR (w=1.826): 118 iters
The Jacobi count, the GS count, and the SOR count are very close to the theoretical predictions , , and for . The max-error value is the same for all three methods because it is the discretization error of the 5-point stencil, not the solver. Refining the grid would shrink that error; faster iteration would not.
PyTorch on the GPU: Red-Black SOR
Plain Gauss-Seidel and SOR have a sequential data dependency that makes them awkward on a GPU. The standard fix is a red-black ordering: colour grid cells like a chessboard, update all red cells in parallel, then all black cells in parallel. Each colour's neighbours are entirely of the opposite colour, so the update inside one colour is fully independent. The iteration matrix is slightly different from natural-ordered SOR but has the same asymptotic spectral radius — convergence rates are identical for fine grids.
Two things make this implementation worth the extra lines. First, it is fully vectorized: the inner update is four calls plus a , all GPU kernels. No Python loop over space. Second, it is differentiable: every operation is a registered PyTorch op, so we can call and get gradients with respect to the source or the boundary values. That makes this routine a drop-in layer inside a neural network, which is the basis of physics-informed networks (PINNs), neural Poisson solvers, and differentiable rendering.
Multigrid: A Sketch of the Real Champion
Even SOR has an Achilles heel. As the Fourier picture shows, relaxation kills high-frequency error fast but low-frequency error slowly. Halving the grid spacing doubles the wavelength of every mode relative to the grid, so what used to be high-frequency suddenly is low-frequency on the new grid. Multigrid exploits this by alternating between fine and coarse grids:
- Smooth on the fine grid. Do a few Jacobi or GS sweeps. This kills the highest-frequency error.
- Restrict the residual to a coarser grid. Average or inject the residual onto an grid. On the coarse grid the surviving low-frequency error looks like high-frequency error.
- Solve (recursively) on the coarse grid. Either run a few sweeps and keep going coarser, or solve exactly if the grid is small enough.
- Prolong (interpolate) the correction back up. Linearly interpolate the coarse-grid solution back to the fine grid and add it to the current iterate.
- Smooth again on the fine grid to fix any high-frequency noise the interpolation introduced.
One pass through this V-cycle has cost on an grid (just constant work per unknown) and reduces the error by a fixed factor independent of grid size. So multigrid converges in iterations, with total cost — optimal up to constants. Whatever speedup SOR gave over Jacobi, multigrid gives again over SOR for large grids.
Why mention multigrid here? The relaxation methods of this section are not just a stepping stone. Multigrid uses them as its smoother — the inner loop that kills high-frequency error on each level. So every multigrid solver is built on top of Jacobi, Gauss-Seidel, or SOR. Understand the simple methods, and you understand the engine inside the fast ones. Most production Poisson solvers in computational physics, image processing, and machine learning today are multigrid or one of its conjugate-gradient cousins.
Summary
- Discretization. Replace by the 5-point stencil: . Applied as an assignment, this is the heart of every relaxation method.
- Jacobi. Use only OLD values during a sweep. Maximally parallel, but spectral radius means iterations.
- Gauss-Seidel. Use the freshest values in place. Spectral radius — roughly half the iterations of Jacobi. Sequential, unless red-black ordered.
- SOR. Gauss-Seidel + an overshoot factor . With the spectral radius drops to — iterations, an asymptotic speedup of a factor over Jacobi.
- Red-black ordering recovers full parallelism for Gauss-Seidel and SOR on GPUs with no loss of convergence rate.
- Multigrid uses these methods as building blocks ("smoothers") and achieves total work for the entire solve.
- Manufactured solutions (pick , derive , solve, compare) are the standard way to validate any PDE solver — the iteration-count gap measures solver speed, and the final-error gap measures discretization quality. Both gaps are independent.
Take-away. Relaxation methods turn the differential statement "the value here is the average of the neighbours, plus the source" into a literal program. The three classical methods — Jacobi, Gauss-Seidel, SOR — differ only in which neighbour values the assignment reads from and how aggressively to use them. Their spectral radii live in closed form for Poisson, and the gap between Jacobi's and SOR's iteration count is the entire reason large-scale physics simulations are feasible. Beyond them lies multigrid, the optimal-complexity champion — but every modern fast solver is still standing on the shoulders of Jacobi.