Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section you will be able to:

Replace the continuous wave PDE $\,u_{tt} = c^{2} u_{xx}\,$ with its discrete cousin on a space-time grid.
Derive the explicit five-point stencil $\,u_i^{n+1} = 2u_i^{n} - u_i^{n-1} + r^{2}\bigl(u_{i+1}^{n} - 2u_i^{n} + u_{i-1}^{n}\bigr)\,$ and read it as "curvature in space drives curvature in time."
Prove and feel the CFL bound $\,r = c\,\Delta t/\Delta x \le 1\,$ — the one inequality that decides whether your simulation is a wave or an explosion.
Handle Dirichlet, Neumann, and absorbing boundary conditions, and explain physically what each one does to a reflected pulse.
Implement the scheme in plain Python first, then translate it to vectorized PyTorch so it runs on a GPU and plugs into machine-learning pipelines.

Why Go Numerical?

"Analytic solutions are gifts. Numerical solutions are tools."

The previous four sections solved the wave equation by hand — d'Alembert's travelling waves, separation of variables, Fourier series, Bessel functions. Those methods are beautiful, but they only work on clean geometries with clean initial data. Real problems rarely cooperate:

Variable wave speed

Seismic waves slow down in soft sediments and speed up in granite, so $c(x)$ changes with position. Fourier modes do not decouple any more — the analytical machinery falls apart.

Irregular geometry

A guitar body, a violin sound box, a human skull. None of these admit a separable solution in any classical coordinate system.

Nonlinear effects

Shock fronts, breaking surface waves, sonic booms. The PDE picks up terms like $u\, u_x$ and superposition stops working.

Arbitrary initial data

You want to drop a Gaussian pulse, pluck a string with a triangle, or hand-draw your initial condition. There is no clean Fourier series for that.

The cure is to discretize: replace the continuous space-time with a finite grid, replace each derivative with a finite difference, and march the solution forward in time. The result is the Finite Difference Method (FDM) — the oldest, simplest, and arguably most pedagogically valuable way to solve a PDE on a computer.

Intuition: imagine the membrane of a drum painted with a grid. At every intersection we track a single number, the height of that pinpoint at a single instant. We then step time forward in tiny ticks. To know tomorrow's height of a point, we only need today's heights of that point and its immediate neighbors. Local rule, global wave. That is the whole game.

Discretizing Space and Time

Pick a spatial step $\Delta x$ and a temporal step $\Delta t$ . Replace the continuous function $u(x, t)$ by samples on a lattice:

u_i^{n} \;\approx\; u\bigl(i\,\Delta x,\; n\,\Delta t\bigr).

The superscript $n$ is the time row, the subscript $i$ is the spatial column. Think of $\,u_i^n\,$ as a giant matrix where each column is one moving point of the string and each row is one snapshot in time.

Approximating the derivatives

Taylor expansion of $u$ around a grid point gives, after the linear terms cancel, the famous second-order central differences:

Continuous	Discrete approximation	Order of accuracy
∂²u/∂t² at (x_i, t_n)	(u_i^{n+1} − 2 u_i^{n} + u_i^{n-1}) / Δt²	O(Δt²)
∂²u/∂x² at (x_i, t_n)	(u_{i+1}^{n} − 2 u_i^{n} + u_{i-1}^{n}) / Δx²	O(Δx²)
∂u/∂t at (x_i, t_n)	(u_i^{n+1} − u_i^{n-1}) / (2 Δt)	O(Δt²)

Each formula has the same shape: the second derivative of a smooth function at a point is, up to an error of order $h^{2}$ , the difference between the value there and the average of its two neighbors, scaled by $1/h^{2}$ . That is geometry: it measures how much the curve bulges away from the straight line through its neighbors. That bulge is curvature, and curvature is what drives the wave.

The discrete wave equation

Plug the two central differences into $u_{tt} = c^{2} u_{xx}$ at the grid point $(i, n)$ :

\dfrac{u_i^{n+1} - 2u_i^{n} + u_i^{n-1}}{\Delta t^{2}} \;=\; c^{2}\,\dfrac{u_{i+1}^{n} - 2u_i^{n} + u_{i-1}^{n}}{\Delta x^{2}}.

Solve for the only unknown, $u_i^{n+1}$ , and introduce the CFL number $r = c\,\Delta t / \Delta x$ :

\boxed{\;u_i^{n+1} \;=\; 2u_i^{n} - u_i^{n-1} \;+\; r^{2}\,\bigl(u_{i+1}^{n} - 2u_i^{n} + u_{i-1}^{n}\bigr).\;}

This is the only formula you have to remember. Everything else — stability, accuracy, boundary handling — spins out of this one line.

The Five-Point Stencil

The update rule reads from five grid points: the three neighbors at the current time row, the same point one step in the past, and writes one new value in the future. Drawn on a space-time grid it looks like a plus sign:

Two intuitions sit inside this stencil:

Spatial part — curvature drives acceleration. The bracket $u_{i+1}^{n} - 2u_i^{n} + u_{i-1}^{n}$ is the discrete Laplacian: positive when the point is below its neighbors, negative when above. The tension wants to pull the membrane back toward the average, and the wave equation says this force is proportional to acceleration.
Temporal part — momentum. The combination $2u_i^{n} - u_i^{n-1}$ is exactly where the point would be next if there were no force at all (constant velocity extrapolation). It is the discrete version of inertia.

Add inertia to the curvature-driven kick and you get the next time step. That is Newton's second law on a grid.

The First-Step Trick

The stencil needs two previous time rows, but at $n = 0$ we only have one. We need a special recipe for the first step.

Use the initial-velocity data $u_t(x, 0) = g(x)$ and a Taylor expansion in time:

u(x, \Delta t) \;=\; u(x, 0) + \Delta t\,u_t(x, 0) + \tfrac{1}{2}\Delta t^{2}\,u_{tt}(x, 0) + \mathcal{O}(\Delta t^{3}).

Substitute the wave equation $u_{tt}(x, 0) = c^{2}\,u_{xx}(x, 0)$ and replace $u_{xx}$ by central differences:

\boxed{\;u_i^{1} \;=\; u_i^{0} + \Delta t\,g(x_i) + \tfrac{1}{2}\,r^{2}\,\bigl(u_{i+1}^{0} - 2u_i^{0} + u_{i-1}^{0}\bigr).\;}

The half in front of $r^{2}$ is the signature of this special first step — miss it and the whole scheme silently drops to first-order accuracy. Released-from-rest data means $g(x) \equiv 0$ and the middle term vanishes.

Worked Example: Step Through By Hand

Take the classic plucked string setup: $u_{tt} = u_{xx}$ on $[0, 1]$ with $u(x, 0) = \sin(\pi x)$ , $u_t(x, 0) = 0$ and fixed ends. The exact solution is $u(x, t) = \sin(\pi x)\cos(\pi t)$ — the fundamental mode breathing at frequency one. We will compute three time steps by hand and compare to the exact answer.

► Step-by-step numerical walkthrough (click to expand)

Setup. Use a coarse grid so the arithmetic stays readable: $\Delta x = 0.25,\; \Delta t = 0.1$ . Then $r = c\,\Delta t / \Delta x = 0.4$ and $r^{2} = 0.16$ . CFL is satisfied (0.4 ≤ 1), so we expect a stable march.

Initial row $u^{0}_i = \sin(\pi x_i)$ :

i	x_i	u_i^0
0	0.00	0.00000
1	0.25	0.70711
2	0.50	1.00000
3	0.75	0.70711
4	1.00	0.00000

First step: because $g(x) = 0$ we use $u_i^{1} = u_i^{0} + \tfrac{1}{2} r^{2}\bigl(u_{i+1}^{0} - 2u_i^{0} + u_{i-1}^{0}\bigr)$ . For $i=1$ :

u_1^{1} = 0.70711 + 0.08\,(1 - 2\cdot 0.70711 + 0) = 0.70711 - 0.03314 = 0.67397.

For $i=2$ : $u_2^{1} = 1 + 0.08\,(0.70711 - 2 + 0.70711) = 1 - 0.04686 = 0.95314$ . By symmetry $u_3^{1} = u_1^{1} = 0.67397$ . The exact values at $t = 0.1$ are $\sin(\pi x_i)\cos(0.1\pi)$ :

i	u_i^1 (FDM)	Exact	Error
1	0.67397	0.67250	+0.00147
2	0.95314	0.95106	+0.00208
3	0.67397	0.67250	+0.00147

Second step. Now we can use the main rule $u_i^{n+1} = 2u_i^{n} - u_i^{n-1} + r^{2}\,(u_{i+1}^{n} - 2u_i^{n} + u_{i-1}^{n})$ . For $i = 1$ :

u_1^{2} = 2(0.67397) - 0.70711 + 0.16\,(0.95314 - 2(0.67397) + 0) = 0.64083 - 0.06317 = 0.57766.

i	u_i^2 (FDM)	Exact at t=0.2	Error
1	0.57766	0.57206	+0.00560
2	0.81694	0.80902	+0.00792
3	0.57766	0.57206	+0.00560

Third step:

i	u_i^3 (FDM)	Exact at t=0.3	Error
1	0.42722	0.41563	+0.01159
2	0.60418	0.58779	+0.01639
3	0.42722	0.41563	+0.01159

What we observe. The scheme tracks the exact cosine-breathing motion correctly. The amplitude at $x = 0.5$ drops from 1.00000 to 0.95314 to 0.81694 to 0.60418 — matching the sampling of $\cos(\pi t)$ almost exactly. The error grows roughly linearly with $n$ , which is the expected behaviour for a second-order scheme on a coarse grid. Refining $\Delta x$ by a factor of two (keeping $r$ fixed) would cut every error in this table by a factor of about four.

The CFL Stability Condition

Why $r \le 1$ ? The physical intuition is beautiful: information in the wave equation travels at speed $c$ . In one numerical time step $\Delta t$ , the physical wave moves a distance $c\,\Delta t$ . The numerical stencil gathers information from one grid spacing $\Delta x$ away.

Courant's rule: the numerical "sight" of one cell per step must reach at least as far as the physical wave front travels. Otherwise the scheme is trying to predict a value the wave has already passed through — information arrives that the stencil has not even looked at yet, and the simulation goes berserk.

Mathematically the same condition pops out of von Neumann stability analysis. Substitute the Fourier mode $u_j^{n} = \xi^{n} e^{ikj\Delta x}$ into the update rule. You get a quadratic in the growth factor $\xi$ :

\xi^{2} - 2\bigl(1 - 2r^{2}\sin^{2}(k\Delta x / 2)\bigr)\,\xi + 1 = 0.

For every wavenumber $k$ , both roots have modulus $|\xi| = 1$ exactly when the coefficient stays in $[-2, 2]$ , which forces $r \le 1$ . Equality, the so-called magic CFL number, is even better: it makes the FDM update exactly equal to d'Alembert's solution — no dispersion at all.

Push the slider just past $r = 1$ and the Gaussian pulse explodes within a few dozen steps. Pull it back and watch the wave snap right back into clean propagation. The transition is razor sharp because the underlying inequality is binary — you are either inside the unit circle or you are not.

Boundary Conditions

The interior stencil has the same shape everywhere, but at $i = 0$ and $i = N$ we need to say something about what is beyond the edge. Three flavors cover essentially every classical problem.

Dirichlet (fixed end)

$u(0, t) = u(L, t) = 0$ . Force the boundary values to zero every step. A right-moving pulse hits the wall and bounces back upside down.

Neumann (free end)

$u_x(0, t) = u_x(L, t) = 0$ . Implement by copying the inner neighbor: $u_0^{n+1} = u_1^{n+1}$ . Pulses bounce back right-side up.

Absorbing (open)

Mur first-order: $u_0^{n+1} = u_1^{n} + \tfrac{r-1}{r+1}(u_1^{n+1} - u_0^{n})$ . Lets waves leave the domain. Perfect at $r = 1$ , slightly leaky otherwise.

The physics: a fixed end is a wall — force the displacement to zero, the wave hands its energy back inverted. A free end is a ring on a frictionless rod — no transverse force, so the slope must be flat. An absorbing end is an imaginary infinite extension — we model what a far-field observer would see at the truncation point.

Interactive 1D Wave Simulator

Time to play. The next panel runs the full FDM scheme live in your browser. Switch initial conditions, slide the CFL number into the danger zone, swap boundary conditions, and observe what each knob does to the physics.

Things worth trying:

With sine mode + fixed BC, turn on "show exact". At $r = 1$ the cyan curve sits exactly on top of the yellow dashed line forever — the magic CFL number.
Launch a Gaussian with absorbing BCs. The pulse escapes through the boundaries instead of bouncing.
With two pulses + fixed BCs, watch them collide, pass through each other (linearity!), and rebound off the walls inverted.
Push the CFL slider past 1. Watch the wave turn red and detonate.

Plain Python Implementation

Now that you have the rule in muscle memory, here is the whole solver as roughly thirty lines of NumPy. Read the explanations and feel how each line maps to a physical idea.

Finite-difference wave solver in NumPy

🐍wave_fdm.py

Explanation(18)

Code(48)

1Import numpy

NumPy gives us vectorized array operations. We need it because the FDM update can be written as a single slice expression that runs in C-speed instead of a Python loop.

EXAMPLE

np.array([1,2,3]) + np.array([4,5,6]) -> [5,7,9]

4Function signature: every physical and numerical knob

c is wave speed, L is domain length, T is total simulated time, nx is the number of spatial grid points, r is the CFL number we want to use (r = c·dt/dx). f(x) is the initial displacement profile, g(x) is the initial velocity. boundary picks Dirichlet/Neumann/absorbing.

EXAMPLE

solve_wave_1d(c=2.0, r=0.8, boundary='absorbing')

9Grid spacing dx

We place nx points across [0, L] inclusive, so the gap between neighbors is L/(nx-1). With nx=101 and L=1 we get dx = 0.01.

EXAMPLE

L=1, nx=5 -> dx = 0.25

10Pick dt from r — not the other way around

The classical FDM-wave stability bound is r = c·dt/dx ≤ 1. Instead of choosing dt and praying, we pick r ≤ 1 first and back-solve for dt = r·dx/c. With c=1, dx=0.01, r=0.9 → dt = 0.009.

EXAMPLE

r=1.0 means dt = dx/c (sharpest stable scheme)

11Number of time steps

Total simulated time T divided by dt, rounded up so we never undershoot T. The +1 accounts for storing both endpoints of every interval.

12The spatial grid x[i] = i·dx

linspace gives 101 points from 0 to 1 inclusive. This is the discrete approximation of the continuous spatial axis.

EXAMPLE

np.linspace(0,1,5) -> [0, 0.25, 0.5, 0.75, 1.0]

13Cache r² (used inside the inner loop)

The Laplacian coefficient in the update rule is r², not r. Pre-computing avoids redoing this multiplication every step.

15Three time layers — the only ones we ever need

The FDM update u(n+1) depends only on u(n) and u(n-1), so we keep three arrays: u_prev, u_curr, u_next. No need to store the whole history (we do here for plotting, but the algorithm itself is 3-layer).

16u_curr — preallocated buffer

np.empty_like skips zero-initialization; we will overwrite every interior element in the next few lines.

19The first step uses a SECOND-ORDER Taylor trick

We cannot use the main rule for n=0 because we have no u(-1). Taylor expanding u(x, dt) around t=0 and using u_tt = c² u_xx gives u¹ ≈ u⁰ + dt·g(x) + ½·r²·(u⁰_{i+1} − 2u⁰_i + u⁰_{i−1}). This special formula keeps the scheme second-order accurate from step 1.

EXAMPLE

If g(x)=0 (released from rest), the dt·g term drops out.

22Fix the boundaries at zero for step 1

Dirichlet conditions u(0,t) = u(L,t) = 0 must hold for all n, including the first step. We set them explicitly so the boundary values are not garbage from np.empty_like.

26Main time-stepping loop

From step 1 onwards, the standard 3-level formula applies. Each iteration computes one new time row from the two previous rows, then rotates the references.

27THE FDM update — vectorized in one line

This is the heart of the method. u_curr[2:] − 2·u_curr[1:-1] + u_curr[:-2] is the discrete Laplacian as a NumPy slice. Combined with 2·u_curr − u_prev it gives u_next at every interior point in one shot.

EXAMPLE

If u_curr = [0, 0.67, 0.95, 0.67, 0] and u_prev = [0, 0.71, 1, 0.71, 0], with r²=0.16, the interior of u_next is [0.578, 0.817, 0.578]

30Dirichlet (fixed-end) boundary

Physical picture: clamped string. Mathematical picture: u = 0 at both ends. Pulses reflect with a sign flip — the inverted echo you hear off a hard wall.

33Neumann (free-end) boundary

Physical picture: the end of the string is on a frictionless ring. Mathematical picture: ∂u/∂x = 0, which in FDM means the ghost point equals its neighbor. Pulses reflect WITHOUT a sign flip.

36Mur first-order absorbing boundary

Open-domain trick. For a right-going wave at the right edge, u_t + c·u_x = 0. Discretizing gives a relation that absorbs the wave instead of reflecting it. Perfect when r = 1, increasingly leaky as r drifts away from 1.

EXAMPLE

When r=1 the formula collapses to u_next[N-1] = u_curr[N-2]: the right boundary just copies its inner neighbor (perfectly transparent).

39Buffer rotation — no allocation

Instead of copying arrays we rotate the three references: tomorrow becomes today, today becomes yesterday, and yesterday's buffer is reused for tomorrow. This is the secret to fast time-stepping in any explicit scheme.

45Sanity check the result

For a sin(πx) initial condition with c=1 and fixed ends, the exact solution is sin(πx)·cos(πt), so max|u| should stay ≈ 1 forever. If you see exponential growth, r > 1 sneaked in (CFL violated).

30 lines without explanation

1import numpy as np
2import matplotlib.pyplot as plt
3
4def solve_wave_1d(c=1.0, L=1.0, T=2.0, nx=101, r=0.9,
5                  f=lambda x: np.sin(np.pi * x),
6                  g=lambda x: np.zeros_like(x),
7                  boundary="fixed"):
8    """Finite-difference solver for u_tt = c^2 u_xx on [0, L] x [0, T]."""
9    dx = L / (nx - 1)
10    dt = r * dx / c                    # CFL: r = c*dt/dx <= 1
11    nt = int(np.ceil(T / dt)) + 1
12    x  = np.linspace(0.0, L, nx)
13    r2 = r * r
14
15    u_prev = f(x).astype(float)        # u at time level n-1
16    u_curr = np.empty_like(u_prev)     # u at time level n
17    u_next = np.empty_like(u_prev)     # u at time level n+1
18
19    # Special first step uses the initial velocity g(x)
20    u_curr[1:-1] = (u_prev[1:-1]
21        + dt * g(x[1:-1])
22        + 0.5 * r2 * (u_prev[2:] - 2 * u_prev[1:-1] + u_prev[:-2]))
23    u_curr[0] = u_curr[-1] = 0.0
24
25    history = [u_prev.copy(), u_curr.copy()]
26
27    for n in range(1, nt - 1):
28        u_next[1:-1] = (2 * u_curr[1:-1] - u_prev[1:-1]
29            + r2 * (u_curr[2:] - 2 * u_curr[1:-1] + u_curr[:-2]))
30
31        if boundary == "fixed":        # Dirichlet
32            u_next[0]  = 0.0
33            u_next[-1] = 0.0
34        elif boundary == "free":       # Neumann (zero-derivative)
35            u_next[0]  = u_next[1]
36            u_next[-1] = u_next[-2]
37        elif boundary == "absorbing":  # Mur 1st-order open BC
38            u_next[0]  = u_curr[1]  + ((r - 1) / (r + 1)) * (u_next[1]  - u_curr[0])
39            u_next[-1] = u_curr[-2] + ((r - 1) / (r + 1)) * (u_next[-2] - u_curr[-1])
40
41        u_prev, u_curr, u_next = u_curr, u_next, u_prev
42        history.append(u_curr.copy())
43
44    return x, np.array(history), dt
45
46x, U, dt = solve_wave_1d(nx=101, r=0.9)
47print(f"steps={U.shape[0]}, dt={dt:.4f}")
48print(f"max|u| over time = {np.max(np.abs(U)):.4f}")

Reading the algorithm in code form makes the trade-off explicit: we traded a clean closed-form Fourier series for a vectorized 3-line update we can run on anything — arbitrary c(x), arbitrary domain shapes, arbitrary forcing.

Vectorized PyTorch Implementation

The NumPy version maxes out a single CPU core. For 2D / 3D wave problems, batch simulation, or for plugging the solver into a neural network, we rewrite the same code in PyTorch. The pleasant surprise: only the import line really changes.

GPU-ready wave solver in PyTorch + a Conv1d view

🐍wave_torch.py

Explanation(10)

Code(41)

3Disable autograd

We are doing forward simulation, not training, so we tag the whole solver with no_grad. This frees PyTorch from building a computational graph and roughly doubles speed and halves memory.

EXAMPLE

Drop the decorator if you want gradients of T(u₀) for adjoint optimization.

4Same API, different backend

Notice how the function signature is essentially the NumPy one. Same physical knobs (c, L, T) and same numerical knobs (nx, r). The mathematics did not change — only the array library.

6Device autodetect

PyTorch chooses CUDA if a GPU is visible, otherwise CPU. On an A100 a 1-million-point wave simulation runs ~50× faster than the NumPy version.

11linspace on GPU

torch.linspace(..., device='cuda') allocates the grid directly in GPU memory; no host-to-device copy is needed afterwards.

12Initial condition as a tensor

torch.sin works elementwise on the tensor. clone() detaches the underlying storage so we don't accidentally alias the grid x.

EXAMPLE

u_prev = exp(-200·(x-0.5)²) for a pulse instead of a sine

17Vectorized FDM (no Python loop in space)

Exact same slice notation as NumPy. PyTorch dispatches it to a fused CUDA kernel under the hood — the loop inside is in C++/CUDA, not Python.

22Time loop — only Python overhead is per-step, not per-cell

Each iteration of this loop launches one CUDA kernel. With nx=2049 the kernel is short, so the loop is launch-bound. Batching multiple simulations along a new axis (e.g., u shape [B, nx]) amortizes the launch cost.

29Rotation of references — zero-copy

On a GPU, swapping Python references is free; no tensor data moves. This is the right way to ping-pong buffers for any explicit time integrator.

32Conv1d view of the same operation

The discrete Laplacian [1, -2, 1] is just a 1-D convolution kernel. Re-expressing the update as conv1d makes the FDM scheme look like the simplest possible CNN — which is exactly the framing modern PINNs and neural-operator papers use to bridge classical PDE solvers and deep learning.

EXAMPLE

torch.nn.functional.conv1d(u.view(1,1,-1), torch.tensor([[[1,-2,1]]]))

34Why this view matters

Once the FDM step is a Conv1d, you can stack it with learnable layers (residual corrections from data), batch many simulations at once, or replace the kernel with a learned one. This is the core trick behind 'differentiable solvers' for PDEs.

31 lines without explanation

1import torch
2
3@torch.no_grad()
4def solve_wave_1d_torch(c=1.0, L=1.0, T=2.0, nx=2049, r=0.9,
5                         device="cuda" if torch.cuda.is_available() else "cpu",
6                         dtype=torch.float32):
7    """Same FDM scheme as the NumPy version, but as a GPU-friendly torch.Tensor."""
8    dx = L / (nx - 1)
9    dt = r * dx / c
10    nt = int((T / dt)) + 1
11    r2 = r * r
12
13    x = torch.linspace(0.0, L, nx, device=device, dtype=dtype)
14    u_prev = torch.sin(torch.pi * x).clone()
15    u_curr = torch.empty_like(u_prev)
16    u_next = torch.empty_like(u_prev)
17
18    # First step (g(x)=0 here, so the dt*g term vanishes)
19    u_curr[1:-1] = (u_prev[1:-1]
20                    + 0.5 * r2 * (u_prev[2:] - 2 * u_prev[1:-1] + u_prev[:-2]))
21    u_curr[0] = u_curr[-1] = 0.0
22
23    for _ in range(1, nt - 1):
24        u_next[1:-1] = (2 * u_curr[1:-1] - u_prev[1:-1]
25                        + r2 * (u_curr[2:] - 2 * u_curr[1:-1] + u_curr[:-2]))
26        u_next[0] = u_next[-1] = 0.0
27        u_prev, u_curr, u_next = u_curr, u_next, u_prev
28
29    return x.cpu(), u_curr.cpu()
30
31# Conv1d view — the Laplacian is literally a kernel convolution
32def laplacian_kernel(device, dtype):
33    return torch.tensor([[[1.0, -2.0, 1.0]]], device=device, dtype=dtype)
34
35def step_conv(u_prev, u_curr, kernel, r2):
36    """One time step expressed as a 1-D convolution. Drop-in batched version."""
37    lap = torch.nn.functional.conv1d(u_curr.view(1, 1, -1), kernel).view(-1)
38    u_next = 2 * u_curr[1:-1] - u_prev[1:-1] + r2 * lap
39    out = torch.zeros_like(u_curr)
40    out[1:-1] = u_next
41    return out

That last function is the bridge to modern research. Once your FDM time step is a $\texttt{conv1d}$ , the boundary between "classical PDE solver" and "learnable neural network" collapses. Differentiable solvers, neural operators, and physics-informed networks all start exactly here.

Accuracy, Dispersion, and Error

The scheme is second-order accurate in both $\Delta x$ and $\Delta t$ : halving the grid roughly quarters the error.

Numerical dispersion

A subtler issue is numerical dispersion. The exact wave equation has every wavenumber propagating at the same speed $c$ . The discrete scheme has a slightly wavenumber-dependent speed:

c_{\text{num}}(k) \;=\; \dfrac{2}{k\,\Delta t}\,\arcsin\bigl(r\,\sin(k\,\Delta x / 2)\bigr).

Long-wavelength components (small $k$ ) travel at almost $c$ ; short ones lag. The net effect is that a sharp Gaussian pulse spreads out slightly with time — energy at high $k$ gets left behind. At the magic value $r = 1$ the formula collapses to $c_{\text{num}} = c$ identically — no dispersion, full agreement with d'Alembert.

How to keep the error small in practice

Pick the finest $\Delta x$ your memory allows.
Pick the largest $\Delta t$ that still satisfies $r \le 1$ . Smaller $\Delta t$ is not always better — it actually increases dispersion.
Resolve the shortest physical wavelength with at least 10–20 grid points.
If you need very long-time integration, switch to a higher-order scheme (Crank–Nicolson, spectral, or DG).

Beyond Basic FDM

Method	Strength	Use it when
Explicit FDM (this section)	Simple, fast, parallelizable	Linear waves on regular grids
Crank-Nicolson / implicit	Unconditionally stable, A-stable	Stiff or diffusive problems
Pseudo-spectral	Spectral (exponential) accuracy	Periodic domains, smooth solutions
Finite-element method	Handles complex geometries	Irregular domains, structural acoustics
Discontinuous Galerkin	High order + shock capturing	Nonlinear / discontinuous data
PINN / neural operator	Learns from data + physics	Inverse problems, parameter sweeps

Every one of these methods sits on the same conceptual chassis as the five-point stencil: replace continuous operators with discrete proxies, respect stability, and march in time. Master the simplest one and the rest will read like dialect.

Connection to Machine Learning

Modern deep-learning research has rediscovered FDM in three flavours, all of which lean on the equation you just implemented:

Differentiable solvers

Wrap the FDM step in PyTorch and you can back-propagate through time. Used in adjoint optimization for waveform inversion in seismology.

Physics-informed neural networks

PINNs minimize the residual $u_{tt} - c^{2} u_{xx}$ on collocation points — the same residual the FDM stencil discretizes.

Neural operators

Fourier and DeepONet operators learn the solution map $u_0 \mapsto u(\cdot, t)$ . FDM provides the ground truth they are trained against.

Common Pitfalls

Violating CFL by a hair. Set $r = 1.0001$ by accident in a test and the simulation looks fine for 50 steps before exploding. Always pin $r$ to at most 0.99 in production code.
Forgetting the first-step factor of one-half. Using the standard rule at $n = 0$ drops the scheme to first-order overall and the phase error grows visibly fast.
Mixing units. $c$ , $\Delta x$ , and $\Delta t$ must be in consistent units, or your CFL number lies to you. Common SI bug: $c$ in m/s, $\Delta x$ in cm.
Reflections from open boundaries. Plain Dirichlet turns "outflow" into a wall. Use Mur or PML if you do not want echoes.
Under-resolved wavelengths. Fewer than ~10 grid points per wavelength makes dispersion catastrophic. Refine the grid or use a higher-order scheme.

Summary

The wave equation, once reserved for clean geometries and clever substitutions, is now in your hands as a thirty-line solver. The ingredients:

Discretize on a regular grid; approximate $u_{tt}$ and $u_{xx}$ by central differences.
Solve for the future value to get the five-point stencil $u_i^{n+1} = 2u_i^{n} - u_i^{n-1} + r^{2}\Delta_x^{2}u_i^{n}$ .
Use the half-step Taylor trick for the first iteration so the scheme stays second-order accurate.
Respect the CFL bound $r = c\,\Delta t / \Delta x \le 1$ . Magic accuracy at $r = 1$ ; explosion just past it.
Pick a boundary condition that matches your physics: fixed for clamped, Neumann for free, Mur for open.
Vectorize in NumPy for clarity, port to PyTorch for speed and differentiability, and a Conv1d view drops you straight into modern ML research.

The next sections take this scheme into the wild: musical instruments, electromagnetic waves, and seismic propagation. With the FDM solver in hand, every one of those becomes a small modification, not a new chapter of mathematics.