Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Apply separation of variables to solve the heat equation on a finite rod with Dirichlet boundary conditions
Derive the spatial eigenvalue problem and identify the eigenvalues $\lambda_n = (n\pi/L)^2$ and eigenfunctions $\sin(n\pi x/L)$
Construct the general Fourier series solution and compute Fourier coefficients from initial conditions
Interpret the physical meaning of mode decay and explain why higher modes decay faster
Implement the Fourier series solution in Python and visualize the evolution of temperature
Connect the heat equation solution to machine learning concepts including score matching and diffusion models

The Big Picture: From Physics to Fourier

"The profound study of nature is the most fertile source of mathematical discoveries." — Joseph Fourier, 1822

In the previous section, we derived the heat equation from first principles. Now we face a fundamental challenge: how do we solve this partial differential equation? The answer reveals one of the most beautiful connections in mathematics — the link between differential equations and Fourier series.

Joseph Fourier discovered that any "reasonable" function can be expressed as an infinite sum of sines and cosines. When applied to the heat equation, this leads to an elegant solution: each Fourier mode evolves independently, decaying exponentially at a rate proportional to the square of its frequency.

The Central Insight

The heat equation on a finite rod can be solved exactly using separation of variables. The solution is a Fourier sine series where each mode decays exponentially:

u(x,t) = \sum_{n=1}^{\infty} B_n \sin\left(\frac{n\pi x}{L}\right) e^{-\alpha(n\pi/L)^2 t}

The coefficients $B_n$ are determined by projecting the initial condition onto the eigenfunctions.

Why This Matters for Machine Learning

The heat equation solution has profound connections to modern deep learning:

Diffusion models (DALL-E, Stable Diffusion) reverse a heat diffusion process to generate images from noise
Score matching learns the gradient of the log probability, which evolves according to a Fokker-Planck equation
Graph neural networks propagate information using discrete Laplacians — discretized heat equations on graphs
Regularization in neural networks often corresponds to adding diffusion terms that smooth the loss landscape

Problem Setup: The Finite Rod

Consider a thin rod of length $L$ with its ends held at a fixed temperature (which we take to be zero for simplicity). The temperature distribution $u(x,t)$ along the rod satisfies:

The Heat Equation Boundary Value Problem

PDE:

\frac{\partial u}{\partial t} = \alpha \frac{\partial^2 u}{\partial x^2}, \quad 0 < x < L, \quad t > 0

BCs:

u(0, t) = 0, \quad u(L, t) = 0

(Dirichlet conditions)

IC:

u(x, 0) = f(x)

(Initial temperature)

Understanding the Boundary Conditions

The Dirichlet boundary conditions $u(0,t) = u(L,t) = 0$ represent holding both ends of the rod at zero temperature. Physically, this could be achieved by immersing the endpoints in ice baths.

Boundary Type	Condition	Physical Meaning
Dirichlet	u = 0 (or constant)	Fixed temperature (e.g., contact with reservoir)
Neumann	∂u/∂x = 0	Insulated end (no heat flux)
Robin	∂u/∂x + hu = 0	Convective cooling (Newton's law)

Why Zero?

Setting the boundary values to zero is not a limitation. If the boundary temperatures are $T_0$ and $T_L$ , we can define a new variable $v = u - u_{ss}$ where $u_{ss}$ is the steady-state solution. Then $v$ satisfies the heat equation with zero boundary conditions.

Separation of Variables

The method of separation of variables is one of the most powerful techniques for solving linear PDEs. The key idea is to assume the solution can be written as a product of single-variable functions:

u(x, t) = X(x) \cdot T(t)

This ansatz transforms the PDE into two ordinary differential equations that can be solved independently. Let us walk through the derivation step by step.

🧮Separation of Variables: Step-by-Step

Walk through the classical method for solving the heat equation

Step 1: The Problem Statement

We want to solve the 1D heat equation on a finite rod of length L with homogeneous Dirichlet boundary conditions:

PDE: The heat equation describes how temperature evolves
BCs: Both ends are held at zero temperature
IC: Initial temperature distribution is f(x)

∂u/∂t = α ∂²u/∂x², u(0,t) = u(L,t) = 0, u(x,0) = f(x)

The boundary conditions will constrain which spatial functions are allowed.

1 of 10

Why Separation Works

The success of separation of variables relies on three key properties:

Linearity: The heat equation is linear, so any linear combination of solutions is also a solution. This allows us to superpose infinitely many separable solutions.
Homogeneous boundary conditions: The zero boundary conditions are "homogeneous" — they don't introduce additional terms. Non-homogeneous conditions require extra steps.
Completeness of eigenfunctions: The sine functions $\sin(n\pi x/L)$ form a complete orthonormal basis for functions on $[0, L]$ that vanish at the endpoints.

The Spatial Eigenvalue Problem

When we separate variables, the spatial part becomes an eigenvalue problem:

\frac{d^2 X}{dx^2} + \lambda X = 0, \quad X(0) = X(L) = 0

This is a Sturm-Liouville problem. The boundary conditions constrain which values of $\lambda$ are allowed.

Finding the Eigenvalues

The general solution to $X'' + \lambda X = 0$ depends on the sign of $\lambda$ :

Case	General Solution	Boundary Conditions
λ < 0	X = Ae^{√\|λ\|x} + Be^{-√\|λ\|x}	Only X ≡ 0 satisfies both BCs
λ = 0	X = Ax + B	Only X ≡ 0 satisfies both BCs
λ > 0	X = A cos(√λ x) + B sin(√λ x)	Non-trivial solutions exist!

For $\lambda > 0$ , applying $X(0) = 0$ gives $A = 0$ . Then $X(L) = B\sin(\sqrt{\lambda}L) = 0$ requires:

\sqrt{\lambda} L = n\pi \quad \Rightarrow \quad \lambda_n = \left(\frac{n\pi}{L}\right)^2, \quad n = 1, 2, 3, \ldots

Eigenvalues

\lambda_n = \frac{n^2 \pi^2}{L^2}

The eigenvalues grow as $n^2$ . They determine both the spatial frequency and the temporal decay rate.

Eigenfunctions

X_n(x) = \sin\left(\frac{n\pi x}{L}\right)

These are the "normal modes" — each satisfies the boundary conditions independently.

Orthogonality

The eigenfunctions are orthogonal on $[0, L]$ :

\int_0^L \sin\left(\frac{m\pi x}{L}\right) \sin\left(\frac{n\pi x}{L}\right) dx = \begin{cases} L/2 & \text{if } m = n \\ 0 & \text{if } m \neq n \end{cases}

This orthogonality is essential for computing the Fourier coefficients.

Fourier Series Solution

Each separable solution $X_n(x)T_n(t)$ satisfies the PDE and boundary conditions. By linearity, any linear combination is also a solution:

The Complete Solution

u(x, t) = \sum_{n=1}^{\infty} B_n \sin\left(\frac{n\pi x}{L}\right) e^{-\alpha (n\pi/L)^2 t}

Each mode $n$ oscillates spatially with wavelength $2L/n$ and decays exponentially with rate $\gamma_n = \alpha(n\pi/L)^2$ .

Finding the Coefficients

The coefficients $B_n$ are determined by the initial condition $u(x, 0) = f(x)$ . At $t = 0$ :

f(x) = \sum_{n=1}^{\infty} B_n \sin\left(\frac{n\pi x}{L}\right)

This is the Fourier sine series of $f(x)$ . Using orthogonality, we can extract each coefficient:

Fourier Coefficient Formula

B_n = \frac{2}{L} \int_0^L f(x) \sin\left(\frac{n\pi x}{L}\right) dx

This computes the "projection" of $f(x)$ onto the $n$ -th eigenfunction.

Physical Interpretation

The solution tells us that heat diffusion can be understood as the independent evolution of infinitely many "modes," each with its own characteristic spatial pattern and decay rate.

🔥Heat Equation on a Finite Rod

Solution using Fourier series: u(x,t) = Σ B_n sin(nπx/L) e^{-α(nπ/L)²t}

Time: t = 0.000s|Decay rate (n=1): λ₁²α = 0.0987

Thermal Diffusivity (α): 0.010

Higher α = faster heat diffusion

Fourier Modes: 20

More modes = better approximation of initial condition

Initial Condition f(x)

u(x, t) = Σ_n=1^∞ B_n sin(nπx/L) e^{-α(nπ/L)²t}

Each mode decays exponentially with rate proportional to n² — higher modes decay faster

Key Observations

• Boundary conditions: u(0,t) = u(L,t) = 0 (Dirichlet, fixed ends)
• Mode decay: Mode n decays as e^{-αn²π²t/L²} — higher modes vanish first
• Long-time behavior: Solution approaches zero as t → ∞
• Smoothing: Sharp features (high-frequency modes) smooth out quickly

Key Physical Insights

Smoothing of Sharp Features

Sharp discontinuities (like step functions) contain high-frequency modes (large $n$ ). These decay fastest, so sharp features smooth out rapidly.

Long-Time Behavior

As $t \to \infty$ , all modes decay and $u(x,t) \to 0$ . Heat escapes through the boundaries, which are held at zero temperature.

Dominant Mode

After sufficient time, only the $n = 1$ mode remains significant. The solution approaches $B_1 \sin(\pi x/L) e^{-\alpha\pi^2 t/L^2}$ .

Characteristic Time Scale

The characteristic time for mode $n = 1$ is $\tau = L^2/(\alpha\pi^2)$ . Larger rods or lower diffusivity means slower equilibration.

Mode Decay Visualization

The decay rate of each Fourier mode is $\gamma_n = \alpha(n\pi/L)^2$ . Since this grows as $n^2$ , higher modes decay much faster than lower modes.

📉Fourier Mode Decay Rates

Each mode n decays as e^{-α(nπ/L)²t} — higher modes decay faster

Select Modes

Time: t = 0.00

α = 0.010

Half-Life Times

n=1: 7.02s

n=2: 1.76s

n=3: 0.78s

Decay Rate

γ_n = α(nπ/L)² = αn²π²/L²

Half-Life

t_1/2 = ln(2)/γ_n ∝ 1/n²

Mode n=1 vs n=k

Mode k decays k² times faster

Physical Interpretation

Higher modes = shorter wavelengths = sharper features. These decay fastest because heat can flow quickly over short distances. This is why sharp temperature discontinuities smooth out rapidly, while broad temperature variations persist longer. The n=2 mode decays 4× faster than n=1, n=3 decays 9× faster, and so on.

The n² Rule

Mode $n$ decays $n^2$ times faster than mode $n = 1$ . This means:

Mode 2 decays 4× faster than mode 1
Mode 3 decays 9× faster than mode 1
Mode 10 decays 100× faster than mode 1

This rapid decay of high frequencies is why heat diffusion is such an effective "smoothing" process.

Special Cases

Case 1: Single Sine Mode Initial Condition

If $f(x) = \sin(m\pi x/L)$ for some integer $m$ , then by orthogonality:

B_n = \begin{cases} 1 & \text{if } n = m \\ 0 & \text{otherwise} \end{cases}

The solution is simply the single decaying mode:

u(x, t) = \sin\left(\frac{m\pi x}{L}\right) e^{-\alpha(m\pi/L)^2 t}

Case 2: Constant Initial Temperature

If $f(x) = T_0$ (constant), the Fourier coefficients are:

B_n = \frac{2T_0}{L} \int_0^L \sin\left(\frac{n\pi x}{L}\right) dx = \frac{2T_0}{n\pi}(1 - \cos(n\pi)) = \begin{cases} \frac{4T_0}{n\pi} & n \text{ odd} \\ 0 & n \text{ even} \end{cases}

Gibbs Phenomenon

When approximating a discontinuous function (like a step) with a Fourier series, the partial sums overshoot near the discontinuity by about 9%. This is the Gibbs phenomenon. As we add more modes, the overshoot moves closer to the discontinuity but never disappears.

Case 3: Point Source (Delta Function)

If $f(x) = \delta(x - x_0)$ (heat concentrated at a single point $x_0$ ):

B_n = \frac{2}{L} \sin\left(\frac{n\pi x_0}{L}\right)

The solution spreads out from the initial point, demonstrating diffusion.

Machine Learning Connections

The Fourier series solution of the heat equation provides deep insights into modern machine learning methods.

1. Diffusion Models (Score-Based Generative Models)

Diffusion models like DALL-E 3 and Stable Diffusion work by:

Forward process: Gradually add noise to data (like heat diffusion)
Reverse process: Learn to denoise (reverse diffusion)

The forward process follows a stochastic differential equation closely related to the heat equation:

dx = -\frac{1}{2}\beta(t)x \, dt + \sqrt{\beta(t)} \, dW

The "score" $\nabla_x \log p(x,t)$ that diffusion models learn is the gradient of the log probability density, which satisfies a Fokker-Planck equation — a generalization of the heat equation.

2. Graph Neural Networks and the Graph Laplacian

Many GNN architectures implement discrete heat diffusion on graphs. The graph Laplacian $L = D - A$ plays the role of the continuous Laplacian $\nabla^2$ , and the update rule:

h_i^{(t+1)} = h_i^{(t)} + \sum_{j \in \mathcal{N}(i)} (h_j^{(t)} - h_i^{(t)})

is the discrete analog of $\partial u/\partial t = \nabla^2 u$ .

3. Spectral Methods in Deep Learning

The Fourier decomposition we used corresponds to the spectral decomposition of the Laplacian operator. This connects to:

Spectral graph convolutions: Apply filters in the eigenspace of the graph Laplacian
Fourier neural operators: Learn operators in Fourier space for PDE solving
Positional encodings: Use sinusoidal functions (like our eigenfunctions!) to encode position in Transformers

Fourier Features in ML

Random Fourier features and spectral normalization both leverage the properties of Fourier representations. The smoothing behavior of the heat equation (high-frequency damping) is related to regularization and generalization in neural networks.

Python Implementation

Let us implement the Fourier series solution in Python. This code demonstrates:

Computing Fourier coefficients by numerical integration
Evaluating the series solution at any point (x, t)
Visualizing the evolution of temperature over time
Analyzing mode decay rates

Fourier Series Solution of Heat Equation on a Finite Rod

🐍heat_equation_rod.py

Explanation(6)

Code(113)

4Fourier Coefficient Formula

The Fourier sine coefficient Bₙ is the projection of f(x) onto the n-th eigenfunction sin(nπx/L). This uses the orthogonality of sine functions on [0, L].

14Numerical Integration

We use the trapezoidal rule to approximate the integral. For smooth functions, this gives O(dx²) accuracy.

18Fourier Series Solution

Each term in the sum is a separable solution: spatial eigenfunction × temporal decay. The decay rate is proportional to n², so higher modes vanish faster.

27Mode-by-Mode Summation

We sum contributions from each Fourier mode. In practice, we truncate at N modes. The error decreases as N increases, with higher modes contributing less at later times.

62Computing All Coefficients

We precompute all Fourier coefficients once. This is the 'expensive' step, but it only needs to be done once per initial condition.

94Decay Rate Analysis

The decay rate γₙ = α(nπ/L)² grows as n². Mode n=5 decays 25× faster than mode n=1. This explains the rapid smoothing of sharp features.

107 lines without explanation

1import numpy as np
2import matplotlib.pyplot as plt
3from matplotlib.animation import FuncAnimation
4
5def fourier_coefficient(n, f, L, num_points=1000):
6    """
7    Compute Fourier sine coefficient B_n for the initial condition f(x).
8
9    B_n = (2/L) * integral_0^L f(x) * sin(n*pi*x/L) dx
10
11    This projects f(x) onto the n-th eigenfunction sin(n*pi*x/L).
12    """
13    x = np.linspace(0, L, num_points)
14    dx = L / (num_points - 1)
15    # Numerical integration using trapezoidal rule
16    integrand = f(x) * np.sin(n * np.pi * x / L)
17    return (2 / L) * np.trapz(integrand, dx=dx)
18
19def heat_solution_fourier(x, t, alpha, L, coefficients, num_modes):
20    """
21    Compute heat equation solution using Fourier series.
22
23    u(x,t) = sum_{n=1}^{N} B_n * sin(n*pi*x/L) * exp(-alpha*(n*pi/L)^2*t)
24
25    Each mode decays exponentially with rate proportional to n^2.
26    """
27    u = np.zeros_like(x)
28    for n in range(1, num_modes + 1):
29        B_n = coefficients[n - 1]
30        lambda_n = n * np.pi / L
31        # Spatial mode * temporal decay
32        u += B_n * np.sin(lambda_n * x) * np.exp(-alpha * lambda_n**2 * t)
33    return u
34
35def solve_heat_equation_rod(L=1.0, alpha=0.01, T_final=2.0,
36                            num_modes=50, initial_condition='step'):
37    """
38    Solve the heat equation on a finite rod [0, L] with:
39    - Dirichlet BCs: u(0,t) = u(L,t) = 0
40    - Initial condition: u(x,0) = f(x)
41
42    Returns arrays for animation.
43    """
44    # Define initial condition
45    if initial_condition == 'step':
46        f = lambda x: np.where((x > 0.25*L) & (x < 0.75*L), 1.0, 0.0)
47    elif initial_condition == 'sine':
48        f = lambda x: np.sin(np.pi * x / L)
49    elif initial_condition == 'triangle':
50        f = lambda x: np.where(x < L/2, 2*x/L, 2*(1 - x/L))
51    elif initial_condition == 'gaussian':
52        f = lambda x: np.exp(-50 * (x - L/2)**2)
53    else:
54        raise ValueError(f"Unknown initial condition: {initial_condition}")
55
56    # Compute Fourier coefficients
57    print(f"Computing {num_modes} Fourier coefficients...")
58    coefficients = [fourier_coefficient(n, f, L) for n in range(1, num_modes + 1)]
59
60    # Display first few coefficients
61    print("First 5 coefficients:")
62    for n in range(1, min(6, num_modes + 1)):
63        print(f"  B_{n} = {coefficients[n-1]:.6f}")
64
65    # Spatial grid
66    x = np.linspace(0, L, 200)
67
68    # Time steps for animation
69    times = np.linspace(0, T_final, 100)
70
71    # Compute solution at each time
72    solutions = []
73    for t in times:
74        u = heat_solution_fourier(x, t, alpha, L, coefficients, num_modes)
75        solutions.append(u)
76
77    return x, times, solutions, f(x)
78
79# Solve for step function initial condition
80x, times, solutions, initial = solve_heat_equation_rod(
81    initial_condition='step',
82    num_modes=50,
83    alpha=0.01
84)
85
86# Plot snapshots at different times
87fig, axes = plt.subplots(2, 3, figsize=(14, 8))
88time_indices = [0, 10, 25, 50, 75, 99]
89
90for ax, idx in zip(axes.flat, time_indices):
91    ax.fill_between(x, solutions[idx], alpha=0.3, color='red')
92    ax.plot(x, solutions[idx], 'r-', linewidth=2)
93    ax.plot(x, initial, 'k--', alpha=0.3, linewidth=1, label='Initial')
94    ax.set_xlim(0, 1)
95    ax.set_ylim(-0.1, 1.1)
96    ax.set_xlabel('x')
97    ax.set_ylabel('u(x,t)')
98    ax.set_title(f't = {times[idx]:.3f}')
99    ax.grid(True, alpha=0.3)
100
101plt.suptitle('Heat Equation on Finite Rod: Step Function Initial Condition', fontsize=14)
102plt.tight_layout()
103plt.show()
104
105# Mode decay analysis
106print("\n--- Mode Decay Analysis ---")
107alpha = 0.01
108L = 1.0
109for n in [1, 2, 3, 4, 5]:
110    lambda_n = n * np.pi / L
111    decay_rate = alpha * lambda_n**2
112    half_life = np.log(2) / decay_rate
113    print(f"Mode n={n}: decay_rate = {decay_rate:.4f}, half-life = {half_life:.3f}s")

Common Pitfalls

Truncation Error

In practice, we truncate the Fourier series at $N$ modes. For discontinuous initial conditions, many modes are needed to accurately represent the initial state. However, at later times fewer modes suffice since high modes decay quickly.

Gibbs Phenomenon at t = 0

Near discontinuities in $f(x)$ , the partial Fourier sums oscillate and overshoot. This is not a bug — it's a fundamental feature of Fourier series. The overshoot smooths out immediately as $t > 0$ .

Confusing Boundary Condition Types

The eigenvalue problem changes with boundary conditions:

Dirichlet (u = 0): Sine functions
Neumann (∂u/∂x = 0): Cosine functions
Periodic: Both sines and cosines

Using the wrong eigenfunctions leads to solutions that don't satisfy the boundary conditions!

Numerical Stability

The analytical Fourier solution is unconditionally stable — no CFL condition needed! This is an advantage over finite difference methods. However, computing many Fourier coefficients can be expensive.

Test Your Understanding

📝Test Your Understanding

Question 1 of 8

Why do we use sine functions (not cosine) as the spatial eigenfunctions for the heat equation on a rod with u(0,t) = u(L,t) = 0?

Summary

In this section, we developed the complete analytical solution to the heat equation on a finite rod with Dirichlet boundary conditions. The solution beautifully illustrates the connection between PDEs, eigenvalue problems, and Fourier analysis.

Key Equations

Name	Formula
Eigenvalues	λₙ = (nπ/L)²
Eigenfunctions	Xₙ(x) = sin(nπx/L)
Temporal decay	Tₙ(t) = exp(-αλₙt)
Fourier coefficient	Bₙ = (2/L)∫f(x)sin(nπx/L)dx
Full solution	u(x,t) = ΣBₙsin(nπx/L)exp(-αn²π²t/L²)
Mode decay rate	γₙ = αn²π²/L² (grows as n²)

Key Takeaways

Separation of variables transforms the heat equation PDE into two ODEs — one spatial and one temporal
The spatial eigenvalue problem yields eigenvalues $\lambda_n = (n\pi/L)^2$ and eigenfunctions $\sin(n\pi x/L)$
The solution is a Fourier sine series with time-dependent coefficients that decay exponentially
Higher modes decay faster (rate ∝ n²) — this is why sharp features smooth out quickly
The Fourier coefficients are found by projecting the initial condition onto the eigenfunctions using orthogonality
As $t \to \infty$ , the solution approaches zero (all heat escapes through the boundaries)
These concepts directly connect to diffusion models, GNNs, and spectral methods in machine learning

The Essence of Heat Diffusion:

"Heat diffusion is the independent decay of infinitely many Fourier modes, each smoothing the temperature distribution at its own characteristic rate."

Coming Next: In the next section, we'll explore Fourier Series Solutions in more depth, examining the properties of Fourier series, convergence, and applications to more complex boundary conditions.