Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Derive the 2D heat equation from energy conservation and understand its physical meaning
Apply separation of variables to transform the 2D PDE into three ODEs
Identify the 2D eigenvalue problem and compute eigenvalues $\lambda_{mn} = \pi^2(m^2/a^2 + n^2/b^2)$
Construct the double Fourier series solution and compute Fourier coefficients
Visualize the 2D heat kernel and explain its Gaussian spreading behavior
Connect 2D diffusion to image processing and machine learning applications
Implement the 2D heat equation solution in Python

The Big Picture: From 1D to 2D

"The extension of Fourier's methods to two dimensions opened the door to understanding heat flow in plates, membranes, and ultimately images."

In the previous sections, we mastered the 1D heat equation — heat flowing along a rod. Now we extend these ideas to two dimensions: heat spreading across a plate, a membrane, or even a digital image.

The mathematical framework generalizes beautifully: the second derivative becomes the Laplacian, separation of variables yields a double Fourier series, and the heat kernel becomes a 2D Gaussian. These concepts are fundamental to understanding:

Thermal Engineering

Heat distribution in circuit boards, solar panels, and building materials

Image Processing

Gaussian blur, denoising, edge detection, and scale-space theory

Diffusion in Biology

Chemical signaling, morphogen gradients, and pattern formation

Diffusion Models (AI)

DALL-E, Stable Diffusion, and image generation via 2D diffusion

Financial Mathematics

2D option pricing models with multiple underlying assets

Geology & Climate

Heat flow in Earth's crust, ocean temperature modeling

Historical Context

Joseph Fourier's 1822 masterwork Théorie Analytique de la Chaleur (The Analytical Theory of Heat) actually focused heavily on 2D and 3D problems. His original motivation was understanding heat flow in the Earth's crust and in heated plates.

Fourier's Plate Problem (1807)

Fourier first analyzed heat flow in a semi-infinite plate with one edge held at a fixed temperature. This led him to discover that the temperature distribution involves infinite series of sine and cosine functions.

The Laplacian Operator

Pierre-Simon Laplace had earlier studied the operator $\nabla^2 = \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2}$ in the context of gravitational potentials. Fourier showed it naturally describes heat diffusion in multiple dimensions.

Modern Applications (2020s)

The 2D heat equation is now central to diffusion models in AI. Image generation systems like DALL-E and Stable Diffusion exploit the mathematics of 2D diffusion to create stunning images from noise.

The 2D Heat Equation

The 2D heat equation describes how temperature $u(x, y, t)$ evolves over a 2D domain:

The 2D Heat Equation

\frac{\partial u}{\partial t} = \alpha \left( \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} \right) = \alpha \nabla^2 u

Rate of temperature change = Diffusivity × Laplacian of temperature

Understanding the Laplacian

The Laplacian $\nabla^2 u = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2}$ measures how the temperature at a point differs from the average of its neighbors. Physically:

Laplacian	Physical Meaning	Result
∇²u < 0	Point is hotter than neighbors (concave down)	Temperature decreases
∇²u > 0	Point is colder than neighbors (concave up)	Temperature increases
∇²u = 0	Point equals neighbor average (harmonic)	No change (equilibrium)

The Laplacian as Mean-Value Operator

For small radius $r$ , the average of $u$ on a circle of radius $r$ centered at $(x, y)$ is approximately:

Average ≈ u(x, y) + (r²/4)∇²u(x, y)

So $\nabla^2 u$ tells us how much $u$ differs from its local average.

Separation of Variables in 2D

We solve the 2D heat equation on a rectangular plate $[0, a] \times [0, b]$ with Dirichlet boundary conditions (zero temperature on all edges):

u(0, y, t) = u(a, y, t) = 0 (left and right edges)

u(x, 0, t) = u(x, b, t) = 0 (bottom and top edges)

u(x, y, 0) = f(x, y) (initial temperature)

Follow the step-by-step derivation below to see how separation of variables transforms the PDE into three independent ODEs:

Separation of Variables in 2D

Step 1 of 157%

Start with the 2D Heat Equation

∂u/∂t = α(∂²u/∂x² + ∂²u/∂y²)

We want to solve the heat equation on a rectangular plate [0, a] × [0, b] with zero temperature on all boundaries (Dirichlet conditions).

The 2D Eigenvalue Problem

Separation of variables leads to two spatial eigenvalue problems:

X-Direction

X'' + μX = 0, X(0) = X(a) = 0

\mu_m = (m\pi/a)^2, \quad X_m = \sin(m\pi x/a)

Y-Direction

Y'' + νY = 0, Y(0) = Y(b) = 0

\nu_n = (n\pi/b)^2, \quad Y_n = \sin(n\pi y/b)

The combined eigenvalue for mode $(m, n)$ is the sum:

2D Eigenvalues

\lambda_{mn} = \mu_m + \nu_n = \pi^2 \left( \frac{m^2}{a^2} + \frac{n^2}{b^2} \right)

Each mode (m, n) decays at rate $\alpha\lambda_{mn}$

The Eigenfunction Family

The 2D eigenfunctions (normal modes) are products of 1D eigenfunctions:

\phi_{mn}(x, y) = \sin\left(\frac{m\pi x}{a}\right) \sin\left(\frac{n\pi y}{b}\right)

These form an orthonormal basis for functions on the rectangular domain that vanish on the boundary.

Mode (m,n)	λₘₙ (unit square)	Decay Rate αλ	Pattern
(1, 1)	2π² ≈ 19.7	Slowest	Single bump
(1, 2) or (2, 1)	5π² ≈ 49.3	2.5× faster	Two bumps
(2, 2)	8π² ≈ 78.9	4× faster	Checkerboard (2×2)
(3, 3)	18π² ≈ 177.7	9× faster	Checkerboard (3×3)

The n² + m² Rule

Higher modes decay much faster than low modes. Mode $(m, n)$ decays at a rate proportional to $m^2 + n^2$ . This explains why:

Sharp features (high frequency) smooth out quickly
Broad features (low frequency) persist longer
Eventually, only mode (1, 1) remains

Double Fourier Series Solution

The general solution is a double Fourier series over all modes:

The Double Fourier Series Solution

u(x, y, t) = \sum_{m=1}^{\infty} \sum_{n=1}^{\infty} B_{mn} \sin\left(\frac{m\pi x}{a}\right) \sin\left(\frac{n\pi y}{b}\right) e^{-\alpha\lambda_{mn}t}

B_mn:Fourier coefficient from initial condition

sin(...)sin(...):2D eigenfunction (spatial pattern)

e^−αλt:Exponential decay (mode-dependent rate)

Computing Fourier Coefficients

The coefficients $B_{mn}$ are found by projecting the initial condition $f(x, y) = u(x, y, 0)$ onto each eigenmode:

Fourier Coefficient Formula

B_{mn} = \frac{4}{ab} \int_0^a \int_0^b f(x, y) \sin\left(\frac{m\pi x}{a}\right) \sin\left(\frac{n\pi y}{b}\right) \, dy \, dx

This double integral "projects" the initial condition onto mode (m, n)

Orthogonality

The formula works because the 2D eigenfunctions are orthogonal:

∫∫ sin(mπx/a)sin(nπy/b) × sin(m'πx/a)sin(n'πy/b) dx dy = (ab/4)δ_mm'δ_nn'

The 2D Heat Kernel

On an infinite plane, the fundamental solution (heat kernel) is the 2D Gaussian:

2D Heat Kernel (Fundamental Solution)

G(x, y, t) = \frac{1}{4\pi\alpha t} \exp\left(-\frac{x^2 + y^2}{4\alpha t}\right)

A 2D Gaussian with standard deviation $\sigma = \sqrt{2\alpha t}$ in each direction

The 2D Heat Kernel

White dashed circle: one standard deviation

2D Heat Kernel (Fundamental Solution)

G(x, y, t) = (1/4παt) exp(-(x² + y²)/4αt)

A 2D Gaussian centered at the origin with standard deviation σ = √(2αt)

Time: t = 0.050

Diffusivity: α = 0.50

Standard Deviation

σ = 0.2236

Spreads as √t

Peak Value

3.18

Decreases as 1/t

Key Properties

• Total integral = 1 (energy conservation)
• Width grows as √t (sub-linear spreading)
• Height decreases as 1/t to maintain unit integral
• Solution to point-source initial condition

Properties of the 2D Heat Kernel

Normalization: The total integral is 1 for all t > 0 (energy conservation)
Spreading: Width grows as $\sqrt{t}$ , height decreases as $1/t$
Radial symmetry: Depends only on $r = \sqrt{x^2 + y^2}$
Initial condition: As $t \to 0$ , approaches a delta function at the origin
Convolution solution: For initial condition $f(x, y)$ , the solution is $u = G * f$

Interactive 2D Heat Diffusion

Explore how heat diffuses on a 2D plate. Try different initial conditions and observe how sharp features smooth out while the total energy is conserved.

Interactive 2D Heat Diffusion

Time: t = 0.0000

Initial Condition

Thermal Diffusivity: α = 0.50

Higher α means faster heat spreading

The 2D Heat Equation

∂u/∂t = α(∂²u/∂x² + ∂²u/∂y²)

Temperature at each point evolves based on the 2D Laplacian (sum of second derivatives). The boundary is held at zero temperature.

Temperature Scale

Cold (0)Hot (1)

What to Observe

Checkerboard pattern: Watch how the high-frequency pattern disappears almost instantly
Center spot: The Gaussian spreads while maintaining its shape
Corners: Heat flows away from corners toward the zero-temperature boundaries

Visualizing 2D Eigenmodes

Each eigenmode $(m, n)$ has a characteristic spatial pattern that persists while its amplitude decays. Higher modes have more oscillations and decay faster.

2D Fourier Eigenmodes

Mode (m, n) = (1, 1)

u_1,1(x,y,t) = sin(1πx)sin(1πy) × e^−αλ_1,1t

Mode m = 1

Mode n = 1

Time t = 0.00

Eigenvalue

λ_1,1 = 19.74

π²(m² + n²)

Decay Rate

1.974

αλ_mn

Half-Life

0.351

ln(2)/(αλ)

Amplitude

100.0%

e^−αλt

Temperature

NegativeZeroPositive

Mode Analysis

The mode structure reveals why diffusion is a low-pass filter:

Mode (1, 1) has λ = 2π² and decays slowest — it's the "fundamental frequency"
Mode (2, 2) has λ = 8π² and decays 4× faster — checkerboard patterns vanish quickly
The eigenvalue grows as m² + n², so decay rate grows quadratically with "frequency"
After sufficient time, only mode (1, 1) remains visible

Machine Learning Connections

The 2D heat equation is fundamental to modern machine learning, particularly in computer vision and generative AI.

1. Gaussian Blur in Image Processing

Applying the heat equation to an image for time $t$ is equivalent to Gaussian blur with standard deviation $\sigma = \sqrt{2t}$ :

blurred_image = convolve(image, gaussian_kernel(σ))

This is why the heat equation is central to scale-space theory in computer vision.

2. Diffusion Models (DALL-E, Stable Diffusion)

Diffusion models work by simulating the heat equation on images:

Forward process: Gradually add Gaussian noise to images — this is 2D heat diffusion!
Reverse process: Train a neural network to reverse the diffusion
Generation: Start from pure noise and denoise step by step

The Mathematical Connection

The noise added at each step follows a schedule that corresponds to the time parameter in the heat equation. The variance $\sigma^2 = 2\alpha t$ controls how much the image "diffuses."

3. Denoising and Regularization

The heat equation naturally removes high-frequency noise while preserving low-frequency structure. This connects to:

Tikhonov regularization: Adding a smoothness penalty ∥∇u∥² corresponds to heat diffusion
Anisotropic diffusion: Perona-Malik equation preserves edges while smoothing flat regions
Graph neural networks: Message passing is discrete heat diffusion on graphs

4. Fourier Features in Neural Networks

The 2D Fourier modes we studied appear in modern architectures:

Positional encodings: Transformers use sinusoidal functions similar to Fourier modes
Fourier neural operators: Learn PDE solutions in Fourier space
Spectral normalization: Controls eigenvalues of weight matrices

Python Implementation

Implement the 2D heat equation solution using the double Fourier series method:

Solving the 2D Heat Equation

🐍heat_equation_2d.py

Explanation(5)

Code(129)

3The 2D Heat Kernel

The fundamental solution is a 2D Gaussian with standard deviation σ = √(2αt). It represents the temperature from a point source of heat at the origin.

17Single Eigenmode

Each mode (m, n) has a spatial pattern sin(mπx/a)sin(nπy/b) that decays exponentially with rate αλₘₙ where λₘₙ = π²(m²/a² + n²/b²).

29Double Fourier Series Solution

The general solution is a double sum over all modes. The coefficients Bₘₙ are computed by projecting the initial condition onto each eigenmode.

43Computing Fourier Coefficients

We use numerical integration (trapezoidal rule) to compute each coefficient. The formula is Bₘₙ = (4/ab)∫∫f(x,y)sin(mπx/a)sin(nπy/b)dxdy.

72Mode Decay Analysis

Higher modes (larger m or n) have larger eigenvalues and decay faster. Mode (1,1) decays slowest and eventually dominates the solution.

124 lines without explanation

1import numpy as np
2import matplotlib.pyplot as plt
3from mpl_toolkits.mplot3d import Axes3D
4from matplotlib import cm
5
6def heat_kernel_2d(x, y, t, alpha=1.0):
7    """
8    The 2D heat kernel (fundamental solution).
9
10    G(x, y, t) = (1/4παt) exp(-(x² + y²)/4αt)
11
12    This is a 2D Gaussian with variance σ² = 2αt.
13    """
14    if t <= 0:
15        return np.where((x == 0) & (y == 0), np.inf, 0)
16
17    prefactor = 1.0 / (4 * np.pi * alpha * t)
18    exponent = -(x**2 + y**2) / (4 * alpha * t)
19    return prefactor * np.exp(exponent)
20
21
22def fourier_mode_2d(x, y, t, m, n, a=1.0, b=1.0, alpha=0.1):
23    """
24    A single eigenmode of the 2D heat equation on [0,a] × [0,b].
25
26    u_mn(x, y, t) = sin(mπx/a) sin(nπy/b) exp(-αλ_mn t)
27
28    where λ_mn = π²(m²/a² + n²/b²)
29    """
30    lambda_mn = np.pi**2 * (m**2/a**2 + n**2/b**2)
31    spatial = np.sin(m * np.pi * x / a) * np.sin(n * np.pi * y / b)
32    temporal = np.exp(-alpha * lambda_mn * t)
33    return spatial * temporal
34
35
36def solve_heat_2d_fourier(x, y, t, initial_func, a=1.0, b=1.0,
37                          alpha=0.1, n_modes=20):
38    """
39    Solve 2D heat equation using double Fourier series.
40
41    u(x,y,t) = Σ_m Σ_n B_mn sin(mπx/a) sin(nπy/b) exp(-αλ_mn t)
42
43    where B_mn = (4/ab) ∫∫ f(x,y) sin(mπx/a) sin(nπy/b) dx dy
44    """
45    # Grid for numerical integration
46    nx, ny = 100, 100
47    xi = np.linspace(0, a, nx)
48    yi = np.linspace(0, b, ny)
49    Xi, Yi = np.meshgrid(xi, yi)
50    dx, dy = a/(nx-1), b/(ny-1)
51
52    # Evaluate initial condition
53    f_vals = initial_func(Xi, Yi)
54
55    # Compute Fourier coefficients and sum modes
56    u = np.zeros_like(x)
57
58    for m in range(1, n_modes + 1):
59        for n in range(1, n_modes + 1):
60            # Compute coefficient B_mn
61            integrand = f_vals * np.sin(m*np.pi*Xi/a) * np.sin(n*np.pi*Yi/b)
62            B_mn = (4/(a*b)) * np.sum(integrand) * dx * dy
63
64            if abs(B_mn) < 1e-10:
65                continue
66
67            # Add this mode's contribution
68            u += B_mn * fourier_mode_2d(x, y, t, m, n, a, b, alpha)
69
70    return u
71
72
73def simulate_and_plot():
74    """Simulate and visualize 2D heat diffusion."""
75    # Setup domain
76    a, b = 1.0, 1.0
77    alpha = 0.1
78    nx, ny = 50, 50
79    x = np.linspace(0, a, nx)
80    y = np.linspace(0, b, ny)
81    X, Y = np.meshgrid(x, y)
82
83    # Initial condition: hot spot in center
84    def initial_condition(x, y):
85        cx, cy = 0.5, 0.5
86        return np.exp(-50 * ((x - cx)**2 + (y - cy)**2))
87
88    # Solve at different times
89    times = [0, 0.05, 0.2, 0.5, 1.0, 2.0]
90
91    fig = plt.figure(figsize=(15, 10))
92
93    for i, t in enumerate(times):
94        ax = fig.add_subplot(2, 3, i+1, projection='3d')
95
96        if t == 0:
97            Z = initial_condition(X, Y)
98        else:
99            Z = solve_heat_2d_fourier(X, Y, t, initial_condition,
100                                       a, b, alpha, n_modes=15)
101
102        ax.plot_surface(X, Y, Z, cmap=cm.hot, linewidth=0,
103                       antialiased=True, alpha=0.9)
104        ax.set_xlabel('x')
105        ax.set_ylabel('y')
106        ax.set_zlabel('u')
107        ax.set_title(f't = {t}')
108        ax.set_zlim(0, 1)
109
110    plt.suptitle('2D Heat Equation: Gaussian Initial Condition',
111                fontsize=14)
112    plt.tight_layout()
113    plt.show()
114
115    # Mode decay analysis
116    print("\n--- 2D Mode Decay Analysis ---")
117    print("Mode (m,n) | λ_mn | Decay rate | Half-life")
118    print("-" * 50)
119
120    for m in range(1, 4):
121        for n in range(1, 4):
122            lambda_mn = np.pi**2 * (m**2 + n**2)
123            decay_rate = alpha * lambda_mn
124            half_life = np.log(2) / decay_rate
125            print(f"  ({m},{n})   | {lambda_mn:6.2f} | {decay_rate:8.4f} | {half_life:8.4f}")
126
127
128# Run simulation
129simulate_and_plot()

Common Pitfalls

Truncating the Double Series

With two indices (m, n), the number of modes grows as N² if you keep modes up to m, n ≤ N. For discontinuous initial conditions, you may need many modes for accuracy near t = 0.

Boundary Conditions Matter

The sine-sine eigenfunctions are specific to Dirichlet (zero-value) conditions on all edges. Different boundary conditions require different eigenfunctions:

Neumann on x, Dirichlet on y: cos(mπx/a)sin(nπy/b)
Periodic: sines and cosines in both directions
Mixed: case-by-case analysis

The Laplacian Has Units

In the heat equation, $\nabla^2 u$ has units of $[u]/[length]^2$ . Always verify that $\alpha \nabla^2 u$ has the same units as $\partial u/\partial t$ .

Numerical Stability

For finite difference methods, the stability condition in 2D is more restrictive than 1D:

α Δt (1/Δx² + 1/Δy²) ≤ 0.5

This means the time step must be smaller in 2D for the same spatial resolution.

Test Your Understanding

Test Your Understanding: 2D Heat Equation

1. The 2D heat equation ∂u/∂t = α(∂²u/∂x² + ∂²u/∂y²) involves which differential operator on the right-hand side?

2. When solving the 2D heat equation on a rectangular domain using separation of variables, how many separate ODEs do we obtain?

3. For the 2D heat equation on a unit square with zero boundary conditions, the eigenvalue for mode (m, n) is λₘₙ = ...

4. The 2D heat kernel (fundamental solution) has what shape?

5. Which mode (m, n) decays the SLOWEST on a unit square?

6. The standard deviation of the 2D heat kernel grows proportionally to...

7. What determines the Fourier coefficient Bₘₙ for the 2D heat equation?

8. In image processing, applying the 2D heat equation has what effect?

Summary

The 2D heat equation extends the 1D theory to plates and surfaces, revealing the power of the Laplacian operator and double Fourier series.

Key Equations

Name	Formula
2D Heat Equation	∂u/∂t = α(∂²u/∂x² + ∂²u/∂y²) = α∇²u
2D Eigenvalues	λₘₙ = π²(m²/a² + n²/b²)
2D Eigenfunctions	φₘₙ(x,y) = sin(mπx/a)sin(nπy/b)
General Solution	u = Σₘ Σₙ Bₘₙ φₘₙ(x,y) exp(-αλₘₙt)
Fourier Coefficient	Bₘₙ = (4/ab)∫∫f(x,y)φₘₙ(x,y)dxdy
2D Heat Kernel	G = (1/4παt)exp(-(x²+y²)/4αt)

Key Takeaways

The 2D heat equation involves the Laplacian $\nabla^2 u$ , which measures how $u$ differs from its local average
Separation of variables in 2D leads to three ODEs: one for each spatial direction and one for time
The 2D eigenvalue $\lambda_{mn}$ is the sum of the 1D eigenvalues in each direction
Higher modes decay much faster — rate grows as $m^2 + n^2$
The 2D heat kernel is a Gaussian spreading in all directions with width $\propto \sqrt{t}$
Gaussian blur in images is exactly 2D heat diffusion
Diffusion models in AI (DALL-E, Stable Diffusion) exploit 2D heat equation mathematics

The Essence of 2D Heat Diffusion:

"Heat flows from hot to cold in all directions at once, with the Laplacian measuring how each point differs from its 2D neighborhood."

Coming Next: In the next section, we'll explore Numerical Methods: Finite Differences — practical algorithms for solving the heat equation when analytical solutions are not available.