Learning Objectives
By the end of this section, you will be able to:
- Apply separation of variables to solve the heat equation on a finite rod with Dirichlet boundary conditions
- Derive the spatial eigenvalue problem and identify the eigenvalues and eigenfunctions
- Construct the general Fourier series solution and compute Fourier coefficients from initial conditions
- Interpret the physical meaning of mode decay and explain why higher modes decay faster
- Implement the Fourier series solution in Python and visualize the evolution of temperature
- Connect the heat equation solution to machine learning concepts including score matching and diffusion models
The Big Picture: From Physics to Fourier
"The profound study of nature is the most fertile source of mathematical discoveries." — Joseph Fourier, 1822
In the previous section, we derived the heat equation from first principles. Now we face a fundamental challenge: how do we solve this partial differential equation? The answer reveals one of the most beautiful connections in mathematics — the link between differential equations and Fourier series.
Joseph Fourier discovered that any "reasonable" function can be expressed as an infinite sum of sines and cosines. When applied to the heat equation, this leads to an elegant solution: each Fourier mode evolves independently, decaying exponentially at a rate proportional to the square of its frequency.
The Central Insight
The heat equation on a finite rod can be solved exactly using separation of variables. The solution is a Fourier sine series where each mode decays exponentially:
The coefficients are determined by projecting the initial condition onto the eigenfunctions.
Why This Matters for Machine Learning
The heat equation solution has profound connections to modern deep learning:
- Diffusion models (DALL-E, Stable Diffusion) reverse a heat diffusion process to generate images from noise
- Score matching learns the gradient of the log probability, which evolves according to a Fokker-Planck equation
- Graph neural networks propagate information using discrete Laplacians — discretized heat equations on graphs
- Regularization in neural networks often corresponds to adding diffusion terms that smooth the loss landscape
Problem Setup: The Finite Rod
Consider a thin rod of length with its ends held at a fixed temperature (which we take to be zero for simplicity). The temperature distribution along the rod satisfies:
The Heat Equation Boundary Value Problem
Understanding the Boundary Conditions
The Dirichlet boundary conditions represent holding both ends of the rod at zero temperature. Physically, this could be achieved by immersing the endpoints in ice baths.
| Boundary Type | Condition | Physical Meaning |
|---|---|---|
| Dirichlet | u = 0 (or constant) | Fixed temperature (e.g., contact with reservoir) |
| Neumann | ∂u/∂x = 0 | Insulated end (no heat flux) |
| Robin | ∂u/∂x + hu = 0 | Convective cooling (Newton's law) |
Why Zero?
Setting the boundary values to zero is not a limitation. If the boundary temperatures are and , we can define a new variable where is the steady-state solution. Then satisfies the heat equation with zero boundary conditions.
Separation of Variables
The method of separation of variables is one of the most powerful techniques for solving linear PDEs. The key idea is to assume the solution can be written as a product of single-variable functions:
This ansatz transforms the PDE into two ordinary differential equations that can be solved independently. Let us walk through the derivation step by step.
Walk through the classical method for solving the heat equation
Step 1: The Problem Statement
We want to solve the 1D heat equation on a finite rod of length L with homogeneous Dirichlet boundary conditions:
- PDE: The heat equation describes how temperature evolves
- BCs: Both ends are held at zero temperature
- IC: Initial temperature distribution is f(x)
The boundary conditions will constrain which spatial functions are allowed.
Why Separation Works
The success of separation of variables relies on three key properties:
- Linearity: The heat equation is linear, so any linear combination of solutions is also a solution. This allows us to superpose infinitely many separable solutions.
- Homogeneous boundary conditions: The zero boundary conditions are "homogeneous" — they don't introduce additional terms. Non-homogeneous conditions require extra steps.
- Completeness of eigenfunctions: The sine functions form a complete orthonormal basis for functions on that vanish at the endpoints.
The Spatial Eigenvalue Problem
When we separate variables, the spatial part becomes an eigenvalue problem:
This is a Sturm-Liouville problem. The boundary conditions constrain which values of are allowed.
Finding the Eigenvalues
The general solution to depends on the sign of :
| Case | General Solution | Boundary Conditions |
|---|---|---|
| λ < 0 | X = Ae^{√|λ|x} + Be^{-√|λ|x} | Only X ≡ 0 satisfies both BCs |
| λ = 0 | X = Ax + B | Only X ≡ 0 satisfies both BCs |
| λ > 0 | X = A cos(√λ x) + B sin(√λ x) | Non-trivial solutions exist! |
For , applying gives . Then requires:
Eigenvalues
The eigenvalues grow as . They determine both the spatial frequency and the temporal decay rate.
Eigenfunctions
These are the "normal modes" — each satisfies the boundary conditions independently.
Orthogonality
The eigenfunctions are orthogonal on :
This orthogonality is essential for computing the Fourier coefficients.
Fourier Series Solution
Each separable solution satisfies the PDE and boundary conditions. By linearity, any linear combination is also a solution:
The Complete Solution
Each mode oscillates spatially with wavelength and decays exponentially with rate .
Finding the Coefficients
The coefficients are determined by the initial condition . At :
This is the Fourier sine series of . Using orthogonality, we can extract each coefficient:
Fourier Coefficient Formula
This computes the "projection" of onto the -th eigenfunction.
Physical Interpretation
The solution tells us that heat diffusion can be understood as the independent evolution of infinitely many "modes," each with its own characteristic spatial pattern and decay rate.
Solution using Fourier series: u(x,t) = Σ Bn sin(nπx/L) e-α(nπ/L)²t
Higher α = faster heat diffusion
More modes = better approximation of initial condition
Key Observations
- • Boundary conditions: u(0,t) = u(L,t) = 0 (Dirichlet, fixed ends)
- • Mode decay: Mode n decays as e-αn²π²t/L² — higher modes vanish first
- • Long-time behavior: Solution approaches zero as t → ∞
- • Smoothing: Sharp features (high-frequency modes) smooth out quickly
Key Physical Insights
Smoothing of Sharp Features
Sharp discontinuities (like step functions) contain high-frequency modes (large ). These decay fastest, so sharp features smooth out rapidly.
Long-Time Behavior
As , all modes decay and . Heat escapes through the boundaries, which are held at zero temperature.
Dominant Mode
After sufficient time, only the mode remains significant. The solution approaches .
Characteristic Time Scale
The characteristic time for mode is . Larger rods or lower diffusivity means slower equilibration.
Mode Decay Visualization
The decay rate of each Fourier mode is . Since this grows as , higher modes decay much faster than lower modes.
Each mode n decays as e-α(nπ/L)²t — higher modes decay faster
Physical Interpretation
Higher modes = shorter wavelengths = sharper features. These decay fastest because heat can flow quickly over short distances. This is why sharp temperature discontinuities smooth out rapidly, while broad temperature variations persist longer. The n=2 mode decays 4× faster than n=1, n=3 decays 9× faster, and so on.
The n² Rule
Mode decays times faster than mode . This means:
- Mode 2 decays 4× faster than mode 1
- Mode 3 decays 9× faster than mode 1
- Mode 10 decays 100× faster than mode 1
This rapid decay of high frequencies is why heat diffusion is such an effective "smoothing" process.
Special Cases
Case 1: Single Sine Mode Initial Condition
If for some integer , then by orthogonality:
The solution is simply the single decaying mode:
Case 2: Constant Initial Temperature
If (constant), the Fourier coefficients are:
Gibbs Phenomenon
When approximating a discontinuous function (like a step) with a Fourier series, the partial sums overshoot near the discontinuity by about 9%. This is the Gibbs phenomenon. As we add more modes, the overshoot moves closer to the discontinuity but never disappears.
Case 3: Point Source (Delta Function)
If (heat concentrated at a single point ):
The solution spreads out from the initial point, demonstrating diffusion.
Machine Learning Connections
The Fourier series solution of the heat equation provides deep insights into modern machine learning methods.
1. Diffusion Models (Score-Based Generative Models)
Diffusion models like DALL-E 3 and Stable Diffusion work by:
- Forward process: Gradually add noise to data (like heat diffusion)
- Reverse process: Learn to denoise (reverse diffusion)
The forward process follows a stochastic differential equation closely related to the heat equation:
The "score" that diffusion models learn is the gradient of the log probability density, which satisfies a Fokker-Planck equation — a generalization of the heat equation.
2. Graph Neural Networks and the Graph Laplacian
Many GNN architectures implement discrete heat diffusion on graphs. The graph Laplacian plays the role of the continuous Laplacian , and the update rule:
is the discrete analog of .
3. Spectral Methods in Deep Learning
The Fourier decomposition we used corresponds to the spectral decomposition of the Laplacian operator. This connects to:
- Spectral graph convolutions: Apply filters in the eigenspace of the graph Laplacian
- Fourier neural operators: Learn operators in Fourier space for PDE solving
- Positional encodings: Use sinusoidal functions (like our eigenfunctions!) to encode position in Transformers
Fourier Features in ML
Random Fourier features and spectral normalization both leverage the properties of Fourier representations. The smoothing behavior of the heat equation (high-frequency damping) is related to regularization and generalization in neural networks.
Python Implementation
Let us implement the Fourier series solution in Python. This code demonstrates:
- Computing Fourier coefficients by numerical integration
- Evaluating the series solution at any point (x, t)
- Visualizing the evolution of temperature over time
- Analyzing mode decay rates
Common Pitfalls
Truncation Error
In practice, we truncate the Fourier series at modes. For discontinuous initial conditions, many modes are needed to accurately represent the initial state. However, at later times fewer modes suffice since high modes decay quickly.
Gibbs Phenomenon at t = 0
Near discontinuities in , the partial Fourier sums oscillate and overshoot. This is not a bug — it's a fundamental feature of Fourier series. The overshoot smooths out immediately as .
Confusing Boundary Condition Types
The eigenvalue problem changes with boundary conditions:
- Dirichlet (u = 0): Sine functions
- Neumann (∂u/∂x = 0): Cosine functions
- Periodic: Both sines and cosines
Using the wrong eigenfunctions leads to solutions that don't satisfy the boundary conditions!
Numerical Stability
The analytical Fourier solution is unconditionally stable — no CFL condition needed! This is an advantage over finite difference methods. However, computing many Fourier coefficients can be expensive.
Test Your Understanding
Summary
In this section, we developed the complete analytical solution to the heat equation on a finite rod with Dirichlet boundary conditions. The solution beautifully illustrates the connection between PDEs, eigenvalue problems, and Fourier analysis.
Key Equations
| Name | Formula |
|---|---|
| Eigenvalues | λₙ = (nπ/L)² |
| Eigenfunctions | Xₙ(x) = sin(nπx/L) |
| Temporal decay | Tₙ(t) = exp(-αλₙt) |
| Fourier coefficient | Bₙ = (2/L)∫f(x)sin(nπx/L)dx |
| Full solution | u(x,t) = ΣBₙsin(nπx/L)exp(-αn²π²t/L²) |
| Mode decay rate | γₙ = αn²π²/L² (grows as n²) |
Key Takeaways
- Separation of variables transforms the heat equation PDE into two ODEs — one spatial and one temporal
- The spatial eigenvalue problem yields eigenvalues and eigenfunctions
- The solution is a Fourier sine series with time-dependent coefficients that decay exponentially
- Higher modes decay faster (rate ∝ n²) — this is why sharp features smooth out quickly
- The Fourier coefficients are found by projecting the initial condition onto the eigenfunctions using orthogonality
- As , the solution approaches zero (all heat escapes through the boundaries)
- These concepts directly connect to diffusion models, GNNs, and spectral methods in machine learning
Coming Next: In the next section, we'll explore Fourier Series Solutions in more depth, examining the properties of Fourier series, convergence, and applications to more complex boundary conditions.