Learning Objectives
By the end of this section, you will be able to:
- Derive the heat equation from first principles using conservation of energy and Fourier's law
- Understand each term in the heat equation and its physical meaning
- Explain the role of thermal diffusivity and how it affects heat propagation
- Visualize the heat kernel (fundamental solution) and its Gaussian shape
- Connect the heat equation to diffusion models in modern machine learning
- Identify why sharp temperature features smooth out over time
- Apply the convolution solution formula using the heat kernel
The Big Picture: Why the Heat Equation Matters
"Heat, like gravity, penetrates every substance of the universe." — Joseph Fourier, 1822
The heat equation is arguably the most important PDE in applied mathematics. It describes not just thermal diffusion, but any process where a quantity spreads from high to low concentration:
🔥 Thermal Diffusion
Heat spreading through materials, from CPU cooling to climate models
🧪 Chemical Diffusion
Molecules spreading through fluids, drug delivery, pollution dispersal
💰 Financial Diffusion
Option prices, the Black-Scholes model is a modified heat equation
📚 Probability Diffusion
Random walks, Brownian motion, the Fokker-Planck equation
📷 Image Processing
Gaussian blur, noise removal, scale-space theory in computer vision
🤖 Generative AI
DALL-E, Stable Diffusion, and other diffusion models for image generation
The Central Equation
In words: The rate of change of temperature equals the diffusivity times the curvature of the temperature profile.
Historical Context: Fourier's Revolution
In 1822, Jean-Baptiste Joseph Fourier published "Théorie Analytique de la Chaleur" (The Analytical Theory of Heat), introducing both the heat equation and Fourier series. This work was revolutionary for several reasons:
1. First PDE for a Real Physical Problem
Fourier derived the heat equation from physical principles — not abstract mathematics. He showed that calculus could describe the continuous flow of heat through matter.
2. Introduction of Fourier Series
To solve the heat equation, Fourier decomposed arbitrary functions into sums of sines and cosines. This was initially controversial but became one of the most powerful tools in mathematics and engineering.
3. Dimensional Analysis
Fourier pioneered the systematic use of physical dimensions, showing that equations must be dimensionally consistent. This idea underpins all of modern physics and engineering.
Fourier's Legacy: The techniques he developed for heat conduction — separation of variables, Fourier series, and the convolution integral — are now fundamental tools across all of science, from signal processing to quantum mechanics.
Physical Setup: Heat in a Rod
Consider a thin rod of length made of some material. We want to describe how temperature varies along the rod and changes over time.
Key Assumptions
- One-dimensional flow: Heat only flows along the rod (not through its sides)
- Homogeneous material: The rod has uniform properties throughout
- No internal heat sources: Heat is neither created nor destroyed inside the rod
- Temperature varies continuously: We can use calculus to describe the temperature field
The Variables
| Symbol | Name | Description | Units |
|---|---|---|---|
| u(x,t) | Temperature | Temperature at position x and time t | K or °C |
| x | Position | Location along the rod | m |
| t | Time | Time since initial condition | s |
| q(x,t) | Heat flux | Rate of heat flow per unit area | W/m² |
| k | Thermal conductivity | How easily heat flows | W/(m·K) |
| ρ | Density | Mass per unit volume | kg/m³ |
| cₚ | Specific heat | Energy to raise 1 kg by 1 K | J/(kg·K) |
Conservation of Energy
The foundation of the heat equation is the First Law of Thermodynamics: energy cannot be created or destroyed, only transferred. For a small segment of the rod from to :
Energy Balance for a Control Volume
Visualizing how energy flows in and out of a control volume
Conservation Law:
The rate of change of energy in the control volume equals the net heat flux through its boundaries.
Mathematical Expression
The thermal energy in our small segment is:
where is the cross-sectional area. Taking the time derivative:
Why Partial Derivative?
We use because temperature depends on both and . The partial derivative means "rate of change with time, holding position fixed."
Fourier's Law of Heat Conduction
Energy conservation tells us that temperature changes due to heat flux, but we need another equation to relate the heat flux to temperature. This is Fourier's Law, the "constitutive relation" for heat conduction:
Fourier's Law
Heat flux is proportional to the negative temperature gradient: q = -k (dT/dx)
The Meaning of the Negative Sign
The negative sign is crucial! It encodes the Second Law of Thermodynamics:
- If temperature increases to the right (), heat flows to the left ()
- If temperature decreases to the right (), heat flows to the right ()
- Heat always flows from hot to cold — down the temperature gradient
Conductivity k
The thermal conductivity measures how easily heat flows through a material. Metals have high (good conductors); insulators like wood or plastic have low .
The Derivation: Putting It Together
Now we combine energy conservation with Fourier's law to derive the heat equation. Follow the step-by-step walkthrough below:
Follow the mathematical derivation of the heat equation from first principles
Start with Energy Conservation
Consider a small segment of a rod from x to x + dx. The thermal energy inside this segment can only change if heat flows in or out through the boundaries.
The Final Result
Combining everything, we arrive at the heat equation:
Physical Interpretation
The heat equation says: Temperature at a point changes based on how different it is from its neighbors.
| Curvature | ∂²u/∂x² | Result |
|---|---|---|
| Point is HOTTER than neighbors | < 0 (concave down) | Temperature decreases ↓ |
| Point is COLDER than neighbors | > 0 (concave up) | Temperature increases ↑ |
| Point equals neighbor average | = 0 (flat) | No change |
Thermal Diffusivity: The Speed of Heat
The parameter is called thermal diffusivity. It determines how fast heat spreads through a material.
Understanding Diffusivity
Intuition for Diffusivity
- High k (good conductor): Heat flows easily through the material → faster diffusion
- High ρcp (large thermal mass): Lots of energy needed to change temperature → slower diffusion
- Metals: High k and moderate ρcp → high diffusivity (copper: α ≈ 111 mm²/s)
- Insulators: Low k → low diffusivity (wood: α ≈ 0.08 mm²/s)
Material Comparison
The √t Scaling Law
Heat spreads a distance in time . This means:
- Double the distance → 4× the time (heat spreads sub-linearly)
- This is why insulation works! Doubling insulation thickness quadruples the protection time
- Same reason Brownian motion scales as √t
The Heat Kernel: The Fundamental Solution
What happens if we start with all heat concentrated at a single point? The answer is the heat kernel (also called the fundamental solution or Green's function):
The Heat Kernel
The Gaussian that spreads: G(x,t) = (1/√(4παt)) · exp(-x²/(4αt))
As t increases, the Gaussian spreads
Key Properties:
- • Integral always = 1 (conservation)
- • Width grows as √t
- • This is a Gaussian with σ = √(2αt)
- • Foundation of diffusion models in ML!
Properties of the Heat Kernel
- It's a Gaussian: The bell curve shape is characteristic of diffusion processes
- Total integral = 1: (energy is conserved)
- Width grows as √t: The standard deviation is
- Height decreases as 1/√t: The peak flattens to maintain constant area
- As t→0, becomes a delta function: Returns to a point source
The Convolution Solution
The heat kernel gives us a beautiful formula for solving any initial value problem. If the initial temperature is , then:
The solution is the convolution of the initial condition with the heat kernel. Each point of the initial distribution spreads according to the heat kernel, and we sum all these contributions.
Key Properties of the Heat Equation
1. Smoothing Property
The heat equation smooths out discontinuities immediately. Even if the initial condition has jumps or corners, for any t > 0 the solution is infinitely differentiable.
Instant Smoothing
This is actually controversial physically: it implies that heat "knows" instantly about distant changes (infinite propagation speed). Real heat has finite speed due to molecular interactions. But for most applications, the approximation is excellent.
2. Maximum Principle
The maximum (and minimum) temperature in a domain can only occur:
- At the initial time (t = 0)
- On the boundary of the domain
In other words, new extremes cannot form inside the domain. Temperature naturally tends toward the average of its surroundings.
3. Energy Conservation
With appropriate boundary conditions (like insulated ends), the total thermal energy is conserved:
4. Irreversibility
The heat equation is not time-reversible. If you run time backward (t → -t), the equation becomes unstable. This is a manifestation of the Second Law of Thermodynamics: heat diffusion increases entropy.
Connection to Machine Learning: Diffusion Models
One of the most exciting developments in AI is diffusion models, which power image generators like DALL-E, Stable Diffusion, and Midjourney. These models are directly connected to the heat equation!
See how the heat equation's forward process connects to generative AI
The Connection:
The forward process is exactly the heat equation! Noise variance grows like σ² = 2αt, matching the heat kernel's spreading. AI models learn to reverse this diffusion.
How Diffusion Models Work
- Forward Process (Adding Noise): Starting from a clean image, progressively add Gaussian noise. This follows the heat equation — the image "diffuses" into noise.
- Train a Denoiser: A neural network learns to predict and remove the noise at each step.
- Reverse Process (Generation): Start from pure noise and iteratively denoise. The network guides the reverse diffusion, creating realistic images.
The Mathematical Connection
The forward diffusion is described by a stochastic differential equation:
The probability density of the noised images satisfies:
This is essentially a heat equation in the space of images! The noise variance grows like , analogous to the heat kernel's spreading.
Why This Matters for ML
Understanding the heat equation gives you deep insight into:
- Why diffusion models work (the forward process has a known solution)
- How to choose the noise schedule (β(t))
- Why score matching is the right training objective
- Connections to denoising autoencoders and energy-based models
Python Implementation
Solving with the Heat Kernel
Common Pitfalls
Confusing Flux Direction
Remember: q = -k(∂u/∂x). The negative sign means heat flows opposite to the temperature gradient — from hot to cold. Forgetting this sign leads to equations that predict temperature running "uphill"!
Infinite vs. Finite Domains
The heat kernel formula applies to the infinite line. For finite domains (like a rod), you need boundary conditions and the solution involves Fourier series, not just convolution.
Dimensional Consistency
Always check units! The heat equation requires:
- [∂u/∂t] = [α][∂²u/∂x²] → K/s = (m²/s)(K/m²) ✓
- Fourier's Law: [q] = [k][∂u/∂x] → W/m² = (W/m·K)(K/m) ✓
Numerical Stability
When solving the heat equation numerically (finite differences), you must satisfy the CFL condition: . Violating this causes the solution to explode! We'll cover this in the section on finite difference methods.
Test Your Understanding
What physical principle is the foundation for deriving the heat equation?
Summary
We have derived the heat equation from first principles by combining energy conservation with Fourier's law. This parabolic PDE is the prototype for all diffusion phenomena.
Key Equations
| Equation | Name | Meaning |
|---|---|---|
| ∂u/∂t = α ∂²u/∂x² | Heat Equation | Temperature change = Diffusivity × Curvature |
| q = -k ∂u/∂x | Fourier's Law | Heat flows down the temperature gradient |
| α = k/(ρcₚ) | Thermal Diffusivity | How fast heat spreads (m²/s) |
| G(x,t) = 1/√(4παt) exp(-x²/4αt) | Heat Kernel | Fundamental solution (Gaussian) |
| u = G * f | Solution Formula | Convolution of initial condition with kernel |
Key Takeaways
- The heat equation comes from energy conservation (no heat created or destroyed) plus Fourier's law (heat flows from hot to cold)
- The thermal diffusivity α = k/(ρcp) determines how fast heat spreads; higher α means faster diffusion
- The second spatial derivative ∂²u/∂x² measures curvature: points hotter than their neighbors cool down, and vice versa
- The heat kernel is a Gaussian with width growing as √t — the characteristic signature of diffusion
- General solutions are convolutions: each point of the initial condition spreads according to the heat kernel
- Diffusion models in AI are built on the same mathematics: the forward process is essentially the heat equation applied to images
- The heat equation smooths out sharp features and is irreversible (entropy increases)
Coming Next: In the next section, we'll solve the heat equation on a finite rod with boundary conditions. You'll see how Fourier series provide beautiful solutions that separate space and time dependencies.