Learning Objectives
By the end of this section you will be able to:
- See why the ordinary chain rule fails when noise has unbounded variation.
- Derive the heuristic from quadratic variation.
- State and apply Itô's lemma to functions of a stochastic process.
- Transform a stochastic differential equation by changing variables (log‑transform of GBM).
- Simulate SDEs with Euler–Maruyama in Python and in vectorised PyTorch.
Why the Ordinary Chain Rule Breaks
In ordinary calculus, if is a smooth function and is differentiable, the chain rule says
That is, the change in is linear in the change in . The reason this works is buried in a Taylor expansion:
For smooth , the increment is of order , so is of order — utterly negligible as . We drop it without guilt.
The Twist for Brownian Motion
If , a Brownian motion, then is not of order . It is of order .
So is of order — the same order as the drift term we are trying to keep. We cannot throw it away.
This single observation is the entire reason Itô's lemma exists, and the entire reason stochastic calculus is its own subject. Everything below is just careful book-keeping around that one fact.
The Secret: (dW)² = dt
Let us pin down what "of order dt" really means. Subdivide into intervals of size . The Brownian increment over each interval is .
The quadratic variation of the path is the limit
Each summand is the square of a Gaussian, and its . So the expectation of the sum is exactly . The variance of each summand is , so the variance of the whole sum is . Mean stays at , fluctuations vanish:
Heuristic shorthand: . Plus, by similar but easier arguments, and . These three rules are the entire algebra of stochastic differentials.
Compare this with a smooth function: the same sum of would be of order and vanish. Brownian motion is wiggly enough that its squared increments add up to something finite and deterministic. That paradox — random in every increment, deterministic in the sum of squares — is the engine of Itô's lemma.
Interactive: Quadratic Variation Explorer
Below we sample a Brownian path on , then compute two sums over the increments: and . Move the slider to increase . Watch the green box snap to while the red box blows up.
Drag the slider. The squared sum locks onto T; the absolute sum keeps growing. That tiny green box is the entire reason Itô's lemma needs an extra term.
Itô's Lemma: Statement and Intuition
Suppose follows a stochastic differential equation
and let be any function with two continuous spatial derivatives and one time derivative. Then Itô's lemma says
Compare this with what the chain rule from ordinary calculus would have given you:
The Itô Correction
The only difference is the extra term . It is called the Itô correction and it comes entirely from the rule.
Where the correction comes from. Taylor-expand in both arguments to second order:
Now substitute into and apply the three multiplication rules:
Plug that back in, gather and terms, and Itô's lemma falls out. The whole derivation is a Taylor expansion plus the bookkeeping rule .
Analogy. Think of as a coin flip that has zero average but bounces with size . Squaring it knocks out the sign and what survives is the average size of a single bounce — . That residual deterministic effect is the Itô correction.
Application: Geometric Brownian Motion
The most famous SDE in finance is geometric Brownian motion (GBM):
Read it like this: in a small time , the stock price moves by a deterministic drift plus a random kick . Both pieces scale with the current price, which is what we want — a $100 stock should fluctuate more (in dollar terms) than a $1 stock.
Here is the expected return per unit time and is the volatility. Typical values: a broad equity index has , ; a single biotech stock can have .
Interactive: Stock Price Simulator
Below you can play with and and see how an ensemble of futures for the same stock evolves. The orange curve is the deterministic mean. Each blue line is one of many possible paths the world could take.
The orange curve is the deterministic mean E[S_t]. Each thin blue line is one possible future of the stock — generated using the closed-form lognormal update derived from Itô's lemma. Crank σ up to feel why volatility, not drift, is the dominant force.
Notice that turning the drift up shifts the orange line, but turning the volatility up fans the paths out exponentially. Volatility is what makes options valuable in the first place — and we are about to see why.
Worked Example: From dS to log(S)
The whole reason Itô's lemma is worth knowing is that it lets us change variables in an SDE. The single most important change of variables in finance is . Let us apply Itô's lemma by hand.
Show the full pen-and-paper derivation
Step 1. Identify the pieces. From we read off and in the generic SDE notation.
Step 2. Choose . Compute the partials:
Step 3. Plug into Itô's lemma:
Step 4. Simplify each piece. The two factors in the drift cancel, leaving . The term gives the Itô correction. The diffusion term collapses to .
Step 5. Integrate from to . The right-hand side is purely deterministic times plus times the total Brownian increment:
Because , the log return is normally distributed:
Numerical check. With :
- Mean of log return = .
- Std of log return = .
- Naive (wrong) answer: mean = 0.10. The difference, , is exactly the Itô correction.
Takeaway. A stock with expected return 10% does not have log-expected return 10% — it has 8%. That two-percent gap, hammered out by volatility, is the origin of and in the Black–Scholes formula you will meet in section 31.5.
Two stocks with the same expected return but different volatilities compound to different typical futures. Volatility eats geometric growth.
Python: Euler–Maruyama from First Principles
Let us write the SDE simulator with no abstractions — one path, one explicit step at a time. This is the discrete cousin of the SDE itself: drift plus diffusion, step by step.
Verify it yourself. Copy the snippet, run it, and you should see something close to:
S_T = 113.4205
log return = +0.1259
With seed=42 the value is deterministic. The expected log return is , so is one standard deviation away — perfectly normal for a single sample.
PyTorch: Vectorised Path Generation
For pricing options by Monte Carlo we want tens of thousands of paths. A Python loop crawls; a tensor batch flies. The trick is to use the closed-form log-Euler update we derived above, then along the time axis.
Why the closed-form update beats Euler–Maruyama on directly. The naive Euler step can produce negative prices when is very negative and is large. Working in space and exponentiating at the end is exact in distribution and never goes negative.
Stochastic Differential Equations
A general one-dimensional SDE has the form
The two ingredients are the drift coefficient and the diffusion coefficient . Different choices give very different processes:
| Name | Drift μ(x,t) | Diffusion σ(x,t) | Typical use |
|---|---|---|---|
| Geometric Brownian motion | μ x | σ x | Stock prices in Black–Scholes |
| Ornstein–Uhlenbeck | −θ(x − m) | σ | Mean-reverting interest rates, asset volatility |
| Cox–Ingersoll–Ross | κ(θ − x) | σ √x | Interest rate models that stay non-negative |
| Heston | κ(θ − x) | ξ √x | Stochastic-volatility option pricing |
| Langevin equation | −∇U(x) | √(2/β) | Statistical physics, diffusion models in ML |
Itô's lemma applies uniformly to all of them. Once you can differentiate a function and substitute its drift and diffusion into the formula, you can transform any SDE into a new SDE for any smooth function of the state.
Important caveat. Itô's lemma is one of two consistent stochastic calculi. The other, Stratonovich, drops the term but interprets the integral differently. Finance and machine learning conventionally use Itô (forward-looking, non-anticipating). Physics and engineering often use Stratonovich (consistent with the smooth limit). They are translatable, but mixing them silently is a classic bug.
Where Itô's Lemma Lives
Itô's lemma is not a finance-only tool. It is the chain rule of every field that deals with continuous-time noise.
- Black–Scholes PDE. Apply Itô to an option price . The term is hedgeable by holding shares of stock. Setting the drift equal to the risk-free rate gives the PDE — the subject of the next section.
- Interest-rate models. The short rate is an SDE; bond prices are expectations of . Itô is how you derive the bond PDE.
- Filtering and Kalman–Bucy. Conditioning a signal on noisy observations gives an SDE for the conditional mean. Itô delivers the gain equation.
- Stochastic gradient Langevin dynamics. In Bayesian deep learning, parameters evolve via . Itô calculus lets you analyse the long-run distribution.
- Diffusion generative models. The forward and reverse SDEs of score-based models are calibrated using Itô's lemma. The famous term in the reverse drift is exactly the Itô correction.
The slogan. is the smallest mathematical statement with the largest economic, scientific, and ML footprint of the twentieth century.
Summary
- Brownian motion has nonzero quadratic variation: , almost surely.
- Heuristic algebra of differentials: , , .
- Itô's lemma: for with ,.
- The extra term is the Itô correction. It is what makes stochastic calculus different from ordinary calculus.
- Applied to on geometric Brownian motion, Itô gives the closed-form lognormal solution that lets us simulate paths exactly in log-space.
- The Euler–Maruyama scheme is the simplest numerical solver for an SDE: drift step plus -scaled Gaussian noise.
- Itô's lemma is the foundation of option pricing, interest-rate modelling, filtering, Langevin dynamics, and modern diffusion generative models.
Up next (Section 31.4): we will apply Itô's lemma to an option price , build a self-financing replicating portfolio, and derive the famous Black–Scholes partial differential equation.