Chapter 31
32 min read
Section 262 of 353

Itô's Lemma and Stochastic Differential Equations

The Black-Scholes Equation

Learning Objectives

By the end of this section you will be able to:

  1. See why the ordinary chain rule fails when noise has unbounded variation.
  2. Derive the heuristic (dW)2=dt(dW)^2 = dt from quadratic variation.
  3. State and apply Itô's lemma to functions of a stochastic process.
  4. Transform a stochastic differential equation by changing variables (log‑transform of GBM).
  5. Simulate SDEs with Euler–Maruyama in Python and in vectorised PyTorch.

Why the Ordinary Chain Rule Breaks

In ordinary calculus, if x(t)x(t) is a smooth function and ff is differentiable, the chain rule says

df=f(x)dx.\displaystyle df = f'(x)\, dx.

That is, the change in ff is linear in the change in xx. The reason this works is buried in a Taylor expansion:

df=f(x)dx+12f(x)(dx)2+\displaystyle df = f'(x)\,dx + \tfrac{1}{2}f''(x)\,(dx)^2 + \cdots

For smooth x(t)x(t), the increment dxdx is of order dtdt, so (dx)2(dx)^2 is of order (dt)2(dt)^2 — utterly negligible as dt0dt \to 0. We drop it without guilt.

The Twist for Brownian Motion

If X(t)=W(t)X(t) = W(t), a Brownian motion, then dWdW is not of order dtdt. It is of order dt\sqrt{dt}.

So (dW)2(dW)^2 is of order dtdt — the same order as the drift term we are trying to keep. We cannot throw it away.

This single observation is the entire reason Itô's lemma exists, and the entire reason stochastic calculus is its own subject. Everything below is just careful book-keeping around that one fact.


The Secret: (dW)² = dt

Let us pin down what "of order dt" really means. Subdivide [0,T][0, T] into NN intervals of size Δt=T/N\Delta t = T/N. The Brownian increment over each interval is ΔWi=W(ti+1)W(ti)N(0,Δt)\Delta W_i = W(t_{i+1}) - W(t_i) \sim \mathcal{N}(0, \Delta t).

The quadratic variation of the path is the limit

WT  =  limNi=0N1(ΔWi)2.\displaystyle \langle W \rangle_T \;=\; \lim_{N\to\infty}\, \sum_{i=0}^{N-1} (\Delta W_i)^2.

Each summand is the square of a Gaussian, and its E[(ΔWi)2]=Δt\mathbb{E}[(\Delta W_i)^2] = \Delta t. So the expectation of the sum is exactly NΔt=TN \cdot \Delta t = T. The variance of each summand is 2(Δt)22(\Delta t)^2, so the variance of the whole sum is 2TΔt02T \cdot \Delta t \to 0. Mean stays at TT, fluctuations vanish:

WT=Talmost surely.\displaystyle \langle W \rangle_T = T \quad \text{almost surely.}

Heuristic shorthand: (dW)2=dt(dW)^2 = dt. Plus, by similar but easier arguments, dtdW=0dt \cdot dW = 0 and (dt)2=0(dt)^2 = 0. These three rules are the entire algebra of stochastic differentials.

Compare this with a smooth function: the same sum of (Δxi)2(\Delta x_i)^2 would be of order Δt\Delta t and vanish. Brownian motion is wiggly enough that its squared increments add up to something finite and deterministic. That paradox — random in every increment, deterministic in the sum of squares — is the engine of Itô's lemma.


Interactive: Quadratic Variation Explorer

Below we sample a Brownian path on [0,1][0, 1], then compute two sums over the increments: (ΔWi)2\sum (\Delta W_i)^2 and ΔWi\sum |\Delta W_i|. Move the slider to increase NN. Watch the green box snap to T=1T = 1 while the red box blows up.

Quadratic Variation Explorer
816,384
Sum of squared increments
Σ (ΔW)² = 1.0757
Target as N → ∞: T = 1.0000
Sum of absolute increments
Σ |ΔW| = 12.87
Diverges to ∞ as N grows — Brownian motion has unbounded variation.

Drag the slider. The squared sum locks onto T; the absolute sum keeps growing. That tiny green box is the entire reason Itô's lemma needs an extra term.

The squared sum is a stable, deterministic quantity — exactly what we need to define (dW)2=dt(dW)^2 = dt. The absolute sum being infinite is why you cannot define stochastic integrals pathwise as Riemann–Stieltjes integrals. You have to build a new machinery: the Itô integral.

Itô's Lemma: Statement and Intuition

Suppose XtX_t follows a stochastic differential equation

dXt=μ(Xt,t)dt+σ(Xt,t)dWt,\displaystyle dX_t = \mu(X_t, t)\, dt + \sigma(X_t, t)\, dW_t,

and let f(x,t)f(x, t) be any function with two continuous spatial derivatives and one time derivative. Then Itô's lemma says

df=(ft+μfx+12σ22fx2)dt  +  σfxdW.\displaystyle df = \left(\,\frac{\partial f}{\partial t} + \mu\,\frac{\partial f}{\partial x} + \tfrac{1}{2}\sigma^2\,\frac{\partial^2 f}{\partial x^2}\right) dt \;+\; \sigma\,\frac{\partial f}{\partial x}\, dW.

Compare this with what the chain rule from ordinary calculus would have given you:

dfnaive=(ft+μfx)dt  +  σfxdW.\displaystyle df_{\text{naive}} = \left(\,\frac{\partial f}{\partial t} + \mu\,\frac{\partial f}{\partial x}\right) dt \;+\; \sigma\,\frac{\partial f}{\partial x}\, dW.

The Itô Correction

The only difference is the extra term 12σ2fxxdt\tfrac{1}{2}\sigma^2 f_{xx}\, dt. It is called the Itô correction and it comes entirely from the (dW)2=dt(dW)^2 = dt rule.

Where the correction comes from. Taylor-expand f(Xt+dt,t+dt)f(X_{t+dt}, t+dt) in both arguments to second order:

df=ftdt+fxdX+12fxx(dX)2+higher order.\displaystyle df = f_t\, dt + f_x\, dX + \tfrac{1}{2} f_{xx}\,(dX)^2 + \text{higher order}.

Now substitute dX=μdt+σdWdX = \mu\, dt + \sigma\, dW into (dX)2(dX)^2 and apply the three multiplication rules:

(dX)2=μ2(dt)2+2μσdtdW+σ2(dW)2=0+0+σ2dt.\displaystyle (dX)^2 = \mu^2(dt)^2 + 2\mu\sigma\,dt\,dW + \sigma^2(dW)^2 = 0 + 0 + \sigma^2\, dt.

Plug that back in, gather dtdt and dWdW terms, and Itô's lemma falls out. The whole derivation is a Taylor expansion plus the bookkeeping rule (dW)2=dt(dW)^2 = dt.

Analogy. Think of dWdW as a coin flip that has zero average but bounces with size dt\sqrt{dt}. Squaring it knocks out the sign and what survives is the average size of a single bounce — dtdt. That residual deterministic effect is the Itô correction.


Application: Geometric Brownian Motion

The most famous SDE in finance is geometric Brownian motion (GBM):

dSt=μStdt+σStdWt.\displaystyle dS_t = \mu\, S_t\, dt + \sigma\, S_t\, dW_t.

Read it like this: in a small time dtdt, the stock price moves by a deterministic drift μSdt\mu S\, dt plus a random kick σSdW\sigma S\, dW. Both pieces scale with the current price, which is what we want — a $100 stock should fluctuate more (in dollar terms) than a $1 stock.

Here μ\mu is the expected return per unit time and σ\sigma is the volatility. Typical values: a broad equity index has μ8%/yr\mu \approx 8\%/\text{yr}, σ16%/yr\sigma \approx 16\%/\text{yr}; a single biotech stock can have σ60%/yr\sigma \approx 60\%/\text{yr}.


Interactive: Stock Price Simulator

Below you can play with μ\mu and σ\sigma and see how an ensemble of futures for the same stock evolves. The orange curve is the deterministic mean. Each blue line is one of many possible paths the world could take.

Geometric Brownian Motion: dS = μS dt + σS dW
At t = T (sample stats over 40 paths)
Empirical mean 110.63
Theory S₀ eμT = 110.52
Empirical std 28.85
Theory 28.07

The orange curve is the deterministic mean E[S_t]. Each thin blue line is one possible future of the stock — generated using the closed-form lognormal update derived from Itô's lemma. Crank σ up to feel why volatility, not drift, is the dominant force.

Notice that turning the drift up shifts the orange line, but turning the volatility up fans the paths out exponentially. Volatility is what makes options valuable in the first place — and we are about to see why.


Worked Example: From dS to log(S)

The whole reason Itô's lemma is worth knowing is that it lets us change variables in an SDE. The single most important change of variables in finance is f(S)=logSf(S) = \log S. Let us apply Itô's lemma by hand.

Show the full pen-and-paper derivation

Step 1. Identify the pieces. From dS=μSdt+σSdWdS = \mu S\, dt + \sigma S\, dW we read off μX=μS\mu_X = \mu S and σX=σS\sigma_X = \sigma S in the generic SDE notation.

Step 2. Choose f(S,t)=logSf(S, t) = \log S. Compute the partials:

ft=0,fS=1S,2fS2=1S2.\frac{\partial f}{\partial t} = 0,\quad \frac{\partial f}{\partial S} = \frac{1}{S},\quad \frac{\partial^2 f}{\partial S^2} = -\frac{1}{S^2}.

Step 3. Plug into Itô's lemma:

d(logS)=(0+μS1S+12(σS)2(1S2))dt+(σS)1SdW.d(\log S) = \left(0 + \mu S \cdot \frac{1}{S} + \tfrac{1}{2}(\sigma S)^2 \cdot \left(-\frac{1}{S^2}\right)\right) dt + (\sigma S)\cdot \frac{1}{S}\, dW.

Step 4. Simplify each piece. The two SS factors in the drift cancel, leaving μ\mu. The (σS)2/S2=σ2(\sigma S)^2 / S^2 = \sigma^2 term gives the Itô correction. The diffusion term collapses to σ\sigma.

d(logS)=(μ12σ2)dt+σdW.\boxed{\, d(\log S) = \left(\mu - \tfrac{1}{2}\sigma^2\right) dt + \sigma\, dW.\,}

Step 5. Integrate from 00 to TT. The right-hand side is purely deterministic times TT plus σ\sigma times the total Brownian increment:

logSTS0  =  (μ12σ2)T  +  σWT.\log\frac{S_T}{S_0} \;=\; \left(\mu - \tfrac{1}{2}\sigma^2\right) T \;+\; \sigma\, W_T.

Because WTN(0,T)W_T \sim \mathcal{N}(0, T), the log return is normally distributed:

logSTS0N ⁣((μ12σ2)T,σ2T).\log\frac{S_T}{S_0} \sim \mathcal{N}\!\left((\mu - \tfrac{1}{2}\sigma^2)T,\, \sigma^2 T\right).

Numerical check. With μ=0.10,σ=0.20,T=1\mu=0.10, \sigma=0.20, T=1:

  • Mean of log return = 0.1012(0.04)=0.080.10 - \tfrac{1}{2}(0.04) = 0.08.
  • Std of log return = 0.200.20.
  • Naive (wrong) answer: mean = 0.10. The difference, 12σ2=0.02-\tfrac{1}{2}\sigma^2 = -0.02, is exactly the Itô correction.

Takeaway. A stock with expected return 10% does not have log-expected return 10% — it has 8%. That two-percent gap, hammered out by volatility, is the origin of d1d_1 and d2d_2 in the Black–Scholes formula you will meet in section 31.5.

Two stocks with the same expected return but different volatilities compound to different typical futures. Volatility eats geometric growth.

Python: Euler–Maruyama from First Principles

Let us write the SDE simulator with no abstractions — one path, one explicit step at a time. This is the discrete cousin of the SDE itself: drift plus diffusion, step by step.

Plain Python Euler–Maruyama simulator
🐍simulate_gbm.py
1NumPy import

We need fast vector math and a good random generator. `numpy.random.default_rng` is the modern PCG64 generator — far better than the legacy `np.random.seed/randn` pair.

3Function signature

Five domain parameters and one numerical parameter. `S0` is today's price, `mu` is the annualised expected return, `sigma` is the annualised volatility, `T` is the time horizon in years, and `N` is the number of Euler steps. Setting `seed` makes the path reproducible — essential for unit tests.

EXAMPLE
Defaults: S₀=100, μ=10%/yr, σ=20%/yr, T=1 yr, N=1000 steps.
10Create the random generator

`default_rng(seed)` returns a fresh BitGenerator state. All randomness in this function pulls from it, so the same seed always reproduces the same path.

11Time-step size dt

We split [0, T] into N equal intervals of length dt = T/N. Smaller dt ⇒ closer to the true continuous SDE, but more compute. For SDEs the Euler error is O(√dt), much slower than the O(dt) you get for ODEs — this is the price of randomness.

EXAMPLE
T = 1.0, N = 1000 → dt = 0.001 years ≈ 6 trading hours.
12Time grid

`linspace(0, T, N+1)` is the array of N+1 sample times. We need N+1 points because we keep S[0] and produce N updates.

13Allocate the price array

Pre-allocate an empty Float64 array of length N+1. Faster than appending inside the loop — Python list append is amortised O(1) but with a much bigger constant than indexed assignment into a NumPy array.

14Initial condition

Set S[0] = S0. The SDE is first-order, so a single initial value pins down the whole path (given the same Brownian sample).

16Step loop

We walk the timeline left to right. At each step we sample one new Brownian increment and use it to push S forward by one dt.

17Brownian increment dW

Theoretically dW ~ N(0, dt). We synthesise it as √dt · Z where Z ~ N(0, 1). This is the discretised version of the integral ∫_t^{t+dt} dW = W(t+dt) − W(t).

EXAMPLE
If Z = 0.3 and dt = 0.001, dW = √0.001 · 0.3 ≈ 0.00949.
18Euler–Maruyama update for dS

Direct discretisation of dS = μ·S·dt + σ·S·dW. The first term is deterministic drift; the second is the random kick. Both scale with the current price S[i], which is what makes GBM multiplicative (and keeps S > 0 in expectation).

EXAMPLE
S[i]=100, μ=0.10, σ=0.20, dt=0.001, dW=0.00949 → drift = 100·0.10·0.001 = 0.01, diffusion = 100·0.20·0.00949 ≈ 0.1898, dS ≈ 0.2.
19Advance the price

S[i+1] = S[i] + dS — the simplest possible time-marching scheme. There is no implicit step, no Newton solve. Every randomness lives in dW.

21Return both arrays

Returning `(t, S)` lets the caller plot, log-transform, or compute summary statistics. We never mutate global state.

23Call the simulator

Run with default parameters, capture both arrays.

26Print summary

The log return ln(S_T/S_0) is the natural quantity. By Itô's lemma it should be approximately Normal with mean (μ − σ²/2)·T = 0.08 and standard deviation σ√T = 0.20. With seed=42 you typically see a value within ±0.5 of the mean.

EXAMPLE
Theory: mean = 0.08, std = 0.20. One realised path could be +0.0612, −0.1145, +0.3417, etc.
11 lines without explanation
1import numpy as np
2
3def simulate_gbm(S0=100.0, mu=0.10, sigma=0.20, T=1.0, N=1000, seed=42):
4    """
5    Simulate one sample path of geometric Brownian motion
6        dS = mu * S * dt + sigma * S * dW
7    using the Euler-Maruyama scheme.
8    """
9    rng = np.random.default_rng(seed)
10    dt = T / N
11    t = np.linspace(0.0, T, N + 1)
12    S = np.empty(N + 1)
13    S[0] = S0
14
15    for i in range(N):
16        dW = np.sqrt(dt) * rng.standard_normal()
17        dS = mu * S[i] * dt + sigma * S[i] * dW
18        S[i + 1] = S[i] + dS
19
20    return t, S
21
22t, S = simulate_gbm()
23print(f"S_0   = {S[0]:.4f}")
24print(f"S_T   = {S[-1]:.4f}")
25print(f"log return = {np.log(S[-1] / S[0]):+.4f}")

Verify it yourself. Copy the snippet, run it, and you should see something close to:

S_0 = 100.0000
S_T = 113.4205
log return = +0.1259

With seed=42 the value is deterministic. The expected log return is 0.080.08, so +0.126+0.126 is one standard deviation away — perfectly normal for a single sample.


PyTorch: Vectorised Path Generation

For pricing options by Monte Carlo we want tens of thousands of paths. A Python loop crawls; a tensor batch flies. The trick is to use the closed-form log-Euler update we derived above, then cumsum\texttt{cumsum} along the time axis.

Vectorised PyTorch simulator (GPU-ready)
🐍simulate_gbm_torch.py
1Import torch

Same job as `numpy` but with GPU support and autograd. We will not need autograd here — we just want the fast tensorised random number generator and `cumsum`.

3Vectorised function signature

Same five SDE parameters as the NumPy version, plus `num_paths`, `device`, `dtype`, and `seed`. The big change: we now simulate `num_paths` Brownian motions simultaneously. With 10,000 paths and N=1000 we are about to crunch 10 million normals in one shot.

EXAMPLE
On a modern GPU this runs in ~30 ms; the NumPy loop above takes minutes for the same count.
9Per-call generator

`torch.Generator(device).manual_seed(seed)` is the device-local equivalent of NumPy's `default_rng`. Reproducibility on GPU requires this — the global `torch.manual_seed` is not enough.

10Time-step dt

Same definition as before: dt = T/N. We pull it out of the loop because we are about to vectorise the loop away entirely.

13Sample standard normals Z

`torch.randn(num_paths, N, ...)` produces a (num_paths × N) matrix of i.i.d. N(0,1) samples. Row p column i is the standard normal we will use for path p at step i.

EXAMPLE
Z[0, 0] = 0.32, Z[0, 1] = -1.14, … each row is one path's random ingredients.
14Scale to Brownian increments

Multiply elementwise by √dt to convert each Z into a dW with variance dt. Now `dW[p, i]` is the Brownian increment for path p during step i.

17Deterministic per-step drift

From Itô's lemma applied to log(S): d(log S) = (μ − σ²/2) dt + σ dW. The (μ − σ²/2) term is the famous Itô correction — naive calculus would have given just μ.

EXAMPLE
μ = 0.10, σ = 0.20, dt = 0.001 → drift = (0.10 − 0.02)·0.001 = 8e-5.
18Log-price increment per step

Each entry of `log_increments` is one realisation of d(log S) for one path at one step. Same shape (num_paths × N).

21cumsum builds the path

`cumsum(... dim=1)` integrates along the time axis: column i becomes the sum of log increments from step 0 up to step i. This is the discrete analogue of ∫_0^t (μ − σ²/2) ds + σ dW(s).

EXAMPLE
If increments along one path are [0.001, -0.002, 0.003], cumsum gives [0.001, -0.001, 0.002].
22Prepend log(S0) column

We need a column of zeros at t = 0 so the final array has N+1 time points. `torch.cat` glues them on. The next line will lift everything up by log(S0).

23Add log(S0)

Broadcasting `+ log(S0)` shifts every path so that S(0) = S0. Now `log_S[p, 0] = log(S0)` and `log_S[p, N]` is the realised log final price for path p.

25Exponentiate to get S

`S = exp(log_S)` recovers the actual price levels. This step is what guarantees S stays positive — taking a Wiener path through the log space and then exponentiating is the geometric in 'geometric Brownian motion'.

26Time grid

Same as NumPy version. Used only for plotting.

29Call and report

Run the batch and check that the empirical mean of S_T matches the theoretical S₀·exp(μT). With 10,000 paths the agreement is typically within 1%.

EXAMPLE
Empirical mean ≈ 110.4. Theory: 100·exp(0.10) ≈ 110.52.
19 lines without explanation
1import torch
2
3def simulate_gbm_batch(S0=100.0, mu=0.10, sigma=0.20,
4                       T=1.0, N=1000, num_paths=10_000,
5                       device="cpu", dtype=torch.float32, seed=42):
6    """
7    Vectorised geometric Brownian motion. Generates many paths in parallel.
8    Uses the exact log-Euler update from Ito's lemma applied to log(S).
9    """
10    g = torch.Generator(device=device).manual_seed(seed)
11    dt = T / N
12
13    # Brownian increments: shape (num_paths, N), each ~ N(0, dt)
14    Z = torch.randn(num_paths, N, generator=g, device=device, dtype=dtype)
15    dW = Z * (dt ** 0.5)
16
17    # Exact GBM increment for log S:  d(log S) = (mu - sigma^2/2) dt + sigma dW
18    drift = (mu - 0.5 * sigma * sigma) * dt
19    log_increments = drift + sigma * dW
20
21    # Cumulative sum along the time axis, then prepend log(S0)
22    log_S = torch.cumsum(log_increments, dim=1)
23    log_S = torch.cat([torch.zeros(num_paths, 1, device=device, dtype=dtype), log_S], dim=1)
24    log_S = log_S + torch.log(torch.tensor(S0, device=device, dtype=dtype))
25
26    S = torch.exp(log_S)
27    t = torch.linspace(0.0, T, N + 1, device=device, dtype=dtype)
28    return t, S
29
30t, S = simulate_gbm_batch()
31print("S shape:", tuple(S.shape))
32print("Mean S_T :", S[:, -1].mean().item())
33print("Theory   :", 100.0 * torch.exp(torch.tensor(0.10)).item())

Why the closed-form update beats Euler–Maruyama on SS directly. The naive Euler step  Si+1=Si+μSidt+σSidW\;S_{i+1} = S_i + \mu S_i\, dt + \sigma S_i\, dW can produce negative prices when dWdW is very negative and σ\sigma is large. Working in logS\log S space and exponentiating at the end is exact in distribution and never goes negative.


Stochastic Differential Equations

A general one-dimensional SDE has the form

dXt=μ(Xt,t)dt+σ(Xt,t)dWt,X0 given.\displaystyle dX_t = \mu(X_t, t)\, dt + \sigma(X_t, t)\, dW_t,\qquad X_0 \text{ given.}

The two ingredients are the drift coefficient μ\mu and the diffusion coefficient σ\sigma. Different choices give very different processes:

NameDrift μ(x,t)Diffusion σ(x,t)Typical use
Geometric Brownian motionμ xσ xStock prices in Black–Scholes
Ornstein–Uhlenbeck−θ(x − m)σMean-reverting interest rates, asset volatility
Cox–Ingersoll–Rossκ(θ − x)σ √xInterest rate models that stay non-negative
Hestonκ(θ − x)ξ √xStochastic-volatility option pricing
Langevin equation−∇U(x)√(2/β)Statistical physics, diffusion models in ML

Itô's lemma applies uniformly to all of them. Once you can differentiate a function and substitute its drift and diffusion into the formula, you can transform any SDE into a new SDE for any smooth function of the state.

Important caveat. Itô's lemma is one of two consistent stochastic calculi. The other, Stratonovich, drops the 12σ2fxx\tfrac{1}{2}\sigma^2 f_{xx} term but interprets the integral differently. Finance and machine learning conventionally use Itô (forward-looking, non-anticipating). Physics and engineering often use Stratonovich (consistent with the smooth limit). They are translatable, but mixing them silently is a classic bug.


Where Itô's Lemma Lives

Itô's lemma is not a finance-only tool. It is the chain rule of every field that deals with continuous-time noise.

  1. Black–Scholes PDE. Apply Itô to an option price V(S,t)V(S, t). The dWdW term is hedgeable by holding V/S\partial V / \partial S shares of stock. Setting the drift equal to the risk-free rate gives the PDE — the subject of the next section.
  2. Interest-rate models. The short rate rtr_t is an SDE; bond prices are expectations of erdse^{-\int r\, ds}. Itô is how you derive the bond PDE.
  3. Filtering and Kalman–Bucy. Conditioning a signal on noisy observations gives an SDE for the conditional mean. Itô delivers the gain equation.
  4. Stochastic gradient Langevin dynamics. In Bayesian deep learning, parameters evolve via dθ=U(θ)dt+2/βdWd\theta = -\nabla U(\theta)\, dt + \sqrt{2/\beta}\, dW. Itô calculus lets you analyse the long-run distribution.
  5. Diffusion generative models. The forward and reverse SDEs of score-based models are calibrated using Itô's lemma. The famous 12σ2\tfrac{1}{2}\sigma^2 term in the reverse drift is exactly the Itô correction.
The slogan. (dW)2=dt(dW)^2 = dt is the smallest mathematical statement with the largest economic, scientific, and ML footprint of the twentieth century.

Summary

  • Brownian motion has nonzero quadratic variation: (ΔWi)2T\sum (\Delta W_i)^2 \to T, almost surely.
  • Heuristic algebra of differentials: (dW)2=dt(dW)^2 = dt, dtdW=0dt\, dW = 0, (dt)2=0(dt)^2 = 0.
  • Itô's lemma: for f(Xt,t)f(X_t, t) with dX=μdt+σdWdX = \mu\, dt + \sigma\, dW,  df=(ft+μfx+12σ2fxx)dt+σfxdW\; df = (f_t + \mu f_x + \tfrac{1}{2}\sigma^2 f_{xx})\, dt + \sigma f_x\, dW.
  • The extra term 12σ2fxx\tfrac{1}{2}\sigma^2 f_{xx} is the Itô correction. It is what makes stochastic calculus different from ordinary calculus.
  • Applied to f=logSf = \log S on geometric Brownian motion, Itô gives the closed-form lognormal solution that lets us simulate paths exactly in log-space.
  • The Euler–Maruyama scheme is the simplest numerical solver for an SDE: drift step plus dt\sqrt{dt}-scaled Gaussian noise.
  • Itô's lemma is the foundation of option pricing, interest-rate modelling, filtering, Langevin dynamics, and modern diffusion generative models.

Up next (Section 31.4): we will apply Itô's lemma to an option price V(S,t)V(S, t), build a self-financing replicating portfolio, and derive the famous Black–Scholes partial differential equation.

Loading comments...