Learning Objectives
By the end of this section, you will be able to:
- Explain why Monte Carlo pricing works at all by identifying the option price with an expectation under the risk-neutral measure.
- Simulate a Geometric Brownian Motion path by hand and recognise where the Itô correction comes from in the analytic terminal-price formula.
- State and use the Monte Carlo error formula , including its consequence: 10× accuracy ⇒ 100× compute.
- Hand-trace a six-path Monte Carlo pricing of a European call and read off both the estimate and the 95% confidence interval.
- Implement a clean Python pricer and a PyTorch differentiable pricer that returns the price and all four Greeks from a single backward pass.
- Choose when Monte Carlo is worth the variance — path-dependent payoffs, high-dimensional baskets, and pricing under dynamics with no closed form.
The Big Picture: When Formulas Run Out
Section-05 gave us a closed-form Black-Scholes price for European calls and puts. So why bother simulating?
Because the closed form is a special case — a beautiful one, but a special case. The instant the payoff stops being a simple function of , the integral that defines the price loses its tidy form. Look at a few examples that real desks deal with every day:
| Option type | Payoff | Closed form? |
|---|---|---|
| European call/put | (S_T − K)₊ | Yes — Black-Scholes |
| Asian (avg-price) | (avg(S_t) − K)₊ | No — average of a lognormal is not lognormal |
| Barrier (knock-out) | (S_T − K)₊ · 𝟙{max S_t < B} | Only under constant vol; closes fast otherwise |
| Basket on 5 assets | (Σ wᵢ Sᵢ − K)₊ | No — sum of lognormals is not lognormal |
| American put | max early exercise | No — free-boundary PDE |
| Path-dependent custom | anything | Almost never |
Every one of these has the same mathematical form:
Risk-neutral pricing is, at its heart, the statement that today's fair price is the expected discounted payoff under a special probability measure in which the stock's expected return is the risk-free rate. The only thing that changes between products is the payoff functional. And computing an expectation — when no clever integral substitution makes it pop out in closed form — is exactly what Monte Carlo was invented for.
The Core Idea: Pricing = Average Discounted Payoff
Start from the risk-neutral formula. For a European call,
Suppose for a moment we have a magic machine that can, on demand, produce one independent draw from the risk-neutral distribution of . We do not (yet) know the density — we just need samples from it.
The strong law of large numbers guarantees that
as long as the payoff has finite mean — which it does for any payoff bounded by a polynomial in . Pricing reduces to three independent tasks:
- Sample from the risk-neutral law (the only place model assumptions live).
- Evaluate the payoff at each sample (problem-specific, but always a deterministic function).
- Average the discounted payoffs and report the mean with a standard error.
Simulating One Stock Path Under GBM
Under Black-Scholes the stock follows geometric Brownian motion under :
Applying Itô's lemma to gives the integrated solution
with , so for . Substituting,
Three small things are worth pausing on:
- The drift is , not the real-world expected return. That is the whole substance of risk-neutral pricing — the asset pretends to drift at the risk-free rate under .
- The correction is not optional. It is the Itô correction. Without it, and discounted prices would fail to be martingales — and our entire pricing edifice would collapse.
- For path-dependent payoffs we cannot jump to expiry — we have to step. The Euler-Maruyama discretization is the go-to: .
Interactive: Watch Many Worlds Unfold
Each curve below is one possible future of the stock — a single draw from the risk-neutral distribution. Green paths end above the strike (they pay out ); red paths end below it (they pay out nothing). The Monte Carlo price is the average green height, discounted, divided by the total path count.
Each green/red curve is one simulated future of the stock. The right-edge histogram is the empirical distribution of the terminal price ST. The call payoff (ST − K)+ is the green-shaded mass above K — the Monte Carlo price is just the average of that mass, discounted back at rate r.
Push up and watch the fan splay wider — more paths land in the green region but also further from it. Push up and the fan stretches in time. Compare the "MC" readout against "BS analytic" in the top-left corner; the gap is your Monte Carlo error.
The Monte Carlo Estimator and Its Standard Error
Write for the i-th discounted payoff. Each is an independent draw from the same distribution; call its true mean and its variance . The MC estimator is the sample mean,
Two consequences from elementary probability:
- Unbiased: . There is no systematic error from finite sampling.
- Variance scales like 1/N: . Hence the standard error .
The Central Limit Theorem then gives us a confidence band:
where is the sample standard deviation of the payoffs. This single inequality is the rule that lets a trading desk sleep at night.
The 1/√N tax. To cut the error by a factor of 10, you must increase by a factor of 100. To cut by 100, by 10,000. Monte Carlo never escapes this geometry — every variance-reduction technique we will meet is some clever way to shrink in the numerator instead of fighting in the denominator.
Interactive: Convergence at Rate 1/√N
The plot below draws the running MC estimate after every new path (amber), the true Black-Scholes price (cyan dashed), and a 95% confidence band (light cyan ribbon) that widens or narrows as the empirical variance updates.
The amber curve is the Monte Carlo estimate after each new path. The dashed cyan line is the exact Black-Scholes value. The shaded band is the running ±1.96 standard errors. Notice how the band — like every honest Monte Carlo error — shrinks like : a 10× improvement in accuracy costs a 100× increase in sample size.
Three things to notice as you play:
- The amber curve is jagged at the start and gradually flattens. That is the law of large numbers in action.
- The cyan band shrinks roughly like . Doubling shrinks it by a factor of , not 2.
- The estimate stays inside the band roughly 95% of the time. Move the seed slider; you can find runs where the estimate briefly drifts outside it — that is the 5% the CLT allows.
Worked Example (Try It By Hand)
Let us price a one-year ATM European call with using only six paths. The closed-form Black-Scholes answer is ; we will see how close a tiny MC sample can get.
We need six standard-normal draws. Suppose our RNG hands us
▶ Show full hand-worked solution (6 paths, 4 steps)
Step 1 — Pre-compute the GBM constants
With , , :
- drift =
- diffusion =
- discount =
Step 2 — Map each Z to S_T
Use .
| i | Zᵢ | exponent | S_T(i) | Payoff (S_T − 100)₊ |
|---|---|---|---|---|
| 1 | −0.55 | 0.01875 − 0.1375 = −0.11875 | 88.80 | 0.00 |
| 2 | +1.21 | 0.01875 + 0.3025 = 0.32125 | 137.89 | 37.89 |
| 3 | +0.04 | 0.01875 + 0.0100 = 0.02875 | 102.92 | 2.92 |
| 4 | −1.83 | 0.01875 − 0.4575 = −0.43875 | 64.48 | 0.00 |
| 5 | +0.72 | 0.01875 + 0.1800 = 0.19875 | 121.99 | 21.99 |
| 6 | +1.55 | 0.01875 + 0.3875 = 0.40625 | 150.10 | 50.10 |
Step 3 — Average and discount
Sum of payoffs = 0 + 37.89 + 2.92 + 0 + 21.99 + 50.10 = 112.90.
Sample mean = .
Discounted estimate = .
Step 4 — Compute the 95% confidence interval
Discounted payoffs: .
Mean of those = 17.90 (matches above). Sample variance ≈ 401.6 (using), so.
Standard error = .
95% CI ≈ , i.e. the interval roughly .
The lesson. Our point estimate (17.90) is wildly off the true value (12.336) — but the 95% CI does contain the true price. With only six samples that is the best honest summary we can give. Push to 100,000 and the half-width drops to ~0.05 — the estimate becomes a precision measurement.
Variance Reduction: Same Accuracy, Fewer Paths
The 1/√N rate is unbreakable, but the constant in front is up for grabs. Every variance-reduction technique attacks the same target: shrink the variance of one payoff, and you need fewer paths for the same precision. Three techniques cover ~90% of real-world MC work.
1. Antithetic variates
For every standard normal you draw, also evaluate the payoff at its mirror , and average the pair. Because too, each pair is a valid sample of . But the two members of a pair are negatively correlated — high on one comes with low on the other. The variance of their average is
with — strictly less than the plain variance you would get from two independent samples. For at-the-money European calls , giving roughly a 2.5× variance reduction for the same compute.
2. Control variates
Suppose is some related payoff whose expectation we know in closed form. Then the unbiased estimator
has variance at the optimal . A common choice for option pricing: use the underlying itself as — we know . Variance reductions of 10× or more are routine.
3. Importance sampling
Deep out-of-the-money options have many paths returning zero — wasted samples. Importance sampling reweights the distribution to oversample the in-the-money region, then corrects with a likelihood ratio. Spectacular gains (1000×+) on rare-event problems; requires care to keep the estimator unbiased.
Interactive: Antithetic vs Plain MC
Both lines below see the same stream of random numbers — the only difference is that the purple estimator also evaluates the payoff at and averages the pair. Watch how it converges to the analytic price along a noticeably tighter trajectory.
Both estimators see exactly the same random numbers — antithetic MC simply also evaluates the payoff at and averages the pair. The ratio of standard errors at the top tells you how many fewer paths you would need for the same precision; for at-the-money European calls the ratio is typically , i.e. variance roughly cut in half for the same compute budget.
The "ratio" readout shows . At ATM you should see something close to – — equivalent to squeezing 4× as many independent paths out of the same compute budget. Push far OTM (say 150) and the gain shrinks; the negative correlation weakens because both members of every pair tend to be zero.
Plain Python: A Vanilla Monte Carlo Pricer
Before we let PyTorch do the heavy lifting, we build the same pricer in plain Python so every operation is visible. No numpy, no autograd — just a for-loop and draws.
Running this on the worked-example scenario with 200,000 paths gives an estimate within ~5 cents of the Black-Scholes truth — better than the bid-ask of any liquid option. The pricer is exactly 17 lines of computation, no closed forms, no integrals. That is the Monte Carlo bargain: trade analytic effort for compute.
PyTorch: Pathwise Greeks via Autograd
The plain-Python loop gives us the price. To trade an option we also need its sensitivities — the Greeks. Two paths from here:
- Bumping (finite differences): run the pricer again with , subtract, divide by . One full MC repricing per Greek. Five inputs ⇒ five reruns. Noisy because you are subtracting two noisy numbers.
- Pathwise via autograd: the chain rule, applied inside every Monte Carlo path, then averaged. One backward pass yields all Greeks, exactly as variance-reduced as the price itself.
Because the closed-form GBM formula is smooth in every input, and the call payoff is smooth almost everywhere, the pathwise gradient estimator is unbiased. PyTorch's autograd computes it for us essentially for free.
Compare the output to the closed-form Greeks from section-06:
| Greek | Pathwise MC | BS closed form |
|---|---|---|
| Price | 12.34 | 12.336 |
| Delta = N(d₁) | ≈ 0.602 | 0.6021 |
| Vega = S φ(d₁) √T | ≈ 37.5 | 37.524 |
| Theta | ≈ −6.4 | −6.414 |
| Rho | ≈ 53.2 | 53.232 |
Where Monte Carlo Earns Its Keep
Monte Carlo is the second-choice tool for vanilla European options — the closed form is faster and noise-free. Where MC dominates is any setting where the closed form does not exist:
| Problem | Why MC wins |
|---|---|
| Asian options | Average of a lognormal is not lognormal — no closed form. Discretize the path, average over draws. |
| Basket options on ≥3 assets | Sum of correlated lognormals is not lognormal. Sample correlated normals via Cholesky, then average. |
| American-style with regression (Longstaff-Schwartz) | Backward induction on regressed continuation values; the underlying paths are MC. |
| Counterparty credit (CVA / xVA) | Joint simulation of all market factors over the life of every trade — millions of dimensions. Only MC scales. |
| Insurance / annuity guarantees | Equity-linked products with policyholder behaviour models too complex for any PDE solver. |
| Model calibration with stochastic vol (Heston, SABR) | Even when the model has a quasi-closed form for vanillas, MC is the universal validator for exotics priced under the same model. |
Summary
- Risk-neutral pricing identifies an option's fair value with the discounted expected payoff under . Monte Carlo computes that expectation by sampling.
- For European payoffs under GBM, one normal draw yields one sample of via the closed-form solution .
- The estimator is unbiased, consistent, and has standard error . The 1/√N rate is the unavoidable tax.
- Variance reduction attacks the numerator . Antithetic variates halve variance for ATM options at zero compute cost; control variates push that to 10× or more; importance sampling rescues deep-OTM rare events.
- Pathwise Greeks via autograd deliver every sensitivity from one backward pass, with the same variance as the price itself — replacing five separate bumps with one differentiable forward.
- Monte Carlo's niche is high dimensions, path dependence, and any setting where no closed form exists. Together with the PDE perspective from section-04, it forms the full numerical toolbox for option pricing.