Learning Objectives
By the end of this section, you will be able to:
- Explain why the intuitive idea of "getting close to L" is not rigorous enough to build calculus on.
- State the ε–δ definition of a limit and identify each quantifier and inequality in plain language.
- Construct a δ that responds to a given ε for linear and quadratic functions, both visually and algebraically.
- Recognise when no δ can possibly work — the geometric signature of a failed limit.
- Connect the limiting ratio ε / δ to the derivative and to the gradient autograd computes in PyTorch.
The Problem with "Approaches"
"What does it mean for a quantity to approach a value? For two centuries, this question stalled the foundations of calculus."
In every previous section we have written things like and trusted that "as x gets close to 2, x² gets close to 4". That sentence is convincing, but it is not a definition — it is a description of feelings.
Three uncomfortable questions
- How close is close? If I demand that x² be within 0.0001 of 4, how close does x have to be to 2?
- What if someone disagrees with my answer? "Close" is subjective; we need a referee that anyone can check.
- For which functions does "close in x ⇒ close in f(x)" even hold? Some functions (think of a step) explode the moment you cross a boundary.
In the early 1800s Augustin-Louis Cauchy and Karl Weierstrass faced this head-on. They flipped the question upside-down: instead of letting the prover decide what "close" meant, they let any opponent name the tolerance. The prover then has to produce a matching guarantee. If the prover can win every round of this game — for arbitrarily small tolerances — the limit is real. Otherwise it isn't.
The ε–δ Game: A Conversation Between Two People
Picture two mathematicians arguing about whether :
Names a positive number . Means: "I don't believe you. Here is how close to L the output must come — within . Beat that."
Responds with a positive number . Means: "If you keep x within of c (and not equal to c), I promise f(x) will land inside your tolerance."
The limit equals L if and only if the Defender has a winning strategy — a recipe that produces a working for every positive the Challenger could possibly throw, no matter how small.
Why the Challenger goes first
If the Defender went first and chose , they would be tempted to hide behind a big window and call it close enough. By forcing the Challenger to pre-commit to , the definition rules out cheating: the Defender must actually shrink to fit any tolerance.
The Formal Definition
ε–δ definition of a limit
We say if and only if for every there exists a such that
In symbolic shorthand: .
Anatomy of the Definition
The definition is dense. Every symbol is doing serious work. Let's slow down and translate piece-by-piece.
| Symbol | Reads as | What it actually means |
|---|---|---|
| ∀ ε > 0 | for every positive ε | The Challenger is allowed to pick any tolerance, no matter how tiny. |
| ∃ δ > 0 | there exists a positive δ | The Defender must produce some response — usually a recipe δ(ε) that depends on ε. |
| 0 < |x − c| | x is not equal to c | We never require f(c) itself to behave. The limit is about behavior NEAR c, not AT c. |
| |x − c| < δ | x is within δ of c | x lives in the punctured input window (c − δ, c) ∪ (c, c + δ). |
| ⇒ | implies | Whenever the input condition holds, the output condition is forced to hold. |
| |f(x) − L| < ε | f(x) is within ε of L | f(x) lives in the output window (L − ε, L + ε) — the green band. |
The order of quantifiers is everything
∀ε ∃δ means the Defender hears ε before choosing δ — δ is allowed to depend on ε. Swapping to ∃δ ∀ε would mean a single δ has to handle every ε, which is impossible for any non-constant function. Many failed limit proofs collapse the moment you check the order.
Interactive: Watch ε Force δ
The picture is the easiest way to feel the definition. Drag the ε slider — that is the Challenger setting the height of the green band. Then drag δ to set the width of the violet band. The limit holds when the curve segment inside the violet band stays entirely inside the green band.
Try this experimental loop:
- Pick the linear function. Set ε = 1.0. Click Auto-fit best δ. Note δ = 0.5.
- Drop ε to 0.1. Auto-fit again. δ shrinks to 0.05 — exactly ε/2.
- Switch to . Observe that δ now depends on the curve's steepness near c = 2.
- Click Make δ too big — red dots appear: those are the witness points where the curve escapes the green band.
- Switch to the sinc function. Even though f(0) is undefined, the limit is still 1 — δ traps the surrounding values just fine.
Worked Example: lim (2x + 1) = 7
Let's do a complete ε–δ proof for . The strategy has two parts: scratch-work (private — figure out what δ must be) and the polished proof (public — present δ and verify).
📝 Step-by-step numerical walkthrough — try it yourself first
Step 1 — Set up the bound we must satisfy. The Challenger picks some ε > 0 (for instance 0.1). We need
Step 2 — Simplify the left side.
Step 3 — Solve for |x − 3|. Divide both sides by 2:
Step 4 — Read off δ. The condition we need on x is exactly |x − 3| < ε/2. So choose
Step 5 — Verify. Suppose 0 < |x − 3| < δ = ε/2. Then
Step 6 — Plug in numbers and feel it. If ε = 0.1, take δ = 0.05. Then x = 3.04 gives f(x) = 7.08 and |7.08 − 7| = 0.08 < 0.1. ✓ Try x = 2.97: f(x) = 6.94, |6.94 − 7| = 0.06 < 0.1. ✓
Notice that we never had to choose specific x values to win — δ = ε/2 handles all of them at once. That is what makes this a proof rather than a spot-check.
The pattern works for every linear function
For with m ≠ 0, the same algebra gives . Steeper lines (bigger |m|) force smaller δ — the input window has to shrink faster to compensate.
Play the Challenger–Defender Game
Below, you are the Challenger. Pick any ε you like — even unreasonably small. Watch the Defender produce a δ that always wins. The numerical witness table proves that every x in the input window really does land in the output window.
A Trickier Case: lim x² = 4
For curves, the Defender's job is harder because steepness changes from point to point. Take . We want
The factor is the one we control with δ. The factor wobbles as x moves. The standard trick: tame the wobble first, then choose δ.
- Tame the wobble. Restrict δ ≤ 1 once and for all. Then means , which gives , so .
- Bound the product. Under the restriction, . We need this to be less than ε, so we need .
- Combine the two requirements. Choose .
Why min?
We need both conditions: δ ≤ 1 (so the wobble is bounded) and δ ≤ ε/5 (so the product is small enough). The smaller of the two satisfies both at once.
For ε = 0.1: . Verification: if then . ✓
The Limit Cage in 3D
The two-dimensional picture has the function as a curve and the bands as strips. In three dimensions we can see the bands as slabs, and the place where the green slab and the violet slab intersect is a yellow box — the limit cage. The ε–δ definition becomes a single sentence: the curve segment for x in the input window must fit inside the cage.
Rotate to look down the x-axis: you see the green ε-strip head-on. Now look down the y-axis: the violet δ-strip becomes obvious. Look from above: the box footprint shows the actual region of the (x, y) plane that the definition cares about.
When the Limit Fails
The definition is also a powerful tool for proving a limit does not exist. Consider the sign function near 0:
Suppose someone claims . The Challenger picks ε = 0.5. Whatever δ the Defender names, the input window (−δ, 0) ∪ (0, δ) contains points where f = +1 and points where f = −1. Both are 1 unit away from 0 — far outside the 0.5 tolerance. No δ can ever work.
The geometry of failure
Whenever the function makes an unavoidable jump, twist, or oscillation that no shrinking input window can suppress, the limit fails. Visually this is red in the 2D visualizer — change the function or pick a small ε at a discontinuity and watch the green band turn rose.
Python: Verifying ε–δ Numerically
The ε–δ definition is a quantifier statement, but a computer can approximate the Defender's job. We pick a function, sample the input window densely, and check every sample against the ε bound. If we want the smallest workable δ, we shrink δ in a halving search until the check passes.
What you should see in the output
Every row prints a smaller ε and a correspondingly smaller δ — and the ratio ε/δ stays glued to ≈ 2.00 across nine orders of magnitude. That stable ratio is the derivative sneaking out of the ε–δ definition.
PyTorch: From ε–δ to Automatic Differentiation
The Python search measures how steep the function is by hand. PyTorch's autograd gives us that slope in one call to .backward(). The slope IS the limiting ratio ε/δ — so once we have it, we can predict δ for any ε without searching.
The bridge between ε–δ and machine learning
When you train a neural network with billions of parameters, every backward pass is asking, in effect, "If I nudge this parameter by an infinitesimal δ, how much does the loss change?" The answer is the gradient — the same that the ε–δ game makes precise. Without this definition, modern AI's mathematical foundations would still be hand-wave.
Why This Definition Changed Everything
The derivative is a limit of difference quotients. Once limits are made air-tight, every theorem of differentiation (chain rule, mean value theorem) becomes provable from first principles.
Convergence of sequences, continuity of functions, the completeness of the real numbers — all of modern analysis is built on top of the ε–δ language.
Every error bound in scientific computing — Newton's method, Runge–Kutta, finite differences — is an ε statement: "the answer is within ε if you push the algorithm far enough."
Universal approximation theorems, gradient convergence proofs, and the very concept of a continuous loss landscape all rest on the ε–δ definition of continuity and differentiability.
Common Pitfalls
Pitfall 1 — Choosing δ that depends on x
δ may depend on ε but not on the variable x. δ is a single number that must work uniformly for every x in the input window. Sneaking x into the formula for δ is a frequent (and silent) source of broken proofs.
Pitfall 2 — Forgetting 0 < |x − c|
The condition 0 < |x − c| is what excludes x = c from the discussion. If you drop the strict 0 <, you have implicitly required f(c) = L — and many limits (like the sinc function at 0) hold even though f(c) is undefined.
Pitfall 3 — Treating ε and δ as fixed numbers
ε is a stand-in for "any positive number, no matter how small". A proof that works only for ε = 0.1 is no proof at all. Always carry ε through the algebra; the symbol must survive to the end.
Pitfall 4 — Confusing the definition with the test
Sampling x values numerically and checking the bound is a great way to gain confidence, but a finite computer cannot verify ∀ε. Numerical checks falsify limits; only algebra proves them.
Summary
The ε–δ definition closed a 200-year-old gap in the foundations of calculus. It turned the vague notion of "approaching" into a precise game played between a Challenger and a Defender, and gave mathematicians the language they needed to prove every theorem about limits, continuity, derivatives, and integrals from scratch.
| Concept | What it means in the game |
|---|---|
| ε > 0 | The Challenger's tolerance — how close to L the output must be |
| δ > 0 | The Defender's response — how close to c the input must be |
| 0 < |x − c| < δ | x is in the punctured input window around c |
| |f(x) − L| < ε | f(x) is in the output window around L |
| ∀ε ∃δ | For every challenge there is a working response |
| Limit fails | Some ε exists for which no δ can ever work |
| ε / δ ratio | Local sensitivity → the derivative as δ → 0 |
Key Takeaways
- A limit is the formal answer to the question "how close can the output be made by squeezing the input?"
- The Challenger always moves first; δ may depend on ε but never on x.
- For linear functions works exactly. For curves, bound the wobble first, then take of the bounds.
- Limits care about behavior near c, not at c — that is why limits generalise function evaluation.
- The limiting ratio ε / δ converges to the derivative — the same number autograd computes in PyTorch.
Coming Next: Now that limits are rigorous, we can stop computing them from scratch every time. The next section develops the Limit Laws — algebraic shortcuts that let us combine known limits into new ones, all justified by the ε–δ definition you just learned.