Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section you will be able to:

Derive Euler's update rule directly from the definition of the derivative, without memorization.
Visualize the method as walking along tangent lines of the slope field.
Execute the algorithm by hand on a small initial-value problem and check the answer against the exact solution.
Distinguish local truncation error $O(h^2)$ from global error $O(h)$ and explain why one is bigger than the other.
Empirically verify the first-order convergence rate using a log-log plot.
Implement Euler's method in plain Python and as a vectorized PyTorch routine that supports batching and autograd.
Recognize the stability barrier that limits how big $h$ can be on stiff problems.

Why We Need Numerical Methods

In Section 20.3 we solved differential equations by separation of variables. That trick is wonderful when it works — but it works rarely. Most equations that describe real physical systems have no elementary closed-form solution at all. Try to analytically solve

\displaystyle \frac{dy}{dt} = \sin(t \, y) + t^2,\qquad y(0) = 1

and you will quickly discover there is no antiderivative we can write down with familiar functions. Yet a physicist still needs to know what $y(2.5)$ is. The way out is to give up on a formula and ask only for numbers: a table of $(t_i, y_i)$ pairs that approximate the true curve as closely as we want.

The numerical pivot. Instead of asking "what is the function y(t)?" we ask "starting from y(0), where will y be a tiny step later, and a tiny step after that, and after that…?"

Euler's method, published by Leonhard Euler in 1768 in his Institutionum Calculi Integralis, is the oldest and simplest answer to that question. Every modern ODE solver — from Runge–Kutta to the implicit BDF integrators that underlie scientific Python — can be understood as a smarter cousin of Euler's idea. Once you grasp Euler, the rest of the family becomes natural.

The Core Idea: Walk Along the Tangent

Imagine you are standing on a hillside in dense fog. You cannot see the valley, but the differential equation acts like a compass that tells you, at every spot you stand on, the exact direction of steepest descent. What do you do? You take one careful step in that direction, look at the compass again, take another step, and so on. Each step is a straight line, but stitched together they trace out a path that closely follows the true downhill curve.

That is precisely Euler's method. The differential equation $dy/dt = f(t, y)$ gives the slope of the unknown solution at every point, even though we do not know the solution itself. We choose a small step size $h$ and we walk along the tangent for that length, recompute the slope at the new point, walk again, recompute, repeat.

The one sentence summary

At each step, replace the curved true solution by its tangent line and take a step of length

h

along it.

Deriving the Update Rule

Start from the very definition of the derivative:

\displaystyle \frac{dy}{dt}\Big|_{t=t_n} \;=\; \lim_{h \to 0} \frac{y(t_n + h) - y(t_n)}{h}

For a small but nonzero $h$ we drop the limit and accept an approximation:

\displaystyle \frac{y(t_n + h) - y(t_n)}{h} \;\approx\; \frac{dy}{dt}\Big|_{t=t_n} \;=\; f(t_n, y(t_n))

Multiply through by $h$ and rearrange:

\displaystyle y(t_n + h) \;\approx\; y(t_n) + h\, f(t_n, y(t_n))

Now let $y_n$ stand for our running approximation to $y(t_n)$ and let $t_{n+1} = t_n + h$ . Replacing the unknown true values by our approximations we obtain Euler's rule:

Euler's update

\displaystyle y_{n+1} \;=\; y_{n} \;+\; h \cdot f(t_n,\, y_n)

Repeated indefinitely from the initial condition $y_0 = y(t_0)$ , this produces the entire approximate trajectory.

It is worth pausing on what just happened. We replaced an infinite limit — the derivative — by a finite difference. That single substitution is the bridge between calculus and computation. Every numerical ODE method does some version of it; what differs is how cleverly each method recovers the curvature that this naive truncation throws away.

We can also reach the same formula from the first-order Taylor expansion:

\displaystyle y(t_n + h) \;=\; y(t_n) + h\, y'(t_n) + \tfrac{1}{2}h^2 y''(\xi)

Euler keeps the linear term $h\,y'(t_n)$ and discards everything from $h^2$ onward. That discarded tail is the local error per step — we will quantify it in a moment.

Geometric Picture

Recall the slope field from Section 20.2: at every point $(t, y)$ we draw a tiny arrow whose direction is the slope $f(t, y)$ . The true solution through $(t_0, y_0)$ is the curve tangent to every arrow it passes through.

Euler's method is the polygonal-line approximation to that curve: instead of following the field continuously, we read the arrow once, stride $h$ units along it, read the arrow at our new location, stride again, and so on. The result is a piecewise-linear path that secretly lives entirely on tangent lines of the true solution family — but never quite on the true curve we want.

What goes wrong, in one picture

After taking a tangent step, the next slope is read from our approximate point, not the true one. The error from the first step gets fed into the slope used for the second step, the second step's error into the third, and so on. This compounding is why local error

O(h^2)

accumulates into global error

O(h)

The Algorithm in Plain Steps

Inputs. The slope function $f(t, y)$ , the initial point $(t_0, y_0)$ , the stopping time $t_{\text{end}}$ , and a step size $h > 0$ .
Setup. Set the step count $N = \lceil (t_{\text{end}} - t_0)/h \rceil$ , set $t \leftarrow t_0$ , $y \leftarrow y_0$ , and record $(t, y)$ .
Repeat $N$ times.
- Evaluate the slope at the current point: $k = f(t, y)$ .
- Take a tangent step: $y \leftarrow y + h \cdot k$ .
- Advance time: $t \leftarrow t + h$ .
- Record the new $(t, y)$ .
Output. The list of pairs $(t_0, y_0), (t_1, y_1), \dots, (t_N, y_N)$ which discretely approximate the solution curve.

Notation we will use throughout

y_n

is the Euler estimate at step

n

;

y(t_n)

is the (usually unknown) true value at the same time. The whole point of the error analysis below is to bound

|y_n - y(t_n)|

Interactive Explorer

The visualization below lets you take Euler steps yourself on six different equations. The green curve is the true solution; the orange polygon is Euler's approximation. Pull the step size slider toward zero and watch the orange path snap onto the green one. Press Step to take a single Euler step at a time — the dashed red line is the tangent actually used to take the next step.

Loading Euler explorer...

Things to try

Choose Logistic, set $h = 0.9$ and watch Euler overshoot the carrying capacity before relaxing back. This is a tiny taste of numerical instability.
On Exponential growth, halve $h$ and observe the final error roughly halving. That is the empirical signature of a first-order method.
On dy/dt = sin(t) - 0.3 y, no closed form exists — but a high-resolution reference is plotted as the green curve. Make $h$ tiny and you reproduce it.

Worked Example (by hand)

We pick an equation whose true solution we happen to know, so we can score each Euler step directly.

\displaystyle \frac{dy}{dt} = y - t^2 + 1, \qquad y(0) = 0.5, \qquad h = 0.5

Its analytic solution (verifiable by direct substitution) is $y(t) = (t + 1)^2 - \tfrac{1}{2} e^t$ . We will compute four Euler steps to reach $t = 2$ and compare against the truth.

Click to expand the full hand calculation

Step 1. Start at $(t_0, y_0) = (0,\, 0.5)$ .

Slope: $f(0, 0.5) = 0.5 - 0^2 + 1 = 1.5$ .
Update: $y_1 = 0.5 + 0.5 \cdot 1.5 = 1.25$ .
New time: $t_1 = 0.5$ .
True value: $y(0.5) = (1.5)^2 - 0.5 e^{0.5} \approx 1.4256$ . Euler error: $\approx 0.176$ .

Step 2. Now at $(0.5,\, 1.25)$ .

Slope: $f(0.5, 1.25) = 1.25 - 0.25 + 1 = 2.0$ .
Update: $y_2 = 1.25 + 0.5 \cdot 2.0 = 2.25$ .
True value: $y(1) \approx 2.6409$ . Euler error: $\approx 0.391$ .

Step 3. Now at $(1.0,\, 2.25)$ .

Slope: $f(1.0, 2.25) = 2.25 - 1.0 + 1 = 2.25$ .
Update: $y_3 = 2.25 + 0.5 \cdot 2.25 = 3.375$ .
True value: $y(1.5) \approx 4.0092$ . Euler error: $\approx 0.634$ .

Step 4. Now at $(1.5,\, 3.375)$ .

Slope: $f(1.5, 3.375) = 3.375 - 2.25 + 1 = 2.125$ .
Update: $y_4 = 3.375 + 0.5 \cdot 2.125 = 4.4375$ .
True value: $y(2) \approx 5.3055$ . Euler error: $\approx 0.868$ .

Observation. The errors $0.176, 0.391, 0.634, 0.868$ grow as we march forward. That growth is the global error compounding — exactly as theory predicts. If we cut $h$ from $0.5$ to $0.1$ and re-run, the final error drops from $\approx 0.868$ to $\approx 0.242$ , a factor of $\approx 3.6$ for a 5× reduction in $h$ — broadly consistent with the linear rate, give or take a constant that depends on $y''$ .

Here is a summary of the trace, computed and verified in Python:

step	t	Euler y	slope	true y	\|error\|
0	0.0	0.5000	1.500	0.5000	0.0000
1	0.5	1.2500	2.000	1.4256	0.1756
2	1.0	2.2500	2.250	2.6409	0.3909
3	1.5	3.3750	2.125	4.0092	0.6342
4	2.0	4.4375	—	5.3055	0.8680

How Wrong Is It? Local and Global Error

Two different errors are at play and it is essential to keep them apart.

Local truncation error

Suppose by some miracle we are exactly on the true curve at time $t_n$ . How wrong is a single Euler step? From the Taylor expansion we already wrote,

\displaystyle y(t_n + h) - \bigl[y(t_n) + h\, f(t_n, y(t_n))\bigr] \;=\; \tfrac{1}{2} h^2 y''(\xi)

So per step, the error is $O(h^2)$ . Halve $h$ and a single step becomes four times more accurate.

Global error

But we take roughly $N = (t_{\text{end}} - t_0)/h$ steps, and each step injects its own error into the input of the next step. A careful argument (using the Lipschitz continuity of $f$ ) shows that all those $O(h^2)$ contributions add up to an accumulated error

\displaystyle \max_{n} \bigl|y_n - y(t_n)\bigr| \;\le\; C \cdot h

where the constant $C$ depends on the problem (specifically, on the size of $y''$ and on the Lipschitz constant of $f$ ). This is why we say Euler is a first-order method: global error scales with the first power of $h$ .

quantity	scaling	meaning
Local truncation error per step	O(h²)	How much one step deviates from the truth
Global error after N steps	O(h)	How much the whole trajectory has drifted
Cost (function evaluations)	O(1/h)	More accuracy ⇒ proportionally more work

The grim trade-off

To shrink Euler's global error by a factor of 10, you need 10× more steps. That is why Euler is rarely used in production: methods like RK4 give 10,000× less error for the same cost. But every fancier method is just Euler with a better slope estimate.

Convergence Demo: Order of Accuracy

The convergence claim above is testable. The widget below solves the chosen IVP with successively halved step sizes and plots $\log|\text{error}|$ against $\log h$ . A first-order method must land on a line of slope 1.

Loading convergence explorer...

The empirically fitted slope (green) sits right next to the reference slope-1 line (purple dashed) for every problem. This is experimental confirmation of the theory: Euler is first-order, no more and no less.

Stability: When Small Steps Are Still Not Small Enough

Accuracy is not the only concern. Some equations punish big $h$ not by being slightly wrong, but by producing solutions that explode to infinity even though the true solution is calm. The cleanest example is the linear decay equation

\displaystyle \frac{dy}{dt} = -\lambda y, \qquad \lambda > 0

whose true solution $y(t) = y_0 e^{-\lambda t}$ decays smoothly to zero. Applying Euler:

\displaystyle y_{n+1} = y_n + h(-\lambda y_n) = (1 - \lambda h)\, y_n

So $y_n = (1 - \lambda h)^n y_0$ . For the Euler solution to also decay, we need $|1 - \lambda h| < 1$ , which requires

\displaystyle h < \frac{2}{\lambda}

Push $h$ past this barrier and Euler oscillates with growing amplitude — a complete failure mode unrelated to accuracy. Equations with very large $\lambda$ (or, more generally, with very fast and very slow dynamics mixed together) are called stiff, and explicit Euler is a poor tool for them. Implicit methods (Section 20.x and beyond) cure this.

Mental rule of thumb

Stability sets an upper bound on

h

before the method even gets a chance to be accurate.

Python Implementation

Below is the cleanest possible Python implementation, with every line annotated. Read the explanations to the right — they explain not just what each line does but why the algorithm needs it.

Plain Euler in 25 lines

🐍euler.py

Explanation(11)

Code(38)

5Function signature

We pass in the slope function f(t, y), the starting point (t0, y0), the stopping time t_end, and the step size h. Everything Euler needs to march forward.

13Number of steps

Total walk length is (t_end - t0). Each step is h units long, so we need (t_end - t0)/h steps. round avoids floating-point off-by-one when h doesn't divide the interval cleanly.

EXAMPLE

t0=0, t_end=1, h=0.25 → n_steps = 4

14Initialize the trajectory

We seed the output lists with the only point we know exactly — the initial condition. Every later (t, y) pair is an approximation.

20Read the slope at the current point

This is the only place we actually use the differential equation. f(t, y) returns dy/dt at the spot where we are standing right now.

EXAMPLE

If y' = y and we're at (t=0, y=1), then slope = 1.

21The Euler update — the heart of the method

Walk h units along the tangent line: new_y = old_y + h × slope. This is the discrete version of the differential dy = (dy/dt)·dt with dt replaced by the finite step h.

EXAMPLE

y_new = 1 + 0.25 × 1 = 1.25

22Advance time

Move the clock forward by exactly the same h that we used in the y update. Keeping these two h's identical is what makes the slope assumption consistent.

23Record the new point

We append (t, y) to the trajectory so we can plot it, table it, or feed it to the next step.

29Return the full trajectory

ts and ys are parallel lists of length n_steps + 1. ts[i] is the time, ys[i] is the Euler estimate of y at that time.

34Demo slope function

y' = y is the canonical exponential equation. We pick it because the exact answer is e^t, so we can directly measure Euler's error.

37Run Euler with h = 0.25

Four steps of size 0.25 carry us from t=0 to t=1. With y₀=1 we expect e ≈ 2.718, but Euler will undershoot.

41Compare against the truth

Euler returns ≈ 2.441 — about 10% below e. That gap is the global error, and it shrinks linearly as we cut h.

27 lines without explanation

1# Plain-Python Euler's method
2# Solves the IVP:  dy/dt = f(t, y),  y(t0) = y0
3# Returns the (t, y) pairs produced by walking along the slope field.
4
5def euler(f, t0, y0, t_end, h):
6    """
7    f      : callable f(t, y) returning the slope
8    t0, y0 : initial condition  y(t0) = y0
9    t_end  : final time
10    h      : step size (must be > 0 and small enough)
11    """
12    n_steps = int(round((t_end - t0) / h))   # number of Euler steps
13    ts = [t0]                                # collected time grid
14    ys = [y0]                                # collected solution values
15
16    t = t0
17    y = y0
18    for _ in range(n_steps):
19        slope = f(t, y)        # 1. read the slope at the current point
20        y = y + h * slope      # 2. take a tangent step of length h
21        t = t + h              # 3. advance time
22        ts.append(t)
23        ys.append(y)
24
25    return ts, ys
26
27
28# --- demo: solve  y' = y,  y(0) = 1  on [0, 1]
29def f(t, y):
30    return y                  # exponential growth: slope = current value
31
32ts, ys = euler(f, t0=0.0, y0=1.0, t_end=1.0, h=0.25)
33for t, y in zip(ts, ys):
34    print(f"t={t:.2f}  y_euler={y:.6f}")
35
36# True solution is y(t) = exp(t), so y(1) = e = 2.71828...
37print("Euler at t=1 :", ys[-1])
38print("True  at t=1 :", 2.718281828459045)

And here is the experiment that confirms the first-order rate — the same one whose plot you played with above, but now as code you can actually run:

Empirical convergence: Euler is first-order

🐍euler_error_study.py

Explanation(6)

Code(28)

7Define the slope

Same model: y' = y. The true solution is y(t) = e^t.

10Return only the final value

For an order-of-accuracy study we don't need the whole trajectory, just y at t=1. This keeps the inner loop tight.

11Pick h from n

Fixing the interval [0, 1] and the step count n forces h = 1/n. Doubling n halves h.

13Plain Euler loop

Identical to the previous example, just without storing the trajectory.

24Sweep step counts

We double n each time. If Euler is first-order, the error should halve each time too.

26Compute the empirical constant

The theory says error ≈ C·h for some C that depends on the equation. If we divide error by h and get a roughly constant number across rows, we've confirmed the first-order rate experimentally.

EXAMPLE

On y'=y from t=0..1, err/h converges to about 1.35 — that's the empirical C.

22 lines without explanation

1# Measuring the error of Euler as h shrinks.
2# We solve  y' = y,  y(0) = 1  on [0, 1].
3# The true answer is  y(1) = e = 2.718281828459045
4
5import math
6
7def f(t, y):
8    return y
9
10def euler_final(f, t0, y0, t_end, n):
11    h = (t_end - t0) / n
12    t, y = t0, y0
13    for _ in range(n):
14        y = y + h * f(t, y)
15        t = t + h
16    return y
17
18print(f"{'n':>5} {'h':>10} {'y_Euler':>12} {'|error|':>12} {'err/h':>10}")
19print("-" * 54)
20
21true_value = math.e
22for n in [4, 8, 16, 32, 64, 128, 256]:
23    h        = 1.0 / n
24    y_hat    = euler_final(f, 0.0, 1.0, 1.0, n)
25    error    = abs(true_value - y_hat)
26    # Empirical "constant" in the bound  error ≈ C · h
27    constant = error / h
28    print(f"{n:>5d} {h:>10.5f} {y_hat:>12.6f} {error:>12.6e} {constant:>10.4f}")

When you run it you should see the $|\text{error}|$ column roughly halve every time $n$ doubles, and the rightmost column (the empirical $C = |\text{error}|/h$ ) converge to a single constant — the punchline of the whole error analysis.

PyTorch / Vectorized Implementation

The plain version is great for understanding but slow for large problems. The PyTorch version below is structurally identical — same update rule, same loop — but every value is a tensor. Three real advantages drop out for free:

Batching. Solve a whole population of trajectories (different initial conditions, different parameters) in a single call.
GPU. Move tensors to cuda and the same code runs there.
Autograd. Because every step is a differentiable tensor op, we can differentiate the final state with respect to anything that fed into it — initial conditions, model parameters, the integration time. This is exactly the trick that makes Neural ODEs trainable.

Batched, differentiable Euler in PyTorch

🐍euler_torch.py

Explanation(7)

Code(49)

9Why use a tensor library at all?

Plain Python loops are fine for textbook examples, but real workloads need (1) GPU support, (2) batching over many initial conditions, and (3) differentiation through the solver. PyTorch gives us all three for free.

11f takes batched y

Compared to the scalar version, f now consumes a tensor of shape [B, D]: B independent trajectories with D-dimensional state. Inside f we use broadcasting so a single line of code handles the whole batch.

17Keep t as a tensor too

Even though t is scalar, making it a tensor on the same device/dtype as y prevents silent CPU↔GPU copies when f mixes the two.

20The same Euler update — broadcasted

y has shape [B, D] and f(t, y) returns the same shape. Adding h * f(t, y) updates every trajectory in parallel without a Python loop over the batch.

EXAMPLE

If B=4, one tensor add replaces four scalar updates.

26A batch of decay rates

Each row is a different λ. Putting them in shape [4, 1] makes them broadcast against y of shape [4, 1], so each trajectory sees only its own λ.

36Closed form for sanity

Decay's exact solution is e^(-λ·t). Comparing tells us each error is around 1e-3 with h=0.01 — exactly what first-order O(h) predicts.

42Differentiating through the solver

Because every operation inside euler_batched is a differentiable tensor op, y_T inherits a computation graph all the way back to y0. Calling .backward() fills y0.grad with the sensitivity ∂y(T)/∂y(0) — this is the very mechanism Neural ODEs use to learn vector fields.

EXAMPLE

For y'=-0.5 y the true sensitivity is e^(-0.5·2) ≈ 0.3679; Euler reproduces it to four decimal places.

42 lines without explanation

1# Vectorized Euler in PyTorch.
2# Same idea as the plain-Python version, but:
3#   • everything is a torch.Tensor, so it runs on GPU when you ask it to
4#   • we solve a batch of trajectories in parallel (different initial values)
5#   • autograd is preserved, so the *final state* can be differentiated
6#     with respect to the initial condition or any parameter of f.
7
8import torch
9
10def euler_batched(f, t0, y0, t_end, h):
11    """
12    f     : callable f(t: float, y: Tensor[B, D]) -> Tensor[B, D]
13    y0    : Tensor of shape [B, D]  — B independent initial conditions
14    Returns y(t_end) as a Tensor of shape [B, D].
15    """
16    n_steps = int(round((t_end - t0) / h))
17    t = torch.tensor(t0, dtype=y0.dtype, device=y0.device)
18    y = y0
19    for _ in range(n_steps):
20        y = y + h * f(t, y)
21        t = t + h
22    return y
23
24
25# --- demo: solve  y' = -lambda * y  (decay) for a batch of lambdas
26lambdas = torch.tensor([0.1, 0.5, 1.0, 2.0]).view(-1, 1)   # shape [4, 1]
27
28def f(_t, y):
29    # Element-wise: each row uses its own lambda
30    return -lambdas * y
31
32y0 = torch.ones(4, 1)               # all start at y=1
33y_T = euler_batched(f, t0=0.0, y0=y0, t_end=2.0, h=0.01)
34
35true = torch.exp(-lambdas * 2.0)    # closed form for comparison
36print("Euler  y(2) :", y_T.flatten().tolist())
37print("Truth  y(2) :", true.flatten().tolist())
38print("Errors      :", (y_T - true).abs().flatten().tolist())
39
40
41# --- bonus: differentiate the final state w.r.t. an initial condition
42y0 = torch.tensor([[1.0]], requires_grad=True)
43
44def f_single(_t, y):
45    return -0.5 * y
46
47y_T = euler_batched(f_single, 0.0, y0, 2.0, 0.01)   # y_T depends on y0
48y_T.sum().backward()                                # populate y0.grad
49print("dy(2)/dy(0) ≈", y0.grad.item())              # ≈ exp(-0.5*2) = 0.3679

From textbook to research

The last block of that example — calling y_T.sum().backward() — is the conceptual seed of the entire Neural ODE literature. Replace the hand-coded f with a neural network, integrate with Euler (or RK4), backprop the downstream loss, and you have a learnable continuous-depth model.

Beyond Euler: A Glimpse at RK4 and Neural ODEs

Once you see that Euler is just "estimate the slope, take a step", every improvement is easy to anticipate:

method	slope used	global order
Forward Euler	slope at the start of the interval	O(h)
Midpoint (RK2)	slope at the interval midpoint	O(h²)
Classical RK4	weighted average of four slope estimates inside the interval	O(h⁴)
Implicit (Backward) Euler	slope at the end of the interval (solve a small equation)	O(h), but unconditionally stable
Adaptive RK (RKF45, Dormand–Prince)	two RK estimates of different orders, used to pick h	variable, controlled by tolerance

Every one of these is a refinement on the same skeleton you just coded. The whole field of numerical analysis for ODEs is the search for ways to extract more accuracy or more stability per slope evaluation than Euler manages.

Common Pitfalls

1. Using the same h that worked on another problem

A step size that gives wonderful results on a slow exponential decay can blow up entirely on a stiff oscillator. The maximum allowable

h

depends on both accuracy and stability requirements, and both depend on the equation.

2. Reading the slope at the wrong point

The slope

f(t, y)

in Euler's update is evaluated at the current

(t_n, y_n)

— not at

(t_{n+1}, y_{n+1})

. Putting the index wrong turns explicit Euler into implicit Euler, which is a different method with very different behavior.

3. Storing only the endpoint when you need the path

Many real applications need the entire trajectory (think plotting, event detection, control). Make sure your implementation records each step, not just the final value.

4. Confusing local and global error

A loss in per-step accuracy of

O(h^2)

looks tiny on a log plot, but over

1/h

steps it adds up to

O(h)

globally. Tracking only the per-step residual hides the real error you care about.

Summary

Euler's method is the canonical bridge from calculus to computation. Its update rule

\displaystyle y_{n+1} = y_n + h\, f(t_n, y_n)

falls out directly from the definition of the derivative; geometrically it walks along tangent lines of the slope field. The cost of this simplicity is first-order accuracy — global error scales linearly with the step size — and a stability barrier on stiff equations. Yet every fancier ODE solver is a refinement of the same skeleton, and the PyTorch version is the first step toward Neural ODEs and the modern continuous-depth models built on top of them.

Take home. If you remember one thing, remember this rhythm: read the slope, walk a step, repeat. Everything beyond Euler is just how to read the slope more cleverly.