Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Recognize when a first-order ODE can be tamed by a change of variable, even though it is neither linear nor exact.
Apply the homogeneous substitution $v = y/x$ to any ODE of the form $dy/dx = F(y/x)$ .
Linearize a Bernoulli equation $y' + P(x)y = Q(x)y^n$ via $u = y^{1-n}$ .
Handle equations of the form $dy/dx = f(ax + by + c)$ with the linear shift $v = ax + by + c$ .
Verify the algebra numerically using Python and PyTorch's autograd.
Connect these substitutions to real-world models: population growth, fluid drag, mixing, and beyond.

The Big Idea: Why Substitute?

"We do not solve hard problems. We trade them for easy ones."

In Sections 21.1 and 21.2 you mastered two structurally easy equations: linear and exact. The integrating factor handled the first, the potential function the second. But the wild ODE you meet in a physics lab rarely walks in already wearing one of those costumes. Consider three real specimens:

Not linear

dy/dx = (x² + y²) / (x y)

Not exact

dy/dx = r y (1 − y/K)

Not separable

dy/dx = (x + y + 1)²

Each of these resists every technique you know. And yet — with the right change of variable, all three collapse into separable or linear equations you can solve in your sleep.

The substitution mindset

Substitution is the universal escape hatch. Whenever a first-order ODE doesn't fit any standard mould, ask: is there a single quantity inside this equation that, if I gave it its own name, everything would simplify around? That quantity is your $v$ (or $u$ ), and naming it is half the battle.

The three patterns below come up so often that they have dedicated names. In every case the recipe is the same four-step dance:

Diagnose the structure on the right side.
Introduce a new variable that captures that structure.
Rewrite the ODE in the new variable — it will be separable or linear.
Solve and undo the substitution to get $y(x)$ back.

Diagnosing the Right Substitution

Pattern matching the right side of $dy/dx = f(x,y)$ tells you which substitution to reach for:

Right side depends on…	Call it	Substitution	Equation becomes
only the ratio y/x	Homogeneous	v = y/x	separable in v, x
y^n with a linear y elsewhere	Bernoulli	u = y^(1−n)	linear in u
only ax + by + c	Linear shift	v = ax + by + c	separable in v, x

Spotting homogeneity at a glance

A right side $f(x,y)$ is homogeneous of degree zero exactly when $f(\lambda x, \lambda y) = f(x,y)$ for every $\lambda \neq 0$ . Try $\lambda = 1/x$ : you get $f(1, y/x)$ , a function of $y/x$ alone.

Homogeneous Equations: v = y/x

A first-order ODE is homogeneous of degree zero when its right side depends only on the ratio $y/x$ :

\frac{dy}{dx} = F\!\left(\frac{y}{x}\right)

Set $v = y/x$ , so $y = v\,x$ . Differentiating with the product rule gives $y' = v + x\,v'$ . Substituting:

v + x\,\frac{dv}{dx} = F(v) \quad\Longrightarrow\quad \frac{dv}{dx} = \frac{F(v) - v}{x}

The right side is a product of a function of $v$ and a function of $x$ — it is separable:

\int \frac{dv}{F(v) - v} = \int \frac{dx}{x} = \ln|x| + C

What is the geometric meaning?

Slopes of a homogeneous ODE are constant along rays from the origin. Walk out along the ray $y = mx$ and the slope $F(m)$ never changes. Move the slider in the demo below to feel this.

Loading homogeneous substitution visualizer…

The Three-Line Recipe

Write the ODE so its right side is a function of $y/x$ alone, call it $F(v)$ .
Replace with $v = y/x,\; y = vx,\; y' = v + xv'$ and rearrange to $\frac{dv}{F(v) - v} = \frac{dx}{x}$ .
Integrate both sides, then back-substitute $v = y/x$ .

Worked Example — Homogeneous

Solve $\dfrac{dy}{dx} = \dfrac{x + y}{x - y}, \quad y(1) = 0$ .

Click to expand step-by-step solution (try it yourself first!)

Step 1 — Confirm homogeneity. Divide numerator and denominator by $x$ :

\frac{dy}{dx} = \frac{1 + y/x}{1 - y/x} = F(v) \;\text{ with } v = y/x.

Step 2 — Substitute. Let $y = vx$ , so $y' = v + xv'$ :

v + x\,v' = \frac{1+v}{1-v} \;\Longrightarrow\; x\,v' = \frac{1+v}{1-v} - v = \frac{1+v - v + v^2}{1-v} = \frac{1+v^2}{1-v}.

Step 3 — Separate variables.

\frac{1-v}{1+v^2}\,dv = \frac{dx}{x}.

Step 4 — Integrate. Split the left integral into two pieces:

\int \frac{dv}{1+v^2} - \int \frac{v\,dv}{1+v^2} = \int \frac{dx}{x}

\arctan v - \tfrac12 \ln(1+v^2) = \ln|x| + C.

Step 5 — Back-substitute $v = y/x$ :

\arctan\!\left(\frac{y}{x}\right) - \tfrac12 \ln\!\left(1 + \frac{y^2}{x^2}\right) = \ln|x| + C.

Step 6 — Apply the initial condition $y(1) = 0$ : the left side becomes $\arctan 0 - \tfrac12 \ln 1 = 0$ , and the right side is $\ln 1 + C = C$ , so $C = 0$ . The implicit solution is:

\arctan\!\left(\frac{y}{x}\right) = \ln|x| + \tfrac12 \ln\!\left(1 + \frac{y^2}{x^2}\right).

Step 7 — Sanity check. Differentiate implicitly and simplify — the slope at $(1, 0)$ should equal $(1 + 0)/(1 - 0) = 1$ , and indeed it does. Plot the curve in the demo above to see it threads through $(1, 0)$ .

Bernoulli Equations: u = y^(1−n)

A Bernoulli equation is anything of the form

\frac{dy}{dx} + P(x)\,y = Q(x)\,y^n, \qquad n \neq 0, 1.

It is one tiny step removed from a linear equation — only that troublesome $y^n$ on the right makes it non-linear. The magic substitution is $u = y^{1 - n}$ .

Why this exponent?

We want a quantity whose derivative absorbs the $y^n$ . Differentiate $u = y^{1-n}$ :

\frac{du}{dx} = (1-n)\,y^{-n}\,\frac{dy}{dx}.

Now multiply the original ODE by $(1-n)\,y^{-n}$ :

(1-n)\,y^{-n}\,y' + (1-n)\,P(x)\,y^{\,1-n} = (1-n)\,Q(x).

The first term is exactly $du/dx$ , and $y^{1-n} = u$ . So:

\boxed{\;\frac{du}{dx} + (1-n)\,P(x)\,u = (1-n)\,Q(x)\;}

A linear first-order ODE in $u$ ! Solve it with the integrating factor method from Section 21.1, then convert back via $y = u^{1/(1-n)}$ .

The famous case n = 2 — the logistic equation

With $n = 2$ the substitution becomes $u = y^{-1} = 1/y$ , and the equation $y' = ry\bigl(1 - y/K\bigr)$ rewritten as $y' - ry = -(r/K)\,y^2$ turns into the linear equation $u' + ru = r/K$ . Play with $r$ , $K$ , and $y_0$ below to see how the S-curve emerges.

Loading Bernoulli logistic demo…

Worked Example — Bernoulli (Logistic)

Solve the logistic ODE $\dfrac{dy}{dx} = ry\bigl(1 - \dfrac{y}{K}\bigr), \quad y(0) = y_0,$ with $r, K, y_0 > 0$ .

Click to expand step-by-step solution

Step 1 — Rewrite in Bernoulli standard form.

y' = ry - \tfrac{r}{K}\,y^2 \;\Longrightarrow\; y' - ry = -\tfrac{r}{K}\,y^2.

Here $P(x) = -r,\; Q(x) = -r/K,\; n = 2$ .

Step 2 — Substitute $u = y^{1-n} = y^{-1}$ :

\frac{du}{dx} = -y^{-2}\,\frac{dy}{dx}, \quad y = 1/u.

Step 3 — Multiply the ODE by $(1-n)\,y^{-n} = -y^{-2}$ :

-y^{-2}y' + r\,y^{-1} = \tfrac{r}{K}.

The left side is exactly $u' + ru$ , giving the linear equation:

\frac{du}{dx} + r\,u = \frac{r}{K}.

Step 4 — Integrating factor $\mu = e^{rx}$ :

\frac{d}{dx}\bigl(e^{rx}\,u\bigr) = \frac{r}{K}\,e^{rx}.

Step 5 — Integrate both sides:

e^{rx}\,u = \frac{1}{K}\,e^{rx} + C \;\Longrightarrow\; u = \frac{1}{K} + C\,e^{-rx}.

Step 6 — Apply $u(0) = 1/y_0$ :

\frac{1}{y_0} = \frac{1}{K} + C \;\Longrightarrow\; C = \frac{1}{y_0} - \frac{1}{K} = \frac{K - y_0}{K\,y_0}.

Step 7 — Convert back to $y = 1/u$ and clean up:

y(x) = \frac{K}{1 + \dfrac{K - y_0}{y_0}\,e^{-rx}}.

Step 8 — Read the physics off the formula.

x → 0: the exponential is 1, so denominator = K/y₀, giving y(0) = y₀. ✓
x → ∞: exponential → 0, so y → K (the carrying capacity).
Inflection point: y = K/2 — the population grows fastest when half of capacity is filled. This is why S-curves show up in advertising spend, virus spread, and tech adoption.

Linear-Shift Substitution: v = ax + by + c

If the entire right side depends on a single linear combination,

\frac{dy}{dx} = f(ax + by + c),

the right substitution is to give that combination a name. Set $v = ax + by + c$ . Differentiating:

\frac{dv}{dx} = a + b\,\frac{dy}{dx} = a + b\,f(v).

The new equation is separable: $\dfrac{dv}{a + b\,f(v)} = dx$ . Integrate, then undo the substitution.

Loading linear-shift demo…

Worked micro-example

For $y' = (x + y + 1)^2$ , set $v = x + y + 1$ . Then $v' = 1 + v^2$ , separable to $\arctan v = x + C$ . Undoing: $x + y + 1 = \tan(x + C)$ , so $y = \tan(x + C) - x - 1$ .

Python: Hand-rolled Substitution Solver

Let's see the homogeneous substitution work end-to-end on a computer. We'll start from the original ODE, apply the substitution by hand to derive the simplified equation, then ask SciPy to solve that simplified equation — and finally un-substitute to recover $y(x)$ . As a last step we will plug the answer back into the original ODE and measure the residual — if our algebra was right, the residual should be tiny.

Solving a homogeneous ODE by substitution

🐍substitution_solver.py

Explanation(10)

Code(35)

1NumPy + SciPy imports

NumPy gives us array math; scipy.integrate.odeint is the workhorse ODE solver (LSODA under the hood). The substitution trick is independent of these libraries — it only changes what equation we hand to the solver.

9Pin down the ORIGINAL equation

The equation dy/dx = (x + y)/(x − y) is homogeneous: divide top and bottom by x and the right side becomes (1 + y/x)/(1 − y/x), a function of the single ratio v = y/x. That observation is the whole reason this section exists.

14F(v) — what the right side becomes after substitution

Plugging v = y/x into the original right side cancels every loose x. F encapsulates that simplified expression so the rest of the code never has to know what the original equation looked like.

17dv/dx = (F(v) − v) / x — the substituted ODE

Differentiating y = v·x with the product rule gives y' = v + x·v'. Setting y' equal to F(v) and solving for v' yields v' = (F(v) − v)/x. Notice it is now separable: dv/(F(v) − v) = dx/x. The if-guard avoids dividing by zero exactly at x = 0.

22Pick a point on the solution curve

We start at (x₀, y₀) = (1.0, 0.5). The very first thing the substitution requires is converting the initial condition: v₀ = y₀/x₀ = 0.5. We solve in v-coordinates from now on.

26Integrate in v-space

odeint marches v from x = 1 to x = 4 using the v-equation. Internally it uses adaptive step control, but conceptually each step is Euler-like: v_{n+1} = v_n + h · (F(v_n) − v_n)/x_n.

27Undo the substitution: y = v · x

We only solved for v — but the user wants y. Multiplying element-wise rebuilds the original curve. This 'forward map then inverse map' pattern is the universal recipe of all substitution methods.

30Numerical derivative for verification

np.gradient uses central differences to approximate dy/dx from the y values we just produced. If our substitution work is correct, this should match the original ODE's right-hand side closely.

31Plug y back into the ORIGINAL right side

We compute (x + y)/(x − y) at every grid point and compare. The maximum absolute difference is our residual — it should be on the order of the solver tolerance, around 1e-4 or smaller.

36Read the printout

Expected output: v moves from 0.5 toward a value where the algebraic solution implicitly defines the curve. The residual line is the trust check — if it were 1.0 you would know the substitution had a bug.

25 lines without explanation

1import numpy as np
2from scipy.integrate import odeint
3import matplotlib.pyplot as plt
4
5# ------------------------------------------------------------------
6# Original homogeneous ODE:   dy/dx = (x + y) / (x - y)
7# Substitution:               v = y / x   =>   y = v x
8# Derived ODE in v:           dv/dx = (F(v) - v) / x
9# where F(v) = (1 + v) / (1 - v).
10# ------------------------------------------------------------------
11
12def F(v):
13    return (1.0 + v) / (1.0 - v)
14
15def dv_dx(v, x):
16    if abs(x) < 1e-9:
17        return 0.0
18    return (F(v) - v) / x
19
20# Initial point in (x, y), then convert to v.
21x0, y0 = 1.0, 0.5
22v0 = y0 / x0                 # = 0.5
23
24x_grid = np.linspace(1.0, 4.0, 200)
25v_sol  = odeint(dv_dx, v0, x_grid).flatten()
26y_sol  = v_sol * x_grid      # convert v back to y
27
28# Verify by plugging back into the ORIGINAL ODE.
29dy_numeric = np.gradient(y_sol, x_grid)
30dy_formula = (x_grid + y_sol) / (x_grid - y_sol)
31max_residual = np.max(np.abs(dy_numeric - dy_formula))
32
33print(f"v(x0) = {v0:.4f}, y(x0) = {y0:.4f}")
34print(f"v(x_end) = {v_sol[-1]:.4f}, y(x_end) = {y_sol[-1]:.4f}")
35print(f"max |dy/dx_numeric - F(x,y)| = {max_residual:.4e}")

PyTorch: Verifying with Autograd

Once we hand-derive a closed-form solution from a substitution, how do we know we didn't flip a sign or forget a chain-rule factor? PyTorch's autograd engine gives us a one-line answer: differentiate the candidate solution and compare with the right side of the original ODE. If they match at every point, the derivation is correct.

Verifying the logistic solution with PyTorch autograd

🐍autograd_verify.py

Explanation(10)

Code(37)

1Why PyTorch at all?

We are not training a model. We are using PyTorch's automatic differentiation engine as a derivative oracle: it computes dy/dx exactly from any expression we build, so it can verify that a closed-form solution actually satisfies its ODE.

14Logistic parameters

Growth rate r = 1.2 (units 1/time), carrying capacity K = 10 (max value y can approach), initial population y₀ = 1.0. With y₀ < K the solution starts below capacity and grows toward K.

19x as a leaf tensor with requires_grad=True

Setting requires_grad turns x into a leaf node in the autograd graph. Anything we compute downstream remembers how to send a gradient back to x. Without this flag the next call would error out.

23A = (K − y₀)/y₀ — the integration constant

When we solved u = 1/y → du/dx + ru = r/K and applied the initial condition u(0) = 1/y₀, the resulting integration constant in y-form is exactly A. This single number is what 'memory' of the initial condition becomes.

24y(x) = K / (1 + A · e^(−r x))

This is the famous logistic curve. As x → ∞ the exponential decays to 0, so y → K (saturation). At x = 0 the denominator is 1 + A = K/y₀, giving y(0) = K · y₀/K = y₀. ✓

27Why grad_outputs=ones?

torch.autograd.grad computes the sum Σᵢ gᵢ · dyᵢ/dxⱼ where gᵢ is the i-th entry of grad_outputs. Passing a vector of 1's gives the diagonal — i.e. dyᵢ/dxᵢ at each grid point — which is exactly the pointwise derivative we want.

28create_graph=False

We only need a first derivative, so we tell autograd to discard intermediate computation graphs after use. Setting it to True would let us differentiate dy/dx again (useful for second-order ODE checks).

31Recompute the RHS of the ORIGINAL ODE

r · y · (1 − y/K) is what dy/dx is supposed to equal. If our algebra was right, autograd's answer and this expression must agree everywhere.

34L∞ error — the trust check

We take the absolute difference at every point and grab the maximum. A correct derivation produces an error near machine epsilon (1e-6 to 1e-7 in float32). A non-zero residual would mean a sign error or a missed term.

37Reading the printout

The first line proves the substitution was algebraically correct. The next two confirm the boundary conditions: y(0) hits y₀, y(end) approaches K. Three checks, all green ⇒ the closed-form is bulletproof.

27 lines without explanation

1import torch
2
3# ------------------------------------------------------------------
4# Bernoulli case: logistic equation
5#     dy/dx = r * y * (1 - y/K)
6# Substitution u = 1/y turns it into the LINEAR equation
7#     du/dx + r*u = r/K
8# with closed-form solution
9#     u(x) = 1/K + (1/y0 - 1/K) * exp(-r*x)
10# Therefore
11#     y(x) = K / (1 + ((K - y0)/y0) * exp(-r*x))
12# ------------------------------------------------------------------
13
14r  = 1.2
15K  = 10.0
16y0 = 1.0
17
18# 1. Build x as a PyTorch tensor that tracks gradients.
19x = torch.linspace(0.0, 6.0, 200, requires_grad=True)
20
21# 2. Evaluate the closed-form y(x) we derived from the substitution.
22A = (K - y0) / y0
23y = K / (1 + A * torch.exp(-r * x))
24
25# 3. Ask autograd for dy/dx at every grid point in one shot.
26ones = torch.ones_like(y)
27dy_dx, = torch.autograd.grad(y, x, grad_outputs=ones, create_graph=False)
28
29# 4. Compute what dy/dx SHOULD be from the original ODE.
30rhs = r * y * (1 - y / K)
31
32# 5. The substitution is correct iff autograd matches the ODE rhs.
33err = (dy_dx - rhs).abs().max()
34
35print(f"max |autograd(y') - r*y*(1 - y/K)| = {err.item():.3e}")
36print(f"y(0)   = {y[0].item():.4f}    (expected {y0})")
37print(f"y(end) = {y[-1].item():.4f}   (expected {K})")

Real-World Applications

🌱 Population Biology (Bernoulli)

The logistic ODE is the workhorse of ecology — every bounded population, from yeast in a flask to wolves in Yellowstone, settles onto an S-curve toward its carrying capacity $K$ .

💨 Fluid Drag (Bernoulli, n = 2)

Newtonian drag gives $m\,dv/dt = mg - c\,v^2$ . Dividing by $v^2$ and substituting $u = v^{-1}$ turns terminal-velocity problems into linear ODEs.

🔬 Reaction Kinetics (Homogeneous)

Self-catalytic reactions $A + B \to 2B$ obey $dB/dt = kAB$ . In dimensionless form (concentrations scaled by the total) the rate law is homogeneous of degree zero, solvable by $v = B/A$ .

🧠 Machine Learning (Bernoulli)

The replicator equation for evolutionary game theory, $\dot x_i = x_i\bigl(f_i(x) - \bar f(x)\bigr)$ , is a multi-species Bernoulli system and underlies modern reinforcement-learning dynamics.

Common Pitfalls

Forgetting the product rule

When you substitute $y = vx$ , the derivative is $y' = v + xv'$ , not just $v'$ . Missing the $v$ term is the single most common mistake in homogeneous substitutions.

Bernoulli with n = 0 or n = 1

The substitution $u = y^{1-n}$ is degenerate when $n = 1$ (gives $u = y^0 = 1$ ) or trivial when $n = 0$ . But in those cases the original equation is already linear — solve it directly with the integrating factor method.

Don't forget to back-substitute

After integrating the simplified equation in $v$ or $u$ , you MUST express the answer back in terms of the original variables $y$ and $x$ . A solution "in v" is not yet a solution to the original problem.

Signs in the linear-shift case

For $v = ax + by + c$ with $b = 0$ , the substitution degenerates — the right side is then a function of $x$ alone and the ODE is already directly integrable.

Test Your Understanding

Summary

Substitution is the most strategic tool in the first-order ODE toolbox. With three patterns you can crack equations that look forbidding at first glance:

Pattern	Substitution	Becomes	Hallmark
dy/dx = F(y/x)	v = y/x	separable in v, x	right side depends only on the ratio y/x
y' + Py = Qy^n	u = y^(1−n)	linear in u	rogue y^n term ruins linearity
dy/dx = f(ax+by+c)	v = ax+by+c	separable in v, x	right side groups x and y as a single block

Key Takeaways

Substitution trades a difficult ODE for an easier one in a new variable — the rest of the work is back-substitution.
Homogeneous: slopes are constant along rays from the origin → $v = y/x$ always works.
Bernoulli: a single $y^n$ spoils linearity → kill it with $u = y^{1-n}$ .
Linear shift: when $ax + by + c$ appears as a unit, name it $v$ .
Numerical verification with SciPy and analytic verification with PyTorch autograd are cheap insurance against algebra mistakes — use them.
Real-world S-curves, drag, and reaction kinetics are all logistic / Bernoulli equations in disguise.

The Substitution Principle:

"Don't solve the problem you have. Rename it so it becomes a problem you've already solved."

Coming Next: Section 21.4 turns these tools toward their most famous applications — exponential growth and decay. We'll meet radioactive isotopes, compound interest, and the half-life formula as one unified story.