Chapter 9
22 min read
Section 82 of 353

Integration by Substitution (U-Substitution)

The Indefinite Integral and Antiderivatives

Learning Objectives

By the end of this section, you will be able to:

  1. Recognize when an integral is suitable for u-substitution by identifying the chain rule pattern in reverse
  2. Apply the u-substitution method to evaluate both indefinite and definite integrals
  3. Transform integration limits when using u-substitution with definite integrals
  4. Master pattern recognition for common substitution types: powers, exponentials, trigonometric, and logarithmic
  5. Connect u-substitution to the chain rule and understand why this technique works
  6. Apply these concepts to machine learning: reparameterization tricks, normalizing flows, and gradient computation

The Big Picture: Reversing the Chain Rule

"U-substitution is the art of seeing composite functions inside integrals and unwinding them — it turns complicated integrals into simple ones by recognizing hidden structure."

In the previous section, we learned basic integration rules that work when the integrand matches standard forms directly. But what happens when we encounter integrals like:

2xex2dxorcos(x)sin3(x)dx\int 2x \cdot e^{x^2} \, dx \quad \text{or} \quad \int \cos(x) \cdot \sin^3(x) \, dx

These don't match any basic form — they involve compositions of functions. U-substitution is the technique that handles precisely these cases by reversing the chain rule.

The Key Insight

Remember the chain rule from differentiation: ddx[F(g(x))]=F(g(x))g(x)\frac{d}{dx}[F(g(x))] = F'(g(x)) \cdot g'(x)

Reading this equation from right to left gives us the integration rule: F(g(x))g(x)dx=F(g(x))+C\int F'(g(x)) \cdot g'(x) \, dx = F(g(x)) + C

U-substitution is simply a systematic way to recognize and apply this pattern!


Historical Context: From Chain Rule to Substitution

The method of substitution has been used since the earliest days of calculus. Both Newton and Leibniz recognized that differentiation and integration are inverse operations, and that the chain rule for differentiation should have a corresponding rule for integration.

Leibniz's Notation

Leibniz's notation dudx\frac{du}{dx} makes substitution feel almost algebraic. When we write du=dudxdxdu = \frac{du}{dx} \, dx, we're treating the differential as if it can be "solved for" and substituted — and remarkably, this informal manipulation gives correct results!

While mathematicians later formalized these manipulations rigorously using limits and the chain rule, Leibniz's intuitive notation remains the most practical tool for performing substitutions.

Modern Applications

Today, u-substitution appears throughout applied mathematics:

  • Physics: Simplifying integrals in mechanics, electromagnetism, and quantum mechanics
  • Probability: Computing expectations via change of variables
  • Machine Learning: The reparameterization trick in VAEs and normalizing flows
  • Signal Processing: Fourier and Laplace transforms involve sophisticated substitutions

The Chain Rule in Reverse

To understand u-substitution, let's first recall exactly how the chain rule works, and then see how to read it backward.

The Chain Rule (Differentiation)

If y=F(u)y = F(u) and u=g(x)u = g(x), then:

dydx=dydududx=F(u)g(x)=F(g(x))g(x)\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx} = F'(u) \cdot g'(x) = F'(g(x)) \cdot g'(x)

Example: Let y=(x2+1)4y = (x^2 + 1)^4. Here u=x2+1u = x^2 + 1 and y=u4y = u^4.

dydx=ddu(u4)ddx(x2+1)=4u32x=4(x2+1)32x=8x(x2+1)3\frac{dy}{dx} = \frac{d}{du}(u^4) \cdot \frac{d}{dx}(x^2 + 1) = 4u^3 \cdot 2x = 4(x^2+1)^3 \cdot 2x = 8x(x^2+1)^3

Reading the Chain Rule Backward (Integration)

Now imagine we're given the integral:

8x(x2+1)3dx\int 8x(x^2+1)^3 \, dx

We recognize this as the derivative of (x2+1)4(x^2+1)^4! So:

8x(x2+1)3dx=(x2+1)4+C\int 8x(x^2+1)^3 \, dx = (x^2+1)^4 + C

U-substitution systematizes this recognition process, letting us handle more complex cases without needing to guess the answer.


The U-Substitution Method

Here is the formal procedure for u-substitution:

The U-Substitution Algorithm

  1. Identify the inner function: Look for a composite function f(g(x))f(g(x)) in the integrand
  2. Choose u: Let u=g(x)u = g(x), the inner function
  3. Compute du: Find dudx=g(x)\frac{du}{dx} = g'(x), then write du=g(x)dxdu = g'(x) \, dx
  4. Substitute: Replace all xx-expressions with uu-expressions
  5. Integrate: Evaluate the (hopefully simpler) integral in terms of uu
  6. Back-substitute: Replace uu with g(x)g(x) to get the answer in terms of xx

Example: The Complete Process

Let's evaluate 2xex2dx\int 2x \cdot e^{x^2} \, dx step by step:

Step 1: Identify the composition

We have ex2e^{x^2}, which is eue^u with u=x2u = x^2.

Step 2: Choose u

Let u=x2u = x^2

Step 3: Compute du

dudx=2x\frac{du}{dx} = 2x, so du=2xdxdu = 2x \, dx

Notice: the 2xdx2x \, dx in our integral is exactly dudu!

Step 4: Substitute

2xex2dx=eudu\int 2x \cdot e^{x^2} \, dx = \int e^u \, du

Step 5: Integrate

eudu=eu+C\int e^u \, du = e^u + C

Step 6: Back-substitute

eu+C=ex2+Ce^u + C = e^{x^2} + C

Verification

Always verify by differentiating: ddx[ex2]=ex22x=2xex2\frac{d}{dx}[e^{x^2}] = e^{x^2} \cdot 2x = 2x \cdot e^{x^2}

Interactive: Step-by-Step Examples

Explore how u-substitution works through several examples. Use the slider to step through each stage of the substitution process:

🔄 U-Substitution Step-by-Step
Step 1 of 5Original Integral
1. Original Integral
2x(x2+1)3dx\int 2x(x^2 + 1)^3 \, dx
Key Insight:
The 2x is exactly the derivative of the inner function x² + 1

Pattern Recognition: Choosing the Right u

The hardest part of u-substitution is choosing what to call uu. Here are the key patterns to recognize:

Pattern 1: Look for the Inner Function

In a composite function f(g(x))f(g(x)), choose u=g(x)u = g(x) — the inner function.

IntegralInner Function uWhy It Works
∫cos(3x) dxu = 3x3x is inside cos
∫(x² + 1)⁵ · 2x dxu = x² + 1x² + 1 is the base of the power
∫sin(x)/cos²(x) dxu = cos(x)cos(x) is the denominator base
∫ln(x)/x dxu = ln(x)ln(x) is being operated on

Pattern 2: The Derivative Must Be Present

For the substitution to work cleanly, the derivative of your chosen uu must appear in the integrand (possibly with a constant factor).

Good Choice ✓

3x2x3+5dx\int 3x^2 \sqrt{x^3 + 5} \, dx

If u=x3+5u = x^3 + 5, then du=3x2dxdu = 3x^2 \, dx. The 3x23x^2 is present!

Bad Choice ✗

xx3+5dx\int x \sqrt{x^3 + 5} \, dx

If u=x3+5u = x^3 + 5, then du=3x2dxdu = 3x^2 \, dx, but we only have xdxx \, dx. Missing an xx!

Pattern 3: Adjusting for Constant Factors

If the derivative appears with a different constant factor, you can compensate:

Example: xex2dx\int x \cdot e^{x^2} \, dx

Let u=x2u = x^2, so du=2xdxdu = 2x \, dx.

We have xdxx \, dx, which is 12du\frac{1}{2} du.

xex2dx=eu12du=12eu+C=12ex2+C\int x \cdot e^{x^2} \, dx = \int e^u \cdot \frac{1}{2} du = \frac{1}{2} e^u + C = \frac{1}{2} e^{x^2} + C

What You Cannot Do

You can only adjust for constant factors. You cannot compensate for missing xx terms by "dividing by x" — that doesn't work!

If the derivative of uu doesn't appear (up to a constant), you need a different technique or a different choice of uu.

Interactive: Visualizing the Transformation

Watch how u-substitution transforms both the function AND the integration limits. The key insight: the shaded areas are equal before and after substitution!

📊 Visualizing the Transformation
U-substitution transforms the integral by changing both the function AND the limits of integration. Watch how the shaded areas remain equal!
Lower limit a:0.0
Upper limit b:2.0
Original Integral
0.02.02x(x2+1)2dx\int_{0.0}^{2.0} 2x(x^2 + 1)^2 \, dx
Area ≈ 41.333

U-Substitution with Definite Integrals

When applying u-substitution to definite integrals, you have two options:

Method 1: Change the Limits

Transform the limits of integration from xx-values to uu-values. This eliminates the need to back-substitute!

Changing Limits Formula

abf(g(x))g(x)dx=g(a)g(b)f(u)du\int_a^b f(g(x)) \cdot g'(x) \, dx = \int_{g(a)}^{g(b)} f(u) \, du

When x=ax = a, u=g(a)u = g(a). When x=bx = b, u=g(b)u = g(b).

Example: Evaluate 022x(x2+1)3dx\int_0^2 2x(x^2 + 1)^3 \, dx

Let u=x2+1u = x^2 + 1, so du=2xdxdu = 2x \, dx

Change limits: When x=0x = 0, u=02+1=1u = 0^2 + 1 = 1. When x=2x = 2, u=22+1=5u = 2^2 + 1 = 5.

022x(x2+1)3dx=15u3du=[u44]15=625414=6244=156\int_0^2 2x(x^2 + 1)^3 \, dx = \int_1^5 u^3 \, du = \left[ \frac{u^4}{4} \right]_1^5 = \frac{625}{4} - \frac{1}{4} = \frac{624}{4} = 156

Method 2: Back-Substitute First

Alternatively, find the indefinite integral in terms of xx, then apply the original limits.

First, find the antiderivative: 2x(x2+1)3dx=(x2+1)44+C\int 2x(x^2 + 1)^3 \, dx = \frac{(x^2+1)^4}{4} + C

Then apply original limits:

[(x2+1)44]02=(5)44(1)44=62514=156\left[ \frac{(x^2+1)^4}{4} \right]_0^2 = \frac{(5)^4}{4} - \frac{(1)^4}{4} = \frac{625 - 1}{4} = 156

Which Method to Use?

Method 1 (Change limits) is usually faster because you never have to back-substitute. It's especially useful for computer algebra systems.

Method 2 (Back-substitute) is useful when you need the antiderivative for other purposes, or when you're more comfortable working in the original variable.


Advanced Substitution Patterns

Exponential with Linear Argument

eax+bdx\int e^{ax+b} \, dx

Let u=ax+bu = ax + b, so du=adxdu = a \, dx, meaning dx=1adudx = \frac{1}{a} du

eax+bdx=1aeax+b+C\int e^{ax+b} \, dx = \frac{1}{a} e^{ax+b} + C

Trigonometric Compositions

sinn(x)cos(x)dx\int \sin^n(x) \cos(x) \, dx

Let u=sin(x)u = \sin(x), so du=cos(x)dxdu = \cos(x) \, dx

sinn(x)cos(x)dx=undu=un+1n+1+C=sinn+1(x)n+1+C\int \sin^n(x) \cos(x) \, dx = \int u^n \, du = \frac{u^{n+1}}{n+1} + C = \frac{\sin^{n+1}(x)}{n+1} + C

Logarithmic Patterns

f(x)f(x)dx\int \frac{f'(x)}{f(x)} \, dx

Let u=f(x)u = f(x), so du=f(x)dxdu = f'(x) \, dx

f(x)f(x)dx=1udu=lnu+C=lnf(x)+C\int \frac{f'(x)}{f(x)} \, dx = \int \frac{1}{u} \, du = \ln|u| + C = \ln|f(x)| + C

Summary Table of Common Substitutions

Integral FormSubstitutionResult
∫f(ax + b) dxu = ax + b(1/a)F(ax + b) + C
∫f(g(x))g'(x) dxu = g(x)F(u) + C = F(g(x)) + C
∫[f(x)]ⁿ f'(x) dxu = f(x)[f(x)]ⁿ⁺¹/(n+1) + C
∫f'(x)/f(x) dxu = f(x)ln|f(x)| + C
∫eᶠ⁽ˣ⁾f'(x) dxu = f(x)eᶠ⁽ˣ⁾ + C

Real-World Applications

Physics: Work Done by Variable Force

When a force varies with position, the work done is an integral. If the force is a composite function, u-substitution is needed.

Example: A spring with non-linear force law F(x)=kxex2F(x) = k \cdot x \cdot e^{-x^2}

Work from x=0x = 0 to x=1x = 1:

W=01kxex2dxW = \int_0^1 kx \cdot e^{-x^2} \, dx

With u=x2u = -x^2, du=2xdxdu = -2x \, dx:

W=k201eudu=k2[eu]01=k2(e11)=k2(1e1)W = -\frac{k}{2} \int_0^{-1} e^u \, du = -\frac{k}{2}[e^u]_0^{-1} = -\frac{k}{2}(e^{-1} - 1) = \frac{k}{2}(1 - e^{-1})

Probability: Computing Expectations

Expected values often require integration of composite functions. For example, if XX follows a distribution with PDF f(x)f(x), and we want E[g(X)]\mathbb{E}[g(X)]:

E[g(X)]=g(x)f(x)dx\mathbb{E}[g(X)] = \int_{-\infty}^{\infty} g(x) f(x) \, dx

The change of variables formula (u-substitution extended to probability) allows us to transform this integral when we know the distribution of Y=g(X)Y = g(X).

Economics: Present Value of Income Stream

The present value of a continuous income stream R(t)R(t) with continuous discounting at rate rr is:

PV=0TR(t)ertdtPV = \int_0^T R(t) e^{-rt} \, dt

When R(t)R(t) has a specific form (like exponential growth), u-substitution simplifies the calculation.


Machine Learning Connection

U-substitution (change of variables) is fundamental to several key machine learning techniques. Understanding the mathematical basis helps you understand why these methods work.

The Reparameterization Trick (VAEs)

In Variational Autoencoders, we need to backpropagate through stochastic nodes. The reparameterization trick does exactly this:

Instead of sampling zN(μ,σ2)z \sim \mathcal{N}(\mu, \sigma^2)

We sample ϵN(0,1)\epsilon \sim \mathcal{N}(0, 1) and compute z=μ+σϵz = \mu + \sigma \epsilon

This is a change of variables! The transformation z=μ+σϵz = \mu + \sigma \epsilon is exactly the kind of substitution we've been studying.

Normalizing Flows

Normalizing flows learn complex probability distributions by chaining together simple, invertible transformations. The change of variables formula (the multivariate generalization of u-substitution) gives:

p(y)=p(x)detxy=p(x)detf1yp(y) = p(x) \left| \det \frac{\partial x}{\partial y} \right| = p(x) \left| \det \frac{\partial f^{-1}}{\partial y} \right|

This is why normalizing flows require computing Jacobian determinants — it's the multidimensional version of the du/dx|du/dx| factor from u-substitution!

Backpropagation and the Chain Rule

The chain rule for differentiation (which u-substitution reverses for integration) is the foundation of backpropagation. When we compute:

Lθ=Lyyzzθ\frac{\partial L}{\partial \theta} = \frac{\partial L}{\partial y} \cdot \frac{\partial y}{\partial z} \cdot \frac{\partial z}{\partial \theta}

Each multiplication corresponds to a "substitution" — we're chaining together derivatives through composed functions, exactly as the chain rule dictates.

The Deep Connection

U-substitution for integration and the chain rule for differentiation are two sides of the same coin. Every neural network gradient computation relies on the chain rule, which means understanding u-substitution helps you understand why backpropagation works.


Python Implementation

Symbolic and Numerical U-Substitution

Here's how to verify u-substitution using Python's symbolic and numerical libraries:

U-Substitution: Symbolic and Numerical
🐍u_substitution_demo.py
10Define the Variable

We use SymPy to create a symbolic variable x. This allows us to manipulate mathematical expressions algebraically.

13Original Integrand

The integrand 2x(x² + 1)³ is a classic u-substitution problem. The 2x factor is exactly the derivative of (x² + 1).

17SymPy Integration

SymPy's integrate() function automatically applies u-substitution and other techniques internally to find the antiderivative.

23Manual U-Substitution

We verify by manually setting u = x² + 1. The integral transforms to ∫u³ du, which integrates to u⁴/4.

28Back-Substitution

After integrating in terms of u, we substitute back u = x² + 1 to get the final answer in terms of x.

32Numerical Verification

We use scipy.integrate.quad() to numerically compute the definite integral, verifying our symbolic result.

41 lines without explanation
1import numpy as np
2from scipy import integrate
3import sympy as sp
4
5def demonstrate_u_substitution():
6    """
7    Demonstrate u-substitution both symbolically and numerically.
8    Example: ∫ 2x(x² + 1)³ dx
9    """
10    # Symbolic approach with SymPy
11    x = sp.Symbol('x')
12
13    # Original integrand
14    f = 2*x * (x**2 + 1)**3
15    print("Original integrand: f(x) =", f)
16
17    # Direct symbolic integration (SymPy uses u-sub internally)
18    F = sp.integrate(f, x)
19    print("Antiderivative: F(x) =", F)
20
21    # Manual u-substitution verification
22    u = sp.Symbol('u')
23    # If u = x² + 1, then du = 2x dx
24    # The integral becomes ∫ u³ du
25    g = u**3
26    G = sp.integrate(g, u)
27    print("After u-sub: ∫u³ du =", G)
28
29    # Substitute back: u = x² + 1
30    result = G.subs(u, x**2 + 1)
31    print("Back-substituted:", result)
32
33    # Numerical verification for definite integral
34    a, b = 0, 2
35    numerical_result, _ = integrate.quad(
36        lambda t: 2*t * (t**2 + 1)**3, a, b
37    )
38
39    # Symbolic definite integral
40    symbolic_result = float(F.subs(x, b) - F.subs(x, a))
41
42    print(f"\nDefinite integral from {a} to {b}:")
43    print(f"  Numerical: {numerical_result:.6f}")
44    print(f"  Symbolic:  {symbolic_result:.6f}")
45    print(f"  Match: {abs(numerical_result - symbolic_result) < 1e-10}")
46
47demonstrate_u_substitution()

U-Substitution in Machine Learning

See how u-substitution concepts appear in VAEs, normalizing flows, and gradient computation:

U-Substitution in ML
🐍u_substitution_ml.py
8Reparameterization Trick

The VAE reparameterization trick is fundamentally a change of variables (u-substitution). Instead of sampling z ~ N(μ, σ²) directly, we sample ε ~ N(0, 1) and compute z = μ + σε.

28The Substitution

z = μ + σε is the substitution. This allows gradients to flow through μ and σ while ε provides the stochasticity. The math is identical to u-substitution in integration.

49Normalizing Flows

Normalizing flows use the multivariate generalization of u-substitution. The Jacobian determinant accounts for how the transformation stretches or compresses probability density.

74Change of Variables Formula

p(y) = p(x) · |dx/dy| is the change of variables formula for probability densities. This is exactly the 1D Jacobian from u-substitution: if y = f(x), then dy = f'(x)dx.

95Backpropagation as Chain Rule

Backpropagation IS the chain rule, which is differentiation's analog of u-substitution. Each layer's gradient computation uses the same 'substitution' logic.

117Gradient Chain

The product dL/dz · dz/du · du/dθ mirrors how u-substitution works: we 'chain' together the derivatives just as we chain substitutions.

142 lines without explanation
1import numpy as np
2from scipy import integrate
3
4def reparameterization_connection():
5    """
6    U-substitution is the mathematical foundation of the
7    reparameterization trick in Variational Autoencoders (VAEs).
8
9    In VAEs, we need to compute gradients through random samples.
10    The trick: instead of sampling z ~ N(μ, σ²), we sample ε ~ N(0, 1)
11    and compute z = μ + σε. This is a change of variables!
12    """
13    print("=" * 60)
14    print("REPARAMETERIZATION TRICK - U-Substitution in ML")
15    print("=" * 60)
16
17    # Standard Gaussian parameters
18    mu = 2.0      # Mean of target distribution
19    sigma = 0.5   # Std dev of target distribution
20
21    # The change of variables formula for probability densities:
22    # If Z = μ + σε where ε ~ N(0, 1)
23    # Then p(z) = p(ε) / |dz/dε| = p(ε) / σ
24
25    def sample_direct(n_samples):
26        """Direct sampling from N(μ, σ²)"""
27        return np.random.normal(mu, sigma, n_samples)
28
29    def sample_reparameterized(n_samples):
30        """Reparameterized sampling: z = μ + σε, ε ~ N(0, 1)"""
31        epsilon = np.random.normal(0, 1, n_samples)
32        return mu + sigma * epsilon  # This IS the substitution!
33
34    # Verify both methods produce the same distribution
35    n = 100000
36    direct_samples = sample_direct(n)
37    reparam_samples = sample_reparameterized(n)
38
39    print(f"\nDirect sampling:       mean = {np.mean(direct_samples):.4f}, "
40          f"std = {np.std(direct_samples):.4f}")
41    print(f"Reparameterized:       mean = {np.mean(reparam_samples):.4f}, "
42          f"std = {np.std(reparam_samples):.4f}")
43    print(f"Expected:              mean = {mu:.4f}, std = {sigma:.4f}")
44
45def jacobian_in_normalizing_flows():
46    """
47    Normalizing flows use u-substitution's multivariable form:
48    the Jacobian determinant for density transformation.
49
50    If y = f(x), then p(y) = p(x) · |det(∂x/∂y)|
51    """
52    print("\n" + "=" * 60)
53    print("NORMALIZING FLOWS - Jacobian from U-Substitution")
54    print("=" * 60)
55
56    # Simple 1D flow: y = exp(x) (mapping R to R+)
57    def flow_forward(x):
58        """Transform x to y"""
59        return np.exp(x)
60
61    def flow_inverse(y):
62        """Transform y back to x"""
63        return np.log(y)
64
65    def log_det_jacobian(x):
66        """Log |dy/dx| = log(exp(x)) = x"""
67        return x
68
69    # Start with standard normal
70    n = 10000
71    x_samples = np.random.normal(0, 1, n)
72
73    # Apply flow
74    y_samples = flow_forward(x_samples)
75
76    # The transformed density follows from change of variables
77    # p(y) = p(x) · |dx/dy| = p(x) / |dy/dx|
78    # This is the same math as u-substitution!
79
80    # Expected density of Y at a point (log-normal density)
81    def log_normal_pdf(y, mu=0, sigma=1):
82        return (1 / (y * sigma * np.sqrt(2 * np.pi))) * \
83               np.exp(-0.5 * ((np.log(y) - mu) / sigma)**2)
84
85    # Verify by comparing histogram to theoretical density
86    y_test = np.linspace(0.1, 5, 100)
87    theoretical_pdf = [log_normal_pdf(y) for y in y_test]
88
89    print(f"\nFlow transformation: y = exp(x)")
90    print(f"Input distribution: X ~ N(0, 1)")
91    print(f"Output distribution: Y ~ LogNormal(0, 1)")
92    print(f"\nSample mean of Y: {np.mean(y_samples):.4f}")
93    print(f"Theoretical mean:  {np.exp(0.5):.4f}")  # E[Y] = exp(μ + σ²/2)
94
95def gradient_computation_example():
96    """
97    U-substitution enables efficient gradient computation
98    through composed functions - the core of backpropagation.
99    """
100    print("\n" + "=" * 60)
101    print("GRADIENT COMPUTATION - Chain Rule as U-Substitution")
102    print("=" * 60)
103
104    # Loss function: L(θ) = (sigmoid(θx) - y)²
105    # We need ∂L/∂θ
106
107    # The chain rule IS u-substitution in differentiation:
108    # If u = θx, z = sigmoid(u), L = (z - y)²
109    # Then ∂L/∂θ = ∂L/∂z · ∂z/∂u · ∂u/∂θ
110
111    def sigmoid(x):
112        return 1 / (1 + np.exp(-np.clip(x, -500, 500)))
113
114    def sigmoid_derivative(x):
115        s = sigmoid(x)
116        return s * (1 - s)
117
118    # Example values
119    theta = 2.0
120    x = 1.5
121    y_true = 0.8
122
123    # Forward pass (computing u, z, L)
124    u = theta * x
125    z = sigmoid(u)
126    L = (z - y_true)**2
127
128    # Backward pass (chain rule = u-sub in reverse)
129    dL_dz = 2 * (z - y_true)
130    dz_du = sigmoid_derivative(u)
131    du_dtheta = x
132
133    # Total gradient
134    dL_dtheta = dL_dz * dz_du * du_dtheta
135
136    print(f"Forward pass:")
137    print(f"  u = θx = {theta} × {x} = {u}")
138    print(f"  z = sigmoid(u) = {z:.6f}")
139    print(f"  L = (z - y)² = ({z:.4f} - {y_true})² = {L:.6f}")
140    print(f"\nBackward pass (chain rule = reverse u-substitution):")
141    print(f"  ∂L/∂z = 2(z - y) = {dL_dz:.6f}")
142    print(f"  ∂z/∂u = sigmoid'(u) = {dz_du:.6f}")
143    print(f"  ∂u/∂θ = x = {du_dtheta}")
144    print(f"\n∂L/∂θ = {dL_dz:.4f} × {dz_du:.4f} × {du_dtheta} = {dL_dtheta:.6f}")
145
146reparameterization_connection()
147jacobian_in_normalizing_flows()
148gradient_computation_example()

Common Mistakes to Avoid

Mistake 1: Forgetting to Substitute dx

Wrong: cos(x2)dx\int \cos(x^2) \, dx with u=x2u = x^2 gives cos(u)dx\int \cos(u) \, dx

Correct: Since du=2xdxdu = 2x \, dx, we need dx=du2xdx = \frac{du}{2x}. But there's no way to eliminate the remaining xx! This integral needs a different technique.

Mistake 2: Not Changing Limits for Definite Integrals

When using Method 1 (changing limits), don't mix xx-limits with uu-integrands.

Wrong: 01u2du\int_0^1 u^2 \, du where 0 and 1 are xx-values

Correct: If u=g(x)u = g(x), use limits g(0)g(0) and g(1)g(1)

Mistake 3: Choosing the Wrong u

Common error: choosing uu to be the "complicated" part without checking if its derivative appears.

Tip: The derivative of your chosen uu should appear as a factor in the integrand (possibly with a constant multiple).

Mistake 4: Forgetting the Constant of Integration

For indefinite integrals, always include +C+C in your final answer after back-substitution.

Mistake 5: Trying to 'Solve for x'

Wrong: If u=x2u = x^2, trying to write x=ux = \sqrt{u} and substituting

Why it fails: This introduces ±\pm ambiguity and often makes the integral more complicated. Only solve for dxdx in terms of dudu.


Test Your Understanding

📝 Test Your Understanding
Question 1 of 5

What is the best choice of u for this integral?

6x2x3+1dx\int 6x^2 \sqrt{x^3 + 1} \, dx

Summary

U-substitution is the most important technique for evaluating integrals involving composite functions. It works by reversing the chain rule.

Key Concepts

ConceptDescriptionKey Formula
Core IdeaReverse the chain rule∫f(g(x))g'(x) dx = F(g(x)) + C
The SubstitutionLet u = g(x), inner functiondu = g'(x) dx
Definite IntegralsChange limits or back-substitutex ∈ [a,b] → u ∈ [g(a), g(b)]
Pattern RecognitionLook for derivative of inner functionDerivative of u must appear

Key Takeaways

  1. Look for compositions: When you see f(g(x))f(g(x)), consider letting u=g(x)u = g(x)
  2. Check for the derivative: The factor g(x)g'(x) (or a constant multiple) must appear in the integrand
  3. Transform completely: Every part of the integrand including dxdx must become uu-expressions
  4. For definite integrals: Either change the limits touu-values or back-substitute before evaluating
  5. Verify by differentiating: Always check your answer by differentiating and confirming you get the original integrand
  6. ML connection: U-substitution underlies the reparameterization trick, normalizing flows, and is the "inverse" of the chain rule used in backpropagation
The Core Insight:
"U-substitution transforms complicated integrals into simple ones by recognizing that every composite function hides a chain rule derivative."
Coming Next: In Integration by Parts, we'll learn another powerful technique — the integration analog of the product rule — that handles integrals involving products of different function types.
Loading comments...