Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Recognize when an integral is suitable for u-substitution by identifying the chain rule pattern in reverse
Apply the u-substitution method to evaluate both indefinite and definite integrals
Transform integration limits when using u-substitution with definite integrals
Master pattern recognition for common substitution types: powers, exponentials, trigonometric, and logarithmic
Connect u-substitution to the chain rule and understand why this technique works
Apply these concepts to machine learning: reparameterization tricks, normalizing flows, and gradient computation

The Big Picture: Reversing the Chain Rule

"U-substitution is the art of seeing composite functions inside integrals and unwinding them — it turns complicated integrals into simple ones by recognizing hidden structure."

In the previous section, we learned basic integration rules that work when the integrand matches standard forms directly. But what happens when we encounter integrals like:

\int 2x \cdot e^{x^2} \, dx \quad \text{or} \quad \int \cos(x) \cdot \sin^3(x) \, dx

These don't match any basic form — they involve compositions of functions. U-substitution is the technique that handles precisely these cases by reversing the chain rule.

The Key Insight

Remember the chain rule from differentiation: $\frac{d}{dx}[F(g(x))] = F'(g(x)) \cdot g'(x)$

Reading this equation from right to left gives us the integration rule: $\int F'(g(x)) \cdot g'(x) \, dx = F(g(x)) + C$

U-substitution is simply a systematic way to recognize and apply this pattern!

Historical Context: From Chain Rule to Substitution

The method of substitution has been used since the earliest days of calculus. Both Newton and Leibniz recognized that differentiation and integration are inverse operations, and that the chain rule for differentiation should have a corresponding rule for integration.

Leibniz's Notation

Leibniz's notation $\frac{du}{dx}$ makes substitution feel almost algebraic. When we write $du = \frac{du}{dx} \, dx$ , we're treating the differential as if it can be "solved for" and substituted — and remarkably, this informal manipulation gives correct results!

While mathematicians later formalized these manipulations rigorously using limits and the chain rule, Leibniz's intuitive notation remains the most practical tool for performing substitutions.

Modern Applications

Today, u-substitution appears throughout applied mathematics:

Physics: Simplifying integrals in mechanics, electromagnetism, and quantum mechanics
Probability: Computing expectations via change of variables
Machine Learning: The reparameterization trick in VAEs and normalizing flows
Signal Processing: Fourier and Laplace transforms involve sophisticated substitutions

The Chain Rule in Reverse

To understand u-substitution, let's first recall exactly how the chain rule works, and then see how to read it backward.

The Chain Rule (Differentiation)

If $y = F(u)$ and $u = g(x)$ , then:

\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx} = F'(u) \cdot g'(x) = F'(g(x)) \cdot g'(x)

Example: Let $y = (x^2 + 1)^4$ . Here $u = x^2 + 1$ and $y = u^4$ .

\frac{dy}{dx} = \frac{d}{du}(u^4) \cdot \frac{d}{dx}(x^2 + 1) = 4u^3 \cdot 2x = 4(x^2+1)^3 \cdot 2x = 8x(x^2+1)^3

Reading the Chain Rule Backward (Integration)

Now imagine we're given the integral:

\int 8x(x^2+1)^3 \, dx

We recognize this as the derivative of $(x^2+1)^4$ ! So:

\int 8x(x^2+1)^3 \, dx = (x^2+1)^4 + C

U-substitution systematizes this recognition process, letting us handle more complex cases without needing to guess the answer.

The U-Substitution Method

Here is the formal procedure for u-substitution:

The U-Substitution Algorithm

Identify the inner function: Look for a composite function $f(g(x))$ in the integrand
Choose u: Let $u = g(x)$ , the inner function
Compute du: Find $\frac{du}{dx} = g'(x)$ , then write $du = g'(x) \, dx$
Substitute: Replace all $x$ -expressions with $u$ -expressions
Integrate: Evaluate the (hopefully simpler) integral in terms of $u$
Back-substitute: Replace $u$ with $g(x)$ to get the answer in terms of $x$

Example: The Complete Process

Let's evaluate $\int 2x \cdot e^{x^2} \, dx$ step by step:

Step 1: Identify the composition

We have $e^{x^2}$ , which is $e^u$ with $u = x^2$ .

Step 2: Choose u

Let $u = x^2$

Step 3: Compute du

$\frac{du}{dx} = 2x$ , so $du = 2x \, dx$

Notice: the $2x \, dx$ in our integral is exactly $du$ !

Step 4: Substitute

$\int 2x \cdot e^{x^2} \, dx = \int e^u \, du$

Step 5: Integrate

$\int e^u \, du = e^u + C$

Step 6: Back-substitute

$e^u + C = e^{x^2} + C$

Verification

Always verify by differentiating: $\frac{d}{dx}[e^{x^2}] = e^{x^2} \cdot 2x = 2x \cdot e^{x^2}$ ✓

Interactive: Step-by-Step Examples

Explore how u-substitution works through several examples. Use the slider to step through each stage of the substitution process:

🔄 U-Substitution Step-by-Step

Step 1 of 5Original Integral

1. Original Integral

\int 2x(x^2 + 1)^3 \, dx

Key Insight:

The 2x is exactly the derivative of the inner function x² + 1

Pattern Recognition: Choosing the Right u

The hardest part of u-substitution is choosing what to call $u$ . Here are the key patterns to recognize:

Pattern 1: Look for the Inner Function

In a composite function $f(g(x))$ , choose $u = g(x)$ — the inner function.

Integral	Inner Function u	Why It Works
∫cos(3x) dx	u = 3x	3x is inside cos
∫(x² + 1)⁵ · 2x dx	u = x² + 1	x² + 1 is the base of the power
∫sin(x)/cos²(x) dx	u = cos(x)	cos(x) is the denominator base
∫ln(x)/x dx	u = ln(x)	ln(x) is being operated on

Pattern 2: The Derivative Must Be Present

For the substitution to work cleanly, the derivative of your chosen $u$ must appear in the integrand (possibly with a constant factor).

Good Choice ✓

$\int 3x^2 \sqrt{x^3 + 5} \, dx$

If $u = x^3 + 5$ , then $du = 3x^2 \, dx$ . The $3x^2$ is present!

Bad Choice ✗

$\int x \sqrt{x^3 + 5} \, dx$

If $u = x^3 + 5$ , then $du = 3x^2 \, dx$ , but we only have $x \, dx$ . Missing an $x$ !

Pattern 3: Adjusting for Constant Factors

If the derivative appears with a different constant factor, you can compensate:

Example: $\int x \cdot e^{x^2} \, dx$

Let $u = x^2$ , so $du = 2x \, dx$ .

We have $x \, dx$ , which is $\frac{1}{2} du$ .

$\int x \cdot e^{x^2} \, dx = \int e^u \cdot \frac{1}{2} du = \frac{1}{2} e^u + C = \frac{1}{2} e^{x^2} + C$

What You Cannot Do

You can only adjust for constant factors. You cannot compensate for missing $x$ terms by "dividing by x" — that doesn't work!

If the derivative of $u$ doesn't appear (up to a constant), you need a different technique or a different choice of $u$ .

Interactive: Visualizing the Transformation

Watch how u-substitution transforms both the function AND the integration limits. The key insight: the shaded areas are equal before and after substitution!

📊 Visualizing the Transformation

U-substitution transforms the integral by changing both the function AND the limits of integration. Watch how the shaded areas remain equal!

Lower limit a:0.0

Upper limit b:2.0

Original Integral

\int_{0.0}^{2.0} 2x(x^2 + 1)^2 \, dx

Area ≈ 41.333

U-Substitution with Definite Integrals

When applying u-substitution to definite integrals, you have two options:

Method 1: Change the Limits

Transform the limits of integration from $x$ -values to $u$ -values. This eliminates the need to back-substitute!

Changing Limits Formula

\int_a^b f(g(x)) \cdot g'(x) \, dx = \int_{g(a)}^{g(b)} f(u) \, du

When $x = a$ , $u = g(a)$ . When $x = b$ , $u = g(b)$ .

Example: Evaluate $\int_0^2 2x(x^2 + 1)^3 \, dx$

Let $u = x^2 + 1$ , so $du = 2x \, dx$

Change limits: When $x = 0$ , $u = 0^2 + 1 = 1$ . When $x = 2$ , $u = 2^2 + 1 = 5$ .

\int_0^2 2x(x^2 + 1)^3 \, dx = \int_1^5 u^3 \, du = \left[ \frac{u^4}{4} \right]_1^5 = \frac{625}{4} - \frac{1}{4} = \frac{624}{4} = 156

Method 2: Back-Substitute First

Alternatively, find the indefinite integral in terms of $x$ , then apply the original limits.

First, find the antiderivative: $\int 2x(x^2 + 1)^3 \, dx = \frac{(x^2+1)^4}{4} + C$

Then apply original limits:

\left[ \frac{(x^2+1)^4}{4} \right]_0^2 = \frac{(5)^4}{4} - \frac{(1)^4}{4} = \frac{625 - 1}{4} = 156

Which Method to Use?

Method 1 (Change limits) is usually faster because you never have to back-substitute. It's especially useful for computer algebra systems.

Method 2 (Back-substitute) is useful when you need the antiderivative for other purposes, or when you're more comfortable working in the original variable.

Advanced Substitution Patterns

Exponential with Linear Argument

\int e^{ax+b} \, dx

Let $u = ax + b$ , so $du = a \, dx$ , meaning $dx = \frac{1}{a} du$

\int e^{ax+b} \, dx = \frac{1}{a} e^{ax+b} + C

Trigonometric Compositions

\int \sin^n(x) \cos(x) \, dx

Let $u = \sin(x)$ , so $du = \cos(x) \, dx$

\int \sin^n(x) \cos(x) \, dx = \int u^n \, du = \frac{u^{n+1}}{n+1} + C = \frac{\sin^{n+1}(x)}{n+1} + C

Logarithmic Patterns

\int \frac{f'(x)}{f(x)} \, dx

Let $u = f(x)$ , so $du = f'(x) \, dx$

\int \frac{f'(x)}{f(x)} \, dx = \int \frac{1}{u} \, du = \ln|u| + C = \ln|f(x)| + C

Summary Table of Common Substitutions

Integral Form	Substitution	Result
∫f(ax + b) dx	u = ax + b	(1/a)F(ax + b) + C
∫f(g(x))g'(x) dx	u = g(x)	F(u) + C = F(g(x)) + C
∫[f(x)]ⁿ f'(x) dx	u = f(x)	[f(x)]ⁿ⁺¹/(n+1) + C
∫f'(x)/f(x) dx	u = f(x)	ln\|f(x)\| + C
∫eᶠ⁽ˣ⁾f'(x) dx	u = f(x)	eᶠ⁽ˣ⁾ + C

Real-World Applications

Physics: Work Done by Variable Force

When a force varies with position, the work done is an integral. If the force is a composite function, u-substitution is needed.

Example: A spring with non-linear force law $F(x) = k \cdot x \cdot e^{-x^2}$

Work from $x = 0$ to $x = 1$ :

W = \int_0^1 kx \cdot e^{-x^2} \, dx

With $u = -x^2$ , $du = -2x \, dx$ :

W = -\frac{k}{2} \int_0^{-1} e^u \, du = -\frac{k}{2}[e^u]_0^{-1} = -\frac{k}{2}(e^{-1} - 1) = \frac{k}{2}(1 - e^{-1})

Probability: Computing Expectations

Expected values often require integration of composite functions. For example, if $X$ follows a distribution with PDF $f(x)$ , and we want $\mathbb{E}[g(X)]$ :

\mathbb{E}[g(X)] = \int_{-\infty}^{\infty} g(x) f(x) \, dx

The change of variables formula (u-substitution extended to probability) allows us to transform this integral when we know the distribution of $Y = g(X)$ .

Economics: Present Value of Income Stream

The present value of a continuous income stream $R(t)$ with continuous discounting at rate $r$ is:

PV = \int_0^T R(t) e^{-rt} \, dt

When $R(t)$ has a specific form (like exponential growth), u-substitution simplifies the calculation.

Machine Learning Connection

U-substitution (change of variables) is fundamental to several key machine learning techniques. Understanding the mathematical basis helps you understand why these methods work.

The Reparameterization Trick (VAEs)

In Variational Autoencoders, we need to backpropagate through stochastic nodes. The reparameterization trick does exactly this:

Instead of sampling $z \sim \mathcal{N}(\mu, \sigma^2)$

We sample $\epsilon \sim \mathcal{N}(0, 1)$ and compute $z = \mu + \sigma \epsilon$

This is a change of variables! The transformation $z = \mu + \sigma \epsilon$ is exactly the kind of substitution we've been studying.

Normalizing Flows

Normalizing flows learn complex probability distributions by chaining together simple, invertible transformations. The change of variables formula (the multivariate generalization of u-substitution) gives:

p(y) = p(x) \left| \det \frac{\partial x}{\partial y} \right| = p(x) \left| \det \frac{\partial f^{-1}}{\partial y} \right|

This is why normalizing flows require computing Jacobian determinants — it's the multidimensional version of the $|du/dx|$ factor from u-substitution!

Backpropagation and the Chain Rule

The chain rule for differentiation (which u-substitution reverses for integration) is the foundation of backpropagation. When we compute:

\frac{\partial L}{\partial \theta} = \frac{\partial L}{\partial y} \cdot \frac{\partial y}{\partial z} \cdot \frac{\partial z}{\partial \theta}

Each multiplication corresponds to a "substitution" — we're chaining together derivatives through composed functions, exactly as the chain rule dictates.

The Deep Connection

U-substitution for integration and the chain rule for differentiation are two sides of the same coin. Every neural network gradient computation relies on the chain rule, which means understanding u-substitution helps you understand why backpropagation works.

Python Implementation

Symbolic and Numerical U-Substitution

Here's how to verify u-substitution using Python's symbolic and numerical libraries:

U-Substitution: Symbolic and Numerical

🐍u_substitution_demo.py

Explanation(6)

Code(47)

10Define the Variable

We use SymPy to create a symbolic variable x. This allows us to manipulate mathematical expressions algebraically.

13Original Integrand

The integrand 2x(x² + 1)³ is a classic u-substitution problem. The 2x factor is exactly the derivative of (x² + 1).

17SymPy Integration

SymPy's integrate() function automatically applies u-substitution and other techniques internally to find the antiderivative.

23Manual U-Substitution

We verify by manually setting u = x² + 1. The integral transforms to ∫u³ du, which integrates to u⁴/4.

28Back-Substitution

After integrating in terms of u, we substitute back u = x² + 1 to get the final answer in terms of x.

32Numerical Verification

We use scipy.integrate.quad() to numerically compute the definite integral, verifying our symbolic result.

41 lines without explanation

1import numpy as np
2from scipy import integrate
3import sympy as sp
4
5def demonstrate_u_substitution():
6    """
7    Demonstrate u-substitution both symbolically and numerically.
8    Example: ∫ 2x(x² + 1)³ dx
9    """
10    # Symbolic approach with SymPy
11    x = sp.Symbol('x')
12
13    # Original integrand
14    f = 2*x * (x**2 + 1)**3
15    print("Original integrand: f(x) =", f)
16
17    # Direct symbolic integration (SymPy uses u-sub internally)
18    F = sp.integrate(f, x)
19    print("Antiderivative: F(x) =", F)
20
21    # Manual u-substitution verification
22    u = sp.Symbol('u')
23    # If u = x² + 1, then du = 2x dx
24    # The integral becomes ∫ u³ du
25    g = u**3
26    G = sp.integrate(g, u)
27    print("After u-sub: ∫u³ du =", G)
28
29    # Substitute back: u = x² + 1
30    result = G.subs(u, x**2 + 1)
31    print("Back-substituted:", result)
32
33    # Numerical verification for definite integral
34    a, b = 0, 2
35    numerical_result, _ = integrate.quad(
36        lambda t: 2*t * (t**2 + 1)**3, a, b
37    )
38
39    # Symbolic definite integral
40    symbolic_result = float(F.subs(x, b) - F.subs(x, a))
41
42    print(f"\nDefinite integral from {a} to {b}:")
43    print(f"  Numerical: {numerical_result:.6f}")
44    print(f"  Symbolic:  {symbolic_result:.6f}")
45    print(f"  Match: {abs(numerical_result - symbolic_result) < 1e-10}")
46
47demonstrate_u_substitution()

U-Substitution in Machine Learning

See how u-substitution concepts appear in VAEs, normalizing flows, and gradient computation:

U-Substitution in ML

🐍u_substitution_ml.py

Explanation(6)

Code(148)

8Reparameterization Trick

The VAE reparameterization trick is fundamentally a change of variables (u-substitution). Instead of sampling z ~ N(μ, σ²) directly, we sample ε ~ N(0, 1) and compute z = μ + σε.

28The Substitution

z = μ + σε is the substitution. This allows gradients to flow through μ and σ while ε provides the stochasticity. The math is identical to u-substitution in integration.

49Normalizing Flows

Normalizing flows use the multivariate generalization of u-substitution. The Jacobian determinant accounts for how the transformation stretches or compresses probability density.

74Change of Variables Formula

p(y) = p(x) · |dx/dy| is the change of variables formula for probability densities. This is exactly the 1D Jacobian from u-substitution: if y = f(x), then dy = f'(x)dx.

95Backpropagation as Chain Rule

Backpropagation IS the chain rule, which is differentiation's analog of u-substitution. Each layer's gradient computation uses the same 'substitution' logic.

117Gradient Chain

The product dL/dz · dz/du · du/dθ mirrors how u-substitution works: we 'chain' together the derivatives just as we chain substitutions.

142 lines without explanation

1import numpy as np
2from scipy import integrate
3
4def reparameterization_connection():
5    """
6    U-substitution is the mathematical foundation of the
7    reparameterization trick in Variational Autoencoders (VAEs).
8
9    In VAEs, we need to compute gradients through random samples.
10    The trick: instead of sampling z ~ N(μ, σ²), we sample ε ~ N(0, 1)
11    and compute z = μ + σε. This is a change of variables!
12    """
13    print("=" * 60)
14    print("REPARAMETERIZATION TRICK - U-Substitution in ML")
15    print("=" * 60)
16
17    # Standard Gaussian parameters
18    mu = 2.0      # Mean of target distribution
19    sigma = 0.5   # Std dev of target distribution
20
21    # The change of variables formula for probability densities:
22    # If Z = μ + σε where ε ~ N(0, 1)
23    # Then p(z) = p(ε) / |dz/dε| = p(ε) / σ
24
25    def sample_direct(n_samples):
26        """Direct sampling from N(μ, σ²)"""
27        return np.random.normal(mu, sigma, n_samples)
28
29    def sample_reparameterized(n_samples):
30        """Reparameterized sampling: z = μ + σε, ε ~ N(0, 1)"""
31        epsilon = np.random.normal(0, 1, n_samples)
32        return mu + sigma * epsilon  # This IS the substitution!
33
34    # Verify both methods produce the same distribution
35    n = 100000
36    direct_samples = sample_direct(n)
37    reparam_samples = sample_reparameterized(n)
38
39    print(f"\nDirect sampling:       mean = {np.mean(direct_samples):.4f}, "
40          f"std = {np.std(direct_samples):.4f}")
41    print(f"Reparameterized:       mean = {np.mean(reparam_samples):.4f}, "
42          f"std = {np.std(reparam_samples):.4f}")
43    print(f"Expected:              mean = {mu:.4f}, std = {sigma:.4f}")
44
45def jacobian_in_normalizing_flows():
46    """
47    Normalizing flows use u-substitution's multivariable form:
48    the Jacobian determinant for density transformation.
49
50    If y = f(x), then p(y) = p(x) · |det(∂x/∂y)|
51    """
52    print("\n" + "=" * 60)
53    print("NORMALIZING FLOWS - Jacobian from U-Substitution")
54    print("=" * 60)
55
56    # Simple 1D flow: y = exp(x) (mapping R to R+)
57    def flow_forward(x):
58        """Transform x to y"""
59        return np.exp(x)
60
61    def flow_inverse(y):
62        """Transform y back to x"""
63        return np.log(y)
64
65    def log_det_jacobian(x):
66        """Log |dy/dx| = log(exp(x)) = x"""
67        return x
68
69    # Start with standard normal
70    n = 10000
71    x_samples = np.random.normal(0, 1, n)
72
73    # Apply flow
74    y_samples = flow_forward(x_samples)
75
76    # The transformed density follows from change of variables
77    # p(y) = p(x) · |dx/dy| = p(x) / |dy/dx|
78    # This is the same math as u-substitution!
79
80    # Expected density of Y at a point (log-normal density)
81    def log_normal_pdf(y, mu=0, sigma=1):
82        return (1 / (y * sigma * np.sqrt(2 * np.pi))) * \
83               np.exp(-0.5 * ((np.log(y) - mu) / sigma)**2)
84
85    # Verify by comparing histogram to theoretical density
86    y_test = np.linspace(0.1, 5, 100)
87    theoretical_pdf = [log_normal_pdf(y) for y in y_test]
88
89    print(f"\nFlow transformation: y = exp(x)")
90    print(f"Input distribution: X ~ N(0, 1)")
91    print(f"Output distribution: Y ~ LogNormal(0, 1)")
92    print(f"\nSample mean of Y: {np.mean(y_samples):.4f}")
93    print(f"Theoretical mean:  {np.exp(0.5):.4f}")  # E[Y] = exp(μ + σ²/2)
94
95def gradient_computation_example():
96    """
97    U-substitution enables efficient gradient computation
98    through composed functions - the core of backpropagation.
99    """
100    print("\n" + "=" * 60)
101    print("GRADIENT COMPUTATION - Chain Rule as U-Substitution")
102    print("=" * 60)
103
104    # Loss function: L(θ) = (sigmoid(θx) - y)²
105    # We need ∂L/∂θ
106
107    # The chain rule IS u-substitution in differentiation:
108    # If u = θx, z = sigmoid(u), L = (z - y)²
109    # Then ∂L/∂θ = ∂L/∂z · ∂z/∂u · ∂u/∂θ
110
111    def sigmoid(x):
112        return 1 / (1 + np.exp(-np.clip(x, -500, 500)))
113
114    def sigmoid_derivative(x):
115        s = sigmoid(x)
116        return s * (1 - s)
117
118    # Example values
119    theta = 2.0
120    x = 1.5
121    y_true = 0.8
122
123    # Forward pass (computing u, z, L)
124    u = theta * x
125    z = sigmoid(u)
126    L = (z - y_true)**2
127
128    # Backward pass (chain rule = u-sub in reverse)
129    dL_dz = 2 * (z - y_true)
130    dz_du = sigmoid_derivative(u)
131    du_dtheta = x
132
133    # Total gradient
134    dL_dtheta = dL_dz * dz_du * du_dtheta
135
136    print(f"Forward pass:")
137    print(f"  u = θx = {theta} × {x} = {u}")
138    print(f"  z = sigmoid(u) = {z:.6f}")
139    print(f"  L = (z - y)² = ({z:.4f} - {y_true})² = {L:.6f}")
140    print(f"\nBackward pass (chain rule = reverse u-substitution):")
141    print(f"  ∂L/∂z = 2(z - y) = {dL_dz:.6f}")
142    print(f"  ∂z/∂u = sigmoid'(u) = {dz_du:.6f}")
143    print(f"  ∂u/∂θ = x = {du_dtheta}")
144    print(f"\n∂L/∂θ = {dL_dz:.4f} × {dz_du:.4f} × {du_dtheta} = {dL_dtheta:.6f}")
145
146reparameterization_connection()
147jacobian_in_normalizing_flows()
148gradient_computation_example()

Common Mistakes to Avoid

Mistake 1: Forgetting to Substitute dx

Wrong: $\int \cos(x^2) \, dx$ with $u = x^2$ gives $\int \cos(u) \, dx$

Correct: Since $du = 2x \, dx$ , we need $dx = \frac{du}{2x}$ . But there's no way to eliminate the remaining $x$ ! This integral needs a different technique.

Mistake 2: Not Changing Limits for Definite Integrals

When using Method 1 (changing limits), don't mix $x$ -limits with $u$ -integrands.

Wrong: $\int_0^1 u^2 \, du$ where 0 and 1 are $x$ -values

Correct: If $u = g(x)$ , use limits $g(0)$ and $g(1)$

Mistake 3: Choosing the Wrong u

Common error: choosing $u$ to be the "complicated" part without checking if its derivative appears.

Tip: The derivative of your chosen $u$ should appear as a factor in the integrand (possibly with a constant multiple).

Mistake 4: Forgetting the Constant of Integration

For indefinite integrals, always include $+C$ in your final answer after back-substitution.

Mistake 5: Trying to 'Solve for x'

Wrong: If $u = x^2$ , trying to write $x = \sqrt{u}$ and substituting

Why it fails: This introduces $\pm$ ambiguity and often makes the integral more complicated. Only solve for $dx$ in terms of $du$ .

Test Your Understanding

📝 Test Your Understanding

Question 1 of 5

What is the best choice of u for this integral?

\int 6x^2 \sqrt{x^3 + 1} \, dx

Summary

U-substitution is the most important technique for evaluating integrals involving composite functions. It works by reversing the chain rule.

Key Concepts

Concept	Description	Key Formula
Core Idea	Reverse the chain rule	∫f(g(x))g'(x) dx = F(g(x)) + C
The Substitution	Let u = g(x), inner function	du = g'(x) dx
Definite Integrals	Change limits or back-substitute	x ∈ [a,b] → u ∈ [g(a), g(b)]
Pattern Recognition	Look for derivative of inner function	Derivative of u must appear

Key Takeaways

Look for compositions: When you see $f(g(x))$ , consider letting $u = g(x)$
Check for the derivative: The factor $g'(x)$ (or a constant multiple) must appear in the integrand
Transform completely: Every part of the integrand including $dx$ must become $u$ -expressions
For definite integrals: Either change the limits to $u$ -values or back-substitute before evaluating
Verify by differentiating: Always check your answer by differentiating and confirming you get the original integrand
ML connection: U-substitution underlies the reparameterization trick, normalizing flows, and is the "inverse" of the chain rule used in backpropagation

The Core Insight:

"U-substitution transforms complicated integrals into simple ones by recognizing that every composite function hides a chain rule derivative."

Coming Next: In Integration by Parts, we'll learn another powerful technique — the integration analog of the product rule — that handles integrals involving products of different function types.