Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Define left Riemann sums, right Riemann sums, and the midpoint rule
Compute each type of Riemann sum for a given function and interval
Visualize how rectangles approximate the area under a curve
Compare the accuracy of different approximation methods
Explain why the midpoint rule is typically more accurate than endpoints
Apply numerical integration in scientific computing and machine learning
Implement these methods in Python code

The Big Picture: From Rectangles to Integrals

"The integral is nothing but the limit of a sum." — Bernhard Riemann

In the previous section, we discovered that the area under a curve can be approximated by filling the region with rectangles. But we glossed over a critical question: where exactly should we sample the function to determine each rectangle's height?

This question leads to three fundamental approaches, each with its own geometric interpretation and accuracy characteristics:

◀️

Left Riemann Sum

Sample at left endpoint of each subinterval

▶️

Right Riemann Sum

Sample at right endpoint of each subinterval

⏺️

Midpoint Rule

Sample at midpoint of each subinterval

Why This Matters

These three methods form the foundation of numerical integration — the art of computing integrals using algorithms rather than analytical formulas. Every scientific computing library, from NumPy to TensorFlow, relies on these concepts.

Understanding why certain methods are more accurate than others prepares you for more advanced techniques like Simpson's Rule and Gaussian quadrature.

Historical Context: Riemann's Revolutionary Idea

Bernhard Riemann (1826-1866) was a German mathematician who, in just 39 years of life, revolutionized multiple areas of mathematics. His 1854 habilitation lecture on geometry laid the groundwork for Einstein's general relativity.

Riemann's approach to integration was fundamentally different from his predecessors. Rather than starting with antiderivatives (the approach of Newton and Leibniz), he defined the integral directly as the limit of sums. This allowed mathematicians to integrate functions that have no elementary antiderivative.

The Riemann Integral Definition

Riemann showed that if we partition $[a, b]$ into $n$ subintervals and sample any point $x_i^*$ in each subinterval, the sum converges to the same value as $n \\to \\infty$ . This value is the definite integral.

The left, right, and midpoint rules are simply different choices for where to place the sample point $x_i^*$ . All three converge to the same answer — but at different rates.

Partitioning the Interval

Before we can approximate an integral, we need to set up the machinery. Given a function $f(x)$ on an interval $[a, b]$ :

Step 1: Divide into Subintervals

We partition $[a, b]$ into $n$ equal subintervals, each of width:

\\Delta x = \\frac{b - a}{n}

Step 2: Identify the Partition Points

The endpoints of the subintervals are:

x_0 = a, \\quad x_1 = a + \\Delta x, \\quad x_2 = a + 2\\Delta x, \\quad \\ldots, \\quad x_n = b

In general:

x_i = a + i \\cdot \\Delta x \\quad \\text{for } i = 0, 1, 2, \\ldots, n

Symbol	Meaning	Formula
n	Number of subintervals (rectangles)	User-specified positive integer
Δx	Width of each subinterval	(b - a) / n
xᵢ	The i-th partition point	a + i · Δx
[xᵢ₋₁, xᵢ]	The i-th subinterval	Length = Δx

Left Riemann Sums

In a left Riemann sum, we evaluate the function at the left endpoint of each subinterval to determine the rectangle's height.

Definition: Left Riemann Sum

L_n = \\sum_{i=1}^{n} f(x_{i-1}) \\cdot \\Delta x = \\Delta x \\cdot \\left[ f(x_0) + f(x_1) + \\cdots + f(x_{n-1}) \\right]

We sum the areas of rectangles whose heights are determined by the left endpoint of each subinterval.

Geometric Interpretation

For an increasing function, the left Riemann sum underestimates the true area. Each rectangle lies entirely below the curve.

For a decreasing function, the left Riemann sum overestimates the true area. Each rectangle extends above the curve.

Example: Left Riemann Sum of f(x) = x² on [0, 2] with n = 4

Step 1: Calculate Δx = (2 - 0) / 4 = 0.5

Step 2: Identify left endpoints: x₀ = 0, x₁ = 0.5, x₂ = 1, x₃ = 1.5

Step 3: Calculate function values:

f(0) = 0² = 0
f(0.5) = 0.5² = 0.25
f(1) = 1² = 1
f(1.5) = 1.5² = 2.25

Step 4: Sum the areas:

L_4 = 0.5 \\cdot (0 + 0.25 + 1 + 2.25) = 0.5 \\cdot 3.5 = 1.75

The exact value is $\\int_0^2 x^2 \\, dx = \\frac{8}{3} \\approx 2.667$ , so the left sum underestimates by about 0.92 (34% error).

Right Riemann Sums

In a right Riemann sum, we evaluate the function at the right endpoint of each subinterval.

Definition: Right Riemann Sum

R_n = \\sum_{i=1}^{n} f(x_i) \\cdot \\Delta x = \\Delta x \\cdot \\left[ f(x_1) + f(x_2) + \\cdots + f(x_n) \\right]

We sum the areas of rectangles whose heights are determined by the right endpoint of each subinterval.

Geometric Interpretation

For an increasing function, the right Riemann sum overestimates the true area. Each rectangle extends above the curve.

For a decreasing function, the right Riemann sum underestimates the true area.

Example: Right Riemann Sum of f(x) = x² on [0, 2] with n = 4

Step 1: Δx = 0.5 (same as before)

Step 2: Identify right endpoints: x₁ = 0.5, x₂ = 1, x₃ = 1.5, x₄ = 2

Step 3: Calculate function values:

f(0.5) = 0.25
f(1) = 1
f(1.5) = 2.25
f(2) = 4

Step 4: Sum the areas:

R_4 = 0.5 \\cdot (0.25 + 1 + 2.25 + 4) = 0.5 \\cdot 7.5 = 3.75

The right sum overestimates by about 1.08 (41% error).

Left + Right Bound the True Area

For a monotonic function, the true integral lies between the left and right sums:

L_n \\leq \\int_a^b f(x)\\,dx \\leq R_n \\quad \\text{(if } f \\text{ is increasing)}

In our example: 1.75 ≤ 2.667 ≤ 3.75 ✓

The Midpoint Rule

The midpoint rule evaluates the function at the center of each subinterval. This seemingly simple change leads to dramatically improved accuracy.

Definition: Midpoint Rule

M_n = \\sum_{i=1}^{n} f(\\bar{x}_i) \\cdot \\Delta x

where $\\bar{x}_i = \\frac{x_{i-1} + x_i}{2}$ is the midpoint of the i-th subinterval.

Equivalently:

\\bar{x}_i = a + (i - 0.5) \\cdot \\Delta x

Why Is Midpoint More Accurate?

The key insight is error cancellation. Consider what happens at a single rectangle:

If the function is concave up (like x²), the tangent line at the midpoint lies below the curve, causing a slight underestimate.
If the function is concave down, the tangent line lies abovethe curve, causing a slight overestimate.

The crucial difference: these errors are second-order — they scale with $(\\Delta x)^2$ rather than $\\Delta x$ .

Example: Midpoint Rule for f(x) = x² on [0, 2] with n = 4

Step 1: Δx = 0.5

Step 2: Identify midpoints: 0.25, 0.75, 1.25, 1.75

Step 3: Calculate function values:

f(0.25) = 0.0625
f(0.75) = 0.5625
f(1.25) = 1.5625
f(1.75) = 3.0625

Step 4: Sum the areas:

M_4 = 0.5 \\cdot (0.0625 + 0.5625 + 1.5625 + 3.0625) = 0.5 \\cdot 5.25 = 2.625

The midpoint gives 2.625 with only 1.6% error — dramatically better than left (34%) or right (41%) with the same number of rectangles!

Interactive Riemann Sum Explorer

Use this interactive tool to visualize how left, right, and midpoint Riemann sums approximate the area under various functions. Watch how the approximation improves as you increase the number of rectangles.

📊Interactive Riemann Sum Explorer

Function

Sum Type

Rectangles: 4

Current Function

f(x) = x^2

Interval: [0, 3]

Riemann Sum Formula

R_n = \sum_{i=1}^{4} f(x_i^*) \cdot \Delta x

\Delta x = \frac{3 - 0}{4} = 0.7500

Metric	Value
Left Riemann Sum (n = 4)	5.906250
Exact Area (definite integral)	9.000000
Error	3.093750 (34.38%)

Key Insight

As n increases, the Riemann sum approaches the exact area under the curve. Try increasing n to 50+ and watch the error shrink.

Try These Experiments

Start with n = 4 and compare left vs right vs midpoint for the same function
Use the "Animate Convergence" button to watch all methods approach the true value
Try a decreasing function (like 1/(1+x²)) and notice how left/right swap behaviors
Observe that midpoint consistently shows smaller error with the same n

Error Analysis: How Wrong Are We?

Understanding the error in numerical integration is crucial for scientific computing. The error bounds depend on the derivatives of the function:

Method	Error Bound	Order	Behavior
Left Riemann	$\|E\| \\leq \\frac{M_1(b-a)^2}{2n}$	O(1/n)	First-order; doubling n halves error
Right Riemann	$\|E\| \\leq \\frac{M_1(b-a)^2}{2n}$	O(1/n)	First-order; doubling n halves error
Midpoint	$\|E\| \\leq \\frac{M_2(b-a)^3}{24n^2}$	O(1/n²)	Second-order; doubling n quarters error!

Here $M_1 = \\max_{x \\in [a,b]} |f'(x)|$ and $M_2 = \\max_{x \\in [a,b]} |f''(x)|$ .

The Power of Second-Order Methods

A second-order method like the midpoint rule converges quadratically — each time you double the number of rectangles, the error decreases by a factor of 4, not 2.

To achieve 0.01% error: Left/Right might need 10,000 rectangles, but midpoint might only need 100!

Convergence to the Definite Integral

The magic of Riemann's approach: regardless of which sampling method you choose, all Riemann sums converge to the same value as $n \\to \\infty$ :

\\lim_{n \\to \\infty} L_n = \\lim_{n \\to \\infty} R_n = \\lim_{n \\to \\infty} M_n = \\int_a^b f(x) \\, dx

This limiting value is the definite integral — Riemann's definition. The integral exists if and only if this limit exists (and is the same for any choice of sample points).

📈Convergence: How Fast Do Approximations Improve?

Let's compute the area under $f(x) = x^2$ from $x = 0$ to $x = 2$ . The exact answer is $\frac{8}{3} \approx 2.667$ . Watch how different methods converge as we increase the number of rectangles.

Applications in Scientific Computing and Machine Learning

Numerical integration is ubiquitous in computational science. Here are key applications:

🧮 Physics Simulations

Computing trajectories, forces, and energy requires integrating differential equations. Monte Carlo methods in quantum physics use these techniques extensively.

📊 Probability

Computing expected values and probabilities from PDFs requires integration. Many distributions (like the normal) have no closed-form CDF.

🤖 Machine Learning

Bayesian inference, normalizing flows, and variational autoencoders all require numerical integration to compute marginal likelihoods and evidence.

📈 Financial Math

Option pricing (Black-Scholes), risk assessment, and portfolio optimization involve integrating probability distributions over possible outcomes.

From Riemann to Monte Carlo

In high dimensions, Riemann sums become impractical (the "curse of dimensionality"). Monte Carlo integration — which randomly samples points instead of using a regular grid — becomes essential. But the core idea of summing f(sample) × (volume) remains the same!

Python Implementation

Let's implement all three methods and compare their convergence:

Implementing Riemann Sums in Python

🐍riemann_sums.py

Explanation(5)

Code(75)

3Left Riemann Sum Function

This function calculates the left Riemann sum by sampling the function at the left endpoint of each subinterval. For increasing functions, this will underestimate the true integral.

EXAMPLE

For f(x) = x on [0, 2] with n=2: Left sum = 0(1) + 1(1) = 1, but exact = 2

18Right Riemann Sum Function

The right Riemann sum samples at the right endpoint of each subinterval. For increasing functions, this overestimates the integral. Notice how we use (i + 1) * delta_x to get the right endpoint.

33Midpoint Rule Function

The midpoint rule samples at the center of each subinterval using (i + 0.5) * delta_x. This typically gives better accuracy because errors from overestimating and underestimating tend to cancel out.

47Test Function: x squared

We use f(x) = x^2 because we know its exact integral: the antiderivative is x^3/3, so the integral from 0 to 2 equals 8/3 - 0 = 8/3.

58Convergence Demonstration

As n increases, all three methods converge to the exact value. However, the midpoint rule converges faster (quadratic vs linear rate). With n=128, the midpoint error is about 16x smaller than left/right errors.

70 lines without explanation

1import numpy as np
2import matplotlib.pyplot as plt
3
4def left_riemann_sum(f, a, b, n):
5    """
6    Compute the Left Riemann Sum of f from a to b using n rectangles.
7
8    The left endpoint of each subinterval determines the height.
9    This underestimates the integral for increasing functions.
10    """
11    delta_x = (b - a) / n
12    total = 0
13
14    for i in range(n):
15        x_i = a + i * delta_x  # Left endpoint
16        total += f(x_i) * delta_x
17
18    return total
19
20def right_riemann_sum(f, a, b, n):
21    """
22    Compute the Right Riemann Sum of f from a to b using n rectangles.
23
24    The right endpoint of each subinterval determines the height.
25    This overestimates the integral for increasing functions.
26    """
27    delta_x = (b - a) / n
28    total = 0
29
30    for i in range(n):
31        x_i = a + (i + 1) * delta_x  # Right endpoint
32        total += f(x_i) * delta_x
33
34    return total
35
36def midpoint_rule(f, a, b, n):
37    """
38    Compute the Midpoint Rule approximation of f from a to b.
39
40    Uses the midpoint of each subinterval for the height.
41    This is more accurate than left/right sums (second-order).
42    """
43    delta_x = (b - a) / n
44    total = 0
45
46    for i in range(n):
47        x_mid = a + (i + 0.5) * delta_x  # Midpoint
48        total += f(x_mid) * delta_x
49
50    return total
51
52# Example: Approximate the integral of x^2 from 0 to 2
53# Exact answer: [x^3/3] from 0 to 2 = 8/3 ≈ 2.6667
54
55def f(x):
56    return x ** 2
57
58a, b = 0, 2
59exact = 8/3  # = 2.6666...
60
61print("Approximating integral of x^2 from 0 to 2")
62print(f"Exact value: {exact:.6f}")
63print()
64print(f"{'n':>6} | {'Left':>12} | {'Right':>12} | {'Midpoint':>12} | {'Best Error':>12}")
65print("-" * 65)
66
67for n in [4, 8, 16, 32, 64, 128]:
68    left = left_riemann_sum(f, a, b, n)
69    right = right_riemann_sum(f, a, b, n)
70    mid = midpoint_rule(f, a, b, n)
71
72    errors = [abs(left - exact), abs(right - exact), abs(mid - exact)]
73    best_error = min(errors)
74
75    print(f"{n:>6} | {left:>12.6f} | {right:>12.6f} | {mid:>12.6f} | {best_error:>12.6f}")

Visualizing the Three Methods

Here's code to create side-by-side visualizations:

Visualizing Riemann Sums with Matplotlib

🐍visualize_riemann.py

Explanation(3)

Code(55)

5Three-Panel Visualization

We create a side-by-side comparison of all three methods. This helps visualize why midpoint tends to be more accurate - it balances overestimates and underestimates.

23Drawing Rectangles

Each rectangle is drawn using matplotlib's bar function. The height is determined by evaluating f at the appropriate sample point (left, right, or midpoint).

35Sample Point Markers

We mark the sample point on each rectangle with a dot. This helps visualize where the function is being evaluated to determine each rectangle's height.

52 lines without explanation

1import numpy as np
2import matplotlib.pyplot as plt
3
4def visualize_riemann_sums(f, a, b, n, title="Riemann Sums"):
5    """
6    Create a visualization comparing all three Riemann sum types.
7    """
8    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
9
10    delta_x = (b - a) / n
11    x_curve = np.linspace(a, b, 200)
12    y_curve = f(x_curve)
13
14    methods = ['Left', 'Right', 'Midpoint']
15    colors = ['#3B82F6', '#22C55E', '#A855F7']
16
17    for ax, method, color in zip(axes, methods, colors):
18        # Plot the curve
19        ax.plot(x_curve, y_curve, 'r-', linewidth=2, label='f(x)')
20        ax.fill_between(x_curve, y_curve, alpha=0.1, color='red')
21
22        # Draw rectangles
23        for i in range(n):
24            left = a + i * delta_x
25            right = left + delta_x
26
27            if method == 'Left':
28                height = f(left)
29                sample_x = left
30            elif method == 'Right':
31                height = f(right)
32                sample_x = right
33            else:  # Midpoint
34                sample_x = (left + right) / 2
35                height = f(sample_x)
36
37            # Draw rectangle
38            ax.bar(left, height, width=delta_x, alpha=0.4,
39                   color=color, edgecolor=color, align='edge')
40
41            # Mark sample point
42            ax.plot(sample_x, height, 'o', color=color, markersize=6)
43
44        ax.set_title(f'{method} Rule (n={n})', fontsize=12, fontweight='bold')
45        ax.set_xlabel('x')
46        ax.set_ylabel('f(x)')
47        ax.axhline(y=0, color='black', linewidth=0.5)
48        ax.grid(True, alpha=0.3)
49
50    plt.tight_layout()
51    plt.savefig('riemann_comparison.png', dpi=150, bbox_inches='tight')
52    plt.show()
53
54# Visualize for x^2 from 0 to 2 with 6 rectangles
55visualize_riemann_sums(lambda x: x**2, 0, 2, 6)

Common Pitfalls

Pitfall 1: Confusing Indexing

Left sums use $f(x_0), f(x_1), \\ldots, f(x_{n-1})$ , while right sums use $f(x_1), f(x_2), \\ldots, f(x_n)$ . The indices differ by one. Draw a picture if confused!

Pitfall 2: Assuming Error Direction

Left sums only underestimate for increasing functions. For decreasing functions, left sums overestimate. For non-monotonic functions, you can't easily predict whether the sum is too high or too low.

Pitfall 3: Expecting Midpoint to Always Win

The midpoint rule is generally more accurate, but the error bound involves $|f''(x)|$ . If the second derivative is very large (highly curved function), the midpoint advantage diminishes. For linear functions (f'' = 0), the midpoint rule is exact even with n = 1!

Numerical Precision Considerations

For very large n, floating-point errors can accumulate. In practice, n = 1000 to n = 100,000 is usually sufficient. Beyond that, more sophisticated quadrature methods (Simpson's Rule, Gaussian quadrature) are preferred.

Test Your Understanding

📝Test Your UnderstandingScore: 0/8

Question 1 of 813% Complete

In a left Riemann sum, where is the height of each rectangle determined?

Summary

We've explored three fundamental methods for approximating definite integrals using rectangles, each with distinct characteristics and trade-offs.

The Three Methods at a Glance

Method	Sample Point	Error Order	Best For
Left Riemann	Left endpoint xᵢ₋₁	O(1/n)	Simple bounds on increasing functions
Right Riemann	Right endpoint xᵢ	O(1/n)	Simple bounds on decreasing functions
Midpoint Rule	Center x̄ᵢ	O(1/n²)	General use — more accurate!

Key Takeaways

All three methods approximate the integral as a sum of rectangle areas: $\\sum f(x_i^*) \\cdot \\Delta x$
The choice of sample point ( $x_i^*$ ) affects accuracy but not the limiting value as $n \\to \\infty$
Midpoint is more accurate because errors from overestimating and underestimating tend to cancel
Midpoint error scales as $O(1/n^2)$ , while left/right scale as $O(1/n)$
For monotonic functions, left and right sums bound the true integral from below and above (or vice versa)
These methods form the foundation for all numerical integration techniques in scientific computing

The Central Insight:

"The definite integral is nothing more than the limit of Riemann sums — the culmination of adding up infinitely many infinitesimally thin rectangles."

Coming Next: In the next section, we'll formalize the notation for Riemann sums using sigma notation, giving us a precise and compact way to express these sums mathematically.