Chapter 3
18 min read
Section 13 of 261

Summation Formulas and Series

Mathematical Foundations

Why Summations Are the Currency of Algorithm Analysis

Almost every running-time analysis you will write in this book ends with a summation. The reason is structural: a loop that runs nn times and does f(i)f(i) work on iteration ii costs exactly i=1nf(i)\sum_{i=1}^{n} f(i) primitive operations. Nested loops produce nested sums. Recursive calls unfolded into a recursion tree produce a sum over levels. Amortized analyses produce sums of credit. The whole subject of analysis of algorithms is, at one level, the systematic study of which sums can be replaced by closed forms — expressions you can evaluate in O(1)O(1) arithmetic ops instead of O(n)O(n).

Consider the simplest possible non-trivial loop:

🐍python
1total = 0
2for i in range(1, n + 1):
3    total += i
4return total

The cost (counting the addition as one unit) is nn operations, and the output the loop computes is i=1ni\sum_{i=1}^{n} i. Both halves of that sentence are summations. The cost summation gives us the running time; the value summation gives us a closed form — n(n+1)/2n(n+1)/2 — that lets us replace the entire loop with a single arithmetic expression. Knowing summation identities is what lets you collapse O(n)O(n) code into O(1)O(1) code, and what lets you simplify a four-page running-time derivation into a two-line statement.

Working slogan. Loops produce sums. Recursion produces sums. Probabilistic analyses produce sums. Master a dozen summation identities and you have already mastered most of the algebra you will do in this book.

Sigma Notation: A Compact Vocabulary

The capital sigma notation packs a loop into one symbol. The expression

i=1nai=a1+a2+a3++an\sum_{i=1}^{n} a_i = a_1 + a_2 + a_3 + \cdots + a_n

has four moving parts: the index ii, the lower bound 1, the upper bound nn, and the body aia_i. The bounds are inclusive on both ends: the sum has exactly n1+1=nn - 1 + 1 = n terms when the index runs from 1 to nn.

Three identities do almost all of the algebraic work you will ever do with sigma notation:

  • Linearity. i=1n(αai+βbi)=αi=1nai+βi=1nbi\sum_{i=1}^{n} (\alpha\, a_i + \beta\, b_i) = \alpha \sum_{i=1}^{n} a_i + \beta \sum_{i=1}^{n} b_i for any constants α,β\alpha, \beta. Constants slide out of sums; sums of sums split.
  • Splitting at an index. i=1nai=i=1mai+i=m+1nai\sum_{i=1}^{n} a_i = \sum_{i=1}^{m} a_i + \sum_{i=m+1}^{n} a_i for any 1m<n1 \leq m < n. Useful for recursion tree analyses where different levels behave differently.
  • Reindexing. i=1nai=j=0n1aj+1\sum_{i=1}^{n} a_i = \sum_{j=0}^{n-1} a_{j+1} (shift by 1), or more generally i=abf(i)=j=0baf(j+a)\sum_{i=a}^{b} f(i) = \sum_{j=0}^{b-a} f(j+a). This is the "rename the loop variable" identity.

A constant is just a sum of constants

A common slip: i=1ncc\sum_{i=1}^{n} c \neq c. The sum has nn copies of cc, so i=1nc=cn\sum_{i=1}^{n} c = c n. Watching for this catches a surprising number of off-by-an-nn bugs in undergrad analyses.

The Arithmetic Series: Gauss's Trick

Legend: in the 1780s, a German schoolmaster tried to keep a class of ten-year-olds quiet by asking them to add the integers from 1 to 100. Carl Friedrich Gauss handed in the answer in seconds: 50505050. He had spotted that the sum could be paired up:

S=1+2+3++99+100S=100+99+98++2+12S=101+101+101++101(100 copies)2S=100101S=1001012=5050.\begin{aligned} S &= 1 + 2 + 3 + \cdots + 99 + 100 \\ S &= 100 + 99 + 98 + \cdots + 2 + 1 \\ 2S &= 101 + 101 + 101 + \cdots + 101 \quad (\text{100 copies}) \\ 2S &= 100 \cdot 101 \\ S &= \frac{100 \cdot 101}{2} = 5050. \end{aligned}

The same pairing works for any nn:

i=1ni=n(n+1)2.\sum_{i=1}^{n} i = \frac{n(n+1)}{2}.

This is the most-used closed form in algorithm analysis. It appears every time a nested loop has the shape for i in range(n): for j in range(i):, because the inner loop runs 0+1+2++(n1)=n(n1)/20 + 1 + 2 + \cdots + (n-1) = n(n-1)/2 times in total. Any time you see a half-triangle of work, you should instantly think Θ(n2)\Theta(n^2).

A second proof: induction

Pairing is satisfying but only a sketch. Induction makes the same identity airtight. Base. For n=1n = 1: LHS =1= 1, RHS =12/2=1= 1 \cdot 2 / 2 = 1. Step. Assume i=1ni=n(n+1)/2\sum_{i=1}^{n} i = n(n+1)/2. Then

i=1n+1i=n(n+1)2+(n+1)=(n+1)n+22=(n+1)(n+2)2,\sum_{i=1}^{n+1} i = \frac{n(n+1)}{2} + (n+1) = (n+1) \cdot \frac{n + 2}{2} = \frac{(n+1)(n+2)}{2},

which is the formula with nn replaced by n+1n+1. QED.


Sum of Squares, Sum of Cubes, and Faulhaber

For squared and cubed indices the closed forms are:

  • i=1ni2=n(n+1)(2n+1)6\sum_{i=1}^{n} i^2 = \frac{n(n+1)(2n+1)}{6}
  • i=1ni3=(n(n+1)2)2=(i=1ni)2\sum_{i=1}^{n} i^3 = \left(\frac{n(n+1)}{2}\right)^2 = \left(\sum_{i=1}^{n} i\right)^2

That last identity — the sum of cubes equals the square of the sum — is one of the small miracles of elementary number theory. It is also genuinely useful: it converts a cubic-looking nested-loop bound into a clean square.

Deriving the sum of squares by telescoping

A beautiful trick: take the algebraic identity (i+1)3i3=3i2+3i+1(i+1)^3 - i^3 = 3 i^2 + 3 i + 1 and sum both sides from i=1i = 1 to nn. The left side telescopes (almost everything cancels):

i=1n[(i+1)3i3]=(n+1)31.\sum_{i=1}^{n} \left[ (i+1)^3 - i^3 \right] = (n+1)^3 - 1.

The right side, by linearity of \sum, equals 3i2+3i+n3 \sum i^2 + 3 \sum i + n. We already know i=n(n+1)/2\sum i = n(n+1)/2. Plug in and solve for i2\sum i^2 and out pops the formula. A five-line proof for an identity that, written out, looks intimidating.

The general pattern: Faulhaber

The pattern continues: i=1nik\sum_{i=1}^{n} i^k is always a polynomial of degree k+1k+1 in nn, with leading term nk+1k+1\frac{n^{k+1}}{k+1}. Faulhaber's 17th century formula gives the explicit polynomial. For algorithm analysis, the only fact you usually need is the asymptotic statement:

i=1nik=Θ(nk+1)for any fixed k0.\sum_{i=1}^{n} i^k = \Theta(n^{k+1}) \qquad \text{for any fixed } k \geq 0.

Why this matters in practice

A nested loop where the inner loop does i2i^2 work on iteration ii runs in Θ(n3)\Theta(n^3) time, not Θ(n2)\Theta(n^2). The exponent goes up by one when you sum.

The Geometric Series: Three Regimes

After the arithmetic series, the geometric series is the second load-bearing identity in algorithms. For any r1r \neq 1:

i=0n1ri=1+r+r2++rn1=1rn1r=rn1r1.\sum_{i=0}^{n-1} r^i = 1 + r + r^2 + \cdots + r^{n-1} = \frac{1 - r^n}{1 - r} = \frac{r^n - 1}{r - 1}.

Proof. Let S=1+r+r2++rn1S = 1 + r + r^2 + \cdots + r^{n-1}. Then rS=r+r2++rnr S = r + r^2 + \cdots + r^n. Subtract: (r1)S=rn1(r - 1) S = r^n - 1. Divide. Two lines.

The behavior of this series splits sharply into three regimes, and which regime you are in dominates the asymptotic class:

RegimeClosed formAsymptoticWhere you see it
r < 1 (decreasing)1/(1-r) − r^n/(1-r) → 1/(1-r)Θ(1) (bounded by a constant!)Geometric backoff, decaying probabilities, fixed-point iteration
r = 1 (flat)nΘ(n)Trivial case: just n copies of 1
r > 1 (increasing)(r^n − 1)/(r − 1) ≈ r^n/(r−1)Θ(r^n) — last term dominatesRecursion-tree levels in T(n) = 2T(n/2) + n^2; doubling arrays

The decreasing case is the source of one of the most counterintuitive results in algorithm design: a sum with infinitely many positive terms can still be a constant. 1+1/2+1/4+1/8+=21 + 1/2 + 1/4 + 1/8 + \cdots = 2 is the canonical example. This is what makes amortized analysis of dynamic-array doubling work: copying happens infinitely often as the array grows, but the total copy cost over nn appends is bounded by a constant times nn, because the sequence of resize costs is geometric.

The dominant-term rule

For an increasing geometric series, only the last term matters asymptotically: i=0n12i=2n1=Θ(2n)\sum_{i=0}^{n-1} 2^i = 2^n - 1 = \Theta(2^n). The earlier terms are dwarfed. For a decreasing geometric series, only the first term matters: i=0(1/2)i=2\sum_{i=0}^{\infty} (1/2)^i = 2. This front-or-back asymmetry is what makes recursion-tree analysis work: you check whether the leaves or the root dominate.

The Harmonic Series and the Logarithm Hidden Inside

The harmonic numbers are the partial sums

Hn=i=1n1i=1+12+13++1n.H_n = \sum_{i=1}^{n} \frac{1}{i} = 1 + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{n}.

Unlike the previous sums, HnH_n has no clean closed form in elementary functions. But its asymptotic behavior is known with great precision:

Hn=lnn+γ+12n112n2+O ⁣(1n4),H_n = \ln n + \gamma + \frac{1}{2n} - \frac{1}{12 n^2} + O\!\left(\frac{1}{n^4}\right),

where γ0.5772156649\gamma \approx 0.5772156649\ldots is the Euler-Mascheroni constant. For algorithm analysis the only thing that usually matters is the leading-order statement:

Hn=Θ(logn).H_n = \Theta(\log n).

Why is the harmonic sum logarithmic? The integral argument

The continuous analogue of the discrete sum i=1n1/i\sum_{i=1}^{n} 1/i is the integral 1n1xdx=lnn\int_1^n \frac{1}{x} \, dx = \ln n. Because 1/x1/x is decreasing, the staircase of rectangles of height 1/i1/i sandwiches the area under the curve from both sides. We get the two-sided bound

ln(n+1)    Hn    1+lnn,\ln(n+1) \;\leq\; H_n \;\leq\; 1 + \ln n,

which already gives Hn=Θ(logn)H_n = \Theta(\log n) with no further work.

Where the harmonic sum hides in algorithms

  • Quicksort average case. The expected number of comparisons satisfies a recurrence whose solution involves HnH_n; that is exactly why average-case quicksort is Θ(nlogn)\Theta(n \log n), not Θ(n)\Theta(n).
  • Coupon collector. The expected number of trials to collect all nn coupons is nHnnlnnn H_n \approx n \ln n. Used in load-balancer analysis, randomized algorithms, hashing.
  • Hash chains. Expected length of the longest chain in a hash table with nn keys and nn buckets is Θ(logn/loglogn)\Theta(\log n / \log \log n) — an analysis that uses harmonic-sum bounds.
  • DCG ranking metric. Search engines and recommenders discount the ii-th rank by 1/log2(i+1)1/\log_2(i+1). Total ideal-DCG sums to a harmonic-like quantity.
  • Disk seek scheduling, skip lists, treaps. Many analyses end with HnH_n and conclude O(logn)O(\log n).

Telescoping Sums

A telescoping sum is a sum whose body can be written as f(i+1)f(i)f(i+1) - f(i). When summed, all interior terms cancel:

i=1n[f(i+1)f(i)]=f(n+1)f(1).\sum_{i=1}^{n} \bigl[ f(i+1) - f(i) \bigr] = f(n+1) - f(1).

We already used this once (for the sum of squares). It is the discrete analogue of the fundamental theorem of calculus, and it is the cleanest way to evaluate a sum without guessing an answer first.

Worked example: a small partial-fraction telescope

Compute S=i=1n1i(i+1)S = \sum_{i=1}^{n} \frac{1}{i(i+1)}. Partial fractions: 1i(i+1)=1i1i+1\frac{1}{i(i+1)} = \frac{1}{i} - \frac{1}{i+1}. So

S=i=1n(1i1i+1)=11n+1=nn+1.S = \sum_{i=1}^{n} \left( \frac{1}{i} - \frac{1}{i+1} \right) = 1 - \frac{1}{n+1} = \frac{n}{n+1}.

Three lines, no induction, exact closed form. Telescoping is the algebra-level shortcut you should reach for whenever you see a difference structure.


Bounding Sums by Integrals

When no closed form exists, you can still pin down the asymptotic class of a sum by bounding it above and below with integrals. For a monotone function ff on [1,n][1, n]:

If ff is increasing, 0nf(x)dx    i=1nf(i)    1n+1f(x)dx.\int_0^n f(x) \, dx \;\leq\; \sum_{i=1}^{n} f(i) \;\leq\; \int_1^{n+1} f(x) \, dx. If ff is decreasing, the inequalities flip:

1n+1f(x)dx    i=1nf(i)    f(1)+1nf(x)dx.\int_1^{n+1} f(x) \, dx \;\leq\; \sum_{i=1}^{n} f(i) \;\leq\; f(1) + \int_1^n f(x) \, dx.

This is, hands down, the most useful technique for asymptotic bounds when you cannot find a closed form. Examples:

  • i=1nlogi    1nlogxdx=nlognn+1=Θ(nlogn)\sum_{i=1}^{n} \log i \;\approx\; \int_1^n \log x \, dx = n \log n - n + 1 = \Theta(n \log n). That single line is the analysis of comparison-sort lower bounds and of log(n!)\log(n!) (Stirling).
  • i=1ni    1nxdx=23(n3/21)=Θ(n3/2)\sum_{i=1}^{n} \sqrt{i} \;\approx\; \int_1^n \sqrt{x} \, dx = \tfrac{2}{3}(n^{3/2} - 1) = \Theta(n^{3/2}).
  • i=1n1i2\sum_{i=1}^{n} \frac{1}{i^2} is bounded by 1+1dxx2=21 + \int_1^\infty \frac{dx}{x^2} = 2 — constant! In fact it converges to π2/6\pi^2/6 (the Basel problem).

A discrete-vs-continuous reflex

Whenever you see f(i)\sum f(i) with ff monotone, your first instinct should be: "What does f(x)dx\int f(x) \, dx look like?" The integral and the sum agree to leading order, and the integral is often something you can do in your head.

Interactive: The Summation Estimator

Pick a summation type, slide nn, and watch the amber staircase (the brute-force loop's running cumulative total) meet the blue dashed curve (the closed-form formula evaluated at the same nn). They land at the same point because the closed form is, by construction, equal to the loop — just evaluable in O(1)O(1) instead of O(n)O(n) arithmetic.

0.0531.1e+21.6e+22.1e+2i (1 to n)brute-force loopclosed form
formula
i=1ni=n(n+1)2\sum_{i=1}^{n} i = \frac{n(n+1)}{2}
closed form @ n = 20
210.00
loop sum @ n = 20
210.00
asymptotic class
Θ(n²)
✓ closed form ≡ loop (exact)

Things to notice as you flip between the choices:

  • Σ i grows like a parabola; the staircase hugs the quadratic curve.
  • Σ 2^i shoots straight up: only the last bar matters — this is the "dominant-term rule" for increasing geometric series, visible to the eye.
  • Σ (1/2)^i levels off at 2. You can crank nn to 100 and it stays bounded — that is what Θ(1)\Theta(1) looks like for an infinite sum.
  • Σ 1/i grows but very slowly. The shape is lnn\ln n, just shifted up by γ\gamma.

Closed Form vs. Loop, in Two Languages

When you have a closed form, evaluating it is O(1)O(1) — a handful of arithmetic operations no matter how big nn is. The loop version is O(n)O(n). For n=109n = 10^9, that is the difference between a nanosecond and a second. Below we compute three sums both ways and assert they agree.

Python: side-by-side closed form vs. loop

Closed form vs. loop, with timing
🐍summation_demo.py
4Function signature

Takes n, returns a dict with both methods' results and microsecond timings for the three classical sums.

EXAMPLE
closed_form_vs_loop(20) returns six numbers plus six timings.
8Start arithmetic-sum loop timer

perf_counter() is monotonic and high-resolution. We time only the work, not the function-call overhead.

9Initialize accumulator

arith_loop starts at 0 and will accumulate the running total. After n iterations it will hold 1 + 2 + ... + n.

EXAMPLE
n = 20: starts at 0, ends at 210.
10for i in range(1, n+1)

Standard Python idiom: range(1, n+1) yields 1, 2, ..., n. Note the +1 — Python's range is half-open at the top.

EXAMPLE
n = 20 iterates i over 1, 2, ..., 20 — twenty additions.
11Accumulate i

After iteration k, arith_loop holds k(k+1)/2 by Gauss's identity. The loop has done k additions to get there.

EXAMPLE
After i = 5: arith_loop = 1+2+3+4+5 = 15. Verify: 5·6/2 = 15. ✓
13Closed form: n(n+1)//2

Three operations: one multiplication, one addition, one floor-division by 2. O(1) regardless of n. We use // (integer division) so the result stays an exact int.

EXAMPLE
n = 20: 20 · 21 / 2 = 420 / 2 = 210. Three ops total.
15Assert exact equality

This is the key sanity check: the loop and the closed form MUST produce identical integers. If they ever disagreed, one of them would be wrong.

EXAMPLE
n = 20: 210 == 210 ✓.
19Geometric loop initialization

p tracks the current power of 2. We start at 1 = 2^0 and double each iteration. Avoids recomputing 2**i from scratch each time.

EXAMPLE
Iteration sequence of p: 1, 2, 4, 8, 16, ...
20Range over n iterations

The underscore _ is Pythonic for 'I don't care about the loop variable'; we only need n iterations.

21Add current power

Each iteration adds the next power of 2: 1, then 1+2, then 1+2+4, ... After n iterations: 2^n - 1.

EXAMPLE
n = 20 after iter 1: geom_loop = 1. After iter 2: 3. After iter 3: 7. After iter 20: 1,048,575.
22Double p

Update p for the NEXT iteration. p *= 2 is one multiplication; faster than recomputing 2**i.

EXAMPLE
After iter 1: p = 2. After iter 2: p = 4. After iter 20: p = 2^20 = 1,048,576.
24Closed form: (1 << n) - 1

Bit shift: 1 << n is exactly 2^n in Python (and is O(1) for fixed n in machine words; arbitrary-precision for big n). Subtract 1 to get the geometric-series formula 2^n − 1.

EXAMPLE
n = 20: 1 << 20 = 1,048,576. Minus 1: 1,048,575. Same as the loop.
31Harmonic loop initializer

Note 0.0 (float) — we are summing fractions. If we used 0 (int), 1/i would still be float in Python 3, but starting as float makes the type contract explicit.

33Add reciprocal

Adds 1/i. After iteration k, harm_loop holds H_k = 1 + 1/2 + ... + 1/k. Floating-point accumulation introduces O(n · ε) rounding error, but for n ≤ 10^7 the result is accurate to many digits.

EXAMPLE
n = 20 after iter 1: harm_loop = 1.0. After iter 2: 1.5. After iter 20: ~3.5977.
35Asymptotic approximation

There is no exact closed form for H_n in elementary functions. We use the leading-order expansion ln(n) + γ. The error is about 1/(2n), so for n = 20 the gap is ~0.025 — visible at the displayed digits but small.

EXAMPLE
n = 20: ln(20) ≈ 2.9957, plus γ ≈ 0.5772 = 3.5729. True H_20 ≈ 3.5977. Gap ≈ 0.0248 ≈ 1/(2·20) ✓.
38Return everything

We return both values AND the wall-clock times. For n = 1,000,000 the loop takes milliseconds while the closed form takes nanoseconds — visible in the timings.

35 lines without explanation
1import math
2import time
3
4def closed_form_vs_loop(n: int) -> dict:
5    """Compute three classical sums by both brute-force loop and closed form,
6    assert exact agreement, and time both methods."""
7    # --- Arithmetic sum: 1 + 2 + ... + n ---
8    t0 = time.perf_counter()
9    arith_loop = 0
10    for i in range(1, n + 1):
11        arith_loop += i
12    t1 = time.perf_counter()
13    arith_closed = n * (n + 1) // 2
14    t2 = time.perf_counter()
15    assert arith_loop == arith_closed
16
17    # --- Geometric sum: 1 + 2 + 4 + ... + 2^(n-1)  (r = 2) ---
18    t3 = time.perf_counter()
19    geom_loop = 0
20    p = 1
21    for _ in range(n):
22        geom_loop += p
23        p *= 2
24    t4 = time.perf_counter()
25    geom_closed = (1 << n) - 1     # 2**n - 1
26    t5 = time.perf_counter()
27    assert geom_loop == geom_closed
28
29    # --- Harmonic sum: 1 + 1/2 + 1/3 + ... + 1/n ---
30    t6 = time.perf_counter()
31    harm_loop = 0.0
32    for i in range(1, n + 1):
33        harm_loop += 1.0 / i
34    t7 = time.perf_counter()
35    harm_approx = math.log(n) + 0.5772156649  # Euler-Mascheroni constant
36    t8 = time.perf_counter()
37
38    return {
39        "n": n,
40        "arith":    {"loop": arith_loop, "closed": arith_closed,
41                     "loop_us": (t1 - t0) * 1e6, "closed_us": (t2 - t1) * 1e6},
42        "geometric":{"loop": geom_loop,  "closed": geom_closed,
43                     "loop_us": (t4 - t3) * 1e6, "closed_us": (t5 - t4) * 1e6},
44        "harmonic": {"loop": harm_loop,  "approx": harm_approx,
45                     "loop_us": (t7 - t6) * 1e6, "approx_us": (t8 - t7) * 1e6},
46    }
47
48print(closed_form_vs_loop(20))
49# n=20: arith_loop=210, arith_closed=210
50#       geom_loop=1048575, geom_closed=1048575 (= 2^20 - 1)
51#       harm_loop=3.5977..., ln(20)+gamma=3.5728  (gap shrinks like 1/(2n))

TypeScript: same identities, BigInt where needed

The same idea in TypeScript — the closed-form path stays O(1)O(1); only the loop scales with nn. We switch to BigInt for the geometric sum because 2^n overflows JS's 253 safe-integer range past n=53n = 53.

Same identities, TypeScript / BigInt
📘summation_demo.ts
1TypeScript signature

Takes a JS number for n. Returns a structured result with both values and a boolean ok flag verifying loop ≡ closed form.

4Loop accumulate

Standard for-loop sum. After the loop runs, arithLoop holds 1 + 2 + ... + n. n iterations, one addition each.

EXAMPLE
n = 20: 0 → 1 → 3 → 6 → 10 → ... → 210.
5Closed form in JS

Three ops: multiply, add, divide-by-2. Beware: for n > 2^26 the multiplication n*(n+1) overflows JS's 2^53 safe integer range — for huge n you would switch to BigInt.

EXAMPLE
n = 20: 20 * 21 / 2 = 210. Exact.
8Why BigInt for the geometric sum

JS doubles only hold integers exactly up to 2^53. For n = 60, 2^60 ≈ 1.15 · 10^18 already exceeds that. BigInt (the n suffix) gives arbitrary precision.

10BigInt loop

p *= 2n doubles a BigInt each iteration. After k iterations p holds 2^k.

EXAMPLE
After iter 60: p = 2^60 = 1152921504606846976n.
11Bit-shift closed form

1n << BigInt(n) is exactly 2^n. Subtract 1n for the geometric-series formula 2^n − 1. O(1) in shifts, O(n) in bits — but bit-counting overhead doesn't show up in our running-time analysis at this level.

EXAMPLE
n = 20: (1n << 20n) - 1n = 1048575n. Same as the loop.
14Float harmonic sum

Same 1/i accumulation. Floating-point: rounding error roughly n · 10^-16 per accumulation, so for n ≤ 10^9 the answer is accurate to about 7 significant digits.

EXAMPLE
n = 20: harmLoop ≈ 3.5977396571.
15Asymptotic approximation

ln n + γ. The error is O(1/n), so this is useful for sanity checks at moderate n and is exact in the leading-order asymptotic sense.

EXAMPLE
n = 20: Math.log(20) + 0.5772 ≈ 3.5729. Gap to true value ≈ 0.0248 ≈ 1/(2 · 20).
19Strict equality (===) for ints

For arithmetic and geometric sums the answers are exact integers; we can use ===. For harmonic we report the gap instead, because floating-point and asymptotic-approximation cannot match exactly.

22 lines without explanation
1function summationDemo(n: number) {
2  // ---- Arithmetic ----
3  let arithLoop = 0;
4  for (let i = 1; i <= n; i++) arithLoop += i;
5  const arithClosed = (n * (n + 1)) / 2;
6
7  // ---- Geometric (r = 2) using BigInt to stay exact ----
8  let geomLoop = 0n;
9  let p = 1n;
10  for (let i = 0; i < n; i++) { geomLoop += p; p *= 2n; }
11  const geomClosed = (1n << BigInt(n)) - 1n;
12
13  // ---- Harmonic ----
14  let harmLoop = 0;
15  for (let i = 1; i <= n; i++) harmLoop += 1 / i;
16  const harmApprox = Math.log(n) + 0.5772156649;  // Euler-Mascheroni
17
18  return {
19    arith:     { loop: arithLoop, closed: arithClosed,
20                 ok: arithLoop === arithClosed },
21    geometric: { loop: geomLoop,  closed: geomClosed,
22                 ok: geomLoop === geomClosed },
23    harmonic:  { loop: harmLoop,  approx: harmApprox,
24                 gap: Math.abs(harmLoop - harmApprox) },
25  };
26}
27
28console.log(summationDemo(20));
29// arith.ok    = true  (210 === 210)
30// geometric.ok = true (1048575n === 1048575n)
31// harmonic.gap ≈ 0.0248 (matches 1/(2n) = 0.025)

The point of two languages, again

Notice that the algebra didn't change. n(n+1)/2n(n+1)/2 is n(n+1)/2n(n+1)/2 whether you write it in Python, TypeScript, Rust, or on a whiteboard. The language only affects how you handle integer overflow and floating-point error — implementation details, not the mathematics.

Three Worked Algorithm-Cost Calculations

Time to apply the machinery. Three classical algorithms, three different summations, three sharply different asymptotic conclusions.

1. Insertion sort: an arithmetic-series argument for Θ(n²)

Insertion sort, in its worst case, scans backwards through the sorted prefix to find the right slot for the next element. Iteration ii (with ii from 11 to n1n - 1) does up to ii comparisons. Total worst-case comparison count:

T(n)=i=1n1i=(n1)n2=Θ(n2).T(n) = \sum_{i=1}^{n-1} i = \frac{(n-1)n}{2} = \Theta(n^2).

Two lines, no hand-waving. The arithmetic-series identity converted a loop count into a clean asymptotic. Insertion sort is Θ(n2)\Theta(n^2) in the worst case — a fact that justifies why merge sort, quicksort, and heapsort exist.

2. Building a heap: top-down vs. bottom-up

A binary heap of nn elements can be built two ways. The naive way: insert elements one at a time, each insertion sifting up at most logi\log i levels:

Ttop(n)=i=1nlogi=log(n!)=Θ(nlogn).T_{\text{top}}(n) = \sum_{i=1}^{n} \log i = \log(n!) = \Theta(n \log n).

We used the integral bound logilogxdx=nlognn+1\sum \log i \approx \int \log x \, dx = n \log n - n + 1 (or equivalently Stirling's formula for log(n!)\log(n!)).

The clever way (Floyd's heap construction): start from the bottom and sift down. A node at height hh does at most hh work, and there are about n/2h+1n / 2^{h+1} nodes at height hh. Total work:

Tbot(n)=h=0lognn2h+1h=n2h=0lognh2h.T_{\text{bot}}(n) = \sum_{h=0}^{\log n} \frac{n}{2^{h+1}} \cdot h = \frac{n}{2} \sum_{h=0}^{\log n} \frac{h}{2^h}.

The inner sum is bounded by the closed form h=0h2h=2\sum_{h=0}^{\infty} \frac{h}{2^h} = 2 (a standard arithmetico-geometric series — differentiate the geometric formula). So Tbot(n)n22=n=Θ(n)T_{\text{bot}}(n) \leq \frac{n}{2} \cdot 2 = n = \Theta(n). We just turned an O(nlogn)O(n \log n) algorithm into an O(n)O(n) algorithm by reordering the work and invoking a bounded geometric-like series. This is the actual reason heapify in the standard library uses bottom-up.

3. Tower of Hanoi: a geometric series for 2ⁿ − 1

Tower of Hanoi with nn disks satisfies the recurrence T(n)=2T(n1)+1T(n) = 2 T(n-1) + 1 with T(1)=1T(1) = 1. Unfolding:

T(n)=2T(n1)+1=2(2T(n2)+1)+1=4T(n2)+2+1=8T(n3)+4+2+1 =2n1T(1)+2n2+2n3++1=i=0n12i=2n1.\begin{aligned} T(n) &= 2 T(n-1) + 1 \\ &= 2(2 T(n-2) + 1) + 1 = 4 T(n-2) + 2 + 1 \\ &= 8 T(n-3) + 4 + 2 + 1 \\ &\ \vdots \\ &= 2^{n-1} T(1) + 2^{n-2} + 2^{n-3} + \cdots + 1 \\ &= \sum_{i=0}^{n-1} 2^i = 2^n - 1. \end{aligned}

The geometric-series identity took us from a recurrence to a closed form in one line. T(n)=2n1T(n) = 2^n - 1 is exact, not asymptotic — for 64 disks the famous answer is 26411.8×10192^{64} - 1 \approx 1.8 \times 10^{19} moves. The same technique handles T(n)=2T(n/2)+nT(n) = 2 T(n/2) + n (merge sort: gives Θ(nlogn)\Theta(n \log n)) and any divide-and-conquer recurrence in the Master Theorem family.


Where These Sums Show Up in Real Systems

These are not classroom curiosities. Each summation identity is doing work in production code right now.

System / domainSum that appearsWhat it tells you
Quicksort (numpy.sort, std::sort, V8 Array.sort fallback)Average comparisons ≈ 2 n H_nAverage runtime is Θ(n log n), not Θ(n)
Dynamic-array doubling (Python list, C++ vector, Java ArrayList)Total resize cost = Σ 2^i ≤ 2 · 2^k = O(n)Amortized append is O(1)
Hash-table chain analysisExpected chain length involves H_nAverage lookup is O(1); worst-case chain is Θ(log n / log log n)
TCP exponential backoff / Ethernet CSMAΣ 2^i wait slots up to ceilingTotal wait grows geometrically until cap; expected collisions stay bounded
Recommendation ranking (DCG metric)Σ rel_i / log_2(i+1)Discount factor ≈ 1/log i caps top-k contribution; ties to harmonic asymptotic
Compound interest, mortgage amortizationΣ (1+r)^i annuity factorGeometric series gives exact closed-form payment formula
PageRank power iterationConvergence rate Σ λ_2^kGeometric → number of iterations is O(log(1/ε)/log(1/λ_2))
Skip lists / treapsExpected height = H_nO(log n) search/insert with high probability
Bloom filter false-positive rate(1 − e^{−kn/m})^k via lim Σ identityOptimal hash count k = (m/n) ln 2
Coupon collector (load balancers, ML data sampling)Σ n/(n−i+1) = n H_nNeed Θ(n log n) random draws to see all n items
Numerical integration (trapezoid rule)Σ f(x_i) ΔxThe sum IS the algorithm
Computer graphics LOD / mipmap memoryΣ (1/4)^i image sizeMipmap pyramid uses ≤ 4/3 of original memory (geometric, r = 1/4)

Notice how often the same three or four series — arithmetic, geometric, harmonic, and their integrals — reappear. The world of algorithms is mostly built out of a small toolkit; you are not learning a list of formulas, you are learning the few that the rest of the field actually uses.


Pitfalls and Sanity Checks

A short list of mistakes that everyone makes once. The point is to make them once and never again.

  • Off-by-one in the bounds. i=0n\sum_{i=0}^{n} has n+1n+1 terms, not nn. i=1n1=n\sum_{i=1}^{n} 1 = n, but i=0n1=n+1\sum_{i=0}^{n} 1 = n+1. Always count by subtracting endpoints and adding one.
  • Forgetting to multiply a constant by nn. i=1nc=cn\sum_{i=1}^{n} c = c n, not cc. Easy to drop in the middle of a longer derivation.
  • Confusing geometric with arithmetic. 2i\sum 2 i is arithmetic (linear terms, Θ(n2)\Theta(n^2)); 2i\sum 2^i is geometric (exponential terms, Θ(2n)\Theta(2^n)). The position of the ii matters.
  • Mistaking a divergent sum for a constant. The harmonic sum diverges: i=11/i=\sum_{i=1}^{\infty} 1/i = \infty. It is Θ(logn)\Theta(\log n) as a partial sum, but it is not bounded. Contrast with 1/i2=π2/6\sum 1/i^2 = \pi^2/6, which is bounded.
  • Treating (1/2)n(1/2)^n as "goes to zero so I can drop it" when working out exact closed forms. It does asymptotically, but for finite nn a sum like 2(1/2)n12 - (1/2)^{n-1} is not exactly 2.
  • Double-counting when splitting an index. If you split i=1n\sum_{i=1}^{n} as i=1m+i=m+1n\sum_{i=1}^{m} + \sum_{i=m+1}^{n}, the ranges must be disjoint. i=1m+i=mn\sum_{i=1}^{m} + \sum_{i=m}^{n} would include ama_m twice.

Always sanity-check at small n

Whenever you derive a closed form, plug in n=1n = 1 and n=2n = 2 and compare to the loop's output. The arithmetic sum at n=1n = 1: loop gives 1, formula 12/2=11 \cdot 2 / 2 = 1. ✓ At n=2n = 2: loop gives 3, formula gives 23/2=32 \cdot 3 / 2 = 3. ✓ Two ten-second checks catch nine out of ten algebra errors.

Quick Check

Compute Σᵢ₌₁¹⁰⁰ (3i + 2) using linearity and the arithmetic-series formula.

Quick Check

Which is the correct asymptotic class for Σᵢ₌₁ⁿ log i?


We now have the algebraic vocabulary — sigma notation, arithmetic and geometric and harmonic identities, telescoping, integral bounds — needed to convert almost any loop or recurrence into a clean asymptotic statement. In the next section we extend the same toolkit to recurrences directly, deriving the Master Theorem, which compresses the three-way comparison "leaves vs. internal vs. work per level" into a single recipe. Every recurrence we solve there will dissolve into one of the summations we just learned to evaluate.

Loading comments...