The Core Idea: Reduce, Then Solve
Algorithmic thinking is, at its heart, the art of not solving the problem you were given. Instead, you transform that problem — usually too large or too tangled to attack head-on — into one or more strictly easier instances whose answers can be assembled into the answer you actually want. Decomposition is the lever; recursion, induction, and the asymptotics of are the machinery that makes the lever safe.
Formally, given a problem instance of size , we seek a function producing subproblems each of size strictly smaller than , together with a combining function . When , is a base case whose answer is known directly.
Strategic question. “What is the smallest move I can make on this problem that turns it into a problem I either already know how to solve, or that has the same shape but is strictly easier?”
Almost every classical algorithm — merge sort, binary search, FFT, Dijkstra, Strassen's matrix multiplication, dynamic programming over a DAG — is the answer to that question for some specific . The strategies below are the recurring shapes the answer takes.
The Five Decomposition Strategies
Decomposition strategies are not disjoint — many real algorithms are hybrids — but each captures a distinct structural insight. The characteristic differences are: how many subproblems are produced, how much smaller they are, whether they overlap, and whether the input itself must first be transformed.
| Strategy | Subproblems | Recurrence shape | Canonical example |
|---|---|---|---|
| Divide & Conquer | ≥ 2, independent, disjoint | T(n) = a·T(n/b) + f(n) | Merge sort, FFT, Strassen |
| Decrease & Conquer | Exactly 1, slightly smaller | T(n) = T(n−c) + f(n) or T(n/c) + f(n) | Binary search, Euclid GCD, insertion sort |
| Transform & Conquer | Same problem, easier representation | T(n) = transform(n) + solve(n') | Heapsort (build-heap), Gaussian elimination, hashing |
| Brute Force / Exhaustive | All candidate solutions | T(n) = |candidates| · check(n) | Try-all-subsets, generate-and-test |
| Dynamic Programming | Many, overlapping | T(n) = #states · transition cost | Fibonacci, LCS, edit distance, Bellman-Ford |
1. Divide and Conquer
Split the input into independent pieces, recursively solve each, then merge. The hallmark is independence: subproblem answers do not depend on one another, so the recursive calls can in principle run in parallel. Merge sort splits an array of size into halves of size and merges them in linear time, yielding . The Fast Fourier Transform splits a length- signal into even-indexed and odd-indexed halves, recurses, and combines with twiddle factors in . Strassen multiplies two matrices using 7 recursive multiplications of half-sized matrices instead of the naive 8, giving .
2. Decrease and Conquer
Make the problem one step smaller — reduce by a constant (usually 1), or by a constant factor. There is exactly one recursive call. Binary search shrinks the candidate range by half each step (decrease by factor); the Euclidean algorithm replaces with , decreasing by a variable but provably factor; insertion sort sorts by sorting and inserting in place (decrease by 1).
3. Transform and Conquer
Don't shrink the input — change its representation so that the original problem becomes easy. Three sub-flavors:
- Instance simplification — presort, then run a simpler algorithm. Finding the median of a sorted array is .
- Representation change — heapsort first builds a binary heap from the input array (), then repeatedly extracts the max ( each). The tree-shaped heap turns “find max” from a linear scan into a constant-time peek.
- Problem reduction — map your problem to one with a known solver. Counting inversions reduces to a tweaked merge sort; shortest-path on a DAG reduces to topological order plus relaxation; maximum-flow reduces to linear programming.
4. Brute Force / Exhaustive Search
Enumerate every candidate solution and test each. Almost always suboptimal, but invaluable as: a baseline for measuring speedups; a correctness oracle for randomized testing of optimized algorithms; the only known approach for many NP-hard problems at small (e.g., the traveling salesman with ). The decomposition is trivial — “try this candidate, then try the rest” — but the candidate space has size , , or worse.
5. Dynamic Programming / Memoization
Use this when subproblems overlap — that is, when a naive recursion would solve the same subproblem an exponential number of times. The naive recursive Fibonacci recomputes twice, three times, and roughly times where . Caching each subproblem the first time it's solved collapses the exponential tree into a DAG of size . The same trick rescues longest common subsequence (), edit distance, knapsack, matrix-chain multiplication, and the Bellman equation in reinforcement learning.
Definition: Overlapping Subproblems
The Decomposition Tree
Every recursive decomposition induces a tree: the root is the original problem, the children of any node are the subproblems produced by , and the leaves are base cases. The shape of this tree is the geometry of the algorithm.
| Strategy | Tree shape | Depth | Leaves |
|---|---|---|---|
| Divide & Conquer (merge sort) | Balanced binary | log₂ n | n |
| Decrease & Conquer (linear) | A long stick | n | 1 |
| Decrease & Conquer (halving) | A long stick | log₂ n | 1 |
| Brute force (subsets) | Balanced binary, full | n | 2ⁿ |
| DP after memoization | DAG, not a tree | depends | O(states) |
The total cost of an algorithm is . Equivalently, when work is uniform across a level. Merge sort has depth and work per level → . A naive recursive Fibonacci has depth and a roughly doubling number of nodes per level → .
Recurrences: The Algebra of Decomposition
A recurrence is an equation (or inequality) that expresses the running time of a recursive algorithm on input of size in terms of its running time on smaller inputs. It is the algebraic form of the decomposition tree.
The most-cited general form is the divide-and-conquer recurrence:
Read this as: “solving a problem of size costs recursive solves on subproblems of size , plus non-recursive work to split and combine.” Specific instances:
| Algorithm | a | b | f(n) | Closed form for T(n) |
|---|---|---|---|---|
| Binary search | 1 | 2 | Θ(1) | Θ(log n) |
| Merge sort | 2 | 2 | Θ(n) | Θ(n log n) |
| Karatsuba mult. | 3 | 2 | Θ(n) | Θ(n^{log₂ 3}) ≈ Θ(n^{1.585}) |
| Strassen mult. | 7 | 2 | Θ(n²) | Θ(n^{log₂ 7}) ≈ Θ(n^{2.807}) |
| Naive matrix mult. | 8 | 2 | Θ(n²) | Θ(n³) |
The Master Theorem reads off directly from by comparing with . We'll prove and apply it in Chapter 2; for now the take-away is that decomposition shape determines asymptotic running time, not constants and not implementation tricks.
Why this matters
Quick Check
An algorithm solves a size-n problem by making 4 recursive calls on subproblems of size n/2, plus Θ(n²) combining work. Which closed form is correct?
Worked Example 1: Sum of an Array
Decrease-and-conquer in its purest form. To sum , recursively sum and add . Base case: an empty slice sums to 0.
The recurrence is , which solves to . Stack depth is also — for on CPython this will hit the recursion limit, a textbook reason to prefer iteration when the decomposition is linear.
Worked Example 2: Maximum Subarray (Three Decompositions)
Given an array of integers (possibly negative), find the maximum sum achievable by any contiguous subarray. Example: for the answer is 6, attained by the subarray . This problem is famous because it admits three textbook decompositions of strikingly different cost, making it ideal for studying how decomposition shape drives running time.
Decomposition A: Brute Force —
Enumerate every pair with , sum from scratch, take the max. Three nested loops: pairs and a sum of length up to for each.
1def max_subarray_brute(a):
2 n = len(a)
3 best = a[0]
4 for i in range(n):
5 for j in range(i, n):
6 s = sum(a[i:j+1]) # O(j - i + 1) work
7 if s > best:
8 best = s
9 return bestDecomposition B: Divide and Conquer —
Split at the midpoint . The optimal subarray either (i) lies entirely in the left half, (ii) lies entirely in the right half, or (iii) crosses . Cases (i) and (ii) are recursive calls; case (iii) is the only one we have to handle directly — sweep outward from in both directions tracking the best suffix sum to the left and best prefix sum to the right.
1def max_subarray_dc(a, lo, hi):
2 if lo == hi: # base case: single element
3 return a[lo]
4 m = (lo + hi) // 2
5
6 # case (i) and (ii): entirely on one side
7 best_left = max_subarray_dc(a, lo, m)
8 best_right = max_subarray_dc(a, m + 1, hi)
9
10 # case (iii): crosses m — best suffix of A[lo..m] + best prefix of A[m+1..hi]
11 s = 0; left_max = float("-inf")
12 for k in range(m, lo - 1, -1):
13 s += a[k]
14 left_max = max(left_max, s)
15 s = 0; right_max = float("-inf")
16 for k in range(m + 1, hi + 1):
17 s += a[k]
18 right_max = max(right_max, s)
19
20 return max(best_left, best_right, left_max + right_max)The cross-case sweep is , giving the recurrence → .
Decomposition C: Kadane (DP) —
Define as the maximum sum of a contiguous subarray ending exactly at index . Then either that subarray is the single element alone, or it extends the best subarray ending at :
The answer is . Because each depends only on we keep one rolling scalar. This is Kadane's algorithm, and it's the textbook example of how transforming a brute-force search into a DP with overlapping-subproblems structure collapses down to .
Lesson from this example
Kadane in Two Languages
First, pure Python from scratch — no libraries, no tricks — so the algorithm is naked.
Now a vectorized PyTorch version — useful when you have a batch of sequences (e.g., per-instrument returns in a finance pipeline, or per-channel anomaly scores in time-series models) and want to fuse the sweep into a single GPU kernel via on prefix sums.
Both implementations compute the same answer, but the second one leverages a transform-and-conquer insight: rewriting the problem in terms of prefix sums turns it into a difference query over running minima, which fuses cleanly onto vector hardware. Same problem — better representation.
Interactive: Decomposing Merge Sort
Decomposition trees are easier to feel than to describe. The widget below walks merge sort through both phases on an 8-element array. Press Step to either split a node or merge a pair.
How to read this. Step forward to watch split itself into halves until each piece has length 1 (the base case). Then watch sorted pieces merge back upward in post-order. The split phase makes levels; the merge phase does work per level — total .
Things to notice as you step through:
- The split phase is a pre-order walk — we always handle a parent before its children.
- The merge phase is the mirror image: a post-order walk — a parent can only merge after both its children have already merged. This is induction in motion.
- The tree has exactly levels (excluding the leaves) and leaves — matching the asymptotics for .
- Each level processes elements in total during merge, and there are levels → the famous .
Decomposition in Real Systems
The five strategies aren't academic exercises — they are the skeletons of the systems you use every minute.
| System | Strategy | How decomposition shows up |
|---|---|---|
| Google MapReduce / Spark | Divide & Conquer | Map: split a 100 TB log file into independent 64 MB shards. Reduce: combine per-shard answers. T(n) = a · T(n/a) + Θ(n) at each stage. |
| PostgreSQL query planner | Transform & Conquer | Rewrites SQL into a relational-algebra DAG, then picks join orders via DP over connected subsets — bitmasked Bellman-Held-Karp on subgraphs of size ≤ 12. |
| GCC / LLVM | Divide & Conquer + DP | Recursive descent splits source by grammar rules; instruction selection on the resulting tree is DP (tree pattern matching, Aho-Johnson). |
| Operating system schedulers (CFS) | Transform & Conquer | Represent the process set as a red-black tree keyed by virtual runtime → 'pick next process' becomes O(log n) tree-min instead of O(n) scan. |
| TCP/IP routing (BGP, OSPF) | Divide & Conquer (hierarchy) | The Internet is recursively partitioned into autonomous systems → areas → subnets, so route lookup is bounded by tree depth, not the global table size. |
| MP3, JPEG, H.264, DSP | Divide & Conquer | FFT in Θ(n log n) (Cooley-Tukey) splits a length-n DFT into two length-n/2 DFTs. Without it, modern audio/video codecs would not exist. |
| Google Maps routing | DP / Decrease & Conquer | Dijkstra (DP over the relaxation Bellman equation) plus contraction-hierarchies (transform-and-conquer: precompute shortcuts) gives sub-millisecond cross-continent routes. |
| Recommender systems | DP / matrix factorization | Cast 'predict user-item rating' as low-rank matrix completion; solve via alternating least squares — each subproblem (one user fixed) is a small linear system. |
| Distributed gradient descent | Divide & Conquer | All-reduce splits an N-GPU gradient sum into log N pairwise reductions of half the data each — same recursive shape as merge sort over a ring. |
| 3D rendering (BVH / scene graphs) | Divide & Conquer | Bounding Volume Hierarchies recursively partition space → ray-scene intersection drops from Θ(#triangles) to Θ(log #triangles). |
| Genome assembly | Transform & Conquer | Read overlaps reduce to an Eulerian path on a de Bruijn graph; problem-reduction lets us reuse 80 years of graph theory. |
| Black-Scholes option pricing (Monte Carlo) | Brute Force + parallelism | Generate 10⁸ candidate price paths, average the payoffs. Embarrassingly parallel — exhaustive search becomes feasible because the candidates are independent. |
Notice how often transform-and-conquer is the hidden engine: red-black trees in the Linux kernel, prefix sums on GPUs, de Bruijn graphs in bioinformatics. The work is in choosing a representation in which the problem you care about is trivial.
Recognizing Which Strategy Fits
Given a fresh problem, run this checklist. The first “yes” usually wins.
- Can I split the input into ~ independent pieces of size ? If the pieces don't need to talk to each other, you have divide and conquer. (Sorting halves, FFT halves, BVH octants.)
- Can I solve a slightly smaller instance and finish in or ? Then it's decrease and conquer. (Insertion sort, GCD, binary search.)
- Is there a representation in which my problem is easy? Transform and conquer. Sorting first, building a hash table, mapping to a graph, switching from time domain to frequency domain.
- Do my recursive subproblems repeat? Then memoize and you have dynamic programming. The diagnostic: draw two levels of the recursion tree and look for nodes labeled with the same subproblem.
- Is the candidate space small (or am I just sanity-checking my fast algorithm)? Brute force is fine and is often the right baseline.
- Is the input itself a tree, graph, or recursively-defined object? Then the natural decomposition follows the structure of the input. (Tree algorithms, JSON/AST traversals, file-system walks.)
Heuristic
Quick Check
You're asked to compute the number of inversions (pairs i < j with A[i] > A[j]) in an array of length 10⁶. Which decomposition strategy gives the best asymptotic running time?
Correctness and Efficiency Intuition
Correctness via structural induction
A recursive algorithm is correct iff: (1) it returns the right answer on every base case, and (2) assuming the recursive calls return the right answer on their (strictly smaller) inputs, the combine step produces the right answer on the current input. This is the induction hypothesis in computational form. For kadane the base case is the singleton subarray with answer ; the inductive step is the recurrence — if we believe is correct, the case analysis (start fresh vs. extend) is exhaustive, so is correct.
Termination is part of correctness
Efficiency = depth × work-per-level
Once you have a decomposition tree, asymptotic cost falls out by accounting:
- Depth — how many levels deep does the tree go? Halving gives ; decreasing by 1 gives .
- Branching — how many children per node? Constant children with shrinkage factor means level has nodes of size .
- Per-node work — the non-recursive cost of
decompose+combineat a node of size . - Sum it up: total work = . For balanced trees this collapses to a clean closed form via the Master Theorem.
Two algorithms with the same big-O often differ in which term dominates — e.g., merge sort vs. heapsort are both , but heapsort's constant is larger because heap operations have worse cache locality. The shape of the decomposition predicts not just asymptotic order but also memory access patterns, parallelism, and even numerical stability (Strassen is famously less numerically stable than the naive algorithm despite being asymptotically faster).