Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Understand amortized analysis and how it differs from worst-case and average-case analysis
Apply the aggregate method to calculate average cost over a sequence of operations
Master the accounting method by assigning charges that pay for expensive operations
Use the potential method to analyze data structures using a potential function
Prove that dynamic array push is O(1) amortized using all three methods
Recognize when amortized analysis applies in real-world data structures

Why This Matters: Amortized analysis is essential for understanding the true performance of data structures like dynamic arrays (std::vector, ArrayList), hash tables, and self-balancing trees. Without it, you would incorrectly conclude that dynamic arrays have O(n) push operations—a claim that contradicts everyday programming experience.

The Problem: Why Worst-Case Lies

Consider the humble dynamic array—the ArrayList in Java, std::vector in C++, or list in Python. You've used it countless times:

🐍example.py

1arr = []
2for i in range(1000000):
3    arr.append(i)  # What's the cost of this?

If you do a naive worst-case analysis, you might reason:

When the array is full, append must copy all n elements to a new, larger array
Therefore, worst-case cost of append is $O(n)$
So n appends would cost $O(n^2)$ !

But this analysis is misleading. In practice, those million appends complete nearly instantly. The worst-case analysis gives a true but useless bound—it tells us each operation might be expensive, but ignores that expensive operations are rare.

The Trap of Worst-Case Analysis

Worst-case analysis per-operation can be overly pessimistic when expensive operations are infrequent. Amortized analysis asks: "What is the average cost if we consider a sequence of operations?"

What is Amortized Analysis?

Amortized analysis determines the average performance of an operation in a sequence, guaranteeing the average cost over that sequence even in the worst case.

Key Characteristics

Analysis Type	What It Measures	Guarantee
Worst-Case	Single operation's maximum cost	Every operation costs at most this
Average-Case	Expected cost with probabilistic assumptions	Average over random inputs
Amortized	Average cost over any sequence	Total cost / number of operations

The Key Insight

Unlike average-case analysis, amortized analysis does not involve probability. It provides a guaranteed average per operation over any sequence of operations. The idea is:

\text{Amortized Cost} = \frac{\text{Total cost of } n \text{ operations}}{n}

If we can prove that n operations cost at most $cn$ (for some constant c), then the amortized cost is $O(1)$ .

Intuition: Amortized analysis "spreads out" the cost of expensive operations over many cheap ones. It's like a subscription service—you pay a steady monthly fee rather than paying per-use with occasional huge bills.

Running Example: Dynamic Arrays

Throughout this section, we'll analyze a dynamic array with a doubling strategy:

The Data Structure

Start with an array of capacity 1
When pushing to a full array, allocate a new array of twice the capacity
Copy all existing elements to the new array, then insert the new element

Cost Model

Inserting an element into an available slot costs 1
Copying an element during resize costs 1
So resizing n elements and inserting costs n + 1

Cost Sequence Example

Let's trace the cost of pushing elements 1, 2, 3, ..., 8:

Push #	Size Before	Capacity	Resize?	Cost	Cumulative
1	0	1	No	1	1
2	1	1	Yes → 2	1+1=2	3
3	2	2	Yes → 4	2+1=3	6
4	3	4	No	1	7
5	4	4	Yes → 8	4+1=5	12
6	5	8	No	1	13
7	6	8	No	1	14
8	7	8	No	1	15

After 8 pushes, total cost = 15. Amortized cost = 15/8 = 1.875 per push!

Pattern to Notice

Expensive resize operations happen at push numbers 2, 3, 5, 9, 17, ... (one more than powers of 2). Between resizes, many cheap O(1) operations occur.

Interactive: Dynamic Array Costs

Watch how push operations affect the total cost and amortized cost. Notice how the amortized cost stays close to 3 even when individual operations spike during resizes.

Dynamic Array Push Operations

Watch how array resizing affects the total cost of push operations

Capacity: 1 | Size: 0

Push Operations

Total Cost

0.00

Amortized Cost

Current Capacity

Operation Log (last 5)

Speed:

Key Insight

Although individual push operations can cost up to O(n) when resizing occurs, the amortized cost per operation is O(1). Expensive resize operations are rare enough that their cost "spreads out" over all operations.

Quick Check

After 16 push operations on a dynamic array that doubles, what is the total cost?

The Aggregate Method

The aggregate method is the most straightforward approach: calculate the total cost of n operations, then divide by n.

Approach

Compute an upper bound on the total cost of n operations
Divide by n to get the amortized cost per operation
All operations get the same amortized cost

Analysis for Dynamic Array

For n push operations on a dynamic array that doubles:

Step 1: Count the Total Cost

Each push costs 1 for insertion. Additionally, resize operations cost:

Push 2: copy 1 element
Push 3: copy 2 elements
Push 5: copy 4 elements
Push 9: copy 8 elements
In general: copy $2^k$ elements at push $2^k + 1$

Total copy cost for n pushes:

\sum_{k=0}^{\lfloor \log_2(n-1) \rfloor} 2^k = 2^{\lfloor \log_2 n \rfloor + 1} - 1 < 2n

Step 2: Sum Total Costs

\text{Total Cost} = \underbrace{n}_{\text{insertions}} + \underbrace{< 2n}_{\text{copies}} < 3n

Step 3: Compute Amortized Cost

\text{Amortized Cost} = \frac{\text{Total Cost}}{n} < \frac{3n}{n} = 3

Result

Using the aggregate method, we prove that the amortized cost of push is

O(1)

—specifically, less than 3 operations per push.

The Accounting Method (Banker's Method)

The accounting method assigns different "charges" to different operations. Some operations are overcharged; the extra credit pays for future expensive operations.

The Key Rule

The total amortized cost must always be at least the total actual cost. We can never "go into debt."

\sum_{i=1}^{n} \hat{c}_i \geq \sum_{i=1}^{n} c_i

where $\hat{c}_i$ is the amortized cost and $c_i$ is the actual cost.

Analysis for Dynamic Array

We assign an amortized cost of 3 to each push operation:

1 credit pays for the actual insertion
2 credits are stored with the element for future copying

Why This Works

When a resize happens from capacity c to 2c:

We need to copy c elements (cost = c)
The last c/2 elements were added since the previous resize
Each of those c/2 elements has 2 stored credits
Total stored: $c/2 \times 2 = c$ credits—exactly enough!

Visualization

Think of each element carrying 2 "coins":

1 coin to pay for copying itself in the next resize
1 coin to "sponsor" an older element that has already spent its coins

Intuition

It's like saving for a vacation. You set aside a little money each paycheck (overcharging each push), so when the expensive trip comes (resize), you have enough saved to pay for it.

Interactive: Accounting Method

Watch the accounting method in action. Each push is charged 3 credits, with 2 stored for future copies. When a resize occurs, stored credits pay for copying.

The Accounting Method (Banker's Method)

Charge extra "credits" for cheap operations to pay for expensive ones

How It Works

1.We charge 3 credits for each push operation (this is our amortized cost)

2.The actual insert costs 1 credit

3.We store 2 credits with each element for future copying

4.When resize happens, stored credits pay for copying elements

Capacity: 1 | Size: 0Total Stored Credits: 0

Credits Charged

(3 per push)

Actual Cost

Stored Credits

Yes

Budget Balanced?

Credit Flow per Push

Charge

Insert

Store

Operation Log

Key Insight

By charging 3 credits per push and storing extras, we always have enough "savings" to pay for expensive resize operations when they occur. The accounting method proves the amortized cost is O(1).

Quick Check

In the accounting method for dynamic arrays, why do we charge 3 credits instead of 2?

The Potential Method (Physicist's Method)

The potential method is the most powerful and general technique. It defines a potential function that maps the data structure state to a non-negative real number, representing "stored energy."

The Framework

Define a potential function $\Phi: \text{States} \rightarrow \mathbb{R}$ such that:

$\Phi(D_0) = 0$ (initial state has zero potential)
$\Phi(D_i) \geq 0$ for all states (potential is never negative)

The amortized cost of the i-th operation is:

\hat{c}_i = c_i + \Phi(D_i) - \Phi(D_{i-1})

That is: amortized cost = actual cost + change in potential.

Why This Works

Summing over n operations:

\sum_{i=1}^{n} \hat{c}_i = \sum_{i=1}^{n} c_i + \Phi(D_n) - \Phi(D_0)

Since $\Phi(D_n) \geq 0$ and $\Phi(D_0) = 0$ :

\sum_{i=1}^{n} \hat{c}_i \geq \sum_{i=1}^{n} c_i

The sum of amortized costs is an upper bound on actual costs!

Analysis for Dynamic Array

Define the potential function:

\Phi(D) = 2n - c

where n = number of elements, c = capacity.

Case 1: No Resize (n < c before push)

Actual cost: $c_i = 1$
Potential change: $\Delta\Phi = 2(n+1) - c - (2n - c) = 2$
Amortized: $\hat{c}_i = 1 + 2 = 3$

Case 2: Resize Needed (n = c before push)

Actual cost: $c_i = n + 1$ (copy n elements + insert)
New state: n+1 elements, capacity 2c = 2n
Potential change: $\Delta\Phi = 2(n+1) - 2n - (2n - n) = 2 - n$
Amortized: $\hat{c}_i = (n+1) + (2-n) = 3$

Beautiful Result

In both cases, the amortized cost is exactly 3! The potential method provides an elegant proof that dynamic array push is

O(1)

amortized.

Interactive: Potential Method

Explore how the potential function tracks "stored energy" in the data structure. Notice how when expensive resize operations occur, the potential drops significantly, canceling out the high actual cost.

The Potential Method (Physicist's Method)

Define a potential function that captures the "stored energy" in the data structure

Potential Function

Φ(D) = 2n - c

where n = size, c = capacity

Amortized cost: ĉᵢ = cᵢ + Φ(Dᵢ) - Φ(Dᵢ₋₁)

Array State

Size (n):

Capacity (c):

Potential Energy: Φ = -1

Φ=0

2×0 - 1 = -1

Operation History

Op	Type	Actual (c)	Φ before	Φ after	ΔΦ	Amortized
Push elements to see the analysis

Total Actual Cost

Total Amortized Cost

Avg Amortized

Key Insight

When a resize occurs, the potential drops significantly (ΔΦ is very negative), which cancels out the high actual cost. This keeps the amortized cost constant at 3 for every push operation!

Quick Check

Why does the potential drop during a resize?

Comparing the Three Methods

Method	Approach	Best For	Difficulty
Aggregate	Calculate total cost, divide by n	Simple data structures, same amortized cost for all ops	Easy
Accounting	Assign charges, maintain credit balance	When different operations should have different costs	Medium
Potential	Define potential function on states	Complex data structures, proving tight bounds	Hard (but powerful)

When to Use Each

Aggregate: Start here. If you can sum up total cost easily, this is the simplest approach.
Accounting: Use when you need to "prepay" for future operations. Good for intuitive explanations.
Potential: Use for complex analyses, especially when operations interact in complicated ways. The potential function often reveals deep structure.

Implementation in Code

Let's implement a dynamic array from scratch, tracking costs to verify our analysis.

Dynamic Array with Cost Tracking

🐍dynamic_array_analysis.py

Explanation(8)

Code(63)

1DynamicArray Class

We implement a dynamic array from scratch to track every cost and verify our amortized analysis.

5Initial Capacity

Start with capacity 1. This matches our analysis where we considered starting from an empty array.

8Cost Tracking

We track total_cost and push_count to compute the actual amortized cost and compare with theoretical bounds.

14Base Cost

Every push has a base cost of 1 for inserting the element into an available slot.

18Doubling Strategy

When full, we double the capacity. This is the key to O(1) amortized cost. Tripling would also work, but doubling is most common.

EXAMPLE

1 → 2 → 4 → 8 → 16 → ...

22Copy Cost

Each element copy costs 1. This is the expensive part of resize, but happens infrequently.

38Amortized Cost Method

Simple calculation: total cost divided by number of operations. This is the aggregate method in code.

43Potential Function

Implement Φ(D) = 2n - c. This lets us verify that ĉᵢ = cᵢ + ΔΦ = 3 for each operation.

55 lines without explanation

1class DynamicArray:
2    """Dynamic array with cost tracking for amortized analysis."""
3
4    def __init__(self):
5        self.data = [None]  # Initial capacity of 1
6        self.size = 0
7        self.capacity = 1
8        self.total_cost = 0
9        self.push_count = 0
10
11    def push(self, value):
12        """Push value to array. Returns cost of this operation."""
13        cost = 1  # Cost for insertion
14
15        # Check if resize needed
16        if self.size >= self.capacity:
17            # Double the capacity
18            new_capacity = self.capacity * 2
19            new_data = [None] * new_capacity
20
21            # Copy all elements (each copy costs 1)
22            for i in range(self.size):
23                new_data[i] = self.data[i]
24                cost += 1  # Cost for each copy
25
26            self.data = new_data
27            self.capacity = new_capacity
28
29        # Insert the new element
30        self.data[self.size] = value
31        self.size += 1
32
33        # Track costs
34        self.total_cost += cost
35        self.push_count += 1
36        return cost
37
38    def amortized_cost(self):
39        """Return current amortized cost per operation."""
40        if self.push_count == 0:
41            return 0
42        return self.total_cost / self.push_count
43
44    def potential(self):
45        """Calculate potential function: Φ = 2n - c"""
46        return 2 * self.size - self.capacity
47
48
49# Demonstrate amortized analysis
50arr = DynamicArray()
51
52print("Push | Cost | Total | Amortized | Potential")
53print("-" * 50)
54
55for i in range(1, 17):
56    cost = arr.push(i)
57    amort = arr.amortized_cost()
58    phi = arr.potential()
59    print(f"  {i:2d} |  {cost:2d}  |  {arr.total_cost:3d}  |   {amort:.2f}    |    {phi:2d}")
60
61print()
62print(f"Final amortized cost: {arr.amortized_cost():.3f}")
63print(f"Theoretical bound: 3.0")

C Implementation

Dynamic Array in C

📄dynamic_array.c

Explanation(5)

Code(75)

5DynamicArray Structure

In C, we use a struct to hold the array pointer, size, capacity, and cost tracking fields.

15Initial Allocation

Start with capacity 1 using malloc. Memory management is manual in C.

28Reallocation

Allocate new memory with double the capacity. We use memcpy for efficient copying.

33Cost Tracking

We add arr->size to cost because memcpy copies that many elements. In practice, memcpy is highly optimized.

35Free Old Memory

Critical in C: free the old array to avoid memory leaks.

70 lines without explanation

1#include <stdio.h>
2#include <stdlib.h>
3#include <string.h>
4
5typedef struct {
6    int* data;
7    int size;
8    int capacity;
9    int total_cost;
10    int push_count;
11} DynamicArray;
12
13DynamicArray* create_array() {
14    DynamicArray* arr = malloc(sizeof(DynamicArray));
15    arr->data = malloc(sizeof(int));
16    arr->size = 0;
17    arr->capacity = 1;
18    arr->total_cost = 0;
19    arr->push_count = 0;
20    return arr;
21}
22
23int push(DynamicArray* arr, int value) {
24    int cost = 1;  // Base cost for insertion
25
26    // Resize if needed
27    if (arr->size >= arr->capacity) {
28        int new_cap = arr->capacity * 2;
29        int* new_data = malloc(new_cap * sizeof(int));
30
31        // Copy all elements (each costs 1)
32        memcpy(new_data, arr->data, arr->size * sizeof(int));
33        cost += arr->size;  // Add copy cost
34
35        free(arr->data);
36        arr->data = new_data;
37        arr->capacity = new_cap;
38    }
39
40    // Insert element
41    arr->data[arr->size] = value;
42    arr->size++;
43
44    arr->total_cost += cost;
45    arr->push_count++;
46    return cost;
47}
48
49double amortized_cost(DynamicArray* arr) {
50    if (arr->push_count == 0) return 0.0;
51    return (double)arr->total_cost / arr->push_count;
52}
53
54void free_array(DynamicArray* arr) {
55    free(arr->data);
56    free(arr);
57}
58
59int main() {
60    DynamicArray* arr = create_array();
61
62    printf("Push | Cost | Total | Amortized\n");
63    printf("-----------------------------------\n");
64
65    for (int i = 1; i <= 16; i++) {
66        int cost = push(arr, i);
67        printf("  %2d |  %2d  |  %3d  |   %.2f\n",
68               i, cost, arr->total_cost, amortized_cost(arr));
69    }
70
71    printf("\nFinal amortized cost: %.3f\n", amortized_cost(arr));
72
73    free_array(arr);
74    return 0;
75}

Real-World Applications

Amortized analysis is crucial for understanding the performance of many common data structures:

1. Hash Tables

Hash tables resize when the load factor exceeds a threshold. Similar to dynamic arrays, this gives $O(1)$ amortized insert/lookup despite occasional $O(n)$ rehashing.

2. Splay Trees

Splay trees have $O(\log n)$ amortized operations, even though individual operations can be $O(n)$ . The potential method is essential for analyzing them.

3. Union-Find (Disjoint Set)

With path compression and union by rank, Union-Find achieves $O(\alpha(n))$ amortized time per operation, where $\alpha$ is the inverse Ackermann function (effectively constant).

4. Fibonacci Heaps

Fibonacci heaps have $O(1)$ amortized insert, find-min, and decrease-key operations, enabling efficient implementations of Dijkstra's and Prim's algorithms.

Data Structure	Worst-Case	Amortized	Key Insight
Dynamic Array (push)	O(n)	O(1)	Doubling spreads resize cost
Hash Table (insert)	O(n)	O(1)	Rehashing is infrequent
Splay Tree (search)	O(n)	O(log n)	Recently accessed items move to root
Union-Find	O(log n)	O(α(n))	Path compression flattens tree
Fibonacci Heap (decrease-key)	O(1)	O(1)	Lazy cleanup in extract-min

Test Your Understanding

Question 1 of 6Score: 0

What is the key insight behind amortized analysis?

Summary

Amortized analysis reveals the true cost of data structure operations by considering sequences rather than individual worst cases.

Key Concepts

Concept	Key Insight	Formula/Idea
Amortized Cost	Average over sequence, not probability	Total cost / number of ops
Aggregate Method	Sum all costs, divide by n	Simple and direct
Accounting Method	Charge extra, save for expensive ops	Credits ≥ actual costs always
Potential Method	Define energy function on states	ĉᵢ = cᵢ + Φ(Dᵢ) - Φ(Dᵢ₋₁)
Dynamic Array Push	O(1) amortized despite O(n) worst case	Φ = 2n - c

Key Takeaways

Worst-case per-operation can be misleading when expensive operations are rare
Amortized analysis provides guaranteed average cost without probabilistic assumptions
Three methods: Aggregate (easiest), Accounting (intuitive), Potential (most powerful)
Dynamic array push is O(1) amortized, which is why vectors/ArrayLists are so efficient
The potential method can prove tight bounds for complex data structures

Exercises

Conceptual Questions

Tripling Strategy: Suppose a dynamic array triples its capacity instead of doubling. Prove that push is still $O(1)$ amortized. What is the constant factor compared to doubling?
Incrementing by 1: If a dynamic array increases capacity by 1 (instead of doubling), what is the amortized cost of push? Prove your answer.
Multipop Stack: Consider a stack with push (cost 1) and multipop(k) that removes k items (cost min(k, size)). Using the potential function $\Phi = \text{size of stack}$ , prove that both operations are $O(1)$ amortized.
Binary Counter: A binary counter starts at 0. The INCREMENT operation flips bits from rightmost until finding a 0, then flips it to 1. Prove INCREMENT is $O(1)$ amortized using the potential $\Phi = \text{number of 1 bits}$ .

Coding Exercises

Implement and Verify: Implement a dynamic array that tracks costs per operation. Run 1 million pushes and verify that the amortized cost approaches 3.
Shrinking Array: Extend the dynamic array to shrink (halve capacity) when size drops below 1/4 capacity. Analyze the amortized cost of pop operations.
Visualize Potential: Create a visualization that shows the potential function value after each operation for both push and pop on a dynamic array.

Challenge Problems

Two-Stack Queue: A queue can be implemented using two stacks. Analyze the amortized cost of enqueue and dequeue operations.
Splay Tree Analysis: Research and explain the Access Lemma used in splay tree amortized analysis. What potential function is used?
Hash Table Load Factor: If a hash table resizes when load factor exceeds 0.75 and doubles capacity, prove that insert is $O(1)$ amortized. What changes if we resize at 0.5?

Solution Hints

Tripling: Copy costs become 1 + 3 + 9 + ... = (3^k - 1)/2. The amortized cost is about 1.5.
Increment by 1: This gives O(n) amortized cost! Total cost is 1 + 2 + 3 + ... + n = n(n+1)/2 = O(n^2) for n pushes.
Multipop: Push: actual cost 1, ΔΦ = +1, amortized = 2. Multipop(k): actual cost k, ΔΦ = -k, amortized = 0.
Binary Counter: If we flip t bits to 0 and one bit to 1, actual cost = t+1, ΔΦ = 1-t, amortized = 2.

In the next section, we'll explore the Master Theorem—a powerful tool for solving recurrence relations that arise from divide-and-conquer algorithms.