We have spent six sections using arrays. Now we build one. By the end of this section you will have written, line by line, both a Python DynamicArray backed by raw ctypes memory and a C++ Vector<T> with proper RAII and move semantics. These are not toys: the same ideas, often the same code shape, power CPython's list, libstdc++ std::vector, Java ArrayList, Go slices, Rust Vec, and JavaScript Array. Once you understand this section you understand all of them.
The thesis of this section. A dynamic array is just a fixed buffer plus two integers —sizeandcapacity— managed by exactly two policies: geometric growth (double when full) and hysteretic shrink (halve when usage drops to a quarter). Get those two right and every operation is amortized O(1) or, where it can't be, optimal O(n − i).
The Contract: What a Dynamic Array Promises
Before we write a single line, we pin down the abstract data type — the public surface every reasonable dynamic array must expose, and the cost model the user can rely on:
| Operation | Worst case | Amortized | Notes |
|---|---|---|---|
| len(a) / a.size() | O(1) | O(1) | Read a single integer |
| a[i] (read or write) | O(1) | O(1) | Pointer + offset; bounds-checked |
| a.append(v) | O(n) | O(1) | Slow only when buffer is full |
| a.pop() | O(n) | O(1) | Slow only when shrinking the buffer |
| a.insert(i, v) | O(n) | O(n − i) | Shift A[i..n−1] right |
| a.delete(i) | O(n) | O(n − i) | Shift A[i+1..n−1] left |
| iterate | O(n) | O(n) | One pointer walk; O(1) per element |
Three private fields back this entire interface: a pointer to a contiguous buffer, a logical size , and an allocated capacity with the perpetual invariant . Live elements occupy slots ; slots are reserved memory waiting to be used.
Memory Model: Going Below Python's list
There is a chicken-and-egg problem: Python's built-in list is itself a dynamic array, so if we build “our” dynamic array on top of it, we're really just renaming list operations. We need a real fixed-size primitive. The right tool is ctypes:
1import ctypes, sys
2
3cap = 4
4A = (ctypes.py_object * cap)() # one allocation, 4 PyObject* slots
5print(sys.getsizeof(A)) # ~ 64 bytes on a 64-bit system
6A[0] = "hello"; A[1] = 42
7print(A[0], A[1]) # hello 42
8# A is FIXED-SIZE: there is no A.append, no A.extend.The expression (ctypes.py_object * 4)() calls into the C allocator and reserves exactly bytes (plus a small header) for four object pointers, contiguous in memory. Crucially, that buffer cannot grow itself. To grow, we allocate a bigger one and copy.
This is exactly how CPython lists work
Objects/listobject.c in the CPython source: a PyListObject has fieldsob_item (pointer to a PyObject** buffer), ob_size (n), andallocated (capacity). The growth function is list_resize. Same three fields, same algorithm, written in C instead of Python.Growth Strategy: Why Doubling Wins
When append finds the buffer full, it must allocate a new one of size and copy elements over. The choice of determines everything. Three strategies:
- Constant additive growth: for some constant . Every appends triggers a resize, costing each time. Total cost of appends is . Catastrophic.
- Geometric growth with factor : . Resizes happen at sizes ; total copy cost is . For : at most total copies. Linear total work, O(1) amortized.
- Doubling (): Simplest, cleanest, and what almost everyone uses. The newly-allocated unused space equals the previously used space, which makes the accounting argument fall out cleanly (next subsection).
std::vector, Java ArrayList) prefer because it improves the chance the next allocation can reuse the freed previous buffer (with , the new buffer is always larger than the sum of all previous ones, so it cannot fit). This is a real practical benefit on systems with simple bump allocators.Amortized Analysis: Two Proofs of O(1) append
Aggregate method
Consider appends starting from capacity 1 and doubling. The total cost is the cost of writing each element ( writes, one each) plus the cost of all resizes. Resizes happen exactly when the buffer is full, at sizes , copying that many elements each time:
Dividing by gives an amortized cost of at most 3 per append — independent of , hence .
Accounting (banker's) method
Charge each append a flat amortized cost of 3 dollars. Spend 1 dollar on the actual write, and store 2 dollars as “credit” on the new element. When the buffer fills at capacity , every one of the live elements carries at least 2 dollars of credit (the latest-doubled half each carry 2; the older half carry even more, since they survived the previous resize too). The resize costs copies, payable from the dollars in credit — with dollars to spare. The invariant holds for all time, so amortized cost ≤ 3.
append can still cost in the worst case (when it triggers the resize). Real-time systems that must guarantee per-operation latency therefore prefer fixed-capacity arrays or specialized structures like the “deque of fixed-size chunks”.Shrink Policy: The Hysteresis Trick
Symmetric reasoning suggests: when pop drops the size below half the capacity, halve the buffer to recover memory. This is wrong. Consider a sequence that alternates append and pop right at the boundary:
- Suppose (full, capacity 4).
append→ grows to capacity 8, copies 4, now .pop→ . With shrink-at-half: → halve to capacity 4, copies 4.append→ full again → grows to 8, copies 4.popshrinks to 4, copies 4.Repeat forever.
Each operation now costs . The amortized argument fails because we spend the credit immediately and have nothing left for the next resize.
Fix: shrink only when the size has dropped to a quarter of capacity, and when shrinking, halve to half of capacity. Now after a shrink we're at 50% utilization, and we need to either drop to 25% (requiring pops) or fill to 100% (requiring appends) before another resize. Every resize is matched by a long stretch of cheap operations — the credit fund stays solvent.
Shrink-at-1/2 is the most common amortized-analysis bug
at size ≤ cap/2 and watch the total copies grow without bound.Insert and Delete: The O(n - i) Shifts
Random-position insert(i, v) and delete(i) are not amortized O(1). They require physically moving every element to the right of position by one slot — elements in the worst case. Inserting at the front is ; inserting at the end is amortized.
Bounds Checks and the Iterator Protocol
Two more responsibilities round out the implementation:
- Bounds checks on every read/write/delete. Negative indices should be normalized () the way Python does. Out-of-range indices must raise
IndexError— never write past the buffer end. Skipping this is the most common source of memory-corruption bugs in C-level dynamic arrays. - Iterator protocol.
__iter__yieldsA[0..n-1]. A subtle real-world bug: if the user mutates the array during iteration (e.g.,appendinside aforloop), a resize relocates the buffer; if your iterator captured the old pointer, it is now dangling. Python's built-in list readsself._A[k]on each step and tolerates appends-during-iteration (you may or may not see the new elements); insert/delete during iteration is undefined behavior even there. We follow the same rule.
Full Implementation: Python with ctypes
Here is the complete DynamicArray. Each line is annotated below — read the code, then click any line on the left to see why it does what it does.
Drive it with the canonical example:
1da = DynamicArray() # n=0, cap=4
2for i in range(10):
3 da.append(i)
4 print(f"after append({i}): n={len(da)}, cap={da._capacity}")
5
6# Output:
7# after append(0): n=1, cap=4
8# after append(1): n=2, cap=4
9# after append(2): n=3, cap=4
10# after append(3): n=4, cap=4 # one slot left
11# after append(4): n=5, cap=8 # FULL → resize, then write
12# after append(5): n=6, cap=8
13# after append(6): n=7, cap=8
14# after append(7): n=8, cap=8
15# after append(8): n=9, cap=16 # FULL → resize again
16# after append(9): n=10, cap=16Capacity follows the geometric sequence , growing precisely at appends 5, 9, 17, 33, … Total copy cost up to appends: copies plus writes = 22 ≤ 3 · 10. The bound holds.
Full Implementation: C++ with RAII and Move Semantics
The Python version is short because Python hides many costs. C++ forces us to confront them: separate allocation from construction, manage destructor calls explicitly, decide between move and copy on resize. The result is a Vector<T> close enough to std::vector that the same optimizations apply.
Why placement-new and explicit ~T()?
std::vector reserves capacity without constructing those elements. If we used new T[cap] instead of ::operator new(cap * sizeof(T)), every reserved slot would be default-constructed up front — that's wrong: the user has not yet stored those values, and T may not even have a default constructor. Placement-new gives us the fine-grained control: construct only when push_back is called, destruct on pop_back, free the raw bytes only in the destructor.Interactive: Build Your Vector
Step through any of the four scenarios below. The amber flash on the buffer marks a real reallocation and copy. Try toggling growth between , , and ; you'll see fewer wasted slots on the smaller factors but more frequent (and slightly more expensive) copies. The fourth scenario is the hysteresis bug — set shrink to at size ≤ cap/2 and watch the total-copy counter explode as the size oscillates across the boundary.
Green = newest element. Blue = occupied. Dashed = unused capacity. Amber = the buffer was just reallocated. Try the “Bad oscillation” scenario with shrink set to at size ≤ cap/2 to see why that policy fails: total copies grow linearly per operation.
Real-World Implementations
Every general-purpose language ships a dynamic array. The differences are in the constants:
| Implementation | Growth factor | Initial cap | Shrink | Notes |
|---|---|---|---|---|
| CPython list | ~1.125× + 6 | 0 (lazy) | explicit only | Tuned for many small lists; over-allocation pattern in list_resize |
| libstdc++ std::vector | 2× | 0 (lazy) | never (shrink_to_fit explicit) | GCC's standard library |
| libc++ std::vector | 2× | 0 (lazy) | never (shrink_to_fit explicit) | Clang's standard library; some forks use 1.5× |
| Java ArrayList | 1.5× (oldCap + (oldCap >> 1)) | 10 | trimToSize() explicit | Capped at Integer.MAX_VALUE - 8 |
| Go slice (append) | 2× until 1024, then 1.25× | 0 | never (re-slice releases nothing) | Hybrid factor for memory friendliness on large slices |
| Rust Vec | 2× | 0 (lazy) | shrink_to_fit / shrink_to | Move semantics native; no exception safety needed |
| Swift Array | 2× | 0 (lazy) | never (auto-COW) | Copy-on-write — a copied Array shares storage until first mutation |
| V8 JS Array | implementation-defined (≈ 1.5–2×) | 0 | may switch to dictionary mode | Sparse arrays fall back to a hash table internally |
Why Python uses such a small growth factor
list_resize uses new_alloc = (newsize + (newsize >> 3) + 6) & ~3, which is roughly . The reasoning is that Python programs create huge numbers of small lists; doubling would waste lots of memory across millions of lists. The amortized analysis still gives O(1) — the constant is just larger.Edge Cases and Exception Safety
- Initial capacity 0 vs lazy alloc.
std::vectorand RustVecallocate nothing on construction; the firstpush_backbootstraps to capacity 1. Our Python version uses 4 for clearer demos. Both are valid. Never use 0 with a multiplicative growth rule unless you also handle the bootstrap. - Capacity overflow. When approaches
SIZE_MAX, doubling overflows. Real implementations check and either throwstd::length_erroror saturate. - Concurrency. None of these structures is thread-safe. Concurrent appends race on both the buffer pointer and the size field; concurrent resizes can free a buffer another thread is reading. Concurrent variants exist (Java
CopyOnWriteArrayList, lock-free rings) but have very different semantics and cost models. - Exception safety on resize. If
T's copy/move throws halfway through a grow, the container must remain in a usable state. The C++ idiom ofstd::move_if_noexceptis precisely the strong-guarantee pattern: move when safe, copy-with-rollback when not. - Move-only types.
Vector<std::unique_ptr<T>>requires move-construction in the resize loop — copies don't exist. Our implementation handles this because we always pass throughstd::move_if_noexcept.
Pitfalls and Common Bugs
- Forgetting to null the popped slot in Python. The
ctypesarray holds a strong reference; withoutself._A[self._n] = Nonethe popped object stays alive until the slot is overwritten. Subtle leak. - Off-by-one in the shift loop for insert/delete. The boundaries are
range(n, i, -1)for insert (n down to i+1, exclusive of i) andrange(i, n - 1)for delete (i up to n−2). Test with insert(0, v), insert(n, v), delete(0), delete(n−1) — the four corner cases catch most off-by-ones. - Iterator invalidation after resize. A C++
std::vector::iteratorcaptured beforepush_backcan be a dangling pointer afterward. Python iterators re-read the buffer pointer each step, so they are safer; but inserting/deleting during a Python loop still skips or repeats elements. - Shrink-at-1/2 oscillation. Already discussed. Use 1/4 trigger, halve on shrink.
- Forgetting
delete[]or::operator deletein C++. Manual memory management is the price of control. RAII (the destructor we wrote) is what makes it bearable. - Pointer aliasing during resize. If two slots in the source buffer point to the same object and you copy them naively, fine; but with non-trivial constructors, double-construction is a real concern. Our placement-new + destruct-old loop handles this correctly because it processes each source slot exactly once.
Quick Checks
Quick Check
A DynamicArray starts with initial_capacity=4 and uses doubling. After 17 consecutive appends, how many element copies have occurred from resizes (excluding the 17 writes themselves)?
Quick Check
Why does shrinking when size ≤ capacity/2 (instead of ≤ capacity/4) break the amortized O(1) guarantee?
With this we close Chapter 5. Every section that follows — linked lists, stacks, queues, hash tables — builds on or contrasts with the dynamic array. When designers ask “is a dynamic array enough for this problem?”, they are weighing exactly the costs we have just measured: amortized O(1) append, O(1) random access, O(n − i) middle-of-array mutation, and contiguous memory for cache locality. In Chapter 6 we trade that contiguity for O(1) front insertion and pay for it in cache misses. The next chapter is, in many ways, this chapter's shadow.