The Operation Zoo
Every array data structure exposes the same handful of primitive operations — read by index, write by index, insert, delete, search, traverse, slice. Their asymptotic costs are deceptively simple to write down. Their real costs on a real machine, fighting cache lines and branch predictors, can differ by two orders of magnitude from what big-O suggests. A serious engineer owns both views.
This section walks every canonical operation, derives its tight bound, and then anchors the bound with hardware-level intuition. We end with the patterns that working engineers actually use to pick the right operation for the right job.
| Operation | Worst-case | Why |
|---|---|---|
| Random access A[i] | O(1) | address arithmetic |
| Update A[i] = v | O(1) | single store |
| Linear search (unsorted) | O(n) | scan all |
| Binary search (sorted) | O(log n) | halve the range |
| Insert at index k | O(n − k) | shift right |
| Delete at index k | O(n − k) | shift left |
| Append (dynamic array) | O(1) amortized / O(n) worst | occasional resize |
| Prepend | O(n) | shift everything |
| Concatenate (n + m) | O(n + m) | copy both |
| Reverse in place | O(n) | n / 2 swaps |
| Rotate by k | O(n) | three reverses |
| Min / max / sum / mean | O(n) | must touch every element |
| Equality compare two arrays | O(n) | element-wise |
| Slice A[i:j] (Python list) | O(j − i) | copy into new list |
| Slice (NumPy basic) | O(1) | view, no copy |
| Membership (unsorted) | O(n) | linear scan |
| Membership (sorted) | O(log n) | binary search |
| Sort (comparison-based) | O(n log n) avg | covered in Ch 26 |
Read this table left-to-right, then right-to-left
Random Access and Update: The O(1) Miracle
Random access — reading for any valid in constant time — is the defining superpower of arrays. The reason is pure arithmetic:
where is the start of the backing buffer and is the stride (size of one element in bytes). One multiply, one add, one load — three machine instructions, zero traversal. No matter how big the array is, the address calculation is the same width. That is where the comes from. Update is symmetric: same address calculation, plus a single store.
What "O(1)" really means here
Searching: Linear and Binary
A linear search examines elements one at a time. Best case the target is the first element (), worst case it is not there at all (), expected case for a uniformly random target position is probes — still because constants disappear in asymptotic notation.
1def linear_search(arr, target):
2 for i, x in enumerate(arr):
3 if x == target:
4 return i # found
5 return -1 # not foundIf the array is sorted, binary search lets us discard half the remaining range with each comparison:, which solves to. We covered the invariant carefully in Chapter 3 §3 (Recurrences). The key trade is up front: you paid to sort once, and you now reap on every query thereafter. For queries the total is — far better than as soon as.
Insertion and Deletion: Where the Cost Hides
The expensive operations on an array are the ones that change the array's shape. Insertion at index requires shifting every element from to one slot to the right to make room. That is writes. Worst case (insert at the front) gives; best case (insert at the end) is .
Deletion is the mirror: elements slide left to close the hole, plus one drop at the tail.
Direction of the shift loop matters. Insertion walks right-to-left. Deletion walks left-to-right. Reverse the direction and you overwrite values before reading them — one of the most common bugs in hand-rolled array code.
Amortized Append and the Resize Drama
Dynamic arrays advertise amortized append, but each individual append can cost when the underlying buffer is full and must be reallocated and copied to a larger one. The standard analysis (covered in detail in Ch 7) uses doubling: when the buffer fills, allocate slots and copy.
The bookkeeping argument: across appends the total work for resizes is. So appends cost total ⇒ per append is averaged out, even though one unfortunate append paid for the copy that day.
Realtime systems beware
Bulk Operations: Concat, Reverse, Rotate, Compare
Operations that must touch every element are inevitably:
- Concatenate arrays of size and :. Allocate one buffer of size, copy both. Repeated concatenation inside a loop becomes — the classic Python "build a string with +=" performance trap. Use instead.
- Reverse in place: swaps via two pointers walking inward. , no extra space.
- Rotate left by k: the three-reverses trick achieves time and space — reverse, reverse, reverse the whole thing. We implement it in the code panel above.
- Equality compare: in the worst case (arrays differ only at the last index) you must read all values — .
- Min / max / sum / mean / variance: a single linear pass — — and trivially parallelizable.
Slicing: Copy or View?
The semantics of change drastically between languages, and this single decision dominates real-world performance.
| Language / Type | Slice cost | Slice semantics |
|---|---|---|
| Python list | O(j − i) | Eager copy. Independent storage. |
| NumPy basic slice (no fancy index) | O(1) | View. Shares memory; mutating the slice mutates the parent. |
| NumPy fancy slice (boolean / index array) | O(j − i) | Copy. |
| Go slice | O(1) | View over a backing array; capacity tracked. |
| Rust &[T] slice reference | O(1) | Borrow; no allocation. |
| C++ std::span (C++20) | O(1) | Non-owning view. |
The same operation in C++ for comparison
Interactive: Array Operation Animator
Pick an operation, drag the index slider and value slider, and click Run step. Amber cells indicate values that changed. Blue cells indicate cells that participated in a shift (read once, written once). The cells touched counter is the literal number of read / write events the operation just performed — the constant factor big-O hides. Compare an "Insert at front" against a "Swap-pop" on the same array and watch cumulative touches diverge.
Array Operation Animator
Amber cells changed value. Blue cells participated in a shift. The "cells touched" counter is the real, observed cost of the operation you just ran — what big-O hides inside its constant factor.
Quick Check
An array of size n receives k front-insertions, one after another. What is the total work performed?
Quick Check
You have arr = [10, 20, 30, 40, 50]. You run swap-and-pop deletion at index 1, then at index 1 again, then at index 0. What is the final array?
Constants Matter: Cache Locality vs Big-O
Two operations that are both can differ in real wall-clock cost by 50× or more. The reason is the memory hierarchy:
| Operation | Big-O | Typical real time | Why |
|---|---|---|---|
| Array index A[i] (cache hot) | O(1) | ≈ 1 ns | L1 cache, 3–4 cycle load |
| Array index A[i] (cache cold) | O(1) | ≈ 80 ns | DRAM round trip |
| Hash map lookup (Python dict) | O(1) avg | ≈ 60 ns | hash + probe + pointer chase |
| Linked list traversal step | O(1) | ≈ 50 ns | pointer chase, no prefetch |
| Sequential array sweep (N=1M) | O(n) | ≈ 0.3–1 ms | fully prefetcher-friendly |
| Random array access (N=1M) | O(n) | ≈ 50–80 ms | cache thrashes on every access |
The bottom two rows are the same algorithm asymptotically, on the same size of input, with a 50–100× gap. Big-O is unbothered. Your latency budget is not.
Real-World Patterns
- Game engines — entity arrays. Append-heavy, delete with swap-pop. Order does not matter; cache-friendly iteration is everything.
- Audio processing — ring buffers. A pure dynamic array would prepend on every input sample (); a ring buffer makes it by reusing slots.
- Trading systems — sorted order books. Sorted array with binary search for price lookup, occasional inserts paid for once on every quote. At HFT scale these arrays are even replaced with cache-line-aligned, branchless variants.
- Time-series databases (Druid, InfluxDB, ClickHouse).Append-only column arrays. Never delete or insert in the middle; all writes are tail-appends. Reads are sequential range scans. This pattern is the only reason these systems can ingest millions of points per second.
- Compiler symbol tables. Small arrays of structs with linear search beat hash maps for — cache locality wins over asymptotics at small sizes.
- GPU vertex / index buffers. Append-only on the CPU side, then frozen and shipped to the GPU. No mid-array edits ever.
- DOM child lists in browsers. When you call 10000 times in a tight loop, you pay the array shift cost plus a layout reflow per call. Hence the pattern: build off-tree, attach once.
- Genomics — sequence arrays. A human genome is bytes as an array of nucleotides. Random access by index is the only way tools like BWA / minimap2 are tractable. Insertions are forbidden; mutations are recorded as a separate diff structure.
- UI virtualization. A list of 100,000 items cannot render every row. Materialize only the visible window into a small array, recycle DOM nodes — turn an render into.
- Database B-tree pages. Each page is a small sorted array of keys. Insert pays shifts — tolerable because page size is bounded (typically 4–16 KB ⇒ a few hundred keys).
Decision Rules: Which Operation, Which Structure?
Memorize these. They cover ~90% of array-related design decisions.
| Access pattern | Use | Why |
|---|---|---|
| Random access + appends only | Dynamic array (Python list, Go slice, std::vector) | O(1) read, amortized O(1) append |
| Random access + occasional middle inserts (small n) | Dynamic array | Constants beat asymptotics for n < ~10⁴ |
| Random access + heavy middle inserts (large n) | Balanced BST or skip list | O(log n) insert AND access |
| Inserts/deletes at both ends | Deque (Ch 8) | O(1) at both ends |
| Order doesn't matter, fast delete | Array + swap-pop | O(1) delete by index |
| Streaming append + bounded window | Ring buffer | O(1) push/pop, fixed memory |
| Sorted, query-heavy | Sorted array + binary search | O(log n) query, paid for at sort time |
| Sorted, mixed query/update | Balanced BST or B-tree | O(log n) for everything |
| Tail-append, range scan | Append-only column array | Fits SSD/columnar storage |
Heuristic: if the operation you run most often is on an array, use an array. If the operation you run most often is on an array and exceeds a few thousand, change structures — you are paying a tax that compounds.
The next two sections (Two Pointer and Sliding Window) are entirely about turning an apparent algorithm on an array into — using nothing but the operations we just catalogued.