Why an Operations Catalogue Matters
We have built singly, doubly, and circular linked lists in the previous four sections. Now we step back and ask the question every working engineer eventually asks: given a problem, will a linked list run fast enough? The answer comes in two parts, and a serious treatment must include both.
- Asymptotic cost — the Big-O behavior as . This is what tells you that delete-head is constant time and access by index is linear, regardless of language or machine.
- Real wall-clock cost — what the CPU actually does. Linked lists pay a hidden price (cache misses, allocator pressure, branch mispredictions) that is invisible to Big-O but very visible on a profiler. Sometimes a linear-time array scan beats a constant-time linked-list operation in practice, and you must know when.
Senior's rule of thumb: Big-O tells you how the algorithm scales. The constant factor tells you whether your users will notice. Linked lists have excellent Big-O for some operations and terrible constants for almost everything. Both facts matter.
The Master Complexity Table
Here is the canonical reference. Memorize the singly-linked column; the rest follow from symmetry. All times are worst case unless noted; is the list length, is an index.
| Operation | Singly (no tail) | Singly (+ tail) | Doubly (+ tail) | Doubly Circular | Dynamic Array |
|---|---|---|---|---|---|
| access by index t[k] | O(k) | O(k) | O(min(k, n−k)) | O(min(k, n−k)) | O(1) |
| find by value | O(n) | O(n) | O(n) | O(n) | O(n) |
| prepend | O(1) | O(1) | O(1) | O(1) | O(n) |
| append | O(n) | O(1) | O(1) | O(1) | O(1) amort. |
| insert after known node | O(1) | O(1) | O(1) | O(1) | O(n) |
| insert before known node | O(n) | O(n) | O(1) | O(1) | O(n) |
| insert at index k | O(k) | O(k) | O(min(k, n−k)) | O(min(k, n−k)) | O(n−k) |
| delete head | O(1) | O(1) | O(1) | O(1) | O(n) |
| delete tail | O(n) | O(n) | O(1) | O(1) | O(1) |
| delete given node ptr | O(n) | O(n) | O(1) | O(1) | O(n) |
| delete by value | O(n) | O(n) | O(n) | O(n) | O(n) |
| reverse in place | O(n) | O(n) | O(n) | O(n) | O(n) |
| length | O(n) / O(1) cached | O(1) cached | O(1) cached | O(1) cached | O(1) |
| concat (splice) two lists | O(n) | O(1) | O(1) | O(1) | O(m) |
Three patterns to internalise
- Tail pointer turns append from O(n) into O(1). Without it, a loop of appends becomes total — a classic interview-killer.
- Doubly linked turns delete-given-pointer from O(n) into O(1). Singly linked must walk to find the predecessor.
- Splicing two lists is O(1) with both tail pointers — array concatenation is . This is the single biggest argument for linked lists in a real codebase.
Asymptotic Cost vs Real Performance
Big-O hides a multiplicative constant that, on modern hardware, can be the difference between a snappy app and a sluggish one. The dominant cost in a linked-list traversal is not the comparison or the pointer dereference; it is the cache miss.
The cache-miss penalty
| Memory tier | Typical latency | Per-element cost |
|---|---|---|
| L1 cache | ≈ 1 ns | Sequential array scan hits this on every element |
| L2 cache | ≈ 4 ns | Larger arrays, still cache-friendly |
| L3 cache | ≈ 15 ns | Working sets up to a few MB |
| Main memory (RAM) | ≈ 100 ns | Cold linked-list traversal pays this per node |
Linked-list nodes are typically allocated independently on the heap, so consecutive nodes can land anywhere in physical memory. Walking the list is a sequence of essentially-random memory accesses — the CPU's prefetcher cannot predict what comes next, so each node = node.next can stall the pipeline waiting on RAM.
A concrete benchmark
On an x86-64 laptop circa 2024, summing an array of 1,000,000 32-bit ints takes about 1 ms (the prefetcher streams cache lines perfectly). Summing an equivalent std::list<int> of the same length takes 10–50 ms. The asymptotic complexity is identical — for both — but the wall-clock difference is 10× to 50×.
Big-O lies (a little)
Anatomy of Each Operation
1. Access by index — O(n), unavoidable
There is no shortcut. We start at head and follow next exactly times. Doubly linked lists with a tail pointer can choose the closer end: cost becomes .
2. Prepend — O(1), the linked list's superpower
Three writes: allocate node, node.next = head, head = node. Compare with a dynamic array's prepend, which must shift every existing element right.
3. Append — O(1) with tail, O(n) without
Without a tail pointer we must walk to the end. This is the most common production bug: a beginner builds a list with appends and accidentally pays . With a tail pointer, append is symmetric to prepend.
4. Insert after a known node — O(1)
If you already hold a pointer to the predecessor, you rewire two next-fields and you are done. This is the operation that makes intrusive linked lists popular in OS kernels: handlers receive a node pointer directly, skip the find phase, and unlink in constant time.
5. Insert before a known node — O(n) singly, O(1) doubly
Singly: you have node but not node.prev, so you walk from head until you find the predecessor. Doubly: node.prev is already there.
6. Delete head / delete tail
- Delete head:
head = head.next. Always . - Delete tail (singly): even with a tail pointer we must find the second-to-last node by walking from
head. . - Delete tail (doubly):
tail = tail.prev; tail.next = null. .
7. Delete given a node pointer — the canonical doubly-linked win
Doubly linked: node.prev.next = node.next; node.next.prev = node.prev. . Singly linked: must walk to find the predecessor. . This single operation is why every production LRU cache uses a doubly linked list.
8. Reverse — O(n), constant extra space
Walk the list flipping each next pointer in place. Three local variables (prev, curr, nxt) and one pass. Far more memory-efficient than building a reversed copy.
9. Length — O(n) by default, O(1) if cached
A bare linked list does not know its own length; counting requires a full traversal. Production implementations cache size as a field and update it on insert/delete — queries, no extra asymptotic cost.
10. Concatenate — O(1) with both tails
a.tail.next = b.head; a.tail = b.tail. Two writes, regardless of how long the lists are. This is the single operation where linked lists genuinely beat every array variant: a billion-element list can be appended to another in constant time.
Interactive: Operation Cost Visualizer
Pick an operation, choose a list size, and step through the work. The cyan bar shows pointer hops in the linked list; the green bar shows the equivalent step count in a dynamic array. Watch how the two curves cross as the operation changes — that crossing is the only reason to ever pick a linked list.
Operation Cost Visualizer — pointer hops vs. array equivalent
n = 8 · visited 0/5Walk from head until we reach index n/2. No shortcut exists — pointers, not arithmetic.
Notice three things while you play:
- access middle visits nodes; arrays jump in one step.
- append (no tail) visits every node; append (with tail) visits one. Same operation, different data structure.
- delete head visits one node for the linked list, for the array. The linked list is asymptotically better, even though the array beats it on most other ops.
Worked Example I: Merge Two Sorted Lists
This is the canonical linked-list interview problem and the recursive step inside merge sort. Given two sorted singly linked lists, produce a single sorted list by rewiring pointers — no new nodes, no copies. The famous trick is the dummy head: a placeholder node whose only purpose is to absorb the "is this the first element?" special case so the loop body has zero branches.
Why the dummy head matters
Complexity. Each iteration moves one of a, b forward by one and does work, so the total cost is time and extra space (we allocate one dummy node, period). This is the engine that powers merge sort on linked lists: time, recursion stack, and no need for random access — making it the sort of choice for linked structures.
Worked Example II: k-th Node From the End
"Find the -th node from the end" is asked at every coding interview because it forces the candidate to discover the two-pointer (lead/follow) idiom. The naive solution walks the list twice (once to compute length, once to walk steps); the elegant solution does it in one pass.
Complexity. One pass, two pointer hops per node visited. time, extra space — and crucially one traversal, not two. On a list whose nodes live in cold RAM, halving the number of passes can halve real wall-clock time even though Big-O is unchanged.
Where this pattern reappears
- Find the middle — slow advances 1, fast advances 2; when fast falls off, slow is at the midpoint.
- Detect a cycle — Floyd's tortoise and hare; if there's a cycle, fast catches slow.
- Find cycle start — once they meet, reset one pointer to head and walk both at speed 1.
Same Algorithms in TypeScript
Here is merge_sorted in TypeScript with class-based nodes. The shape is identical — the dummy-head trick is language-agnostic — but the type system makes the nullable next-pointer explicit, which catches a whole class of bugs at compile time.
1class ListNode {
2 constructor(public val: number, public next: ListNode | null = null) {}
3}
4
5function mergeSorted(
6 a: ListNode | null,
7 b: ListNode | null,
8): ListNode | null {
9 const dummy = new ListNode(0);
10 let tail = dummy;
11
12 while (a !== null && b !== null) {
13 if (a.val <= b.val) {
14 tail.next = a;
15 a = a.next;
16 } else {
17 tail.next = b;
18 b = b.next;
19 }
20 tail = tail.next!; // safe: we just assigned it
21 }
22
23 tail.next = a !== null ? a : b;
24 return dummy.next;
25}
26
27function kthFromEnd(head: ListNode | null, k: number): ListNode | null {
28 if (k <= 0) return null;
29
30 let lead: ListNode | null = head;
31 for (let i = 0; i < k; i++) {
32 if (lead === null) return null;
33 lead = lead.next;
34 }
35
36 let follow: ListNode | null = head;
37 while (lead !== null) {
38 lead = lead.next;
39 follow = follow!.next;
40 }
41 return follow;
42}Two language-level observations:
- The
!non-null assertion ontail.next!is a deliberate signal to the reader: "I have just written a non-null value here, the type checker can't prove it, but I can." In performance-critical code, the alternative (a runtime null check) costs a branch per loop iteration. - A C or C++ port would look almost identical, but you would also write
delete/freecalls when nodes are removed from a list — Python and TypeScript hand that off to the GC.
The Linked-List Algorithm Bestiary
Beyond the basic CRUD operations, a small zoo of classic algorithms operate on linked lists. Memorising these templates pays off in interviews, in production code review, and (most importantly) in spotting when a problem is secretly "just" one of these.
| Algorithm | Time | Space | Key idea |
|---|---|---|---|
| Detect cycle (Floyd's tortoise & hare) | O(n) | O(1) | Slow +1, fast +2; if they meet, there's a cycle. |
| Find cycle entry | O(n) | O(1) | After meeting, reset one to head; walk both at speed 1; meet at entry. |
| Find middle | O(n) | O(1) | Slow +1, fast +2; when fast falls off, slow is at the midpoint. |
| k-th from end | O(n) | O(1) | Two pointers with a fixed k-step lead. |
| Merge two sorted lists | O(n + m) | O(1) | Dummy head + writer pointer; rewire links in place. |
| Merge sort on linked list | O(n log n) | O(log n) stack | Split at middle, recurse, merge — no random access needed. |
| Reverse | O(n) | O(1) | prev/curr/nxt; flip each next pointer in one sweep. |
| Reverse k-group | O(n) | O(1) | Reverse k nodes at a time, link sub-lists. |
| Reorder L0 → Ln → L1 → Ln-1 … | O(n) | O(1) | Find middle, reverse second half, interleave. |
| Palindrome check | O(n) | O(1) | Find middle, reverse second half, compare. |
Why merge sort, not quick sort, for linked lists
Where These Operations Run in Production
| System | Linked-list role | Why it works there |
|---|---|---|
| LRU cache (Redis, browser caches, OS page cache) | Doubly linked list + hashmap; move-to-front on access | O(1) move-to-front and O(1) eviction-at-tail; hashmap gives O(1) lookup. |
| Linux kernel (list.h) | Intrusive doubly-linked lists everywhere — task lists, lock waiters, free pages | O(1) splice/unlink given a pointer; no allocator round-trip. |
| Java collections — java.util.LinkedList | Doubly linked Deque | Rarely the right choice today; ArrayDeque beats it on cache locality. |
| Python — collections.deque | NOT a linked list — block-of-arrays doubly linked | Best of both worlds: O(1) ends, cache-friendly middle. |
| Git reflog / commit chain | Doubly linked head/tail history | O(1) append on each commit; walk-back for git log. |
| Browser tab navigation history | Doubly linked back/forward stack | O(1) back, O(1) forward, splice when navigating from a non-tip page. |
| Database row-level locks (PostgreSQL ProcArray) | Linked waiter queues per lock | O(1) enqueue/dequeue; intrusive — the waiting transaction holds the link. |
| Compiler IRs (LLVM ilist) | Doubly linked instruction lists per basic block | Optimisation passes splice and reorder instructions in O(1). |
| Garbage collector free lists | Singly linked list of free memory blocks | O(1) push/pop on allocation/free; per-size-class free lists. |
| Message queues (Linux msgrcv, in-kernel) | FIFO linked queue | O(1) enqueue at tail, O(1) dequeue at head; lock-free variants exist. |
The pattern across all ten: linked lists win when (a) you already hold pointers to nodes (intrusive use), (b) the dominant operations are splice/unlink, and (c) random access is unnecessary. They lose any time you would have walked the structure linearly anyway.
Pattern Recognition: Reach for What?
| You need… | Best choice | Why not a linked list |
|---|---|---|
| Random access by index | dynamic array | O(1) vs O(n); cache-friendly |
| Sorted with binary search | dynamic array | binary search needs O(1) indexing |
| Frequent push/pop at both ends | deque (block-array) | O(1) ends + cache locality |
| O(1) splice of large sub-lists | doubly linked list | this is THE case where lists win |
| LRU cache | doubly linked list + hashmap | the textbook combination |
| Priority queue | binary heap (array-backed) | list scan is O(n) |
| Sorted dynamic set | balanced BST or skip list | O(log n) search |
| Hot-path traversal of millions of items | array / vector | cache-line streaming |
| Intrusive ownership in C/kernel code | doubly linked list | no allocator overhead, O(1) unlink |
| FIFO queue, simple | deque or ring buffer | rarely worth a linked list |
The senior engineer's default
Pitfalls and Performance Traps
1. The accidental O(n²) append loop
1# WRONG: append walks to the tail every time
2for v in values: # values has length n
3 head = append_no_tail(head, v) # O(n) per call → O(n^2) total
4
5# RIGHT: keep a tail pointer, OR build in reverse and reverse once
6tail = None
7for v in values:
8 node = Node(v)
9 if head is None:
10 head = tail = node
11 else:
12 tail.next = node
13 tail = node2. Reversing without saving next first
1# WRONG: we clobber curr.next BEFORE saving it
2prev = None
3curr = head
4while curr is not None:
5 curr.next = prev # destroys the link to the rest of the list!
6 prev = curr
7 curr = curr.next # now points at prev — infinite loop
8
9# RIGHT: save nxt, then rewire
10prev = None
11curr = head
12while curr is not None:
13 nxt = curr.next # save BEFORE we overwrite
14 curr.next = prev
15 prev = curr
16 curr = nxt3. Forgetting to update tail on delete-tail
After deleting the last node, tail is dangling. Production bug: the next append writes through the old (freed) tail. Always update tail in lockstep with the structural change.
4. Concatenation without both tail pointers
a.tail.next = b.head is only if you have a.tail. Without it, you walk a first — back to . The whole point of a tail pointer is to make this case constant time; do not lose it.
5. Caching length but updating it inconsistently
Cached size is one of the most common sources of correctness bugs. Every mutation method must update the count. A single forgotten increment in insert_after and your iterator runs short by one. Either cache it religiously or compute it on demand — never half-and-half.
Memory: the silent killer
Node{value, next} with an int payload is ~24 bytes of useful data plus ~16 bytes of allocator overhead — a 40-byte node for 4 bytes of payload (10× overhead). The same data in an int[] is 4 bytes per element. For million-element collections, the linked list uses 10× the RAM and runs 10× slower. Choose accordingly.Quick Checks
Quick Check
A junior engineer writes a function that appends n items to a singly linked list with no tail pointer. What is the total time complexity, and why?
Quick Check
You have two sorted singly linked lists of length 100 each. What is the time complexity of merging them with the dummy-head algorithm?
Quick Check
In the LRU cache pattern (doubly linked list + hashmap), what makes 'move recently-used item to front' an O(1) operation?
With the operations catalogued and their real costs understood, we are ready to study the algorithms that make linked lists genuinely indispensable. The next section dives into the two-pointer family: Floyd's cycle detection, finding the middle, finding the cycle entry, and the deeper invariant-based proofs that show why these tricks always work.