The Pointer Idea
A pointer is a value whose meaning is “the location of something else in memory.” Concretely, on a 64-bit machine, a pointer is an 8-byte unsigned integer that names a byte address in your process's virtual address space. Conceptually—and this is the picture you should carry into every diagram in this chapter—a pointer is an arrow from one object to another.
Languages disagree about how loud they are about pointers. In C, you write int* p and you can print the address with %p; the address arithmetic is yours to do. In Java and Python, every variable that names a non-primitive object is implicitly a pointer—assigning b = a copies the arrow, not the object—but the language forbids you from inspecting or arithmetically manipulating the address.
Why this matters: The data structures in this chapter (and most of the rest of the book) are built by chaining objects together with pointers. Every lookup, insert, delete, splice, and reversal is ultimately a few rewrites of pointer fields. If you can fluently picture an arrow being unhooked from one box and rehooked onto another, the rest of linked structures collapses into careful bookkeeping.
What Is a Node?
A node is the smallest unit of a linked structure: a small struct that bundles a value with one or more pointer fields aimed at other nodes. The simplest node has exactly one pointer:
That second component, next, is the entire reason linked lists exist. With it, a node knows about the node that comes after it, and we can chain nodes into arbitrarily long sequences without needing them to live next to each other in memory.
Variants you'll meet in later sections of this chapter:
- Doubly linked node: — two pointers, walk both directions.
- Tree node: — two pointers, branch instead of chain (Chapter 9).
- Graph node: — many pointers, no shape constraint (Chapter 16).
Addresses vs References
Two flavors of the same idea cause endless confusion. It pays to name the difference precisely.
| Concept | C / C++ | Java / Python / JS |
|---|---|---|
| What a variable holds | Either a value or an explicit pointer (int vs int*) | Primitives hold values; everything else holds an implicit reference |
| Take the address | &x produces a pointer to x | Not exposed; the language hides it |
| Follow the pointer | *p or p->field | Implicit on every field access (a.next) |
| Pointer arithmetic | p + 1 advances by sizeof(*p) | Forbidden |
| Null sentinel | NULL or nullptr | None (Python), null (Java/JS) |
Reference is a pointer in disguise
Stack vs Heap Allocation
Nodes live on the heap. Local variables, function arguments, and return addresses live on the stack. The distinction governs lifetime and cost.
| Property | Stack | Heap |
|---|---|---|
| Allocation cost | ~1 ns (move stack pointer) | ~50–100 ns (allocator bookkeeping) |
| Deallocation | Automatic at function return | Manual (C) or via GC (Java/Python) |
| Lifetime | Tied to enclosing function frame | Independent — survives any function |
| Layout | Contiguous, predictable | Scattered wherever the allocator finds room |
| Typical use | Locals, arguments, return addresses | Long-lived objects, dynamic structures |
A linked-list node must live on the heap because it has to outlive whatever function created it. That's why a Python Node() call or a C malloc(sizeof(Node)) is irreducible: there's a real cost, in the tens of nanoseconds, every time you grow the list.
Memory Layout: Array vs Linked List
Picture an array of three ints and a linked list of three ints. The bytes look completely different.
1Array (contiguous):
2
3 0x1000: [ 7 ][ 3 ][ 9 ] one cache line, three values, no pointers
4 ^^^ ^^^ ^^^
5 4B 4B 4B = 12 bytes total payload
6
7Linked list (scattered):
8
9 0x1000: [ 7 | 0x2080 ] 16 bytes (4 value + 4 pad + 8 ptr)
10 0x2080: [ 3 | 0x40C0 ] 16 bytes
11 0x40C0: [ 9 | NULL ] 16 bytes
12
13Total = 48 bytes for the same 3 ints, plus pointer-chase across 3 cache lines.Two consequences fall out of this picture:
- Cache friendliness. Walking an array is one cache line per ~16 ints (typical 64-byte line). Walking a linked list is one cache miss per node in the worst case, because nodes can sit anywhere on the heap. On modern CPUs a cache miss is ~100× slower than an L1 hit—this is the dominant performance gap between arrays and lists.
- Random access. Arrays support in by computing . Linked lists have no such formula: to reach the -th node you must hop times from the head. That's .
So why ever pay for a linked list? Two operations that arrays make slow:
- Insert/delete in the middle given a pointer to the spot: for a list (rewrite two pointers); for an array (shift every element after the gap).
- Splice/concatenate two lists in by re-pointing one tail. With arrays you must allocate-and-copy.
Anatomy of a Node in Python
Python is the easiest place to start because the runtime hides every detail that doesn't matter for the algorithm. Here is a full minimal node, three nodes built by hand, and a complete traversal.
After all three constructor calls, the object graph in CPython's heap looks exactly like the diagram in the playground below: three independent PyObjects, each carrying an integer reference and a next reference; the names head, n2, n3 are local variables that go out of scope when main returns—but the nodes themselves stay alive as long as head reaches them.
Anatomy of a Node in C
C drops the disguise. Every malloc and every dereference is explicit. The same algorithm becomes longer but more honest.
head->next after free(head) is undefined behavior. It might seem to work on your laptop, then segfault on the production server because the allocator reused the freed bytes. This is the archetypal use-after-free bug.Pointer Manipulation Idioms
Almost every linked-list algorithm in the rest of this chapter is built from four primitive moves. Memorize their shapes; you'll see them constantly.
1. The Hop
1p = p.next # advance one nodeCost: one pointer read, one assignment. . The atom of every traversal.
2. Insert After p
1new_node = Node(v)
2new_node.next = p.next # 1) new_node aims at the rest of the list
3p.next = new_node # 2) p hops over to new_nodeOrder matters. If you do step 2 first, you overwrite p.next before you've saved it—the tail of the list becomes unreachable, and the garbage collector will gleefully free everything from p.next onward. Always wire the new node's outgoing pointer before redirecting any existing pointer at it.
3. Delete the Node After p
1p.next = p.next.next # bypass one node; GC reclaims the orphanIn Python the orphan node becomes garbage automatically. In C you must save the doomed node first and call free on it.
4. The Two-Pointer Swap
1# Reverse the link between a and b
2a.next, b.next = b.next, a # tuple-assignment evaluates the RHS firstPython's tuple assignment makes this a one-liner; in C you'd use a temporary. The trick of evaluating the right-hand side fully before any binding is what makes the line safe.
Null Terminators and Sentinel Heads
Every linked structure needs a way to say “there's nothing here.” The standard answer is a designated null pointer:
- Python:
None - Java / JavaScript:
null - C:
NULL(a macro for((void*)0)) - C++ (modern):
nullptr
Treating the empty list as a null head and the end-of-list as a null next is conventional, but it has a cost: every algorithm sprouts special cases “if the list is empty” and “if I'm at the head.” A common cleanup trick is the sentinel head (also called a dummy or header node):
1# Without sentinel: 3 cases (empty, at-head, in-middle)
2# With sentinel: 1 case
3sentinel = Node(value=None) # dummy node, never holds real data
4sentinel.next = first_real_node # the "real" head is sentinel.next
5
6# Now insert and delete look identical everywhere:
7def delete_after(p):
8 if p.next is not None:
9 p.next = p.next.next # works at the head, in the middle, anywhereA sentinel costs one node (16 bytes) but eliminates the “is this the head?” check from every operation. Production-grade linked lists in the Linux kernel (struct list_head) and the C++ STL (std::list) all use sentinels for this reason.
The True Memory Cost
Account for every byte. For a node holding a 4-byte int on a 64-bit system:
Plus per-allocation overhead from the heap allocator—typically 8 to 32 bytes of bookkeeping per malloc chunk. So the practical cost of a linked-list node holding one int is more like bytes. An array slot is 4. The list pays a 6×–12× memory premium for the privilege of local edits.
For a Python list of ints, the picture is worse: each Node is a full PyObject (~56 bytes object header + dict pointer + two attribute references), each attribute is itself a boxed integer (~28 bytes). One node easily costs ~120 bytes. Use __slots__ to chop the per-object overhead in half:
1class Node:
2 __slots__ = ("value", "next") # ~56 -> ~32 bytes per node
3 def __init__(self, value, next=None):
4 self.value = value
5 self.next = nextNode class for millions of records, the right answer is almost always “use a NumPy array, an array.array, or a Pandas column.” Linked lists exist for their algorithmic shape, not for storing bulk numeric data.Interactive: Pointer Playground
The playground below is a live linked list. Each rectangle is one node: the left half holds the value, the right half is the next pointer (drawn as a dot with an arrow). The made-up addresses on top emphasize that nodes do not sit consecutively in memory—each one is wherever the heap allocator put it.
Try this sequence to feel the asymmetry:
- Click Insert at head three times. Watch how each insert is one rewrite — green node, head pointer flips.
- Click Insert at tail. Notice the operation is the same shape, but conceptually we had to walk to the end first — that walk is the cost.
- Drag the Delete index slider to the middle and click delete. The node turns red and disappears; the predecessor's next pointer now skips it.
Each rectangle is one heap-allocated node: the left half stores the value, the right half stores the next pointer (drawn as a dot). Addresses on top are made-up but illustrate that nodes can live anywhere on the heap. Notice how insert at head is O(1) butdelete at index k costs O(k) because we must walk there first.
Where Linked Lists Hide in Real Systems
Linked lists are sometimes accused of being “a textbook structure with no real use.” That is wrong—they're hiding in plain sight all over real systems software. The pattern repeats: whenever an algorithm needs insertion/removal at known points and indexed access is unimportant, a linked list shows up.
| System | Where the linked list lives | Why |
|---|---|---|
| Linux kernel | struct list_head — intrusive doubly linked lists everywhere | Tasks, file descriptors, inodes; need O(1) splice/unlink under locks |
| Memory allocators | Free lists chaining released chunks of the same size class | Allocate / free in O(1) by popping/pushing the head |
| Garbage collectors | Mark queue and reference graph traversal | Workload is pure traversal; no random access ever needed |
| Hash tables (chaining) | Each bucket is a linked list of key–value pairs | Collisions are rare; simple O(1) prepend keeps insert fast |
| LRU caches | Doubly linked list ordered by recency, plus a hash map | Move-to-front and evict-tail are both O(1) given a node pointer |
| Filesystems | Inode block lists, directory entries | Variable-size, append-friendly, no need for random index |
| Browsers / editors | History, undo/redo stacks of unbounded length | Constant-time push at the head; no indexed access |
| Graph algorithms | Adjacency lists (Chapter 16) | Each vertex stores a list of neighbors; degrees vary widely |
| Compilers | SSA def–use chains, basic-block predecessor/successor lists | Edges are added/removed during optimization passes |
Pitfalls and How to Avoid Them
- Memory leaks (C/C++). Every
mallocneeds a matchingfree. Use a teardown function that walks the list and frees in head-to-tail order, savingnextbefore each free. - Use-after-free / dangling pointers. After
free(p), the bits ofpstill look like a valid address. Setp = NULLimmediately if you might touch it again. - Accidental cycles. Writing
tail.next = headturns the list into a ring, and any unsuspecting traversal becomes an infinite loop. Section 6 (Floyd's tortoise and hare) shows the O(n) detection algorithm. - Iterator invalidation. If you delete the node a cursor is currently pointing at, the cursor is dangling on the next hop. Standard fix: keep a previous pointer too, so deletion is “
prev.next = cur.next; cur = prev.next.” - Identity vs equality (Python).
a is basks “same object?”a == basks “same value?” For the null check you wantp is None; for “is this the same node I started from?” you also wantis, not==. - Losing the head. Overwriting
headin the middle of an algorithm makes every node before your cursor unreachable—in Python they get garbage collected; in C they leak. Saveheadat the top of any function that might rebind it.
Quick Checks
Quick Check
Starting from `head -> [7|.] -> [3|.] -> [9|None]`, you run: p = head.next p.next = Node(5, p.next) What does the list look like afterward?
Quick Check
In C, you want to delete the head of a non-empty list AND avoid a memory leak. Which sequence is correct?
With the pointer mental model in place, every linked-list algorithm in the rest of this chapter reduces to careful arrow choreography. Section 6.2 builds a full singly linked list class on top of these primitives; section 6.3 adds the back-pointer for the doubly linked variant; sections 6.4–6.7 turn the primitive moves into named algorithms (reverse, merge, detect-cycle, partition).