Why Trees? The Shape of Hierarchy
Up to this chapter we have worked with structures that are essentially linear: arrays, linked lists, stacks, queues. Each element has at most one predecessor and one successor. Linearity is powerful, but the world is not flat. The folders on your computer, the replies under a forum post, the pages reachable from a homepage, the executives reporting to a CEO, the syntactic structure of a sentence, the state of a chess game several moves ahead — none of these are lines. They branch.
A tree is the data structure that captures branching hierarchy. One designated element — the root — sits at the top, every other element has exactly one immediate ancestor, and there are no cycles. That single restriction (one parent per element, no cycles) is what makes trees enormously useful: it lets us traverse, search, summarize, and modify hierarchies in time proportional to when the tree is balanced, and never worse than in the worst case.
Intuition. If a linked list is a chain, a tree is a family tree. Every person has exactly one biological mother in the diagram, but a mother can have many children. Pick any person; trace upward and you eventually reach the original ancestor. There are no loops — you cannot be your own great-grandparent.
Before any formal definition, here are the everyday hierarchies that trees model in production software. Notice that the "branching factor" differs wildly — sometimes 2, sometimes thousands — and the depth differs too:
| Real-world hierarchy | Root | Internal nodes | Leaves |
|---|---|---|---|
| Unix file system | / | Directories | Regular files |
| DOM of a web page | <html> | Elements (<div>, <ul>, ...) | Text nodes |
| JSON document | Top-level object | Nested objects/arrays | Strings, numbers, true/false/null |
| Decision tree (ML) | Root question | Internal questions | Class label or value |
| Game tree (chess, Go) | Current position | Future positions | Terminal positions |
| Compiler AST | Program | Expressions, statements | Identifiers, literals |
| Org chart | CEO | Directors, managers | Individual contributors |
| Phylogenetic tree | Last common ancestor | Speciation events | Living species |
Understanding tree terminology is therefore not just bookkeeping. It is the vocabulary you will use every time you ship a feature that touches a file system, a UI tree, a configuration document, a routing table, a version-control history, a database index, or a machine-learning model of class hierarchies. The rest of this chapter, and most of the next five chapters, will be unintelligible without it.
A Formal Definition
A tree is a finite set of nodes such that:
- Either (the tree is empty),
- Or there is a designated node , called the root, and the remaining nodes are partitioned into disjoint subsets , each of which is itself a tree, called a subtree of .
The recursive flavor is the point. The definition says a tree is either empty, or a root attached to a list of trees. That single sentence is what makes nearly every tree algorithm in this book a few lines of recursion: a function that handles the empty case and the "root plus subtrees" case is, by induction, a function that handles every tree.
An equivalent graph-theoretic definition is sometimes more useful: a tree is an undirected graph that is connected and acyclic. In a connected graph there is a path between every pair of vertices; in an acyclic graph there is no closed loop. A tree of nodes therefore has exactly edges — one fewer than the number of nodes — because every node except the root contributes exactly one edge to its parent.
Rooted vs. free trees
Core Terminology, Visualized
The visualizer below shows a single tree of nodes. Click any node to see all of the relationships defined in this section colored simultaneously. Each definition will refer back to the same picture, so you can read with one hand on the slider.
With that picture in mind, here is the terminology, in the order you will use it most often:
- Root. The unique node with no parent. In our example the root is . A non-empty tree has exactly one root.
- Parent and child. If is immediately above in the tree, then is the parent of , and is a child of . Every non-root node has exactly one parent. is a parent of and .
- Sibling. Two nodes are siblings if they share the same parent. and are siblings; so are .
- Leaf (or external node). A node with no children. In our example the leaves are . Leaves carry the "data" in many trees: file contents in a file system, text nodes in the DOM, predicted classes in a decision tree.
- Internal node. Any non-leaf node. In our example are internal. Internal nodes carry structure (the directories, the elements, the questions); leaves carry content.
- Edge. The connection between a parent and one of its children. A tree with nodes has exactly edges — we will prove this in a moment.
- Path. A sequence of nodes in which each consecutive pair is connected by an edge. In a tree there is exactly one path between any two nodes — another consequence of being connected and acyclic.
- Ancestor and descendant. If the unique path from upward to the root passes through , then is an ancestor of , and is a descendant of . Every node is its own ancestor and its own descendant by convention (these are the "improper" cases). Click in the visualizer: its ancestors are .
- Subtree. The subtree rooted at is together with all of its descendants. The subtree at consists of . Subtrees are the recursive handle on a tree: every algorithm of the form "do something to a tree" is, almost always, "do something to the root, then recurse on each subtree."
Improper vs. proper
Depth, Height, Level, Degree
Five numbers are attached to every tree and every node. They look similar at first glance, but they measure different things, and mixing them up is the most common bug in undergraduate tree code.
| Quantity | Defined for | Definition | Example in our tree |
|---|---|---|---|
| depth(v) | a node v | Number of edges on the path from the root to v. | depth(K) = 3 |
| height(v) | a node v | Number of edges on the longest downward path from v to a leaf. | height(C) = 2, height(leaf) = 0 |
| height(T) | the whole tree | Equal to height(root). Equivalently: max depth over all nodes. | height(T) = 3 |
| level(v) | a node v | Equal to depth(v) + 1 in the 1-indexed convention, or just depth(v) in the 0-indexed convention. | Root is at level 1 (or 0) |
| degree(v) | a node v | Number of children of v. | degree(D) = 3, degree(leaf) = 0 |
| degree(T) | the whole tree | Maximum degree across all nodes. | degree(T) = 3 in our example |
Three short, important identities follow directly from the table:
- Root has depth 0. The path from the root to itself has zero edges. So .
- Leaf has height 0. The longest downward path from a leaf is the empty path, which has zero edges. So .
- Tree height is bounded by node count. , with equality when the tree is a single chain (a "stick").
Off-by-one trap
Mathematical Properties of Trees
Three properties are worth proving once, in full, because every later chapter quietly assumes them. They are simple, but they are the reason balanced search trees, heaps, segment trees, and tries can promise operations.
1. A tree on n nodes has exactly n − 1 edges.
Every node except the root contributes exactly one edge to its parent. The root contributes none. So the total number of edges equals the number of non-root nodes, which is . As a corollary, if you ever count edges in something you are calling a tree, you have either miscounted or it is not a tree (it has a cycle, or a node with two parents).
2. The unique-path property.
Between any two nodes there is exactly one path. If there were two distinct paths, joining them would form a cycle, contradicting acyclicity; if there were none, the tree would not be connected. This is what allows ancestor/descendant relationships to be well-defined: there is no ambiguity about "the" path from a node to the root.
3. Height is between log₂ n and n − 1.
For a binary tree (every internal node has at most 2 children), the number of nodes at depth is at most . Summing over depths gives at most nodes total, so , which rearranges to . On the other extreme, a tree shaped like a chain has . So:
The whole story of balanced binary trees (AVL, red-black, B-trees) is the story of forcing the height to stay near the lower bound. We will explore that in Chapter 13. For now, slide below to feel the spread:
Notice how at the balanced height is while the worst-case stick height is . That ten-times-shorter path is the difference between an instant search and a sluggish one, and it is why the rest of this book cares so much about keeping trees balanced.
Kinds of Trees You Will Meet
The same word "tree" describes many shapes with very different properties. Knowing the names lets you communicate precisely with other engineers and pick the right algorithm.
| Name | Defining property | Where you see it |
|---|---|---|
| General tree | Each node may have any non-negative number of children. | File systems, DOM, org charts. |
| k-ary tree | Each node has at most k children. (k = 2 is the most common.) | Quad-trees (k=4), oct-trees (k=8) in 3D graphics. |
| Binary tree | Each node has at most 2 children, conventionally labelled left and right. | BSTs, heaps, expression trees, Huffman trees. |
| Full binary tree | Every internal node has exactly 2 children. | Huffman coding, decision tournaments. |
| Complete binary tree | All levels filled except possibly the last, which is filled left-to-right. | Binary heaps stored as arrays (Chapter 14). |
| Perfect binary tree | All internal nodes have 2 children AND all leaves are at the same depth. | Idealized analyses; n = 2^(h+1) − 1. |
| Balanced tree | Height is O(log n). Specific definitions vary (AVL, red-black, weight-balanced). | Maps and sets in std::map, java.util.TreeMap, etc. |
| Binary search tree (BST) | In-order traversal yields a sorted sequence; left < parent < right. | Indexes, ordered associative containers. |
| Heap | Parent dominates children under a fixed order (min-heap or max-heap). | Priority queues, Dijkstra's algorithm. |
| Trie | Edges labelled with alphabet symbols; paths spell out keys. | Autocomplete, IP routing tables. |
| B-tree | Each node holds many keys and many children; designed for disk pages. | Database indexes, file system directory entries. |
We will spend chapters on each of these. For this section the only thing to internalize is that they are all trees — the terminology in this section applies to every one of them — but each one piles extra invariants on top to make a particular operation cheap.
A Tree Node in C
Now let us encode a tree in actual code. We start with C because in C the memory layout of a tree is visible to the eye: a node is a struct with payload data and pointers to other structs. The pointer is the edge.
For a general tree (any number of children), the cleanest layout is the first-child / next-sibling representation: each node stores one pointer to its leftmost child and one pointer to the next sibling on its right. This handles arbitrary fan-out in two pointers per node, no dynamic arrays required.
Why first-child / next-sibling?
left and right pointer per node — because the structure of binary trees benefits from naming the two children. We will use that layout starting in Section 10.2.The Same Tree in Python
Python hides pointers behind the garbage collector and lets us write the same structure as a small class. The semantics are identical (a node holds references to its children) but the lifetime is managed for us, and the algorithms are easier to read because the syntax is closer to the math.
The two implementations are the same algorithm
Computing Height and Depth from a Node
Two of the numbers we defined — depth and height — are computed by tiny recursive walks. They are the "hello world" of tree algorithms, and you should be able to write them without thinking.
Pattern. Almost every tree algorithm in this book follows the shape ofheight: a base case for the empty or leaf subtree, and a recursive case that combines the answers from each child. Internalize this pattern now and the rest of the chapter will feel like variations on a theme.
Edge Cases: Empty Trees, Single Nodes, Forests
Three boundary cases come up constantly in interviews and in real code, and getting them right is what separates a robust tree library from a leaky one.
- The empty tree. The recursive definition allows . In code this means the root pointer can be NULL (in C) or None (in Python). Every recursive algorithm must handle this case explicitly — usually with
if node is None: return ...on the first line. The height of the empty tree is conventionally , so that comes out right. - The single-node tree. One node, zero edges. It is simultaneously the root, a leaf, the only ancestor of itself, and the only descendant of itself. depth = 0, height = 0, degree = 0. Test every traversal you write against this case before trusting it.
- The forest. A collection of disjoint trees. Forests arise when you split a tree into parts (e.g. remove the root) or when an algorithm naturally produces several disconnected hierarchies (e.g. Kruskal's minimum-spanning-tree algorithm maintains a forest of growing components, Chapter 21). A forest can be turned into a single tree by adding a fake super-root with the forest's trees as its children — a useful trick when an algorithm requires a single root.
Off-by-one on the empty tree
Where Trees Live in the Real World
It is worth pausing at the end of a foundational section to look outward. The terminology you just learned is the same terminology used by the engineers who built the systems you use every day.
| System | Tree it uses | What the structure buys |
|---|---|---|
| Linux ext4 / NTFS / APFS | Directory tree + B-tree indexes | O(log n) file lookup; cheap rename = pointer swap. |
| Git | Commit DAG + per-commit Merkle tree of blobs | Content-addressed storage; tamper-evident history. |
| Browsers (DOM) | Hierarchical element tree | CSS selectors, event bubbling, layout all walk the tree. |
| React / Vue / Angular | Virtual-DOM tree + fiber tree | Diffing two trees produces minimal real-DOM updates. |
| PostgreSQL / MySQL / SQLite | B-tree (or B+-tree) indexes | O(log n) range queries on disk. |
| MongoDB / Elasticsearch | B-tree / LSM-tree indexes | Same idea, tuned for write-heavy workloads. |
| IP routing tables | Trie (radix tree) on IP prefixes | Longest-prefix match in time linear in prefix length. |
| Compilers (gcc, clang, rustc) | AST + dominator tree + IR call tree | Optimization passes are tree rewrites. |
| Databases of facts (Wikidata) | Class hierarchy / taxonomy tree | Inheritance of properties between concepts. |
| Game AI (chess engines) | Game tree (minimax + alpha-beta) | Pruning relies on the tree's structure to skip subtrees. |
| Decision-tree models (XGBoost) | Decision tree | Each leaf is a prediction; depth controls capacity. |
| Phylogenetics | Species tree, gene tree | Common-ancestor queries answer evolution questions. |
Read this table back to yourself with the words you learned in this section. The Git commit DAG is "almost a tree" with merge commits introducing extra parents; B-tree indexes are balanced trees with very high degree (typically hundreds), giving an extremely small height; the DOM is a general tree in which leaves are usually text nodes; an AST is a tree whose internal-node degree depends on the syntactic category. The same vocabulary, applied everywhere.
Quick Check
Quick Check
In the visualizer's tree, what is height(C)?
Quick Check
Which statement is FALSE for every tree?
Quick Check
You delete the root of a non-empty tree. What is left?
What Comes Next
You now have the vocabulary (root, parent, child, sibling, leaf, ancestor, descendant, subtree), the numbers (depth, height, level, degree), the mathematical properties ( edges, unique path, height bounds), the C and Python encodings, and the recursive pattern that every tree algorithm follows. Section 10.2 will widen the lens and survey the four standard representations of trees in memory — first-child / next-sibling, array-of-children, parent-pointer-only, and array-indexed (the heap-style flat encoding) — with a clear rubric for which to use when.
Sections 10.3 and 10.4 will then introduce the four classical traversals (preorder, inorder, postorder, level-order), each of which is a one-page recursive program built on the same node structures we defined here. By the end of the chapter you will have written, by hand, every operation you need to implement a basic file viewer, a simple expression evaluator, or a generic tree visitor.