Chapter 3
18 min read
Section 12 of 261

Logarithms in Algorithms

Mathematical Foundations

Learning Objectives

By the end of this section, you will be able to:

  1. Understand what logarithms are and why they're fundamental to algorithm analysis
  2. Recognize logarithmic patterns in algorithms like binary search and divide-and-conquer
  3. Calculate and manipulate logarithms using key properties
  4. Explain why the logarithm base doesn't matter in Big-O notation
  5. Connect logarithms to real-world systems like databases, search engines, and tree structures
  6. Appreciate the dramatic efficiency that logarithmic algorithms provide at scale
Why This Matters: Logarithms are the mathematical foundation of efficient algorithms. They explain why binary search can find an item among a billion in just 30 steps, why balanced trees remain fast as they grow, and why divide-and-conquer algorithms are so powerful. Understanding logarithms transforms algorithm analysis from memorization to intuition.

The Story Behind Logarithms

Long before computers existed, mathematicians faced a practical problem: multiplication and division of large numbers were incredibly tedious and error-prone. In the early 17th century, Scottish mathematician John Napier invented a revolutionary tool—the logarithm.

The Original Purpose

Napier's insight was brilliant: multiplication can be converted to addition using logarithms. If you need to multiply two large numbers, say 3,456 × 7,891:

  1. Look up log(3,456) ≈ 3.539 and log(7,891) ≈ 3.897 in a table
  2. Add them: 3.539 + 3.897 = 7.436
  3. Look up the antilog: 107.436 ≈ 27,279,696

This technique powered centuries of scientific calculation and navigation before calculators. Slide rules—mechanical logarithm computers—helped send astronauts to the moon!

Logarithms in Computer Science

In computer science, logarithms gained a new and profound importance: they describe how many times you can divide a problem in half before reaching a single element. This is the essence of efficient algorithms.

Historical Connection

The term "logarithm" comes from Greek: logos (ratio) + arithmos (number). Napier chose this name because logarithms relate numbers through ratios—exactly what happens when we repeatedly halve a problem.

Definition: What is a Logarithm?

A logarithm answers a fundamental question: "What power do I need to raise a base to, in order to get a given number?"

Formal Definition

The logarithm base bb of xx, written logb(x)\log_b(x), is the exponent yy such that:

logb(x)=yby=x\log_b(x) = y \quad \Leftrightarrow \quad b^y = x

In other words, logb(x)\log_b(x) asks: "To what power must I raise bb to get xx?"

Examples with Different Bases

LogarithmQuestionAnswerVerification
log₂(8)2 to what power = 8?32³ = 8 ✓
log₂(1024)2 to what power = 1024?102¹⁰ = 1024 ✓
log₁₀(1000)10 to what power = 1000?310³ = 1000 ✓
log₂(1)2 to what power = 1?02⁰ = 1 ✓
log₃(81)3 to what power = 81?43⁴ = 81 ✓

Intuitive Understanding

Think of logarithms as counting multiplications:

  • log₂(n): How many times must I multiply 2 by itself to reach n?
  • Equivalently: How many times can I divide n by 2 before reaching 1?

For example, log2(1024)=10\log_2(1024) = 10 because:

📝text
11024 → 512 → 256 → 128 → 64 → 32 → 16 → 8 → 4 → 2 → 1
2  1     2     3     4     5    6    7   8   9  10 divisions

Memory Aid

Think of log₂(n) as "how many times can I halve n?" This directly connects to binary search and divide-and-conquer algorithms.

Quick Check

What is log₂(32)?


Interactive: Logarithm Explorer

Use this interactive tool to explore logarithms with different bases. Notice how all logarithm curves have the same fundamental shape—they differ only by a constant factor.

Loading logarithm explorer

Key Logarithm Properties

These properties are essential for analyzing algorithms and simplifying expressions. Each property corresponds to a rule about exponents.

The Three Fundamental Properties

PropertyFormulaIntuition
Product Rulelog(ab) = log(a) + log(b)Multiplying numbers = adding their exponents
Quotient Rulelog(a/b) = log(a) - log(b)Dividing numbers = subtracting their exponents
Power Rulelog(aⁿ) = n × log(a)Raising to a power = multiplying the exponent

Additional Important Properties

PropertyFormulaExample
log of 1log_b(1) = 0Any number to power 0 is 1
log of baselog_b(b) = 1log₂(2) = 1, log₁₀(10) = 1
Change of baselog_a(x) = log_b(x) / log_b(a)log₂(x) = ln(x) / ln(2)
Inverseb^(log_b(x)) = x2^(log₂(8)) = 8

Change of Base Formula

The change of base formula is particularly important in computer science:

loga(x)=logb(x)logb(a)=ln(x)ln(a)=log10(x)log10(a)\log_a(x) = \frac{\log_b(x)}{\log_b(a)} = \frac{\ln(x)}{\ln(a)} = \frac{\log_{10}(x)}{\log_{10}(a)}

This means any logarithm can be converted to any other base by dividing by a constant. This constant factor is why the base doesn't matter in Big-O notation.

Logarithm Properties in Code
🐍logarithm_properties.py
7Base-2 logarithm

log₂(1024) = 10 because 2¹⁰ = 1024. Python's math.log2() computes this directly.

EXAMPLE
math.log2(8) = 3.0
8Base-10 logarithm

log₁₀(1024) ≈ 3.01 because 10³ = 1000, close to 1024. This is the 'common logarithm'.

9Natural logarithm

ln(1024) ≈ 6.93, using base e ≈ 2.718. The natural log appears frequently in calculus and statistics.

12Change of base

We can compute log₂(x) using log₁₀: just divide log₁₀(x) by log₁₀(2). This works with any base.

EXAMPLE
log₁₀(1024) / log₁₀(2) = 3.01 / 0.301 = 10.0
16Constant ratio

The ratio log₂(n)/log₁₀(n) is always the same: log₂(10) ≈ 3.322. This constant factor is why base doesn't matter in Big-O.

12 lines without explanation
1# Computing logarithms in Python
2import math
3
4n = 1024
5
6# Different bases, same pattern
7log_base_2 = math.log2(n)        # 10.0
8log_base_10 = math.log10(n)      # 3.010...
9log_natural = math.log(n)        # 6.931...
10
11# Converting between bases
12log_2_from_10 = math.log10(n) / math.log10(2)  # 10.0
13log_2_from_e = math.log(n) / math.log(2)       # 10.0
14
15# Verify the relationship
16ratio = log_base_2 / log_base_10  # ≈ 3.322
17# This is always log₂(10) ≈ 3.322

Quick Check

Using the product rule, what is log₂(8 × 16)?


Why Logarithms Appear in Algorithms

Logarithms appear in algorithms for a fundamental reason: they describe repeated halving (or more generally, repeated division). This pattern emerges in three main scenarios:

1. Searching in Sorted Data

When searching a sorted collection, you can eliminate half the remaining elements with each comparison. Starting with nn elements:

nn2n4n81n \to \frac{n}{2} \to \frac{n}{4} \to \frac{n}{8} \to \cdots \to 1

The number of steps to reach 1 element is log2(n)\log_2(n).

2. Balanced Tree Heights

In a balanced binary tree with nn nodes, each level doubles the number of nodes. A tree of height hh can hold 2h12^h - 1 nodes. Inverting this:

h=log2(n+1)h = \lceil \log_2(n + 1) \rceil

3. Divide and Conquer Recursion

When a problem of size nn is divided into smaller subproblems (typically of size n/2n/2), the recursion depth is log(n)\log(n). This is why algorithms like merge sort and quicksort have O(nlogn)O(n \log n) complexity.

The Common Thread

All three scenarios share the same pattern: the problem size shrinks by a constant factor at each step. Whether dividing by 2, 3, or any constant, the number of divisions needed to reach size 1 is logarithmic.

Binary search is the canonical example of logarithmic complexity. It searches a sorted array by repeatedly dividing the search space in half.

The Algorithm

Binary Search Implementation
🐍binary_search.py
7Initialize search range

We start with the entire array: from index 0 to len(arr)-1. This range will shrink by half each iteration.

9Main loop condition

Continue while the search range is valid (left ≤ right). When left > right, the range is empty and target isn't present.

10Find the middle

Calculate the middle index. Integer division ensures we get a valid index. This divides the search space in half.

12Check for match

If the middle element equals our target, we've found it! Return the index immediately.

14Search right half

If arr[mid] < target, the target must be in the right half. Update left to mid + 1, eliminating the left half.

EXAMPLE
If arr = [1,3,5,7,9], mid=2 (value 5), target=7: search [7,9]
16Search left half

If arr[mid] > target, the target must be in the left half. Update right to mid - 1, eliminating the right half.

13 lines without explanation
1def binary_search(arr, target):
2    """
3    Search for target in sorted array arr.
4    Returns index if found, -1 otherwise.
5    Time Complexity: O(log n)
6    """
7    left, right = 0, len(arr) - 1
8
9    while left <= right:
10        mid = (left + right) // 2
11
12        if arr[mid] == target:
13            return mid           # Found!
14        elif arr[mid] < target:
15            left = mid + 1       # Search right half
16        else:
17            right = mid - 1      # Search left half
18
19    return -1  # Not found

Why O(log n)?

Each iteration eliminates half the remaining elements. After kk iterations, we have:

Remaining elements=n2k\text{Remaining elements} = \frac{n}{2^k}

We're done when only 1 element remains, so we solve for kk:

n2k=1n=2kk=log2(n)\frac{n}{2^k} = 1 \quad \Rightarrow \quad n = 2^k \quad \Rightarrow \quad k = \log_2(n)

Interactive: Binary Search in Action

Watch how binary search eliminates half the search space at each step. Notice how even with 64 elements, it takes at most 6 comparisons.

Binary Search: The Logarithm in Action
01
12
23
34
45
56
67
78
89
910
1011
1112
1213
1314
1415
1516
1617
1718
1819
1920
2021
2122
2223
2324
2425
2526
2627
2728
2829
2930
3031
3132
Search Range
Current Middle
Found
Target
0
Steps Taken
5
Max Steps (log₂ n)
32
Elements Remaining
32
Total Elements

Why Binary Search is O(log n)

Each step eliminates half of the remaining elements. Starting with 32 elements:

32168421

After k steps, we have n/2k elements. We need n/2k = 1, so k = log₂(n) = log₂(32) ≈ 5.

Quick Check

Binary search on an array of 1,000,000 elements requires at most how many comparisons?


Case Study: Tree Heights

The height of a balanced tree is logarithmic in the number of nodes. This is why tree-based data structures (BSTs, heaps, B-trees) provide efficient operations.

The Relationship

For a balanced binary tree with nn nodes:

  • Level 0 (root): 1 node = 20
  • Level 1: 2 nodes = 21
  • Level 2: 4 nodes = 22
  • Level k: 2k nodes

A complete binary tree of height hh has 2h+112^{h+1} - 1 nodes. Solving for height:

h=log2(n)h = \lfloor \log_2(n) \rfloor

Interactive: Tree Height Visualization

Loading tree height demo

Why This Matters

Tree operations (search, insert, delete) traverse from root to leaf. In a balanced tree:

  • 1,000 nodes: height ≈ 10, so operations take ~10 steps
  • 1,000,000 nodes: height ≈ 20, so operations take ~20 steps
  • 1,000,000,000 nodes: height ≈ 30, so operations take ~30 steps

A billion-node tree is only 50% taller than a million-node tree!

Unbalanced Trees

If a tree becomes unbalanced (like a linked list), the height can become O(n) instead of O(log n). This is why self-balancing trees (AVL, Red-Black, B-trees) are crucial for maintaining logarithmic performance.

Case Study: Divide and Conquer

Divide and conquer algorithms split problems into smaller subproblems, solve them recursively, and combine the results. The recursion depth is logarithmic because each level reduces problem size by a constant factor.

The Pattern

A typical divide-and-conquer recurrence looks like:

T(n)=aT(nb)+f(n)T(n) = a \cdot T\left(\frac{n}{b}\right) + f(n)

Where:

  • aa = number of subproblems
  • n/bn/b = size of each subproblem
  • f(n)f(n) = work to divide and combine

The recursion depth is logb(n)\log_b(n) because we divide by bb at each level.

Interactive: Divide and Conquer Decomposition

Loading divide-and-conquer demo

Merge Sort Example

Merge sort divides an array in half (b=2b = 2), makes 2 recursive calls (a=2a = 2), and does O(n)O(n) work to merge:

T(n)=2T(n2)+O(n)T(n) = 2 \cdot T\left(\frac{n}{2}\right) + O(n)

There are log2(n)\log_2(n) levels, each doing O(n)O(n) work total:

T(n)=O(n)×log2(n)=O(nlogn)T(n) = O(n) \times \log_2(n) = O(n \log n)
Merge Sort: O(n log n)
🐍merge_sort.py
6Base case

An array of 0 or 1 elements is already sorted. This is where the recursion stops.

9Divide step

Split the array at the midpoint. This is the 'divide' in divide-and-conquer.

10Recursive call (left half)

Recursively sort the left half. Each recursive level processes n/2 elements.

11Recursive call (right half)

Recursively sort the right half. Combined with line 10, this creates the log(n) depth.

13Conquer/Combine step

Merge the two sorted halves. This takes O(n) time, and happens at each of the log(n) levels, giving O(n log n) total.

25 lines without explanation
1def merge_sort(arr):
2    """
3    Sort array using merge sort.
4    Time Complexity: O(n log n)
5    """
6    if len(arr) <= 1:
7        return arr              # Base case
8
9    mid = len(arr) // 2
10    left = merge_sort(arr[:mid])   # Recursive call (n/2)
11    right = merge_sort(arr[mid:])  # Recursive call (n/2)
12
13    return merge(left, right)      # O(n) work
14
15def merge(left, right):
16    """Merge two sorted arrays into one sorted array."""
17    result = []
18    i = j = 0
19
20    while i < len(left) and j < len(right):
21        if left[i] <= right[j]:
22            result.append(left[i])
23            i += 1
24        else:
25            result.append(right[j])
26            j += 1
27
28    result.extend(left[i:])
29    result.extend(right[j:])
30    return result

Common Logarithmic Complexities

Here are the most common complexity classes involving logarithms:

ComplexityNameExamples1 Million Items
O(log n)LogarithmicBinary search, balanced tree lookup~20 ops
O(√n)Square rootSome number theory algorithms1,000 ops
O(n)LinearLinear search, array traversal1,000,000 ops
O(n log n)LinearithmicMerge sort, heap sort, FFT~20,000,000 ops
O(n²)QuadraticBubble sort, naive matrix multiply10¹² ops

The Log n Factor: What It Represents

The logn\log n factor typically represents one of:

  • Recursion depth: How many times you divide the problem (merge sort)
  • Tree height: How many levels in a balanced tree structure (heap sort)
  • Bits in the input: Number of binary digits needed to represent n (radix sort)

Pattern Recognition

When you see an algorithm that repeatedly halves its input or builds a balanced tree structure, expect a log n factor in the complexity.

Why the Base Doesn't Matter in Big-O

You'll often see O(logn)O(\log n) written without specifying the base. This is intentional: in asymptotic analysis, the base is absorbed by the constant factor.

The Mathematical Reason

Using the change of base formula:

loga(n)=logb(n)logb(a)=1logb(a)logb(n)\log_a(n) = \frac{\log_b(n)}{\log_b(a)} = \frac{1}{\log_b(a)} \cdot \log_b(n)

The factor 1logb(a)\frac{1}{\log_b(a)} is a constant that Big-O notation ignores. Therefore:

O(log2n)=O(log10n)=O(lnn)=O(logn)O(\log_2 n) = O(\log_{10} n) = O(\ln n) = O(\log n)

Concrete Example

nlog₂(n)log₁₀(n)Ratio
103.321.003.32
1006.642.003.32
1,0009.973.003.32
1,000,00019.936.003.32

The ratio log₂(n)/log₁₀(n) is always 3.32 (which equals log₂(10)). This constant multiplier is exactly what Big-O ignores.

When Base Does Matter

The base matters in exact complexity analysis (counting precise operations) and when comparing algorithms with the same asymptotic complexity. For example, a 3-way quicksort (log₃ n depth) has fewer recursion levels than 2-way (log₂ n), even though both are O(n log n).

Real-World Impact

The practical impact of logarithmic complexity is staggering. Let's examine some real-world scales:

Database Indexing

Modern databases use B-trees with branching factors around 100-1000. For a database with 10 billion records:

log100(1010)=102=5 disk reads\log_{100}(10^{10}) = \frac{10}{2} = 5 \text{ disk reads}

Just 5 disk accesses to find any record among 10 billion! Without indexing, you'd need to scan all 10 billion records.

Google indexes approximately 100 billion web pages. A tree-based index structure with branching factor 1000:

log1000(1011)4 steps\log_{1000}(10^{11}) \approx 4 \text{ steps}

Four index lookups can narrow down from 100 billion pages to a manageable set!

Version Control (Git)

Git uses Merkle trees (hash trees) to efficiently compare repositories. Comparing two repositories with millions of files:

  • Without tree structure: O(n) = millions of comparisons
  • With Merkle tree: O(log n) ≈ 20 hash comparisons to find differences

Comparison: Linear vs Logarithmic at Scale

Data SizeLinear O(n)Logarithmic O(log n)Speedup
1,0001,000 ops10 ops100×
1,000,0001,000,000 ops20 ops50,000×
1,000,000,0001 billion ops30 ops33 million×
1 trillion1 trillion ops40 ops25 billion×
The Takeaway: Logarithmic algorithms don't just run faster—they transform impossible problems into trivial ones. A billion-element search in 30 steps is the difference between instant results and hours of waiting.

Summary

In this section, we've explored logarithms—one of the most important mathematical concepts in algorithm analysis:

Key Concepts

ConceptKey Insight
Definitionlog_b(x) = y means b^y = x; it counts repeated multiplications or divisions
PropertiesProduct: log(ab) = log(a) + log(b); Power: log(a^n) = n×log(a)
Binary SearchHalving the search space gives O(log n) comparisons
Tree HeightBalanced tree with n nodes has height O(log n)
Divide & ConquerDividing by constant factor gives log n recursion depth
Base IndependenceAll log bases differ by constants, so O(log₂ n) = O(log n)

Key Formulas

logb(x)=yby=xlog(ab)=log(a)+log(b)log(a/b)=log(a)log(b)log(an)=nlog(a)loga(x)=logb(x)logb(a)\begin{aligned} \log_b(x) = y &\Leftrightarrow b^y = x \\ \log(ab) &= \log(a) + \log(b) \\ \log(a/b) &= \log(a) - \log(b) \\ \log(a^n) &= n \cdot \log(a) \\ \log_a(x) &= \frac{\log_b(x)}{\log_b(a)} \end{aligned}

What's Next

In the upcoming sections, we'll explore:

  1. Summation Formulas and Series: Essential for analyzing loops and recursive algorithms
  2. Proof Techniques: Mathematical induction and other methods for proving algorithm correctness
  3. Probability Basics: Randomized algorithms and expected case analysis

Knowledge Check

Test your understanding of logarithms in algorithms with this quiz:

Knowledge Check
Question 1 of 8

What is log₂(1024)?


Exercises

Computation Exercises

  1. Calculate: log₂(256), log₂(2048), log₃(243), log₁₀(1,000,000)
  2. Simplify using logarithm properties: log₂(32 × 64), log₂(1024 / 32), log₂(8⁵)
  3. Convert log₂(1000) to use base 10. Verify your answer with a calculator.
  4. If log₂(n) = 20, what is n? If log₂(n) = 30?

Algorithm Analysis Exercises

  1. A sorted array has 1,000,000 elements. What is the maximum number of comparisons binary search will make?
  2. A balanced ternary tree (each node has 3 children) has 1,000,000 nodes. What is its height?
  3. How many times can you halve the number 1,000,000,000 before reaching 1?
  4. An algorithm has recurrence T(n) = 2T(n/2) + n. Write out the first 4 levels of the recursion tree for n = 64. What is the total work at each level?

Conceptual Questions

  1. Why do we say O(log₂ n) = O(log₁₀ n) = O(log n), but we can't say O(n) = O(n²)?
  2. A database can do binary search on an index in O(log n) time. Why might it use a B-tree with branching factor 100 instead of a binary tree?
  3. If doubling the input size adds one more step to an algorithm, what is its complexity?
  4. Explain intuitively why merge sort is O(n log n) while bubble sort is O(n²).

Programming Challenges

  1. Implement a function that computes log₂(n) using only addition and subtraction (no built-in log functions). Count how many times you halve n.
  2. Modify binary search to return the number of comparisons made. Test on arrays of size 100, 1000, 10000, and verify the results match log₂(n).
  3. Write a function to compute the height of a complete binary tree given the number of nodes. Verify using log₂(n).

Solution Hints

  • Exercise 1.1: 256 = 2⁸, 2048 = 2¹¹, 243 = 3⁵, 1,000,000 = 10⁶
  • Exercise 2.1: log₂(1,000,000) = log₂(10⁶) ≈ 6 × 3.32 ≈ 20 comparisons
  • Exercise 3.1: Different log bases differ by constants; n and n² differ by a factor of n

In the next section, we'll explore Summation Formulas and Series—essential tools for analyzing loops and computing total work in algorithms.

Loading comments...