Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Understand what logarithms are and why they're fundamental to algorithm analysis
Recognize logarithmic patterns in algorithms like binary search and divide-and-conquer
Calculate and manipulate logarithms using key properties
Explain why the logarithm base doesn't matter in Big-O notation
Connect logarithms to real-world systems like databases, search engines, and tree structures
Appreciate the dramatic efficiency that logarithmic algorithms provide at scale

Why This Matters: Logarithms are the mathematical foundation of efficient algorithms. They explain why binary search can find an item among a billion in just 30 steps, why balanced trees remain fast as they grow, and why divide-and-conquer algorithms are so powerful. Understanding logarithms transforms algorithm analysis from memorization to intuition.

The Story Behind Logarithms

Long before computers existed, mathematicians faced a practical problem: multiplication and division of large numbers were incredibly tedious and error-prone. In the early 17th century, Scottish mathematician John Napier invented a revolutionary tool—the logarithm.

The Original Purpose

Napier's insight was brilliant: multiplication can be converted to addition using logarithms. If you need to multiply two large numbers, say 3,456 × 7,891:

Look up log(3,456) ≈ 3.539 and log(7,891) ≈ 3.897 in a table
Add them: 3.539 + 3.897 = 7.436
Look up the antilog: 10^7.436 ≈ 27,279,696

This technique powered centuries of scientific calculation and navigation before calculators. Slide rules—mechanical logarithm computers—helped send astronauts to the moon!

Logarithms in Computer Science

In computer science, logarithms gained a new and profound importance: they describe how many times you can divide a problem in half before reaching a single element. This is the essence of efficient algorithms.

Historical Connection

The term "logarithm" comes from Greek: logos (ratio) + arithmos (number). Napier chose this name because logarithms relate numbers through ratios—exactly what happens when we repeatedly halve a problem.

Definition: What is a Logarithm?

A logarithm answers a fundamental question: "What power do I need to raise a base to, in order to get a given number?"

Formal Definition

The logarithm base $b$ of $x$ , written $\log_b(x)$ , is the exponent $y$ such that:

\log_b(x) = y \quad \Leftrightarrow \quad b^y = x

In other words, $\log_b(x)$ asks: "To what power must I raise $b$ to get $x$ ?"

Examples with Different Bases

Logarithm	Question	Answer	Verification
log₂(8)	2 to what power = 8?	3	2³ = 8 ✓
log₂(1024)	2 to what power = 1024?	10	2¹⁰ = 1024 ✓
log₁₀(1000)	10 to what power = 1000?	3	10³ = 1000 ✓
log₂(1)	2 to what power = 1?	0	2⁰ = 1 ✓
log₃(81)	3 to what power = 81?	4	3⁴ = 81 ✓

Intuitive Understanding

Think of logarithms as counting multiplications:

log₂(n): How many times must I multiply 2 by itself to reach n?
Equivalently: How many times can I divide n by 2 before reaching 1?

For example, $\log_2(1024) = 10$ because:

📝text

11024 → 512 → 256 → 128 → 64 → 32 → 16 → 8 → 4 → 2 → 1
2  1     2     3     4     5    6    7   8   9  10 divisions

Memory Aid

Think of log₂(n) as "how many times can I halve n?" This directly connects to binary search and divide-and-conquer algorithms.

Quick Check

What is log₂(32)?

Interactive: Logarithm Explorer

Use this interactive tool to explore logarithms with different bases. Notice how all logarithm curves have the same fundamental shape—they differ only by a constant factor.

Loading logarithm explorer…

Key Logarithm Properties

These properties are essential for analyzing algorithms and simplifying expressions. Each property corresponds to a rule about exponents.

The Three Fundamental Properties

Property	Formula	Intuition
Product Rule	log(ab) = log(a) + log(b)	Multiplying numbers = adding their exponents
Quotient Rule	log(a/b) = log(a) - log(b)	Dividing numbers = subtracting their exponents
Power Rule	log(aⁿ) = n × log(a)	Raising to a power = multiplying the exponent

Additional Important Properties

Property	Formula	Example
log of 1	log_b(1) = 0	Any number to power 0 is 1
log of base	log_b(b) = 1	log₂(2) = 1, log₁₀(10) = 1
Change of base	log_a(x) = log_b(x) / log_b(a)	log₂(x) = ln(x) / ln(2)
Inverse	b^(log_b(x)) = x	2^(log₂(8)) = 8

Change of Base Formula

The change of base formula is particularly important in computer science:

\log_a(x) = \frac{\log_b(x)}{\log_b(a)} = \frac{\ln(x)}{\ln(a)} = \frac{\log_{10}(x)}{\log_{10}(a)}

This means any logarithm can be converted to any other base by dividing by a constant. This constant factor is why the base doesn't matter in Big-O notation.

Logarithm Properties in Code

🐍logarithm_properties.py

Explanation(5)

Code(17)

7Base-2 logarithm

log₂(1024) = 10 because 2¹⁰ = 1024. Python's math.log2() computes this directly.

EXAMPLE

math.log2(8) = 3.0

8Base-10 logarithm

log₁₀(1024) ≈ 3.01 because 10³ = 1000, close to 1024. This is the 'common logarithm'.

9Natural logarithm

ln(1024) ≈ 6.93, using base e ≈ 2.718. The natural log appears frequently in calculus and statistics.

12Change of base

We can compute log₂(x) using log₁₀: just divide log₁₀(x) by log₁₀(2). This works with any base.

EXAMPLE

log₁₀(1024) / log₁₀(2) = 3.01 / 0.301 = 10.0

16Constant ratio

The ratio log₂(n)/log₁₀(n) is always the same: log₂(10) ≈ 3.322. This constant factor is why base doesn't matter in Big-O.

12 lines without explanation

1# Computing logarithms in Python
2import math
3
4n = 1024
5
6# Different bases, same pattern
7log_base_2 = math.log2(n)        # 10.0
8log_base_10 = math.log10(n)      # 3.010...
9log_natural = math.log(n)        # 6.931...
10
11# Converting between bases
12log_2_from_10 = math.log10(n) / math.log10(2)  # 10.0
13log_2_from_e = math.log(n) / math.log(2)       # 10.0
14
15# Verify the relationship
16ratio = log_base_2 / log_base_10  # ≈ 3.322
17# This is always log₂(10) ≈ 3.322

Quick Check

Using the product rule, what is log₂(8 × 16)?

Why Logarithms Appear in Algorithms

Logarithms appear in algorithms for a fundamental reason: they describe repeated halving (or more generally, repeated division). This pattern emerges in three main scenarios:

1. Searching in Sorted Data

When searching a sorted collection, you can eliminate half the remaining elements with each comparison. Starting with $n$ elements:

n \to \frac{n}{2} \to \frac{n}{4} \to \frac{n}{8} \to \cdots \to 1

The number of steps to reach 1 element is $\log_2(n)$ .

2. Balanced Tree Heights

In a balanced binary tree with $n$ nodes, each level doubles the number of nodes. A tree of height $h$ can hold $2^h - 1$ nodes. Inverting this:

h = \lceil \log_2(n + 1) \rceil

3. Divide and Conquer Recursion

When a problem of size $n$ is divided into smaller subproblems (typically of size $n/2$ ), the recursion depth is $\log(n)$ . This is why algorithms like merge sort and quicksort have $O(n \log n)$ complexity.

The Common Thread

All three scenarios share the same pattern: the problem size shrinks by a constant factor at each step. Whether dividing by 2, 3, or any constant, the number of divisions needed to reach size 1 is logarithmic.

Case Study: Binary Search

Binary search is the canonical example of logarithmic complexity. It searches a sorted array by repeatedly dividing the search space in half.

The Algorithm

Binary Search Implementation

🐍binary_search.py

Explanation(6)

Code(19)

7Initialize search range

We start with the entire array: from index 0 to len(arr)-1. This range will shrink by half each iteration.

9Main loop condition

Continue while the search range is valid (left ≤ right). When left > right, the range is empty and target isn't present.

10Find the middle

Calculate the middle index. Integer division ensures we get a valid index. This divides the search space in half.

12Check for match

If the middle element equals our target, we've found it! Return the index immediately.

14Search right half

If arr[mid] < target, the target must be in the right half. Update left to mid + 1, eliminating the left half.

EXAMPLE

If arr = [1,3,5,7,9], mid=2 (value 5), target=7: search [7,9]

16Search left half

If arr[mid] > target, the target must be in the left half. Update right to mid - 1, eliminating the right half.

13 lines without explanation

1def binary_search(arr, target):
2    """
3    Search for target in sorted array arr.
4    Returns index if found, -1 otherwise.
5    Time Complexity: O(log n)
6    """
7    left, right = 0, len(arr) - 1
8
9    while left <= right:
10        mid = (left + right) // 2
11
12        if arr[mid] == target:
13            return mid           # Found!
14        elif arr[mid] < target:
15            left = mid + 1       # Search right half
16        else:
17            right = mid - 1      # Search left half
18
19    return -1  # Not found

Why O(log n)?

Each iteration eliminates half the remaining elements. After $k$ iterations, we have:

\text{Remaining elements} = \frac{n}{2^k}

We're done when only 1 element remains, so we solve for $k$ :

\frac{n}{2^k} = 1 \quad \Rightarrow \quad n = 2^k \quad \Rightarrow \quad k = \log_2(n)

Interactive: Binary Search in Action

Watch how binary search eliminates half the search space at each step. Notice how even with 64 elements, it takes at most 6 comparisons.

Binary Search: The Logarithm in Action

Array Size: 32

Target Value: 23

Speed:

910

1011

1112

1213

1314

1415

1516

1617

1718

1819

1920

2021

2122

2223

2324

2425

2526

2627

2728

2829

2930

3031

3132

Search Range

Current Middle

Found

Target

Steps Taken

Max Steps (log₂ n)

Elements Remaining

Total Elements

Why Binary Search is O(log n)

Each step eliminates half of the remaining elements. Starting with 32 elements:

32 → 16 → 8 → 4 → 2 → 1

After k steps, we have n/2^k elements. We need n/2^k = 1, so k = log₂(n) = log₂(32) ≈ 5.

Quick Check

Binary search on an array of 1,000,000 elements requires at most how many comparisons?

Case Study: Tree Heights

The height of a balanced tree is logarithmic in the number of nodes. This is why tree-based data structures (BSTs, heaps, B-trees) provide efficient operations.

The Relationship

For a balanced binary tree with $n$ nodes:

Level 0 (root): 1 node = 2⁰
Level 1: 2 nodes = 2¹
Level 2: 4 nodes = 2²
Level k: 2^k nodes

A complete binary tree of height $h$ has $2^{h+1} - 1$ nodes. Solving for height:

h = \lfloor \log_2(n) \rfloor

Interactive: Tree Height Visualization

Loading tree height demo…

Why This Matters

Tree operations (search, insert, delete) traverse from root to leaf. In a balanced tree:

1,000 nodes: height ≈ 10, so operations take ~10 steps
1,000,000 nodes: height ≈ 20, so operations take ~20 steps
1,000,000,000 nodes: height ≈ 30, so operations take ~30 steps

A billion-node tree is only 50% taller than a million-node tree!

Unbalanced Trees

If a tree becomes unbalanced (like a linked list), the height can become O(n) instead of O(log n). This is why self-balancing trees (AVL, Red-Black, B-trees) are crucial for maintaining logarithmic performance.

Case Study: Divide and Conquer

Divide and conquer algorithms split problems into smaller subproblems, solve them recursively, and combine the results. The recursion depth is logarithmic because each level reduces problem size by a constant factor.

The Pattern

A typical divide-and-conquer recurrence looks like:

T(n) = a \cdot T\left(\frac{n}{b}\right) + f(n)

Where:

$a$ = number of subproblems
$n/b$ = size of each subproblem
$f(n)$ = work to divide and combine

The recursion depth is $\log_b(n)$ because we divide by $b$ at each level.

Interactive: Divide and Conquer Decomposition

Loading divide-and-conquer demo…

Merge Sort Example

Merge sort divides an array in half ( $b = 2$ ), makes 2 recursive calls ( $a = 2$ ), and does $O(n)$ work to merge:

T(n) = 2 \cdot T\left(\frac{n}{2}\right) + O(n)

There are $\log_2(n)$ levels, each doing $O(n)$ work total:

T(n) = O(n) \times \log_2(n) = O(n \log n)

Merge Sort: O(n log n)

🐍merge_sort.py

Explanation(5)

Code(30)

6Base case

An array of 0 or 1 elements is already sorted. This is where the recursion stops.

9Divide step

Split the array at the midpoint. This is the 'divide' in divide-and-conquer.

10Recursive call (left half)

Recursively sort the left half. Each recursive level processes n/2 elements.

11Recursive call (right half)

Recursively sort the right half. Combined with line 10, this creates the log(n) depth.

13Conquer/Combine step

Merge the two sorted halves. This takes O(n) time, and happens at each of the log(n) levels, giving O(n log n) total.

25 lines without explanation

1def merge_sort(arr):
2    """
3    Sort array using merge sort.
4    Time Complexity: O(n log n)
5    """
6    if len(arr) <= 1:
7        return arr              # Base case
8
9    mid = len(arr) // 2
10    left = merge_sort(arr[:mid])   # Recursive call (n/2)
11    right = merge_sort(arr[mid:])  # Recursive call (n/2)
12
13    return merge(left, right)      # O(n) work
14
15def merge(left, right):
16    """Merge two sorted arrays into one sorted array."""
17    result = []
18    i = j = 0
19
20    while i < len(left) and j < len(right):
21        if left[i] <= right[j]:
22            result.append(left[i])
23            i += 1
24        else:
25            result.append(right[j])
26            j += 1
27
28    result.extend(left[i:])
29    result.extend(right[j:])
30    return result

Common Logarithmic Complexities

Here are the most common complexity classes involving logarithms:

Complexity	Name	Examples	1 Million Items
O(log n)	Logarithmic	Binary search, balanced tree lookup	~20 ops
O(√n)	Square root	Some number theory algorithms	1,000 ops
O(n)	Linear	Linear search, array traversal	1,000,000 ops
O(n log n)	Linearithmic	Merge sort, heap sort, FFT	~20,000,000 ops
O(n²)	Quadratic	Bubble sort, naive matrix multiply	10¹² ops

The Log n Factor: What It Represents

The $\log n$ factor typically represents one of:

Recursion depth: How many times you divide the problem (merge sort)
Tree height: How many levels in a balanced tree structure (heap sort)
Bits in the input: Number of binary digits needed to represent n (radix sort)

Pattern Recognition

When you see an algorithm that repeatedly halves its input or builds a balanced tree structure, expect a log n factor in the complexity.

Why the Base Doesn't Matter in Big-O

You'll often see $O(\log n)$ written without specifying the base. This is intentional: in asymptotic analysis, the base is absorbed by the constant factor.

The Mathematical Reason

Using the change of base formula:

\log_a(n) = \frac{\log_b(n)}{\log_b(a)} = \frac{1}{\log_b(a)} \cdot \log_b(n)

The factor $\frac{1}{\log_b(a)}$ is a constant that Big-O notation ignores. Therefore:

O(\log_2 n) = O(\log_{10} n) = O(\ln n) = O(\log n)

Concrete Example

n	log₂(n)	log₁₀(n)	Ratio
10	3.32	1.00	3.32
100	6.64	2.00	3.32
1,000	9.97	3.00	3.32
1,000,000	19.93	6.00	3.32

The ratio log₂(n)/log₁₀(n) is always 3.32 (which equals log₂(10)). This constant multiplier is exactly what Big-O ignores.

When Base Does Matter

The base matters in exact complexity analysis (counting precise operations) and when comparing algorithms with the same asymptotic complexity. For example, a 3-way quicksort (log₃ n depth) has fewer recursion levels than 2-way (log₂ n), even though both are O(n log n).

Real-World Impact

The practical impact of logarithmic complexity is staggering. Let's examine some real-world scales:

Database Indexing

Modern databases use B-trees with branching factors around 100-1000. For a database with 10 billion records:

\log_{100}(10^{10}) = \frac{10}{2} = 5 \text{ disk reads}

Just 5 disk accesses to find any record among 10 billion! Without indexing, you'd need to scan all 10 billion records.

Internet Search

Google indexes approximately 100 billion web pages. A tree-based index structure with branching factor 1000:

\log_{1000}(10^{11}) \approx 4 \text{ steps}

Four index lookups can narrow down from 100 billion pages to a manageable set!

Version Control (Git)

Git uses Merkle trees (hash trees) to efficiently compare repositories. Comparing two repositories with millions of files:

Without tree structure: O(n) = millions of comparisons
With Merkle tree: O(log n) ≈ 20 hash comparisons to find differences

Comparison: Linear vs Logarithmic at Scale

Data Size	Linear O(n)	Logarithmic O(log n)	Speedup
1,000	1,000 ops	10 ops	100×
1,000,000	1,000,000 ops	20 ops	50,000×
1,000,000,000	1 billion ops	30 ops	33 million×
1 trillion	1 trillion ops	40 ops	25 billion×

The Takeaway: Logarithmic algorithms don't just run faster—they transform impossible problems into trivial ones. A billion-element search in 30 steps is the difference between instant results and hours of waiting.

Summary

In this section, we've explored logarithms—one of the most important mathematical concepts in algorithm analysis:

Key Concepts

Concept	Key Insight
Definition	log_b(x) = y means b^y = x; it counts repeated multiplications or divisions
Properties	Product: log(ab) = log(a) + log(b); Power: log(a^n) = n×log(a)
Binary Search	Halving the search space gives O(log n) comparisons
Tree Height	Balanced tree with n nodes has height O(log n)
Divide & Conquer	Dividing by constant factor gives log n recursion depth
Base Independence	All log bases differ by constants, so O(log₂ n) = O(log n)

Key Formulas

\begin{aligned} \log_b(x) = y &\Leftrightarrow b^y = x \\ \log(ab) &= \log(a) + \log(b) \\ \log(a/b) &= \log(a) - \log(b) \\ \log(a^n) &= n \cdot \log(a) \\ \log_a(x) &= \frac{\log_b(x)}{\log_b(a)} \end{aligned}

What's Next

In the upcoming sections, we'll explore:

Summation Formulas and Series: Essential for analyzing loops and recursive algorithms
Proof Techniques: Mathematical induction and other methods for proving algorithm correctness
Probability Basics: Randomized algorithms and expected case analysis

Knowledge Check

Test your understanding of logarithms in algorithms with this quiz:

Knowledge Check

Question 1 of 8

What is log₂(1024)?

Exercises

Computation Exercises

Calculate: log₂(256), log₂(2048), log₃(243), log₁₀(1,000,000)
Simplify using logarithm properties: log₂(32 × 64), log₂(1024 / 32), log₂(8⁵)
Convert log₂(1000) to use base 10. Verify your answer with a calculator.
If log₂(n) = 20, what is n? If log₂(n) = 30?

Algorithm Analysis Exercises

A sorted array has 1,000,000 elements. What is the maximum number of comparisons binary search will make?
A balanced ternary tree (each node has 3 children) has 1,000,000 nodes. What is its height?
How many times can you halve the number 1,000,000,000 before reaching 1?
An algorithm has recurrence T(n) = 2T(n/2) + n. Write out the first 4 levels of the recursion tree for n = 64. What is the total work at each level?

Conceptual Questions

Why do we say O(log₂ n) = O(log₁₀ n) = O(log n), but we can't say O(n) = O(n²)?
A database can do binary search on an index in O(log n) time. Why might it use a B-tree with branching factor 100 instead of a binary tree?
If doubling the input size adds one more step to an algorithm, what is its complexity?
Explain intuitively why merge sort is O(n log n) while bubble sort is O(n²).

Programming Challenges

Implement a function that computes log₂(n) using only addition and subtraction (no built-in log functions). Count how many times you halve n.
Modify binary search to return the number of comparisons made. Test on arrays of size 100, 1000, 10000, and verify the results match log₂(n).
Write a function to compute the height of a complete binary tree given the number of nodes. Verify using log₂(n).

Solution Hints

Exercise 1.1: 256 = 2⁸, 2048 = 2¹¹, 243 = 3⁵, 1,000,000 = 10⁶
Exercise 2.1: log₂(1,000,000) = log₂(10⁶) ≈ 6 × 3.32 ≈ 20 comparisons
Exercise 3.1: Different log bases differ by constants; n and n² differ by a factor of n

In the next section, we'll explore Summation Formulas and Series—essential tools for analyzing loops and computing total work in algorithms.