Chapter 0
15 min read
Section 2 of 175

Set Theory Essentials

Prerequisites & Mathematical Foundations

Introduction

Set theory forms the mathematical foundation of probability theory. Before we can discuss events, sample spaces, and probability measures, we need to understand what sets are and how to work with them. This section provides a quick refresher on the essential set theory concepts you'll encounter throughout this book.

Why This Matters for ML: In machine learning, we constantly work with collections of data points, events, and outcomes. Understanding set theory helps us reason about these collections rigorously and provides the language for defining probability spaces.

What is a Set?

A set is a well-defined collection of distinct objects. These objects are called elements or members of the set. Sets are typically denoted by capital letters (A, B, C, ...) and their elements are listed within curly braces.

Examples of Sets

  • Finite set: A={1,2,3,4,5}A = \{1, 2, 3, 4, 5\} - the set of first five positive integers
  • Infinite set: N={1,2,3,...}\mathbb{N} = \{1, 2, 3, ...\} - the set of natural numbers
  • Empty set: ={}\emptyset = \{\} - the set with no elements
  • Real numbers: R\mathbb{R} - all real numbers

Special Number Sets

  • N\mathbb{N} - Natural numbers (1, 2, 3, ...)
  • Z\mathbb{Z} - Integers (..., -2, -1, 0, 1, 2, ...)
  • Q\mathbb{Q} - Rational numbers (fractions)
  • R\mathbb{R} - Real numbers

Set Notation

Understanding set notation is crucial for reading mathematical definitions in probability and statistics.

SymbolMeaningExample
Element of3 ∈ {1, 2, 3}
Not element of4 ∉ {1, 2, 3}
Subset of{1, 2} ⊆ {1, 2, 3}
Proper subset{1, 2} ⊂ {1, 2, 3}
|A|Cardinality (size)|{1, 2, 3}| = 3
Empty set∅ ⊆ A for any set A

Set-Builder Notation

Sets can also be defined by specifying a property that all elements must satisfy:

A={xR:x>0}A = \{x \in \mathbb{R} : x > 0\}

This reads as "A is the set of all x in the real numbers such that x is greater than 0." The colon (:) can also be written as a vertical bar (|).


Set Operations

The following operations allow us to combine and manipulate sets:

Union (∪)

The union of sets A and B contains all elements that are in A or B (or both):

AB={x:xA or xB}A \cup B = \{x : x \in A \text{ or } x \in B\}

Intersection (∩)

The intersection of sets A and B contains all elements that are in A and B:

AB={x:xA and xB}A \cap B = \{x : x \in A \text{ and } x \in B\}

Complement (Aᶜ or A')

The complement of set A contains all elements in the universal set that are not in A:

Ac={xU:xA}A^c = \{x \in U : x \notin A\}

Set Difference (A \ B)

The difference A minus B contains elements in A but not in B:

AB={x:xA and xB}A \setminus B = \{x : x \in A \text{ and } x \notin B\}

Probability Connection

In probability, union corresponds to "OR" events, intersection to "AND" events, and complement to "NOT" events. We'll explore this in Chapter 1.

De Morgan's Laws

De Morgan's laws are fundamental identities that relate unions, intersections, and complements. They are essential for manipulating probability expressions:

(AB)c=AcBc(A \cup B)^c = A^c \cap B^c
(AB)c=AcBc(A \cap B)^c = A^c \cup B^c

In words:

  1. The complement of a union equals the intersection of complements
  2. The complement of an intersection equals the union of complements
Example: If A = "it rains" and B = "it's cold", then (A ∪ B)ᶜ = "it doesn't rain AND it's not cold" = Aᶜ ∩ Bᶜ

Python Implementation

Python has built-in support for sets, making it easy to work with set operations:

🐍set_operations.py
1# Creating sets
2A = {1, 2, 3, 4, 5}
3B = {4, 5, 6, 7, 8}
4
5# Set membership
6print(3 in A)        # True
7print(6 in A)        # False
8
9# Union (elements in A OR B)
10print(A | B)         # {1, 2, 3, 4, 5, 6, 7, 8}
11print(A.union(B))    # Same result
12
13# Intersection (elements in A AND B)
14print(A & B)         # {4, 5}
15print(A.intersection(B))
16
17# Difference (elements in A but NOT in B)
18print(A - B)         # {1, 2, 3}
19print(A.difference(B))
20
21# Symmetric difference (elements in A OR B but NOT both)
22print(A ^ B)         # {1, 2, 3, 6, 7, 8}
23
24# Subset check
25C = {1, 2}
26print(C <= A)        # True (C is subset of A)
27print(C < A)         # True (C is proper subset of A)

Working with Universal Set and Complement

🐍complement.py
1import numpy as np
2
3# Define universal set (e.g., integers 1-10)
4U = set(range(1, 11))
5A = {1, 2, 3, 4, 5}
6
7# Complement of A with respect to U
8A_complement = U - A
9print(f"A = {A}")
10print(f"A^c = {A_complement}")  # {6, 7, 8, 9, 10}
11
12# Verify De Morgan's Law
13B = {4, 5, 6, 7, 8}
14
15# (A ∪ B)^c
16union_complement = U - (A | B)
17print(f"(A ∪ B)^c = {union_complement}")
18
19# A^c ∩ B^c
20intersection_complements = (U - A) & (U - B)
21print(f"A^c ∩ B^c = {intersection_complements}")
22
23# They should be equal!
24print(f"De Morgan verified: {union_complement == intersection_complements}")

Summary

In this section, we covered the essential set theory concepts that form the foundation of probability theory:

  • Sets are collections of distinct objects
  • Set notation provides a precise language for describing membership, subsets, and cardinality
  • Set operations (union, intersection, complement, difference) allow us to combine sets
  • De Morgan's laws relate complements to unions and intersections

In the next section, we'll review combinatorics - the mathematics of counting - which is essential for calculating probabilities in discrete sample spaces.