Introduction
Set theory forms the mathematical foundation of probability theory. Before we can discuss events, sample spaces, and probability measures, we need to understand what sets are and how to work with them. This section provides a quick refresher on the essential set theory concepts you'll encounter throughout this book.
Why This Matters for ML: In machine learning, we constantly work with collections of data points, events, and outcomes. Understanding set theory helps us reason about these collections rigorously and provides the language for defining probability spaces.
What is a Set?
A set is a well-defined collection of distinct objects. These objects are called elements or members of the set. Sets are typically denoted by capital letters (A, B, C, ...) and their elements are listed within curly braces.
Examples of Sets
- Finite set: - the set of first five positive integers
- Infinite set: - the set of natural numbers
- Empty set: - the set with no elements
- Real numbers: - all real numbers
Special Number Sets
- - Natural numbers (1, 2, 3, ...)
- - Integers (..., -2, -1, 0, 1, 2, ...)
- - Rational numbers (fractions)
- - Real numbers
Set Notation
Understanding set notation is crucial for reading mathematical definitions in probability and statistics.
| Symbol | Meaning | Example |
|---|---|---|
| ∈ | Element of | 3 ∈ {1, 2, 3} |
| ∉ | Not element of | 4 ∉ {1, 2, 3} |
| ⊆ | Subset of | {1, 2} ⊆ {1, 2, 3} |
| ⊂ | Proper subset | {1, 2} ⊂ {1, 2, 3} |
| |A| | Cardinality (size) | |{1, 2, 3}| = 3 |
| ∅ | Empty set | ∅ ⊆ A for any set A |
Set-Builder Notation
Sets can also be defined by specifying a property that all elements must satisfy:
This reads as "A is the set of all x in the real numbers such that x is greater than 0." The colon (:) can also be written as a vertical bar (|).
Set Operations
The following operations allow us to combine and manipulate sets:
Union (∪)
The union of sets A and B contains all elements that are in A or B (or both):
Intersection (∩)
The intersection of sets A and B contains all elements that are in A and B:
Complement (Aᶜ or A')
The complement of set A contains all elements in the universal set that are not in A:
Set Difference (A \ B)
The difference A minus B contains elements in A but not in B:
Probability Connection
De Morgan's Laws
De Morgan's laws are fundamental identities that relate unions, intersections, and complements. They are essential for manipulating probability expressions:
In words:
- The complement of a union equals the intersection of complements
- The complement of an intersection equals the union of complements
Example: If A = "it rains" and B = "it's cold", then (A ∪ B)ᶜ = "it doesn't rain AND it's not cold" = Aᶜ ∩ Bᶜ
Python Implementation
Python has built-in support for sets, making it easy to work with set operations:
1# Creating sets
2A = {1, 2, 3, 4, 5}
3B = {4, 5, 6, 7, 8}
4
5# Set membership
6print(3 in A) # True
7print(6 in A) # False
8
9# Union (elements in A OR B)
10print(A | B) # {1, 2, 3, 4, 5, 6, 7, 8}
11print(A.union(B)) # Same result
12
13# Intersection (elements in A AND B)
14print(A & B) # {4, 5}
15print(A.intersection(B))
16
17# Difference (elements in A but NOT in B)
18print(A - B) # {1, 2, 3}
19print(A.difference(B))
20
21# Symmetric difference (elements in A OR B but NOT both)
22print(A ^ B) # {1, 2, 3, 6, 7, 8}
23
24# Subset check
25C = {1, 2}
26print(C <= A) # True (C is subset of A)
27print(C < A) # True (C is proper subset of A)Working with Universal Set and Complement
1import numpy as np
2
3# Define universal set (e.g., integers 1-10)
4U = set(range(1, 11))
5A = {1, 2, 3, 4, 5}
6
7# Complement of A with respect to U
8A_complement = U - A
9print(f"A = {A}")
10print(f"A^c = {A_complement}") # {6, 7, 8, 9, 10}
11
12# Verify De Morgan's Law
13B = {4, 5, 6, 7, 8}
14
15# (A ∪ B)^c
16union_complement = U - (A | B)
17print(f"(A ∪ B)^c = {union_complement}")
18
19# A^c ∩ B^c
20intersection_complements = (U - A) & (U - B)
21print(f"A^c ∩ B^c = {intersection_complements}")
22
23# They should be equal!
24print(f"De Morgan verified: {union_complement == intersection_complements}")Summary
In this section, we covered the essential set theory concepts that form the foundation of probability theory:
- Sets are collections of distinct objects
- Set notation provides a precise language for describing membership, subsets, and cardinality
- Set operations (union, intersection, complement, difference) allow us to combine sets
- De Morgan's laws relate complements to unions and intersections
In the next section, we'll review combinatorics - the mathematics of counting - which is essential for calculating probabilities in discrete sample spaces.