Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

📚 Core Knowledge

• Define a Poisson process and its three fundamental properties
• Derive the exponential distribution of inter-arrival times
• Explain the connection to order statistics and Gamma distributions
• Understand superposition and thinning operations

🔧 Practical Skills

• Simulate homogeneous Poisson processes from scratch
• Use the thinning algorithm for time-varying intensities
• Model aggregate claims with compound Poisson processes
• Apply these concepts to real-world event modeling

🧠 Deep Learning Connections

• Neural Point Processes — Modern deep learning extends Poisson processes to learn intensity functions from data
• Temporal Event Prediction — Predicting when the next user action, transaction, or system failure will occur
• Attention Mechanisms — Self-attention can be viewed through the lens of point process theory
• Reinforcement Learning — Poisson processes model random events in environment dynamics

Where You'll Apply This: Queuing systems, call centers, network traffic analysis, insurance claims modeling, stock price jump models, customer arrival prediction, epidemiology, and temporal event prediction in neural networks.

The Big Picture

The Poisson process is the fundamental model for random events occurring in time. Unlike the Poisson distribution (which counts events in a fixed interval), a Poisson process describes the entire sequence of arrival times—a continuous-time stochastic process.

The Core Insight

A Poisson process answers: "When do random events happen?" It models events that occur independently and at a constant average rate—like radioactive decay, customer arrivals, or server requests. The beautiful mathematics: while the process is continuous in time, the count in any interval follows a discrete Poisson distribution.

⏱️

Continuous Time: Events can occur at any moment

🎲

Memoryless: Future is independent of past

📊

Stationary: Same statistics at any time

Historical Context

📜

Siméon Denis Poisson (1837)

Poisson derived the famous distribution while studying court judgments, but didn't develop the process theory. His distribution models "rare events"—the count of occurrences when each individual event has tiny probability.

📞

Agner Krarup Erlang (1909)

Erlang developed queuing theory while working for the Copenhagen Telephone Company. He modeled telephone call arrivals as a Poisson process—founding modern operations research and teletraffic engineering.

⚛️

Ernest Rutherford & Hans Geiger (1910)

Rutherford and Geiger confirmed that radioactive decay follows a Poisson process—each atom decays independently with a constant rate, providing the first rigorous physical validation of the model.

Mathematical Definition

A counting process $\{N(t), t \geq 0\}$ counts the number of events that occur by time $t$ . It's a Poisson process if it satisfies three fundamental properties:

The Three Defining Properties

1. Independent Increments

The number of events in disjoint time intervals are independent random variables.

N(t_4) - N(t_3) \perp\!\!\!\perp N(t_2) - N(t_1) \text{ for } t_1 < t_2 \leq t_3 < t_4

2. Stationary Increments

The distribution of counts depends only on the length of the interval, not its position.

N(t+s) - N(s) \stackrel{d}{=} N(t) \text{ for all } s \geq 0

3. Orderliness (No Simultaneous Events)

Events occur one at a time—the probability of two or more events in a tiny interval is negligible.

P(N(h) \geq 2) = o(h) \text{ as } h \to 0

These properties uniquely determine that $N(t) \sim \text{Poisson}(\lambda t)$ :

P(N(t) = k) = \frac{(\lambda t)^k e^{-\lambda t}}{k!}, \quad k = 0, 1, 2, \ldots

where $\lambda$ is the rate parameter (events per unit time)

Property	Formula	Interpretation
Mean	E[N(t)] = λt	Expected events grows linearly with time
Variance	Var(N(t)) = λt	Mean equals variance (Poisson property)
Standard deviation	σ = √(λt)	Uncertainty grows as square root of time
Rate	λ = E[N(t)]/t	Average events per unit time

Interactive: Process Timeline

Simulate a Poisson process and observe how events arrive over time. Notice how the count in each unit interval follows a Poisson distribution, and the inter-arrival times follow an exponential distribution.

Poisson Process: Events Over Time

Watch events arrive according to a Poisson process. The count in each unit interval follows Poisson(λ), and the time between events follows Exponential(λ).

Rate (λ): 3 events/unit

Total Time: 10 units

Total Events

0.00

Mean Count (vs λ=3)

0.000

Mean Inter-arrival

0.333

1/λ (Expected)

Show inter-arrival times

Count Distribution (per unit interval)

Key Properties of Poisson Process

1.N(t) ~ Poisson(λt) for any interval of length t
2.Inter-arrival times ~ Exponential(λ)
3.Independent increments: counts in non-overlapping intervals are independent
4.Stationary increments: count distribution depends only on interval length

Count in interval [0, t]:

N(t) ~ Poisson(λt)

Inter-arrival time:

T ~ Exponential(λ), E[T] = 1/λ

Inter-arrival Times

One of the most important properties of the Poisson process: inter-arrival times are exponentially distributed. Let $T_n$ denote the time between the $(n-1)$ -th and $n$ -th event.

Inter-arrival Time Distribution

T_n \stackrel{iid}{\sim} \text{Exponential}(\lambda)

Mean

E[T_n] = \frac{1}{\lambda}

Variance

\text{Var}(T_n) = \frac{1}{\lambda^2}

The Memoryless Property: If you've been waiting time

s

for the next event, the remaining waiting time still has the same exponential distribution. Mathematically:

P(T > t+s | T > s) = P(T > t)

. This is unique to the exponential distribution among continuous distributions.

Arrival Time Distribution

The $n$ -th arrival time $S_n = T_1 + T_2 + \cdots + T_n$ is the sum of $n$ independent exponential random variables. This sum follows a Gamma distribution:

S_n \sim \text{Gamma}(n, 1/\lambda)

with mean $E[S_n] = n/\lambda$ and variance $\text{Var}(S_n) = n/\lambda^2$

Order Statistics Connection: Given

N(T) = n

events in

[0, T]

, the arrival times

(S_1, \ldots, S_n)

have the same joint distribution as the order statistics of

n

i.i.d. Uniform(0, T) random variables. This elegant result connects Poisson processes to order statistics!

Interactive: Arrival Times

Explore the relationship between inter-arrival times (exponential) and arrival times (Gamma). Run multiple simulations to see how the empirical distributions match the theoretical predictions.

Arrival Times & Order Statistics

A remarkable property: given N(T) = n events in [0,T], the arrival times are distributed as the order statistics of n uniform random variables on [0,T]. Additionally, the n-th arrival time S_n follows a Gamma(n, 1/\u03BB) distribution.

Rate (\u03BB): 2

Events to Track: 5

Key Mathematical Results

1.Arrival Time Distribution: S_n ~ Gamma(n, 1/\u03BB), because it's the sum of n independent Exp(\u03BB) random variables.

2.Expected Value: E[S_n] = n/\u03BB

3.Variance: Var(S_n) = n/\u03BB\u00B2

4.Order Statistics: Given N(T)=n, the arrival times are distributed like the order statistics of n Uniform(0,T) random variables.

Superposition and Thinning

Two fundamental operations on Poisson processes allow us to combine and decompose them. These are inverses of each other:

Superposition (Merging)

The sum of independent Poisson processes is a Poisson process.

PP(λ₁) + PP(λ₂) = PP(λ₁ + λ₂)

Example: Combining arrivals from two entrances

Thinning (Splitting)

Randomly classifying events creates independent Poisson processes.

PP(λ) → PP(λp) ⊥ PP(λ(1-p))

Example: Splitting customers by product interest

Interactive: Operations

Experiment with superposition (merging streams) and thinning (splitting by type). Observe how the rates add in superposition and split proportionally in thinning.

Superposition & Thinning

Two fundamental operations on Poisson processes: superposition merges independent processes into one, while thinning splits a process into independent sub-processes. These operations are inverses of each other.

Stream 1 Rate (\u03BB\u2081): 3

Stream 2 Rate (\u03BB\u2082): 2

Time Window: 10

Stream 1 Events

Stream 2 Events

Merged Total

Expected: E[N(T)]

Superposition Theorem

If N\u2081(t) and N\u2082(t) are independent Poisson processes with rates \u03BB\u2081 and \u03BB\u2082, then their superposition N(t) = N\u2081(t) + N\u2082(t) is a Poisson process with rate \u03BB = \u03BB\u2081 + \u03BB\u2082.

PP(\u03BB\u2081) + PP(\u03BB\u2082) = PP(\u03BB\u2081 + \u03BB\u2082)

This extends to any finite number of independent Poisson processes.

Inhomogeneous Poisson Processes

When the arrival rate varies with time, we have an inhomogeneous (non-homogeneous) Poisson process with time-varying intensity function $\lambda(t)$ .

Inhomogeneous Poisson Process

N(t) \sim \text{Poisson}\left(\Lambda(t)\right) \text{ where } \Lambda(t) = \int_0^t \lambda(s) \, ds

$\Lambda(t)$ is the cumulative intensity (or integrated rate)

Real-world examples of time-varying intensity:

Call centers: Peak hours in morning and afternoon, low at night
Website traffic: Spikes during promotions, lower on weekends
Hospital admissions: Seasonal flu patterns, weekly cycles
Financial markets: Higher volatility at market open/close

The Thinning Algorithm

To simulate an inhomogeneous Poisson process, we use the thinning algorithm(Lewis & Shedler, 1979):

Find $\lambda_{\max} = \sup_t \lambda(t)$ , the maximum intensity
Generate events from a homogeneous Poisson process with rate $\lambda_{\max}$
For each event at time $t$ , accept it with probability $\lambda(t)/\lambda_{\max}$
The accepted events form the inhomogeneous Poisson process

Interactive: Time-Varying Rates

Visualize inhomogeneous Poisson processes with different intensity functions. Toggle "Show rejected events" to see the thinning algorithm in action.

Inhomogeneous (Non-Homogeneous) Poisson Process

When the arrival rate varies with time, we have an inhomogeneous Poisson process with time-varying intensity function \u03BB(t). This is simulated using the thinning algorithm: generate events at the maximum rate, then randomly accept or reject based on the local intensity.

Intensity Function \u03BB(t)

Base Rate (\u03BB\u2080): 5

Amplitude: 0.80

Total Time: 10 units

Show rejected events (thinning visualization)

Accepted Events

Rejected (Thinned)

9.00

\u03BB_max (Upper Bound)

Acceptance Rate

The Thinning Algorithm (Lewis & Shedler, 1979)

1.Find \u03BB_max = sup\u209C \u03BB(t), the maximum intensity over the time interval
2.Generate events from a homogeneous Poisson process with rate \u03BB_max
3.For each event at time t, accept it with probability \u03BB(t)/\u03BB_max
4.The accepted events form an inhomogeneous Poisson process with intensity \u03BB(t)

Compound Poisson Processes

A compound Poisson process generalizes the Poisson process by adding random "jump sizes" at each arrival. Instead of counting events, we aggregate random amounts:

Compound Poisson Process

S(t) = \sum_{i=1}^{N(t)} X_i

N(t): Poisson(λt) count process

X_i: i.i.d. jump sizes with mean μ

Property	Formula	Notes
Mean	E[S(t)] = λμt	Wald's equation
Variance	Var(S(t)) = λE[X²]t	Includes jump size variability
MGF	M_S(s) = exp(λt(M_X(s)-1))	Composition of MGFs

Key applications of compound Poisson processes:

Insurance: Claims arrive as a Poisson process; claim sizes are random
Finance: Jump-diffusion models for asset prices with random price jumps
Retail: Customers arrive randomly; purchase amounts vary
Inventory: Demand arrives as Poisson; order quantities are random

Interactive: Aggregate Claims

Simulate a compound Poisson process representing aggregate claims, purchases, or other cumulative random processes. Compare different jump size distributions.

Compound Poisson Process

A compound Poisson process S(t) = \u2211X\u1D62 aggregates random jump sizes X\u1D62 at Poisson arrival times. Unlike a regular Poisson process that counts events, this process accumulates random amounts—critical for modeling insurance claims, financial jumps, and queuing systems.

Arrival Rate (\u03BB): 4

Mean Jump: 100

Jump Distribution

Time: 10

Events (N(t))

E: 40.0

Total (S(t))

E: 4000

0.0

Avg Jump

E: 100.0

Max Jump

Min Jump

Compound Poisson Process Definition

S(t) = \u2211\u1D62\u208C\u2081\u207F\u028C\u209C\u2080 X\u1D62

where N(t) ~ Poisson(\u03BBt) and X\u1D62 are i.i.d. with E[X] = \u03BC

Key Properties

Mean: E[S(t)] = \u03BB\u00B7\u03BC\u00B7t

Variance: Var(S(t)) = \u03BB\u00B7E[X\u00B2]\u00B7t

MGF: M\u209B\u209C(s) = exp(\u03BBt(M\u1D6A(s) - 1))

Applications in Machine Learning

Poisson processes are fundamental to many modern ML systems that deal with temporal events:

🧠 Neural Point Processes

Instead of hand-crafting intensity functions, neural networks learn $\lambda(t)$ from data. Recurrent neural networks capture temporal dependencies, while transformers model long-range interactions. Used for predicting user actions, medical events, and financial transactions.

🔄 Continuous Normalizing Flows

Neural ODEs and continuous normalizing flows can be viewed through the lens of point processes, enabling density estimation in continuous time. The thinning algorithm inspires rejection sampling in these models.

📊 Anomaly Detection

Model normal event patterns as a Poisson process. Deviations from the expected rate signal anomalies: fraud detection (unusual transaction patterns), system monitoring (burst traffic), and cybersecurity (attack detection).

🎮 Reinforcement Learning

Poisson processes model random events in environment dynamics: customer arrivals in inventory management, opponent actions in games, and resource availability in scheduling. Semi-Markov decision processes use inter-event time distributions.

Real-World Poisson Scenarios

Select a scenario to see how Poisson distribution models real-world event counting. Run simulations and calculate probabilities for planning and decision-making.

📞

Call Center

Incoming calls to a customer service hotline

How many agents needed?Peak hour staffingAverage hold time planning

Rate (λ): 8 calls/hour

Simulations: 100

8.0

E[X] = λ

8.0

Var(X) = λ

—

Sample Mean

—

Sample Variance

Probability Calculator

Query value (k): 10

P(X = 10)

9.926%

P(X ≤ 10)

81.59%

P(X > 10)

18.41%

Python Implementation

Let's implement Poisson processes from scratch, including homogeneous, inhomogeneous, and compound variants.

Poisson Process Implementation

🐍poisson_process.py

Explanation(13)

Code(57)

1Import NumPy

NumPy provides efficient array operations and random number generation for simulating stochastic processes.

4Class Definition

We implement the Poisson process as a class that stores the rate parameter and generates samples on demand.

5Constructor

The rate parameter λ determines the expected number of events per unit time. Higher λ means more frequent arrivals.

9Simulate Method

Generate a realization of the Poisson process up to time T. Returns both arrival times and event count.

12Exponential Inter-arrivals

The key simulation technique: generate exponential inter-arrival times using the inverse transform method. -log(U)/λ where U ~ Uniform(0,1).

EXAMPLE

For λ=2: expected inter-arrival is 0.5 time units

15Event Loop

Continue generating events until we exceed time T. Each iteration adds one event at current_time.

22Count Events

Count events in an arbitrary interval [s, t]. This demonstrates that N(t) - N(s) ~ Poisson(λ(t-s)).

26Inhomogeneous Simulation

Simulate a non-homogeneous Poisson process with time-varying intensity λ(t) using the thinning algorithm.

30Find Maximum Rate

The thinning algorithm requires λ_max = sup_t λ(t). We evaluate the intensity at many points to find this.

35Generate with Max Rate

First generate events from a homogeneous Poisson(λ_max) process. This gives us candidate events.

39Thinning Step

Accept each candidate event at time t with probability λ(t)/λ_max. This produces the correct time-varying intensity.

47Compound Process

A compound Poisson process adds random jump sizes at each arrival. S(t) = ∑X_i where N(t) is Poisson and X_i are i.i.d.

52Generate Jumps

At each event, sample a random jump size from the specified distribution (e.g., exponential for insurance claims).

44 lines without explanation

1import numpy as np
2from typing import Callable, List, Tuple
3
4class PoissonProcess:
5    def __init__(self, rate: float):
6        self.rate = rate  # λ: events per unit time
7
8    def simulate(self, T: float) -> Tuple[np.ndarray, int]:
9        """Simulate Poisson process up to time T."""
10        arrivals = []
11        current_time = 0
12
13        # Generate exponential inter-arrival times
14        while True:
15            inter_arrival = np.random.exponential(1 / self.rate)
16            current_time += inter_arrival
17
18            if current_time > T:
19                break
20            arrivals.append(current_time)
21
22        return np.array(arrivals), len(arrivals)
23
24    def count_in_interval(self, arrivals: np.ndarray, s: float, t: float) -> int:
25        """Count events in [s, t]."""
26        return np.sum((arrivals >= s) & (arrivals < t))
27
28def simulate_inhomogeneous(intensity_fn: Callable, T: float) -> np.ndarray:
29    """Simulate inhomogeneous Poisson process using thinning."""
30    # Find maximum intensity for thinning
31    time_grid = np.linspace(0, T, 1000)
32    lambda_max = max(intensity_fn(t) for t in time_grid)
33
34    # Generate homogeneous PP with max rate
35    pp = PoissonProcess(lambda_max)
36    candidates, _ = pp.simulate(T)
37
38    # Thin: accept with probability λ(t) / λ_max
39    accepted = []
40    for t in candidates:
41        accept_prob = intensity_fn(t) / lambda_max
42        if np.random.random() < accept_prob:
43            accepted.append(t)
44
45    return np.array(accepted)
46
47def simulate_compound(rate: float, T: float,
48                      jump_dist: Callable) -> Tuple[np.ndarray, np.ndarray]:
49    """Simulate compound Poisson process S(t) = ∑ X_i."""
50    pp = PoissonProcess(rate)
51    arrivals, n = pp.simulate(T)
52
53    # Generate jump sizes
54    jumps = np.array([jump_dist() for _ in range(n)])
55    cumulative = np.cumsum(jumps)
56
57    return arrivals, cumulative

In Practice: Use libraries like scipy.stats for the Poisson and exponential distributions, or specialized packages like tick or pytorch-geometric-temporalfor neural point processes.

Knowledge Check

Test your understanding of Poisson processes with these questions:

Poisson Processes Quiz

Question 1 of 8Score: 0

Inter-arrival Times

What distribution do inter-arrival times follow in a homogeneous Poisson process with rate λ?

Summary

Key Takeaways

✅A Poisson process has independent increments, stationary increments, and orderliness.

✅Inter-arrival times are i.i.d. Exponential(λ); the n-th arrival time is Gamma(n, 1/λ).

✅Superposition adds independent Poisson processes; thinning splits them.

✅The thinning algorithm simulates inhomogeneous Poisson processes.

✅Compound Poisson processes aggregate random jump sizes at Poisson arrivals.

✅Given N(T)=n events, arrival times equal order statistics of n Uniform(0,T) r.v.s.

✅Neural point processes learn intensity functions from data for temporal prediction.

✅Applications: queuing, insurance, finance, anomaly detection, RL.

What's Next

You've now completed Chapter 25 on Stochastic Processes! The next chapter on Probabilistic Graphical Models will show how to represent complex dependencies between random variables using graphs—combining probability theory with graph theory to build interpretable models for reasoning under uncertainty.