Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will be able to:

Understand the physical intuition behind the Weibull distribution: why things fail, age, and wear out over time
Master the two parameters: shape $k$ controls failure behavior, scale $lambda$ controls timing
Interpret the hazard function: increasing (wear-out), decreasing (burn-in), or constant (random)
Recognize the bathtub curve: how products fail in three phases (infant mortality, useful life, wear-out)
Connect to exponential: understand Weibull as a generalization (k=1 gives exponential)
Apply to real problems: reliability engineering, survival analysis, wind speed modeling
Implement in Python using scipy.stats for practical calculations
Apply in AI/ML: survival prediction, time-to-event modeling, physics-informed neural networks

The Big Picture: A Story of Failure and Survival

"The Weibull distribution answers the question: How do things fail? Randomly? From aging? From manufacturing defects?"

Historical Context

In 1939, Swedish engineer Waloddi Weibull was studying the strength of materials. He observed a fundamental problem: the exponential distribution assumed failure rate was constant, but real materials behaved differently:

Some materials failed early due to manufacturing defects
Some failed randomly with no aging pattern
Some failed late due to wear and fatigue

No single distribution could capture all these behaviors. Weibull created a flexible distribution with a shape parameter that could model all three failure modes.

The Challenge It Solves

Problem: The exponential distribution assumes a constant failure rate (memoryless). But in reality:

New lightbulbs sometimes fail immediately (infant mortality)
Old car engines wear out faster (increasing failure rate)
Electronic components in "burn-in" phase fail less over time (decreasing failure rate)

Weibull's Key Innovation

The shape parameter k controls failure behavior:

$k < 1$ : Failure rate DECREASES (burn-in/infant mortality)
$k = 1$ : Failure rate CONSTANT (exponential distribution)
$k > 1$ : Failure rate INCREASES (wear-out, aging)

Mathematical Definition

PDF (Probability Density Function)

f(x; k, lambda) = rac{k}{lambda} left( rac{x}{lambda} ight)^{k-1} expleft(-left( rac{x}{lambda} ight)^k ight), quad x geq 0

CDF (Cumulative Distribution Function)

F(x; k, lambda) = 1 - expleft(-left( rac{x}{lambda} ight)^k ight), quad x geq 0

Survival Function (Reliability)

S(x) = P(X > x) = expleft(-left( rac{x}{lambda} ight)^k ight)

Hazard Function (Instantaneous Failure Rate)

h(x) = rac{f(x)}{S(x)} = rac{k}{lambda} left( rac{x}{lambda} ight)^{k-1}

Symbol Table

Symbol	Name	Meaning	Range
x	Random variable	Time-to-failure	[0, ∞)
k (or β)	Shape parameter	Controls hazard behavior	(0, ∞)
λ (or η)	Scale parameter	Characteristic life	(0, ∞)

Key Statistics

Statistic	Formula	Intuition
Mean	λΓ(1 + 1/k)	Average time-to-failure
Median	λ(ln 2)^(1/k)	50% of items fail before this
Mode	λ((k-1)/k)^(1/k) for k>1, else 0	Most likely failure time
Variance	λ²[Γ(1+2/k) - Γ²(1+1/k)]	Spread of failure times
P(X > λ)	e^(-1) ≈ 36.8%	Always 36.8%, regardless of k

The Scale Parameter λ

The scale parameter $lambda$ is called the characteristic life because exactly 63.2% of items fail before time $lambda$ , regardless of the shape parameter k. This makes $lambda$ a universal "reference point" for comparing Weibull distributions.

Interactive PDF Explorer

Explore how the Weibull distribution changes with different parameters. Pay special attention to how the shape parameter $k$ affects the distribution shape:

Weibull Distribution Explorer

Shape parameter (k): 2.00

0.31 (exp)5

Failure Rate: Increasing (Wear-out)

Scale parameter (λ): 2.00

0.52.755

Characteristic life (63.2% fail by λ)

Display Options

PDF (Density)CDF (Failure Probability)Survival FunctionShow Mean/Median/Mode

Mode

1.4142

λ((k-1)/k)^(1/k)

Median

1.6651

λ(ln 2)^(1/k)

Mean

1.7725

λΓ(1 + 1/k)

Std Dev

0.9265

P(T > λ)

36.8%

Always 36.8%

Key Insight: The Shape Parameter k Controls Everything

With k = 2.00, the distribution becomes more symmetric and bell-shaped. The special case k = 2 is the Rayleigh distribution, used for wind speeds and the magnitude of 2D random vectors.

Things to Try

Set k = 1 to see the exponential distribution (constant failure rate)
Set k = 0.5 to see the decreasing hazard (infant mortality)
Set k = 3.6 to see how Weibull approximates the normal distribution
Toggle the survival function to see reliability curves

The Shape Parameter k: The Heart of Weibull

The shape parameter $k$ is what makes Weibull so powerful. It controls whether failure rate increases, decreases, or stays constant.

Three Regimes

k Value	Hazard Behavior	Physical Meaning	Example
k < 1	Decreasing	Infant mortality / burn-in	Software bugs, manufacturing defects
k = 1	Constant	Random failures (exponential)	Cosmic ray damage, accidents
k > 1	Increasing	Wear-out / aging	Mechanical fatigue, corrosion
k = 2	Linear increase	Rayleigh distribution	Wind speeds, 2D random vectors
k ≈ 3.6	Bell-shaped	Near-normal distribution	When positive values needed

The Intuition

Think of the exponent $(k-1)$ in the hazard function $h(t) = (k/lambda)(t/lambda)^{k-1}$ :

$k-1 < 0$ : Power is negative → h(t) decreases with time
$k-1 = 0$ : Power is zero → h(t) is constant
$k-1 > 0$ : Power is positive → h(t) increases with time

The Hazard Function: Understanding Failure Rates

The hazard function h(t) is the instantaneous probability of failure at time t, given that the item has survived up to that time. This is one of the most important concepts in reliability engineering.

Hazard Rate Explorer: Understanding Failure Patterns

↓

k < 1

Decreasing hazard: "Infant mortality" - failures decrease as weak items are eliminated early.

→

k = 1

Constant hazard: Exponential distribution - purely random failures, memoryless.

↑

k > 1

Increasing hazard: "Wear-out" - older items fail more often due to aging, fatigue.

Shape parameter (k): 2.00

0.31.04.0

Watch how hazard changes from decreasing to increasing

The Hazard Function: Instantaneous Risk

The hazard function h(t) tells you the instantaneous probability of failure at time t, given survival up to that time. For Weibull:

h(t) = (k/λ) × (t/λ)^k-1

The exponent (k-1) determines the behavior:
• k-1 < 0 (k<1): Power is negative → h(t) decreases
• k-1 = 0 (k=1): Power is zero → h(t) = constant = λ
• k-1 > 0 (k>1): Power is positive → h(t) increases

Why Hazard Matters

Maintenance scheduling: High hazard → more frequent inspections
Warranty decisions: Set warranty to cover low-hazard period
Spare parts inventory: Plan based on expected failure rates
Design improvements: Identify failure modes from hazard patterns

The Bathtub Curve: Product Lifecycle Reliability

One of the most famous applications of Weibull is the bathtub curve, which describes the failure rate of products over their entire lifecycle.

The Bathtub Curve: Product Lifecycle Reliability

Electronic components: high early failure (screening needed), long useful life, eventual wear-out from electromigration.

Show component distributions

Phase I: Infant Mortality

Weibull k < 1
Early failures from manufacturing defects, weak components, assembly errors. Mitigated by burn-in testing and quality control.

Phase II: Useful Life

Weibull k = 1 (Exponential)
Random failures at constant rate. Longest phase with lowest failure rate. This is where warranty periods are typically set.

Phase III: Wear-out

Weibull k > 1
Increasing failures from fatigue, corrosion, degradation. Preventive maintenance schedules target this phase.

The Three Phases

Phase I: Infant Mortality (k < 1)
High initial failure rate due to manufacturing defects, assembly errors, weak components. Mitigated by burn-in testing and quality control.
Phase II: Useful Life (k = 1)
Low, constant failure rate. Random failures unrelated to age. This is where warranty periods are typically set.
Phase III: Wear-out (k > 1)
Increasing failure rate due to aging, fatigue, corrosion, degradation. Preventive maintenance schedules target this phase.

Engineering Insight

The bathtub curve explains why:

Burn-in testing weeds out Phase I failures before shipping
Extended warranties cover Phase I but rarely Phase III
Preventive replacement happens before Phase III begins

Key Properties

Property 1: The 63.2% Rule

At time $t = lambda$ , exactly 63.2% of items have failed and 36.8% survive—regardless of k:

S(lambda) = e^{-(lambda/lambda)^k} = e^{-1} approx 0.368

Property 2: Percentile Life

The p-th percentile life (time by which p% have failed) is:

t_p = lambda left(-ln(1-p) ight)^{1/k}

For example, the B10 life (10% failure) commonly used in bearing specifications is:

t_{0.10} = lambda (-ln(0.90))^{1/k} approx lambda (0.105)^{1/k}

Property 3: Minimum of Weibulls

If $X_1, X_2, ..., X_n$ are independent Weibull with the same shape k but possibly different scales $lambda_i$ , then $min(X_1, ..., X_n)$ is also Weibull with:

ext{Shape} = k, quad ext{Scale} = left(sum_{i=1}^n lambda_i^{-k} ight)^{-1/k}

This is crucial for series systems where the system fails when the first component fails.

Real-World Applications

Example 1: Wind Speed Distribution

Problem: Wind speeds at a site follow Weibull(k=2, λ=8 m/s). What is the probability of wind speed exceeding 12 m/s?

Solution:

P(V > 12) = expleft(-left( rac{12}{8} ight)^2 ight) = exp(-2.25) approx 0.105

About 10.5% of the time, wind speed exceeds 12 m/s.

Why k = 2 for Wind?

Wind speeds often follow Weibull with k ≈ 2 (Rayleigh distribution) because wind velocity is the magnitude of a 2D random vector. This is fundamental to wind turbine design and energy estimation.

Example 2: Bearing Life

Problem: Ball bearings have Weibull life with k = 2 and 90% reliability at 1000 hours (B10 life = 1000 hours). Find the scale parameter λ.

Solution:

From $S(1000) = 0.90$ :

0.90 = expleft(-left( rac{1000}{lambda} ight)^2 ight)

-ln(0.90) = left( rac{1000}{lambda} ight)^2

\lambda = rac{1000}{\sqrt{-\ln(0.90)}} = rac{1000}{\sqrt{0.105}} \approx 3081 ext{ hours}

The characteristic life is about 3081 hours, and the mean life is $3081 imes \Gamma(1.5) \approx 2733$ hours.

Example 3: Medical Survival Analysis

Problem: Time to cancer recurrence follows Weibull(k=1.5, λ=36 months). What is the 5-year survival probability?

Solution:

S(60) = expleft(-left( rac{60}{36} ight)^{1.5} ight) = exp(-2.87) approx 0.057

About 5.7% of patients remain recurrence-free at 5 years.

AI/ML Applications

1. Survival Analysis and Time-to-Event Prediction

Weibull is foundational in survival analysis, which has many ML applications:

Customer churn prediction: Time until customer leaves
Equipment failure: Predictive maintenance
User conversion: Time until first purchase
Clinical trials: Time to adverse event

Deep survival models often use Weibull-based loss functions:

🐍python

1import torch
2import torch.nn as nn
3
4class WeibullSurvivalLoss(nn.Module):
5    """Weibull negative log-likelihood loss for survival analysis."""
6
7    def forward(self, log_k, log_lambda, t, event):
8        """
9        Args:
10            log_k: Log shape parameter (neural network output)
11            log_lambda: Log scale parameter (neural network output)
12            t: Observed time
13            event: 1 if event occurred, 0 if censored
14        """
15        k = torch.exp(log_k)
16        lambda_ = torch.exp(log_lambda)
17
18        # Log-likelihood components
19        log_hazard = torch.log(k) - torch.log(lambda_) + \
20                     (k - 1) * (torch.log(t) - torch.log(lambda_))
21        cum_hazard = (t / lambda_) ** k
22
23        # Negative log-likelihood
24        nll = -event * log_hazard + cum_hazard
25        return nll.mean()

2. Physics-Informed Neural Networks (PINNs)

Weibull physics can be embedded into neural networks for reliability prediction:

Enforce monotonicity: Survival curves must decrease
Physical constraints: Hazard must be non-negative
Transfer learning: Use Weibull priors from similar components

3. Learning Rate Scheduling

Weibull-shaped learning rate decay can model training "aging":

🐍python

1import numpy as np
2
3def weibull_lr_schedule(epoch, initial_lr, k=2, lambda_=100):
4    """
5    Weibull-inspired learning rate decay.
6
7    - Early epochs: Learning rate decreases slowly (k > 1)
8    - Later epochs: More aggressive decay
9    """
10    survival = np.exp(-(epoch / lambda_) ** k)
11    return initial_lr * survival
12
13# Example: LR starts at 0.1, decays with k=2 shape
14for epoch in [0, 10, 50, 100, 200]:
15    lr = weibull_lr_schedule(epoch, 0.1, k=2, lambda_=100)
16    print(f"Epoch {epoch}: LR = {lr:.6f}")

4. Anomaly Detection

Weibull models help detect anomalies by comparing observed lifetimes to expected distributions:

Component failing much earlier than expected → manufacturing defect
Component lasting much longer than expected → verify model assumptions
Change in hazard pattern → potential design change or environmental factor

Connections to Other Distributions

Relationship	Description
Weibull(k=1, λ) = Exp(λ)	Shape k=1 gives exponential distribution
Weibull(k=2, λ) = Rayleigh(λ/√2)	Shape k=2 is Rayleigh distribution
Weibull(k≈3.6) ≈ Normal	Approximates normal for positive data
log(Weibull) = Gumbel	Taking log transforms to Gumbel distribution
NOT exponential family	Important for Bayesian analysis limitations

The Distribution Family Tree

Weibull sits in a family of distributions related by transformations:

Exponential → (generalize with shape) → Weibull
Weibull → (take log) → Gumbel (extreme value)
Normal → (square, positive) → Chi-squared → (scale) → Rayleigh
Rayleigh is Weibull(k=2), connecting to 2D random vectors

Parameter Estimation

Maximum Likelihood Estimation

Given failure times $t_1, t_2, ..., t_n$ , the log-likelihood is:

\ell(k, lambda) = nln k - nklnlambda + (k-1)sum_{i=1}^n ln t_i - sum_{i=1}^n left( rac{t_i}{lambda} ight)^k

There is no closed-form solution. We solve numerically or use the Weibull plot method.

Weibull Plot (Graphical Method)

Transform data to linearize the CDF:

lnlnleft( rac{1}{1-F(t)} ight) = k ln t - k ln lambda

Plotting $lnln(1/(1-hat{F}(t_i)))$ vs $ln t_i$ gives a line with slope k and intercept $-klnlambda$ .

Python Implementation

Basic Operations with SciPy

🐍python

1from scipy import stats
2import numpy as np
3
4# Weibull distribution in scipy
5# IMPORTANT: scipy uses shape c (= k) and scale (= λ)
6k = 2.0       # Shape parameter
7lambda_ = 5.0  # Scale parameter
8
9weibull = stats.weibull_min(c=k, scale=lambda_)
10
11# PDF: f(t)
12t = 3.0
13pdf = weibull.pdf(t)
14print(f"f({t}) = {pdf:.6f}")
15
16# CDF: F(t) = P(T ≤ t)
17cdf = weibull.cdf(t)
18print(f"P(T ≤ {t}) = {cdf:.6f}")
19
20# Survival: S(t) = P(T > t) = 1 - F(t)
21survival = weibull.sf(t)
22print(f"P(T > {t}) = {survival:.6f}")
23
24# Hazard: h(t) = f(t) / S(t)
25hazard = pdf / survival
26print(f"h({t}) = {hazard:.6f}")
27
28# Key statistics
29print(f"Mean: {weibull.mean():.4f}")
30print(f"Median: {weibull.median():.4f}")
31print(f"Std Dev: {weibull.std():.4f}")
32
33# Quantiles (percentile lives)
34print(f"B10 life (10% fail): {weibull.ppf(0.10):.4f}")
35print(f"B50 life (median):   {weibull.ppf(0.50):.4f}")
36
37# Generate random samples
38samples = weibull.rvs(size=1000)
39print(f"Sample mean: {samples.mean():.4f}")
40print(f"Sample median: {np.median(samples):.4f}")

Fitting Weibull to Data

🐍python

1from scipy import stats
2import numpy as np
3
4# Example failure time data
5failure_times = np.array([1.2, 2.5, 3.1, 1.8, 4.2, 2.9, 5.1, 1.5, 2.2, 3.8,
6                          2.7, 3.3, 4.5, 2.1, 3.9, 1.9, 4.8, 2.8, 3.5, 4.1])
7
8# Fit Weibull using MLE
9# Returns (c, loc, scale) where c = shape, scale = λ
10shape, loc, scale = stats.weibull_min.fit(failure_times, floc=0)
11k_hat = shape
12lambda_hat = scale
13
14print(f"Estimated k (shape): {k_hat:.4f}")
15print(f"Estimated λ (scale): {lambda_hat:.4f}")
16
17# Verify with theoretical values
18fitted_dist = stats.weibull_min(c=k_hat, scale=lambda_hat)
19print(f"Fitted mean: {fitted_dist.mean():.4f}")
20print(f"Sample mean: {failure_times.mean():.4f}")
21
22# Goodness-of-fit test (Kolmogorov-Smirnov)
23ks_stat, p_value = stats.kstest(failure_times, 'weibull_min',
24                                 args=(k_hat, 0, lambda_hat))
25print(f"KS statistic: {ks_stat:.4f}, p-value: {p_value:.4f}")

Reliability Analysis

🐍python

1from scipy import stats
2import numpy as np
3
4def reliability_analysis(k, lambda_, target_reliability=0.90):
5    """
6    Comprehensive reliability analysis for a Weibull component.
7    """
8    weibull = stats.weibull_min(c=k, scale=lambda_)
9
10    # Mean Time To Failure
11    mttf = weibull.mean()
12
13    # Percentile lives
14    b10 = weibull.ppf(0.10)  # 10% failure
15    b50 = weibull.ppf(0.50)  # 50% failure (median)
16
17    # Time for target reliability
18    t_reliable = weibull.ppf(1 - target_reliability)
19
20    # Failure rate at MTTF
21    hazard_at_mttf = (k / lambda_) * (mttf / lambda_) ** (k - 1)
22
23    print(f"=== Weibull Reliability Analysis ===")
24    print(f"Shape k = {k}, Scale λ = {lambda_}")
25    print(f"MTTF: {mttf:.2f}")
26    print(f"B10 Life: {b10:.2f}")
27    print(f"B50 Life (Median): {b50:.2f}")
28    print(f"Time for {target_reliability*100:.0f}% reliability: {t_reliable:.2f}")
29    print(f"Hazard at MTTF: {hazard_at_mttf:.4f}")
30
31    return {
32        'mttf': mttf,
33        'b10': b10,
34        'b50': b50,
35        't_reliable': t_reliable,
36        'hazard_mttf': hazard_at_mttf
37    }
38
39# Example: Bearing with k=2, λ=3000 hours
40results = reliability_analysis(k=2.0, lambda_=3000)

SciPy Parameterization

SciPy uses c for the shape parameter (what we call k) andscale for λ. Some textbooks use β instead of k and η instead of λ. Always check your parameterization!

Failure Time Simulation

Generate random failure times and estimate parameters from the simulated data. This helps you understand how well MLE works with different sample sizes.

Failure Time Simulator: Generate and Estimate

True k: 2.00

True λ: 3.00

Sample Size: 200

Click "Generate Samples" to simulate failure times and see how well we can estimate the parameters!

Common Pitfalls

Pitfall 1: Confusion with Parameterization

Different sources use different parameterization. Common variants:

scipy.stats: c = shape (k), scale = λ
Some textbooks: β = shape, η = scale (characteristic life)
R's dweibull: shape = k, scale = λ

Always verify by checking that the mean formula matches your parameterization.

Pitfall 2: Extrapolation Danger

Weibull fits to observed data may not extrapolate well to extreme values. If you observe failures from 100-1000 hours, predicting behavior at 10,000 hours is risky—the physical failure mechanism might change.

Pitfall 3: Ignoring Censored Data

In reliability testing, not all items fail before the test ends. "Censored" observations (items still working at end of test) require special handling. Standard MLE assumes all data are complete failures.

Pitfall 4: Small Sample Bias

MLE estimates of k are biased in small samples. For n < 20, consider bias correction or use a Bayesian approach with informative priors.

Test Your Understanding

1/10

When the shape parameter k = 1, what distribution does Weibull become?

Summary

The Weibull distribution is a powerful and flexible model for time-to-failure data, with the shape parameter k controlling whether failures increase, decrease, or remain constant over time.

Key Formulas

Property	Formula
PDF	f(x) = (k/λ)(x/λ)^(k-1) exp(-(x/λ)^k)
CDF	F(x) = 1 - exp(-(x/λ)^k)
Survival	S(x) = exp(-(x/λ)^k)
Hazard	h(x) = (k/λ)(x/λ)^(k-1)
Mean	λΓ(1 + 1/k)
Median	λ(ln 2)^(1/k)
P(X > λ)	e^(-1) ≈ 36.8% (always)

Key Takeaways

Shape k controls failure behavior: k < 1 (decreasing), k = 1 (constant/exponential), k > 1 (increasing)
Scale λ is the characteristic life: 63.2% fail before λ, regardless of k
The bathtub curve combines three Weibull regimes to model complete product lifecycles
Weibull generalizes exponential: Setting k = 1 recovers the exponential distribution
Critical for reliability engineering: MTTF, B10 life, warranty analysis all use Weibull
AI/ML applications: Survival analysis, time-to-event prediction, physics-informed neural networks

The Bottom Line: When you need to model time-to-failure data with the flexibility to capture increasing, decreasing, or constant failure rates, Weibull is your go-to distribution. Its shape parameter makes it one of the most versatile tools in reliability engineering and survival analysis.

From Reliability Engineering to Deep Learning

Weibull connects classical reliability engineering to modern machine learning. Whether you're predicting customer churn, designing maintenance schedules, or building survival models, understanding Weibull gives you the mathematical foundation to model time-to-event phenomena across domains.