Learning Objectives
By the end of this section, you will be able to:
- Understand the physical intuition behind the Weibull distribution: why things fail, age, and wear out over time
- Master the two parameters: shape controls failure behavior, scale controls timing
- Interpret the hazard function: increasing (wear-out), decreasing (burn-in), or constant (random)
- Recognize the bathtub curve: how products fail in three phases (infant mortality, useful life, wear-out)
- Connect to exponential: understand Weibull as a generalization (k=1 gives exponential)
- Apply to real problems: reliability engineering, survival analysis, wind speed modeling
- Implement in Python using scipy.stats for practical calculations
- Apply in AI/ML: survival prediction, time-to-event modeling, physics-informed neural networks
The Big Picture: A Story of Failure and Survival
"The Weibull distribution answers the question: How do things fail? Randomly? From aging? From manufacturing defects?"
Historical Context
In 1939, Swedish engineer Waloddi Weibull was studying the strength of materials. He observed a fundamental problem: the exponential distribution assumed failure rate was constant, but real materials behaved differently:
- Some materials failed early due to manufacturing defects
- Some failed randomly with no aging pattern
- Some failed late due to wear and fatigue
No single distribution could capture all these behaviors. Weibull created a flexible distribution with a shape parameter that could model all three failure modes.
The Challenge It Solves
Problem: The exponential distribution assumes a constant failure rate (memoryless). But in reality:
- New lightbulbs sometimes fail immediately (infant mortality)
- Old car engines wear out faster (increasing failure rate)
- Electronic components in "burn-in" phase fail less over time (decreasing failure rate)
Weibull's Key Innovation
The shape parameter k controls failure behavior:
- : Failure rate DECREASES (burn-in/infant mortality)
- : Failure rate CONSTANT (exponential distribution)
- : Failure rate INCREASES (wear-out, aging)
Mathematical Definition
PDF (Probability Density Function)
CDF (Cumulative Distribution Function)
Survival Function (Reliability)
Hazard Function (Instantaneous Failure Rate)
Symbol Table
| Symbol | Name | Meaning | Range |
|---|---|---|---|
| x | Random variable | Time-to-failure | [0, ∞) |
| k (or β) | Shape parameter | Controls hazard behavior | (0, ∞) |
| λ (or η) | Scale parameter | Characteristic life | (0, ∞) |
Key Statistics
| Statistic | Formula | Intuition |
|---|---|---|
| Mean | λΓ(1 + 1/k) | Average time-to-failure |
| Median | λ(ln 2)^(1/k) | 50% of items fail before this |
| Mode | λ((k-1)/k)^(1/k) for k>1, else 0 | Most likely failure time |
| Variance | λ²[Γ(1+2/k) - Γ²(1+1/k)] | Spread of failure times |
| P(X > λ) | e^(-1) ≈ 36.8% | Always 36.8%, regardless of k |
The Scale Parameter λ
The scale parameter is called the characteristic life because exactly 63.2% of items fail before time , regardless of the shape parameter k. This makes a universal "reference point" for comparing Weibull distributions.
Interactive PDF Explorer
Explore how the Weibull distribution changes with different parameters. Pay special attention to how the shape parameter affects the distribution shape:
Weibull Distribution Explorer
Failure Rate: Increasing (Wear-out)
Characteristic life (63.2% fail by λ)
Key Insight: The Shape Parameter k Controls Everything
With k = 2.00, the distribution becomes more symmetric and bell-shaped. The special case k = 2 is the Rayleigh distribution, used for wind speeds and the magnitude of 2D random vectors.
Things to Try
- Set k = 1 to see the exponential distribution (constant failure rate)
- Set k = 0.5 to see the decreasing hazard (infant mortality)
- Set k = 3.6 to see how Weibull approximates the normal distribution
- Toggle the survival function to see reliability curves
The Shape Parameter k: The Heart of Weibull
The shape parameter is what makes Weibull so powerful. It controls whether failure rate increases, decreases, or stays constant.
Three Regimes
| k Value | Hazard Behavior | Physical Meaning | Example |
|---|---|---|---|
| k < 1 | Decreasing | Infant mortality / burn-in | Software bugs, manufacturing defects |
| k = 1 | Constant | Random failures (exponential) | Cosmic ray damage, accidents |
| k > 1 | Increasing | Wear-out / aging | Mechanical fatigue, corrosion |
| k = 2 | Linear increase | Rayleigh distribution | Wind speeds, 2D random vectors |
| k ≈ 3.6 | Bell-shaped | Near-normal distribution | When positive values needed |
The Intuition
Think of the exponent in the hazard function :
- : Power is negative → h(t) decreases with time
- : Power is zero → h(t) is constant
- : Power is positive → h(t) increases with time
The Hazard Function: Understanding Failure Rates
The hazard function h(t) is the instantaneous probability of failure at time t, given that the item has survived up to that time. This is one of the most important concepts in reliability engineering.
Hazard Rate Explorer: Understanding Failure Patterns
Decreasing hazard: "Infant mortality" - failures decrease as weak items are eliminated early.
Constant hazard: Exponential distribution - purely random failures, memoryless.
Increasing hazard: "Wear-out" - older items fail more often due to aging, fatigue.
Watch how hazard changes from decreasing to increasing
The Hazard Function: Instantaneous Risk
The hazard function h(t) tells you the instantaneous probability of failure at time t, given survival up to that time. For Weibull:
The exponent (k-1) determines the behavior:
• k-1 < 0 (k<1): Power is negative → h(t) decreases
• k-1 = 0 (k=1): Power is zero → h(t) = constant = λ
• k-1 > 0 (k>1): Power is positive → h(t) increases
Why Hazard Matters
- Maintenance scheduling: High hazard → more frequent inspections
- Warranty decisions: Set warranty to cover low-hazard period
- Spare parts inventory: Plan based on expected failure rates
- Design improvements: Identify failure modes from hazard patterns
The Bathtub Curve: Product Lifecycle Reliability
One of the most famous applications of Weibull is the bathtub curve, which describes the failure rate of products over their entire lifecycle.
The Bathtub Curve: Product Lifecycle Reliability
Electronic components: high early failure (screening needed), long useful life, eventual wear-out from electromigration.
Phase I: Infant Mortality
Weibull k < 1
Early failures from manufacturing defects, weak components, assembly errors. Mitigated by burn-in testing and quality control.
Phase II: Useful Life
Weibull k = 1 (Exponential)
Random failures at constant rate. Longest phase with lowest failure rate. This is where warranty periods are typically set.
Phase III: Wear-out
Weibull k > 1
Increasing failures from fatigue, corrosion, degradation. Preventive maintenance schedules target this phase.
The Three Phases
- Phase I: Infant Mortality (k < 1)
High initial failure rate due to manufacturing defects, assembly errors, weak components. Mitigated by burn-in testing and quality control. - Phase II: Useful Life (k = 1)
Low, constant failure rate. Random failures unrelated to age. This is where warranty periods are typically set. - Phase III: Wear-out (k > 1)
Increasing failure rate due to aging, fatigue, corrosion, degradation. Preventive maintenance schedules target this phase.
Engineering Insight
The bathtub curve explains why:
- Burn-in testing weeds out Phase I failures before shipping
- Extended warranties cover Phase I but rarely Phase III
- Preventive replacement happens before Phase III begins
Key Properties
Property 1: The 63.2% Rule
At time , exactly 63.2% of items have failed and 36.8% survive—regardless of k:
Property 2: Percentile Life
The p-th percentile life (time by which p% have failed) is:
For example, the B10 life (10% failure) commonly used in bearing specifications is:
Property 3: Minimum of Weibulls
If are independent Weibull with the same shape k but possibly different scales , then is also Weibull with:
This is crucial for series systems where the system fails when the first component fails.
Real-World Applications
Example 1: Wind Speed Distribution
Problem: Wind speeds at a site follow Weibull(k=2, λ=8 m/s). What is the probability of wind speed exceeding 12 m/s?
Solution:
About 10.5% of the time, wind speed exceeds 12 m/s.
Why k = 2 for Wind?
Wind speeds often follow Weibull with k ≈ 2 (Rayleigh distribution) because wind velocity is the magnitude of a 2D random vector. This is fundamental to wind turbine design and energy estimation.
Example 2: Bearing Life
Problem: Ball bearings have Weibull life with k = 2 and 90% reliability at 1000 hours (B10 life = 1000 hours). Find the scale parameter λ.
Solution:
From :
The characteristic life is about 3081 hours, and the mean life is hours.
Example 3: Medical Survival Analysis
Problem: Time to cancer recurrence follows Weibull(k=1.5, λ=36 months). What is the 5-year survival probability?
Solution:
About 5.7% of patients remain recurrence-free at 5 years.
AI/ML Applications
1. Survival Analysis and Time-to-Event Prediction
Weibull is foundational in survival analysis, which has many ML applications:
- Customer churn prediction: Time until customer leaves
- Equipment failure: Predictive maintenance
- User conversion: Time until first purchase
- Clinical trials: Time to adverse event
Deep survival models often use Weibull-based loss functions:
1import torch
2import torch.nn as nn
3
4class WeibullSurvivalLoss(nn.Module):
5 """Weibull negative log-likelihood loss for survival analysis."""
6
7 def forward(self, log_k, log_lambda, t, event):
8 """
9 Args:
10 log_k: Log shape parameter (neural network output)
11 log_lambda: Log scale parameter (neural network output)
12 t: Observed time
13 event: 1 if event occurred, 0 if censored
14 """
15 k = torch.exp(log_k)
16 lambda_ = torch.exp(log_lambda)
17
18 # Log-likelihood components
19 log_hazard = torch.log(k) - torch.log(lambda_) + \
20 (k - 1) * (torch.log(t) - torch.log(lambda_))
21 cum_hazard = (t / lambda_) ** k
22
23 # Negative log-likelihood
24 nll = -event * log_hazard + cum_hazard
25 return nll.mean()2. Physics-Informed Neural Networks (PINNs)
Weibull physics can be embedded into neural networks for reliability prediction:
- Enforce monotonicity: Survival curves must decrease
- Physical constraints: Hazard must be non-negative
- Transfer learning: Use Weibull priors from similar components
3. Learning Rate Scheduling
Weibull-shaped learning rate decay can model training "aging":
1import numpy as np
2
3def weibull_lr_schedule(epoch, initial_lr, k=2, lambda_=100):
4 """
5 Weibull-inspired learning rate decay.
6
7 - Early epochs: Learning rate decreases slowly (k > 1)
8 - Later epochs: More aggressive decay
9 """
10 survival = np.exp(-(epoch / lambda_) ** k)
11 return initial_lr * survival
12
13# Example: LR starts at 0.1, decays with k=2 shape
14for epoch in [0, 10, 50, 100, 200]:
15 lr = weibull_lr_schedule(epoch, 0.1, k=2, lambda_=100)
16 print(f"Epoch {epoch}: LR = {lr:.6f}")4. Anomaly Detection
Weibull models help detect anomalies by comparing observed lifetimes to expected distributions:
- Component failing much earlier than expected → manufacturing defect
- Component lasting much longer than expected → verify model assumptions
- Change in hazard pattern → potential design change or environmental factor
Connections to Other Distributions
| Relationship | Description |
|---|---|
| Weibull(k=1, λ) = Exp(λ) | Shape k=1 gives exponential distribution |
| Weibull(k=2, λ) = Rayleigh(λ/√2) | Shape k=2 is Rayleigh distribution |
| Weibull(k≈3.6) ≈ Normal | Approximates normal for positive data |
| log(Weibull) = Gumbel | Taking log transforms to Gumbel distribution |
| NOT exponential family | Important for Bayesian analysis limitations |
The Distribution Family Tree
Weibull sits in a family of distributions related by transformations:
- Exponential → (generalize with shape) → Weibull
- Weibull → (take log) → Gumbel (extreme value)
- Normal → (square, positive) → Chi-squared → (scale) → Rayleigh
- Rayleigh is Weibull(k=2), connecting to 2D random vectors
Parameter Estimation
Maximum Likelihood Estimation
Given failure times , the log-likelihood is:
There is no closed-form solution. We solve numerically or use the Weibull plot method.
Weibull Plot (Graphical Method)
Transform data to linearize the CDF:
Plotting vs gives a line with slope k and intercept .
Python Implementation
Basic Operations with SciPy
1from scipy import stats
2import numpy as np
3
4# Weibull distribution in scipy
5# IMPORTANT: scipy uses shape c (= k) and scale (= λ)
6k = 2.0 # Shape parameter
7lambda_ = 5.0 # Scale parameter
8
9weibull = stats.weibull_min(c=k, scale=lambda_)
10
11# PDF: f(t)
12t = 3.0
13pdf = weibull.pdf(t)
14print(f"f({t}) = {pdf:.6f}")
15
16# CDF: F(t) = P(T ≤ t)
17cdf = weibull.cdf(t)
18print(f"P(T ≤ {t}) = {cdf:.6f}")
19
20# Survival: S(t) = P(T > t) = 1 - F(t)
21survival = weibull.sf(t)
22print(f"P(T > {t}) = {survival:.6f}")
23
24# Hazard: h(t) = f(t) / S(t)
25hazard = pdf / survival
26print(f"h({t}) = {hazard:.6f}")
27
28# Key statistics
29print(f"Mean: {weibull.mean():.4f}")
30print(f"Median: {weibull.median():.4f}")
31print(f"Std Dev: {weibull.std():.4f}")
32
33# Quantiles (percentile lives)
34print(f"B10 life (10% fail): {weibull.ppf(0.10):.4f}")
35print(f"B50 life (median): {weibull.ppf(0.50):.4f}")
36
37# Generate random samples
38samples = weibull.rvs(size=1000)
39print(f"Sample mean: {samples.mean():.4f}")
40print(f"Sample median: {np.median(samples):.4f}")Fitting Weibull to Data
1from scipy import stats
2import numpy as np
3
4# Example failure time data
5failure_times = np.array([1.2, 2.5, 3.1, 1.8, 4.2, 2.9, 5.1, 1.5, 2.2, 3.8,
6 2.7, 3.3, 4.5, 2.1, 3.9, 1.9, 4.8, 2.8, 3.5, 4.1])
7
8# Fit Weibull using MLE
9# Returns (c, loc, scale) where c = shape, scale = λ
10shape, loc, scale = stats.weibull_min.fit(failure_times, floc=0)
11k_hat = shape
12lambda_hat = scale
13
14print(f"Estimated k (shape): {k_hat:.4f}")
15print(f"Estimated λ (scale): {lambda_hat:.4f}")
16
17# Verify with theoretical values
18fitted_dist = stats.weibull_min(c=k_hat, scale=lambda_hat)
19print(f"Fitted mean: {fitted_dist.mean():.4f}")
20print(f"Sample mean: {failure_times.mean():.4f}")
21
22# Goodness-of-fit test (Kolmogorov-Smirnov)
23ks_stat, p_value = stats.kstest(failure_times, 'weibull_min',
24 args=(k_hat, 0, lambda_hat))
25print(f"KS statistic: {ks_stat:.4f}, p-value: {p_value:.4f}")Reliability Analysis
1from scipy import stats
2import numpy as np
3
4def reliability_analysis(k, lambda_, target_reliability=0.90):
5 """
6 Comprehensive reliability analysis for a Weibull component.
7 """
8 weibull = stats.weibull_min(c=k, scale=lambda_)
9
10 # Mean Time To Failure
11 mttf = weibull.mean()
12
13 # Percentile lives
14 b10 = weibull.ppf(0.10) # 10% failure
15 b50 = weibull.ppf(0.50) # 50% failure (median)
16
17 # Time for target reliability
18 t_reliable = weibull.ppf(1 - target_reliability)
19
20 # Failure rate at MTTF
21 hazard_at_mttf = (k / lambda_) * (mttf / lambda_) ** (k - 1)
22
23 print(f"=== Weibull Reliability Analysis ===")
24 print(f"Shape k = {k}, Scale λ = {lambda_}")
25 print(f"MTTF: {mttf:.2f}")
26 print(f"B10 Life: {b10:.2f}")
27 print(f"B50 Life (Median): {b50:.2f}")
28 print(f"Time for {target_reliability*100:.0f}% reliability: {t_reliable:.2f}")
29 print(f"Hazard at MTTF: {hazard_at_mttf:.4f}")
30
31 return {
32 'mttf': mttf,
33 'b10': b10,
34 'b50': b50,
35 't_reliable': t_reliable,
36 'hazard_mttf': hazard_at_mttf
37 }
38
39# Example: Bearing with k=2, λ=3000 hours
40results = reliability_analysis(k=2.0, lambda_=3000)SciPy Parameterization
SciPy uses c for the shape parameter (what we call k) andscale for λ. Some textbooks use β instead of k and η instead of λ. Always check your parameterization!
Failure Time Simulation
Generate random failure times and estimate parameters from the simulated data. This helps you understand how well MLE works with different sample sizes.
Failure Time Simulator: Generate and Estimate
Click "Generate Samples" to simulate failure times and see how well we can estimate the parameters!
Common Pitfalls
Pitfall 1: Confusion with Parameterization
Different sources use different parameterization. Common variants:
- scipy.stats: c = shape (k), scale = λ
- Some textbooks: β = shape, η = scale (characteristic life)
- R's dweibull: shape = k, scale = λ
Always verify by checking that the mean formula matches your parameterization.
Pitfall 2: Extrapolation Danger
Weibull fits to observed data may not extrapolate well to extreme values. If you observe failures from 100-1000 hours, predicting behavior at 10,000 hours is risky—the physical failure mechanism might change.
Pitfall 3: Ignoring Censored Data
In reliability testing, not all items fail before the test ends. "Censored" observations (items still working at end of test) require special handling. Standard MLE assumes all data are complete failures.
Pitfall 4: Small Sample Bias
MLE estimates of k are biased in small samples. For n < 20, consider bias correction or use a Bayesian approach with informative priors.
Test Your Understanding
Test Your Understanding
1/10When the shape parameter k = 1, what distribution does Weibull become?
Summary
The Weibull distribution is a powerful and flexible model for time-to-failure data, with the shape parameter k controlling whether failures increase, decrease, or remain constant over time.
Key Formulas
| Property | Formula |
|---|---|
| f(x) = (k/λ)(x/λ)^(k-1) exp(-(x/λ)^k) | |
| CDF | F(x) = 1 - exp(-(x/λ)^k) |
| Survival | S(x) = exp(-(x/λ)^k) |
| Hazard | h(x) = (k/λ)(x/λ)^(k-1) |
| Mean | λΓ(1 + 1/k) |
| Median | λ(ln 2)^(1/k) |
| P(X > λ) | e^(-1) ≈ 36.8% (always) |
Key Takeaways
- Shape k controls failure behavior: k < 1 (decreasing), k = 1 (constant/exponential), k > 1 (increasing)
- Scale λ is the characteristic life: 63.2% fail before λ, regardless of k
- The bathtub curve combines three Weibull regimes to model complete product lifecycles
- Weibull generalizes exponential: Setting k = 1 recovers the exponential distribution
- Critical for reliability engineering: MTTF, B10 life, warranty analysis all use Weibull
- AI/ML applications: Survival analysis, time-to-event prediction, physics-informed neural networks
The Bottom Line: When you need to model time-to-failure data with the flexibility to capture increasing, decreasing, or constant failure rates, Weibull is your go-to distribution. Its shape parameter makes it one of the most versatile tools in reliability engineering and survival analysis.
From Reliability Engineering to Deep Learning
Weibull connects classical reliability engineering to modern machine learning. Whether you're predicting customer churn, designing maintenance schedules, or building survival models, understanding Weibull gives you the mathematical foundation to model time-to-event phenomena across domains.