AI Book - Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will:

Understand the linear RUL problem—why raw RUL labels are problematic
Derive the piecewise linear model that caps RUL at a maximum value
Justify the RUL maximum of 125 cycles based on physical and practical considerations
Apply the mathematical formulation to transform raw labels
Understand evaluation implications of capped RUL during testing

Why This Matters: The choice of RUL target formulation significantly impacts model performance. The piecewise linear model is now standard practice in RUL prediction research—understanding why enables you to make informed decisions for other prognostics problems.

The Linear RUL Problem

The raw RUL label is simply cycles remaining until failure:

\text{RUL}(t) = T_{\text{max}} - t

This creates a linear decrease from $T_{\text{max}} - 1$ at cycle 1 to 0 at failure. However, this formulation has fundamental problems.

Problem 1: No Visible Degradation Early

In the early phases of engine life, sensor readings are essentially identical:

Problem 2: High Variance Labels

Linear RUL creates high variance in early-life labels:

Cycle	Min RUL	Max RUL	Range	Std Dev
1	127	361	234	~60
10	118	352	234	~60
50	78	312	234	~60
100	28	262	234	~60
150	0	212	212	~55

Training on high-variance labels leads to high-variance predictions.

Problem 3: Asymmetric Importance

In practice, predicting "RUL = 200" vs "RUL = 300" matters much less than "RUL = 20" vs "RUL = 30":

High RUL: No maintenance action needed either way
Low RUL: Critical decision point for scheduling maintenance

Equal weighting of all RUL errors ignores this practical asymmetry.

Piecewise Linear Degradation Model

The piecewise linear model addresses these problems by capping RUL at a maximum value:

\text{RUL}_{\text{PL}}(t) = \min(\text{RUL}_{\text{max}}, T_{\text{max}} - t)

Visual Representation

The piecewise linear RUL function has two phases:

Constant phase: RUL = $\text{RUL}_{\text{max}}$ for early cycles (no visible degradation)
Linear decay phase: RUL decreases linearly as degradation becomes apparent

📝text

1RUL
2 ^
3 |
4 |  ___________________    <- Constant at RUL_max
5 |                     \
6 |                      \
7 |                       \   <- Linear decay
8 |                        \
9 |                         \
10 +-------------------------->  Cycle
11 0                         T_max

Physical Interpretation

The piecewise linear model reflects a physical reality:

Early in life, degradation is imperceptible. The engine operates within normal parameters, and sensor readings cannot distinguish a "young" engine from a "slightly older" engine. Only after sufficient wear accumulates do degradation signatures become detectable.

Setting RUL = $\text{RUL}_{\text{max}}$ for early cycles says: "We know the engine is healthy, but we cannot predict exactly how long it will last."

Choosing RUL Maximum

The choice of $\text{RUL}_{\text{max}}$ is important. Common values in the literature include 125, 130, and 150 cycles.

Why 125 Cycles?

We use $\text{RUL}_{\text{max}} = 125$ based on:

Literature standard: Most published work uses 125, enabling fair comparison
Statistical analysis: Degradation signatures become detectable around 100-130 cycles before failure
Practical maintenance: 125 cycles (~125 flights) provides sufficient planning window

Analysis Supporting RUL Max

Effect on Label Distribution

Capping at 125 dramatically changes the label distribution:

Metric	Linear RUL	Piecewise (125)
Min value	0	0
Max value	~360	125
Mean	~100	~75
Std dev	~60	~40
Distribution	Skewed right	Capped uniform

Mathematical Formulation

Let's formalize the piecewise linear RUL computation.

Definition

For an engine with total lifetime $T$ cycles, the piecewise linear RUL at cycle $t$ is:

\text{RUL}_{\text{PL}}(t) = \begin{cases} \text{RUL}_{\text{max}} & \text{if } t \leq T - \text{RUL}_{\text{max}} \\ T - t & \text{if } t > T - \text{RUL}_{\text{max}} \end{cases}

Or equivalently using the min function:

\text{RUL}_{\text{PL}}(t) = \min(125, T - t)

Transition Point

The transition from constant to linear occurs at:

t^* = T - \text{RUL}_{\text{max}}

Implementation

🐍python

1# Piecewise linear RUL computation
2def compute_piecewise_rul(cycle, total_cycles, rul_max=125):
3    """
4    Compute piecewise linear RUL.
5
6    Args:
7        cycle: Current cycle number (1-indexed)
8        total_cycles: Total cycles until failure
9        rul_max: Maximum RUL cap (default 125)
10
11    Returns:
12        Capped RUL value
13    """
14    linear_rul = total_cycles - cycle
15    return min(rul_max, linear_rul)

Training and Evaluation Implications

The piecewise linear model affects both training and evaluation.

Training Implications

Reduced label variance: Model sees consistent RUL = 125 for healthy engines
Focused learning: Model concentrates on the degrading phase where prediction matters
Gradient stability: Loss gradients are more stable without extreme RUL targets

Evaluation Implications

During evaluation, we compare predicted RUL to ground truth. For test samples, the ground truth may be below or above 125:

True RUL	Clipped	Model Should Predict
180	125	~125 (cannot know exact)
100	100	~100 (within range)
50	50	~50 (visible degradation)
10	10	~10 (critical)

Test Label Treatment

The standard practice is to also clip test labels at 125 for fair comparison. This means a model that predicts 125 for a true RUL of 200 is considered correct. Methods that report metrics on unclipped test labels are not directly comparable.

Scoring Functions

Standard metrics (RMSE, MAE) are computed on capped RUL:

\text{RMSE} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (\hat{y}_i - y_i^{\text{(clipped)}})^2}

Where $y_i^{\text{(clipped)}} = \min(125, y_i^{\text{(true)}})$ .

Summary

In this section, we introduced the piecewise linear RUL model:

Linear RUL problems: Indistinguishable early-life readings, high label variance, asymmetric importance
Piecewise solution: Cap RUL at maximum value (125 cycles)
Physical motivation: Degradation is imperceptible until ~100-130 cycles before failure
Formula: $\text{RUL}_{\text{PL}} = \min(125, T - t)$
Benefits: Reduced variance, focused learning, stable training
Evaluation: Apply same clipping to test labels for fair comparison

Aspect	Linear RUL	Piecewise RUL
Max value	~360	125
Early-life labels	High, variable	Constant at 125
Learning focus	Entire range	Degradation phase
Variance	High	Reduced
Industry standard	No	Yes

Looking Ahead: The piecewise linear model gives us continuous RUL targets. But for multi-task learning, we also need discrete targets. In the next section, we will discretize RUL into health states—categorical labels that enable our classification task and provide regularization for the regression.

With the piecewise RUL model defined, we are ready to create the health state classification targets.

Cycle	Min RUL	Max RUL	Range	Std Dev
1	127	361	234	~60
10	118	352	234	~60
50	78	312	234	~60
100	28	262	234	~60
150	0	212	212	~55

Cycle	Min RUL	Max RUL	Range	Std Dev
1	127	361	234	~60
10	118	352	234	~60
50	78	312	234	~60
100	28	262	234	~60
150	0	212	212	~55

Cycle	Min RUL	Max RUL	Range	Std Dev
1	127	361	234	~60
10	118	352	234	~60
50	78	312	234	~60
100	28	262	234	~60
150	0	212	212	~55