The Fuel Meter Stays Full Until It Doesn't
A car's fuel meter does not display “413 km to empty” when the tank is fresh - it just shows a near-full bar. Only as the level drops below half does the gauge start moving in meaningful steps; below the warning threshold it animates more urgently. The intuition is the same with engine RUL: there is no useful signal in the difference between “200 cycles remaining” and “180 cycles remaining” - the engine is healthy in both cases. The model wastes capacity learning to discriminate within that flat zone.
The standard fix in C-MAPSS modelling, going back to Heimes 2008, is the piecewise-linear RUL cap at Rmax=125. Below this threshold the target is linear in cycles to failure; above it, the target is flat at Rmax. The model trains on a target that emphasises the regime where the prediction matters.
The Piecewise-Linear Function
For an engine that fails at cycle tfail, the capped RUL at cycle t is
ycapped(t)=min(Rmax,tfail−t).
Two regimes:
| Regime | Condition | Target |
|---|---|---|
| Early life (cap binds) | t_fail - t > R_max | Constant R_max |
| Late life (cap released) | t_fail - t <= R_max | Linear: t_fail - t |
For our 200-cycle engine, the boundary is at cycle 75 (= 200 - 125). The first 75 cycles all share target 125; cycles 75-200 ramp linearly from 125 down to 0.
Interactive: Slide R_max
Python: Apply the Cap
NumPy is the standard numerical library in Python. It gives us ndarray (N-dimensional array) — a fast, contiguous, vectorised matrix type implemented in C. Every cap operation in this file (np.minimum, np.arange, broadcasting subtraction) runs in compiled C, not slow Python loops. We alias as 'np' by universal convention.
Pandas provides DataFrame and Series — labelled tabular data structures built on top of NumPy. Not strictly needed for this self-contained cap demo, but kept for consistency with the rest of the book where the C-MAPSS pipeline is fed by a pandas DataFrame indexed by (engine_id, cycle).
The cap threshold. This is the C-MAPSS convention introduced by Heimes (2008) and used by virtually every paper since. Empirically tested values 110–150 give nearly identical RMSE; below 100 starts to hurt because too much late-life signal is lost; above 150 introduces regression noise from early cycles where degradation is invisible.
Defines the function that implements y = min(R_max, t_fail − t) for an entire array of RUL values at once. This is the function you call inside Dataset.__getitem__ when you build the supervised target.
Single-line docstring. Mirrors the math from the section above so readers of the implementation immediately see which equation is being implemented. Triple double-quotes are PEP 257's recommendation even for one-line docstrings, because the same prefix scales to multi-line.
The whole cap is one element-wise min. Where raw rul ≤ r_max, np.minimum returns the raw value (the linear regime). Where raw rul > r_max, it returns r_max (the flat regime). The 'piecewise' nature is implicit in the min.
Comment header marking the start of a small standalone demo. We will fabricate one engine that fails at cycle 200 so we can inspect the cap behaviour without loading the C-MAPSS CSV.
Total lifespan of our synthetic engine — it dies at cycle 200. With R_MAX = 125 this means the cap will bind for the first 75 cycles (200 − 125 = 75) and release for the remaining 125.
Generates the time index 0, 1, 2, …, 199 — every cycle the engine has run. np.arange is NumPy's vectorised version of Python's range().
Vectorised broadcast subtraction: scalar (200) minus ndarray (cycles) returns an ndarray where each element is 200 − cycle. This is the raw distance-to-failure target — the un-capped ground truth.
Calls our function. r_max defaults to R_MAX = 125, so np.minimum(raw_rul, 125) runs internally. The first 75 elements (where raw_rul > 125) collapse to 125; the last 125 elements (where raw_rul ≤ 125) pass through unchanged.
Header line for the first chunk of demo output. Pure side effect; no return value.
f-string interpolates the slice raw_rul[:5] (first 5 elements) and converts it to a Python list with .tolist() so the output reads like a familiar list literal instead of NumPy's ndarray repr.
Same slice, but on the capped array. Every value is 125 — proof the cap is binding for the first 75 cycles where raw RUL exceeds R_MAX.
Empty print — emits a blank line. Used purely for visual separation between the two demo blocks.
print accepts multiple positional args and joins them with the default sep=' '. life − R_MAX = 200 − 125 = 75 — the exact cycle where the cap releases.
Sub-header. Computes the inclusive range 73 to 77 — two cycles before and after the elbow at 75 — so we can see the cap releasing in real time.
Slice raw_rul[73:78] — Python slicing is half-open, so the +3 gives us 5 elements (indices 73, 74, 75, 76, 77). Shows the raw target marching linearly through the elbow value 125.
Same window on the capped array. Indices 73 and 74 are still flattened to 125 (raw was 127 and 126, both > R_MAX). Index 75 is exactly 125 in both arrays — the boundary. Indices 76 and 77 pass through (124 and 123 ≤ 125).
1import numpy as np
2import pandas as pd
3
4R_MAX = 125
5
6def apply_rul_cap(rul: np.ndarray, r_max: int = R_MAX) -> np.ndarray:
7 """Cap RUL at r_max. Implements y = min(r_max, t_fail - t)."""
8 return np.minimum(rul, r_max)
9
10
11# ----- Demo on one synthetic engine -----
12life = 200
13cycles = np.arange(life) # 0..199
14raw_rul = life - cycles # 200, 199, ..., 1
15capped_rul = apply_rul_cap(raw_rul)
16
17print("first 5 cycles:")
18print(f" raw : {raw_rul[:5].tolist()}")
19print(f" capped : {capped_rul[:5].tolist()}")
20print()
21print("transition zone around cycle", life - R_MAX, ":")
22print(f" cycles {life - R_MAX - 2}-{life - R_MAX + 2}:")
23print(f" raw : {raw_rul[life - R_MAX - 2:life - R_MAX + 3].tolist()}")
24print(f" capped : {capped_rul[life - R_MAX - 2:life - R_MAX + 3].tolist()}")
25
26# first 5 cycles:
27# raw : [200, 199, 198, 197, 196]
28# capped : [125, 125, 125, 125, 125] ← cap holds for first 75 cycles
29# transition zone around cycle 75:
30# cycles 73-77:
31# raw : [127, 126, 125, 124, 123]
32# capped : [125, 125, 125, 124, 123] ← cap releases at cycle 75PyTorch: Cap as a Tensor Op
Top-level PyTorch package. Provides torch.Tensor (the GPU-aware n-dimensional array), torch.clamp (the cap operator we use here), torch.arange (tensor version of np.arange), and the autograd engine. Once we move from the NumPy demo to the actual training Dataset, the cap must operate on tensors so the result lands directly on the right device with the right dtype.
Same cap value as the NumPy version. Defined as a Python int (not a tensor) so it works equally well as a default argument and inside scalar arithmetic. torch.clamp's max= argument accepts a Python scalar directly, so no conversion is needed.
Tensor-flavoured version of the cap. Same signature, same semantics — only the input/output types change from ndarray to torch.Tensor. This is the function you actually call in production training because it keeps the data on whatever device the rest of the pipeline expects.
One-line cap on the GPU. torch.clamp clips a tensor into [min, max]. Passing only max= gives upper-only clipping — exactly our piecewise-linear cap. The operation is element-wise and differentiable, with gradient zero on the saturated side.
Comment marking that the production call site is Dataset.__getitem__ — the data boundary. Capping there means every batch the model ever sees has a target already in [0, R_MAX]. The model never has to see, fit, or backprop through raw RUL.
Seeds the global PyTorch RNG so any randomness later in the script is deterministic. This snippet does not actually use randomness, but seeding is included as a hygiene reminder — every Dataset that adds noise, dropout, or random crops needs a deterministic seed for reproducible runs.
Builds a synthetic raw-RUL tensor [200., 199., …, 1.] in two stages. torch.arange generates the integer sequence; .float() casts it from int64 to float32, the standard regression dtype.
Calls the cap. r_max defaults to R_MAX = 125, so torch.clamp(fake_raw_rul, max=125) runs internally. The result is a fresh tensor; the original fake_raw_rul is left untouched (clamp is non-mutating; clamp_ would be the in-place version).
Sanity check on the lower side. capped.min() returns a 0-dim tensor; .item() converts it to a plain Python float for printing. The minimum is 1.0 because our last cycle has raw_rul = 1, well below R_MAX, so it passes through unchanged.
Sanity check on the upper side. The maximum is exactly 125.0 — proof that the cap is binding for every cycle where raw_rul exceeded R_MAX. If we ever saw a value > 125 here, the cap would be silently broken.
Print the first five elements as a Python list. capped[:5] is a length-5 view; .tolist() materialises it into [125.0, 125.0, ...] for clean printing. All five values are 125 because raw_rul[:5] = [200, 199, 198, 197, 196], all above the cap.
Five elements straddling the elbow: indices 73, 74, 75, 76, 77. Identical pattern to the NumPy version: the first three sit at 125 (cap binds), index 75 is exactly at the boundary (both regimes agree), and indices 76 and 77 fall through to the linear regime.
1import torch
2
3R_MAX = 125
4
5def apply_rul_cap(rul: torch.Tensor, r_max: int = R_MAX) -> torch.Tensor:
6 return torch.clamp(rul, max=r_max)
7
8
9# Use it inside a Dataset
10torch.manual_seed(0)
11fake_raw_rul = torch.arange(200, 0, -1).float() # 200, 199, ..., 1
12capped = apply_rul_cap(fake_raw_rul)
13
14print("min capped:", capped.min().item()) # 1.0
15print("max capped:", capped.max().item()) # 125.0
16print("first 5 :", capped[:5].tolist()) # [125., 125., 125., 125., 125.]
17print("around 75:", capped[73:78].tolist()) # [125., 125., 125., 124., 123.]__getitem__ when computing y. NEVER on a tensor that is a function of model parameters - clamp at the boundary between data and model only.Bounded Targets in Other Tasks
| Task | Why bound the target | Common bound |
|---|---|---|
| RUL (this book) | Early-life signal is uninformative | R_max = 125 |
| Recommendation (rating prediction) | Targets bounded in [1, 5] | Sigmoid-scaled output |
| Time-to-event prediction | Long-tail rare events dominate gradient | Truncate at 95th percentile |
| Stock price forecasting | Outliers blow up MSE | Quantile clipping |
| Detection: predicted bounding-box scale | Aspect-ratio range | Log-space target with cap |
Two Cap Pitfalls
torch.clamp(rul, max=125) after the regression head. Don't - it kills the gradient when the cap binds. Train on the capped target and let the network learn to predict in [0, 125] organically.The point. A piecewise-linear cap focuses the model's capacity on the regime that matters. R_max = 125 is the C-MAPSS convention; the cap is applied at the data boundary, not inside the model.
Takeaway
- R_max = 125 is the standard. Cap raw RUL at this value; ramp linearly to 0 in the last 125 cycles.
- Two implementation lines. NumPy:
np.minimum(rul, r_max). PyTorch:torch.clamp(rul, max=r_max). - Apply at the data boundary. Inside Dataset __getitem__. Never inside the model.
- Always evaluate on the capped target too. Otherwise your RMSE is inflated by early-life cycles.