Chapter 7
9 min read
Section 28 of 121

Piecewise-Linear RUL Cap (R_max = 125)

Sequences, RUL Cap & Health Labels

The Fuel Meter Stays Full Until It Doesn't

A car's fuel meter does not display “413 km to empty” when the tank is fresh - it just shows a near-full bar. Only as the level drops below half does the gauge start moving in meaningful steps; below the warning threshold it animates more urgently. The intuition is the same with engine RUL: there is no useful signal in the difference between “200 cycles remaining” and “180 cycles remaining” - the engine is healthy in both cases. The model wastes capacity learning to discriminate within that flat zone.

The standard fix in C-MAPSS modelling, going back to Heimes 2008, is the piecewise-linear RUL cap at Rmax=125R_{\max} = 125. Below this threshold the target is linear in cycles to failure; above it, the target is flat at RmaxR_{\max}. The model trains on a target that emphasises the regime where the prediction matters.

The choice. R_max = 125 is conventional; values in [110, 150] perform similarly. Below 100 hurts because the model loses too much early-life signal; above 150 wastes capacity on cycles where degradation is invisible.

The Piecewise-Linear Function

For an engine that fails at cycle tfailt_{\text{fail}}, the capped RUL at cycle tt is

ycapped(t)  =  min ⁣(Rmax,tfailt).y_{\text{capped}}(t) \;=\; \min\!\bigl(R_{\max},\, t_{\text{fail}} - t\bigr).

Two regimes:

RegimeConditionTarget
Early life (cap binds)t_fail - t > R_maxConstant R_max
Late life (cap released)t_fail - t <= R_maxLinear: t_fail - t

For our 200-cycle engine, the boundary is at cycle 75 (= 200 - 125). The first 75 cycles all share target 125; cycles 75-200 ramp linearly from 125 down to 0.

Interactive: Slide R_max

Loading RUL-cap viz…

Python: Apply the Cap

np.minimum implements the piecewise-linear cap
🐍rul_cap_numpy.py
1import numpy as np

NumPy is the standard numerical library in Python. It gives us ndarray (N-dimensional array) — a fast, contiguous, vectorised matrix type implemented in C. Every cap operation in this file (np.minimum, np.arange, broadcasting subtraction) runs in compiled C, not slow Python loops. We alias as 'np' by universal convention.

EXECUTION STATE
numpy = Library for numerical computing — provides ndarray, broadcasting, np.minimum, np.arange, slicing, and all vectorised math used in this snippet.
as np = Creates the alias 'np' so we can write np.minimum() instead of numpy.minimum(). Universal Python convention.
2import pandas as pd

Pandas provides DataFrame and Series — labelled tabular data structures built on top of NumPy. Not strictly needed for this self-contained cap demo, but kept for consistency with the rest of the book where the C-MAPSS pipeline is fed by a pandas DataFrame indexed by (engine_id, cycle).

EXECUTION STATE
pandas = Tabular data library. In the wider book we use pd.read_csv() to load C-MAPSS, pd.groupby('engine_id') to split per-engine, and df['RUL'] = ... to attach the capped target.
as pd = Standard alias so we write pd.DataFrame() instead of pandas.DataFrame().
4R_MAX = 125

The cap threshold. This is the C-MAPSS convention introduced by Heimes (2008) and used by virtually every paper since. Empirically tested values 110–150 give nearly identical RMSE; below 100 starts to hurt because too much late-life signal is lost; above 150 introduces regression noise from early cycles where degradation is invisible.

EXECUTION STATE
R_MAX = 125 — the upper bound on the regression target. Any cycle further than 125 from failure is told 'healthy = 125', not its true distance.
→ why 125 specifically? = C-MAPSS engines run for 130–360 cycles. 125 is roughly the median life of the test set, so it splits each trajectory into a flat 'healthy' phase and a degrading 'late' phase of comparable length.
→ why a module-level constant? = Used both inside apply_rul_cap (default arg) and inside the demo (transition-zone arithmetic). Defining once at the top guarantees the train-time and eval-time caps stay in sync.
6def apply_rul_cap(rul, r_max=R_MAX) → np.ndarray

Defines the function that implements y = min(R_max, t_fail − t) for an entire array of RUL values at once. This is the function you call inside Dataset.__getitem__ when you build the supervised target.

EXECUTION STATE
⬇ input: rul (np.ndarray) = Raw RUL values. Any shape — could be a 1D vector of one engine's trajectory or a 2D batch (engines × cycles). Each element is t_fail − t for that timestep.
→ rul example value = np.array([200, 199, 198, ..., 2, 1]) — shape (200,) for one engine that fails at cycle 200.
⬇ input: r_max (int, default = R_MAX = 125) = Cap threshold. Defaults to the module-level R_MAX so callers usually omit it; pass explicitly only when ablating the cap value.
→ r_max purpose = Sets the elbow of the piecewise function. Below the elbow the target equals the raw RUL; above it the target is flattened to r_max.
→ return type hint: np.ndarray = Tells type checkers and readers that the function preserves the NumPy array type — not a Python list, not a scalar.
⬆ returns = np.ndarray, same shape as input rul. Each element ≤ r_max. dtype is preserved (int in, int out; float in, float out).
7Docstring: """Cap RUL at r_max. Implements y = min(r_max, t_fail − t)."""

Single-line docstring. Mirrors the math from the section above so readers of the implementation immediately see which equation is being implemented. Triple double-quotes are PEP 257's recommendation even for one-line docstrings, because the same prefix scales to multi-line.

8return np.minimum(rul, r_max)

The whole cap is one element-wise min. Where raw rul ≤ r_max, np.minimum returns the raw value (the linear regime). Where raw rul > r_max, it returns r_max (the flat regime). The 'piecewise' nature is implicit in the min.

EXECUTION STATE
📚 np.minimum(x1, x2) = NumPy ufunc: returns the element-wise minimum of two arrays (or array + scalar via broadcasting). NOT the same as np.min(): np.min reduces an array to a single value along an axis; np.minimum compares two arrays elementwise and preserves shape.
⬇ arg 1: rul = The raw RUL ndarray (e.g. [200, 199, ..., 1]).
⬇ arg 2: r_max = Python int 125. NumPy broadcasts the scalar against rul, so np.minimum compares every element of rul against 125.
→ mini example = np.minimum(np.array([200, 130, 125, 100, 1]), 125) → array([125, 125, 125, 100, 1])
→ np.minimum vs np.min = np.min(np.array([3, 1, 4])) → 1 (reduces to scalar) np.minimum(np.array([3, 1, 4]), 2) → [2, 1, 2] (elementwise)
→ why not np.clip? = np.clip(rul, 0, r_max) is equivalent here, but conveys 'two-sided clipping'. We only need the upper bound, so np.minimum reads more truthfully.
⬆ return = Same-shape ndarray; values ≤ r_max element by element.
11# ----- Demo on one synthetic engine -----

Comment header marking the start of a small standalone demo. We will fabricate one engine that fails at cycle 200 so we can inspect the cap behaviour without loading the C-MAPSS CSV.

12life = 200

Total lifespan of our synthetic engine — it dies at cycle 200. With R_MAX = 125 this means the cap will bind for the first 75 cycles (200 − 125 = 75) and release for the remaining 125.

EXECUTION STATE
life = 200 (Python int) — the cycle at which our fake engine fails.
→ cap-release boundary = life − R_MAX = 200 − 125 = 75. From cycle 75 onward the linear regime takes over.
13cycles = np.arange(life)

Generates the time index 0, 1, 2, …, 199 — every cycle the engine has run. np.arange is NumPy's vectorised version of Python's range().

EXECUTION STATE
📚 np.arange(stop) = Returns an ndarray of evenly spaced integers in the half-open interval [0, stop). Single-arg form starts at 0 and steps by 1. Three-arg form is np.arange(start, stop, step).
⬇ arg: life = 200 = Stop value (exclusive). Produces 200 elements: 0, 1, 2, …, 199.
⬆ result: cycles (shape (200,)) = array([ 0, 1, 2, ..., 197, 198, 199]) dtype=int64
→ mini example = np.arange(5) → array([0, 1, 2, 3, 4]) np.arange(2, 8, 2) → array([2, 4, 6])
14raw_rul = life - cycles

Vectorised broadcast subtraction: scalar (200) minus ndarray (cycles) returns an ndarray where each element is 200 − cycle. This is the raw distance-to-failure target — the un-capped ground truth.

EXECUTION STATE
life (scalar) = 200 — broadcast to every position
cycles (5,)…(200,) = [0, 1, 2, ..., 198, 199]
→ broadcasting rule = scalar − ndarray applies the operation element-wise. life − cycles[i] for every i. No Python loop runs; NumPy issues one C-level pass.
⬆ raw_rul (shape (200,)) = array([200, 199, 198, ..., 2, 1])
→ first 5 values = raw_rul[:5] = [200, 199, 198, 197, 196]
→ last 5 values = raw_rul[-5:] = [5, 4, 3, 2, 1]
15capped_rul = apply_rul_cap(raw_rul)

Calls our function. r_max defaults to R_MAX = 125, so np.minimum(raw_rul, 125) runs internally. The first 75 elements (where raw_rul > 125) collapse to 125; the last 125 elements (where raw_rul ≤ 125) pass through unchanged.

EXECUTION STATE
⬇ raw_rul[:5] = [200, 199, 198, 197, 196] — all above 125
⬇ raw_rul[73:78] = [127, 126, 125, 124, 123] — straddles the elbow
⬇ raw_rul[-5:] = [5, 4, 3, 2, 1] — well below 125
⬆ capped_rul[:5] = [125, 125, 125, 125, 125] — clipped
⬆ capped_rul[73:78] = [125, 125, 125, 124, 123] — elbow at index 75
⬆ capped_rul[-5:] = [5, 4, 3, 2, 1] — unchanged (already ≤ 125)
→ invariant = capped_rul.max() == 125 and capped_rul.min() == 1.
17print("first 5 cycles:")

Header line for the first chunk of demo output. Pure side effect; no return value.

EXECUTION STATE
Output = first 5 cycles:
18print(f" raw : {raw_rul[:5].tolist()}")

f-string interpolates the slice raw_rul[:5] (first 5 elements) and converts it to a Python list with .tolist() so the output reads like a familiar list literal instead of NumPy's ndarray repr.

EXECUTION STATE
📚 .tolist() = ndarray method — converts the array to nested Python lists. Used here purely for printing readability: list repr is [200, 199, ...] vs ndarray repr array([200, 199, ...]).
⬇ raw_rul[:5] = Slice — first five elements. Slicing returns a view, not a copy.
Output = raw : [200, 199, 198, 197, 196]
19print(f" capped : {capped_rul[:5].tolist()}")

Same slice, but on the capped array. Every value is 125 — proof the cap is binding for the first 75 cycles where raw RUL exceeds R_MAX.

EXECUTION STATE
Output = capped : [125, 125, 125, 125, 125]
→ why 125 here? = raw_rul[:5] = [200, 199, 198, 197, 196], all > 125, so np.minimum picks 125 for every position.
→ pedagogical point = The model is told 'engine is healthy' (target = 125) instead of 'exactly 200 cycles to go'. Wasted capacity recovered.
20print()

Empty print — emits a blank line. Used purely for visual separation between the two demo blocks.

EXECUTION STATE
Output = (blank line)
21print("transition zone around cycle", life - R_MAX, ":")

print accepts multiple positional args and joins them with the default sep=' '. life − R_MAX = 200 − 125 = 75 — the exact cycle where the cap releases.

EXECUTION STATE
→ arithmetic in print() = Each comma-separated expression is evaluated, then str()-ified, then joined with ' '. So 'cycle' + ' ' + str(75) + ' ' + ':'.
Output = transition zone around cycle 75 :
22print(f" cycles {life - R_MAX - 2}-{life - R_MAX + 2}:")

Sub-header. Computes the inclusive range 73 to 77 — two cycles before and after the elbow at 75 — so we can see the cap releasing in real time.

EXECUTION STATE
→ 75 − 2 = 73 — two cycles before the elbow (cap still binding)
→ 75 + 2 = 77 — two cycles after the elbow (cap released)
Output = cycles 73-77:
23print(f" raw : {raw_rul[life - R_MAX - 2:life - R_MAX + 3].tolist()}")

Slice raw_rul[73:78] — Python slicing is half-open, so the +3 gives us 5 elements (indices 73, 74, 75, 76, 77). Shows the raw target marching linearly through the elbow value 125.

EXECUTION STATE
→ slice arithmetic = start = life − R_MAX − 2 = 73, stop = life − R_MAX + 3 = 78. Length = 78 − 73 = 5.
→ why +3 not +2 on stop? = Python slice end is exclusive, so to include index 77 we write 78.
→ values at each index = raw_rul[73]=127, raw_rul[74]=126, raw_rul[75]=125, raw_rul[76]=124, raw_rul[77]=123
Output = raw : [127, 126, 125, 124, 123]
24print(f" capped : {capped_rul[life - R_MAX - 2:life - R_MAX + 3].tolist()}")

Same window on the capped array. Indices 73 and 74 are still flattened to 125 (raw was 127 and 126, both > R_MAX). Index 75 is exactly 125 in both arrays — the boundary. Indices 76 and 77 pass through (124 and 123 ≤ 125).

EXECUTION STATE
Output = capped : [125, 125, 125, 124, 123]
→ cycle 73 (raw=127) = min(125, 127) = 125 — cap binds
→ cycle 74 (raw=126) = min(125, 126) = 125 — cap binds
→ cycle 75 (raw=125) = min(125, 125) = 125 — exactly at the elbow; both regimes agree
→ cycle 76 (raw=124) = min(125, 124) = 124 — cap releases; linear regime
→ cycle 77 (raw=123) = min(125, 123) = 123 — linear regime
13 lines without explanation
1import numpy as np
2import pandas as pd
3
4R_MAX = 125
5
6def apply_rul_cap(rul: np.ndarray, r_max: int = R_MAX) -> np.ndarray:
7    """Cap RUL at r_max. Implements y = min(r_max, t_fail - t)."""
8    return np.minimum(rul, r_max)
9
10
11# ----- Demo on one synthetic engine -----
12life = 200
13cycles = np.arange(life)              # 0..199
14raw_rul = life - cycles               # 200, 199, ..., 1
15capped_rul = apply_rul_cap(raw_rul)
16
17print("first 5 cycles:")
18print(f"  raw    : {raw_rul[:5].tolist()}")
19print(f"  capped : {capped_rul[:5].tolist()}")
20print()
21print("transition zone around cycle", life - R_MAX, ":")
22print(f"  cycles {life - R_MAX - 2}-{life - R_MAX + 2}:")
23print(f"    raw    : {raw_rul[life - R_MAX - 2:life - R_MAX + 3].tolist()}")
24print(f"    capped : {capped_rul[life - R_MAX - 2:life - R_MAX + 3].tolist()}")
25
26# first 5 cycles:
27#   raw    : [200, 199, 198, 197, 196]
28#   capped : [125, 125, 125, 125, 125]    ← cap holds for first 75 cycles
29# transition zone around cycle 75:
30#   cycles 73-77:
31#     raw    : [127, 126, 125, 124, 123]
32#     capped : [125, 125, 125, 124, 123]   ← cap releases at cycle 75

PyTorch: Cap as a Tensor Op

torch.clamp(max=R_max) is the one-line equivalent
🐍rul_cap_torch.py
1import torch

Top-level PyTorch package. Provides torch.Tensor (the GPU-aware n-dimensional array), torch.clamp (the cap operator we use here), torch.arange (tensor version of np.arange), and the autograd engine. Once we move from the NumPy demo to the actual training Dataset, the cap must operate on tensors so the result lands directly on the right device with the right dtype.

EXECUTION STATE
torch = Provides Tensor, autograd, optimizers, nn.Module, and CUDA bindings. We only use the Tensor + clamp + arange surface in this snippet.
→ why tensors not ndarrays? = Inside Dataset.__getitem__ we want the target ready as a torch.Tensor so the DataLoader can stack and pin-memory it without an extra np→torch conversion per sample.
3R_MAX = 125

Same cap value as the NumPy version. Defined as a Python int (not a tensor) so it works equally well as a default argument and inside scalar arithmetic. torch.clamp's max= argument accepts a Python scalar directly, so no conversion is needed.

EXECUTION STATE
R_MAX = 125 — Python int. Identical to the NumPy file so train and eval use exactly the same cap.
5def apply_rul_cap(rul, r_max=R_MAX) → torch.Tensor

Tensor-flavoured version of the cap. Same signature, same semantics — only the input/output types change from ndarray to torch.Tensor. This is the function you actually call in production training because it keeps the data on whatever device the rest of the pipeline expects.

EXECUTION STATE
⬇ input: rul (torch.Tensor) = The raw RUL tensor. Any shape and any dtype with an upper bound (typically float32 or int64). Lives on CPU here, but the function works unchanged on CUDA — torch.clamp is device-agnostic.
→ typical shape = (seq_len,) for one window, or (batch_size, seq_len) for a batch from the DataLoader.
⬇ input: r_max (int, default = R_MAX = 125) = Cap threshold. Python int accepted directly by torch.clamp.
→ return type hint: torch.Tensor = Same dtype, same device, same shape as input rul. The cap is non-differentiable but autograd treats clamp specially: gradient is 1 where the cap is inactive, 0 where it binds.
⬆ returns = torch.Tensor — values ≤ r_max element by element.
6return torch.clamp(rul, max=r_max)

One-line cap on the GPU. torch.clamp clips a tensor into [min, max]. Passing only max= gives upper-only clipping — exactly our piecewise-linear cap. The operation is element-wise and differentiable, with gradient zero on the saturated side.

EXECUTION STATE
📚 torch.clamp(input, min=None, max=None) = PyTorch elementwise op. clamp(x) = min(max_v, max(min_v, x)). When min is None it leaves the lower side untouched; when max is None the upper side is untouched. Available as torch.clamp(t, ...) or t.clamp(...).
⬇ arg 1: rul (input tensor) = Positional. The tensor whose values we want to bound.
⬇ arg 2: max = r_max = Keyword. Upper bound. Any element greater than r_max is replaced by r_max; elements ≤ r_max pass through.
→ why no min=? = Raw RUL is already ≥ 0 by construction (t ≤ t_fail), so a lower clip is redundant. Adding min=0 would be harmless but uninformative.
→ equivalent forms = torch.clamp(rul, max=r_max) ≡ torch.minimum(rul, torch.tensor(r_max)) ≡ rul.clamp(max=r_max) ≡ rul.clamp_max(r_max)
→ gradient behaviour = Where rul ≤ r_max: ∂clamp/∂rul = 1 Where rul > r_max: ∂clamp/∂rul = 0 This is fine for a TARGET (we don't backprop through targets). It would be a problem only if you put clamp inside the model output — see Pitfall 2 below.
→ mini example = torch.clamp(torch.tensor([200., 130., 125., 100., 1.]), max=125.) → tensor([125., 125., 125., 100., 1.])
⬆ return = Same-shape, same-dtype, same-device tensor; values ≤ r_max.
9# Use it inside a Dataset

Comment marking that the production call site is Dataset.__getitem__ — the data boundary. Capping there means every batch the model ever sees has a target already in [0, R_MAX]. The model never has to see, fit, or backprop through raw RUL.

10torch.manual_seed(0)

Seeds the global PyTorch RNG so any randomness later in the script is deterministic. This snippet does not actually use randomness, but seeding is included as a hygiene reminder — every Dataset that adds noise, dropout, or random crops needs a deterministic seed for reproducible runs.

EXECUTION STATE
📚 torch.manual_seed(seed) = Sets the seed of the default CPU RNG. Returns the underlying torch.Generator so it can be chained. For full reproducibility you also need torch.cuda.manual_seed_all(seed) and numpy.random.seed(seed).
⬇ arg: 0 = Any integer works; 0 is conventional for 'first run / sanity check'.
11fake_raw_rul = torch.arange(200, 0, -1).float()

Builds a synthetic raw-RUL tensor [200., 199., …, 1.] in two stages. torch.arange generates the integer sequence; .float() casts it from int64 to float32, the standard regression dtype.

EXECUTION STATE
📚 torch.arange(start, end, step) = PyTorch tensor version of np.arange. Returns a 1D tensor of evenly spaced values from start to end (exclusive), stepping by step. Negative step lets you count down.
⬇ arg 1: start = 200 = First element of the sequence (inclusive).
⬇ arg 2: end = 0 = End of the sequence (exclusive). Counting stops at 1 because 0 is excluded.
⬇ arg 3: step = -1 = Negative step → counting down. Each successive element decreases by 1.
→ result before .float() = tensor([200, 199, 198, ..., 2, 1]) shape (200,) dtype=int64
📚 .float() = Tensor method: shorthand for .to(torch.float32). Casts to 32-bit floating point. Equivalent: .to(dtype=torch.float32).
→ why cast to float? = MSE loss and regression heads expect float targets. Mixing int targets with float predictions raises a dtype error.
⬆ fake_raw_rul (shape (200,), float32) = tensor([200., 199., 198., ..., 2., 1.])
12capped = apply_rul_cap(fake_raw_rul)

Calls the cap. r_max defaults to R_MAX = 125, so torch.clamp(fake_raw_rul, max=125) runs internally. The result is a fresh tensor; the original fake_raw_rul is left untouched (clamp is non-mutating; clamp_ would be the in-place version).

EXECUTION STATE
⬇ fake_raw_rul[:5] = tensor([200., 199., 198., 197., 196.])
⬇ fake_raw_rul[73:78] = tensor([127., 126., 125., 124., 123.])
⬆ capped[:5] = tensor([125., 125., 125., 125., 125.])
⬆ capped[73:78] = tensor([125., 125., 125., 124., 123.])
⬆ capped.shape = torch.Size([200]) — same shape as input.
⬆ capped.dtype = torch.float32 — preserved.
14print("min capped:", capped.min().item())

Sanity check on the lower side. capped.min() returns a 0-dim tensor; .item() converts it to a plain Python float for printing. The minimum is 1.0 because our last cycle has raw_rul = 1, well below R_MAX, so it passes through unchanged.

EXECUTION STATE
📚 .min() = Tensor method: returns the minimum value along the specified dim. Without args returns the global min as a scalar tensor.
📚 .item() = Tensor method: pulls the single value out of a 0-dim or 1-element tensor as a Python number. Errors if the tensor has more than one element. Used only for printing/logging — never inside a hot loop.
→ why .item()? = Without it, print would emit 'tensor(1.)' instead of '1.0'. .item() unwraps the value.
Output = min capped: 1.0
15print("max capped:", capped.max().item())

Sanity check on the upper side. The maximum is exactly 125.0 — proof that the cap is binding for every cycle where raw_rul exceeded R_MAX. If we ever saw a value > 125 here, the cap would be silently broken.

EXECUTION STATE
📚 .max() = Tensor method: returns the maximum value along the specified dim. Without args returns the global max as a scalar tensor. Mirror of .min().
→ why exactly 125 and not 124.9? = torch.clamp replaces any value > 125 with the literal r_max (125) — not with a near-by value. The result is identically 125.0, bit for bit.
Output = max capped: 125.0
16print("first 5 :", capped[:5].tolist())

Print the first five elements as a Python list. capped[:5] is a length-5 view; .tolist() materialises it into [125.0, 125.0, ...] for clean printing. All five values are 125 because raw_rul[:5] = [200, 199, 198, 197, 196], all above the cap.

EXECUTION STATE
📚 .tolist() = Tensor method: copy the tensor to a (nested) Python list of Python numbers. Useful for printing and JSON serialisation; the resulting list is detached from the tensor.
⬇ capped[:5] = Slice — first five elements (a view, not a copy).
Output = first 5 : [125.0, 125.0, 125.0, 125.0, 125.0]
17print("around 75:", capped[73:78].tolist())

Five elements straddling the elbow: indices 73, 74, 75, 76, 77. Identical pattern to the NumPy version: the first three sit at 125 (cap binds), index 75 is exactly at the boundary (both regimes agree), and indices 76 and 77 fall through to the linear regime.

EXECUTION STATE
→ element-by-element = capped[73] = 125. (raw 127 → clamped) capped[74] = 125. (raw 126 → clamped) capped[75] = 125. (raw 125 → exact match) capped[76] = 124. (raw 124 → unchanged) capped[77] = 123. (raw 123 → unchanged)
Output = around 75: [125.0, 125.0, 125.0, 124.0, 123.0]
→ boundary index = 75 is the unique cycle where raw_rul == R_MAX. From this index onward 'capped' equals 'raw'.
5 lines without explanation
1import torch
2
3R_MAX = 125
4
5def apply_rul_cap(rul: torch.Tensor, r_max: int = R_MAX) -> torch.Tensor:
6    return torch.clamp(rul, max=r_max)
7
8
9# Use it inside a Dataset
10torch.manual_seed(0)
11fake_raw_rul = torch.arange(200, 0, -1).float()    # 200, 199, ..., 1
12capped = apply_rul_cap(fake_raw_rul)
13
14print("min capped:", capped.min().item())   # 1.0
15print("max capped:", capped.max().item())   # 125.0
16print("first 5  :", capped[:5].tolist())     # [125., 125., 125., 125., 125.]
17print("around 75:", capped[73:78].tolist())  # [125., 125., 125., 124., 123.]
Where to apply it. Inside the Dataset's __getitem__ when computing y. NEVER on a tensor that is a function of model parameters - clamp at the boundary between data and model only.

Bounded Targets in Other Tasks

TaskWhy bound the targetCommon bound
RUL (this book)Early-life signal is uninformativeR_max = 125
Recommendation (rating prediction)Targets bounded in [1, 5]Sigmoid-scaled output
Time-to-event predictionLong-tail rare events dominate gradientTruncate at 95th percentile
Stock price forecastingOutliers blow up MSEQuantile clipping
Detection: predicted bounding-box scaleAspect-ratio rangeLog-space target with cap

Two Cap Pitfalls

Pitfall 1: Forgetting to cap test labels. If you compute MSE on RAW test RUL (not capped), your reported RMSE is artificially inflated by the early-life cycles where you predicted 125 and the “truth” was 200. Always evaluate on the same target you trained on - including the cap.
Pitfall 2: Cap inside the model. Tempting to add a final torch.clamp(rul, max=125) after the regression head. Don't - it kills the gradient when the cap binds. Train on the capped target and let the network learn to predict in [0, 125] organically.
The point. A piecewise-linear cap focuses the model's capacity on the regime that matters. R_max = 125 is the C-MAPSS convention; the cap is applied at the data boundary, not inside the model.

Takeaway

  • R_max = 125 is the standard. Cap raw RUL at this value; ramp linearly to 0 in the last 125 cycles.
  • Two implementation lines. NumPy: np.minimum(rul, r_max). PyTorch: torch.clamp(rul, max=r_max).
  • Apply at the data boundary. Inside Dataset __getitem__. Never inside the model.
  • Always evaluate on the capped target too. Otherwise your RMSE is inflated by early-life cycles.
Loading comments...