The Fuel Gauge for Machines
Every car has a gauge that says “42 miles to empty.” You do not get a stream of fuel-tank pressures, fuel-pump currents, and ECU error codes — the gauge has already collapsed all of that into one number you can act on. Predictive maintenance is the same idea, applied to anything that wears out: squeeze the multi-dimensional sensor history into a single scalar — Remaining Useful Life — that a maintenance scheduler can actually use.
That is why nearly every paper, dashboard, and commercial diagnostic product in this space ultimately reports one number: in cycles, hours, or kilometres remaining. It is the actionable statistic. The model can be a transformer, a CNN, an LSTM, or three of them stacked — the output that the maintenance crew sees is one number.
RUL, Formally
Let be the vector of sensor readings at cycle , where is the number of sensors (14 for C-MAPSS, 20 for N-CMAPSS DS02). A run-to-failure trajectory is the sequence .
At any moment , the true remaining useful life is the trivially obvious quantity
The prediction problem is to estimate from a window of past readings only — the model never gets to peek at . Concretely, given a window of the last cycles
we want a function (a neural network with parameters ) such that
That is the entire problem statement. The next 28 chapters are how to choose well, what training objective to put on , and how to get good values for from data.
Interactive: One Sample, Up Close
The diagram below is a single engine that fails at cycle 200. Five sensors are drifting away from their healthy baselines on physically-motivated curves (vibration up, oil pressure down, exhaust temperature up, and so on). The orange band is the model's 30-cycle input window; the green panel is the scalar RUL the model is asked to predict.
Drag the cursor and watch how a single engine generates many training samples — each cursor position is one row of the supervised dataset. Drag the failure cycle down to 100 and you simulate a short-lived engine; drag the window size down to 5 and the model has almost no context to work with. By Chapter 7 we will commit to and for C-MAPSS — this is just where those numbers come from.
Where Do RUL Labels Come From?
For supervised learning we need ground-truth RUL labels on the training data. There is exactly one way to obtain them: run the engine to failure and timestamp the moment it dies. Once you know , every earlier cycle gets its label for free.
That single-sentence procedure has a consequence that haunts every working prognostic dataset: most engines never run to failure. They get pulled for unrelated reasons (lease return, fleet retirement, regulatory inspection), and you have a sensor history but no label. NASA C-MAPSS dodges this by being a simulation: every trajectory in FD001-FD004 is run to failure by construction. Real-world prognostic projects almost always have to confront the “censoring” problem — we will return to it at the end of this section.
Python: Build (X, y) Pairs From One Engine Run
Before any neural network, the labelling procedure is forty lines of NumPy. We simulate one run-to-failure trajectory, slide a 30-cycle window across it, and emit one (input, target) pair per cursor position — 171 pairs from a single engine with 200 cycles of life.
From one engine to a fleet
Real C-MAPSS gives you 100 engines in FD001 and 260 in FD002. The pattern stays identical — loop over engines, call build_pairs for each, concatenate the results. We will do exactly that in Chapter 7 once we have a real PyTorch Dataset.
PyTorch: The Same Idea, but as a Dataset
Now the same logic in PyTorch idiom. Two changes only: numpy.ndarray becomes torch.Tensor, and the loop becomes a Dataset subclass that DataLoader can batch and shuffle. The numerical output is identical.
ConcatDataset over 260 engines is a one-liner.RUL Beyond Aerospace
The (window, sensor)-to-scalar regression formulation is dominant outside aerospace too. Whenever the failure mode is gradual, monotonic, and partially observable through sensor data, the same machinery applies. The label definition shifts because the “cycle” is replaced by whatever unit of life the equipment uses.
| Domain | Unit of life | Sensors | Public benchmark |
|---|---|---|---|
| Turbofan engine (this book) | Operating cycle | Pressure, temperature, fan-speed, fuel-flow | NASA C-MAPSS, N-CMAPSS DS02 |
| Lithium-ion battery | Charge cycle | Voltage, current, temperature curves | NASA Battery, MIT/Stanford Severson 2019 |
| Rolling-element bearing | Hours under load | Vibration spectrum, acoustic emission | PRONOSTIA / FEMTO bearing dataset |
| Hard-disk drive | Operating hours | SMART attributes (read errors, reallocations) | Backblaze quarterly drive dump |
| Wind-turbine gearbox | Rotations / hours | Vibration, oil debris, temperature | EDP Open Data, EngieWindFarm |
| Patient hospital stay (medical analogue) | Days until discharge / decompensation | Vitals, labs, medication exposure | MIMIC-IV ICU data |
| Software systems | Calls until crash / time-to-anomaly | Latency percentiles, error rates, GC pauses | (internal SRE telemetry) |
Every row of that table consumes a window-shaped input and emits a scalar time-to-event. The CNN-BiLSTM-Attention backbone we build in Part III, the gradient-balancing loss in Part VI, the Pareto-frontier picture in Chapter 23 — all of them transfer with at most a renaming of variables.
The Censoring Pitfall
Every introduction to RUL skips the elephant in the corner: the only engines you can label are the ones that did fail. Engines pulled early, swapped out for unrelated reasons, or still happily running at the moment your dataset snapshot is taken provide censored observations — you know the engine survived past some cycle, but you do not know .
C-MAPSS sidesteps the problem by being entirely run-to-failure simulation data, which is one of the reasons it has dominated the prognostic benchmark landscape since 2008. Real-world projects almost always need to either restrict the training set to fully-observed runs or borrow techniques from the survival literature. We flag it here, defer it to Chapter 29 (“Limitations & Open Research Questions”), and proceed with the un-censored idealisation for the next twenty-eight chapters.
The clean version of the problem. Run-to-failure trajectories with known failure cycle. Sliding-window inputs. Scalar RUL targets. A neural network that maps one to the other. That is the world we will work in.
Takeaway
- RUL is the fuel gauge for machines. Every model in this book, no matter how exotic its internals, ultimately emits one scalar per window: cycles to failure.
- Supervised pairs come from sliding windows. One engine with 200 cycles of life and a window size of 30 yields 171 (input, target) pairs.
- Python and PyTorch say the same thing differently. NumPy gives you the math directly; PyTorch
Dataset/DataLoadergives you batching, shuffling, and parallel I/O for free. - Most real datasets are censored. C-MAPSS hides this behind simulation; outside it, censoring is the thing you have to engineer around.