The ImageNet of Prognostics
ImageNet did not just give vision researchers a benchmark; it gave them a common language. Once a million labelled images existed, an entire generation of architectures became comparable, reproducible, and rankable on a single leaderboard. In prognostics that role belongs to NASA C-MAPSS — a public dataset of simulated turbofan run-to-failure trajectories released by the NASA Prognostic Center of Excellence in 2008. Almost every RUL paper of the past fifteen years quotes a number from one of its four sub-datasets.
C-MAPSS is short for Commercial Modular Aero-Propulsion System Simulation. NASA built it as a thermo-mechanical model of a high-bypass turbofan with five rotating components (fan, LPC, HPC, HPT, LPT), then ran thousands of simulated lifetimes with health-margin parameters that drift until something breaks. The result: a perfectly censoring-free supervised dataset where every engine runs all the way to failure and the failure cycle is known exactly.
What's in the Box: Files, Columns, Conventions
Each of the four sub-datasets ships as three plain-text files:
| File | Contents | Format |
|---|---|---|
| train_FD00x.txt | Run-to-failure trajectories for ALL engines | 26 cols, space-separated |
| test_FD00x.txt | Truncated trajectories - the model must extrapolate | 26 cols, space-separated |
| RUL_FD00x.txt | Ground-truth RUL at the last test cycle of each engine | 1 col |
The 26 columns are always the same: 2 identifiers (engine id, cycle), 3 operational settings (altitude, Mach, throttle angle), and 21 sensors (temperatures, pressures, speeds, bleed flows). No header row, no missing values, no timestamps — just space-separated floats. Loading is a one-liner.
Labelling the training data with RUL is also a one-liner because the file is fully run-to-failure: the maximum cycle observed for each engine is its failure cycle. The training-time RUL at any earlier cycle is just .
Interactive: The Dataset at a Glance
Hover any value in the column anatomy to see what physical quantity it represents. Click an FD card to switch the histogram — FD003 and FD004 have noticeably longer tails than FD001/FD002 because their two-fault populations include slowly-degrading engines that survive 350+ cycles.
Two structural takeaways. First, the four sub-datasets are notidentical — they vary in operating conditions (1 vs 6) and fault modes (1 vs 2), which is exactly what the rest of Chapter 2 is about. Second, the failure-cycle distributions are wide enough that any model has to handle engines that fail at cycle 110 and engines that survive past 350.
Python: Parse the Raw Files in 20 Lines
Loading C-MAPSS is no harder than reading a CSV. The only quirk is that the delimiter is whitespace and the file has no header row.
Why pandas, not pure NumPy?
We could call np.loadtxt and skip pandas entirely — it would work. We use pandas because the groupby + transform idiom turns the per-engine RUL labelling into one expressive line. The downstream PyTorch code converts back to NumPy float32 anyway.
PyTorch: A Reusable CMAPSSDataset
For training we need a Dataset that yields pairs as PyTorch tensors. The class below is a thin wrapper around the NumPy logic above — same windows, same labels, same order; what changes is that DataLoader can now batch and shuffle for free.
Other Public Prognostic Benchmarks
C-MAPSS is dominant but not alone. The benchmarks below cover similar territory in adjacent industries; everything in this book transfers to them with at most a renamed loader.
| Benchmark | Domain | Size | Distinctive feature |
|---|---|---|---|
| NASA C-MAPSS (this book) | Turbofan engine | 100-260 engines per subset | Censoring-free, multi-condition variants |
| N-CMAPSS DS01-DS08 | Turbofan engine | Up to 5,000 hours of flight | Realistic flight envelopes (Section 2.3) |
| NASA Battery | Lithium-ion cell | ~30 cells | Charge/discharge cycle health |
| MIT/Stanford Severson 2019 | Lithium-ion cell | 124 cells | Early-cycle prediction of full lifetime |
| PRONOSTIA / FEMTO | Rolling-element bearing | 17 bearings | Vibration spectra to failure |
| IMS Bearing | Rolling-element bearing | 4 bearings, 3 runs | Continuous run-to-failure on test rigs |
| Backblaze drives | Hard-disk drive | ~250k drives, quarterly | Real-world censored failure logs |
| EDP Open Wind | Wind-turbine gearbox | Multiple turbines | SCADA + maintenance logs |
The Simulation-vs-Reality Gap
N-CMAPSS DS02 (Section 2.3) closes part of this gap by simulating real flight envelopes; the remaining gap to deployment is what motivates the Limitations chapter at the end of this book (Chapter 29).
What the benchmark gives us. A single, public, perfectly labelled dataset that lets ten different research groups directly compare their methods. That is enough to drive a decade of progress — even if the numbers do not transfer literally to a real airline.
Takeaway
- Four files, 26 columns, no surprises. 2 identifiers + 3 operational settings + 21 sensors per row, space-separated, no header.
- Labels are free. Training engines run all the way to failure, so RUL at any cycle is per engine.
- The four sub-datasets vary in difficulty. 1 vs 6 operating conditions, 1 vs 2 fault modes — the topic of Section 2.2.
- Loading is two helpers.
load_train()for analysis andCMAPSSDatasetfor training. We will reuse the latter in every chapter from Part III onward. - The simulation-reality gap is real but not the point of this book. C-MAPSS is the standardised arena. The methods built here generalise; the exact numbers do not.