Chapter 2
15 min read
Section 7 of 121

N-CMAPSS DS02: Realistic Flight Envelopes

Benchmarks: C-MAPSS & N-CMAPSS

Simulator vs. Black Box: A Step Closer to Reality

Driving simulators have come a long way, but every autonomous-vehicle team in the world also collects real road miles — because no simulator captures the full distribution of weather, road wear, and bizarre human drivers you encounter on day one of deployment. The same realisation eventually came to the prognostics community.

C-MAPSS taught a generation of researchers how to compare RUL methods. But its operating conditions are six fixed centroids that the simulator samples uniformly — an idealisation that hides the slow, structured way real flights move through the envelope. NASA's 2021 release of the N-CMAPSS dataset added that missing realism: real flight profiles, variable-length missions, and a richer sensor catalog including unmeasurable internal states.

Mental model. C-MAPSS is what you train and benchmark on. N-CMAPSS is what you double-check on. The paper validates GRACE on both.

What Makes N-CMAPSS Different

PropertyC-MAPSS (Section 2.1)N-CMAPSS DS02
File formatPlain-text .txt, 26 columnsHDF5, multiple datasets per group
Operating conditions6 fixed centroids, uniformly sampledContinuous - real flight envelopes
Engine count (dev)100-260 per FD subset8 units (2..9), one HDF5 file
Cycles per engine~150-360100,000+ (one cycle = ~1 second of flight)
Sensors21 physical14 physical + 14 virtual = 28 channels
Flight structureNone - cycles are i.i.d. regimesVariable-length flights w/ phases
Released20082021
Total file size<50 MB across all 4 subsets~2.3 GB

The two qualitative changes that matter most: continuous flight profiles (cycles within a flight are correlated, not i.i.d.) and virtual sensors (channels representing internal engine states like turbine efficiency margins, which a real airline cannot directly instrument but a simulator can expose). Together they make N-CMAPSS more physically realistic but also more demanding of the model.

Interactive: One Flight, Two Datasets

Below is the same time horizon, plotted twice. Red is a 200-cycle slice of a C-MAPSS FD002 engine: altitude jumps every cycle between the six fixed regimes from Section 2.2. Green is one representative N-CMAPSS DS02 flight profile: smooth taxi → takeoff → climb → cruise → descent → approach → landing.

Loading flight-profile comparator…

Both arrangements are valid prognostic data. The C-MAPSS view is statistically clean — conditions are uniformly sampled so every cycle stands on its own. The N-CMAPSS view is physically real — cycles are correlated within a flight and the regime is autocorrelated for tens or hundreds of cycles at a time.

The HDF5 Layout, Up Close

DS02 is a single HDF5 file with ten top-level datasets. Knowing the names is 90% of the battle:

GroupShapeContents
X_s_dev(N_dev, 14)Physical sensors (T24, T30, ..., Wf)
X_v_dev(N_dev, 14)Virtual sensors (HPC efficiency, T48, ...)
A_dev(N_dev, 8)Auxiliary: unit, flight, cycle, hs, fault codes
W_dev(N_dev, 4)Operating conditions (alt, Mach, TRA, T40)
Y_dev(N_dev, 1)Per-cycle RUL
X_s_test(N_tst, 14)Held-out engines, physical sensors
X_v_test(N_tst, 14)Held-out engines, virtual sensors
A_test(N_tst, 8)Held-out auxiliary
W_test(N_tst, 4)Held-out conditions
Y_test(N_tst, 1)Held-out RUL

The development split (suffix _dev) carries engines 2-9; the held-out test split (_test) carries different engines never seen during training. Standard 80/20 split machinery does not apply — you get a fully held-out engine population.

Python: Loading DS02 With h5py

Twenty lines, no surprises if you have used HDF5 before. The trick is the boolean mask on the auxiliary units column — HDF5 evaluates it on disk so we never materialise the whole 5-million-row matrix in memory.

Pull one engine's full history out of DS02
🐍ncmapss_loader.py
1import h5py

h5py is the Python interface to HDF5 - a hierarchical binary format the NASA prognostic group chose so the multi-gigabyte file is queryable without loading everything at once.

EXECUTION STATE
h5py = Pythonic HDF5. Lets you treat the file like a nested dict of ndarrays.
Why HDF5 for N-CMAPSS? = DS02 alone is ~2.3 GB. With HDF5 you can mmap it, do boolean slicing on disk, and never load more than the columns you need.
2import numpy as np

Standard NumPy alias.

3import pandas as pd

Used downstream for inspection (not strictly needed here).

7PATH = "data/raw/N-CMAPSS_DS02-006.h5"

Filesystem path. The '006' in the filename is the dataset version released by NASA; it has been stable since 2021. The README in the project repo points here.

EXECUTION STATE
size on disk = ~2.3 GB
alternative = Smaller subsets DS01, DS03, ..., DS08 follow the same layout.
10def load_ds02_unit(path, unit_id):

Pull all flight cycles for ONE engine (called a 'unit' in the DS02 nomenclature). Engines are numbered 2-9 in DS02.

EXECUTION STATE
input: path (str) = HDF5 file location
input: unit_id (int) = Engine number, 2..9 inclusive for DS02
returns 4-tuple = (X_s, X_v, aux, rul) - sensor channels, virtual sensors, metadata, RUL
19with h5py.File(path, "r") as f:

Opens the file in read-only mode. The `with` block guarantees the file handle is closed even if downstream code raises - critical when working with multi-GB datasets.

EXECUTION STATE
h5py.File(path, mode) = mode='r' is read-only. 'r+' would allow writes; 'w' truncates the file. We never want to mutate NASA's published file.
f as a dict = f.keys() returns ['A_dev', 'A_test', 'X_s_dev', 'X_s_test', 'X_v_dev', 'X_v_test', 'Y_dev', 'Y_test', 'W_dev', 'W_test']
21units = f["A_dev"][:, 0].astype(np.int32)

Pull the FIRST column of the auxiliary matrix - the 'unit' (engine) ID. The bracket [:, 0] is HDF5-aware and only reads that column from disk.

EXECUTION STATE
f['A_dev'] = (N, 8) auxiliary metadata: unit, flight, cycle, hs, Fc, etc.
[:, 0] = All rows, column 0 (the unit/engine ID).
.astype(np.int32) = HDF5 stores it as float64; we want int for boolean comparison.
units.shape = (5263447,) - total cycles across DS02 development split
22mask = units == unit_id

Boolean mask: True where the unit column equals our engine of interest. NumPy will use this to slice all the other arrays consistently.

EXECUTION STATE
mask.shape = (5263447,) bool
mask.sum() = 104897 for unit_id=2 - that engine's total cycle count
24X_s = f["X_s_dev"][mask].astype(np.float32)

The 'physical' sensor matrix. 14 channels: T24, T30, T48, T50, P15, P2, P21, P24, Ps30, P40, P50, Nf, Nc, Wf. These are the things a real engine sensor measures.

EXECUTION STATE
f['X_s_dev'].shape = (5263447, 14)
mask slicing = Returns just the rows where mask is True - lazy load from disk, no copy of the rest.
X_s.shape after mask = (104897, 14) for unit 2
25X_v = f["X_v_dev"][mask].astype(np.float32)

The 'virtual' sensor matrix. 14 more channels representing internal engine states that would be hard to measure physically (e.g., HPT efficiency margin). Together with X_s these are 28 input features per cycle.

EXECUTION STATE
X_v.shape = (104897, 14) for unit 2
→ why useful? = The virtual channels are what give DS02 its physics-rich character. DKAMFormer leans heavily on them.
26aux = f["A_dev"][mask].astype(np.float32)

Auxiliary metadata: 8 columns including unit, flight number, cycle, health state, fault code, etc.

EXECUTION STATE
aux.shape = (104897, 8)
columns = [unit, flight, cycle, hs, Fc1, Fc2, Fc3, Fc4]
27rul = f["Y_dev"][mask, 0].astype(np.float32)

The RUL target. Y_dev is shape (N, 1); we take column 0 to get a 1-D array.

EXECUTION STATE
f['Y_dev'].shape = (5263447, 1)
[mask, 0] = Boolean rows + integer column 0 → 1-D ndarray
rul.shape = (104897,)
rul[:3] = [64.0, 64.0, 64.0] - all cycles in flight 1 share the flight-level RUL
29return X_s, X_v, aux, rul

Four-tuple. Caller can concatenate X_s and X_v if they want the full 28-channel input; we keep them separate for flexibility.

33X_s, X_v, aux, rul = load_ds02_unit(PATH, unit_id=2)

Pull engine 2.

EXECUTION STATE
memory cost = 104897 × (14 + 14 + 8 + 1) × 4 bytes ≈ 15 MB - tiny once we slice down to one engine
35print("X_s shape :", X_s.shape)

Verify physical-sensor shape.

EXECUTION STATE
Output = X_s shape : (104897, 14)
36print("X_v shape :", X_v.shape)

Virtual-sensor shape.

EXECUTION STATE
Output = X_v shape : (104897, 14)
37print("aux shape :", aux.shape)

Metadata shape.

EXECUTION STATE
Output = aux shape : (104897, 8)
38print("rul range :", rul.min(), "->", rul.max())

RUL spans 0 (last flight) to flight_count − 1 (first flight). Engine 2 in DS02 ran 64 flights before failure.

EXECUTION STATE
Output = rul range : 0.0 -> 64.0
39print("flights :", int(aux[:, 1].max()))

Column 1 of aux is the flight number. The maximum flight number is the engine's total flight count.

EXECUTION STATE
Output = flights : 64
interpretation = Engine 2 was simulated through 64 complete flights of variable length before failure. Total cycle count: ~105k.
27 lines without explanation
1import h5py
2import numpy as np
3import pandas as pd
4
5# DS02 ships as a single ~2.3 GB HDF5 file.  Each top-level group is one
6# of the eight subsets {DS01..DS08}; we focus on DS02 in this book.
7PATH = "data/raw/N-CMAPSS_DS02-006.h5"
8
9
10def load_ds02_unit(path: str, unit_id: int):
11    """Load all flights from one engine ('unit') in DS02.
12
13    Returns
14    -------
15    X_s   : (N, 14)  float32 - 'sensor' channels (Xs)
16    X_v   : (N, 14)  float32 - 'virtual sensor' channels (Xv)
17    aux   : (N, 8)   float32 - auxiliary metadata (unit, cycle, ...)
18    rul   : (N,)     float32 - per-cycle Remaining Useful Life
19    """
20    with h5py.File(path, "r") as f:
21        # Use boolean mask on the auxiliary 'unit' column to slice by engine.
22        units = f["A_dev"][:, 0].astype(np.int32)
23        mask  = units == unit_id
24
25        X_s   = f["X_s_dev"][mask].astype(np.float32)
26        X_v   = f["X_v_dev"][mask].astype(np.float32)
27        aux   = f["A_dev"  ][mask].astype(np.float32)
28        rul   = f["Y_dev"  ][mask, 0].astype(np.float32)
29
30    return X_s, X_v, aux, rul
31
32
33# ----- Use it on engine 2 of DS02 -----
34X_s, X_v, aux, rul = load_ds02_unit(PATH, unit_id=2)
35
36print("X_s shape :", X_s.shape)
37print("X_v shape :", X_v.shape)
38print("aux shape :", aux.shape)
39print("rul range :", rul.min(), "->", rul.max())
40print("flights   :", int(aux[:, 1].max()))    # column 1 of A is flight number
41
42# X_s shape : (104897, 14)
43# X_v shape : (104897, 14)
44# aux shape : (104897, 8)
45# rul range : 0.0 -> 64.0
46# flights   : 64

Why one engine at a time?

DS02 is large enough that loading all 8 dev units in one shot eats nearly a gigabyte. Real preprocessing pipelines load one unit, slice into windows, save the windowed tensor, and move on — never holding more than one engine in memory.

PyTorch: A Flight-Level Dataset

A close cousin of the C-MAPSS Dataset from Section 2.1, with two differences: a longer default window (50 vs 30, since DS02 cycles are denser) and an option to include the 14 virtual-sensor channels alongside the 14 physical ones.

One-engine N-CMAPSS Dataset, virtual-sensor-aware
🐍ncmapss_dataset.py
1import h5py

HDF5 reader.

2import numpy as np

Required for dtype conversion.

3import torch

Tensor type.

4from torch.utils.data import Dataset

Base class.

7class NCMAPSSFlightDataset(Dataset):

Mirrors the C-MAPSS dataset but for one DS02 engine. Bigger windows (default 50 vs 30) because flight cycles are denser - one second of flight per cycle in some pre-processings.

EXECUTION STATE
design = One unit per Dataset; ConcatDataset over 8 units gives the full DS02 dev split
14def __init__(self, h5_path, unit_id, window=50, use_virtual=True):

Four constructor args. use_virtual toggles whether the 14 virtual-sensor channels are concatenated to the input.

EXECUTION STATE
input: h5_path = Path to N-CMAPSS_DS02-006.h5
input: unit_id = Engine number 2..9
input: window = 50 (paper convention) - longer than C-MAPSS's 30
input: use_virtual = True → 28 features; False → 14 physical sensors only
16with h5py.File(h5_path, "r") as f:

Same context-managed open as the NumPy version.

17units = f["A_dev"][:, 0].astype(np.int32)

Pull unit IDs from disk.

18mask = units == unit_id

Boolean mask for our engine.

20X_s = f["X_s_dev"][mask].astype(np.float32)

Physical sensors, masked.

EXECUTION STATE
X_s.shape = (104897, 14) for unit 2
21if use_virtual:

Toggle on whether to include virtual sensors.

22X_v = f["X_v_dev"][mask].astype(np.float32)

Virtual sensors, masked.

23feats = np.concatenate([X_s, X_v], axis=1)

Stick X_s and X_v side by side along columns. axis=1 preserves the time axis (rows) and grows the feature axis.

EXECUTION STATE
np.concatenate(arrays, axis) = Joins arrays along an EXISTING axis (unlike np.stack which creates a new one). axis=1 = column-wise on 2-D arrays.
feats.shape = (104897, 28)
24else:

Physical-only branch.

25feats = X_s

Skip the virtual sensors.

EXECUTION STATE
feats.shape = (104897, 14)
27ruls = f["Y_dev"][mask, 0].astype(np.float32)

Per-cycle RUL.

29self.feats = torch.from_numpy(feats)

Promote feats to a torch.Tensor (zero-copy bridge). __getitem__ slices from this once per sample.

EXECUTION STATE
self.feats.dtype = torch.float32
self.feats.shape = torch.Size([104897, 28])
30self.ruls = torch.from_numpy(ruls)

Same trick on the RUL column.

EXECUTION STATE
self.ruls.shape = torch.Size([104897])
31self.window = window

Stash window size.

32self.starts = list(range(0, len(self.feats) - window + 1))

Pre-computed valid window starts.

EXECUTION STATE
len(self.starts) = 104848 = 104897 − 50 + 1
34def __len__(self):

Window count.

35return len(self.starts)

104848 for unit 2 with window 50.

37def __getitem__(self, idx):

Standard indexing.

38s = self.starts[idx]

Window start index.

39e = s + self.window

Window end (exclusive).

40X = self.feats[s:e]

Tensor slice, no copy.

EXECUTION STATE
X.shape = torch.Size([50, 28])
41y = self.ruls[e - 1]

Scalar RUL at the LAST cycle of the window.

EXECUTION STATE
Example: idx=0, e=50 = y = ruls[49]; for unit 2 first flight RUL=64
42return X, y

Standard (X, y) contract.

46ds = NCMAPSSFlightDataset(...)

Construct on engine 2 with both sensor groups concatenated.

EXECUTION STATE
constructor cost = Reads ~15 MB into RAM once; subsequent slicing is free
50print("samples:", len(ds))

Sample count.

EXECUTION STATE
Output = samples: 104848
51X, y = ds[0]

First window.

52print("X shape:", tuple(X.shape), " y:", float(y))

Verify shapes.

EXECUTION STATE
Output = X shape: (50, 28) y: 64.0
24 lines without explanation
1import h5py
2import numpy as np
3import torch
4from torch.utils.data import Dataset
5
6
7class NCMAPSSFlightDataset(Dataset):
8    """One DS02 unit (engine), exposed as sliding windows over its
9    concatenated flight history.
10
11    For each window we return:
12        X     : (window, n_features) - X_s + X_v concatenated
13        rul   : scalar RUL target at the end of the window
14    """
15
16    def __init__(self, h5_path: str, unit_id: int, window: int = 50,
17                 use_virtual: bool = True):
18        with h5py.File(h5_path, "r") as f:
19            units = f["A_dev"][:, 0].astype(np.int32)
20            mask  = units == unit_id
21
22            X_s = f["X_s_dev"][mask].astype(np.float32)
23            if use_virtual:
24                X_v   = f["X_v_dev"][mask].astype(np.float32)
25                feats = np.concatenate([X_s, X_v], axis=1)   # (N, 28)
26            else:
27                feats = X_s                                  # (N, 14)
28
29            ruls = f["Y_dev"][mask, 0].astype(np.float32)
30
31        self.feats   = torch.from_numpy(feats)
32        self.ruls    = torch.from_numpy(ruls)
33        self.window  = window
34        self.starts  = list(range(0, len(self.feats) - window + 1))
35
36    def __len__(self) -> int:
37        return len(self.starts)
38
39    def __getitem__(self, idx: int):
40        s = self.starts[idx]
41        e = s + self.window
42        X = self.feats[s:e]              # (window, n_features)
43        y = self.ruls[e - 1]             # scalar RUL at end of window
44        return X, y
45
46
47# ----- Use it -----
48ds = NCMAPSSFlightDataset(
49    "data/raw/N-CMAPSS_DS02-006.h5", unit_id=2,
50    window=50, use_virtual=True,
51)
52print("samples:", len(ds))
53X, y = ds[0]
54print("X shape:", tuple(X.shape), "  y:", float(y))
55# samples: 104848
56# X shape: (50, 28)   y: 64.0
Composing units. For multi-engine training, torch.utils.data.ConcatDataset wraps a list of NCMAPSSFlightDatasets (one per unit) into a single iterable. Sampling is uniform across engines by default; weight by length if you want larger engines to dominate.

Realism-Check Datasets in Other ML Areas

The pattern “train on a clean simulator, validate on a realism-check dataset” is everywhere in modern ML. Each pair below maps one-to-one to the C-MAPSS → N-CMAPSS relationship.

FieldClean simulator / benchmarkRealism-check dataset
Prognostics (this book)C-MAPSS (2008)N-CMAPSS DS01-DS08 (2021)
Robotics / RLMuJoCo, PyBulletReal-robot data, sim-to-real transfer benchmarks
Speech recognitionLibrispeech (clean)TED-LIUM, in-the-wild ASR
Autonomous drivingCARLA, AirSimWaymo Open, NuPlan
Medical imagingBraTS challenge slabsClinical PACS scans
NLPGLUE benchmarksReal customer-support transcripts
Climate MLERA5 reanalysisLocal station observations

The book's methodological core (multi-task learning + gradient balancing + asymmetric safety loss) transfers to all of these — what you swap is the loader and the per-condition normaliser.

Where C-MAPSS Numbers Fail to Transfer

Beware the leaderboard transplant. A method that wins on C-MAPSS FD002 does not necessarily win on N-CMAPSS DS02 - the paper's Table III shows DKAMFormer (the published SOTA on N-CMAPSS at RMSE 5.46) beating GRACE (RMSE 6.35) on this benchmark, even though GRACE wins decisively on multi-condition C-MAPSS. The two datasets stress different capabilities.

Two reasons. First, N-CMAPSS's virtual sensors are physics-rich information that encodes pressure ratios and turbine efficiencies; methods like DKAMFormer that build a knowledge graph over them get a real boost. Second, the continuous flight envelope means conditions are autocorrelated - a model can implicitly track the regime over tens of cycles rather than learning to invariate to abrupt jumps. The accuracy-safety tradeoff is still real, but the relative weights between RMSE and NASA do shift.

The book's honest claim. The proposed framework wins decisively on multi-condition C-MAPSS and matches the best within the framework on N-CMAPSS. It does not displace domain-specific physics engineering on benchmarks where physics knowledge is the primary lever.

Takeaway

  • N-CMAPSS is the realism check. Real flight envelopes, variable-length missions, 28 channels (14 physical + 14 virtual), one ~2.3 GB HDF5 file.
  • HDF5 lets you slice on disk. f['X_s_dev'][mask] reads only the rows you need.
  • Different geometry, different winners. The continuous flight envelope makes physics-based methods (DKAMFormer) comparatively stronger on N-CMAPSS than on multi-condition C-MAPSS.
  • Same Dataset shape, longer windows. Window=50 is conventional on N-CMAPSS; the rest of the API matches our C-MAPSS class.
  • Use both. Train and ablate on C-MAPSS for compute economy; validate the headline result on N-CMAPSS before claiming victory.
Loading comments...