Chapter 5
11 min read
Section 19 of 121

The 21 Sensors and 3 Operational Settings

NASA Datasets Deep Dive

A Tour of a Turbofan

A modern high-bypass turbofan moves air through five rotating stages: a fan that pushes most of the airflow around the core, a low-pressure compressor (LPC), a high-pressure compressor (HPC), a combustor, then a high-pressure turbine (HPT) and low-pressure turbine (LPT) that extract energy. Sensors are placed at standard gas-path stations — T2 at the fan inlet, T24 at LPC outlet, T30 at HPC outlet, T50 at LPT outlet, and so on — plus rotational-speed and bleed-flow probes.

The C-MAPSS simulator outputs 21 such gas-path readings on every cycle, alongside 3 operational settings that describe what regime the engine is currently in. Together they form the 24-column feature vector (plus engine_id and cycle = 26 columns total) that every paper in this space consumes.

Three Operational Settings

ColumnSymbolPhysical meaningTypical range
op_set_1altitude (k ft)Flight altitude in 1,000 ft units0 - 42
op_set_2Mach numberAircraft speed / speed of sound0 - 0.84
op_set_3TRA (%)Throttle resolver angle60 - 100

On FD001 / FD003 these three columns are essentially constant - the engine is run at sea-level idle. On FD002 / FD004 they jump between six discrete regimes (Section 2.2). For both cases, the op_set columns are NOT modelling features — they describe the current operating point, which downstream code uses to normalise sensor readings within their regime (Chapter 6).

Twenty-One Sensors, Catalogued

#SymbolDescriptionInformative on FD001?
1T2Total temperature at fan inletNo (constant)
2T24Total temperature at LPC outletYes
3T30Total temperature at HPC outletYes
4T50Total temperature at LPT outletYes
5P2Pressure at fan inletNo (constant)
6P15Total pressure in bypass ductNo (constant)
7P30Total pressure at HPC outletYes
8NfPhysical fan speedYes
9NcPhysical core speedYes
10eprEngine pressure ratioNo (constant)
11Ps30Static pressure at HPC outletYes
12phiRatio of fuel flow to Ps30Yes
13NRfCorrected fan speedYes
14NRcCorrected core speedYes
15BPRBypass ratioYes
16farBBurner fuel-air ratioNo (constant)
17htBleedBleed enthalpyYes
18Nf_dmdDemanded fan speedNo (constant)
19PCNfR_dmdDemanded corrected fan speedNo (constant)
20W31HPT coolant bleedYes
21W32LPT coolant bleedYes
14 of 21 sensors are informative on FD001. The rest are constants (control commands like demanded fan speed) or sensors with so little variance they cannot encode degradation. Section 5.3 formalises the selection rule and Chapter 6 onwards work with the 14-channel input F=14F = 14 on FD001 / FD003 and 17 channels (14 sensors + 3 op-settings) when we choose to feed the regime info to the model.

Why Each Sensor Drifts

Physical degradation in a turbofan shows up in predictable directions on each sensor:

Failure modeSensor effectExample sensor
HPC efficiency dropHigher T30, higher P30T30 drifts UP (~80 R over life)
LPT degradationHigher T50T50 drifts UP
Fan blade wearLower BPR, higher fuel flowBPR drifts DOWN, phi UP
Bleed-flow imbalanceW31 / W32 drift apartW31 - W32 widens

Knowing the direction matters when reading attention maps later — the model spends more attention on sensors that drift consistently with the failure mode.

Python: Range and Variance Per Sensor

Quick statistical pass that confirms which sensors carry signal. Anything with std < 1e-6 is constant; the top-5 high-variance sensors will dominate downstream modelling.

Per-sensor mean / std / range to identify constants
🐍sensor_stats.py
1import numpy as np

Standard alias.

2import pandas as pd

Loader.

4COLUMNS = ...

26-column layout.

9SENSOR_COLS = [...]

Just the 21 sensor names. Used to slice statistics.

12df = pd.read_csv(...)

Load FD001 train as a DataFrame.

16mean = df[SENSOR_COLS].mean()

Per-sensor mean across the entire training file. Pandas returns a Series indexed by column name.

EXECUTION STATE
mean.shape = (21,)
Example: mean['sensor_2'] = 642.68 (Rankine)
17std = df[SENSOR_COLS].std()

Per-sensor standard deviation. Sensors with std ≈ 0 are constant.

EXECUTION STATE
std['sensor_1'] = 0.0 (constant - drop)
std['sensor_2'] = 0.50 (informative)
18range_ = df[SENSOR_COLS].max() - df[SENSOR_COLS].min()

Range = max - min. Another way to spot constants.

20stats = pd.DataFrame({"mean": mean, "std": std, "range": range_}).round(3)

Combine into one DataFrame indexed by sensor name. .round(3) for readable printing.

EXECUTION STATE
stats.shape = (21, 3)
21print(stats)

Pretty-printed table - one row per sensor, three columns of statistics.

EXECUTION STATE
First few rows = sensor_1: mean=518.67 std=0.0 range=0.0; sensor_2: mean=642.68 std=0.50 range=4.6; ...
24constant = stats[stats["std"] < 1e-6].index.tolist()

Filter for sensors with vanishing std, return their names.

EXECUTION STATE
.index.tolist() = Pandas Index has all the column names; .tolist() converts to a Python list
Output = ['sensor_1', 'sensor_5', 'sensor_6', 'sensor_10', 'sensor_16', 'sensor_18', 'sensor_19']
25print(f"\nconstant sensors ({len(constant)}):", constant)

Output the constant set. 7 sensors on FD001 - they will be dropped in §5.3.

EXECUTION STATE
Output = constant sensors (7): [...]
28top5 = stats.sort_values("std", ascending=False).head(5)

Sort by std descending; take the top 5. These are the highest-variance (and thus most-likely-informative) sensors.

29print("\ntop-5 high-variance sensors:")

Header.

30print(top5)

Names will likely include sensor_9 (Nc - core speed), sensor_14 (NRc - corrected core speed), and the temperature sensors 2, 3, 4.

EXECUTION STATE
Output = sensor_9, sensor_14, sensor_2, sensor_3, sensor_4 (order varies by FD subset)
19 lines without explanation
1import numpy as np
2import pandas as pd
3
4COLUMNS = (
5    ["engine_id", "cycle"]
6    + [f"op_set_{i}" for i in range(1, 4)]
7    + [f"sensor_{i}" for i in range(1, 22)]
8)
9SENSOR_COLS = [f"sensor_{i}" for i in range(1, 22)]
10
11
12df = pd.read_csv("data/raw/train_FD001.txt", sep=r"\s+", header=None, names=COLUMNS)
13
14
15# ----- Per-sensor diagnostics -----
16mean   = df[SENSOR_COLS].mean()
17std    = df[SENSOR_COLS].std()
18range_ = df[SENSOR_COLS].max() - df[SENSOR_COLS].min()
19
20stats = pd.DataFrame({"mean": mean, "std": std, "range": range_}).round(3)
21print(stats)
22
23# Tag the constant ones
24constant = stats[stats["std"] < 1e-6].index.tolist()
25print(f"\nconstant sensors ({len(constant)}):", constant)
26
27# Tag the high-variance informative ones (top 5 by std)
28top5 = stats.sort_values("std", ascending=False).head(5)
29print("\ntop-5 high-variance sensors:")
30print(top5)
31
32# constant sensors (7): ['sensor_1', 'sensor_5', 'sensor_6',
33#                        'sensor_10', 'sensor_16', 'sensor_18', 'sensor_19']
34# top-5: sensor_9, sensor_14, sensor_2, sensor_3, sensor_4

PyTorch: Selecting Sensors as Channels

Drop the 7 constant sensors with one slice
🐍sensor_select.py
1import torch

Top-level PyTorch.

2import numpy as np

Used for the index list (could be Python list).

5torch.manual_seed(0)

Determinism.

6X = torch.randn(2, 30, 21)

Stand-in for a real (B, T, 21) batch from CMAPSSDataset.

EXECUTION STATE
X.shape = torch.Size([2, 30, 21])
9INFORMATIVE_IDX = [1, 2, 3, 6, 7, 8, 10, 11, 12, 13, 14, 16, 19, 20]

0-based indices of the 14 informative sensors on FD001 (drop 0, 4, 5, 9, 15, 17, 18 which are constant).

EXECUTION STATE
len(INFORMATIVE_IDX) = 14
11X14 = X[:, :, INFORMATIVE_IDX]

PyTorch advanced indexing - pass a list of indices and slice them out along the last axis. Equivalent to NumPy fancy indexing.

EXECUTION STATE
X14.shape = torch.Size([2, 30, 14])
→ useful pattern = Same trick lets you pick any subset of channels at any layer.
12print("X.shape :", tuple(X.shape))

Verify input shape.

EXECUTION STATE
Output = X.shape : (2, 30, 21)
13print("X14.shape :", tuple(X14.shape))

Verify output shape - 14 channels.

EXECUTION STATE
Output = X14.shape : (2, 30, 14)
16idx_tensor = torch.tensor(INFORMATIVE_IDX, dtype=torch.long)

Same idea but with torch.index_select - more explicit, sometimes faster on TPU/XLA.

17X14b = torch.index_select(X, dim=2, index=idx_tensor)

Functional equivalent to advanced indexing. Some compilers optimise this better; the result is identical.

18print("match :", torch.equal(X14, X14b))

Sanity check - both produce the same tensor.

EXECUTION STATE
Output = match : True
7 lines without explanation
1import torch
2import numpy as np
3
4# Pretend X is loaded as (B, T, 21) with all 21 sensors
5torch.manual_seed(0)
6X = torch.randn(2, 30, 21)
7
8# ----- Drop constant sensors via index list -----
9INFORMATIVE_IDX = [1, 2, 3, 6, 7, 8, 10, 11, 12, 13, 14, 16, 19, 20]   # 0-based; 14 sensors
10
11X14 = X[:, :, INFORMATIVE_IDX]                    # PyTorch advanced indexing on last axis
12print("X.shape   :", tuple(X.shape))
13print("X14.shape :", tuple(X14.shape))            # (2, 30, 14)
14
15# Equivalently with torch.index_select
16idx_tensor = torch.tensor(INFORMATIVE_IDX, dtype=torch.long)
17X14b = torch.index_select(X, dim=2, index=idx_tensor)
18print("match     :", torch.equal(X14, X14b))     # True

What ‘Sensor’ Means in Other Domains

DomainSensor analogueChannel count
RUL turbofan (this book)Gas-path probe21 raw / 14 informative
ECG analysisElectrode lead12-lead ECG
EEG / brain-computerScalp electrode32 / 64 / 128 channels
Smartphone activityAccelerometer / gyroscope axis6 (3 acc + 3 gyro)
Automotive diagnosticsOBD-II PIDs20-50 depending on vehicle
Speech recognitionMel filterbank~80 mels
Climate stationsTemperature, humidity, pressure, wind5-10

Two Sensor-Catalog Pitfalls

Pitfall 1: Cross-FD sensor sets. The constant set is FD-specific. On FD002 some “FD001-constant” sensors actually move because operating regime changes. Always recompute per-FD before dropping.
Pitfall 2: Op-settings are NOT sensors. op_set_1, op_set_2, op_set_3 describe the regime. Feeding them as sensor channels mixes degradation signal with regime signal — which is exactly the problem Chapter 6 (per-condition normalisation) avoids.
The bottom line. 21 sensors and 3 operational settings, of which 14 sensors are useful for FD001 modelling. The rest of the dataset processing pipeline (Section 5.3, Chapter 6, Chapter 7) ratchets through the obvious cleanup steps before any neural network sees the data.

Takeaway

  • 3 op-settings + 21 sensors per row. 24 numeric columns total (plus engine_id, cycle = 26).
  • Op-settings describe the regime. They are not modelling features; they tell us how to normalise the sensors.
  • 14 informative sensors on FD001. The other 7 are constants - drop them in feature selection.
  • Drift direction encodes failure mode. T30 up = HPC degradation; BPR down = fan wear. Useful for interpreting attention maps later.
  • PyTorch advanced indexing handles selection. X[:, :, INFORMATIVE_IDX] reduces (B, T, 21) to (B, T, 14) in one line.
Loading comments...