Chapter 3
18 min read
Section 13 of 104

FD001-FD004: Operating Conditions and Fault Modes

NASA C-MAPSS Dataset Deep Dive

Learning Objectives

By the end of this section, you will:

  1. Understand the complexity hierarchy from FD001 (simplest) to FD004 (most challenging)
  2. Interpret operating conditions (altitude, Mach number, throttle) and their physical meaning
  3. Distinguish fault modes: HPC degradation vs Fan degradation
  4. Explain why multiple conditions complicate prediction through sensor value distributions
  5. Recognize the need for condition-aware normalization to handle multi-regime data
Why This Matters: Many RUL prediction methods achieve good results on FD001 but fail on FD004. Understanding why requires grasping the fundamental differences between these datasets. Our AMNL model achieves state-of-the-art on all four precisely because it addresses these differences through careful design choices.

Dataset Complexity Spectrum

The four C-MAPSS sub-datasets form a complexity spectrum, varying along two axes:

Two Dimensions of Complexity

DatasetOperating ConditionsFault ModesComplexity
FD0011 (Sea Level)1 (HPC)Lowest
FD0026 (Various)1 (HPC)Medium
FD0031 (Sea Level)2 (HPC + Fan)Medium
FD0046 (Various)2 (HPC + Fan)Highest

FD001 is the controlled experiment: single condition, single fault. Each additional complexity dimension (multiple conditions OR multiple faults) creates a new challenge. FD004 combines both.

Published Performance Gap

The performance gap between datasets is substantial. For a typical deep learning model:

DatasetTypical RMSERelative Difficulty
FD00112-151× (baseline)
FD00218-25~1.5× harder
FD00313-17~1.2× harder
FD00422-30~2× harder

Our AMNL model narrows this gap significantly by addressing the root causes of these performance differences.


Operating Conditions Explained

Each data row includes three operating condition settings that define the engine's current operating regime.

The Three Operating Settings

SettingPhysical MeaningRange in Data
AltitudeFlight altitude above sea level0 - 42,000 ft
Mach NumberFlight speed relative to sound0 - 0.84
Throttle Resolver Angle (TRA)Pilot throttle position0 - 100

Why Operating Conditions Matter

Operating conditions fundamentally change sensor readings even for a healthy engine:

The Six Operating Regimes in FD002/FD004

FD002 and FD004 include six distinct operating condition combinations:

RegimeAltitude (ft)MachTRAFlight Phase
100100Ground idle / takeoff
210,0000.25100Low altitude climb
320,0000.70100Mid altitude cruise
425,0000.6260Reduced thrust cruise
535,0000.84100High altitude cruise
642,0000.84100Max altitude cruise

Each regime produces different "normal" sensor readings. A model must learn that the same sensor value means different things in different regimes.

The Multi-Condition Challenge

In FD002/FD004, sensor distributions are multimodal. A temperature of 600°R might be normal at sea level but indicate severe degradation at high altitude. Global normalization destroys this information—we need per-condition normalization (covered in Chapter 4).


Fault Modes and Degradation Patterns

The C-MAPSS simulation introduces degradation through specific failure modes affecting different engine components.

Fault Mode 1: HPC Degradation

High Pressure Compressor (HPC) degradation is present in all four datasets:

  • Physical cause: Blade tip erosion, fouling, increased clearances
  • Effect: Reduced compression efficiency and flow capacity
  • Sensor signatures: Increased HPC outlet temperature (T30), decreased efficiency ratios

HPC degradation affects the thermodynamic cycle:

ηHPCT30,P30,SFC\eta_{\text{HPC}} \downarrow \Rightarrow T_{30} \uparrow, P_{30} \downarrow, \text{SFC} \uparrow

Where SFC is Specific Fuel Consumption (more fuel needed for same thrust).

Fault Mode 2: Fan Degradation

Fan degradation is present only in FD003 and FD004:

  • Physical cause: Foreign object damage, blade erosion, tip rubs
  • Effect: Reduced fan efficiency and bypass ratio
  • Sensor signatures: Changed bypass ratio (BPR), altered fan speed characteristics

Why Two Fault Modes is Harder

With a single fault mode, degradation patterns are consistent:

Degradation=f(HPC wear)\text{Degradation} = f(\text{HPC wear})

With two fault modes, degradation can follow different paths:

Degradation=f(HPC wear) OR g(Fan wear) OR h(both)\text{Degradation} = f(\text{HPC wear}) \text{ OR } g(\text{Fan wear}) \text{ OR } h(\text{both})

FD001: The Baseline Case

FD001 is the simplest and most studied sub-dataset. Understanding it deeply provides the foundation for tackling more complex variants.

FD001 Characteristics

PropertyValue
Training engines100
Test engines100
Operating conditions1 (sea level)
Fault mode1 (HPC degradation only)
Total training cycles20,631
Min trajectory length128 cycles
Max trajectory length362 cycles
Mean trajectory length206 cycles

Why FD001 is "Easy"

  1. Unimodal sensor distributions: All engines operate at sea level, so sensor values cluster around single means
  2. Consistent degradation pattern: Only HPC fault, so degradation signatures are uniform
  3. Simple normalization: Global mean/std normalization works well
  4. Clean trends: Sensors show monotonic degradation without regime jumps

FD001 Baseline Performance

State-of-the-art methods on FD001 achieve RMSE around 11-12 cycles. Our AMNL model achieves RMSE ≈ 11.44, competitive with the best published results.


FD002, FD003, FD004: Increasing Complexity

FD002: Multiple Operating Conditions

PropertyValue
Training engines260
Test engines259
Operating conditions6
Fault mode1 (HPC only)
Total training cycles53,759

Challenge: Same fault mode, but sensors now have 6 different "normal" baselines depending on operating regime.

Solution approach: Per-condition normalization to remove regime effects before feeding data to the model.

FD003: Multiple Fault Modes

PropertyValue
Training engines100
Test engines100
Operating conditions1 (sea level)
Fault modes2 (HPC + Fan)
Total training cycles24,720

Challenge: Single operating condition, but degradation can follow different patterns depending on which component fails first.

Solution approach: Model must learn multiple degradation signatures and potentially identify fault type implicitly.

FD004: Maximum Complexity

PropertyValue
Training engines249
Test engines248
Operating conditions6
Fault modes2 (HPC + Fan)
Total training cycles61,249

Challenge: Everything is variable—operating conditions AND fault modes. This is closest to real-world conditions where engines operate across regimes and can fail in multiple ways.

Solution approach: Combine per-condition normalization with a powerful model that can learn multiple degradation patterns simultaneously.

FD004: The Real Test

FD004 is where many methods fail. A model that achieves 12 RMSE on FD001 might achieve 28+ RMSE on FD004. Our AMNL model achieves RMSE ≈ 19.34 on FD004—a 21% improvement over previous state-of-the-art.


Implications for Model Design

Understanding the dataset differences leads directly to design decisions in our AMNL model:

1. Per-Condition Normalization

For FD002 and FD004, we normalize each sensor within each operating condition separately:

xnormalized(c)=xμ(c)σ(c)x_{\text{normalized}}^{(c)} = \frac{x - \mu^{(c)}}{\sigma^{(c)}}

Where μ(c)\mu^{(c)} and σ(c)\sigma^{(c)} are computed only from data in operating condition cc.

2. Multi-Task Learning

With multiple fault modes, predicting both RUL (continuous) and health state (categorical) helps the model learn more robust features:

  • Health state classification learns to distinguish degradation stages
  • RUL regression learns fine-grained cycle predictions
  • Shared features benefit from both signal types

3. Attention for Variable Patterns

With different fault modes, important timesteps vary between engines. Attention allows the model to focus on relevant degradation signatures regardless of fault type.

4. Dataset-Specific Evaluation

We must evaluate on all four datasets separately. A method that only works on FD001 is not useful for real-world deployment where conditions vary.

ChallengeSourceOur Solution
Multimodal distributionsMultiple conditionsPer-condition normalization
Variable degradationMultiple fault modesAttention + multi-task learning
Different data sizesDataset variabilitySame architecture, dataset-specific training
Performance gapFD001 vs FD004Careful preprocessing + powerful model

Summary

In this section, we explored the four C-MAPSS sub-datasets and their key differences:

  1. Complexity spectrum: FD001 (simplest) → FD004 (most complex)
  2. Operating conditions: 1 regime (FD001/FD003) vs 6 regimes (FD002/FD004) affecting all sensor readings
  3. Fault modes: HPC only (FD001/FD002) vs HPC + Fan (FD003/FD004) creating variable degradation patterns
  4. Multi-condition challenge: Same sensor value means different things in different regimes
  5. Multi-fault challenge: Same symptom can indicate different fault types
  6. Design implications: Per-condition normalization, attention, multi-task learning
DatasetConditionsFaultsKey Challenge
FD00111Baseline—learn HPC degradation
FD00261Handle regime shifts in sensors
FD00312Handle variable degradation patterns
FD00462Handle both simultaneously
Looking Ahead: Not all 21 sensors are equally informative. Some remain nearly constant; others are dominated by noise. In the next section, we will analyze each sensor and justify our selection of 17 informative features—the input to our model.

With the dataset complexity understood, we are ready to select the most predictive features.