Chapter 16
15 min read
Section 79 of 104

FD002 Results: +37.0% Improvement

Main Results: State-of-the-Art

Learning Objectives

By the end of this section, you will:

  1. Understand FD002's complexity with 6 operating conditions
  2. Analyze the breakthrough +37.0% improvement over DKAMFormer
  3. Examine exceptional consistency across all 5 seeds
  4. Understand why multi-condition learning benefits from AMNL
  5. Interpret statistical significance (p < 0.0001)
Key Result: On FD002, AMNL achieves 6.74 ± 0.91 RMSE—a +37.0% improvement over DKAMFormer (10.70) and +65.9% improvement over published SOTA (19.77). This is the most consistent result across seeds, with all 5 seeds beating both baselines. Statistical significance: p < 0.0001.

FD002 Dataset Characteristics

FD002 is a complex dataset with 6 distinct operating conditions, presenting significant challenges for RUL prediction.

Dataset Configuration

PropertyValueImplication
Operating Conditions6 (Various altitudes/speeds)High condition variability
Fault Modes1 (HPC Degradation)Single failure pattern
Training Engines260Large training set
Test Engines259Comprehensive evaluation
Total Training Cycles~53,000Substantial data volume
Condition DistributionVariable per engineCondition shifts during operation

Operating Condition Details

The 6 operating conditions span different combinations of altitude, Mach number, and throttle resolver angle:

ConditionAltitude (ft)MachTRA
10 (Sea Level)0.00100
210,0000.25100
320,0000.70100
425,0000.6260
535,0000.84100
642,0000.84100

Why FD002 is Challenging

Unlike FD001, engines in FD002 operate across multiple conditions. This creates condition-dependent sensor distributions—the same degradation level produces different sensor readings at different altitudes. Models must learncondition-invariant degradation features to predict RUL accurately.


Per-Seed Results

AMNL demonstrates exceptional consistency on FD002, with all 5 seeds significantly outperforming baselines.

Comprehensive Per-Seed Data

SeedRMSEMAENASA ScoreEpochsvs DKAMFormer
426.294.040.910314.5244+41.2%
123 ✓6.193.950.912333.5219+42.1%
4566.524.450.903352.1236+39.1%
7896.333.970.908360.5183+40.8%
10248.357.210.841419.4237+22.0%

Statistical Summary

StatisticRMSEMAENASA Score
Mean6.744.730.895356.0
Std Dev0.911.350.03040.8
Best6.193.950.912314.5
Worst8.357.210.841419.4

Exceptional Consistency

The standard deviation of 0.91 RMSE represents only 13.5% coefficient of variation—much lower than FD001's 18.6%. Even the worst seed (1024 with 8.35 RMSE) significantly beats DKAMFormer (10.70).

All Seeds Beat Both Baselines

SeedRMSEvs DKAMFormer (10.70)vs SOTA (19.77)
426.29✓ +41.2%✓ +68.2%
1236.19✓ +42.1%✓ +68.7%
4566.52✓ +39.1%✓ +67.0%
7896.33✓ +40.8%✓ +68.0%
10248.35✓ +22.0%✓ +57.8%

Breakthrough Analysis

Understanding the magnitude of AMNL's improvement on FD002.

Statistical Significance

Highly Significant: p < 0.0001

The improvement is highly statistically significant. With p < 0.0001, there is less than a 0.01% chance this result occurred by random chance. This is the strongest statistical evidence in our evaluation.

Statistical MeasureValueInterpretation
p-value< 0.0001Highly significant (****)
Effect Size (Cohen's d)4.37Very large effect
95% CI Lower5.61Lower bound of mean RMSE
95% CI Upper7.86Upper bound of mean RMSE

NASA Score Improvement

Unlike FD001, AMNL improves both RMSE and NASA Score on FD002:

MetricAMNLDKAMFormerImprovement
RMSE6.7410.70+37.0% ✓
NASA Score356.0498.0+28.5% ✓
Dual Improvement: On FD002, AMNL achieves better RMSE and lower NASA Score simultaneously. This indicates the model not only predicts more accurately but also makes fewer dangerous late predictions.

Why AMNL Excels on FD002

Understanding the mechanisms behind AMNL's breakthrough performance on multi-condition data.

The Multi-Condition Challenge

Traditional RUL models struggle with FD002 because sensor readings depend on both degradation level and operating condition:

Sensori(t)=f(Degradation(t))+g(Condition(t))+ϵ\text{Sensor}_i(t) = f(\text{Degradation}(t)) + g(\text{Condition}(t)) + \epsilon

The challenge: extracting f(Degradation)f(\text{Degradation}) when it's confounded with g(Condition)g(\text{Condition}).

How AMNL Solves This

  1. Dual-task learning: The health classification task provides an auxiliary signal that guides feature learning
  2. Equal weighting (0.5/0.5): Balanced gradients prevent either task from dominating the representation
  3. Shared encoder: The CNN-BiLSTM-Attention backbone must learn features useful for both tasks
  4. Condition-invariant representations: Health states (Healthy/Degrading/Critical) are defined by RUL, not conditions— learning to classify health requires learning condition-invariant features

The Regularization Effect

FD002 vs FD001 Comparison

AspectFD001 (Simple)FD002 (Complex)
Operating Conditions16
AMNL Improvement+2.3%+37.0%
Multi-task BenefitLimitedSubstantial
Condition-invariance NeededNoYes
Health Task ValueModerate regularizationEssential for generalization

The Negative Transfer Gap

Conventional wisdom suggests multi-task learning often suffers from "negative transfer"—where auxiliary tasks hurt primary task performance. AMNL shows the opposite: multi-task learning helps more as complexity increases.


Summary

FD002 Results Summary:

  1. Mean RMSE: 6.74 ± 0.91 (highly consistent)
  2. Improvement: +37.0% vs DKAMFormer, +65.9% vs SOTA
  3. Best single result: 6.19 RMSE (seed 123)
  4. Statistical significance: p < 0.0001 (highly significant)
  5. All seeds beat baselines: 5/5 seeds outperform DKAMFormer and SOTA
Key MetricFD002 ResultSignificance
RMSE Improvement+37.0%Largest vs DKAMFormer
Consistency (CV)13.5%Lower than FD001
NASA Score356.0 (vs 498.0)Also improved
Effect Size4.37Very large
Key Insight: FD002 represents AMNL's breakthrough dataset. The +37% improvement over DKAMFormer demonstrates that equal task weighting enables superior learning of condition-invariant features. This pattern continues with FD004 (also 6 conditions), where we see similar gains.

FD002 shows AMNL's strength on multi-condition data. Next, we examine FD003 (2 fault modes).