Learning Objectives
By the end of this section, you will:
- Understand FD002's complexity with 6 operating conditions
- Analyze the breakthrough +37.0% improvement over DKAMFormer
- Examine exceptional consistency across all 5 seeds
- Understand why multi-condition learning benefits from AMNL
- Interpret statistical significance (p < 0.0001)
Key Result: On FD002, AMNL achieves 6.74 ± 0.91 RMSE—a +37.0% improvement over DKAMFormer (10.70) and +65.9% improvement over published SOTA (19.77). This is the most consistent result across seeds, with all 5 seeds beating both baselines. Statistical significance: p < 0.0001.
FD002 Dataset Characteristics
FD002 is a complex dataset with 6 distinct operating conditions, presenting significant challenges for RUL prediction.
Dataset Configuration
| Property | Value | Implication |
|---|---|---|
| Operating Conditions | 6 (Various altitudes/speeds) | High condition variability |
| Fault Modes | 1 (HPC Degradation) | Single failure pattern |
| Training Engines | 260 | Large training set |
| Test Engines | 259 | Comprehensive evaluation |
| Total Training Cycles | ~53,000 | Substantial data volume |
| Condition Distribution | Variable per engine | Condition shifts during operation |
Operating Condition Details
The 6 operating conditions span different combinations of altitude, Mach number, and throttle resolver angle:
| Condition | Altitude (ft) | Mach | TRA |
|---|---|---|---|
| 1 | 0 (Sea Level) | 0.00 | 100 |
| 2 | 10,000 | 0.25 | 100 |
| 3 | 20,000 | 0.70 | 100 |
| 4 | 25,000 | 0.62 | 60 |
| 5 | 35,000 | 0.84 | 100 |
| 6 | 42,000 | 0.84 | 100 |
Why FD002 is Challenging
Unlike FD001, engines in FD002 operate across multiple conditions. This creates condition-dependent sensor distributions—the same degradation level produces different sensor readings at different altitudes. Models must learncondition-invariant degradation features to predict RUL accurately.
Per-Seed Results
AMNL demonstrates exceptional consistency on FD002, with all 5 seeds significantly outperforming baselines.
Comprehensive Per-Seed Data
| Seed | RMSE | MAE | R² | NASA Score | Epochs | vs DKAMFormer |
|---|---|---|---|---|---|---|
| 42 | 6.29 | 4.04 | 0.910 | 314.5 | 244 | +41.2% |
| 123 ✓ | 6.19 | 3.95 | 0.912 | 333.5 | 219 | +42.1% |
| 456 | 6.52 | 4.45 | 0.903 | 352.1 | 236 | +39.1% |
| 789 | 6.33 | 3.97 | 0.908 | 360.5 | 183 | +40.8% |
| 1024 | 8.35 | 7.21 | 0.841 | 419.4 | 237 | +22.0% |
Statistical Summary
| Statistic | RMSE | MAE | R² | NASA Score |
|---|---|---|---|---|
| Mean | 6.74 | 4.73 | 0.895 | 356.0 |
| Std Dev | 0.91 | 1.35 | 0.030 | 40.8 |
| Best | 6.19 | 3.95 | 0.912 | 314.5 |
| Worst | 8.35 | 7.21 | 0.841 | 419.4 |
Exceptional Consistency
The standard deviation of 0.91 RMSE represents only 13.5% coefficient of variation—much lower than FD001's 18.6%. Even the worst seed (1024 with 8.35 RMSE) significantly beats DKAMFormer (10.70).
All Seeds Beat Both Baselines
| Seed | RMSE | vs DKAMFormer (10.70) | vs SOTA (19.77) |
|---|---|---|---|
| 42 | 6.29 | ✓ +41.2% | ✓ +68.2% |
| 123 | 6.19 | ✓ +42.1% | ✓ +68.7% |
| 456 | 6.52 | ✓ +39.1% | ✓ +67.0% |
| 789 | 6.33 | ✓ +40.8% | ✓ +68.0% |
| 1024 | 8.35 | ✓ +22.0% | ✓ +57.8% |
Breakthrough Analysis
Understanding the magnitude of AMNL's improvement on FD002.
Statistical Significance
Highly Significant: p < 0.0001
The improvement is highly statistically significant. With p < 0.0001, there is less than a 0.01% chance this result occurred by random chance. This is the strongest statistical evidence in our evaluation.
| Statistical Measure | Value | Interpretation |
|---|---|---|
| p-value | < 0.0001 | Highly significant (****) |
| Effect Size (Cohen's d) | 4.37 | Very large effect |
| 95% CI Lower | 5.61 | Lower bound of mean RMSE |
| 95% CI Upper | 7.86 | Upper bound of mean RMSE |
NASA Score Improvement
Unlike FD001, AMNL improves both RMSE and NASA Score on FD002:
| Metric | AMNL | DKAMFormer | Improvement |
|---|---|---|---|
| RMSE | 6.74 | 10.70 | +37.0% ✓ |
| NASA Score | 356.0 | 498.0 | +28.5% ✓ |
Dual Improvement: On FD002, AMNL achieves better RMSE and lower NASA Score simultaneously. This indicates the model not only predicts more accurately but also makes fewer dangerous late predictions.
Why AMNL Excels on FD002
Understanding the mechanisms behind AMNL's breakthrough performance on multi-condition data.
The Multi-Condition Challenge
Traditional RUL models struggle with FD002 because sensor readings depend on both degradation level and operating condition:
The challenge: extracting when it's confounded with .
How AMNL Solves This
- Dual-task learning: The health classification task provides an auxiliary signal that guides feature learning
- Equal weighting (0.5/0.5): Balanced gradients prevent either task from dominating the representation
- Shared encoder: The CNN-BiLSTM-Attention backbone must learn features useful for both tasks
- Condition-invariant representations: Health states (Healthy/Degrading/Critical) are defined by RUL, not conditions— learning to classify health requires learning condition-invariant features
The Regularization Effect
FD002 vs FD001 Comparison
| Aspect | FD001 (Simple) | FD002 (Complex) |
|---|---|---|
| Operating Conditions | 1 | 6 |
| AMNL Improvement | +2.3% | +37.0% |
| Multi-task Benefit | Limited | Substantial |
| Condition-invariance Needed | No | Yes |
| Health Task Value | Moderate regularization | Essential for generalization |
The Negative Transfer Gap
Conventional wisdom suggests multi-task learning often suffers from "negative transfer"—where auxiliary tasks hurt primary task performance. AMNL shows the opposite: multi-task learning helps more as complexity increases.
Summary
FD002 Results Summary:
- Mean RMSE: 6.74 ± 0.91 (highly consistent)
- Improvement: +37.0% vs DKAMFormer, +65.9% vs SOTA
- Best single result: 6.19 RMSE (seed 123)
- Statistical significance: p < 0.0001 (highly significant)
- All seeds beat baselines: 5/5 seeds outperform DKAMFormer and SOTA
| Key Metric | FD002 Result | Significance |
|---|---|---|
| RMSE Improvement | +37.0% | Largest vs DKAMFormer |
| Consistency (CV) | 13.5% | Lower than FD001 |
| NASA Score | 356.0 (vs 498.0) | Also improved |
| Effect Size | 4.37 | Very large |
Key Insight: FD002 represents AMNL's breakthrough dataset. The +37% improvement over DKAMFormer demonstrates that equal task weighting enables superior learning of condition-invariant features. This pattern continues with FD004 (also 6 conditions), where we see similar gains.
FD002 shows AMNL's strength on multi-condition data. Next, we examine FD003 (2 fault modes).