Early vs Late: Two Different Failures
Predict an engine's RUL too early and you replace a part with life left in it - cost a few thousand dollars, ten or twenty cycles of wasted life, a bit of unplanned tear-down. Predict it too late and you miss the failure - cost a few hundred thousand dollars, an unplanned outage, possible cascade damage, possibly a safety incident. Same magnitude of error; very different cost.
| Error sign | Meaning | Operational consequence | Typical $$ scale |
|---|---|---|---|
| d < 0 | predicted RUL < true RUL | early/premature replacement | $5K - $20K (part + labour) |
| d ≈ 0 | near-perfect prediction | ideal maintenance timing | minimal |
| d > 0 | predicted RUL > true RUL | late - failure missed | $100K - $1M+ (downtime + cascade) |
RMSE - which is the headline metric in nearly every RUL paper - treats and as identical. The NASA scoring function does not. This section is about that scoring function.
The NASA Scoring Function
For prediction error on engine i, the NASA C-MAPSS score is
The total score for a test set is ; lower is better. The two decay constants (early) and (late) come from the original PHM 2008 challenge brief and have been the standard ever since.
How Lopsided Is It?
Side-by-side cost for symmetric errors :
| |d| (cycles) | early s(-d) | late s(+d) | late / early ratio |
|---|---|---|---|
| 5 | 0.467 | 0.649 | 1.4× |
| 10 | 1.151 | 1.718 | 1.5× |
| 15 | 2.150 | 3.482 | 1.6× |
| 20 | 3.654 | 6.389 | 1.7× |
| 25 | 5.913 | 11.182 | 1.9× |
| 30 | 9.300 | 19.086 | 2.1× |
| 40 | 20.624 | 53.598 | 2.6× |
| 50 | 45.001 | 147.413 | 3.3× |
Read this table once, never forget it. A 20-cycle late prediction costs the same as a 25-cycle early prediction. A 30-cycle late prediction costs the same as a 40-cycle early prediction. The model should err early - and the loss function should make that easy.
Interactive: Cost vs Prediction Error
Drag the bias and noise knobs. Bias slides the entire prediction distribution; noise spreads it. Watch how a small POSITIVE bias (1-2 cycles late on average) explodes the score, while a similar NEGATIVE bias (early) barely moves it.
Python: NASA Score from Scratch
Pure NumPy implementation - five lines of math, the rest is bookkeeping. Worked example: five engines with errors demonstrates the cost asymmetry.
PyTorch: Differentiable NASA Loss
Production version. NASAScoreLoss as an nn.Module with torch.where for the piecewise cost, plus a clip_error guard against gradient explosion. Same numerical answer as the NumPy block.
Asymmetric Cost in Other Domains
Asymmetric loss is not a C-MAPSS-only idea - any safety-critical decision with different costs for over- vs under-prediction wants something like NASA score.
| Domain | Underestimate | Overestimate | Asymmetry constants |
|---|---|---|---|
| RUL prediction (C-MAPSS) | early replacement ($5K) | missed failure ($1M+) | a1=13, a2=10 (NASA) |
| Wildfire risk score | false alarm (annoyance) | missed wildfire ($10M+) | a1=20, a2=5 (typical agency) |
| Battery SoC for EV range | over-conservative range | stranding the driver | a1=15, a2=4 (OEM) |
| Hospital ICU triage score | extra observation hour | missed deterioration | a1=25, a2=2 (clinical) |
| Inventory days-to-stockout | early reorder (carrying cost) | stockout (lost sale + brand) | a1=10, a2=4 (retail) |
| Power-grid load forecast | spot-buy more capacity | blackout | a1=12, a2=3 (TSO) |
Three NASA-Score Pitfalls
clamp(-clip_error, +clip_error) the residual. Evaluation on a finite test set has no such risk.exp(d/a) - 1 as just exp(d/a). That makes a perfect prediction cost 1 instead of 0 - a constant offset of +N per evaluation, but a non-zero gradient at d=0 during training. Subtle bug; the model ends up biased early because the gradient at d=0 is positive, pushing predictions DOWN.The point. RMSE is a stand-in convenience metric; NASA score is the one operators care about. Section §13.2 shows the Pareto frontier between them; §13.3 reports published baselines on both metrics; §13.4 closes the chapter with the operator-cost framing.
Takeaway
- Lateness is the feature, not magnitude. NASA s(d) penalises late predictions ~1.5-3× more than early predictions of the same magnitude.
- Two decay constants. (early), (late). Standard since PHM 2008.
- Differentiable surrogate.
torch.where(d >= 0, exp(d/a2)-1, exp(-d/a1)-1)+ clamp on |d| = a working PyTorch loss. - Always clip when training. Unbounded exp() is a gradient bomb at init.
- Always report RMSE alongside. NASA alone is gameable by all-zero predictions.