Learning Objectives
By the end of this section, you will:
- Understand why RUL errors are asymmetric in their consequences
- Analyze the NASA scoring function and its asymmetric penalties
- Design asymmetric loss functions for training
- Balance asymmetry with training stability
- Implement differentiable asymmetric losses in PyTorch
Why This Matters: In real maintenance operations, predicting failure too late is catastrophic (unplanned downtime, safety risks), while predicting too early is merely costly (premature replacement). Asymmetric losses encode this operational reality into the learning objective.
Asymmetric Nature of RUL Errors
RUL prediction errors have fundamentally different consequences depending on their direction.
Error Direction Analysis
Define the prediction error as:
| Error Sign | Meaning | Consequence | Severity |
|---|---|---|---|
| d < 0 | Predicted RUL < True RUL | Early prediction (premature action) | Cost inefficiency |
| d = 0 | Perfect prediction | Optimal maintenance timing | Ideal |
| d > 0 | Predicted RUL > True RUL | Late prediction (delayed action) | Safety risk, failure |
Operational Consequences
NASA Scoring Function
NASA introduced an asymmetric scoring function for C-MAPSS evaluation.
Scoring Function Definition
Where:
- : Prediction error for sample i
- : Early prediction decay constant
- : Late prediction decay constant
Asymmetry Ratio
The ratio means late predictions are penalized more severely:
| Error (d) | Early Score (d<0) | Late Score (d≥0) | Ratio |
|---|---|---|---|
| ±5 | 0.32 | 0.65 | 2.0× |
| ±10 | 0.54 | 1.72 | 3.2× |
| ±15 | 0.68 | 3.48 | 5.1× |
| ±20 | 0.79 | 6.39 | 8.1× |
| ±30 | 0.90 | 19.09 | 21.2× |
Exponential Asymmetry
The exponential form means the asymmetry grows rapidly with error magnitude. A 30-cycle late prediction is penalized 21× more than a 30-cycle early prediction.
Score Visualization
1NASA Score vs. Prediction Error:
2
3Score
4 20 ─┤ ╱
5 │ ╱
6 15 ─┤ ╱
7 │ ╱
8 10 ─┤ ╱
9 │ ╱
10 5 ─┤ ______╱
11 │ ____─
12 0 ─┼──────●──────────────────────
13 │ │
14 -5 ─┴──┬──┼──┬──┬──┬──┬──┬──┬──┬
15 -30 -20 -10 0 10 20 30 40
16 Prediction Error (d = ŷ - y)
17
18Key:
19 d < 0: Early (gradual penalty)
20 d > 0: Late (steep penalty)Asymmetric Loss Formulation
We design a differentiable asymmetric loss for training.
Smooth Asymmetric MSE
A simple approach uses different coefficients for positive and negative errors:
Where:
Differentiable NASA-Style Loss
For direct optimization toward the NASA score:
Training Instability
The exponential form can cause gradient explosion for large errors. In practice, we clip errors or use a hybrid approach that switches to linear penalty beyond a threshold.
Hybrid Asymmetric Loss
Combine MSE base with asymmetric adjustment:
This adds extra penalty only for late predictions while keeping the base MSE for all samples.
Implementation
Complete PyTorch implementation of asymmetric RUL losses.
Asymmetric MSE
1class AsymmetricMSELoss(nn.Module):
2 """
3 Asymmetric Mean Squared Error loss.
4
5 Penalizes late predictions (over-estimation of RUL) more severely
6 than early predictions (under-estimation).
7
8 Args:
9 alpha_early: Coefficient for early predictions (d < 0)
10 alpha_late: Coefficient for late predictions (d >= 0)
11 """
12
13 def __init__(
14 self,
15 alpha_early: float = 1.0,
16 alpha_late: float = 1.3
17 ):
18 super().__init__()
19 self.alpha_early = alpha_early
20 self.alpha_late = alpha_late
21
22 def forward(
23 self,
24 pred: torch.Tensor,
25 target: torch.Tensor
26 ) -> torch.Tensor:
27 """
28 Compute asymmetric MSE loss.
29
30 Args:
31 pred: Predicted RUL, shape (batch,)
32 target: True RUL, shape (batch,)
33
34 Returns:
35 Asymmetric MSE loss (scalar)
36 """
37 pred = pred.view(-1)
38 target = target.view(-1)
39
40 # Compute errors: d = pred - target
41 errors = pred - target
42 squared_errors = errors ** 2
43
44 # Asymmetric coefficients
45 # Late: d >= 0 (predicted >= actual, over-estimation)
46 # Early: d < 0 (predicted < actual, under-estimation)
47 coefficients = torch.where(
48 errors >= 0,
49 torch.tensor(self.alpha_late, device=errors.device),
50 torch.tensor(self.alpha_early, device=errors.device)
51 )
52
53 # Weighted loss
54 weighted_errors = coefficients * squared_errors
55 loss = weighted_errors.mean()
56
57 return lossNASA-Style Exponential Loss
1class NASAScoreLoss(nn.Module):
2 """
3 Differentiable approximation of NASA scoring function.
4
5 Uses exponential penalties with different decay constants
6 for early vs. late predictions.
7
8 Args:
9 a1: Decay constant for early predictions (default 13)
10 a2: Decay constant for late predictions (default 10)
11 clip_error: Maximum error magnitude to prevent explosion
12 """
13
14 def __init__(
15 self,
16 a1: float = 13.0,
17 a2: float = 10.0,
18 clip_error: float = 50.0
19 ):
20 super().__init__()
21 self.a1 = a1
22 self.a2 = a2
23 self.clip_error = clip_error
24
25 def forward(
26 self,
27 pred: torch.Tensor,
28 target: torch.Tensor
29 ) -> torch.Tensor:
30 """
31 Compute NASA-style exponential loss.
32
33 Args:
34 pred: Predicted RUL, shape (batch,)
35 target: True RUL, shape (batch,)
36
37 Returns:
38 NASA score loss (scalar)
39 """
40 pred = pred.view(-1)
41 target = target.view(-1)
42
43 # Compute errors with clipping
44 errors = pred - target
45 errors = torch.clamp(errors, -self.clip_error, self.clip_error)
46
47 # Compute scores
48 early_mask = errors < 0
49 late_mask = ~early_mask
50
51 scores = torch.zeros_like(errors)
52 scores[early_mask] = torch.exp(-errors[early_mask] / self.a1) - 1
53 scores[late_mask] = torch.exp(errors[late_mask] / self.a2) - 1
54
55 # Mean score
56 loss = scores.mean()
57
58 return lossCombined Weighted Asymmetric Loss
1class WeightedAsymmetricMSE(nn.Module):
2 """
3 Combines sample weighting (linear decay) with asymmetric penalties.
4
5 This is the recommended loss for RUL prediction when both
6 sample importance and error direction matter.
7
8 Args:
9 r_max: Maximum RUL for weight computation
10 w_min: Minimum sample weight
11 w_max: Maximum sample weight
12 alpha_early: Asymmetry coefficient for early predictions
13 alpha_late: Asymmetry coefficient for late predictions
14 """
15
16 def __init__(
17 self,
18 r_max: float = 125.0,
19 w_min: float = 1.0,
20 w_max: float = 2.0,
21 alpha_early: float = 1.0,
22 alpha_late: float = 1.3
23 ):
24 super().__init__()
25 self.r_max = r_max
26 self.w_min = w_min
27 self.w_max = w_max
28 self.alpha_early = alpha_early
29 self.alpha_late = alpha_late
30
31 def forward(
32 self,
33 pred: torch.Tensor,
34 target: torch.Tensor
35 ) -> torch.Tensor:
36 pred = pred.view(-1)
37 target = target.view(-1)
38
39 # Sample weights (linear decay)
40 capped_target = torch.clamp(target, max=self.r_max)
41 sample_weights = self.w_max - (self.w_max - self.w_min) * capped_target / self.r_max
42
43 # Asymmetric coefficients
44 errors = pred - target
45 asym_coeffs = torch.where(
46 errors >= 0,
47 torch.tensor(self.alpha_late, device=errors.device),
48 torch.tensor(self.alpha_early, device=errors.device)
49 )
50
51 # Combined weighted loss
52 squared_errors = errors ** 2
53 weighted_errors = sample_weights * asym_coeffs * squared_errors
54
55 loss = weighted_errors.sum() / sample_weights.sum()
56
57 return lossSummary
In this section, we explored asymmetric RUL loss:
- Motivation: Late predictions (failure miss) are far more costly than early predictions
- NASA score: Exponential asymmetry with 10/13 ratio
- Asymmetric MSE: Simple coefficient-based approach
- Hybrid loss: MSE base + extra late penalty
- Combined: Sample weighting + asymmetric coefficients
| Loss Type | α_early | α_late | Use Case |
|---|---|---|---|
| Symmetric MSE | 1.0 | 1.0 | Baseline |
| Mild asymmetry | 1.0 | 1.3 | General RUL (recommended) |
| Strong asymmetry | 1.0 | 2.0 | Safety-critical systems |
| NASA-style | exp(-d/13) | exp(d/10) | Match evaluation metric |
Looking Ahead: We have addressed RUL-specific losses. The next section introduces focal loss for health classification—a technique for handling the imbalanced distribution of health states in training data.
With asymmetric RUL loss understood, we address class imbalance in health classification.