Section 1.4 introduced three archetype industries to motivate the accuracy-safety tradeoff: the delivery truck (cheap repairs, RMSE matters), the airline 787 (balanced - both metrics matter), and the cruise ship (catastrophic late-failure cost, NASA matters). With §13.1's asymmetric NASA score and §13.2's Pareto frontier formalised, we can now pin each regime to a specific position on the frontier - and to a specific number w∈[0,1] that selects it.
What “revisited” means. §1.4 gave the qualitative metaphor. This section gives you the operational formula: w=cℓ/(cℓ+ce) where cℓ is your cost-per-cycle of LATE prediction and ce is your cost-per-cycle of EARLY prediction. Plug in your numbers; out comes the regime.
Combined Cost on the Frontier
Define a single combined cost over a model i:
Ji(w)=(1−w)⋅RMSEi+w⋅NASAi,
where ⋅ denotes min-max normalisation across the candidate set so both metrics live in [0,1] before combining. The model that minimises Ji(w) is the “regime winner” for that w.
Why min-max normalise. RMSE on FD002 is ~10; NASA score is ~1,000. Without normalisation, the NASA term dominates J for any non-trivial w. After normalisation both metrics are in [0,1] and w truly controls the trade-off.
Picking w From Operational Cost
Let cℓ be the dollar cost of a single cycle's LATENESS (failure caught one cycle late) and ce the cost of a single cycle's EARLINESS (replacement one cycle early). Then a principled choice of w is
w=cℓ+cecℓ.
Anchored at the symmetric case w=0.5⇔cℓ=ce; pushed toward 1 when late costs grow.
Industry archetype
c_late
c_early
w
Regime
Delivery truck (Class 4)
$10K/cyc
$8K/cyc
0.56
near-balanced, slight safety bias
Civil aviation (787)
$50K/cyc
$5K/cyc
0.91
strong safety bias
Cruise ship (mid-ocean)
$2M/cyc
$20K/cyc
0.99
extreme safety bias
Datacentre disk (RAID-protected)
$200/cyc
$50/cyc
0.80
moderate safety bias
Hospital MRI
$30K/cyc
$8K/cyc
0.79
moderate safety bias
Interactive: Slide w, Watch the Winner
Real numbers from the IEEE/CAA JAS paper's Table II (FD002 + FD004 averages, 5 seeds). Slide the safety knob; the highlighted model is the one minimising J at that w.
Loading deployment regime chooser…
Try this. At w = 0 the winner is AMNL (best RMSE = 7.45). Slide to w = 0.5 - the winner switches to GABA at w ≈ 0.15. Continue to w = 1 - GRACE wins from about w = 0.4 onwards. The frontier is traversed by ONE slider; every model on it is the optimum for some operational cost ratio.
The Three Regimes With Numbers
Regime
w range
Winner (FD002+FD004)
RMSE
NASA
Why
delivery-truck (RMSE)
0.00 - 0.15
AMNL
7.45
446.7
best accuracy; late bias acceptable
airline-787 (balanced)
0.15 - 0.40
GABA
7.89
235.7
balances both metrics
cruise-ship (NASA)
0.40 - 1.00
GRACE
7.92
232.7
best safety; tiny RMSE penalty
Why GRACE wins so wide a band. GRACE's RMSE is only 0.47 cycles worse than AMNL's but its NASA score is HALF. After min-max normalisation the RMSE gap is ~0.77 (small) while the NASA gap is ~1.00 (full range), so even modest w≥0.4 already favours GRACE. The cruise-ship regime starts at w ≈ 0.4, not at w = 0.9 - the asymmetric NASA score does most of the work.
Python: Regime-Aware Model Selector
A 30-line picker. Min-max normalise the (RMSE, NASA) table, compute the convex combination J, return the argmin.
select_for_regime() over 5 trained models
🐍regime_selector_numpy.py
Explanation(20)
Code(45)
1import numpy as np
NumPy provides the (n,)-vector arithmetic for min-max normalisation and np.argmin for the winner-pick. We could do this with plain Python lists but the vectorised version reads cleaner.
EXECUTION STATE
📚 numpy = Library: ndarray, broadcasting, linear algebra, math.
as np = Universal alias.
2from typing import NamedTuple
NamedTuple gives us a tiny immutable dataclass with field-name access. m.name reads better than m[0], and the tuple is hashable so models can live in sets / dict keys.
EXECUTION STATE
📚 typing.NamedTuple = Stdlib helper. Subclass it to declare a typed immutable record. Equivalent to collections.namedtuple but with type annotations.
5class Model(NamedTuple):
Three-field record: name (str), rmse (float), nasa (float). Each instance is a tuple; fields are positional AND keyword-accessible.
EXECUTION STATE
field: name = Display string.
field: rmse = Root mean squared error in cycles. Lower is better.
field: nasa = Total NASA score across the test set. Lower is better.
Min-max normalise both metrics to [0, 1] across the model list. Without this step, the combined cost J would be dominated by NASA (which is in the hundreds) regardless of how we set w.
EXECUTION STATE
⬇ input: models = list[Model] - the candidate set we're ranking.
⬆ returns = Tuple of two (n,) NumPy arrays - normalised RMSE and NASA per model. Both in [0, 1] with the best model at 0 and the worst at 1.
14rmse = np.array([m.rmse for m in models], dtype=np.float32)
List comprehension extracts the rmse field from each Model, then np.array wraps as float32. The dtype is overkill here (we have only ~10 numbers) but matches the rest of the pipeline.
EXECUTION STATE
📚 np.array(seq, dtype) = Construct an ndarray from a Python sequence. dtype is optional; defaults to inferred.
→ list comprehension = [m.rmse for m in models] yields the RMSE values one by one. Equivalent to a for-loop with append, but one-liner.
⬇ arg: dtype = np.float32 = Match downstream pipeline. Mixing dtypes is a silent bug source.
Pick the model that minimises J = (1-w)·RMSE_n + w·NASA_n. w is the SAFETY WEIGHT - 0 cares only about accuracy, 1 cares only about safety / NASA cost.
EXECUTION STATE
⬇ input: models = list[Model] - candidate set.
⬇ input: w = Float in [0, 1]. Operational meaning: fraction of total cost driven by safety. w=0 → reward accuracy only; w=1 → reward conservatism only; w=0.5 → balanced.
⬆ returns = The single Model with smallest combined cost J under this w.
23if not 0.0 <= w <= 1.0:
Defensive bounds check. Python supports chained comparisons - a <= b <= c is shorthand for a <= b AND b <= c.
EXECUTION STATE
→ chained comparison = 0.0 <= w <= 1.0 is exactly equivalent to (0.0 <= w) and (w <= 1.0). Python evaluates each comparison once.
24raise ValueError(f"w must be in [0, 1], got {w}")
Fail loudly on bad input. f-string interpolates the offending value into the message so the caller can see what was passed.
EXECUTION STATE
📚 raise = Python statement that throws an exception. Stops the function and propagates up the call stack.
⬇ exception type: ValueError = Built-in exception for invalid argument values. Convention: type-mismatched ⇒ TypeError; valid type but invalid value ⇒ ValueError.
25rmse_n, nasa_n = normalise(models)
Tuple unpacking. Calls normalise() and binds the two returned arrays in one go.
EXECUTION STATE
→ tuple unpacking = Python convention. The right-hand side is a 2-tuple; the left-hand side has 2 names ⇒ each name binds to the matching element.
26J = (1.0 - w) * rmse_n + w * nasa_n
The combined cost. Convex combination so total weight (1-w) + w = 1 stays constant - changing w only re-tilts the cost surface, never inflates it.
EXECUTION STATE
operator: * = Scalar × array broadcast.
operator: + = Element-wise array add.
→ at w = 0.50 = J = 0.5 · rmse_n + 0.5 · nasa_n = balanced.
40for regime, w in [("delivery-truck (RMSE)", 0.10), ("airline-787 (balanced)", 0.50), ("cruise-ship (NASA)", 0.90)]:
Loop over three regime archetypes from §1.4. The delivery-truck regime cares about cost, the cruise-ship regime cares about safety, and the airline-787 regime is balanced.
EXECUTION STATE
iter var: regime = Display string for the regime.
iter var: w = Safety weight in [0, 1].
LOOP TRACE · 3 iterations
delivery-truck (w=0.10)
logic = RMSE dominates J. Picks model with smallest rmse_n.
winner = AMNL (best RMSE 7.45)
rationale = Truck repairs are cheap; the cost of a wrong-by-20-cycles prediction is just an early oil change. Optimise pure accuracy.
airline-787 (w=0.50)
logic = Balanced - both metrics matter.
winner = GABA (best balanced J=0.368)
rationale = Aircraft engines have high but not catastrophic failure costs. Need both accuracy AND a small safety margin.
cruise-ship (w=0.90)
logic = NASA dominates J. Picks model with smallest nasa_n.
winner = GRACE (best NASA 232.7)
rationale = Ocean liner engine failure mid-voyage = thousand stranded passengers, possible safety incident. Pay for conservatism.
43pick = select_for_regime(models, w)
Run the selector for this regime.
EXECUTION STATE
⬇ args used = models (5-element list) and w (loop variable).
→ reading = Three regimes, three different winners. AMNL wins on pure accuracy; GRACE on pure safety; GABA on balance. The mapping repeats in §13.4 with full method derivations.
25 lines without explanation
1import numpy as np
2from typing import NamedTuple
345classModel(NamedTuple):6"""One trained model and its measured (RMSE, NASA score) on a held-out set."""7 name:str8 rmse:float9 nasa:float101112defnormalise(models:list[Model])->tuple[np.ndarray, np.ndarray]:13"""Min-max normalise RMSE and NASA across the model list to [0, 1]."""14 rmse = np.array([m.rmse for m in models], dtype=np.float32)15 nasa = np.array([m.nasa for m in models], dtype=np.float32)16 rmse_n =(rmse - rmse.min())/(rmse.max()- rmse.min()+1e-12)17 nasa_n =(nasa - nasa.min())/(nasa.max()- nasa.min()+1e-12)18return rmse_n, nasa_n
192021defselect_for_regime(models:list[Model], w:float)-> Model:22"""Return the model that minimises J = (1-w)*RMSE_n + w*NASA_n."""23ifnot0.0<= w <=1.0:24raise ValueError(f"w must be in [0, 1], got {w}")25 rmse_n, nasa_n = normalise(models)26 J =(1.0- w)* rmse_n + w * nasa_n
27 best =int(np.argmin(J))28return models[best]293031# ---------- Worked example: 5 trained models, 3 regimes ----------32models =[33 Model("Baseline-0.5/0.5", rmse=8.06, nasa=252.5),34 Model("AMNL", rmse=7.45, nasa=446.7),35 Model("GradNorm", rmse=7.96, nasa=241.9),36 Model("GABA", rmse=7.89, nasa=235.7),37 Model("GRACE", rmse=7.92, nasa=232.7),38]3940for regime, w in[("delivery-truck (RMSE)",0.10),41("airline-787 (balanced)",0.50),42("cruise-ship (NASA)",0.90)]:43 pick = select_for_regime(models, w)44print(f"{regime:<28s} w={w:.2f} → {pick.name:<18s} "45f"(RMSE={pick.rmse:.2f}, NASA={pick.nasa:.1f})")
PyTorch: Pick a Checkpoint by Regime
Production version. Walks every checkpoint over a held-out DataLoader, computes both metrics inside a torch.no_grad() block, then returns the (name,model) tuple of the J-minimum winner.
regime_safety_weight() + select_checkpoint() with stub smoke test
Translate operational $ cost asymmetry into the safety weight w. Anchors w=0.5 at cost_late == cost_early; pushes w → 1 when late costs grow.
EXECUTION STATE
⬇ input: cost_late = Dollars (or any positive unit) per cycle of LATE prediction. For a 787 turbofan: ~$50,000/cycle (delayed inspection cost + fuel + cascade).
⬇ input: cost_early = Dollars per cycle of EARLY prediction. For the same engine: ~$5,000/cycle (premature replacement). Ratio: 10×.
⬆ returns = Float in (0, 1). Strictly positive because both costs must be positive. w = 50000 / (50000 + 5000) = 0.909 for the 787 example.
14if cost_late <= 0 or cost_early <= 0:
Defensive bounds check. Negative or zero costs are nonsensical.
EXECUTION STATE
📚 or = Logical OR. Short-circuits - if cost_late ≤ 0, cost_early is not even evaluated.
15raise ValueError("costs must be positive")
Fail loudly on bad input.
16return cost_late / (cost_late + cost_early)
Convex-combination weight. With c_l=50K, c_e=5K: 50000/55000 ≈ 0.909 ⇒ w ≈ 0.91 — heavy safety bias.
EXECUTION STATE
operator: / = Python true division (float).
operator: + = Float add.
→ 787 example = cost_late=50000, cost_early=5000 ⇒ w = 0.909.
→ truck example = cost_late=10000, cost_early=8000 ⇒ w = 0.556 (slightly safety-leaning).
→ cruise example = cost_late=2000000, cost_early=20000 ⇒ w = 0.990.
Evaluate every checkpoint on the same held-out set, then return the one that minimises J = (1-w)·RMSE_n + w·NASA_n.
EXECUTION STATE
⬇ input: checkpoints = dict[name → nn.Module]. Each value is a trained DualTaskModel checkpoint.
⬇ input: eval_set = Iterable of (x, y_rul, y_hs) batches. Typically a DataLoader.
⬇ input: w = Safety weight, e.g. from regime_safety_weight().
⬆ returns = (name, model) tuple of the winner.
22metrics = {}
Empty dict to accumulate {checkpoint_name → (rmse, nasa)} pairs.
23for name, model in checkpoints.items():
Iterate the checkpoint dict. dict.items() yields (key, value) pairs.
EXECUTION STATE
📚 dict.items() = Returns a view of (key, value) pairs. Stable iteration order in Python 3.7+.
iter vars = name (str), model (nn.Module).
LOOP TRACE · 3 iterations
name='amnl_ck'
expected = Late-leaning → low RMSE, high NASA. Wins for w near 0.
name='gaba_ck'
expected = Mildly late → balanced. Wins for w ≈ 0.5.
name='grace_ck'
expected = Early-leaning → low NASA, slightly higher RMSE. Wins for w near 1.
24model.eval()
Switch to evaluation mode. Disables dropout, makes BatchNorm use running stats. Always call before measuring metrics.
EXECUTION STATE
📚 .eval() = Sets self.training = False on the module and all sub-modules. Affects nn.Dropout, nn.BatchNorm*. Complement of .train().
25with torch.no_grad():
Context manager that disables autograd inside the block. We do not need gradients for evaluation - skipping the autograd graph saves memory and time.
EXECUTION STATE
📚 torch.no_grad() = Context manager / decorator. Inside the block, requires_grad is forced to False on op results. Preferred over @torch.no_grad() decorator when you only want to disable for a small section.
26preds, targets = [], []
Two empty lists to accumulate per-batch predictions and targets.
27for x, y_rul, _ in eval_set:
Iterate batches. The third element of each batch tuple (health-state label) is unused here so we discard it with the underscore convention.
EXECUTION STATE
iter vars = x (B, T, F), y_rul (B,), _ (B,) discarded.
→ underscore convention = Python idiom for "I do not need this value". Just a regular variable name; nothing magical about it.
28p, _ = model(x)
Forward pass. DualTaskModel returns (rul, logits); we keep rul, drop logits (unused for RUL metrics).
29preds.append(p)
.append on a Python list. O(1) amortised.
30targets.append(y_rul)
Same.
31pred = torch.cat(preds)
Concatenate per-batch tensors into one (N,) tensor. dim=0 (the default) joins along the batch axis.
EXECUTION STATE
📚 torch.cat(seq, dim=0) = Concatenate a sequence of tensors along an existing dim. Without `dim`, joins along axis 0 (batch axis).
⬇ arg: seq = preds = List of (B,) tensors.
⬇ arg: dim = 0 (default) = Join along axis 0. Result shape: (N,) where N = sum of per-batch B.
⬆ result: pred = (N,) - all predictions across the eval set.
32tgt = torch.cat(targets)
Same trick for targets.
EXECUTION STATE
⬆ result: tgt = (N,) - all ground-truth RUL values.
33rmse = torch.sqrt(F.mse_loss(pred, tgt)).item()
RMSE = sqrt(MSE).
EXECUTION STATE
📚 F.mse_loss(input, target, reduction='mean') = Standard mean squared error. Default reduction is 'mean' - average over all elements.
📚 torch.sqrt(t) = Element-wise √x. On a 0-D tensor returns a 0-D tensor.
📚 .item() = 0-D tensor → Python float.
⬆ result: rmse = Python float in cycle units.
34d = (pred - tgt).clamp(-50, 50)
Signed error, clipped to [-50, 50] to prevent overflow in exp() below.
EXECUTION STATE
📚 .clamp(min, max) = Element-wise clip. Differentiable: gradient is 1 inside the range, 0 outside.
⬇ arg 1: min = -50 = |d| beyond 50 saturates at -50. exp(50/10) = exp(5) ≈ 148 - manageable.
⬇ arg 2: max = +50 = |d| beyond 50 saturates at +50. Without this, predictions of 125 against truth=0 (or vice versa) would explode.
→ reading = Three regimes pick three different checkpoints. The PyTorch path mirrors the NumPy logic, just on real (or stub) tensors.
31 lines without explanation
1import torch
2import torch.nn as nn
3import torch.nn.functional as F
456defregime_safety_weight(cost_late:float, cost_early:float)->float:7"""Convert operational $ cost-of-late vs cost-of-early into a safety weight w in [0, 1].
89 w = cost_late / (cost_late + cost_early)
1011 Anchored at the symmetric case w=0.5 ⇔ cost_late == cost_early.
12 """13if cost_late <=0or cost_early <=0:14raise ValueError("costs must be positive")15return cost_late /(cost_late + cost_early)161718defselect_checkpoint(checkpoints:dict[str, nn.Module],19 eval_set,20 w:float)->tuple[str, nn.Module]:21"""Evaluate every checkpoint, pick the one that minimises the combined cost J."""22 metrics ={}23for name, model in checkpoints.items():24 model.eval()25with torch.no_grad():26 preds, targets =[],[]27for x, y_rul, _ in eval_set:28 p, _ = model(x)29 preds .append(p)30 targets.append(y_rul)31 pred = torch.cat(preds)32 tgt = torch.cat(targets)33 rmse = torch.sqrt(F.mse_loss(pred, tgt)).item()34 d =(pred - tgt).clamp(-50,50)35 nasa = torch.where(36 d >=0,37 torch.exp( d /10)-1,38 torch.exp(-d /13)-1,39).sum().item()40 metrics[name]=(rmse, nasa)4142# Min-max normalise both metrics, mix by w43 rmses = torch.tensor([v[0]for v in metrics.values()])44 nasas = torch.tensor([v[1]for v in metrics.values()])45 rmse_n =(rmses - rmses.min())/(rmses.max()- rmses.min()+1e-12)46 nasa_n =(nasas - nasas.min())/(nasas.max()- nasas.min()+1e-12)47 J =(1- w)* rmse_n + w * nasa_n
48 best =int(J.argmin())49 name =list(checkpoints.keys())[best]50return name, checkpoints[name]515253# ---------- Smoke test ----------54classStubModel(nn.Module):55def__init__(self, bias:float):56super().__init__()57 self.bias = bias
58defforward(self, x):59return x.float()+ self.bias, torch.zeros(x.shape[0],3)606162checkpoints ={63"amnl_ck": StubModel(bias=+3.0),# late-leaning → low RMSE, high NASA64"gaba_ck": StubModel(bias=+0.5),# mildly late65"grace_ck": StubModel(bias=-2.0),# early-leaning → low NASA, slightly higher RMSE66}67eval_set =[(torch.randint(0,126,(32,)).float(),68 torch.randint(0,126,(32,)).float(),69 torch.randint(0,3,(32,)))]7071for regime, w in[("truck",0.10),("airline",0.50),("cruise",0.90)]:72 name, _ = select_checkpoint(checkpoints, eval_set, w)73print(f"{regime:>8s} w={w:.2f} → {name}")
Same Knob, Other Industries
Industry
Late metric
Early metric
Typical w
Aircraft turbofan
NASA score (cycles)
Premature replacement (USD)
0.85 - 0.95
Wind-turbine gearbox
Crane + outage hours
Premature gearbox swap
0.80 - 0.90
EV battery thermal runaway risk
Risk × incident cost
Premature derate
0.92 - 0.98
Datacentre HDD (RAID-protected)
Rebuild time + double-failure risk
Premature retirement
0.65 - 0.80
Hospital MRI cryostat
Down-time + helium boil-off
Premature service
0.70 - 0.85
City traffic-light controller
Intersection outage cost
Truck-roll labour
0.55 - 0.70
Three Regime-Selection Pitfalls
Pitfall 1: Forgetting to normalise. If you compute J on RAW RMSE (~10) and RAW NASA (~1000), even w=0.01 already weights NASA heavily. The whole point of min-max normalisation is to make w mean what you want it to mean.
Pitfall 2: Picking w by hand. Eyeballing “feels like 0.7” is how reviewers catch you. Use the cost-ratio formula instead. If you do not know your operational cost ratio, ASK the operator - they have the numbers.
Pitfall 3: Normalising on training set, picking on test set. Min-max stats from one set do not transfer to another. Always normalise WITHIN the candidate set you are picking over. If you re-evaluate on a new deployment site, recompute the min-max bounds.
The point. Three industries, one slider, one formula. §13.4 closes the chapter by mapping each regime to its winning method (AMNL / GABA / GRACE) with the method-specific motivations.
Takeaway
Combined cost.J(w)=(1−w)⋅RMSE+w⋅NASA on min-max normalised metrics.
w from cost ratio.w=cℓ/(cℓ+ce). Numbers come from the operator, not the modeller.
Three regimes, three winners. AMNL for truck, GABA for airline, GRACE for cruise ship.
GRACE's band is wide. Even modest safety weight (w ≥ 0.4) already favours GRACE because the NASA gap is so large after normalisation.