The Broadcast-Microphone Analogy
A radio engineer normalises the audio level so loud and quiet speakers sound “equally loud” to the listener. If two speakers occupy the same broadcast - one whispering close to the mic, one shouting from across the room - applying a single gain knob to the whole show does not equalise them; it just scales the weird mix. To actually equalise, you need to apply a different gain per speaker. That is the heart of per-condition normalisation: one gain knob per regime, not one for the whole signal.
Section 2.4 demonstrated this empirically with a multi-condition sensor histogram. This section formalises the math, walks the Python and PyTorch implementations, and explains why the bug survives code review on so many real production pipelines.
Why Global Z-Score Cannot Erase Regime Variance
Recall the law of total variance from §2.4: the total variance of a pooled sample decomposes into between- and within-cluster components,
Global Z-score is a linear transformation: . Linear transformations rescale variance, but they do NOT change the partition between-cluster vs within-cluster. After Z-score the total variance is 1.0 and the between/within ratio is unchanged from the raw data. If the raw data was 99% regime, the Z-scored data is also 99% regime - just on a different scale.
Interactive: Mixing Hides the Signal
The visualization from §2.4, reproduced here so the math has somewhere to land. Toggle between “raw”, “global Z”, and “per-condition Z” and watch the variance partition update.
Python: Quantify the Damage on Real Data
Twenty lines of NumPy. Compute the variance partition before and after global Z-score. The between/total ratio stays at 98.9% no matter what scale you apply.
PyTorch: A Global vs Per-Condition Comparison
The Same Failure Mode Elsewhere
| Domain | Regime | What global Z-score loses |
|---|---|---|
| RUL (this book) | Operating condition | Degradation signal under regime variance |
| Speech | Speaker | Phonemes under speaker pitch/timbre |
| Multi-site MRI | Scanner vendor | Pathology under site bias |
| A/B testing | User segment | Treatment effect under cohort means |
| Federated learning | Client | Model updates under client drift |
| Genomics | Sequencing batch | Differential expression under batch effect |
Three Reasons This Bug Survives Code Review
StandardScaler().fit_transform(X) is the wrong tool. You need either a custom transform or sklearn's ColumnTransformer applied per-condition slice.The lesson. Global mean / std is mathematically unable to remove between-cluster variance. The fix is one nn.Module away (Section 6.4) and accounts for ~1 percentage point of NASA score on FD002 - small per-section but compounded across all of Parts V-VII.
Takeaway
- Linear scaling cannot erase between-cluster variance. Global Z-score is linear; it preserves the between/total ratio.
- The diagnostic is between/total after normalisation. Should be near 0 if the normaliser is correct; near 1 means the regime is still present.
- The bug is silent. Train loss looks fine; only test loss reveals it. Diagnose with the partition ratio, not the training loss curve.
- The fix is per-condition Z-score. Sections 6.2-6.4 implement it.