AI Book - Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section, you will:

Understand the motivation for discretizing RUL into health states
Define the 5 health state categories and their RUL boundaries
Apply the discretization formula to convert RUL to class labels
Analyze class distribution and address imbalance concerns
Connect discretization to multi-task learning as described in Chapter 2

Why This Matters: Our AMNL model performs both RUL regression and health state classification. The classification task provides categorical structure that regularizes the regression, leading to better overall performance. This section defines how we create the classification labels.

Why Discretize RUL?

We already have continuous RUL labels. Why create a parallel discrete representation?

Motivation 1: Practical Decision Making

In real maintenance operations, decisions are categorical, not continuous:

Healthy: Continue normal operation
Degrading: Schedule inspection at next opportunity
Warning: Plan maintenance within days
Critical: Ground aircraft, inspect immediately

A health state classification directly supports these decisions without requiring engineers to interpret continuous RUL values.

Motivation 2: Regularization

Classification provides categorical structure that constrains the learned representation:

The model learns features that distinguish broad degradation stages
Classification loss provides gradients even when regression is noisy
Shared features benefit from both signal types

Motivation 3: Robustness

Exact RUL prediction is inherently uncertain. Health states provide a more robust target:

Scenario	RUL Prediction	Health State
True RUL = 50, Pred = 60	Error = 10 cycles	Same state (correct)
True RUL = 50, Pred = 45	Error = 5 cycles	Same state (correct)
True RUL = 50, Pred = 20	Error = 30 cycles	Wrong state (error)

Small RUL errors within a state are acceptable; large errors that cross state boundaries are not.

Defining Health States

We define 5 health states based on RUL ranges. This follows the approach used in several prior works on C-MAPSS.

The 5 Health States

State	Label	RUL Range	Description	Action
0	Healthy	RUL > 100	Normal operation, no degradation	Continue operation
1	Minor Degradation	75 < RUL ≤ 100	Early signs of wear	Monitor closely
2	Moderate Degradation	50 < RUL ≤ 75	Clear degradation trends	Schedule inspection
3	Significant Degradation	25 < RUL ≤ 50	Advanced degradation	Plan maintenance
4	Critical	RUL ≤ 25	Failure imminent	Immediate action

Boundary Visualization

📝text

1RUL
2 ^
3 |
4125|  ─────────────────────  State 0: Healthy
5   |
6100|  · · · · · · · · · · ·  Boundary
7   |                         State 1: Minor Degradation
8 75|  · · · · · · · · · · ·  Boundary
9   |                         State 2: Moderate Degradation
10 50|  · · · · · · · · · · ·  Boundary
11   |                         State 3: Significant Degradation
12 25|  · · · · · · · · · · ·  Boundary
13   |                         State 4: Critical
14  0|  ─────────────────────  Failure
15   +------------------------> Cycle

Why These Boundaries?

The boundaries (100, 75, 50, 25) create equal-width intervals of 25 cycles each (except State 0 which extends to 125+):

Equal intervals: Balanced class sizes within the degrading range
Meaningful thresholds: 25-cycle intervals align with typical maintenance planning windows
Prior work: These boundaries are established in C-MAPSS literature, enabling comparison

Discretization Formula

Given a (piecewise linear) RUL value, we compute the health state:

\text{state} = \begin{cases} 0 & \text{if } \text{RUL} > 100 \\ 1 & \text{if } 75 < \text{RUL} \leq 100 \\ 2 & \text{if } 50 < \text{RUL} \leq 75 \\ 3 & \text{if } 25 < \text{RUL} \leq 50 \\ 4 & \text{if } \text{RUL} \leq 25 \end{cases}

This can be computed efficiently using floor division:

\text{state} = \min\left(4, \left\lfloor \frac{125 - \text{RUL}}{25} \right\rfloor\right)

Implementation

🐍python

1def rul_to_health_state(rul, num_states=5, rul_max=125):
2    """
3    Convert RUL to health state label.
4
5    Args:
6        rul: Remaining Useful Life (can be array)
7        num_states: Number of discrete states (default 5)
8        rul_max: Maximum RUL value (default 125)
9
10    Returns:
11        Health state label(s) in range [0, num_states-1]
12    """
13    interval = rul_max // num_states  # 25 for 5 states
14    state = (rul_max - rul) // interval
15    return min(state, num_states - 1)
16
17# Example usage:
18# rul_to_health_state(85)  -> 1
19# rul_to_health_state(30)  -> 3
20# rul_to_health_state(5)   -> 4

Class Distribution Analysis

Understanding the class distribution helps anticipate training challenges.

Theoretical Distribution

For an engine with lifetime $T > 125$ cycles:

State	RUL Range	Cycles in State
0	> 100	T - 125 + 25 = T - 100
1	75-100	25
2	50-75	25
3	25-50	25
4	0-25	25

States 1-4 each contain exactly 25 cycles, but State 0 contains all remaining cycles. This creates inherent class imbalance.

FD001 Class Distribution

Addressing Class Imbalance

Several strategies can address the imbalance:

Strategy	Mechanism	Trade-off
Class weights	Weight loss by inverse frequency	May overfit to rare classes
Oversampling	Duplicate minority samples	Increases training time
Undersampling	Remove majority samples	Loses information
Focal loss	Down-weight easy examples	Complex tuning
Accept imbalance	Let model learn natural distribution	May underperform on rare classes

In our AMNL model, we use the AMNL loss normalization which automatically balances the classification task contribution regardless of class imbalance.

Connection to Multi-Task Learning

The health state labels enable the multi-task learning framework we introduced in Chapter 2.

Two-Head Architecture

Our model produces two outputs from the shared representation:

\mathbf{c} \xrightarrow{\text{Shared}} \begin{cases} \text{RUL Head} \to \hat{y}_{\text{RUL}} \in \mathbb{R} \\ \text{Health Head} \to \hat{\mathbf{p}} \in \Delta^5 \end{cases}

Where $\Delta^5$ is the 5-dimensional probability simplex (softmax outputs).

Loss Components

Task	Target	Loss	Purpose
RUL Regression	Piecewise RUL	MSE	Precise cycle prediction
Health Classification	Health state (0-4)	Cross-Entropy	Categorical structure

Synergy Between Tasks

The tasks reinforce each other:

Classification → Regression: Categorical boundaries prevent regression from making errors that cross state boundaries
Regression → Classification: Fine-grained RUL signal helps classification near boundaries
Shared features: Both tasks train the shared backbone, leading to richer representations

The Key Insight: A model that predicts RUL = 48 vs RUL = 52 makes a small regression error (4 cycles), but crosses the State 2/3 boundary—a classification error. Multi-task learning penalizes both, encouraging predictions that respect categorical structure while maintaining regression precision.

Label Consistency

Always derive health states from the piecewise RUL, not raw RUL. This ensures State 0 corresponds to RUL = 100-125, not RUL = 100-∞. Consistency between regression and classification targets is essential.

Summary

In this section, we defined the health state discretization scheme:

Motivation: Practical decision making, regularization, robustness
5 health states: Healthy (0), Minor (1), Moderate (2), Significant (3), Critical (4)
Boundaries: 100, 75, 50, 25 cycles create equal intervals
Formula: $\text{state} = \min(4, \lfloor (125 - \text{RUL})/25 \rfloor)$
Class imbalance: State 0 dominates (~50%), addressed through AMNL loss
Multi-task connection: Classification provides categorical structure for regression

State	RUL Range	Meaning	Typical Action
0	> 100	Healthy	Normal operation
1	75-100	Minor degradation	Monitor
2	50-75	Moderate degradation	Schedule inspection
3	25-50	Significant degradation	Plan maintenance
4	≤ 25	Critical	Immediate action

Chapter Summary: We have now deeply explored the NASA C-MAPSS dataset—its structure, operating conditions, fault modes, sensor selections, and target formulations. With this understanding, we are ready to build the data preprocessing pipeline in Chapter 4: normalization, windowing, and creating PyTorch datasets that our model can consume.

With the dataset fully understood, we are ready to implement the data pipeline that transforms raw files into model-ready tensors.