Why Windows, Not Whole Histories
We have already discussed (Section 1.2) why a model takes a WINDOW of past cycles instead of the entire history: old cycles rarely matter for current prognosis, attention scales poorly with sequence length, and the windowed view turns a single run-to-failure trajectory into many supervised samples. This section formalises the construction.
Output Count and Stride
Given an engine with N cycles, window length W, and stride S, the number of valid windows is
nwindows=⌊SN−W⌋+1.
With N=200,W=30,S=1 we get 171 windows; with S=5 we get 35. The per-window RUL target is tfail−tend where tend is the LAST cycle of the window.
Interactive: Slide Across One Engine
The viewer from Section 1.2, reproduced - drag the cursor and watch the (X, y) pair update.
Python: Build Pairs From One Engine
NumPy is Python's foundational numerical-computing library. It provides ndarray (N-dimensional array) — a fast, memory-efficient matrix type backed by compiled C code. We use it here for two reasons: (1) np.random.randn to fabricate a fake sensor matrix, and (2) np.stack / np.array to glue per-window slices and per-window RUL targets into single tensors at the end.
Generic windowing helper that turns ONE engine's full run-to-failure trajectory into many supervised (X, y) pairs. This is the canonical reference function for the entire book — every loader in later chapters reuses this exact contract.
Documents the contract: input is one engine's full trajectory plus its failure cycle; output is supervised pairs ready for a regression model. The Returns block names the two output shapes so callers don't need to read the body.
Total cycles in this engine's trajectory. For our toy example, n = 200.
Two empty Python lists that accumulate per-window slices and their RUL targets. We use lists (not pre-allocated arrays) because list.append is amortised O(1) and we don't need to compute n_pairs ahead of time.
Slide the window across the trajectory. `end` is the EXCLUSIVE upper bound of each window — the cycle just past the window's last observation. The window covers cycles [end - window, end).
Slice the past `window` cycles ending at (but not including) `end`. NumPy slicing returns a VIEW into the original array — no data copy — so this is essentially free.
RUL ('Remaining Useful Life') after the window closes. Convention: with `end` exclusive, the window's last observation is at cycle end−1, and RUL counts cycles from end forward until failure_cycle.
Convert the two Python lists into NumPy tensors. np.stack creates a NEW leading axis (so 171 shape-(30,14) arrays become one shape-(171,30,14) tensor); np.array converts a flat list of ints into a 1D ndarray.
Sets NumPy's global random state to a fixed seed so np.random.randn produces the SAME numbers on every run. Critical for reproducible examples and reproducible book figures.
Tuple unpacking — two assignments in one line. Defines the toy engine's dimensions.
Generates fake sensor data for the toy example. Read right-to-left: sample standard normals, cast to float32, scale by 5 (std=5), shift by 100 (mean=100).
Dense windowing — every cycle yields a training window. Maximises training samples (171) from a single engine.
Sparse windowing — slide 5 cycles at a time. 5× fewer windows. Useful when training data is overwhelming or compute is constrained.
Verify dense-windowing shapes and RUL targets at runtime.
Verify sparse-windowing shapes. Note RUL jumps by 5 — the visible signature of stride>1.
1import numpy as np
2
3def build_pairs(sensors: np.ndarray, failure_cycle: int,
4 window: int = 30, stride: int = 1):
5 """One engine's run-to-failure -> (X, y) regression pairs.
6
7 Returns
8 -------
9 X : (n_pairs, window, n_sensors) - sliding-window inputs
10 y : (n_pairs,) - per-window scalar RUL target
11 """
12 n = len(sensors)
13 X_list, y_list = [], []
14 for end in range(window, n + 1, stride):
15 X_list.append(sensors[end - window:end])
16 y_list.append(failure_cycle - end)
17 return np.stack(X_list), np.array(y_list)
18
19
20# ----- Run on one synthetic engine -----
21np.random.seed(0)
22n_cycles, n_sensors = 200, 14
23sensors = np.random.randn(n_cycles, n_sensors).astype(np.float32) * 5 + 100
24
25X1, y1 = build_pairs(sensors, failure_cycle=200, window=30, stride=1)
26X5, y5 = build_pairs(sensors, failure_cycle=200, window=30, stride=5)
27
28print("stride 1: X.shape", X1.shape, " y[:5]", y1[:5].tolist())
29print("stride 5: X.shape", X5.shape, " y[:5]", y5[:5].tolist())
30
31# stride 1: X.shape (171, 30, 14) y[:5] [170, 169, 168, 167, 166]
32# stride 5: X.shape ( 35, 30, 14) y[:5] [170, 165, 160, 155, 150]PyTorch: A Sliding-Window Dataset
We still need NumPy here as the source format for sensor data. PyTorch's torch.from_numpy is the standard zero-copy bridge — the ndarray's memory becomes the tensor's storage, no duplication.
PyTorch root module. Provides torch.Tensor (the GPU-aware n-dimensional array), torch.from_numpy, torch.tensor, torch.manual_seed, and the autograd engine. Everything in this Dataset stores its data as torch.Tensors so the DataLoader can collate batches and move them to GPU.
Imports the abstract base class for ALL PyTorch map-style datasets. Subclassing Dataset and implementing __len__ + __getitem__ is the only contract a DataLoader needs — it then handles batching, shuffling, multiprocessing, and pinning automatically.
Lazy single-engine windowing dataset. Unlike the NumPy helper that materialises ALL 171 windows up front, this class stores ONLY the (200,14) sensor matrix plus a 171-element list of start indices, and slices a window on-demand inside __getitem__. Saves memory when you have many engines.
One-line summary describing the strategy: lazy (no upfront materialisation) and single-engine (one trajectory per Dataset instance). For multi-engine training you'd wrap this in torch.utils.data.ConcatDataset.
Constructor. Same four arguments as the NumPy helper, but this version stores them and pre-computes start indices instead of building windows.
Zero-copy bridge from ndarray to torch.Tensor, then cast to float32. The new tensor SHARES memory with the original ndarray — modifying one modifies the other.
Stash the failure cycle on the instance so __getitem__ can compute RUL = failure_cycle − e at sample-fetch time. Plain int — no tensor conversion needed because we'll wrap the per-sample y as a tensor inside __getitem__.
Stash window length. Used in __getitem__ to compute the slice end e = s + self.window.
Pre-compute every valid window start index as a Python list. With stride=1 we get [0, 1, 2, ..., 170] (171 entries); with stride=5 we get [0, 5, 10, ..., 170] (35 entries). __getitem__ later indexes into this list to find s.
PyTorch DataLoader contract method #1. Returns the total number of samples the dataset exposes. DataLoader uses this to compute number of batches and to bound the sampler's index range.
The number of valid window starts equals the number of available samples. 171 for stride=1, 35 for stride=5.
PyTorch DataLoader contract method #2. Given a sample index in [0, len(self)), return one (X, y) pair. DataLoader calls this once per sample then collates results into a batch.
Look up the start cycle for this sample.
Exclusive upper bound of the slice. With s=0 and window=30, e=30 — slice covers cycles [0, 30).
Tensor slice along the first axis. Like NumPy slicing, returns a VIEW into self.sensors — no data copy. The DataLoader's collate_fn later stacks batch_size such views into a (B, 30, 14) batch tensor.
Build the scalar RUL target as a 0-D tensor. We must cast through Python float to ensure a clean torch.float32 result; passing a Python int directly would yield a torch.int64 which mismatches the float32 X and the typical MSE-loss expectation.
DataLoader contract: __getitem__ returns one sample as a tuple. Default collate_fn stacks the X tensors into (batch, 30, 14) and the y tensors into (batch,) automatically.
Sets PyTorch's CPU random seed. Belt-and-braces with np.random.seed earlier — different libraries have independent RNGs, so reproducibility requires seeding each one used.
Same fake sensor matrix as the NumPy block — same seed-controlled values. Pre-converted to float32 so torch.from_numpy gives a float32 tensor without an extra cast.
Dense windowing — every cycle yields a sample. The constructor pre-computes self.starts = [0..170] and stores the (200,14) tensor. No windows materialised yet.
Sparse windowing — only every 5th cycle yields a sample. self.starts = [0, 5, 10, ..., 170]. Same self.sensors tensor — only the starts list shrinks.
Verifies __len__ wiring. Python's len() built-in calls Dataset's __len__(self) under the hood.
Verifies stride=5 yields exactly 35 samples — matches floor((200−30)/5)+1.
Subscript notation triggers __getitem__(self, idx=0). The class returns a (X, y) tuple; Python unpacks it into two variables.
Verify shapes and the RUL value of the first sample.
1import numpy as np
2import torch
3from torch.utils.data import Dataset
4
5
6class SlidingWindowDataset(Dataset):
7 """Lazy windowing - one engine, sliding stride."""
8
9 def __init__(self, sensors: np.ndarray, failure_cycle: int,
10 window: int = 30, stride: int = 1):
11 self.sensors = torch.from_numpy(sensors).float()
12 self.failure_cycle = failure_cycle
13 self.window = window
14 self.starts = list(range(0, len(sensors) - window + 1, stride))
15
16 def __len__(self) -> int:
17 return len(self.starts)
18
19 def __getitem__(self, idx: int):
20 s = self.starts[idx]
21 e = s + self.window
22 X = self.sensors[s:e] # (window, n_sensors)
23 y = torch.tensor(float(self.failure_cycle - e)) # scalar RUL
24 return X, y
25
26
27# ----- Use it -----
28torch.manual_seed(0)
29sensors = np.random.randn(200, 14).astype(np.float32) * 5 + 100
30
31ds_dense = SlidingWindowDataset(sensors, 200, window=30, stride=1)
32ds_sparse = SlidingWindowDataset(sensors, 200, window=30, stride=5)
33
34print("dense size :", len(ds_dense)) # 171
35print("sparse size :", len(ds_sparse)) # 35
36
37X, y = ds_dense[0]
38print("X.shape:", tuple(X.shape), "y:", float(y)) # (30, 14) 170.0Sliding Windows in Other Domains
| Domain | Window | Stride | Notes |
|---|---|---|---|
| RUL (this book) | 30 cycles | 1 (train) / last (eval) | Window in cycles |
| Speech recognition | 25 ms frames | 10 ms (75% overlap) | Hop length = stride |
| EEG seizure detection | 1-second windows | 0.5 s (50% overlap) | Continuous classification |
| Network IDS | 60 packets | 1 packet (heavy overlap) | Stream-mode windowing |
| Stock-trading signals | 60 minutes | 5 minutes | Hourly indicators |
| Wearable activity recognition | 200 samples (~2 s) | 100 samples (50% overlap) | Ditto |
Three Window-Construction Pitfalls
min(engine_length) when picking W.The point. One run-to-failure engine becomes 100+ training samples via windowing. The model never sees more than 30 cycles at a time. The RUL target is always the cycles-to-go at the window's last cycle.
Takeaway
- Window length 30 is the C-MAPSS standard. Stride 1 for training, last-window-only for evaluation.
- n_windows = floor((N - W)/S) + 1. Memorise this formula; it is the foundation of every loader in this book.
- RUL is computed at the window's LAST cycle. Reversing this aligns prediction to the past, not the future.