Chapter 15
25 min read
Section 104 of 175

Permutation Tests

Common Statistical Tests

Learning Objectives

By the end of this section, you will be able to:

📚 Core Knowledge

  • • Understand the principle of exchangeability under the null hypothesis
  • • Explain how permutation tests construct a null distribution empirically
  • • Describe when permutation tests are preferred over parametric alternatives
  • • Distinguish between exact and Monte Carlo permutation tests
  • • Compare permutation tests with bootstrap methods

🔧 Practical Skills

  • • Implement permutation tests for two-sample comparisons
  • • Calculate exact p-values for small samples
  • • Apply Monte Carlo approximation for large datasets
  • • Extend permutation logic to correlation and paired tests
  • • Use scipy and custom implementations for permutation inference

🧠 AI/ML Applications

  • A/B Testing - Robust hypothesis tests for skewed metrics like revenue or conversion
  • Feature Importance - Permutation importance for model interpretation
  • Model Comparison - Statistical testing when comparing model performance
  • Cross-Validation - Significance of CV score differences
  • Fairness Auditing - Testing for disparate impact without distributional assumptions
Central Message: Permutation tests provide exact inference without making any distributional assumptions. By leveraging the principle of exchangeability under the null hypothesis, we construct a null distribution directly from the data itself—a powerful technique that predates and complements modern machine learning.

The Big Picture: Distribution-Free Inference

Throughout this chapter, we have explored tests like the t-test, chi-square test, and likelihood ratio test. These powerful methods all share a common requirement: they rely on asymptotic theory or distributional assumptions to derive the null distribution. But what if your data is highly skewed, has outliers, or comes from an unknown distribution?

Permutation tests (also called randomization tests or exact tests) offer an elegant solution. Instead of assuming a theoretical distribution, they construct the null distribution empirically by repeatedly shuffling the data. The core insight is beautifully simple:

💡

The Core Insight

If the null hypothesis is true, then the group labels are arbitrary. We could shuffle them randomly without affecting the underlying structure of the data. By observing how our test statistic behaves under all possible shuffles, we learn what values are "typical" under H₀.

Historical Context: Fisher's Lady Tasting Tea

The permutation test was pioneered by Ronald Fisher in the 1930s through his famous "Lady Tasting Tea" experiment. A colleague, Dr. Muriel Bristol, claimed she could tell whether milk or tea was poured first into a cup. Fisher designed a rigorous test:

Fisher's Experimental Design (1935)

  1. Prepare 8 cups: 4 with milk first, 4 with tea first
  2. Present cups in random order; she identifies which 4 had milk first
  3. Count how many she correctly identifies
  4. Calculate: What's the probability of this success rate if she were just guessing?

If she were guessing, all (84)=70\binom{8}{4} = 70 ways of choosing 4 cups would be equally likely. The p-value is simply the proportion of these 70 arrangements that are as impressive as (or more impressive than) what she achieved.

Why This Was Revolutionary: Fisher showed that meaningful statistical inference could be conducted without assuming any probability distribution. The null distribution comes directly from the randomization procedure itself.

The Core Principle: Exchangeability

The mathematical foundation of permutation tests is the concept of exchangeability. Under the null hypothesis, observations are exchangeable if their joint distribution is invariant to permutations of their labels.

Formal Definition of Exchangeability

(X1,,Xn)=d(Xπ(1),,Xπ(n))(X_1, \ldots, X_n) \stackrel{d}{=} (X_{\pi(1)}, \ldots, X_{\pi(n)})

for every permutation π\pi of {1,,n}\{1, \ldots, n\}

Intuition: If there is truly no treatment effect (H₀ is true), then whether an observation came from the "treatment" group or "control" group is just an arbitrary label. The data would look the same regardless of how we assigned these labels.

ScenarioAre Labels Exchangeable Under H₀?Why?
A/B test with random assignmentYesRandom assignment means labels are arbitrary if no effect
Drug trial: treatment vs placeboYesIf drug has no effect, assignment is irrelevant
Observational study: smokers vs non-smokersCaution neededGroups may differ systematically beyond just smoking
Time series: before vs afterUsually noTemporal ordering typically matters
Critical Assumption: Permutation tests are not a free lunch. They require that observations are exchangeable under H₀, which typically holds when treatments are randomly assigned. For observational data, the assumption may be violated if the groups differ in ways beyond the treatment of interest.

Interactive: Understanding Exchangeability

This interactive demonstration shows how exchangeability works. Under the null hypothesis, we can shuffle group labels and create equally plausible datasets.

Understanding Exchangeability

The Key Insight: Exchangeability

Under the null hypothesis (no treatment effect), the group labels are arbitrary. If there's truly no difference between groups, we could shuffle the labels without changing the underlying data structure. This is the principle of exchangeability.

1

Original Data with Labels

12A
15A
18A
25B
28B
30B
Mean A: 15.0Mean B: 27.7Diff: 12.7

Key Takeaway

The permutation test asks: "How often would we see a difference as extreme as 12.7 if we randomly shuffled the labels?" If such extreme differences are rare among all permutations, we have evidence against H₀.


Mathematical Framework

Let us formalize the permutation testing procedure for a two-sample comparison.

The Permutation Distribution

Consider two groups with observations X1,,Xn1X_1, \ldots, X_{n_1} (Group A) and Y1,,Yn2Y_1, \ldots, Y_{n_2} (Group B). Let TT be our test statistic (e.g., difference in means).

Permutation Test Procedure

  1. Compute observed statistic: Calculate TobsT_{\text{obs}} from the original data
  2. Pool the data: Combine all n=n1+n2n = n_1 + n_2 observations into one set
  3. Generate permutations: For each of the (nn1)\binom{n}{n_1} possible ways to assign n1n_1 observations to "Group A":
    • Calculate the test statistic T(b)T^{(b)}
  4. Build null distribution: The collection {T(1),T(2),}\{T^{(1)}, T^{(2)}, \ldots\} forms the permutation distribution
  5. Calculate p-value: Compare TobsT_{\text{obs}} to the permutation distribution

P-Value Calculation

The permutation p-value is calculated as the proportion of permuted statistics that are as extreme or more extreme than the observed statistic:

Permutation P-Value Formulas

Two-sided:

p=1Bb=1B1(T(b)Tobs)p = \frac{1}{B} \sum_{b=1}^{B} \mathbf{1}(|T^{(b)}| \geq |T_{\text{obs}}|)

Right-tailed:

p=1Bb=1B1(T(b)Tobs)p = \frac{1}{B} \sum_{b=1}^{B} \mathbf{1}(T^{(b)} \geq T_{\text{obs}})

Left-tailed:

p=1Bb=1B1(T(b)Tobs)p = \frac{1}{B} \sum_{b=1}^{B} \mathbf{1}(T^{(b)} \leq T_{\text{obs}})

where B is the number of permutations and 1()\mathbf{1}(\cdot) is the indicator function

Exact vs Monte Carlo: When (nn1)\binom{n}{n_1} is small (roughly < 10,000), we can enumerate all permutations for an exact p-value. For larger samples, we sample B permutations randomly for a Monte Carlo approximation. With B = 10,000 permutations, the Monte Carlo error is typically ±0.01.

Interactive: Permutation Test Explorer

This interactive visualization lets you see the permutation test in action. Run permutations, build the null distribution, and observe how the p-value is calculated.

Permutation Test Explorer

Group A (Control)

2328312527

Mean: 26.80

Group B (Treatment)

3538324036

Mean: 36.20

Observed Difference (B - A)

9.40

H₀: This difference is due to random chance

n =1000

Types of Permutation Tests

The permutation principle extends to many testing scenarios beyond two-sample means:

Two-Sample Tests

Compare two independent groups. Test statistic options:

  • Difference in means: YˉXˉ\bar{Y} - \bar{X}
  • Difference in medians
  • t-statistic (more powerful when variances differ)
  • Any function of the two groups

Paired Tests

For matched pairs (before/after, twins, etc.):

  • Compute differences Di=YiXiD_i = Y_i - X_i
  • Randomly flip signs of differences
  • Test if mean difference differs from zero

Correlation Tests

Test H₀: X and Y are independent:

  • Keep X values fixed
  • Permute Y values (break the pairing)
  • Calculate correlation under each permutation

Multi-Group Tests (k groups)

Extension to ANOVA-style comparisons:

  • Pool all observations
  • Randomly assign to k groups (respecting sizes)
  • Use F-statistic or sum of squared deviations


Advantages and Limitations

AspectAdvantageLimitation
AssumptionsNo distributional assumptions (non-parametric)Requires exchangeability under H₀
ValidityExact p-values for any sample sizeOnly tests H₀, not parameters
RobustnessWorks with outliers, skewed data, any shapeMay be less powerful than parametric tests when assumptions hold
ComputationConceptually simple; easy to implementCan be slow for large datasets
FlexibilityAny test statistic can be usedNo confidence intervals directly

Interactive: Robustness Comparison

This simulation compares the Type I error rates of permutation tests versus t-tests under various conditions. See how permutation tests maintain validity even when the t-test assumptions are violated.

Permutation vs Parametric: Robustness Comparison

0 = Normal, Higher = More right-skewed

Distribution Shape Preview

Normal distribution


Permutation vs Bootstrap

Both permutation tests and bootstrap are resampling methods, but they serve different purposes:

Permutation Tests

  • Purpose: Hypothesis testing
  • Sampling: Without replacement (shuffle labels)
  • Generates: Null distribution
  • Centered at: Zero (or null value)
  • Answers: "Is the observed effect real?"

Bootstrap

  • Purpose: Estimation uncertainty
  • Sampling: With replacement
  • Generates: Sampling distribution
  • Centered at: Observed statistic
  • Answers: "How precise is our estimate?"

Interactive: Resampling Methods Comparison

Compare the permutation and bootstrap distributions side by side. Notice how the permutation distribution is centered at zero (the null hypothesis) while the bootstrap distribution is centered at the observed difference.

Permutation vs Bootstrap: Two Resampling Philosophies

Group A

1821241923

Mean: 21.0

Group B

2831273229

Mean: 29.4

Observed Difference: 8.4

Permutation Test

  • Shuffles labels between groups
  • Samples without replacement
  • Tests H₀: groups are exchangeable
  • Distribution centered at zero

Bootstrap

  • Resamples observations within groups
  • Samples with replacement
  • Estimates sampling distribution
  • Distribution centered at observed
Resamples:500

Key Difference

Permutation tests generate a null distribution (what we'd see if H₀ were true), while bootstrap estimates the sampling distribution of the statistic. Use permutation for hypothesis testing; use bootstrap for confidence intervals.

When to Use Each:
  • Use permutation tests when testing hypotheses (p-values)
  • Use bootstrap when constructing confidence intervals
  • For A/B tests: Use permutation for the test, bootstrap for effect size CIs

Applications in AI/ML

Permutation tests have become increasingly important in modern machine learning. Here are key applications:

🎯 Permutation Feature Importance

Permutation importance measures feature importance by shuffling each feature and observing the drop in model performance. Unlike built-in importance measures, it works for any model and doesn't require model internals.

from sklearn.inspection import permutation_importance

🧪 A/B Testing for Skewed Metrics

Revenue, purchase amount, and session duration are often highly skewed with outliers. The t-test's normal approximation may fail. Permutation tests provide valid inference regardless of the metric's distribution.

📊 Model Comparison Testing

Is Model A's CV accuracy of 0.92 significantly better than Model B's 0.89? Permutation tests on paired CV scores (e.g., McNemar's test for classification) provide rigorous answers without asymptotic assumptions.

⚖️ Algorithmic Fairness Auditing

Testing whether a model's predictions have disparate impact across demographic groups. Permutation tests assess whether observed disparities could arise by chance, without requiring strong distributional assumptions.


Python Implementation

Complete Permutation Test Implementation
🐍python
1Imports

We use numpy for numerical operations and scipy.stats for comparison with parametric tests. The typing module helps with type hints.

10Function Signature

The permutation_test function is designed to be flexible: it accepts any test statistic function and handles both exact enumeration and Monte Carlo sampling.

34Observed Statistic

First, we compute the test statistic on the original data. This is the value we'll compare against the permutation distribution.

41Exact vs Monte Carlo

We check if the total number of possible permutations C(n, n_a) is small enough to enumerate exactly. For small samples, we get exact p-values; for large samples, we use Monte Carlo.

46Exact Enumeration

For small samples, we enumerate all possible ways to assign n_a observations to group A using combinations. This gives an exact p-value.

55Monte Carlo Sampling

For large samples, we randomly shuffle the pooled data and split it into two groups. Repeating this n_permutations times approximates the exact distribution.

65P-Value Calculation

The p-value is the proportion of permuted statistics as extreme as the observed. For two-sided tests, we use absolute values.

84Skewed Data Example

This A/B test example uses log-normal data (common for revenue metrics). The permutation test handles skewness correctly without assuming normality.

101Correlation Test

To test independence between X and Y, we shuffle Y (breaking the pairing) while keeping X fixed. This preserves marginal distributions.

130Paired Test

For paired data, we randomly flip the signs of differences. Under H₀ (no effect), positive and negative differences are equally likely.

158Scipy Integration

scipy.stats provides permutation_test since Python 3.9. It supports different permutation types: 'independent' for two-sample, 'samples' for paired, and 'pairings' for correlation.

201 lines without explanation
1import numpy as np
2from scipy import stats
3from typing import Literal, Callable
4
5# =============================================
6# Generic Permutation Test Framework
7# =============================================
8
9def permutation_test(
10    group_a: np.ndarray,
11    group_b: np.ndarray,
12    statistic: Callable[[np.ndarray, np.ndarray], float] = lambda a, b: np.mean(b) - np.mean(a),
13    n_permutations: int = 10000,
14    alternative: Literal['two-sided', 'greater', 'less'] = 'two-sided',
15    seed: int | None = None
16) -> dict:
17    """
18    Perform a two-sample permutation test.
19
20    Parameters
21    ----------
22    group_a : array-like
23        Observations from first group
24    group_b : array-like
25        Observations from second group
26    statistic : callable
27        Function that computes test statistic from (group_a, group_b)
28    n_permutations : int
29        Number of permutations (use 'exact' for small samples)
30    alternative : str
31        'two-sided', 'greater', or 'less'
32    seed : int, optional
33        Random seed for reproducibility
34
35    Returns
36    -------
37    dict with 'statistic', 'p_value', 'permutation_distribution'
38    """
39    if seed is not None:
40        np.random.seed(seed)
41
42    # Compute observed statistic
43    observed = statistic(group_a, group_b)
44
45    # Pool all observations
46    pooled = np.concatenate([group_a, group_b])
47    n_a = len(group_a)
48    n_total = len(pooled)
49
50    # Check if exact enumeration is feasible
51    from math import comb
52    n_exact = comb(n_total, n_a)
53
54    if n_exact <= n_permutations:
55        # Exact test: enumerate all permutations
56        from itertools import combinations
57        perm_stats = []
58        for indices in combinations(range(n_total), n_a):
59            perm_a = pooled[list(indices)]
60            perm_b = pooled[[i for i in range(n_total) if i not in indices]]
61            perm_stats.append(statistic(perm_a, perm_b))
62        perm_stats = np.array(perm_stats)
63        actual_perms = n_exact
64    else:
65        # Monte Carlo approximation
66        perm_stats = np.zeros(n_permutations)
67        for i in range(n_permutations):
68            shuffled = np.random.permutation(pooled)
69            perm_a = shuffled[:n_a]
70            perm_b = shuffled[n_a:]
71            perm_stats[i] = statistic(perm_a, perm_b)
72        actual_perms = n_permutations
73
74    # Calculate p-value based on alternative
75    if alternative == 'two-sided':
76        p_value = np.mean(np.abs(perm_stats) >= np.abs(observed))
77    elif alternative == 'greater':
78        p_value = np.mean(perm_stats >= observed)
79    else:  # 'less'
80        p_value = np.mean(perm_stats <= observed)
81
82    return {
83        'statistic': observed,
84        'p_value': p_value,
85        'permutation_distribution': perm_stats,
86        'n_permutations': actual_perms,
87        'exact': n_exact <= n_permutations
88    }
89
90
91# =============================================
92# Example 1: Basic two-sample test
93# =============================================
94
95# Simulated A/B test data (revenue per user)
96np.random.seed(42)
97control = np.random.lognormal(3, 1, 50)     # Control group: skewed revenue
98treatment = np.random.lognormal(3.2, 1, 50) # Treatment group: 20% higher mean
99
100result = permutation_test(control, treatment, n_permutations=10000)
101
102print("=== Two-Sample Permutation Test ===")
103print(f"Observed difference in means: {result['statistic']:.2f}")
104print(f"P-value: {result['p_value']:.4f}")
105print(f"Exact test: {result['exact']}")
106
107# Compare with t-test (may be unreliable for skewed data!)
108t_stat, t_pval = stats.ttest_ind(treatment, control)
109print(f"\nFor comparison - t-test p-value: {t_pval:.4f}")
110
111
112# =============================================
113# Example 2: Permutation test for correlation
114# =============================================
115
116def permutation_correlation_test(
117    x: np.ndarray,
118    y: np.ndarray,
119    n_permutations: int = 10000,
120    seed: int | None = None
121) -> dict:
122    """Test H0: X and Y are independent."""
123    if seed is not None:
124        np.random.seed(seed)
125
126    observed_r, _ = stats.pearsonr(x, y)
127
128    perm_correlations = np.zeros(n_permutations)
129    for i in range(n_permutations):
130        perm_y = np.random.permutation(y)
131        perm_correlations[i], _ = stats.pearsonr(x, perm_y)
132
133    p_value = np.mean(np.abs(perm_correlations) >= np.abs(observed_r))
134
135    return {
136        'correlation': observed_r,
137        'p_value': p_value,
138        'permutation_distribution': perm_correlations
139    }
140
141# Test correlation between advertising spend and sales
142ad_spend = np.array([10, 15, 20, 25, 30, 35, 40, 45, 50, 55])
143sales = np.array([120, 145, 170, 190, 220, 245, 260, 290, 310, 340])
144
145corr_result = permutation_correlation_test(ad_spend, sales)
146print("\n=== Permutation Correlation Test ===")
147print(f"Observed correlation: {corr_result['correlation']:.4f}")
148print(f"P-value: {corr_result['p_value']:.4f}")
149
150
151# =============================================
152# Example 3: Paired permutation test (sign flip)
153# =============================================
154
155def paired_permutation_test(
156    before: np.ndarray,
157    after: np.ndarray,
158    n_permutations: int = 10000,
159    seed: int | None = None
160) -> dict:
161    """Test H0: No difference (by randomly flipping signs of differences)."""
162    if seed is not None:
163        np.random.seed(seed)
164
165    differences = after - before
166    observed_mean = np.mean(differences)
167
168    perm_means = np.zeros(n_permutations)
169    for i in range(n_permutations):
170        # Randomly flip signs
171        signs = np.random.choice([-1, 1], size=len(differences))
172        perm_means[i] = np.mean(differences * signs)
173
174    p_value = np.mean(np.abs(perm_means) >= np.abs(observed_mean))
175
176    return {
177        'mean_difference': observed_mean,
178        'p_value': p_value,
179        'permutation_distribution': perm_means
180    }
181
182# Blood pressure before and after treatment
183bp_before = np.array([140, 145, 138, 150, 142, 148, 155, 140, 143, 147])
184bp_after = np.array([132, 138, 130, 145, 135, 140, 148, 132, 138, 140])
185
186paired_result = paired_permutation_test(bp_before, bp_after)
187print("\n=== Paired Permutation Test ===")
188print(f"Mean BP reduction: {paired_result['mean_difference']:.2f} mmHg")
189print(f"P-value: {paired_result['p_value']:.4f}")
190
191
192# =============================================
193# Using scipy.stats (Python 3.9+)
194# =============================================
195
196# scipy provides permutation_test in stats module
197from scipy.stats import permutation_test as scipy_perm_test
198
199def stat_func(x, y, axis):
200    return np.mean(x, axis=axis) - np.mean(y, axis=axis)
201
202scipy_result = scipy_perm_test(
203    (treatment, control),
204    stat_func,
205    n_resamples=10000,
206    alternative='two-sided',
207    permutation_type='independent'
208)
209
210print("\n=== scipy.stats.permutation_test ===")
211print(f"Statistic: {scipy_result.statistic:.4f}")
212print(f"P-value: {scipy_result.pvalue:.4f}")

Knowledge Check

Test your understanding of permutation tests with this interactive quiz.

Knowledge CheckQuestion 1 of 8

What is the key assumption that permutation tests rely on under the null hypothesis?

Current score: 0/0

Chapter 15: Complete Test Selection Guide

After covering all the major statistical tests in this chapter, here's a comprehensive guide to help you choose the right test for your situation.

Decision Flowchart: Which Test Should I Use?

Step 1: What type of data?

  • Continuous (means) → Go to Step 2
  • Categorical (counts) → Chi-square tests (Section 2)
  • Variances → F-tests (Section 3)

Step 2: How many groups?

  • One group vs known value → One-sample t-test
  • Two groups (independent) → Two-sample t-test (or Welch's)
  • Two groups (paired/matched) → Paired t-test
  • 3+ groups → ANOVA/F-test

Step 3: Are assumptions met?

  • Normality holds, large n → Parametric test (t, F, χ²)
  • Normality violated, small n → Non-parametric alternative
  • Outliers or skewed data → Permutation test (Section 6)

Parametric vs Non-Parametric Alternatives

When distributional assumptions are violated or sample sizes are small, non-parametric tests provide valid alternatives. Here's a comprehensive mapping:

SituationParametric TestNon-Parametric AlternativeWhen to Use Alternative
One sample, locationOne-sample t-testWilcoxon signed-rankNon-normal, small n, outliers
Two independent samplesTwo-sample t-testMann-Whitney U (Wilcoxon rank-sum)Skewed data, ordinal data
Two paired samplesPaired t-testWilcoxon signed-rankNon-normal differences, small n
3+ independent groupsOne-way ANOVAKruskal-Wallis HUnequal variances, non-normal
3+ related samplesRepeated measures ANOVAFriedman testNon-normal, ordinal data
CorrelationPearson rSpearman ρ or Kendall τNon-linear, ordinal, outliers
2×2 contingencyChi-square testFisher&apos;s exact testSmall expected counts (<5)
General two-samplet-testPermutation testAny violation, skewed, small n

When to Use Parametric

  • Data approximately normal (or large n by CLT)
  • Variances roughly equal across groups
  • Need maximum statistical power
  • Want confidence intervals for parameters
  • Sample size is moderate to large (n > 30)

When to Use Non-Parametric

  • Data heavily skewed or with outliers
  • Sample size is small (n < 20-30)
  • Data is ordinal (rankings) not interval
  • Uncertain about distributional assumptions
  • Want robustness over efficiency

Complete Test Summary

Test (Section)PurposeKey Formula/StatisticAssumptions
Z-test (1)Mean when σ knownZ = (x̄ - μ₀) / (σ/√n)Normal data, known σ
t-test (1)Mean when σ unknownt = (x̄ - μ₀) / (s/√n)Normal (or large n), unknown σ
Chi-square (2)Categorical associationsχ² = Σ(O-E)²/EExpected counts ≥ 5
F-test (3)Variance comparison, ANOVAF = MS_between / MS_withinNormal, equal variances
LRT (4)Nested model comparison-2 log(L₀/L₁) ~ χ²Large samples (asymptotic)
Wald (5)Parameter significance(θ̂ - θ₀)² / Var(θ̂)Large samples, MLE computed
Score (5)Parameter significanceU²/I(θ₀)Large samples, null computed
Permutation (6)Distribution-free testAny statisticExchangeability under H₀
Rule of Thumb: When in doubt, start with the permutation test. It's valid under the weakest assumptions and often has power comparable to parametric tests. Use parametric tests when you need confidence intervals or when you're confident in assumptions.

Summary

Key Takeaways

  1. Distribution-free inference: Permutation tests require no distributional assumptions. They work correctly for any data shape—skewed, multimodal, with outliers.
  2. Exchangeability principle: Under H₀, group labels are arbitrary. We can shuffle them to build the null distribution directly from the data.
  3. Exact p-values: For small samples, we can enumerate all permutations for exact inference. For large samples, Monte Carlo sampling provides accurate approximations.
  4. Flexibility: Any test statistic can be used (means, medians, custom functions). The same principle extends to paired tests, correlation, and multi-group comparisons.
  5. Bootstrap distinction: Permutation tests shuffle labels to create a null distribution (hypothesis testing). Bootstrap resamples with replacement to estimate sampling variability (confidence intervals).
  6. ML applications: Permutation importance for feature selection, A/B testing for skewed metrics, model comparison, and fairness auditing all leverage permutation logic.

Quick Reference

Test TypeWhat Gets PermutedTest StatisticUse Case
Two-sampleGroup labelsMean difference, t-statisticA/B tests, treatment effects
PairedSigns of differencesMean of signed differencesBefore/after comparisons
CorrelationY values (keep X fixed)Pearson r, Spearman ρTesting independence
Multi-groupGroup labelsF-statistic, Kruskal-Wallis HComparing &gt;2 groups
Final Thought: Permutation tests embody a beautiful principle: when we don't know the null distribution, we can construct it from the data itself. This approach, pioneered by Fisher nearly a century ago, remains one of the most powerful and underutilized tools in the modern data scientist's toolkit. With computational power now abundant, there's rarely a reason not to use permutation tests when parametric assumptions are questionable.
Loading comments...