Correlation – Comparison

Pearson vs Spearman vs Kendall


Correlation Comparison — Pearson vs Spearman vs Kendall

Three correlations. Same purpose: to check if variables move together. But each answers a different question.

Think of the three like this:

  • Pearson → Measures LINEAR movement (straight-line alignment)
  • Spearman → Measures ORDER movement (monotonic direction)
  • Kendall → Measures AGREEMENT movement (how consistent pairs agree)

A simple story to separate them

Imagine 6 students attend a test. Two teachers evaluate them:

  • Teacher A gives marks (0–100) — numeric, measurable
  • Teacher B gives ranks (1st, 2nd, 3rd…) — ordered, not equal jumps

If we compare the actual marks, we use Pearson. If we compare the rank order, we use Spearman. If we compare agreement of ranking consistency (pairwise), we use Kendall.


✈ Aviation Perspective — same crew dataset, 3 interpretations

Dataset: (Crew experience in years vs average delay)


      Experience (yrs): [1, 3, 4, 6, 8, 10]
      Delay (mins):     [22,19,18,15,13,12]
        
✔ Delay decreases as crew experience increases.

Pearson (r): “How straight is the line?” Spearman (ρ): “Does greater experience always mean lower delay?” Kendall (τ): “How consistently do pairs agree?”

🧠 Psychology / Education Perspective

Dataset: (Sleep hours vs memory score)


      Sleep (hrs):      [5, 6, 7, 8, 9, 10]
      Memory score:     [55,63,72,76,79,72]
        
- Sleep and memory both increase initially - Beyond 9 hours, memory drops slightly (non-linear pattern)

Pearson struggles here because the relationship is not perfectly linear. Spearman and Kendall detect the consistent upward trend, ignoring the last value noise.


When to use which?

  • Pearson → When values matter (precise numbers)
  • Spearman → When order matters (ranking or monotonic trend)
  • Kendall → When agreement matters (consistency / small datasets)

Memory anchor (easy to recall)

  • P → Precision → Pearson → Precise linear relation
  • S → Sequence → Spearman → Ranks sequence
  • K → Kalm (calm) → Kendall → Stable, robust agreement

Same goal, different lens. Pearson measures distance. Spearman measures direction. Kendall measures agreement.

Formulas, Mini-Examples & Quick Decision Guide

1) Pearson Correlation (r)

Measures: Linear strength between two continuous variables


        Σ (xᵢ − x̄)(yᵢ − ȳ)
  r =  ------------------------------------
        √[ Σ (xᵢ − x̄)² × Σ (yᵢ − ȳ)² ]
        

Mini-example (Crew Experience vs Delay)

Experience up -> Delay down -> r ~ -0.98 
(very strong negative)

2) Spearman Rank Correlation (ρ)

Measures: Monotonic direction based on ranks


              6 Σ dᵢ²
  ρ = 1 - -----------------
            n (n^2 - 1)
        

Mini-example (Crew Experience vs Delay ranked)

Ranks match in reverse order -> ρ = -1.0 
(perfect monotonic inverse)

3) Kendall’s Tau (τ)

Measures: Agreement — pairwise consistency of ordering


      (#Concordant pairs - #Discordant pairs)
 tau = ------------------------------------------
             Total number of pairs
        

Mini-example

If most pairs agree -> tau is high
If pairs contradict -> tau falls

Side-by-side Snapshot

Measure Best For Data Type Sensitivity
Pearson (r) Linear relationships Continuous Highly affected by outliers
Spearman (ρ) Monotonic trends Ranks / Ordinal Stable with outliers
Kendall (τ) Small samples / consistency Ranks / Ordinal Most robust

Quick Decision Flow

Is relationship LINEAR? -> Pearson
Is relationship MONOTONIC? -> Spearman
Is dataset SMALL or needs CONSISTENCY? -> Kendall

Real-world usage

  • Aviation: ranking crews on performance metrics → Spearman / Kendall
  • Ops: fuel burn vs payload → Pearson (precise numeric linear)
  • 🧠 Psychology: anxiety level vs test score → Spearman (non-linear)
  • 🧠 Education: consistency between multiple evaluators → Kendall

Remember:

Statistics is not about being perfect — it’s about being honest with uncertainty.
It’s the language of reason in a world full of noise.