Practical Work in Commercial Airlines

crew planning with scatter plot thinking

Context Building: Scatter Plots in Crew Planning

Crew planning in commercial airlines combines schedule design, crew legality, duty-time limits, reserve coverage, airport constraints, and disruption recovery. This creates a planning environment where many variables move together and simple averages are often not enough.

A scatter plot helps planners see relationships between two operational variables at the same time. Instead of only reading tables, teams can quickly detect clusters, outliers, trend direction, and unusual stations or pairings.

Why Scatter Plots Are Useful for Crew Planning

Pattern discovery: identify whether delay exposure grows with duty length or sector count.
Outlier detection: spot pairings or bases with unusual delay or legality pressure.
Capacity planning: compare reserve coverage against disruption volume.
Decision support: guide roster redesign using evidence rather than assumptions.

Typical Crew-Planning Scatter Plot Questions

Duty Hours vs Delay Minutes: do longer duties face more operational disruption?
Sectors per Duty vs Turnaround Buffer: where does schedule tightness become risky?
Reserve Crew % vs Flights Recovered: what reserve level gives practical benefit?
Sign-in Time (local) vs Late Departure Rate: are certain report windows more fragile?

Scatter plots do not prove causation, but they are excellent for finding where to investigate next. In crew planning, that makes them a strong first-step tool before optimization or ML models.

Data Preparation Points (Important)

Crew-planning scatter plots are only useful when the data is prepared at the correct level: flight leg, pairing, duty period, or roster line. Mixing levels can create misleading relationships.

Use consistent timestamps (UTC vs local time clearly defined)
Remove invalid records (negative delay, missing duty end, duplicated pairings)
Separate planned vs actual values to avoid interpretation errors
Tag operational context such as base, fleet, season, weather class, or route type

How Teams Use the Result

After spotting a pattern, analysts usually segment the data by base, fleet, route family, or season. That second step confirms whether a trend is network-wide or only a local operational issue.

In practice, scatter plots support discussions between crew planning, operations control, and station teams because the chart is visual, fast to interpret, and easy to compare across scenarios.

Analysis and Sample Python Scatter Plots

Below are practical examples showing how to build scatter plots for crew-planning analysis using pandas and matplotlib. These examples use sample data, but the structure can be connected directly to airline operations datasets.

1. Basic Scatter: Duty Hours vs Delay Minutes

Objective: check whether longer duty periods are associated with higher accumulated delay.

import pandas as pd
import matplotlib.pyplot as plt

# Example crew duty-level dataset
df = pd.DataFrame({
    "duty_hours": [6.2, 7.5, 8.1, 9.0, 10.2, 11.4, 7.0, 8.8, 9.6, 12.1],
    "delay_minutes": [8, 12, 18, 20, 31, 45, 10, 23, 29, 54]
})

plt.figure(figsize=(8, 5))
plt.scatter(df["duty_hours"], df["delay_minutes"], color="#1a5276", alpha=0.75, s=70)
plt.title("Crew Planning: Duty Hours vs Delay Minutes")
plt.xlabel("Duty Hours (actual)")
plt.ylabel("Total Delay Minutes")
plt.grid(alpha=0.25)
plt.show()

What to look for: upward trend, isolated outliers, and thresholds (for example, disruption growing after 9-10 duty hours).

2. Operational Scatter with Size + Color Encoding

Objective: compare reserve coverage and delay recovery by base, while visualizing flight volume.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
    "base": ["DXB", "DOH", "SIN", "LHR", "FRA", "JFK"],
    "reserve_coverage_pct": [8, 10, 7, 12, 9, 11],
    "recovered_flights_pct": [72, 81, 68, 86, 77, 83],
    "daily_flights": [310, 280, 190, 340, 260, 220]
})

colors = ["#1a5276", "#239b56", "#c0392b", "#1a5276", "#239b56", "#c0392b"]
sizes = df["daily_flights"] * 2.2

plt.figure(figsize=(9, 5.5))
plt.scatter(
    df["reserve_coverage_pct"],
    df["recovered_flights_pct"],
    s=sizes,
    c=colors,
    alpha=0.7,
    edgecolors="white",
    linewidth=1
)

for _, row in df.iterrows():
    plt.text(row["reserve_coverage_pct"] + 0.1, row["recovered_flights_pct"] + 0.2, row["base"], fontsize=9)

plt.title("Crew Planning: Reserve Coverage vs Flights Recovered")
plt.xlabel("Reserve Coverage (%)")
plt.ylabel("Recovered Flights During Disruption (%)")
plt.grid(alpha=0.25)
plt.show()

3. Add a Trend Line for Faster Decision Support

A trend line helps leadership read direction quickly during planning reviews.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
    "duty_hours": [6.2, 7.5, 8.1, 9.0, 10.2, 11.4, 7.0, 8.8, 9.6, 12.1],
    "delay_minutes": [8, 12, 18, 20, 31, 45, 10, 23, 29, 54]
})

x = df["duty_hours"].to_numpy()
y = df["delay_minutes"].to_numpy()
m, b = np.polyfit(x, y, 1)

plt.figure(figsize=(8, 5))
plt.scatter(x, y, color="#239b56", alpha=0.75, s=70, label="Duty records")
plt.plot(x, m * x + b, color="#c0392b", linewidth=2, label="Trend line")
plt.title("Duty Hours vs Delay Minutes (with Trend)")
plt.xlabel("Duty Hours")
plt.ylabel("Delay Minutes")
plt.legend()
plt.grid(alpha=0.25)
plt.show()

Recommended workflow: scatter plot -> segment by base/fleet/season -> validate with operational context -> act. This keeps crew-planning decisions evidence-based and operationally realistic.

Analysis Notes for Airline Teams

Use filters for fleet/base/season to avoid mixed-pattern charts.
Mark outliers and verify if they are data errors, weather events, or schedule-design issues.
Compare planned vs actual scatter plots to measure robustness of the roster design.
Use scatter plots regularly in review cycles, not only after disruptions.

A good scatter plot turns complex crew-planning data into clear operational insight.

When used with clean data and domain judgment, it helps airlines see patterns early, question assumptions, and design stronger schedules.