A stratum · what the survivors can't tell you

The Planes That Didn't Come Back

In 1943 the bombers returned from Europe covered in bullet holes, and the Air Force asked the obvious question: where do we add armor? The obvious answer — where the holes are — is exactly wrong. The holes are a map of where a bomber can be hit and still come home. Add the armor where the survivors are unmarked.

The mathematician Abraham Wald, working for the Statistical Research Group at Columbia, saw the trap. Every plane the brass could study had one thing in common: it came back. The planes hit in the truly fatal spots weren't in the data — they were in the North Sea. So the survivors' damage is survival-filtered: holes pile up where damage is survivable, and thin out exactly where it kills. The empty patches on the returning planes mark the lethal regions. This is survivorship bias, and below you can run it.

Fly the fleet

Each region has a hidden lethality — the chance a single hit there downs the plane. Drag it up and watch two things: the plane is lost more often, and the holes on the planes that do return drain away from that region. You only ever see the survivors' holes (amber). The blue is what you're not shown: where the flak actually lands. Wald's move is to compare them.

Instrument — set each region's lethality; read the survivor damage

survival rate: 37.2% — of 100 bombers, 63 are lost

Reinforce the holes (naïve)
fuselage
where survivors carry the most damage
Reinforce the gaps (Wald)
engine
fewest holes per unit of exposure → the deadliest hit
Regionsurvivor holes · exposure · recovered lethality

Recovered from the survivors alone (areas + loss rate known): every region's true lethality, to the decimal.

Why the gaps, not the holes

Hits scatter across the plane in proportion to each region's exposed area — call that share aᵢ. A hit in region i downs the plane with probability vᵢ (its lethality). A returning plane is one with zero fatal hits, so the only holes you ever see are the survivable ones. Work it through and the survivor holes land in region i with probability

qᵢ ∝ aᵢ · (1 − vᵢ)

That is the whole bias in one line. Holes-per-area, qᵢ ⁄ aᵢ, is proportional to (1 − vᵢ): the more lethal a region, the fewer survivor holes it shows relative to its size. The naïve eye reads qᵢ and armors the biggest pile of holes — the fuselage, which is huge and rarely fatal. Wald reads qᵢ ⁄ aᵢ and armors the region the survivors are quietly missing — the engine.

And the one input the naïve reading throws away is aᵢ, the exposure baseline — together with the planes that never returned to be counted. Keep them, and you can do better than rank the regions: you can recover every lethality exactly. The fraction lost fixes the overall scale (Z = 1 + ln P(survive) ⁄ λ), and then vᵢ = 1 − qᵢ · Z ⁄ aᵢ reads each region's true danger straight off the survivors' damage. The instrument above does this live, in the red bars.

The same hole in everything

Wald's bombers are the cleanest case, but the shape is everywhere a sample is filtered by the very outcome you're studying — you measure the survivors and forget the fallen:

In each one the fix is Wald's: ask what got filtered out of the sample by the thing I'm trying to measure, and put it back. The gap in the data is the data.

▸ part of the portal: The Condition You Weren't Told

The check

Every number here is re-derived offline — closed form and a seeded Monte-Carlo fleet — in research/the-planes-that-didnt-come-back/verify.mjs:

Reproduce: node research/the-planes-that-didnt-come-back/verify.mjs — 18/18 pass.