The Pitch Doesn't Slide
A siren passing you doesn't glide down in pitch. It holds one high note while it comes, drops fast right as it passes, then holds one low note going away. Two plateaus and a step — not a slope. Send it past your ear and listen.
Ready. Press “Send it past”.
Watch the rings. Ahead of the source they pile up — each new crest is launched from a little closer to you, so they arrive bunched: a higher pitch. Behind, they spread out: a lower pitch. The source's own note never changes. Only the spacing you receive does.
Now watch the graph, not the rings. The pitch you hear is nearly flat and high for the whole approach, then falls through the source's true note in a rush, then is nearly flat and low. The closer it passes, the sharper that fall — at a few metres it's almost a cliff; from far away it's a gentle ramp. The “whole-way-down slide” people describe is really the brief middle, stretched in memory to cover the whole event.
The bit nobody mentions
There's a delay built into hearing. The note that marks closest approach — the exact instant the pitch equals the siren's real pitch — leaves the source when it's directly level with you, but the sound still has to cross the gap to reach your ear. So you hear “the middle” a fraction of a second after the source is actually nearest you: a lag of exactly the gap divided by the speed of sound. At 10 m that's 29 ms. The dot is already pulling away by the time your ears report the turn.
The check — every number here, recomputed live
The pitch you receive from a wavefront is set by how fast the gap to the source is changing when that wavefront leaves. In one line (v = speed of sound, vs = source speed, vr = how fast the gap is opening):
- At your current settings (30.0 m/s, f₀ = 1000 Hz, v = 343 m/s): approaching 1095.85 Hz, receding 919.57 Hz, and exactly 1000 Hz at closest approach (the gap is momentarily not changing).
- The plateau ratio is 1.1917 = (v+vs)/(v−vs), a drop of 3.04 semitones — and that ratio doesn't depend on how close it passes, only on the speed.
- The drop is not symmetric: +1.58 semitones up to the high plateau but only −1.45 down to the low one. The Hz spread above f₀ is wider than below.
- You hear closest approach 29.2 ms late — the gap (10 m) divided by v.
The closed form above is checked against a second, independent method that knows nothing of it: a wavefront-counting simulation that just emits a crest every 1/f₀ seconds from the moving source, propagates each at v, and times the arrivals. The two agree to a maximum relative error of 3×10⁻⁸ across 6,000 samples. Both live in research/doppler-siren/verify.mjs — run it yourself.
So why does everyone say it slides?
Because for a source that passes close and fast, the transition is so brief and the change so vivid that the two plateaus get forgotten and only the drop is remembered — and a drop, replayed in the mind, becomes a slide. Push the “how close” slider out to 50–60 m and the truth comes back: a long flat high, a slow sag, a long flat low. The physics was two plateaus all along.
Honest apparatus
Idealizations, named: a point source in still air, no wind, no reflections or echoes, and the observer at rest (only the source moves — moving-observer Doppler has a slightly different form, f′ = f₀(v+vo)/v, not used here). Speed of sound taken as 343 m/s (dry air, 20 °C); it rises ~0.6 m/s per °C, which shifts the plateau Hz a little but not the shape. The rings you see are slowed — a real 1000 Hz source emits a thousand crests a second, far too many to draw, so the visual crest rate is reduced for the eye; the audio uses the real f₀ curve, and the geometry (bunching ahead, stretching behind) is exact at any rate. The drop is given in equal-tempered semitones, 12·log₂(ratio), a convention for naming musical interval size, not a claim about perception. Real sirens often wail (sweep f₀ on purpose); here f₀ is held fixed so the Doppler shift is the only thing moving the pitch.