Artificial Wasteland · A stratum · Number

All Edge, No Middle

THE UNIT n-BALL IN THE n-CUBECURSE OF DIMENSIONALITYMONTE-CARLO vs. EXACT

“In view of all that we have said in the foregoing sections, the many obstacles we appear to have surmounted, what casts the pall over our victory celebration? It is the curse of dimensionality.” — Richard Bellman, Dynamic Programming (1957), coining the phrase

Your sense of what a shape is was trained in three dimensions and never updated. A ball sits snugly inside a box; a point picked at random from a ball is usually somewhere in the middle; two arrows pointing “different ways” can point at almost any angle to each other. All three of these are true in the room you are sitting in — and all three become false, dramatically and provably, as you add dimensions.

This is not a metaphor and not a paradox. Every claim below is a one-line formula you can hold in your hand, and every one is watched here happening — thrown as random points into a high-dimensional box and counted — with the sampled number converging to the exact one in front of you. Nothing is asserted that is not also shown. What emerges is a single, strange picture: in high dimensions, space has no middle. It is all edge.

1 · The box that is all corner

Drop a ball into a box so it just touches every wall — a circle in a square, a sphere in a cube. In two dimensions the disk fills a generous π/4 ≈ 78.5% of the square. In three, the ball fills π/6 ≈ 52.4% of the cube — already less than half. Keep going. The ball's share of the box is

inscribed ball ÷ box, in n dimensions: R(n) = V(n) / 2ⁿ, V(n) = πn/2 / Γ(n/2 + 1)

and that ratio does not drift down — it falls off a cliff. Slide the dimension up and watch the box empty out. By ten dimensions the ball you inscribed fills barely a quarter of one percent of the box it fits inside; by a hundred, it is a speck of order 10⁻⁷⁰. Where did the room go? Into the corners — and the corners run away from you: the far corner of the box sits a distance √n from the centre while the ball's wall stays fixed at 1. Add dimensions and the box grows spikes.

Throw random points into the box · count how many land inside the ball
2
sampled (Monte-Carlo) exact R(n) points thrown0 far corner is

The gauge fills to the sampled fraction; the tick is the exact R(n). They meet. Notice how few dimensions it takes: by n = 7 you have to throw hundreds of points to land even one inside the ball, and past n ≈ 12 the rain essentially never hits it — the box is, to any sampler, pure corner.

The whole cliff · R(n) for n = 1…20 (bar = share of the box, exact)
nball ÷ boxshare

A curiosity hiding in the same formula: the ball's own volume V(n) — forget the box — rises to a maximum at n = 5 (about 5.264) and then also falls to zero. Adding a sixth dimension to a unit ball makes it smaller. There is simply less and less room near the middle of anything.

2 · The ball that is all peel

Forget the box; take the ball itself and pick a point inside it uniformly at random. Where does it land? Your intuition says “somewhere in there, probably not right at the edge.” But volume in a ball grows like rⁿ, so the fraction of the ball lying within a radius r of the centre is exactly rⁿ — and for large n that is almost zero until r is almost 1. The point lands in the skin, essentially always.

half the ball's volume lies beyond radius: r½ = (1/2)1/n n = 100 → r½ = 0.99309 (half the volume sits in the outer 0.69% of the radius)

An orange in a hundred dimensions is all peel and no fruit: 99% of its substance lies in the outermost 4.5% of its radius. The centre — the place your intuition reaches for — is empty.

Sample uniform points in the ball · histogram of how far they land from the centre
3
median radius (sampled) exact (½)^(1/n) half the volume is in the outer

The bars are sampled radii; the line is the exact density n·rⁿ⁻¹. As you raise n the whole distribution stampedes to the wall at r = 1 and thins to a blade. The most likely place to be, in a high-dimensional ball, is pressed against the outside.

3 · Every direction a right angle

Pick two points at random and look at the directions they point from the centre. In two dimensions the angle between them is anything at all, uniformly. Raise the dimension and something rigid takes hold: the two directions become almost exactly perpendicular, nearly every time. The cosine of the angle between two random unit vectors has mean zero and variance 1/n

cosine of the angle between two random directions: E[cos θ] = 0, Var[cos θ] = 1/n, so a typical cos θ ≈ ±1/√n n = 1000 → spread 0.032 → within ~1.8° of a right angle

so the spread collapses like 1/√n. In a thousand dimensions a random pair of directions sits within two degrees of a right angle. High-dimensional space is a place where nearly everything is orthogonal to nearly everything else — which is exactly why, in the same breath, every point becomes the same distance from every other. There is no near and no far. That is the curse in its purest form.

Sample pairs of random directions · histogram of the cosine of the angle between them
3
std of cos θ (sampled) exact 1/√n typical pair within of a right angle90°

The spike sharpens onto 0 as n climbs — bars are sampled, the gaussian curve is the exact limiting shape (width 1/√n). By a few hundred dimensions, “point a different way” and “point at a right angle” mean almost the same thing.

Why it is a curse

When every distance is the same distance, the idea of a nearest neighbour quietly dies. Beyer, Goldstein, Ramakrishnan and Shaft proved the sharp version in 1999: for a broad class of distributions, as the dimension grows the nearest point and the farthest point in a sample become indistinguishable — the ratio of their distances tends to 1. Any method that leans on “which stored example is closest?” — nearest-neighbour classification, clustering, similarity search — loses its footing. So does sampling: to cover a 10-dimensional grid at the resolution you would use on a line takes 10¹⁰ points; the volume that hides in the corners is volume you can never afford to visit. This is the curse Bellman named, and it is why brute-force search over a high-dimensional space is hopeless.

Why your embeddings work anyway

And yet high-dimensional machinery — the word and image embeddings that a modern model lives inside, running to hundreds or thousands of dimensions — plainly does work. Two honest reasons, both visible above.

First, the near-orthogonality is a gift, not only a curse. Because two random directions are almost surely perpendicular, a high-dimensional space has room for an astronomical number of almost-orthogonal directions — so distinct concepts can be given distinct directions that barely interfere, and a cosine similarity that rises above the 1/√n noise floor is real signal, not coincidence. It is because random vectors cluster at a right angle that a measured alignment means something. (The same blessing lets you squash the dimension back down almost for free: the Johnson–Lindenstrauss lemma says a random projection to only ~log(m)/ε² dimensions preserves all pairwise distances among m points to within ε.)

Second, and decisively: real data is not uniform in the box. The facts on this page are theorems about points spread evenly through a high-dimensional cube or ball. Real embeddings are not spread evenly — they lie clustered on a thin, curved, far lower-dimensional surface threaded through the big space (the manifold hypothesis; Fefferman, Mitter & Narayanan gave it a testable form in 2016). The ambient dimension is enormous and cursed; the intrinsic dimension the data actually occupies is small and friendly. The curse is real — it is a statement about empty space. Your data escapes it by not being empty space.

The check — every number here, recomputed

The offline verifier research/high-dimensional-geometry/verify.mjs reproduces each headline figure two independent ways (a closed form and a from-scratch recursion or Monte-Carlo estimate) and they agree; the instruments above recompute the sampled numbers live in your browser and converge to the same exact values.

  • Inscribed ball ÷ box: R(2)=π/4=78.54%, R(3)=π/6=52.36%, R(10)≈0.249%, R(100)≈10⁻⁷⁰ — closed form matched by the point-rain sampler.
  • Unit-ball volume V(n) is maximised at n=5 (V₅≈5.264) then falls to zero; closed form π^(n/2)/Γ(n/2+1) equals the recursion V(n)=(2π/n)·V(n−2) to 8 digits, n=1…24.
  • Half a ball's volume lies beyond r=(½)^(1/n); for n=100 that is 0.99309 (the outer 0.69% of the radius), matched by the sampled median radius.
  • Cosine of two random directions: mean 0, Var=1/n; sampled variance tracks 1/n from n=2 to n=1000; at n=1000 a typical pair is within 1.8° of perpendicular.
  • Mean pairwise distance of Gaussian points ≈ √(2n), and the relative contrast (Dmax−Dmin)/Dmin shrinks with n (≈2700 at n=2 → 0.18 at n=1000): distances concentrate.

All 60 offline checks pass. What is proven here is geometry of the uniform ball/cube; what is assumed in the last section — that real embeddings live on a low-dimensional manifold — is an empirical hypothesis, named as such, not a theorem.