When the NCAA Tournament rolls around there's an inevitable flurry of blog posts and
news articles about some
fellow or
another who has predicted the Tournament outcome by running a Tournament simulation
a million times! Now that's impressive!
Or maybe not.
These simulations are nothing more than taking someone's win probabilities (usually Pomeroy or Sagarin, since these are available with little effort) and then rolling a die against those probabilities for each of the 63 games. On a modern computer you can do this a million times in a second with no real strain.
More importantly, though, does running this sort of simulation a million times actually reveal anything interesting?
Imagine that we decided to do this for just the title game. In our little thought experiment, the title game this year has (most improbably) come down to Duke versus Furman, thanks in no small part to Furman's huge upset of the University of Kentucky in their opening round game.
(Furman -- one of the worst teams in the nation and who have only
managed 5 wins in the lowly Southern Conference -- has somehow won
through to the conference title game and actually does have a chance to get to the Tournament. If this happens, they'll undoubtedly be the worst 16 seed and matched up against UK in Louisville. So this is totally a plausible scenario.)
We look up the probability of Duke beating Furman in our table of Jeff Sagarin's strengths (or Ken Pomeroy, whomever it was) and we see that Duke is favored to win that game 87% of the time. So now we're ready to run our simulation.
We run our simulation a million times. No, wait. We want to be as accurate as possible for the Championship game, so we run it
ten million times.
(We have plenty of time to do this while Jim Nantz narrates a twenty minute piece on the unlikely Furman Paladins and their quixotic quest to win the National Championship. This includes a long interview with a frankly baffled Coach Calipari.)
We anxiously watch the results tally as our simulation progresses. (Or rather we don't, because the whole thing finishes before we can blink, but I'm using some dramatic license here.) Finally our simulation is complete, and we proudly announce that in
ten million simulated games, Duke won
8,700,012 of the games! Whoo hoo!
But wait.
The sharp-eyed amongst you might have noticed that Duke's 8,700,012 wins out of a 10,000,000 is almost the same percentage as our original winning probability that we borrowed from Ken Pomeroy. (Or Jeff Sagarin, whomever it was.) Well, no kidding. It had better be, or our random number generator is seriously broken.
Welcome to the
Law of Large Numbers. To quote Wikipedia: "[T]he average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed." The more times we run this "simulation" the closer we'll get to exactly 87%.
This is why the whole notion of "simulating" the tournament this way is silly. The point of doing a large number of trials (simulations) is to reveal the expected value. But
we already know the expected value: it's the winning probability we stole from Jeff Sagarain. (Or Ken Pomeroy, whomever it was.) It's just a waste of perfectly good random numbers to get us back to the place we started.
To be fair, there's one reason that it makes some sense to do this for the entire Tournament. If for some reason you want to know before the Tournament the chances of a particular team winning the whole thing, then this sort of simulation is a feasible way to calculate that result. (Or if you're Ed Feng you create
this thing.) And if that's your goal, I give you a pass.
On the other hand, if you're doing all this simulation to fill out a bracket for (say) the
Machine Madness competition, then it makes more sense to run your simulation for a
small number of trials. The number of trials is essentially a sliding control between Very Random (1 trial) and Very Boring (1 billion trials) at the other end. Arguably it is good meta-strategy in pool competitions not to predict the favorite in every game, so by lowering the number of trials you can inject some randomness into your entry. (I don't think this is necessarily a good approach, but at least it is rational.)
Now I'm off to root for Furman in the Southern Conference title game.