Information Theory wonks define the smallest unit of information as a bit -- essentially the answer to one yes-no question. (Disclaimer: yes, I know the real definition is more complex :-) Let us suppose, then, that we can get the answer to one yes-no question about a basketball game. What question should we ask, and how much can we improve our prediction based upon that information?
Basketball aficionados will know that "home court advantage" (HCA) is a major factor in college basketball. We will examine HCA in more detail soon, but for now let us use our one bit of information to determine the home team. How much does knowing the answer to that question improve our prediction?
It turns out that in college basketball, the home team wins an astonishing 66% of the games, and outscores the visiting team by an average of 4.5 points. We can use this information to create our "1-Bit Predictor":
The home team will win by 4.5 points.
This predictor gets 66% of its games correct with an error of about 13.5 points over all games in the 2009-2011 seasons. So with that one bit of information we've improved our predictor by 32% on one performance metric! (But only by about 7% on the other metric, which will prove a tougher nut to crack.)
|Predictor||% Correct||MOV Error|
(For comparison purposes to later predictors, the performance I show in this table corresponds to a standard testing methodology, to be explained shortly.)
So the "1-Bit Predictor" provides a reasonable lower bound on prediction performance. This may seem like a trivial result, but consider that [Sokol 2006] compared four rating systems and the Las Vegas betting line over six seasons of tournament games and found that they picked the correct winner in 70-75% of the games. The 1-Bit Predictor is already within a few percentage points of these much more sophisticated systems. If nothing else, this suggests that improving prediction performance is going to be a difficult task.
Having established a reasonable lower bound, the obvious next question is "What is a reasonable upper bound for prediction performance?" That is the topic of the next posting.