Net Prophet: Performance Versus "The Line"

As I've mentioned earlier, I use my models to bet (in some theoretical sense) against the "the line". Typically I bet the games where my model differs significantly from the line (e.g., >4 points or so). As I've documented here, I have a number of different models, all of which have around the same performance (~11 points RMSE).

In the past I've usually averaged the predictions of these models for betting purposes, but for some time I've wondered whether they all perform equally well against the line. Although they all have similar errors, it's possible that some of the models error more consistently to the winning side of the line. To test this, I gathered three seasons worth of Vegas closing line data (about 7700 games) and tested each model for how often its predictions were correct versus the line. (The predictor is "correct" if it would make a winning bet given the line.) I also looked at each predictor's error versus the line (i.e., how accurately it predicted the line).

Model	Performance vs. Line	Error vs. Line
TrueSkill	49.89%	3.75
Govan	49.28%	3.49
BGD	49.58%	3.51
Base Statistical	50.12%	4.34
Statistical w/ Derived	50.15%	4.34
All	52.00%	3.49
All (Difference > 2)	53.15%

The "All" model here is a linear predictor using all the inputs to TrueSkill, Govan, BGD and Statistics w/ Derived. (I also tested some voting models, but they all under-perform the Statistical/All models.)

There are a couple of interesting results.

Most noticeably, the "All" predictor is at break-even versus the line. (Due to "house cut" on sports bets, you need to win about 52% of your bets to break even.) If we restrict ourselves to bets where the predictor differs from the line by at least two points, performance moves into (barely) positive territory. This is very good performance; the best predictors tracked at The Prediction Tracker do not even break 50%. (Furthermore, I am using the "closing" line, which is a tougher measure [by about one point] than the opening line used at the Prediction Tracker.)

It's also intriguing that TrueSkill/Govan/BGD all underperform the line but track it noticeably better than the statistical predictor. This suggests to me that the line is set not by wily veteran gamblers in the smoky back rooms, but by a computer program using some kind of team strength measure.

A (possibly interesting) side-note: All models that under-perform the line are going to fall into the seemingly miniscule range of 48-52%. (If a model performs worse than 48% against the line, we would simply bet against the model.) Pick any crazy model you like -- "Always bet the home team," "Always bet on the team whose trainer's name is first alphabetically," etc. -- and the performance is almost certainly going to fall in that 48-52% range against the line. (If it doesn't, you've found the key to beating Vegas!)

Net Prophet

Friday, February 17, 2012

Performance Versus "The Line"

No comments:

Post a Comment