Net Prophet
Exploring algorithms for predicting NCAA basketball games.
Thursday, April 4, 2013
NIT Final
Tonight, Iowa met Baylor in the final game of the NIT.
(Virginia, also mentioned in that posting, lost to Iowa in the quarterfinals of the NIT.)
Wednesday, April 3, 2013
Final Four Predictions
Home
|
Away
|
Prediction
|
| (1) Louisville | (9) Wichita St. | 12.4 |
| (4) Michigan | (4) Syracuse | 5.4 |
| (1) Louisville | (4) Michigan | 10.2 |
| (1) Louisville | (4) Syracuse | 10.9 |
The Prediction Machine likes Michigan over Syracuse, but that game represents a bit of a predictive dilemma because it involves two #4 seeds facing each other. Normally the better seeded team is the Home team, and benefits from the Tournament version of Home Court Advantage. (*) It isn’t clear how to resolve this when two identical seeds face each other – this happens so rarely in the Tournament that there isn’t clear precedent. In this case, the PM rates the teams nearly identical, and predicts the win for whichever team is the “home” team. Since the NCAA has chosen Michigan as the home team, I’ll go with that. However, I’ve shown both possible matchups in the final game just in case.
(*) Why is there a home court advantage in the Tournament? My theory: The Home Court Advantage derives largely from referee bias. During the regular season the referee bias is that “teams play better at home” and so they give the home team the benefit of calls, etc. During the tournament the bias is “the better seeded team is better” and so that team gets the benefit of calls.
Saturday, March 30, 2013
Sweet Sixteen Update, Part 2
| Home | Away | Line | Pred | Delta | Result |
| Miami (FL) | Marquette | 6 | 3.7 | -2.3 | -10 |
| Louisville | Oregon | 10 | 11.6 | 1.6 | 8 |
| Ohio State | Arizona | 3.5 | 5.4 | 1.9 | 3 |
| Indiana | Syracuse | 5.5 | 7.7 | 2.2 | -11 |
| Duke | Michigan State | 2 | 4.3 | 2.3 | 10 |
| Kansas | Michigan | 2 | 4.9 | 2.9 | -2 |
| Wichita State | La Salle | 4 | 7.6 | 3.6 | 14 |
| Florida | Florida Gulf Coast | 12.5 | 20.2 | 7.7 | 12 |
The PM went 1-2 on betting predictions. The Louisville-Oregon prediction was too close for a bet, but the PM would have been on the wrong side there as well.
Michigan-Kansas was the surprise result to me – I really expected Kansas to win that game fairly handily. They might have still squeaked out the win if not for some bone-headed late-game plays.
I’m traveling again today, so I don’t know if I’ll manage to get out Elite Eight predictions.
Friday, March 29, 2013
Sweet Sixteen Update (Part 1)
Home
|
Away
|
Line
|
Pred
|
Delta
|
Result
|
| Miami (FL) | Marquette | 6 | 3.7 | -2.3 |
-10
|
| Louisville | Oregon | 10 | 11.6 | 1.6 | |
| Ohio State | Arizona | 3.5 | 5.4 | 1.9 |
3
|
| Indiana | Syracuse | 5.5 | 7.7 | 2.2 |
-11
|
| Duke | Michigan State | 2 | 4.3 | 2.3 | |
| Kansas | Michigan | 2 | 4.9 | 2.9 | |
| Wichita State | La Salle | 4 | 7.6 | 3.6 |
14
|
| Florida | Florida Gulf Coast | 12.5 | 20.2 | 7.7 |
The Arizona-Ohio State prediction was too close to recommend a bet, but the PM ended up on the wrong side of the line there as well. Ross’s defensive blunder on the next-to-last play of the game probably cost quite a few gamblers a payout.
Wednesday, March 27, 2013
The Prediction Machine’s Bracket
Initially, the Prediction Machine picked the most likely winner of each game – whichever team it deemed stronger. But there’s a serious drawback to this approach. The Committee is already pretty good at determining the relative strength of the teams, so by and large the Prediction Machine’s picks agreed with the seedings. It only differed where the Committee had “mis-seeded” teams. That seems to happen every year, but there’s usually only one or two mis-seeds. So you end up with a bracket that may be the most likely outcome, but which is also going to be very similar to many other brackets. (In fact, we see that very thing in this year’s Machine Madness competition: “Danny’s Dangerous Picks” and “Predict the Madness” are identical after the second round.) This makes it very hard to finish high in a pool with a lot of entrants.
In the next iteration, I forced the Prediction Machine to pick about 15% of the games as upsets. I chose that number because historically, that’s about how many upsets there are each Tournament. The Prediction Machine did this by ranking the upsets and selecting the top 6 upsets in the first round and 5 more in the rest of the tournament. The idea was to get away from the consensus picks of the other competitors while picking the most likely upsets. But this is too risky a strategy. Depending upon the size of the pool, you probably don’t need to get 11 upsets correct to do very well. For example, in last year’s Machine Madness pool, it would have been sufficient to get 8 points from upsets – which could be just one correct upset pick in the round of 8.
This year, the Prediction Machine used an algorithm which took a target number of upset points and tried to select the most likely set of upsets to meet that total. Initially I planned to use a target number of 8 points – based on last year’s results – but in the end decided to set the target higher, with the goal of ending up in the top 5% of the ESPN contest if the upsets occurred as predicted. I placed that goal at (a somewhat arbitrary) 50 points. I then used the Prediction Machine to predict all the chalk matchups in the tournament. This identified a number of games where the Prediction Machine thought the lower-seeded team would win:
Value
| Home | Away |
Pred
|
4
| Georgetown | Florida |
-4.8
|
1
| UCLA | Minnesota |
-4.7
|
2
| Kansas St. | Wisconsin |
-3.3
|
1
| Colorado St. | Missouri |
-2.3
|
1
| Memphis | St. Mary's |
-1.6
|
2
| New Mexico | Arizona |
-1.2
|
This adds up to 11 points of mis-seeds. That’s a surprising number and may reflect an unusual basketball season. When I plugged these upsets in and ran the tournament again, I discovered that the Prediction Machine also favored #3 Florida over #1 Kansas (an 8 point game), so I added that in for 19 total points of mis-seeds.
The PM then identified the most likely upsets in the remaining games. These were the top results:
Value
| Home | Away |
Upset
|
8
| Gonzaga | Ohio St. | 17.5 |
1
| Notre Dame | Iowa St. | 17.4 |
4
| Miami (FL) | Marquette | 17.3 |
32
| Louisville | Indiana | 14.1 |
The PM then added upsets in order of likelihood until it reached 50 (or in this case, 64). (The next upset on the list was Oklahoma over San Diego State.)
There are a couple of refinements to this approach that I haven’t had time to incorporate. A simple refinement would be to drop 14 points of upsets to get back to 50 points. A more complex refinement would be to try different combinations of upsets to get the most likely combination that reaches the target points. Either refinement in this year would have ended up keeping just the Louisville-Indiana upset in the final game.
It’s just as well that I didn’t have time to implement either refinement. This year’s Machine Madness field turned out much larger than expected (27 competitors!) and even if Indiana wins everything, I won’t win the competition unless Marquette beats Miami – one of the upsets that would be dropped to get back to 50.
Looking at the Prediction Machine’s performance, in the first rounds it went 2-2 for mis-seeds/upsets, and in the second round 1-1. 50% correct on picking upsets is probably a pretty good performance. In the ESPN competition, the Prediction Machine’s bracket is at 94.4% out of about 8 million entries, with 7 out of the Round of Eight still alive.
Tuesday, March 26, 2013
Sweet Sixteen Predictions
Here are the Prediction Machines thoughts on the Sweet Sixteen games:
| Home | Away | Line | Pred | Delta |
| Miami (FL) | Marquette | 6 | 3.7 | -2.3 |
| Louisville | Oregon | 10 | 11.6 | 1.6 |
| Ohio State | Arizona | 3.5 | 5.4 | 1.9 |
| Indiana | Syracuse | 5.5 | 7.7 | 2.2 |
| Duke | Michigan State | 2 | 4.3 | 2.3 |
| Kansas | Michigan | 2 | 4.9 | 2.9 |
| Wichita State | La Salle | 4 | 7.6 | 3.6 |
| Florida | Florida Gulf Coast | 12.5 | 20.2 | 7.7 |
The PM likes mostly home teams, although it thinks Marquette +6 is a good bet. The PM has Marquette picked as a likely upset in its bracket. It needs Marquette to win this game and Indiana to win out in order to finish first in the Machine Madness Contest. (I’ll have a blog post shortly about how the PM picked its bracket.)
At the other end of the spectrum, the PM likes Florida to crush FGCU, even with FGCU's recent victories taken into account. I'm dubious. I'm also dubious of the Indiana prediction, given how impressive Syracuse was in San Jose. And the PM has liked Wichita State all along, and is looking for a fairly routine victory over La Salle.
The PM doesn't usually see this many "bettable" games, where the difference between the Vegas line and the PM’s prediction is greater than 2 points. It’s likely that – since there are many more regular season games to train upon – the PM doesn't do as good a job accounting for the Tournament conditions as Vegas. Alternatively, it may be doing a better job, or the lines may be more influenced by betting during the Tournament when there’s more action.
Monday, March 25, 2013
Upset Picks Review
| Home | Away | Upset | MOV |
|---|---|---|---|
| Notre Dame | Iowa St. | 17.4 | -18 |
| San Diego St. | Oklahoma | 7.1 | 15 |
| Memphis | St. Mary's | 2.5 | 2 |
| Oklahoma St. | Oregon | 2.4 | -13 |
| N.C. State | Temple | 2.2 | -4 |
| Illinois | Colorado | 2.1 | 8 |
| Colorado St. | Missouri | 1.8 | 12 |
| Creighton | Cincinnati | 1.6 | 4 |
| Pittsburgh | Wichita St. | 1.6 | -18 |
| UNLV | California | 1.5 | -3 |
| Butler | Bucknell | 1.5 | 12 |
| North Carolina | Villanova | 1.2 | 7 |
According to the PM, the Notre Dame-Iowa St. game had significantly higher upset chances than any other game, and in fact Iowa State won the game handily. The PM also liked Oklahoma to upset San Diego State, but State won that game handily.
The next tier of upset possibilities (> 2) was less likely, but the PM also went 50% in this tier. (Actually 75%. Although I marked it as a missed upset here, after the play-in game, the PM had St. Mary's as an outright favorite in the game against Memphis, so the Upset probability was for actually for Memphis to win! The Upset metric remained almost the same, by the way.)
The next tier is below the cutoff for consideration as an upset, although both the Wichita State and Cal upsets were identified here. (And Cincinatti, Missouri and Bucknell were all popular upset picks by pundits.)
Combined with last year's results, it appears that the PM's algorithm for detecting likely upsets works fairly well. (Note that whether the PM should include the upset in its bracket is a different question!)