Friday, February 7, 2014

A Good Offensive Rebounding Team? (Part 1)

I've been working a lot lately on refactoring the Prediction Machine code for improved performance.  Previously, it took about 15 minutes to process a single season of data, and due to memory issues, I couldn't process all of my data in one pass; I had to stop and restart my processing for each season.  So I spent a week or so profiling the code, removing memory leaks, speeding up the slowest processing, and so on.  The results were pretty remarkable -- about a 25x speedup overall.  This makes it much easier to try out new ideas on the entire data set.

Speaking of which, I was watching a college basketball game the other night and the announcer claimed that one of the teams was "a good offensive rebounding team".  He said this based upon nothing more than the team grabbing two offensive rebounds in a row, but it made me wonder exactly how one could tell if a team was a good offensive rebounding team, or not.

For any particular game we know the number of offensive rebounds each team grabbed.  But that tells us very little.  If Duke and UCLA played each other and Duke grabbed 13 offensive rebounds and UCLA grabbed 4, we'd be tempted to say that Duke is the better offensive rebounding team.  But are they?

Well, it's obvious that one game could be just a fluke.  So we need to look at performance over a number of games.  So let's suppose that Duke averages 13 offensive rebounds a game, while UCLA only averages 4.  Now can we say that Duke is a better offensive rebounding team?  Maybe not.

Suppose we found that Duke is shooting 28% from the field while UCLA is shooting 87% from the field.  The difference in offensive rebounds might simply reflect a difference in opportunities.

Let's correct for that by expressing offensive rebounding as a percentage of available opportunities (e.g., offensive rebounds / missed shots).   So now suppose that Duke is grabbing 35% of it's offensive rebound opportunities while UCLA is grabbing only 27%.  Now can we say that Duke is a better offensive rebounding team?  Maybe not.

If Duke and UCLA didn't play all the same opponents, then those aren't apples to apples numbers.  Suppose we found that Duke's opponents had held their opponents to a 45% offensive rebounding rate, while UCLA's opponents had their opponents to a 15% offensive rebounding rate.  Now it appears that Duke is a comparatively weak offensive rebounding team, while UCLA is comparatively strong.

Did you follow all that?   Express offensive rebounding as a percentage of the available rebounds, average it over all the games, and then adjust it for opponents.  And then you -- maybe -- have a number that you can use to compare teams.

If we run this statistic for the current season, here are the top offensive rebounding teams:

Rank TeamRating    Total
1North Carolina (15-7)  1.36295
2Quinnipiac (13-8)1.35357
3San Diego St. (20-1)1.31217
4UAB (14-8)1.28284
5Northern Illinois (10-11)  1.26257
6Tennessee (14-8)1.26312
7Purdue (14-9)1.24289
8Arizona (22-1)1.24277
9Indiana (14-8)1.23268
10Long Beach St. (9-13)1.23212

You might be surprised to see Quinnipiac at #2, but they lead the nation in total number of offensive rebounds (as shown in the last column).  (And Tennessee at #2 has 45 (!) fewer rebounds.) What's interesting here is the obvious disparity between the raw number of offensive rebounds and the rankings.  San Diego St., with only 217 offensive rebounds, is #3 largely because they've played teams that are tough to rebound against.  It's also interesting to note that there are some very good teams in this list.

In my model this statistic doesn't have a lot of predictive value -- but then, I have a variety of other statistics that characterize offensive rebounding performance.  One interesting thing about this statistic is that it is about ten times more important to the Away team than to the Home team.  This suggests that good offensive rebounding teams might play a little better on the road.

Next time we'll look at another way to measure offensive rebounding performance and see how the two measures compare.