Wednesday, February 4, 2015

Strength of Schedule & Adjusted Statistics (Part 5)

The method for calculating adjusted statistics I outlined in previous posts turns out to be moderately useful.  It's better for prediction than raw statistics, but not as useful as some other approaches.

One problem is that my approach doesn't explicitly provide a measure for defense against a statistic.  We get an adjusted statistic for (say) 3 pt % that tells us how good a team is relative to other teams for that statistic, but we don't get a measure of how good that team is at defending the 3 pt %.  We can remedy this with a slightly different approach.

In this new approach, we assume that the performance for a team in a particular game is a function of both the team's offensive strength and its opponent's defensive strength.

`S_"ij" = (O_i)/(D_j)`

`S_"ij"` represents the value of the statistic in a game between the offensive team (i) and the defensive team (j).  `O_i` then represents the offensive team's adjusted strength at this statistic, and `D_j` represents the defensive team's adjusted strength at defending this statistic.

As an example of this approach, assume we have the following schedule of games and performances:


This yields six equations for the Offensive and Defensive strength ratings:

`0.43 = O_g / D_s`            `0.30 = O_s / D_g`
`0.35 = O_g / D_b`            `0.23 = O_b / D_g`
`0.28 = O_s / D_b`             `0.26 = O_b / D_s`

and solving these equations yields

`O_g = 0.50`     `D_g = 0.75`
`O_s = 0.40`     `D_s = 0.85`
`O_b = 0.30`     `D_b = 0.70`

In general we'll have many more games than teams, and there won't be an exact solution for this system of equations.  Solving the system can be done by an iterative approach similar to the one described for the previous system; assign some values to O and D and alternately recompute O and D until the values converge.

This approach is similar to a ranking method described in [Govan 2009].  (You can get this paper from the Paper Archive.)  Govan describes the conditions under which the ratings will converge (hint: they will for NCAA basketball after a few hundred games) and  a method of calculation.  Govan's approach isn't exactly the same as described here, I leave it to the reader to work out the differences and how to address them.

In the next posting I'll talk about yet another alternative model for adjusted statistics and how that can be calculated.

1 comment:

  1. This was a great series of posts. Thanks for writing it.