Tuesday, January 27, 2015

Strength of Schedule & Adjusted Statistics (Part 1)

The usual method of predicting games is to look at a team's past performance -- usually expressed as a statistical value such as "winning percentage" -- and use that to estimate future performance.  But this approach is problematic, because all statistics are not made the same.  In the case of winning percentage, two teams with a winning percentage of 84% are not necessarily equivalent.  Louisville's 16-3 record, with losses to #1 Kentucky, #4 Duke and #18 UNC is not the same as Dayton's 16-3 record with losses to #17 Connecticut, Arkansas and Davidson.

This problem arises because college basketball is a case of incomplete pairwise comparison.  If every team played every other team twenty times, by the end of the (admittedly long) season, winning percentage would be a pretty good measure of team strength.  But that's never going to happen, so we need other ways to compensate for this weakness.  One of the simplest is to calculate a "Strength of Schedule" and use that to interpret the statistic.

In its simplest form, Strength of Schedule (SoS) is calculated as the average of all a team's opponents in the same statistic.  So if we were looking at "winning percentage" SoS would be calculated by averaging the winning percentage of all of a team's opponents.  So Louisville might have a SoS of .57 (meaning its opponents has overall won 54% of their games) while Dayton had a SoS of .51 (meaning its opponents had only won 51% of their games).  In light of this, we could then say that Louisville's 84% winning percentage is better than Dayton's 84% winning percentage.

There are several shortcomings with this definition of SoS.

First, it doesn't always make sense to measure Strength of Schedule using the same statistic.  For example, suppose we're looking at "3 Pt Shooting Percentage".  In this case, SoS would tell us how well our opponents shot the three-pointer.  That doesn't make a lot of sense.  How well our opponent shot the three doesn't affect how well we shot the three.  In this case, we really want to know how well our opponent's opponents shot the three (if you can follow that thought).  The simplistic form of SoS only makes sense for symmetric statistics -- where a plus for one team is automatically a minus for the other team -- such as winning percentage, where a win for you necessarily means a loss for the other team.

Even for symmetric statistics, there are problems with this view of SoS.  One is that we've only pushed off the problem one level by looking at a team's opponents.  To return to the previous example, Louisville's opponents seem better because they have better records than Dayton's opponents.  But maybe that's just because Louisville's opponents themselves played weak teams. This is the problem that RPI tries to address by looking at opponents and opponents' opponents.  Two layers is pretty good, and RPI is a much better metric than straight Winning Percentage.  Of course, you can take it deeper.

In general, many of the more sophisticated rating systems (e.g., Massey) can be viewed as different  approaches to extending Strength of Schedule as deep as possible.  I'm not sure there's a "right" answer to measuring Strength of Schedule, but it seems clear that the general idea -- to adjust or interpret statistics based upon a team's opponents -- is valuable.