Friday, May 20, 2011

KRACH

College hockey seems to be a hotbed for ratings systems -- possibly because (like college football) it has many teams and few games.  One of the most popular ratings sytems for college hockey is Ken's Ratings for American College Hockey (KRACH).  KRACH is based upon Bradley-Terry rankings, which is a probablistic model for pairwise comparison, i.e., gives a method for ranking a set of entities by making only pairwise comparisons.  Since games are "pairwise comparisons" between teams, there's an obvious application to ranking sports teams.

Unlike some of the rating systems we've looked at, KRACH gives the ratings a specific meaning: an odds scale.  This means that if Team A has a rating of 200 and Team B has a rating of 100, then Team A is a 2:1 favorite to beat Team B.  If Team A and Team B play an infinitely long series of games, then Team A is expected to win 2/3 of those games and Team B to win 1/3.  If Team A and Team B play just a single game, then Team A is expected to win 0.66 of that game and Team B is expected to win 0.33 of that game.

(I'm reminded of the old joke that a statistician is a person who can put his head in an oven, his feet in a freezer, and on average be quite comfortable.)

So if we're given a team and its schedule (along with the KRACH ratings) we can calculated the expected number of wins by summing up the expected wins for each game.  For example, if a Team with a Krach rating of 100 plays Teams A, B, C, and D with ratings of (respectively) 150, 100, 75 and 50, then the expected number of wins is:
X vs A:  0.40
X vs B:  0.50
X vs C:  0.57
X vs D:  0.67
------
Total:  2.14
What the KRACH system does is try to pick ratings for all the teams (simultaneously) so that the expected number of wins for each team matches the actual number of wins.  As you might imagine, this requires an iterative approach.  The update equation for each iteration is given by this formula:
Ki = Vi / Sum over j of [Nij/(Ki+Kj)]
Where Ki is the KRACH rating for team i, Vi is the number of wins for team i, and Nij is the number of times that team i and j have played each other.  To avoid problems with undefeated and winless teams, we also need to add in a "tie" game with a fictitious team with an arbitrary rating.

There's one other adjustment necessary for using the KRACH ratings for prediction.  Since the KRACH ratings represent odds, when we compare two ratings to predict an outcome, we need to use a ratio of the ratings:
Ki/(Ki+Kj)
rather than use Ki and Kj directly in our linear regression.

Implementing the KRACH rating and testing with our usual methodology yields this performance:

Predictor    % Correct    MOV Error
Wilson77.7%10.33
KRACH71.5%11.50

Once again, this rating does not provide an improvement on our best rating so far.  Because the KRACH rating is an odds rating, we might expect the MOV Error to be higher, but even on predicting outcome, KRACH is not an improvement on Wilson.