Net Prophet: Govan Ratings

The next system will take a look at is based on [Govan 2009]. Govan calls this the "Offense-Defense Rating" but to avoid confusion with my own Offense-Defense Rating, I will refer to this system as the "Govan Rating."

The idea behind the Govan Rating is to generate an Offense and a Defense rating for every team based upon scoring. Assuming that "Score_ij" represents the points scored by Team j against Team i, then the Offense rating for Team j is given by the formula:

Offense_j = (Score_1j/Defense₁) + ... + (Score_nj/Defense_n)

That is, Offense of Team J is the sum over all the teams of the points scored by Team J divided by the Defense rating of the opposing team. When the opposing team is a good defender (i.e., has a low Defense rating), then Team J gets a relatively higher Offense rating. When the opposing team is a poor defender (i.e., has a high Defense rating), then Team J gets a relatively lower Offense rating.

The formula for calculating the Defense rating for Team I is the reverse:

Defense_i = (Score_i1/Offense₁) + ... + (Score_in/Offense_n)

Again, you are rewarded for holding down the scoring of teams with potent offenses, and penalized for giving up points to teams with weak offenses.

Since Offense and Defense are interdependent, they must be iteratively approximated. The method varies slightly from what we've seen before. In this case, we will estimate the (k)th iteration of Offense using the (k-1)th iteration values for Defense, and then the (k)th iteration of Defense using the (k)th iteration values for Offense that we just calculated (not the (k-1)th values as you might have expected).

Not unsurprisingly, this can be expressed as matrix calculations. See the paper for details.

One significant problem with this system is that the ratings do not (necessarily) converge if every team doesn't play every other team. We can solve this by adding a phantom tie game between every pair of teams.

There are a couple of interesting features of this system. First, this system doesn't consider Won-Loss records at all; you get rewarded equally regardless of whether you score 80 points in a loss or in a win. Second, this rating system rewards teams that play more games. A team that plays 30 games is going to have a higher Offense rating than one that plays only 22.

So how does Govan perform?

Predictor	% Correct	MOV Error
TrueSkill + iRPI	72.9%	11.01
IMOV	72.7%	11.05
Govan (baseline)	71.8%	11.44

The baseline performs about as well as the improved LRMC. One potential problem here is that the linear regression can only combine the Offense and Defense terms additively. Given how they are defined, we might guess that Offense/Defense would be a better predictive rating than Offense + Defense. If we define a combined rating Combined = Offense/Defense and add that to the model we get a slight improvement:

Predictor	% Correct	MOV Error
TrueSkill + iRPI	72.9%	11.01
IMOV	72.7%	11.05
Govan (baseline)	71.8%	11.44
Govan (w/ Combined)	71.8%	11.43

For the model above, I had set the phantom tie game to a score of 68-68. (68 is the average score in NCAA basketball, so I reasoned this would perturb the ratings the least.) [Govan 2009] uses small numbers for this tie game, so as an experiment, I set the score of the phantom tie game to 1-1, and then to 0.1-0.1:

Predictor	% Correct	MOV Error
TrueSkill + iRPI	72.9%	11.01
IMOV	72.7%	11.05
Govan (baseline)	71.8%	11.44
Govan (w/ Combined)	71.8%	11.43
Govan (w/ Combined; 1-1)	72.2%	11.26
Govan (w/ Combined; 0.1-0.1)	72.6%	11.21

This provides a significant performance improvement, although not yet competitive with the best of breed.

As mentioned above, one of the odd features of this system is that it rewards playing more games. We can try to address that by dividing the Offense and Defense ratings by the number of games played:

Offense_j = (1/N_j)* [(Score_1j/Defense₁) + ... + (Score_nj/Defense_n)]
Defense_i = (1/N_j)* [(Score_i1/Offense₁) + ... + (Score_in/Offense_n)]

This will hopefully remove any reward for playing more games while still permitting the solution to converge:

Predictor	% Correct	MOV Error
TrueSkill + iRPI	72.9%	11.01
IMOV	72.7%	11.05
Govan (w/ Combined; 0.1-0.1)	72.6%	11.21
Govan (w/ Combined; 0.1-0.1, normalized)	73.5%	10.80

And so it does, providing a big boost to performance -- making this version of Govan the best performing rating so far!

Tweaks like eliminating close games or blowouts don't really apply to this rating, because the rating is based solely on points scored, and not the game outcome. Nonetheless, it's worth a quick experiment to see if ignoring close games has a positive effect:

Predictor	% Correct	MOV Error
TrueSkill + iRPI	72.9%	11.01
Govan (best)	73.5%	10.80
Govan (mov-cutoff=2)	73.5%	10.82

Answer: no.

We'll stop for now with tweaking this rating. One thing we'll want to look at in the future is combining this with other ratings. Because Govan does not use the Won-Loss record, it may capture information not captured by ratings like TrueSkill that don't consider scores at all, and so properly combining the two might improve the overall performance. Or not -- that's an experiment for another day!

Net Prophet

Saturday, June 11, 2011

Govan Ratings

No comments:

Post a Comment