Wednesday, February 20, 2013

Predictive Ratings vs. Achievement Ratings

In this recent posting, I commented that the AP voters were continuing to under-rank Florida.  The PM ranked Florida #1 (in a virtual tie with Indiana), while the AP voters had them at #7.  In a comment on that posting (and in his own blog postings here and here), Monte McNair makes a distinction between predictive ratings and achievement ratings.  The former are measured by how well they predict future games; the latter by how well they reflect what teams have accomplished.

The question I want to ponder for a moment is whether that's a meaningful distinction.

As a mental exercise, imagine that we decided to create a rating system for the sport of competitive lifting.  Based upon how much weight competitors lifted in various competitions, we'd assign them a rating that would reflect their weightlifting ability.  What would this rating represent?  Most people would say that it represents (or measures) the "strength" of the competitor.

That seems like a silly exercise for weightlifting, because we already have a direct measure of the competitors' strength -- how much they lifted.  It makes more sense for sports like basketball, where we understand that the final score doesn't directly measure a team's "basketball strength" but is instead a complicated function of the two teams' strengths and factors like the officiating crew, the venue, and so on.  The rating is intended to tease out the hidden variable -- the team's basketball strength -- which cannot be measured directly.  So a rating represents a team's basketball strength, and usually a higher number represents more strength.

Now let's return to the distinction between predictive ratings and achievement ratings.

There's an easy and intuitive understanding of how to assess a predictive rating:  We test it's ability to predict future games.  A rating that does that better is a better predictive rating.  The achievement rating looks backward rather than forward, so we should assess an achievement rating by how well it predicts past games.  A rating that does that better is a better achievement rating.

Here's the rub, though:  those are the same things!  The more accurately a rating reflects the true "basketball strength" of a team, the better it will perform predicting all of the team's games -- whether they have already occurred or are in the future.

Monte also argues in this posting that:
When assessing how well a team has played over a season, the only factors that should come into play are: (1) how often did you win and (2) how difficult was your schedule.
I think there's a simple counter-example to this notion.  Imagine that going into the last game of conference play, Indiana and Michigan have played exactly the same schedule of opponents, and they're about to play each other.  They each played Butler in the third game of the season, but I won't tell you how that game came out.  In all the other games, Indiana beat each opponent by at least 12 points, while Michigan never won by more than 6 points and went to OT in three of the games.

Now I ask you two questions:  (1) Who was more likely to have won against Butler when they played early in the season?  and (2) Who is more likely to win when they play each other tonight on a neutral floor?

My guess is that almost everyone would answer Indiana to both questions -- which means that Indiana should be rated higher than Michigan.  Regardless of whether you're trying assess what a team has already achieved or how it might perform in a future game, how a team wins (or loses) a game is very important.

Of course, you may reject the notion that ratings should reflect a team's "basketball strength".  But then I challenge you to express clearly what a rating should mean.  I think you'll find it very hard to find a meaningful definition that doesn't come back to being an accurate measure of a team's strength.


  1. I think we still have a misunderstanding. I disagree when you say "The achievement rating looks backward rather than forward, so we should assess an achievement rating by how well it predicts past games. A rating that does that better is a better achievement rating." A predictive rating will obviously be better at predicting both future and past games.

    To take your weightlifting example, let's say lifter A averages 300 pounds on his lift and lifter B 280 pounds. Those are their predictive ratings. In their 30-lift competition, however, A averaged 298 pounds compared to 299 for B. Who deserves to be crowned champion? While A is the "better" lifter, B deserves to be named the winner.

    The same thing applies in rewarding teams in sports. In professional leagues, the schedules are close enough that leagues essentially ignore the strength of schedule portion and call it a wash, awarding playoff spots based on wins. In college basketball, schedules are so disparate that we cannot go simply on wins so we need to "adjust" those wins based on schedule difficulty. Take college football for example, Notre Dame by nearly all predictive ratings was not one of the top 2 teams in the nation, but they went undefeated against a strong schedule so I would say they undoubtedly DESERVED to be in the national title game. For another example, take MLB. The Orioles were likely not the best non-division winner, but they deserved to make the playoffs based on their record. Would you argue that the Angels (likely a "better" team) should have instead been named to the playoffs?

  2. To me, the weakness of this argument is that you take into account SOS. That's because all wins are not equal -- a win over a weak opponent is simply not as good as a win over a strong opponent. Once you've admitted that all wins are not equal, it makes sense to look not only at SOS but quality of the win.

    Your baseball playoff example isn't that relevant because the playoffs by rule go to the team with the better record. But if you wanted to have the best teams in the playoff, then the Angels should have gone.

    Imagine the old days of the NCAA tournament when only one team per conference was invited. At the end of the season, two teams in a conference are tied atop the standings with identical records. One team won all its games by 20 points; the other won all its games by 2 points. Which one is "deserving" of the bid?

    Also, recall that when teams were rewarded solely on won-loss records, the result was that teams scheduled as many cupcake opponents as they could. Is that really the way to get the most "deserving" teams?

  3. I guess we'll have to disagree on this one, but as long as the object of the game is to win, you can't penalize teams for not winning by a lot. Simple case: two teams, same schedule. One team goes 11-1, the other goes 10-2, but the 10-2 team would be favored by 5 points over the 11-1 team. You have to give the nod to the 11-1 team. You say the "quality" of the win and i think you mean winning by 10 is more impressive than winning by 5. But winning by 10 when you hit 80% of your 3PA is NOT as impressive as winning by 10 when you hit 20% of your 3PA. 3P% in a single game is much more luck than offensive rebounding or 2P%. Team B (that won by 10 with 20% 3P%) would be favored over Team A were they to play.

    In your old NCAA Tournament hypothetical, to me those teams are exactly equally deserving. The last paragraph isn't relevant here, I am not for rewarding teams solely based on win-loss record with no regard for their schedule. My system is pretty simple: how many games would a baseline team have won against your schedule and how many games did YOU win against that schedule? If you are 9-4 against a baseline 7-6 schedule that's the same to me as being 12-1 against an easier baseline 10-3 schedule.

    I don't see why once we take into account SOS that means that we also switch from wins to peripherals. Not taking into account SOS is really just a special case where the SOS are equal. In pro leagues where they do not take into account SOS, all wins STILL are not equal. You still will have teams with worse records that are "better" than teams with better records. If you want to pick the "best" teams to go to the NCAA Tournament, then you must also prefer that the "best" MLB or NFL teams make the postseason.