The next step is to try to predict the number of possessions when two teams play each other. Why would we want to do this? Consider the situation where Maryland beats Duke by 5 points. How strong is that evidence that Maryland is better than Duke? Well, if the final score was 45-40 we might consider that stronger evidence than if the score was 125-120. Number of possessions is also used to calculate "tempo-free statistics" which allow us to better make apples-to-apples comparisons between games that are played at different paces.
So how do we predict the number of possessions? The model I'm using right now supposes that each team has a preferred pace -- i.e., an ideal number of possessions. Some teams would like to play games with lots of running up and down the court and many possessions; others would like to play a very slow, controlled game. When two teams meet up, they each try to play the game at their preferred pace, and as a result the game is played somewhere in-between:
Posspred = (Preferred PossHome + Preferred PossAway)/2
Of course, we don't know the "preferred pace" for a team, so we have to try to discover that from the game data. One way to do that is gradient descent, as was used for Danny Tarlow's PMM. If we do that, and then test the predictor in the same way we've tested the others, we get this performance:
Is that good performance? If we look at the distribution of possessions/game:
I'm inclined to say the predictor is "meh" -- not particularly good, but probably good enough to be useful.