Monday, August 8, 2011

Regularization in PMM

I'm back from vacation and slowly finding some time to work on prediction.  One of the things I'm doing is revisiting the code for "Probabilistic Matrix Model" (PMM).  This model is based upon the code Danny Tarlow released for his tournament predictor, which he discusses here. At the heart of this code is an algorithm to minimize the error between the predicted scores and the actual scores using batch gradient descent.  This is something I'll want to do frequently in the next stage of development (e.g., to predict the number of possessions in a game), so I'm looking at adapting Danny's code.  (Or rather, adapting my adaptation of Danny's code :-)

Danny's code differs in a couple of ways from a straightforward batch gradient descent.  One difference is that Danny has added in a regularization step.  Danny made this comment about regularization:
In addition, I regularize the latent vectors by adding independent zero-mean Gaussian priors (or equivalently, a linear penalty on the squared L2 norm of the latent vectors). This is known to improve these matrix-factorization-like models by encouraging them to be simpler, and less willing to pick up on spurious characteristics of the data.
I theorize that with a large, diverse training set such as I'm using, regularization is unnecessary.  To test that, I re-ran the PMM without any regularization:

  Predictor    % Correct    MOV Error  
PMM (w/o regularization) 71.8%11.20

Performance is almost identical, so indeed there doesn't seem to be any value in regularization.

No comments:

Post a Comment