A quick posting to highlight a couple of interesting papers.
The first is "A New Bayesian Rating System for Team Competitions" from ICML 2011. (Special thanks to Henry Ware for pointing me at the paper.) This paper addresses two shortcomings of vanilla TrueSkill: (1) multi-way tie games, and (2) how individuals contribute to team performance. The authors are able to show significantly better performance in their problem domain with some fairly straightforward modifications to TrueSkill. The results don't really apply to what I'm looking at right now, but would potentially apply to modeling team performance as a function of all the individual players. I'm not convinced there's much to be gained by trying to model individual basketball players. I foresee at least two significant problems. The first is that there are not enough games to build good individual models of the players (although bridging seasons might help with this). The second is that I don't believe that team makeup changes significantly enough during the season to make tracking individuals worthwhile -- but I haven't done any analysis to see whether that intuition is correct or not. But inasmuch as players don't contribute in a simplistic additive way to team performance, the work here on alternate methods for "summing" individual contributions would probably be very relevant.
The second paper is "An empirical comparison of supervised learning algorithms." This paper surveys a wide variety of learning algorithms (e.g., neural networks, SVMs, naive Bayes, etc.) over a set of sample problems and categorizes their effectiveness. Obviously the effectiveness of the algorithms depends (somewhat) on the problem set, but this paper presents some convincing evidence that boosted, calibrated decision trees, neural networks, and SVMs are the most effective available algorithms. The paper is an interesting read, and the author's web site is also worth a look.