This is a little trickier than you might expect, because my prediction model uses linear regression. Linear regression works fine when we're looking for the relationship between two numerical variables (e.g., how does rebounds/game affect score) but it doesn't work so well with polynominal (not polynomial!) variables. A polynominal variable is one that takes on a number of discrete, non-numeric values. In this case, day of the week can be Monday, Tuesday, Wednesday and so on.

To use a polynominal variable in linear regression, we turn it into a number of binominal variables. In this case, we create a new variable called "DOW = Monday" and give it a 1 or 0 value depending upon whether or not the day of the game is Monday. We do this for each possible value of the polynominal variable, so in this case we end up with seven new variables. We can then use these as input to our linear regression.

When I do so, I find that only one of the new variables has any importance in the regression:

0.6636 * DOW = 4=false

Translating, this says the home team is at a small disadvantage in Friday games. I leave it up to the reader to explain why that might be true. (Ivy League effect?)

We can also look at whether predictions are more or less accurate on some days. When I do that for my model, I find that the predictions are most accurate for Saturday games, and the least accurate for Sunday games. The difference in RMSE is about 6/10 of a point, so it's not an entirely trivial difference. In fact, Saturday games are more accurate than any other day of the week.

## No comments:

## Post a Comment