tag:blogger.com,1999:blog-5902620336509647050.post1702571608600452081..comments2017-01-24T09:48:45.360-05:00Comments on Net Prophet: What Would a Perfect (Knowledge) Predictor Score in the Kaggle Competition?Scott Turnerhttp://www.blogger.com/profile/03393071448515738228noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-5902620336509647050.post-85481530432270065922016-03-27T00:41:43.205-04:002016-03-27T00:41:43.205-04:00From a Bayesian perspective I look at the true pro...From a Bayesian perspective I look at the true probability as a posterior distribution rather than a point estimate. This year I just used mean posterior parameters to generate point estimates but I think this was a mistake that kept my entry with the pack. While the mean posterior probability minimizes expected log loss, it also comes with less upside as the upper bounds of the score are lower(less variance). Next year I think I'll do one submission minimizing avg log loss, and another minimizing my bottom quartile of log loss. <br /><br />melondonkeyhttp://www.blogger.com/profile/02818146576436658055noreply@blogger.comtag:blogger.com,1999:blog-5902620336509647050.post-80202561330366311372016-03-22T21:22:00.312-04:002016-03-22T21:22:00.312-04:00Scoring these kinds of things is hard because over...Scoring these kinds of things is hard because overconfidence can be rewarded (and it is more likely to be if you're on the right side of 50% in the long run). If you think of an infinite number of Kaggle competitions, anyway. It'd be interesting to see someone create a series of NCAA-like seasons of data with known true probability rates and run the various Kaggle submissions against them and see who picks the correct perfect knowledge.<br /><br />Something that I think is missing from Kaggle is model skill. This article covers it well for the uninitiated: http://fivethirtyeight.com/features/when-picking-a-bracket-its-easier-to-be-accurate-than-skillful/). Logs of probability get a little bit at skill because being more certain of the correct response is generally more skillful. However, if you just said you were 100% certain of 2v15 and a 3v14, you'd get the same score for both, when it's less impressive to predict the 2v15 (where you were only 6% more certain than naive historical perspectives). But of course there is the problem of picking a historically reasonable 'naive' baseline for skill estimation as you have pointed out. indpndnthttp://www.blogger.com/profile/10104482482233171142noreply@blogger.com