Alternative Modelling Algorithms & Approaches

lackofintelligence · July 5, 2020, 5:27pm

Yup, I figured out how to do it. The mysterious use is to use the beta function to fit the out-of-sample scores during cross-validation. Why the beta function? Because it naturally lives on an interval, and comes with skew and excess kurtosis, unlike the Normal distribution which does not live on an interval and has no skew or excess kurtosis. We can map the beta function to the interval of correlations, [-1,1]. Once you have used maximum likelihood to fit the distribution of scores you can derive any kind of estimator you like from it. In particular maximum likelihood estimates are robust against spurious fluctuations that are statistically guaranteed to occur during parameter optimization. See how nicely it fits our data:

I also considered the Logit-Normal, but in the limit that the standard deviation goes to zero, the skew and excess kurtosis also go to zero and that is contrary to what is observed. I am exploring the Beta-Ratio, the ratio of the areas of the fitted beta distribution above and below some threshold, in some sense similar to the sortino ratio, but it remains finite no matter where you set the threshold.

Topic		Replies	Views
Submission Question Tournament	4	937	January 2, 2021
Performing Exploratory Data Analysis on Numerai Tournament Data with R Data Science	3	6433	December 2, 2021
Are predictions discrete or continuous? Tournament	19	4056	May 22, 2021
Numerai Tournament Training Data Explorer Data Science	1	1355	December 2, 2021
Model Evaluation Metrics Data Science	17	9507	March 25, 2021

Alternative Modelling Algorithms & Approaches

Related topics