Participant-centric model benchmark

kayeffnumeraitor · December 14, 2022, 9:41pm

Hello again everyone,

When using the diagnostics page you end up with some insights about the model provided you trained your model only on the vanilla train set. While they certainly may give good feedback about your model quality (apart from TC, but lets not dive into that), what I always found lacking is that they are metrics useful from the viewpoint of Numerai, but not from the viewpoint of a tournament participant.

Since the only metric that we are able to back-test AND stake on right now is CORR, I will assume that the participant will stake on CORR only.

A numerai participant staking on CORR obviously will be burned if the correlation of their predictions are less than zero, and rewarded if greater than zero. So then the question is: How can the particpant minimize the probability of being burned and maximize the probability of getting a reward in any given round?

For that reason the metric that I use for my models is the following: Evaluate the ranked correlation on all validation eras, assume that the per era correlation follows a gaussian distribution, and calculate the probability for having a per era correlation greater than zero.

As a benchmark, I compare it to the example predictions over the same period, and also to random predictions. Here is such a result from one of my latest models:

This result tells me, under the assumption that future eras behave similar to the ones in the validation set, in ~85 % of the weekly eras my model should receive positive correlation, which also is comparable to the example predictions.

murkyautomata · December 15, 2022, 7:23am

If you’re assuming your correlation follows a gaussian, then your negative corr probability is a function of your sharpe ratio.

kayeffnumeraitor · December 15, 2022, 4:54pm

Yes, but at least for me sharpe ratio is less intuitive than the probability of receiving positive results per round.

Topic		Replies	Views
Does Good Model Diagnostics Correlate with Tournament Performance? Data Science	13	3006	February 7, 2021
Interpreting Model Diagnostics Data Science	0	763	March 30, 2021
Submission core metrics Tournament	3	1771	October 2, 2020
Still understanding scores: CORR of the MM Tournament	5	501	March 18, 2024
Model Diagnostics Update Announcements	0	11713	September 3, 2020

Participant-centric model benchmark

Related topics