Submission core metrics

olivepossum · August 2, 2020, 6:06pm

Hi,
I wanted to clarify some doubts about the basic metrics shown after a submission is done:

Validation correlation: The mean of your per-era correlations.
Is this computed using the predictions of the validation data set and the real targets of the validation dataset?
The code should look like (got it from the github examples):

def score(df):
    pct_ranks = df[PREDICTION_NAME].rank(pct=True, method="first")
    targets = df[TARGET_NAME]
    return np.corrcoef(targets, pct_ranks)[0, 1]

validation_data = tournament_data[tournament_data.data_type == "validation"]
validation_correlations = validation_data.groupby("era").apply(score)
validation_correlations_web = validation_correlations.mean()

Validation Sharpe: This is the mean of your per-era correlations divided by the standard-deviation of your per era correlations.
Based on what I mentioned above regarding Validation Correlation, Validation Sharpe should look like:

validation_sharpe_web = validation_correlations.mean() / validation_correlations.std()

Is that right?

Corr With Example Preds: This is the correlation between your model and the example predictions.
Which are the example predictions and agains which dataset are calculated?

Thanks in advance.

udit10 · August 4, 2020, 6:32pm

Your understanding of val corr and val sharpe is correct. Regarding the example preds, the example predictions is already included in the downloaded numerai data as a csv file so you can compare it with your preds using the corr metric. These example preds are generated using the example_model.py file also included in the downloaded data.

aif · October 2, 2020, 10:09am

What is considered a good correlation between my model’s predictions and the example predictions? Is it a high or a low correlation more appropriate?

themicon · October 2, 2020, 12:30pm

There is no real good answer to that question. If your correlation is close to the example prediction, then you will most probably not have good MMC. If the correlation is very low between these two, you might have high MMC, but very bad CORR on the live data. You could have very low correlation to the example predictions (be completely orthogonal to it) get good MMC and good CORR. There really isn’t any way to know for sure. Some models have done really well having high correlation with the example predictions, some have done really well that have very low correlation.

Topic		Replies	Views
Benchmark Metrics with numerai-tools v0.0.11 Tournament	1	719	April 3, 2024
Model Diagnostics: MMC Data Science	0	3292	September 3, 2020
What is a good sharpe ratio and validation correlation? Tournament	4	3346	May 18, 2020
MMC and other metrics for all targets Tournament	6	882	December 17, 2023
Still understanding scores: CORR of the MM Tournament	5	510	March 18, 2024

Submission core metrics

Related topics