Submission core metrics

Hi,
I wanted to clarify some doubts about the basic metrics shown after a submission is done:

Validation correlation: The mean of your per-era correlations.
Is this computed using the predictions of the validation data set and the real targets of the validation dataset?
The code should look like (got it from the github examples):

def score(df):
    pct_ranks = df[PREDICTION_NAME].rank(pct=True, method="first")
    targets = df[TARGET_NAME]
    return np.corrcoef(targets, pct_ranks)[0, 1]

validation_data = tournament_data[tournament_data.data_type == "validation"]
validation_correlations = validation_data.groupby("era").apply(score)
validation_correlations_web = validation_correlations.mean()

Validation Sharpe: This is the mean of your per-era correlations divided by the standard-deviation of your per era correlations.
Based on what I mentioned above regarding Validation Correlation, Validation Sharpe should look like:

validation_sharpe_web = validation_correlations.mean() / validation_correlations.std()

Is that right?

Corr With Example Preds: This is the correlation between your model and the example predictions.
Which are the example predictions and agains which dataset are calculated?

Thanks in advance.

1 Like

Your understanding of val corr and val sharpe is correct. Regarding the example preds, the example predictions is already included in the downloaded numerai data as a csv file so you can compare it with your preds using the corr metric. These example preds are generated using the example_model.py file also included in the downloaded data.

1 Like