MMC Calculation

Hey everyone,

With the recent scoring changes, I thought I’d post some sample code to compute MMC locally using numerai-tools, since it was a bit confusing for me at first.

For those who also use colab I made a notebook that handles setup as well here

I might still have it wrong, so let me know if any feedback!

#Install packages with scoring function and numerapi
!git clone
!mv numerai-tools/numerai_tools/ /content/
!pip install numerapi

from numerapi import NumerAPI
import pandas as pd

napi = NumerAPI()


#Download example predictions, meta model preds, and live targets
napi.download_dataset("v4.2/validation_benchmark_models.parquet", "validation_benchmark_models.parquet")

df_mm = pd.read_parquet("v4.2/meta_model.parquet")

#Get eras that have data from meta model
mm_eras = df_mm["era"].unique()

bm_val = pd.read_parquet("validation_benchmark_models.parquet")

#Get bechmark predictions only for eras that have meta model data
bm_val_recent = bm_val.loc[bm_val["era"].isin(mm_eras)]

#Do the same for live targets
live_targets = pd.read_parquet("v4.2/validation_int8.parquet", columns=["era","target"])
live_targets_recent = live_targets.loc[live_targets["era"].isin(mm_eras)]

from scoring import correlation_contribution

#correlation_contribution(predictions: pd.DataFrame, meta_model: pd.Series, live_targets: pd.Series)
mmc = correlation_contribution(bm_val_recent, df_mm["numerai_meta_model"], live_targets_recent["target"])