TC is a metric that shows, how much your model improves the meta model.
That means that ensembling the predictions of my model with the metamodel should improve the meta model metrics like corr and sharpe. The fund also picks trades from the stocks, where the metamodel has the most confidence (top/bottom 200).
With that in mind, we can estimate the (past) TC of a model with the following script:
validation[‘my_prediction’] = my_model.predict(validation[features])
mm = pd.read_parquet(‘v4.1_meta_model.parquet’)
validation.loc[:, “mm_ensemble”] = validation[[‘era’, ‘my_prediction’, ‘numerai_meta_model’]].dropna().groupby(‘era’).rank(pct=True).mean(axis=1)
validation_stats = validation_metrics(
[‘my_prediction’, ‘mm_ensemble’, ‘numerai_meta_model’],
print(validation_stats[[‘mean’, ‘sharpe’, ‘tb200_mean’, ‘tb200_sharpe’]])
So TC should be correlated with the gain of the meta model after ensembling.
I guess, if the tb200_mean of the ensemble is lower than the that of the metamodel, then there is no TC to expect from that model.
Ideally, the ensemble should outperforms both of it’s components.
Do you think that the above estimation method is correct?
Has anyone came up with a better estimation?