We are reviving Meta Model Contribution (MMC) to replace True Contribution (TC). For rounds starting on or after January 2nd, 2024 staking and payouts will transition to the fixed multipliers 0.5xCORR + 2xMMC. Furthermore, the 2024 Grandmasters season will be determined on CORR and MMC.
We are doing this for a few reasons:
- MMC is more stable than TC
- MMC is locally calculable while TC is not
- We realized our most stable performance when paying MMC
What is MMC (and BMC)?
From our docs:
MMC is the covariance of a model with the target, after its predictions have been neutralized to the Meta Model. Similarly, Benchmark Model Contribution (BMC) is the covariance of a model with the target, after its predictions have been neutralized to the stake-weighted Benchmark Models.
The idea to revive MMC started with a simple question from our founder, Richard Craib:
âGiven a model, how much does the Meta Modelâs correlation with the target change if we increase the modelâs stake by some small amount?â
He asked this because we know that the Meta Modelâs correlation with our target is a directly monetizable metric for which we could optimize. This produced a simple formula for calculating MMC that I call âRichardâs MMCâ:
Where y is the target, m is the Meta Model, and p are a modelâs predictions.
Using a derivative with respect to the 0.001 as it goes to 0, we reach the following formula that I call âMurkyâs MMCâ (big thanks to our user Murky for this derivation):
Assuming that p and m are both centered, normalized column vectors (using the tie_kept_rank and gaussian functions from our open-source scoring tools package) both formulas reach results that are 100% correlated with each other.
Finally, to sanity check these methods of calculating MMC, I revived the old method for calculating MMC, removed the bagging and uniform transformation to yield a third formula that i dub âMikeâs MMCâ:
It took some time to convince myself of the mathematical equivalence of Murkyâs and Mikeâs.
Here are some thoughts:
- tie_kept_rank and gaussian functions makes both predictions and meta model centered and normalized
- mean = 0 and std = 1 allows the following relationships:
- when mean = 0:
- The inverse of a vector can be defined as:
- using the above we can convert between Murkyâs MMC and Mikeâs MMC:
Murkyâs version is the fastest and simplest to compute, so we are using it to calculate MMC:
def contribution(
predictions: pd.DataFrame,
meta_model: pd.Series,
live_targets: pd.Series,
) -> pd.Series:
"""Calculate the contribution of the given predictions
to the given meta model.
Then calculate contribution by:
1. tie-kept ranking each prediction and the meta model
2. gaussianizing each prediction and the meta model
3. orthogonalizing each prediction wrt the meta model
4. multiplying the orthogonalized predictions and the targets
Arguments:
predictions: pd.DataFrame - the predictions to evaluate
meta_model: pd.Series - the meta model to evaluate agains
live_targets: pd.Series - the live targets to evaluate against
Returns:
pd.Series - the resulting contributive correlation
scores for each column in predictions
"""
# filter and sort preds, mm, and targets wrt each other
meta_model, predictions = filter_sort_index(meta_model, predictions)
live_targets, predictions = filter_sort_index(live_targets, predictions)
live_targets, meta_model = filter_sort_index(live_targets, meta_model)
# rank and normalize meta model and predictions so mean=0 and std=1
p = gaussian(tie_kept_rank(predictions)).values
m = gaussian(tie_kept_rank(meta_model.to_frame()))[meta_model.name].values
# orthogonalize predictions wrt meta model
neutral_preds = orthogonalize(p, m)
# center the target
live_targets -= live_targets.mean()
# multiply target and neutralized predictions
# this is equivalent to covariance b/c mean = 0
mmc = (live_targets @ neutral_preds) / len(live_targets)
return pd.Series(mmc, index=predictions.columns)
We divide by the length of the target to bring the final values inside the range of something like CORR20v2.
Your BMC is basically MMC, but using just the stake-weighted benchmark models instead of the Meta Model. This is helpful to tell us how well your model ensembles with just our internal Benchmark Models. A high score in both would indicate a truly unique and contributive signal.
Why MMC?
The fact that we can calculate MMC 3 different ways and they are all 100% correlated means that this is an easily explainable metric regardless of how you intuit the linear algebra and can be calculated locally (unlike TC).
Furthermore, MMC is much more stable than TC. Take a look at the following charts showing the distribution of each score over time:
Clearly MMC is much closer to the distribution of CORR than TC ever was or will be. This stability in the score is significant when we consider how users need to optimize their models for MMC and CORR.