When we released Metamodel Contribution (MMC) earlier this year, we said it was a test run to see how users respond before we begin payouts for it. During our observation and experimentation period, we devised a slightly different version of MMC.
The original MMC formulation essentially takes the stake-weighted metamodel, and then tries removing each user from it and seeing how much it hurts the metamodel.
MMC2 has a slight adjustment designed to push users to improve the hedge fund performance more directly: The residual MMC method.
The new MMC will residualize your predictions to the stake-weighted-metamodel, and then score the resultant vector against kazutsugi.
A way to think about it is like we are taking your signal, removing the part of the signal that can be accounted for by the stake-weighted-metamodel signal, and then scoring what is left over.
One note is that we score by covariance rather than correlation here, to account for the magnitude of difference from the stake-weighted-metamodel. We first normalize everyone’s predictions, so still only the prediction ranks matter for users. If you submit exactly the stake-weighted-metamodel, your score will be exactly 0. The further away your predictions are from the stake-weighted-metamodel, the more potential for large MMC scores.
We still subsample users for the stake-weighted-metamodel. If you followed the original MMC closely, you know this trick prevents enormous stakers from being penalized by the metric and affecting other users too dramatically, and it helps reward redundancy more than a pure stake-weighted-metamodel comparison would.
This diagram helps visualize what this neutralization/residualization operation does.
The User and the Metamodel are in a very similar direction, so when you neutralize the user to the metamodel, you are left with only the independent component of the user’s predictions, and the vector has much smaller magnitude than the original predictions
Code for neutralizing exactly one vector by one other vector (assuming they are pandas series)
import pandas as pd import numpy as np def neutralize_series(series, by, proportion=1.0): scores = series.values.reshape(-1, 1) exposures = by.values.reshape(-1, 1) # this line makes series neutral to a constant column so that it's centered and for sure gets corr 0 with exposures exposures = np.hstack( (exposures, np.array([np.mean(series)] * len(exposures)).reshape(-1, 1))) correction = proportion * (exposures.dot( np.linalg.lstsq(exposures, scores))) corrected_scores = scores - correction neutralized = pd.Series(corrected_scores.ravel(), index=series.index) return neutralized
Some code and demonstration of using neutralization in other ways can be found at https://github.com/numerai/example-scripts/blob/master/analysis_and_tips.ipynb (near the end. function: normalize_and_neutralize).
The stake-weighted metamodel is now transformed to be uniform before we neutralize each model to it (which are also uniform transformed before all else).
The covariance metric is now divided by 0.29^2 to get the MMC displayed on the website. This is because the standard deviation of a uniform distribution is 0.29, so to get covariance up to correlation space, you’d want to divide by 0.29^2. This makes MMC have the roughly the same magnitude as as the main tournament when correlation with metamodel is 0, making the metric more interpretable.