Changing Scoring & Payouts Again To MMC Only

Another thought… removing “optional multipliers” seems to be the best of the coming changes… I did a little plotting and found that the current regime of “optional-multipliers” has not produced any meaning full change to the stake weighting of the MM.

Take a look at the top 10 stakes over the last year…

the whales have just been allowed to sit and take up space. They’re diluting good signal.

(also kudos to Mistreated for taking profits.)
… and the code for anyone to play with…

import pandas as pd
import matplotlib.pyplot as plt
from numerapi import NumerAPI
napi = NumerAPI()

list_of_usernames = ['crowdcent', 'atol', 'shatteredx', 'mistreated', 'efficient_meerkat', 'aininja', 'halsmith99', 'phorex', 'yoshiso','ummon']

def get_account_profile_details(username, tournament=8):
    query = """
        query($tournament: Int!, $username: String!) {
          accountProfile(tournament: $tournament, username: $username) {
            totalStakeTs {
    args = {'tournament': tournament, 'username': username}
    account_data = napi.raw_query(query, args)['data']['accountProfile']
    return account_data

all_profiles = pd.DataFrame()

for username in list_of_usernames:
    profile_data = get_account_profile_details(username, tournament=8)
    profile_totalStakeTs = profile_data['totalStakeTs']

    # Convert to DataFrame and process
    df = pd.DataFrame(profile_totalStakeTs)
    df['date'] = pd.to_datetime(df['date'])
    df.set_index('date', inplace=True)
    # Aggregate data
    all_profiles[username] = df['value']

# Handle any missing dates
all_profiles = all_profiles.fillna(method='ffill').astype('float')

# Plotting the data
plt.figure(figsize=(15, 10))
for username in list_of_usernames:

plt.title('Account Values Over Time')

Excuse my Numeraic ignorance, but I was not around when MMC was used as payout multiplier. First to be sure: is it the same as current CWMM metric in dashboard? Second can you provide more reading links on evolution of MMC and how to optimize model MMC performance?

1 Like

If we we’re to split the top 10 stakes into two groups – ones that beat the 1yr average and ones that did not – we see that it could still take a year or two for the competitive models to come to prominence. Numerai should be enforcing a high multiplier not a lowering it.

columns_to_sum = ['crowdcent', 'atol', 'efficient_meerkat', 'ummon']
all_profiles['lame_ducks'] = all_profiles[columns_to_sum].sum(axis=1)
columns_to_sum = ['shatteredx', 'mistreated', 'aininja', 'halsmith99', 'phorex', 'yoshiso',]
all_profiles['winners'] = all_profiles[columns_to_sum].sum(axis=1)

plt.figure(figsize=(15, 10))
for username in ['lame_ducks','winners']:

plt.title('Account Values Over Time')
plt.legend() ```

The gap widens even more when you look at the top 20 stakes…

at present the top 20 stakes control 50.6% of the meta model. The 12 lame ducks control 38.8%
The 8 winners control just 11.8%.

I went ahead and ran it for the top 50 stakes… I’d say that’s a pretty damn representative sample.

at present the top 50 stakes control 63.3% of the meta model. The 33 lame ducks control 48.7%. The 17 winners control just 14.7%.

I’m actually shocked at results… Half of the meta model is controlled by users that can’t figure out how to get in the 50th percentile!

look at the graph… It’ll take years for meta model to sort itself out!

my point is this… don’t keep cutting the payout multipliers. That would be a huge disappointment and bad for the fund. There’s one type of user that’s BIG problem for the fund… the ones with more money than sense. Why not just crank up and enforce a big multiplier? It doesn’t really matter what metric you use to score predictions… Corr, MMC, TC… what ever… halt of the meta model is controlled by people aren’t even coming close to hitting the mark.

Why not MMCx5? Why not x10? the multiplier and PF are two absolutely arbitrary coefficients and you let them control your MM’s learning rate?! I never got the payout factor… why let the public decide how fast or slow the stake weighted meta model adjusts its self?


MMC != CWMM. MMC is well described here: MMC used to be calculated as performance of leave one (your model) out - performance of meta model. MMC2 is basically performance of your model’s residuals against the target.

1 Like

So MMC score is not calculated and published right now? How can we check our past MMC performance?

1 Like

Lets hope MMC scores get backfilled soon :face_with_spiral_eyes:


I think the idea is that you can calculate MMC on your own against the historical meta model predictions.

But I dont want to “reevaluate” all of my 100 models on my own. I would prefer Numerai to do just backfill and show MMC scores + no way to compare your MMC to other MMC scores without Numerai backfilling the metric.


I think this is going to backfire. You are changing the tournament from: “make models that predict the target” to: “retrain models every round to farm whatever residuum is left”. It is like doing gradient descent without momentum. IMO you should keep corr as part of the reward. If the models that are highly correlated with the target doesn’t help your fund, than it is the target fault.


The magnitude of the risk is controlled by the magnitudes of the scores, i.e. the multiplier/scaling level, but again if there is only one level we don’t need that concept. Anyway, if the risk is too high relative to the probability of being positive each round (and by how much), then everybody is guaranteed to go bankrupt no matter how good their model is. This is a simple Kelly criterion calculation. If everyone is forced to “overbet” then we all go broke…mathematically guaranteed. (We can’t only focus on getting rid of bad performers – you can do that by getting rid of everybody. Good performers must be rewarded and not ground into dust also.)


The big danger with MMC only (and we discussed this a alot back in the MMC days the first time) is without a “center” of corr (or something absolute) the metamodel will just move around (in predictive space or whatever that would be called – the direction of the preditions) for no other reason than people are constantly moving their models around chasing the residuals since that is all there is. (Most likely it would oscillate between two or three main directions that match the inductive biases of the underlying models.) However, since the market is a moving target to begin with and it is all very complex, this dynamic is not quite guaranteed, we’ll just see. The old MMC tended to be just an extension/magnifier of whatever your corr was most of the time. It behaved nothing like TC – we’ll have to see on a new target.

But if it is a zero-sumish game, then it comes down to something like “you must be right at least 52% of the time” (or some X%). This is how sports bettors think – meet that that mark or else you’re losing. And if the math/magnitude of the scores and the risk controls we are offered are not worked out to be just right (if they even can be in such a crazy game) then there is definitely a danger of enforced losing for everybody – it will simply be impossible to make any reward except by luck in the short-term. (Time for an asymmetrical score perhaps?)

So far they have stubbornly refused to give us any real risk controls (i.e. stake management, which Richard explicitly refused in the last fireside) and of course automatic compounding is great if you’re winning, but it is also an excellent way to drive you straight into that overbetting zone where you are guaranteed to lose. (AGAIN, even with a good model! Overbetting kills everybody – good and bad.)

So while I’m actually cautiously optimistic that this general idea can be made to work…well their track record of thinking these things through is unfortunately undeniably at this point…not great. And they always seem to be in a mad rush to implement the new thing and dump the old thing before we can really tell how it is going to go. Like…why not actually see how it goes instead of just jumping off the cliff and hoping? (EVERY policy has its price and its unintended consequences – we could all 100% agree that “whatever plan” sounds great and the math is perfect, etc and it still has some fatal flaw nobody thought of but is easily seen in practice…if you only give it some time in practice to find out (before totally abandoning the old thing). Too many nasty rug pulls and there just won’t be anybody left…and we’ve already had one or maybe two this year depending on how you count them…


Numerai will no longer make payouts based on CORR or TC, all payouts will be on MMC only on a new upcoming target called Teager starting the end of this year.

With this you mean that the scoring system will change to a new target with a “new-old” metric in the next month? or just that the new target will be released?

I think that some overlap is needed to let people adjust or create new models. If lots of people unstake their models not because they are performing bad, but because they don’t know how are they going to perform, that could be very dangerous for the MM performance.


The approach does feel disappointingly haphazard and ad hoc. If Numerai and @richardai want longer term commitments (as stated in the linked video), I think they can foster that by acheiving greatere stability in the way the competitions are organized.

For example, several months before introducing a substantial change, perhaps Numerai could open a third competition–let’s call it Playground (to go with Classic and Signals)–where everyone recognizes that things could change quickly, and there would be no staking. Just some gold stars for consistent participation, or maybe some credits to apply to the staked competitions.

The purpose would be to provide an area where there really isn’t any risk, so Numerai could beta test their ideas and competitors could get a handle on what they need to do to compete in the Tournament given a proposed update. Then if Numerai finds that the Playground results are leaning towards an improvement in the Tournament, and the bugs have been worked out, then switching that into the main Tournament would be easy.

And when their are no proposed updates, just let the Playground run under the the same process as the Tournament, but without staking.


My Numerai Tourney experience lately.
Sometimes you eat the Bear, sometimes the Bear eats you.


So if we are now going to have a new Target and MMC to evaluate our predictions, how do I calculate MMC?


Your fund is performing poorly compared to the metrics? I would be very surprised if it was otherwise. This is the Representation Problem: whatever metrics you may design and model, the unpredictable markets will always end up being worse for you. This goes against your core belief, more of an unfounded ‘chartist’ hype really, that you can make money with trading robots. (In other ways than instantaneous reactions to data that the suckers out there have not even seen yet). So you struggle on, fiddling endlessly with the metrics. “Let us just adjust the numerous ‘targets’ again and everything will be good”.

Moving onto a concrete question. I have TC= 0.0104 and CORRV2=0.0058 but negative
CWMM=-0.0175. Am I going to be penalised by these changes, or am I going to benefit?



Best regards for everyone!

  1. What current target best approximates the new “teager” target?
  2. When will we see our CWMM rank in the tournament?