Changing Scoring & Payouts Again To MMC Only

tl;dr Numerai will no longer make payouts based on CORR or TC, all payouts will be on MMC only on a new upcoming target called Teager starting the end of this year.

Numerai has made payouts in a number of different ways over the years as this meme from RocketChat shows. I mentioned in the Fireside chat, we need to make changes to payouts to discourage poor performing models from staking. I went through different ideas but I think we’ve settled on something. Thanks @murkyautomata for your help.

On the one hand, it seems natural to pay Numerai participants based on their predictions’ correlation to the target Numerai defines. On the other hand, Numerai already has benchmark models which already have very good correlation with the targets see: Numerai.

Another payout mechanism which seems natural is True Contribution (TC): how much your staked predictions improve the post-optimization portfolio returns for Numerai. However, TC has some weaknesses in that it is blackbox and also tied to certain optimizer settings. The optimizer settings for Numerai One and Supreme are different from each other and change from time to time. To have TC stay “True” the whole time would require constant alterations to it — even the size of the funds influence TC. So TC is challenging to maintain without becoming even more blackbox and mysterious.

MMC (MMC2 Announcement) was the previous way we solved the problem of incentivizing orthogonal signals that are actually contributing to the Meta Model. We take the Meta Model predictions and ask: does a small weight on your signal when added to the Stake Weighted Meta Model improve or hurt its correlation with the target? (This is equivalent to the residual-MMC discussed in the old MMC2 post).

In the past, MMC had some weaknesses. It used to be computed against old targets without feature penalization or without liquidity adjustments (many small stocks need a liquidity adjustment to reflect that a decent size hedge fund can’t buy them in large size). Because MMC on old targets had these weaknesses, we developed TC. However, we are now almost ready with a new target (called Teager) which does almost all the important transformations that the optimizer does within the target making MMC on this target a great measurement of contribution. In other words, MMC on this target is quite close to having TC under new minimal optimizer settings Numerai intends to move to. Also important to note: Numerai now gives out the Meta Model signal so MMC is not a blackbox any more and can be computed locally.

So why change to MMC only payouts? Why can’t it be optional? Why can’t you keep CORR or TC?
In the past MMC and TC were optional to stake on. Many users would simply stake CORR with a large stake and not stake MMC or TC at all. The problem is many large stakers this year have had persistently negative TC & far worse TC than benchmark models. This would be fine if these models were being burned away but if they weren’t staking TC this year, they would hardly burn at all if their CORR was okay or flat. The point of payouts is to get feedback into the stakes so that the Stake Weighted Meta Model can improve. Users persistently hurting the Meta Model but doing okay on CORR shouldn’t be able to earn a positive return on their stake; they should earn a strongly and persistently negative return on their stake.

How much payout multiplier will be available on MMC?
We think MMCx2 is probably makes sense. Whatever multiplier we choose there will only be one multiplier. For example, if we choose MMCx2 everyone who stakes will have to stake with that multiplier and there will be no option for MMCx0.5 or MMCx3. This puts all staked NMR on a level playing field for earnings.

Aren’t you worried this will cause many users to unstake because earning MMC is too hard?
We are not worried in fact we sincerely hope fewer users stake on Numerai after this change. Clearly, unlike any web service I can think of, Numerai can actually benefit from losing users if they aren’t good. Of course, we are hurt badly if we lose the best data scientists i.e. the data scientists who will be the best going forward at MMC. But by making the tournament harder and getting users who don’t even believe they have any MMC to stop staking, we are going to get to a world where the payout factor is far higher for the top data scientists who remain. And of course, users who don’t think they have MMC can still submit unstaked predictions to Numerai until they feel confident enough that they have MMC that they are prepared to stake on it.

To put it even more simply: Numerai data scientists can currently earn NMR even though they hurt the fund performance (they had bad TC but didn’t stake it) and performed far worse than Benchmark Models (on CORR or any other metric we display). Does it make sense for Numerai to reward models like this? What for? It’s a bad incentive. You can see the problem by noting the average return on stakes on Numerai is 30.65% over a year where the Meta Model and fund performance has been poor. Clearly, payouts are too generous in aggregate at the moment to make any sense for us or you in the long term. We were happy to have generous payouts to seed the data science community but going into next season, we need to have much closer alignment. Producing a contributive model to the Meta Model is very difficult; earning payouts on Numerai has to be exactly as difficult.

Screenshot 2023-11-14 at 7.02.19 PM

What about Signals?
There are no changes planned for Signals just yet. Numerai Signals already has payouts on the stricter FNC rather than CORR. Though it is possible the TC payouts on Signals could change to something more MMC-like in the future as well.

More details to follow and we plan to release the new Teager target by the end of the month.

5 Likes

Sounds good. Thanks for the heads up. I’ve never understood why optional multipliers were allowed. I think Crowdcent and Atol would have changed tack or de-staked a long time ago if they had been forced to stake on 1xCorr+3xTC. I will say that I’ve always thought the fact that we couldn’t back test for TC was actually a good thing for the fund as it removed or reduced the possibility of introducing overfitting error into the MM.

Sounds basically right, yes. Confused about the multiplier a bit – if it is all one multiplier then means essentially no multiplier needed, right? (Or are you saying they’ll still be a 1x and a 2x?) In other words, the magnitude of the metric would just be set to whatever you thought correct and no multipliers needed (which are a weird thing – basically a hack – and confusing for newbies anyway). From what I remember the last version of MMC had a scaling factor that was supposed to make it about equivalent to corr (which then wasn’t adjusted when the target changed and mmc magnitudes went down by 10% so think about that.)

So as long as the math is right sounds good. There was that thing (that I was never quite convinced of but others were) that in negative corr environments you’d get bad mmc if you were too far away from the crowd (even if your corr was better than most) and therefore originality was being punished just at the time it was most needed. Hopefully we’ll have this up and running with enough time to see what it is like before staking? (With backfill hopefully? This should be way less expensive to calcuate than TC, eh?)

9 Likes

Another thought… removing “optional multipliers” seems to be the best of the coming changes… I did a little plotting and found that the current regime of “optional-multipliers” has not produced any meaning full change to the stake weighting of the MM.

Take a look at the top 10 stakes over the last year…

the whales have just been allowed to sit and take up space. They’re diluting good signal.

(also kudos to Mistreated for taking profits.)
… and the code for anyone to play with…

import pandas as pd
import matplotlib.pyplot as plt
from numerapi import NumerAPI
napi = NumerAPI()

list_of_usernames = ['crowdcent', 'atol', 'shatteredx', 'mistreated', 'efficient_meerkat', 'aininja', 'halsmith99', 'phorex', 'yoshiso','ummon']


def get_account_profile_details(username, tournament=8):
    query = """
        query($tournament: Int!, $username: String!) {
          accountProfile(tournament: $tournament, username: $username) {
            totalStakeTs {
              date
              delta
              time
              value
            }
          }
        }
        """
    args = {'tournament': tournament, 'username': username}
    account_data = napi.raw_query(query, args)['data']['accountProfile']
    return account_data

all_profiles = pd.DataFrame()

for username in list_of_usernames:
    profile_data = get_account_profile_details(username, tournament=8)
    profile_totalStakeTs = profile_data['totalStakeTs']

    # Convert to DataFrame and process
    df = pd.DataFrame(profile_totalStakeTs)
    df['date'] = pd.to_datetime(df['date'])
    df.set_index('date', inplace=True)
    
    # Aggregate data
    all_profiles[username] = df['value']

# Handle any missing dates
all_profiles = all_profiles.fillna(method='ffill').astype('float')

# Plotting the data
plt.figure(figsize=(15, 10))
for username in list_of_usernames:
    all_profiles[username].plot(label=username)

plt.title('Account Values Over Time')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.show()

Excuse my Numeraic ignorance, but I was not around when MMC was used as payout multiplier. First to be sure: is it the same as current CWMM metric in dashboard? Second can you provide more reading links on evolution of MMC and how to optimize model MMC performance?

1 Like

If we we’re to split the top 10 stakes into two groups – ones that beat the 1yr average and ones that did not – we see that it could still take a year or two for the competitive models to come to prominence. Numerai should be enforcing a high multiplier not a lowering it.

columns_to_sum = ['crowdcent', 'atol', 'efficient_meerkat', 'ummon']
all_profiles['lame_ducks'] = all_profiles[columns_to_sum].sum(axis=1)
columns_to_sum = ['shatteredx', 'mistreated', 'aininja', 'halsmith99', 'phorex', 'yoshiso',]
all_profiles['winners'] = all_profiles[columns_to_sum].sum(axis=1)

plt.figure(figsize=(15, 10))
for username in ['lame_ducks','winners']:
    all_profiles[username].plot(label=username)

plt.title('Account Values Over Time')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.show() ```
2 Likes

The gap widens even more when you look at the top 20 stakes…

at present the top 20 stakes control 50.6% of the meta model. The 12 lame ducks control 38.8%
The 8 winners control just 11.8%.

I went ahead and ran it for the top 50 stakes… I’d say that’s a pretty damn representative sample.

at present the top 50 stakes control 63.3% of the meta model. The 33 lame ducks control 48.7%. The 17 winners control just 14.7%.

I’m actually shocked at results… Half of the meta model is controlled by users that can’t figure out how to get in the 50th percentile!

look at the graph… It’ll take years for meta model to sort itself out!

my point is this… don’t keep cutting the payout multipliers. That would be a huge disappointment and bad for the fund. There’s one type of user that’s BIG problem for the fund… the ones with more money than sense. Why not just crank up and enforce a big multiplier? It doesn’t really matter what metric you use to score predictions… Corr, MMC, TC… what ever… halt of the meta model is controlled by people aren’t even coming close to hitting the mark.

Why not MMCx5? Why not x10? the multiplier and PF are two absolutely arbitrary coefficients and you let them control your MM’s learning rate?! I never got the payout factor… why let the public decide how fast or slow the stake weighted meta model adjusts its self?

8 Likes

MMC != CWMM. MMC is well described here: https://forum.numer.ai/t/mmc2-announcement/93. MMC used to be calculated as performance of leave one (your model) out - performance of meta model. MMC2 is basically performance of your model’s residuals against the target.

1 Like

So MMC score is not calculated and published right now? How can we check our past MMC performance?

1 Like

Lets hope MMC scores get backfilled soon :face_with_spiral_eyes:

2 Likes

I think the idea is that you can calculate MMC on your own against the historical meta model predictions.

But I dont want to “reevaluate” all of my 100 models on my own. I would prefer Numerai to do just backfill and show MMC scores + no way to compare your MMC to other MMC scores without Numerai backfilling the metric.

7 Likes

I think this is going to backfire. You are changing the tournament from: “make models that predict the target” to: “retrain models every round to farm whatever residuum is left”. It is like doing gradient descent without momentum. IMO you should keep corr as part of the reward. If the models that are highly correlated with the target doesn’t help your fund, than it is the target fault.

8 Likes

The magnitude of the risk is controlled by the magnitudes of the scores, i.e. the multiplier/scaling level, but again if there is only one level we don’t need that concept. Anyway, if the risk is too high relative to the probability of being positive each round (and by how much), then everybody is guaranteed to go bankrupt no matter how good their model is. This is a simple Kelly criterion calculation. If everyone is forced to “overbet” then we all go broke…mathematically guaranteed. (We can’t only focus on getting rid of bad performers – you can do that by getting rid of everybody. Good performers must be rewarded and not ground into dust also.)

8 Likes

The big danger with MMC only (and we discussed this a alot back in the MMC days the first time) is without a “center” of corr (or something absolute) the metamodel will just move around (in predictive space or whatever that would be called – the direction of the preditions) for no other reason than people are constantly moving their models around chasing the residuals since that is all there is. (Most likely it would oscillate between two or three main directions that match the inductive biases of the underlying models.) However, since the market is a moving target to begin with and it is all very complex, this dynamic is not quite guaranteed, we’ll just see. The old MMC tended to be just an extension/magnifier of whatever your corr was most of the time. It behaved nothing like TC – we’ll have to see on a new target.

But if it is a zero-sumish game, then it comes down to something like “you must be right at least 52% of the time” (or some X%). This is how sports bettors think – meet that that mark or else you’re losing. And if the math/magnitude of the scores and the risk controls we are offered are not worked out to be just right (if they even can be in such a crazy game) then there is definitely a danger of enforced losing for everybody – it will simply be impossible to make any reward except by luck in the short-term. (Time for an asymmetrical score perhaps?)

So far they have stubbornly refused to give us any real risk controls (i.e. stake management, which Richard explicitly refused in the last fireside) and of course automatic compounding is great if you’re winning, but it is also an excellent way to drive you straight into that overbetting zone where you are guaranteed to lose. (AGAIN, even with a good model! Overbetting kills everybody – good and bad.)

So while I’m actually cautiously optimistic that this general idea can be made to work…well their track record of thinking these things through is unfortunately undeniably at this point…not great. And they always seem to be in a mad rush to implement the new thing and dump the old thing before we can really tell how it is going to go. Like…why not actually see how it goes instead of just jumping off the cliff and hoping? (EVERY policy has its price and its unintended consequences – we could all 100% agree that “whatever plan” sounds great and the math is perfect, etc and it still has some fatal flaw nobody thought of but is easily seen in practice…if you only give it some time in practice to find out (before totally abandoning the old thing). Too many nasty rug pulls and there just won’t be anybody left…and we’ve already had one or maybe two this year depending on how you count them…

21 Likes

Numerai will no longer make payouts based on CORR or TC, all payouts will be on MMC only on a new upcoming target called Teager starting the end of this year.

With this you mean that the scoring system will change to a new target with a “new-old” metric in the next month? or just that the new target will be released?

I think that some overlap is needed to let people adjust or create new models. If lots of people unstake their models not because they are performing bad, but because they don’t know how are they going to perform, that could be very dangerous for the MM performance.

5 Likes

The approach does feel disappointingly haphazard and ad hoc. If Numerai and @richardai want longer term commitments (as stated in the linked video), I think they can foster that by acheiving greatere stability in the way the competitions are organized.

For example, several months before introducing a substantial change, perhaps Numerai could open a third competition–let’s call it Playground (to go with Classic and Signals)–where everyone recognizes that things could change quickly, and there would be no staking. Just some gold stars for consistent participation, or maybe some credits to apply to the staked competitions.

The purpose would be to provide an area where there really isn’t any risk, so Numerai could beta test their ideas and competitors could get a handle on what they need to do to compete in the Tournament given a proposed update. Then if Numerai finds that the Playground results are leaning towards an improvement in the Tournament, and the bugs have been worked out, then switching that into the main Tournament would be easy.

And when their are no proposed updates, just let the Playground run under the the same process as the Tournament, but without staking.

11 Likes

My Numerai Tourney experience lately.
Sometimes you eat the Bear, sometimes the Bear eats you.

9 Likes

So if we are now going to have a new Target and MMC to evaluate our predictions, how do I calculate MMC?

3 Likes