MMC Payouts Adjustment Proposal

Obviously this looks great for my models. I like to pick models that are off the beaten path, and up until now, it has treated me quite well. I switched to the MMC tournament for 3 submissions and I’m significantly under-performing CORR. I an VERY interested in this new approach. Can you explain what you mean by “exposed to MMC at the same time as correlation”? Does it mean take the max(CORR, 2*MMC)? Probably not, so what are the cons? I see the pros!

3 Likes

It means simply CORR + MMC. (both 1x)

2 Likes

@master_key, would you be increasing the 1*MMC multiplier as MMC gets more difficult to find?

@jrb , We probably want the encrypted one way MMC loss function for the training and validation sets, not for unlabeled eras, and we want to be able to make 100 passes across it per week. @master_key , having this function for the last week only is needed. That is, publish last week’s encrypted meta model loss function only , not even an estimate of the future model. That would be a very very solid way of creating unique models for the future and its a lot easier.

@lackofintelligence What is an “encrypted one way MMC loss function”? That’s not how a HE system works. What you’re asking for could be trivially exploited, if implemented.

@jrb , I think you understand. The idea is that there is some difficulty in obtaining the loss. Maybe a minute to obtain the loss for the entire dataset. Even if somebody expends a week of computer power to obtain an estimate of last week’s metamodel predictions for all of the rows, it is just last week’s meta model predictions on the training set, what good is it? In fact if you think about it why wouldn’t Numer.ai release it totally free? It does not give you predictions on any live data. In fact, I’ll just say that right now. Why not just release last week’s metamodel predictions on the training and validation data sets? That would be the absolute best place to start looking for a unique model.

@wigglemuse is correct, it’s literally MMC + CORR = Payouts. So if your MMC is -0.1 and your Corr is 0.3, your payout would be 0.2

Good question. I’m not sure how much harder it will be to get MMC over time! We’ll stay committed to making sure we reward models that help us though. Just don’t know if it will be by a multiplier for MMC or what

3 Likes

Seems like a quite usable system. Can’t speak for anyone else but it’d certainly push me more towards MMC.

2 Likes

For Numerai’s long term succes, among other things, it is important that the users with the best and most unique models should have the highest returns.

Just CORR only covers rewarding the ‘best’ models and is therefore unfit. I would even go as far as saying that it is up for replacement as soon as something better comes along.

To increase model uniqueness MMC was introduced earlier this year. But MMC is not just model uniqueness, it also covers relative model performance. Unfortunately, this proved to be it’s biggest flaw. Let me give an (unfortunately fictional) example: I have made the perfect model (for Numerai), an insanely consistent model with 0.03 CORR and a deemingly impossible high sharpe. With such a consistent model I would for sure like to stake on CORR, a stable ~3% return each week is both safe (low risk) and more than profitable enough. Given the uniqueness of this model you would think MMC would be even more profitable. And it might be in the long term. But in the recent period this would result in losses far exceeding your CORR risks & drawdowns. Why? Because all the boosting models achieve >0.06 CORR in these tournament rounds leaving you with negative MMC. And such a risk is not worth it to switch to MMC from your stable CORR returns.
Relative performance is not a good metric. Integration_Test outperforms NasdaqJockey on more weeks than vice versa, but it is those weeks where Integration_Test does bad where NasdaqJockey shines and the sole reason why it is rightfully praised as a model.

Now MMC+CORR is proposed. I like it much more than just increasing the MMC payouts (i.e.: MMC payouts get 3*MMC values). It covers both unique and good models and could be seen as best of both CORR and MMC. In theory, one could therefore suggest to replace both current tournaments with this new CORR+MMC tournament.

Although parts might be unclear or misunderstood, this message is not meant as critique. I applaud MikeP in his search for the ‘best’ (in all ways) payout metric and whilst it might not last for much longer, MMC was a good step in this journey and we can learn much from it. I believe that CORR+MMC could be a useful next step. But I have the feeling we can come up with something better. CORR+MMC still has the ‘relative performance’ drawback of MMC that I do not like.

I have some ideas, but nothing worth sharing yet. But these thoughts might help others:

  • Can we change MMC to cover only uniqueness? Just as CORR rewards raw performance?
  • Is uniqueness as simple as correlations with other models, or more complicated? If trading is done only on the largest sell/buy signals and/or Numerai first neutralizes our predictions, shouldn’t we include this somewhere/somehow?
  • If NasdaqJockey is one of the ‘best’ models for Numerai - why is it not on top of either (or the combined) leaderboard? Does MMC not correlate that well with their in-house metrics of model usefulness?
  • To reward unique and consistent good performing models like NasdaqJockey, we might need to move away from just correlation and move towards sharpe-based metrics? Sharpe between rounds? Sharpe within rounds (over the individual days of a round)?
  • EDIT: If we would have a metric for uniqueness, integration_test should be one of the worst models on this metric. OLD: Why does integration_test not have a negative MMC consistently? It is the least unique model there is.
3 Likes

It makes sense for integration_test to have positive MMC as it still is one of the best models outright. Any model can (theoretically) get copied, so uniqueness changes with time and trends. (The only reason integration_test is not unique is nothing intrinsic to it – it is because everybody copies it or just submits the exact duplicate predictions.) Negative MMC for a round is basically saying the metamodel would have been better off without this model in it (for this round). So lack of uniqueness itself can’t be the thing that gets you negative MMC – it always has to be tied up with performance somehow.

2 Likes

Thanks for your response Wigglemuse. I understand what you mean and fully agree. My last sentence was meant slightly different, more like: “If we would have a metric for uniqueness, integration_test should be one of the worst models on this metric.”. Edited this in my original post.

If there would be a payout scheme purely based on performance and uniqueness, integration_test should be positively rewarded for it’s performance, and negatively because it is probably one of the least unique models around.

1 Like

Well, that’s the question – “negatively” doesn’t seem right, but whatever uniqueness bonus is given, it shouldn’t get much relative to others certainly. And simple CORR+MMC accomplishes that. I’d still be interested in comparing the numbers of different ideas though, so I’m gonna see if I can pull some data since nobody is taking my hints to do it for me. (Somebody else will have to make graphs though.)

Another proposal to merge CORR & MMC: A Dynamic Payout Scheme

Motivation:
If models with MMC>0 have Mean(CORR)=0.0318 and Mean(MMC)=0.015 and you demand models with high MMC, the multiplier of MMC should be at least twice the multiplier of CORR. That is to say:
Payout = w CORR + (2-w) MMC ; such that w<0.667
Why? Because improving CORR by +2d is easier than improving MMC by +d. This is in average terms, in marginal terms it can change a little bit.

Proposal:
1.- Start with this initial scheme:
Payout = w CORR + (2-w) MMC ; such that w=0.65
2.- Adjust “w” depending on the marginal improvement of the average CORR and MMC over time.
3.- In this way the payout scheme can be changed for every tour in order to give an incentive to the submission of high MMC models.

an example is included in a “Tournament category” post.

@master_key would you consider keeping 2*MMC as an option, for people who want to only stake on MMC?

Some code to check your historical payout using only CORR or MMC or CORR+MMC:

#!/usr/bin/env python3

import numerapi
import matplotlib.pyplot as plt
import pandas as pd
import sys
import numpy as np

api = numerapi.NumerAPI()

# metrictoplot = 'corr'
# metrictoplot = 'mmc'
# metrictoplot = 'comb'
# metrictoplot = 'all'

metrictoplot = sys.argv[1]

username_list = ['integration_test', 'sugaku']

fig1 = plt.figure()
cmap = plt.cm.get_cmap('tab20b', len(username_list)*3)
i = 0
for user in username_list:
	print("Collecting data for: ", user)
	user_df = pd.DataFrame(api.daily_submissions_performances(user)).sort_values(by="date").groupby("roundNumber").last()
	start_round=np.min(user_df.index)
	end_round=np.max(user_df.index) # most recent resolved round
	stake_corr = 1.0 # initial stake
	stake_mmc = 1.0
	stake_comb = 1.0
	for r in range(start_round, end_round):
		if r in user_df.index:
			corr_score = user_df.loc[r, "correlation"]
			mmc_score = user_df.loc[r, "mmc"]
		else:
			corr_score = 0.0
			mmc_score = 0.0
		if np.isnan(user_df.loc[r, "correlation"]) or np.isnan(user_df.loc[r, "mmc"]):
			corr_score = 0.0
			mmc_score = 0.0
		if corr_score:
			stake_corr *= 1.0 + corr_score*1.0
			stake_mmc *= 1.0 + mmc_score*2.0 #2x leverage for mmc
			stake_comb *= 1.0 + corr_score+mmc_score
		user_df.loc[r, "weekly_stakes_corr"] = stake_corr
		user_df.loc[r, "weekly_stakes_mmc"] = stake_mmc
		user_df.loc[r, "weekly_stakes_comb"] = stake_comb

	color = cmap(float(i)/len(username_list))

	if metrictoplot == "corr":
		plt.title('Expected CORR payout for models', fontsize=17)
		user_df.weekly_stakes_corr.plot(label=user, color=color)
		plt.text(end_round-0.75, user_df.loc[r, "weekly_stakes_corr"], user, color=color, fontweight="bold")
	if metrictoplot == "mmc":
		plt.title('Expected MMC payout for models', fontsize=17)
		user_df.weekly_stakes_mmc.plot(label=user, color=color)
		plt.text(end_round-0.75, user_df.loc[r, "weekly_stakes_mmc"], user, color=color, fontweight="bold")
	if metrictoplot == "comb":
		plt.title('Expected CORR+MMC payout for models', fontsize=17)
		user_df.weekly_stakes_comb.plot(label=user+'_comb', color=color)
		plt.text(end_round-0.75, user_df.loc[r, "weekly_stakes_comb"], user+'_COMB', color=color, fontweight="bold")
	if metrictoplot == "all":
		plt.title('Expected CORR and MMC payout for models', fontsize=17)
		user_df.weekly_stakes_corr.plot(label=user+'_corr', color=color)
		plt.text(end_round-0.75, user_df.loc[r, "weekly_stakes_corr"], user+'_CORR', color=color, fontweight="bold")
		user_df.weekly_stakes_mmc.plot(label=user+'_mmc', color=color)
		plt.text(end_round-0.75, user_df.loc[r, "weekly_stakes_mmc"], user+'_MMC', color=color, fontweight="bold")
		user_df.weekly_stakes_comb.plot(label=user+'_comb', color=color)
		plt.text(end_round-0.75, user_df.loc[r, "weekly_stakes_comb"], user+'_COMB', color=color, fontweight="bold")
	i += 1

plt.grid(linestyle='--', linewidth=0.5, color="black")
plt.xlabel('Round number')
plt.ylabel('Expected payout factor')
plt.xticks(np.arange(start_round, end_round, 1), rotation=60)
ax = plt.gca()
ax.set_facecolor((0.9, 0.9, 0.9))
plt.show()

sys.exit()
10 Likes

Similar to the above but allowing to use a rolling window over the rounds
(update 12:01:31 25 July 2020: reduce repeated code, accept rolling window size as input)

#!/usr/bin/env python3

import sys
import numerapi
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

api = numerapi.NumerAPI()

metrics = ["correlation", "mmc", "comb"]
metrictoplot = sys.argv[1]
if metrictoplot not in metrics + ["all"]:
    raise Exception("Valid metric values are %s" % (metrics + ["all"]))

username_list = ['integration_test', 'nasdaqjockey']

plt.figure(figsize=(18, 6))
cmap = plt.cm.get_cmap('cubehelix', len(username_list)*3)
rolling_window_size = int(sys.argv[2])

metric_to_style = {"correlation": "-", "mmc": "--", "comb": ":"}

for i, user in enumerate(username_list):
    print("Collecting data for:", user, flush=True)
    user_df = pd.DataFrame(api.daily_submissions_performances(user))\
        .sort_values(by="date")\
        .groupby("roundNumber")\
        .last()
    start_round = user_df.index.min()
    end_round = user_df.index.max()
    user_df["comb"] = user_df["correlation"] + user_df["mmc"]
    user_df["mmc"] *= 2

    rolling_series = (user_df[metrics].fillna(0) + 1) \
        .rolling(rolling_window_size)\
        .apply(np.prod, raw=True)

    color = cmap(float(i)/len(username_list))

    if metrictoplot == "all":
        plt.title('Expected correlation, mmc and comb payout for models', fontsize=17)
        for metric in metrics:
            rolling_series[metric].plot(label=user, color=color, ls=metric_to_style[metric])
            plt.text(end_round, rolling_series[metric].values[-1], "%s - %s" % (user, metric), color=color, fontweight="bold")    
    else:
        plt.title('Expected %s payout for models' % metrictoplot, fontsize=17)
        rolling_series[metrictoplot].plot(label=user, color=color, ls=metric_to_style[metrictoplot])
        plt.text(end_round, rolling_series[metrictoplot].values[-1], "%s - %s" % (user, metrictoplot), color=color, fontweight="bold")    

plt.grid(linestyle='--', linewidth=0.2, color="black")
plt.xlabel('Round number')
plt.ylabel('Expected payout factor')
plt.xticks(np.arange(start_round, end_round, 1), rotation=60)
ax = plt.gca()
plt.show()
3 Likes

It is an improvement compared to the previous situation. In particular, it is more fear with the best MMC user.
However, I think it is not enough incentive for users to focus on MMC rather than CORR. You will see it in a couple of months.

Why do you think that such incentives should exist? In my opinion, developing high MMC models are good for your CORR by itself. If your high MMC model has low CORR but unique enough, you’ll just combine your model with example predictions and will get a model with positive MMC and high CORR. If your high MMC model has high CORR - you will be already happy.

1 Like

jackerparker look at this topic
Discussion on incentives: MM clones vs MM improvement