Signals payout improvement?

v1nc3n7 · April 27, 2021, 1:43pm

Hi!

Signals payouts are currently low, and the Numerai team is looking for possible improvements. One purpose of this post is to discuss if the Signals’ payout curve could be modified to make Signals more attractive.

Below is the message that I originally posted in the chat:

Currently, on Signals, a very high variance signal that is correct 56% of the time loses money. However, combining signals from multiple users should reduce the variance, and therefore such a very high variance signal may still be of interest for the meta-model. Here very high variance means a signal that reaches the payout cutoff in both directions all the time.
Let’s consider now a more reasonable signal that is staked with 2xMMC to maximize profit. The fact that a 2x coefficient is applied both to correlation and MMC increases the variance. To simplify, let’s assume that the payout (equal to 2 x corr + 2 x MMC) only takes values in {+10%, -10%}. To be break-even, the signal must be correct 52.5% of the time. If the signal is correct 55% of the time, it will have an average return of 0.5%. To increase the return of such signal, we could make a change to the payout curve. What about attacks? It is important to notice here that to avoid attacks of type p | 1-p, the payout curve doesn’t have to be symmetric. Taking into consideration how returns are compounded, a payout function x |-> x if x >= 0 and x / (1 – x) if x < 0 would make p | 1-p attacks just break-even. With such a payout function, the previous signal would have a return that goes up from 0.5% to 0.96%, that is almost the double. Furthermore, if the signal is correct only 52.5% of the time, rather than being break-even, it would now have a positive return of 0.48%.
I believe that the above payout function would be more in line with the purpose of Signals, that is gathering new original signals that could improve the meta-model, even if these signals are very weak. Furthermore, this change would be particularly easy to implement.

I am going to give a few more details now.

To simplify the problem, let’s consider that our weekly result (for example 2corr + 2mmc) only take values in \{r, -r\}. Let’s check how often we must get a positive result with the current symmetric payout curve in order to get break-even:

(1 + r)^p (1 – r)^{1 – p} = 1 \iff p = \frac{-\log(1-r)}{\log(1+r) - \log(1-r)}.

That means that if our results take only values in \pm 25\%, we need to be correct \frac{-\log(1-0.25)}{\log(1+0.25) - \log(1-0.25)} = 56.3\% of the time to be break-even.
If the values are in \pm 10\% (resp. \pm 5\%), the predictions need to be correct 52.5\% (resp. 51.25\%) of the time to be break-even.

We see here that any signal that doesn’t have a high number of positive eras will probably either lose money or barely make any money. That seems contradictory to the fact that with Signals, we have to bring our own data, and while such data can be of interest to Numerai, we cannot expect to have results as good as the ones we can get in the classic tournament that is using expensive financial data.

Let’s now consider an asymmetric payout curve defined by:

f(x) = \left\{\begin{array}{ll} x \text{ if } x \ge 0 \\ \frac{x}{1 – x} \text{ if } x < 0 \end{array} \right.

This function as the following property: (1 + f(x)) (1 + f(-x)) = 1.
That is if we submit p and 1 – p, we get as results corr and -corr, and therefore the payout is break-even.

def f(x):
    if x >= 0:
        return x
    else:
        return x / (1 - x)


r = 0.1
print('With results in +-0.10:')
for c in [55, 52.5, 51]: # c is percentage of correct predictions
    payout = ((1 + f(r))**c * (1+f(-r))**(100-c)) ** (1 / 100) - 1
    print(f'- if the signal is correct {c}% of the time, the average payout is {payout * 100 :.2f}%')

With results in +-0.10:
- if the signal is correct 55% of the time, the average payout is 0.96%
- if the signal is correct 52.5% of the time, the average payout is 0.48%
- if the signal is correct 51% of the time, the average payout is 0.19%

We could also add to this payout function a small penalty. That would make signals with very high variance and no prediction value lose a bit of money rather than just being break-even:

def f(x, eps):
    if x >= 0:
        return x
    else:
        return x / (1 - x) + eps * x

That’s all for today, thank you for reading

of_s · April 27, 2021, 2:11pm

It is an apples to oranges comparison to the classic tournament due solely to the difference of the prediction windows. If Signals were extended to predicting returns a month out, then we could generalize to data quality statements. I doubt the classic tournament would have such positive results predicting 6 days out…

Otherwise, your payout suggestion is sensible to reward participation with lower percentages.

degerhan · April 27, 2021, 5:56pm

Following up on the comment from @of_s, and with the huge caveat that this data is 10+ years old and the below rows are cherry-picked, rank-based predictions that drove an S&P500 market neutral fund for 5+ years. Annual returns averaged 20.2% which used to be among the top of the heap for market-neutral systematic funds.

Prediction Horizon weeks	1	2	4	13
Return metric, for horizon	4.79	17.41	40.54	149.68
Return metric, avg per week	4.79	8.71	10.14	11.51
Alpha contribution (annualized)	-0.20%	5.09%	6.48%	8.58%

My intuition is a 2 to 4 week prediction window will yield better results for signals. And the performance gap between main tournament and signals might be supporting my intuition. However, feel free to give me a hard time about this as I haven’t done a study with recent data or the signals universe.

I think I heard @mdo say before that targets matter more than modeling techniques – I wonder if numerai has done or will do a study to determine optimal signals target horizon.

gund · April 28, 2021, 9:57am

Interesting! Can you please explain how you came up with this formula?

Or why do you consider the payout is (1+r)^p * (1-r)^(1-p)? Expected payout shouldn’t it be something like (1+r)p + (1-r)(1-p)?

v1nc3n7 · April 28, 2021, 11:24am

Indeed, it is not such a great comparison. For sure, the shorter prediction window makes the problem significantly more difficult. Overall, due to the multiple differences, the correlation we get on Signals should be on average way less good than the one on the classic tournament, and the variance should also be higher. That’s why rewarding lower percentages seems important.

v1nc3n7 · April 28, 2021, 11:30am

What would be great is to be able to make predictions at different horizons. Furthermore, having multiple targets to optimize for could improve the results. That’s actually what I would prefer.
But from the data you show, if we had to stick to only one horizon, 2 weeks could be a sweet spot for Numerai since it would still be significantly different from the classic tournament.

v1nc3n7 · April 28, 2021, 11:47am

The name of the variable is not a good choice, average return might be better.
Here, f(0.1) and f(-0.1) are respectively the returns for positive and negative rounds. Over 100 rounds, let’s assume that 55 rounds are positive. Then the invested amount will be multiplied by a coefficient (1 + f(0.1))^{55} (1 + f(-0.1))^{45}. To get the average per round, we raise to the power of 1/100.

of_s · April 28, 2021, 2:30pm

Why should the variance (of correlations) be higher? I would think the longer the timeframe, the more dispersed the scores.

If you proxy the 6 day returns with noise, then while scores on average will be lower, their variance or range should be too.

v1nc3n7 · April 28, 2021, 2:58pm

I was thinking with respect to the scaled correlation, so it would be proportionally higher. It is probably better to say that Sharpe would be lower. Thank you for following

_liamhz · May 26, 2021, 7:44pm

Does this payout curve prevent p | 1-p attacks whenstakes are rebalanced amongst the models trying to exploit the payouts?

Model A has a corr of 0.05, model B has a corr of -0.05. Both are staked with 1NMR

After a round, we get the following stake values (assuming payouts on 1xCorr and 0xMMC for simplicity)

Model A: 1.05
Model B: 0.952380952 (1 + -0.05 / (1 - -0.05))

After the two models rebalance their stakes to be equal between them, each will have a stake of 1.00119048NMR ((1.05 + 0.952380952) / 2)

v1nc3n7 · May 27, 2021, 9:15am

You are right, it doesn’t prevent attacks. It does allow weaker signals to get positive returns, but at the cost of opening a breach to attacks. The error above is that in the case of p | 1-p attacks, we are not compounding the returns consecutively with respect to one initial stake, but separately with respect to two stakes. Since we don’t want to let room for any attack, this payout is not a good proposal. This means we have to keep a purely symmetrical payout function, and that weak signals will not be able to get a positive return. My bad for this mistake.

_liamhz · May 27, 2021, 4:57pm

No worries, I really appreciate the effort you put into designing this! It was interesting enough that internally we had started thinking about implementing this, which is why I spent the time searching for attacks.

Topic		Replies	Views
Longer Signals Target - A Proposal For Higher Payouts Signals	25	2952	July 2, 2021
The Signals Meta Model Has Been Released: Here Are The Feature Exposures Signals	9	2028	August 23, 2023
Changing Scoring & Payouts Again To MMC Only Tournament	29	3651	November 29, 2023
Better LGBM Params, Signals V2 Data, and Reducing Signals Churn Announcements	24	1985	August 15, 2024
Numerai Fireside Chat Aftermath Feedback	61	1931	November 4, 2023

Signals payout improvement?

Related topics