We’ve been working on new ways to increase the payouts on Numerai Signals. We want early Numerai Signals users to be able to earn as much as the early Numerai users have.
We analyzed how well all current Numerai Signals users would do if their signals were instead scored against the new longer horizon target. It turns out that most users do much better on the new longer horizon target – even though the new target is still neutral to the same features.
current target
new target
Mean user corr
0.000801
0.003860
Standard deviation of corr
0.004622
0.005687
Average corr Sharpe
0.0495
0.2340
# of users who have higher corr mean on target
20
58
# of users who have higher corr Sharpe on target
17
61
If we had scored on this new target, the mean correlation with the target would go up 481% but the variance would only increase by only 20% resulting in a much higher average Sharpe for NMR staked on Numerai Signals (0.0495 to 0.2340). It’s important to note that payouts would likely improve even further once we gave out these targets historically, updated the validation diagnostics to show results based on these targets, users optimized their signals to be good at this target, etc.
In the new few days we will be showing a new correlation on Numerai Signals called CORR20. This will be your signal’s correlation on the new target.
Wow awesome news! Great timing with the signals roundtable coming up today too. This is even more of a motivation to get moving on signals models. Looking forward to learning more tomorrow.
This sounds a great idea Richard; having both the existing CORR6 comp and a CORR20 one might also be attractive for users, and if that were feasible to operate, running like that for at least for some period of time to see what happens and whether it’s beneficial could be worthwhile.
Allowing 3-4x MMC is great to increase payouts, but it also increases risk of high negative return.
One thing important to me with Signals is that staked NMR are blocked only for 10 days, vs 4 weeks on the tournament. Staking on CORR20 wouldn’t be a good thing in that sense.
What about allowing both short-term AND long-term payouts for the same model:
after 6 days, payout 2CORR6 + xMMC
after 20 days, payout 2CORR20 + xMMC
And let the user choose if he wants to get payout after 6 days only, 20 days only, or both?
It’s exactly what I was looking for when signed up for Signals at first time (I’ve discovered Numerai by searching for something like Signals). One week time horizon can be too noisy (the term “noise trader” is there for some reason, right :P) and 4 or 3 weeks is way better. But I still think it’s not enouth, I’m hoping to see even longer targets in the future (like a target for 2 or 6 months, maybe 1 year). Can’t wait to back with my efforts on Signals
I tested my measure with a 6-week lag and it was still performant. I expect a longer horizon to be much better from a volatility perspective and a payout perspective. This is a great development!
Whatever is the final payout formula, I think Signals deserves to pay a lot more than the tournament because it’s more complex in many ways. For the reasons listed below, it will attract way less users than the tournament, and it will mainly attract experts which expect to be eligible to high rewards.
entry cost is much higher: users cannot use a pre-built dataset and train first model in minutes
competing is not necessary free: users need to call some API to collect data, which are either free and slow, or faster but not free.
once data is collected, we need to spend time to clean and process it in order to build a proper training dataset.
each week users have to collect data again, to build the test dataset and its features. This takes both human and computing time, which means $ if you do it on an instance
it requires a mix of rare data-science skills and expertise and creativity, which is not the case for the tournament (anyone can submit a model on the tournament after reading a short tutorial on random forests).
If your intend is to increase participation in Signals, more than just increasing payout, I think a better track for Numerai is to provide free historical price and fundamental data (FactSet or Morningstar). As gund pointed out, there is a cost in trying to get some meaningful models and this could be a deterrant to prospective participants.
I can agree with this because I actually started building a machine on aws but stopped because I determined that at least for now its cost prohibitive.
For the financial data there’s a bit of a start up headache to be sure. But after that, it doesn’t seem to be too much of a problem if you’ve settled on a model. I just download data updates from Yahoo once a week (they seem to have about 98% of the live universe tickers, plus lots of other stuff, like indexes, rates, and currencies) and models (once they are stable) should only need marginal adaptation.
Mind you I’m no expert—this is only my second round in Signals (otoh, I have been doing similar stuff since the 90s). And my models do need more work; I checked my submissions and one of the strongest short term buy recommendations was for GME.
The new scoring might be better for Numerai, but I don’t think it constitutes a higher payout due to the longer compounding period, user orbitalTeaPot on chat came up with some early calculations:
Hinting at a decrease of between 12% - 25%… had some time to model what that would look like for my 3 main models (BLINKI, INKI and WANDER ) and indeed they have a combined decrease in payouts of about 11% (staking on middle of the ground CORR + MMC ) from round 238, my models are somehow low quality/time invariant so take it with a grain of salt ( yet they’ve been/were in the top 10 ) before signalsgate.
Modifying the payout formula wouldn’t only increase the payouts, but also allow weaker signals to be submitted. In theory, using not only strong signals but also weak signals could improve the metamodel. It seems however, if I understood well, that you currently have difficulties to use weak signals? If it is indeed the case, modifying the payout formula may be of lesser interest.
One concern I have about modifying the horizon to 4 weeks is that you may lose part of the information that you find interesting in Signals models. Are you sure that one reason that Signals models can help to improve the Numerai Classic metamodel isn’t partly because models on both tournaments have different horizons?
We were talking about it yesterday and it seems like a very small change. In your proposal, a -0.03 CORR becoming a -0.03/(1+0.03) = -0.0291 doesn’t seem to be a large enough effect even though I agree with the argument.
That’s a good question – we like how Signals is now in terms of how good the Signals are for helping the fund. And there’s no guarantee that the top Signals users might adjust their models in such a way to do better on CORR20 but actually worse at contributing to our live trading for some other reason. One good thing about optimizing for CORR20 is that it will tend to produce lower churn models but the benefit of that might not offset the cost. Very difficult to say ahead of time – especially with our whole analysis being done on users who don’t yet have the new longer target to train on.
Out of curiosity, why not open up a separate Signals20 competition? To me—and I am admittedly very new to Numerai—doing 1 week vs 4 week predictions are quite different kettles of fish.
I don’t want to be the proverbial sourpuss, but this just struck me. If people believe that the models they’ve built for the 6-2 competition actually would perform better on a 22-2 basis, wouldn’t it just make more sense for them to hold off submitting a given prediction for several weeks (or alternatively not use data from within two or three weeks of the submission date)?
What I like about corr20 is that it should probably be easier, we can expect to have a better Sharpe. What I don’t like about it is that our stakes are going to be locked for a way longer time. What would be great is to have a choice between corr4 and corr20. And that could also be interesting on your side, since you would be able to see how much the horizon has an impact on the usefulness of the models for live trading.
Another interesting and fun problem would be to have signals that have good results for both horizons. But that would be more complicated, so it is not really practical.
By the way, you can definitely forget about the modified payout function I was proposing, it is not immune to attacks.