No, it is not. I clearly said that I mean: past performance does not indicate future performance. Totally different statement.
Example: you see a model with really good historical performance. You buy it on numerbay and stake on it. The model will now burn 50% of your stake, because:
The author of the model decided to submit something else since you staked (or has a bug in his pipeline)
The numerai fund raised AUM significantly and now is not able to trade small caps; but the model you have bought is trained on an old target, not optimized for large caps
ā¦
The point is - you base your stake on something, that already happened in history, but your profits are purely based on what happens in the future and you are missing the extra context needed. And the historical performance was based on factors which just did not persist but you donāt know that. Thatās why just using the historical performance is never enough.
But that actually proves my point My argument was that selecting models using just LB rank does not work and it is a stroke of pure luck.
If that ever worked, Numerai would not need to introduce the staking in the first place. It was like this till June 2017 and it apparently did not work.
It is not really my problem how Numerai selects their MM weights. If they are unhappy with linear system, they can use whatever works best for them.
You donāt know, maybe without the big stakers the fund performance could be much worse. Everyone talking about big stakers but nobody here proved that average stakers delivered something much better. Considering how bad LightGBM models did in the drawdown I would not expect anything great.
Better than considering the past performance of a model, I would say users should submit the prediction of the current round plus some other eras (synthetic eras or historical eras with a different obfuscation settings so that past eras cannot be recognized). This additional eras serve to gain confidence on the model over different period of times. Letās say this additional eras are 10 per round, then over a period of 4 weeks (a full round) Numerai can collect the model performance over 220 eras (20 rounds + 10 test eras * 20 rounds). This would certainly improve the confidence on models.
That would be the case only for a fixed hold out era set, correct? So my original idea would still work and I still believe it would make any type of score more meaningful and less noisy. The only problem i see in my approach is the increase in computation: if Numerai asks for X more era submissions, then the computation requirements increase X times and given the limited amount of time we have for the daily tournament that might be a problem.
On a different note, I believe the stake weighted portfolio concept has to go away (as many of you already said). It was an interesting idea but it bundles two concepts that have nothing to do with each other: the optimal model weight (optimal from the hedge fund point of view) and the user investment capacity. To be honest, the whole Numeraire thing has to go away. Itās an additional layer that users donāt need/want. I hope Numerai can find the way to make the best use of the model predictions so that the hedge fund goes well and we can get paid in FIAT eventually.
Iām worried more about total signal weight/mm control than staking (or avoiding drawdowns) per se, i.e. Iām in the camp that thinks a tiny number of signals shouldnāt be essentially controlling the fund. Why have all these thousands of models when only 10 of them really count?
I think it could perhaps work with synthetic eras, but hardly with historical eras. Where would you get so many historical eras that they never wear out considering the whole history is now public?
About NMR - the fund needs it and I want it What are users going to stake ⦠fiat? Also, the whole point of NMR is that payouts donāt cost them anything. If they had to pay with fiat, the earnings would be negligibleā¦
Fair enough, but nobody forces Numerai to calculate the weights in the way they do now (weight == stake). They can use something else (like log(stake)) if this does not work for them. It is an internal thing.
Also, the concentration of MM weights can be caused by Numerbay as well (too many participants staking on the same model)ā¦
Yes, exactly. SOMETHING should be done about too much top-heaviness ā it doesnāt have to be staking restrictions. Although Iād listen to such arguments as one part of a solution ā should there really be zero top limit if the average stake is X whatever [ factors of magnitude less ]. Earlier in the tournament we had a single guy (who worked for Numerai and designed the scoring!) dominating the staking in a huge way. Giant whales dominating (and reducing the payout factor) does impact the participation (in a negative way) of everybody else and that should be recognized. But it is a thorny problem, with no perfect solutions. Iād like to keep the crowd in crowd-sourcing though. (And I tend to think any schemes based on historical performance will be more problematic than beneficial which I think we agree on.)
Another thought: we already have de facto staking restrictions with the payout factor (which take the form of earning restrictions on your stakeā¦itās similar anyway). Still, one mega whale can eat the whole pie if they choose to. I keep coming back to some thought of tying payout factor to CWMM for everybody ā then whales will be disincentivized from becoming the MM. They can still stake a ton, but not all on the same/similar signal. And those with high-performing but low-CWMM models can earn at better rates. Doesnāt that sound about right? Or is it attackable?
I think the current stake burning and TC are theoretically enough to address most problems such as big whales and bad performing models by auto-correction (i.e. burning). But the problem is that the burning is too slow for participant who choose low multiplier for TC. From what i have seen, most big stakersā TC multiplier is small. So the auto-correction is slow or even non-existing if TC is set to 0 (during period where TC is negative but Corr is good).
So i think the stake weighted ensemble is not ideal if ppl can choose different multipliers. Maybe the ensemble weights should be based something like a accumulated āvirtual stakeā that is calculated using fixed corr + 3*tc (the optimal multipliers should be researched and determined by numerai so that it will have a better auto-correction speed, higher multipliers should have faster auto-correct speed but maybe higher churn so more research should be done. It can even be a moving average or some sort).
Alos, i think the āvirtual stakeā can be used as an actual staking limit factor as it is somehow related to the accumulated actual TC of the MM. So, if you have a very high virtual stake, then naturally you can stake more.
Here is one of the example on how to implement āvirtual stakeā (VS) system:
We can first initialize the VS to be the current stake
Then we calculate the virtual payout (VP) each round normally using aCorr + bTC, a and b are determined by numerai
Participantās stake level cannot exceed accumulated āvirtual stakeā (AVS) or avs_factor * AVS, where avs_factor is set by numerai.
Now at least you will not be burnt by TC if you choose not to, but your round to round stake limit will be changed based on your AVS. So, stake that exceed AVS will be return to your wallet while stake that do not exceed AVS will be compound as usual.
If they are good scientists with good models, then the āvirtual stakeā should be basically unlimited for them. I think the main problem here is that they can chose to just use 2xCorr + 0 * TC, then they are not burnt properly based on their contribution (Here i assume TC can actually measure their contribution correctly, which i think it does to some extent).
Many ideas have emerged; the following are particularly resonant with me:
Moving from linear stake MM weights to weights using factors such as CWMM, MCWNM, and submitterās selected payout multiplier(s). This would help keep multiple voices in the conversation and avoid āwhale emphasisā. MCWNM values equal to 1.0 signal that this a voice thatās already been heard, and that someone is just repeating someone elseās (or their own) predictions.
Having a performance-based payout multiplier that starts at 1 (aka no change to payouts) for new users but can go up or down based on daily returns. Consistent positive returns and it crawls upward, vice versa for burns. No limit on upward growth. This incentivizes people to find net positive models and run with them. They could have a 2.0 multiplier rather quickly. Note this multiplier could also be used in calculating MM stake weighting contributions as it is a measure of historical confidence.
Awards/multipliers that incentivize low CWMM and MCWNM (and CWEPācorr with example preds) but only when thereās high/positive CORR as well.
Monthly/annual NMR bonuses for steady performance.
I have realized that there is something I still donāt understand. I have my ideas, but I would like to ask if anybody has a better understanding of the Numeraiās point of view.
The bad performance of the hedge fund depends largely to the large stakes on models that performed bad lately, despite the tournament having many models with good performance in the same period of time (but with smaller stakes).
However Numerai still likes the idea of the Stake Weighted Meta Model, so they keep this approach and instead they have temporarily change the payout scheme from a maximum of 1xCorr+3xTC to 2xCorr+1xTC
How would that solve the problem? Didnāt the large stake models performed equally bad on both CORR and TC?
Since they are putting more weight on CORR instead of TC, that does mean TC is not useful for the hedge fund after all?