Is this new metric TC robust towards p / (1-p) type of vulnerability?
Leaderboard Bonus Exploit Uncovered - Tournament - Numerai Forum
Is this new metric TC robust towards p / (1-p) type of vulnerability?
Leaderboard Bonus Exploit Uncovered - Tournament - Numerai Forum
It is symmetrical (1-p will get exactly opposite TC), and there is no bonus anymore, so shouldnât be an issue.
Is there any preprocessing for userâs predictions before SWMModel in the production system? In MMC, userâs predictions are converted to uniform distribution but I am wondering the behavior for TC.
Especially whether only rank matters or magnitude matters for TC calculation, and this information is useful for us to understand the TC behavior.
For those who are commenting that they will likely withdraw from the competition due to the change as their currently optimized model (CORR/MMC) is not suitable for TC, I think that is their intention. Because those models are earning rewards without helping TC/hedge fund performance.
I am fortunately not in that group somehow. I recently checked my model and it would have made 1% more per week if I could stake on 2xTC instead of 2xMMC. It does however increase volatility as others have mentioned, but perhaps for a different reason. I currently have a 2.1 Sharpe with Corr+2xMMC but it will drop to 1.3 with Corr+2xTC. This is mainly due to the lack of correlation between CORR and MMC in my submissions, but a 40% correlation between CORR and TC. Ideally, having some validation diagnostics to historically backtest TC will be helpful if that can be provided in the diagnostic tool.
Will a train_example_preds.parquet
file, at some point, be provided? If we are interested in modeling an optimization for Exposurer Dissimilarity
(mentioned in the original post from @mdo )? For those willing/wanting to build a new tran/validation set via mixing eras from each, it becomes impossible to draw from an existing example_pred
.
Would a relatively decent equivalent be to use the example model provided to produce predictions for the training set? Or are the example_preds
in the validation file specially chosen by the Numerai team for a particular reason?
lol well I will stay skeptical initially and see what will happen with TC rankings and daily/weekly scores for example, I can imagine we will see some huge swings when people start experimenting. At the moment I have a model ranked #11 for TC, yet I have absolutely no clue why this one would be up there so high in the ranking
Time to get staking! but before you do, tell me about your modelâŚ
lol nothing fancy here, its an ensemble of different regression algoâs with the small feature selection, and also only 1/4 of the v2 dataset (1 out of 4 eraâs to avoid overfitting).
you know, i tried something like that with ElasticNet, Ridge, Lasso and Lars. Nil Nada Niet.
I hope youâre gonna sell those predictions on numerbay, fame and fortune beckons⌠if youâre in the market for an agentâŚ
Well⌠be my guest I would say : https://numerbay.ai/product/numerai-predictions/bigcreeper_4
Maybe this has been already answered somewhere else and I missed that, but How is that the documentation and diagnostic tool still mention mmc
instead of tc
? Is there a way to know the tc
for models not submitted ?
@mdo wouldnât removing the turnover constraint unfairly benefit âfasterâ signals, that you actually wouldnât be able to trade due to the high turnover? In Signals especially it would be easy to create fast high turnover models. If you canât actually trade the signal though (due to high turnover, which isnât constrained) then the signal wouldnât be âcontributingâ to the portfolio returns, right? Maybe Iâm misunderstanding âturnover constraintâ in this context?
Doesnât the turnover constraint just refer to limitations from stuff they are currently holding (i.e. they canât turnover their whole portfolio every time they trade) and not some inherent âspeedâ of certain types of signals? (The removal of this constraint is what I was referring to in the other thread btw when I said TC isnât computed against the actual Numerai portfolio.) Sure, in signals (or classic), somebody can switch up their model constantly to something else, but it has to be something else good (if they want to benefit from it). And consider that (at the moment anyway) there are two different funds with two different portfolios using two different optimizers but only one metamodel.
The definition of a model âcontributingâ doesnât need to be strictly limited to trades the model ârecommendedâ (via its rankings) that actually happened, but to creating via the metamodel a varied menu of good potential trades that the (real full) optimizer can choose from to actually trade that also fit it with the real-world turnover limitations from their current position. (And again, now youâve got two optimizers doing this for two funds.) In other words, good choices on the menu (that time shows would have worked out well) should be rewarded if they were actually chosen or not. TC is arbitrary enough (from user perspective even if it isnât really). Reward/punishment differences for what are actually equivalent choices of trades (in terms of âsurviving the optimizerâ and in ultimate real-world performance) based on timing because of what Numerai happens to be actually holding this week introduces a truly arbitrary/random element (because it is all a black box to us) that would mess up the feedback mechanism. (Because âgoodâ trades would be essentially randomly denied reward or even punished.)
While this would probably yield a more stable metric, I suspect that it might be too closely related to CORR, which is what Numerai does not want. Instead they want predictions that can be still correlated to the target even after filtering out some of the entries. Otherwise everyone will just optimize for the easiest to retrieve signal that works most of the time. But what happens if exactly this signal is filtered out after all constraints? Then you have something that is random noise or even systematically anti correlated with the target.
For example consider the optimizer decides that all rows with feature_foo_bar not equal to 0.5 cannot be traded because risk is too high. Lets say this feature is most of the time the most correlated one to the target, so most of the users models will like this feature very much and depend heavily on it (Similar to the risky features). But now that this feature is filtered on the remaining predictions are probably trash.
I guess this is the main reason why FNC & exposure related metrics are good proxies for TC.
But I agree that we should not be punished for the black box that comes after our submission. When I think about it, the list of proxy variables are probably the best metrics Numerai could offer to stake on, if TC is what Numerai wants. These variables basically say: If these are high, your ranking recommendations are likely to be surviving the constraint black box and are still yielding profit. This would also decouple the user stakes from the noise of the market, which is something that we as users cannot do something about and should actually be the task of the risk optimizer.
What I was saying here was just defending what they are already doing with TC, i.e. we already can be rewarded for good choices even if they donât actually trade them. TC is based on a proxy portfolio created from running the metamodel through the optimizer, but it is not based on their actual trading portfolio, nor should it. (For the reasons I was laying out in previous post that would create unnecessary randomness on our end and make TC even more capricious.)
On the subject of proxy measures, there canât really be a single true proxy because if there was, then that would just be TC too. Anything correlated with TC (like FNCv3) is only correlated with a subset of the space â for instance nobody should conclude that in order to get good TC you must also get good FNCv3 (or whatever) as that just isnât true. It just means if you are looking for TC, having high FNCv3 might be one place to find it, but certainly not the only place. As we are discussing in the other thread, you can get high TC just by being âcorrectiveâ to some bias in the metamodel but without being a good signal on your own (and without being correlated to anything in particular).
@pumplerod was there any update on this? A few months ago I tried to build them myself by training excluding the era to predict and some adjacent eras to avoid leakage. Then I applied something like this Optimizing for FNC and TB scores - #32 by olivepossum
But didnât have successful results.
Sorry, I tried to delete my post in time because I realized what I was asking isnât actually something the API provides. I was curious about the correlation of TB_Corr and TC.
I was able to test for a very limited run on one of my models for rounds 322-334 and found that if I looked at a 50% Feature Neutralization against the fncv3_features my Feature Neutralized TB200 corr score correlated with my TC scores to a value of 0.7562. Itâs a pretty limited run, so Iâm not sure thatâs valuable.
@olivepossum I also used that post for some inspiration, though Iâm using a TB percentage rather than a strict n (200) samples. I actually showed a reversal in my long negative down-slide of TC when I incorporated this in rounds 322-334, though the current live rounds have me buried and probably setting new records for how terrible a model can perform. So again, after months of feeling positive about this approach, I find myself questioning everything I thought to be true.