Numerai Self-Supervised Learning & Data Augmentation Projects

Which is why releasing the targets from concluding rounds has been the single biggest improvement they could have done for the dataset, and ultimately scores.

2 Likes

Excuse me, when and how have they done this?

1 Like

could have done…been arguing for it for years!

2 Likes

yeah cause if you look at the tSNE image eras 5 steps apart aren’t too far from each other.

1 Like

Well, maybe, maybe not. We’re just looking at the feature data here I think, not the relationship of the feature data to the targets. (Let’s see that visualization for regression weights of models trained on single eras – does it look so neat?) If you remove the overlaps between eras so they are at least 4 weeks apart (i.e. like the legacy dataset), if you train on era 120 to predict era 121, and train on era 122 to predict on era 123 and so on, do you do better than just training on eras 1-120? (I do know if you look at present eras and try to find similar eras from the distant past and weight them more heavily, that doesn’t seem helpful.)

But anyway, you probably do perform better weighting recent data heavily at least some of the time, maybe even most of the time. Therein lies the danger. Because then you will also have some catastrophic failures when things change because you are so overfit to recency. So it would be a high-risk high-variance strategy, and I think that is historically one reason they’ve wanted to hold back that recent data. (They used to say just that.) Of course, you can say that no no you’ll do it properly and with balance etc etc and I don’t doubt it, but that temptation to roll the dice on recency (especially when recent rounds have been getting big scores) will exist for everybody (not just sober-minded and uber-disciplined folks) and it is a real risk to the metamodel as a whole. (At least under corr scoring, not sure about TC.) And it is hard to see how this will not lead to increased variance of the metamodel, right? Anybody clamoring for an unending stream of the latest resolved targets can only intend to be using the latest resolved targets as they come in, right? i.e. They are planning on updating their models continuously (or fairly often) with the latest data, and this can only lead to weighting recency more and more – any time a trend develops it will prod those people to weight recency even more because that’s what’s winning right now and they will feel they are missing out. (Isn’t this how bubbles occur?)

Well, you can see where I’m going with this. I’m not saying they shouldn’t release this data, or that it will not be a net positive, but this idea that it is a slam-dunk obviously 100% correct thing to do that will most definitely be awesome is not supportable imo. And it could easily look great for a while…until it doesn’t. If you are a young market neutral fund doing well, and then doing even better, but then suddenly crash when things change…well that’s that, who’s going to trust you now? Remember the entire point of a fund of this type is to be invariant to market conditions and trends as much as possible – it is supposed to just chug along getting decent positive returns in all markets, and the more that recency controls the character of the models that make the predictions the more it is setting itself up to be catastrophically wrong when things suddenly change and all those (now) less-weighted lessons from the past are not saving us anymore. So there is certainly upside to more data, but with continuous availability of the most recent data let’s not pretend there is no potential downside.

But in any case, it sounds like within a month we’ll begin to find out what happens…

4 Likes

We’ve shared this difference of opinion before in RC and the “so overfit to recency” assumption implies our models degenerating into a simple martingale which has been shown to be a terrible practice in the finance literature. I don’t think anyone here subscribes to the EMH, otherwise they wouldn’t bother participating!

1 Like

This seems to go against the general ethos of crowdsourcing predictions which would be closer to “the swarm knows best” and further from “the swarm cannot be trusted and must be limited for its own good”. Not releasing the data is a huge design decision that is being made unilaterally without any input from “the swarm”, which if recent Numerai results are anything to go by, does know best when it comes to data science

3 Likes

Which is exactly why in the above I put the bit about you can argue that YOU would do it properly, and I don’t doubt you. YOU are not everybody else. You are making an assumption also there are no degenerate gamblers (as we responsible gamblers like to affectionally call them) in the crowd here, or that we wouldn’t attract any. (We have even seen new people show up asking for the recent data to train on please because obviously that’s the most important thing.) And I know from many years experience that there are gamblers everywhere just waiting to pounce on anything that remotely seems like an easy score. (Like when the examples predictions do well, you’ll find more and model people just submitting the examples.) There will probably be people training on the most recent resolved era only, and then putting those preds up for sale on NumerBay as soon as they hit a lucky streak where recency reigns because sometimes it will. So there is no question that recency will become more of a factor, which is probably good. It creates a temptation to overweight it though because it is there and obvious and will do really well at times. People succumb to temptations – that’s the way people are. So basically, I really can’t see people on average landing on just the right balance, that just doesn’t pass the smell test. But right now we are overweighted to the past (since we have no choice). Moving the overweight to the recent past may or may not be better, and could well be worse. We’ll see…

1 Like

Yes, I agree actually. But is it like making weed legal or making heroin legal? You can compellingly argue personal freedom in both cases, but the second one still gives me pause as to whether that is really gonna be better or not…

1 Like

Emphasis on “swarm” here, individuals will always be found pushing their luck and blowing up, but now that reasonable capital is present and that 2-bits players have a limited impact on the meta-model, the group as a whole may be savvier/more risk-averse. Also, the opposite risk is definitely present, we cannot currently adapt to changing market conditions which may blow up the fund.

1 Like

Yeah I don’t think we disagree, but we seem to have a different perspective

1 Like

I’ll be getting nitpicky here as I broadly agree, but I think the nuance here is that individuals tend to overweight short-term reward and underweight long-term risk, i.e. we may be seeing an uptick in heroin addiction that individuals would be net losers in. But, philosophically at least, I would consider something like Numerai to be more akin to a market where long term risk/reward are more accurately weighted, even if the individuals making up Numerai may blow up on risky setups

The idea that people think they need to deftly and nimbly move their models to some new space to rapidly adapt to current conditions is exactly what worries me.

One of my favorite thinkers in the geopolitical space (Walter Russell Mead) uses an analogy about the stability of countries/societies that compares clipper ships to rafts. Clipper ships are fast and can get where they need to go quickly, they can change course at will, etc. They also tend to capsize in bad storms, crash on rocks, get themselves in bad situations due to arrogant captainship, etc. Whereas a raft is nearly unsinkable, but the price is it is slow, you can’t maneuver much, and your feet are always in the water. Sitting on the raft, the clipper ship seems much nicer…until it doesn’t. Bet on the raft to survive longer.

1 Like

That is probably the correct course for 9 years 11 months out of a decade, I think we just have different weights internally associated with different values (market solution, stability, long term vs short term, etc…), no arguing past that if your position is coherent with regards to what you care about and so is mine

Getting back to nuts and bolts – somebody show me that it is true. Make me the graph that shows unequivocally that recent market conditions predict soon upcoming market conditions better overall than not having that data. (But not just in the “more data = better models” sense – show me it HAS to be the most recent data.)

Well first in general “more data = better models”, that’s why I want more data, because we have so little too begin with. But for fast adaptation, we want the swarm to figure that part out if there is anything to even figure out. The big issue that I’m seeing is that even if the scientific community has not found anything, we probably have a larger community and more “group intelligence” than any before for this specific problem, it may be that we need a group intelligence of a certain level to even be able to entertain certain solutions

Well, we are getting a lot more data in any case (the whole test set). But some seem intent that we MUST have the latest data, and it MUST keep coming every week or else how can we possibly be expected to make a decent model? Besides the fact that I think we’ve already proven you can make decent models without it, the idea that SOMETHING CHANGED LAST MONTH and so we must immediately change our models (basically so that they would have done better last month) is pretty much the exactly wrong thing to do but I think people will be doing it to the overall detriment of the metamodel. I know they will – I’ve seen it a million times. It doesn’t work like that.

1 Like

this is regressions of data to targets on single eras, like you said.

Ooh, I missed the 3d version before.

1 Like

Ok, on the version that moves I can see that there are just isolated clusters that move along in time for a while but then just stop not connected to anything else. (If you just do the feature data, I think it will be one unbroken string more or less.)

And that’s another aspect of it – the kinds of models will get dumber because linear models will work better on recent->live until it suddenly doesn’t, but the coefficients will be so specific to the wrong regime that the crash will be worse. Even if you personally are not reckless in these ways, it won’t matter if the fund goes down and your NMR is now worthless. Don’t just think that because you are personally smarter that an overall dumbing down of the metamodel isn’t risky for you. (Or in the worse-case scenario, unlikely but not impossible – a catastrophic crash that causes investors to flee.) TC may make all of this moot by simply making it unprofitable to be too like everybody else or too linear (or at all linear).

1 Like