This is where you can get some performance gains with some improvements. We’re evaluated on our correlation, so why not use correlation in your loss function. Take a look at this post for some tips:
In my experience, you still want to include MSE. But theoretically and practically there are merits to including in your loss function the thing we’re evaluated on
Then it becomes clear why you want them ordered by era, as correlation doesn’t make much sense unless you’re looking at an era (tbf trying correlation with random samples could be interesting, but I don’t think it has as good theoretical merits)
You can also train different models on differents eras and then ensemble them
There are actually lots of things you can do with eras, thankfully a lot of helpful have shared their ideas in Arbitrage’s office hours, so make sure to go watch all of them:
Also Arbitrage has some good suggestions about how to use eras in his intro video series: