Right. So I understand we don’t want to be suckered by the easy lure of superficial linear correlations that may be fleeting. But, do we really have hard evidence that deeper features are necessarily more robust? After all, by the very fact that they are more complex (at least given the dataset with the features we have), involving interactions between features – couldn’t it be that they could be quite brittle if those interactions don’t hold up era-to-era?
Undoubtedly some of the linear features that do great in specific eras are just lucky/random and should not be relied upon (I think that’s been proven by you and it has been proven by me in my own testing), and it makes sense to remove the gravitational pull of those when making models (which I already do in my methods although I don’t totally eliminate linearity). But as most of us know, modern techniques beyond simple parametric models with find a “signal” in anything even if it isn’t there, linear or not. So while I think this will be very interesting and useful, getting rid of linearity in and of itself will not prevent overfitting or possible reliance on fleeting/fickle factors – it may even cause more of it. So there are going to be some new gotchas there I bet.
(None of this is meant as criticism, just musing.)