Hello! Just making a post to go along with the OHwA interview.
First I wanted to talk about feature neutralisation. In the picture above the X axis represents the value of a dummy feature and the Y axis represents the predicted value, the blue line represents the original unmodified model fitted to this data. You can see that the original prediction is heavily modified when performing feature neutralisation; the prediction changes from a strictly monotonically increasing function to a function which is decreasing for 80% of the values. To me this is something which we would not want in our modelling.
In the next image we see a symmetric prediction. For this kind of prediction feature neutralisation changes nothing about the rank of the predictions for the values of the dummy feature and so does not perform any kind of risk avoidance. I see no reason why this kind of prediction would be deemed less risky than the first and yet feature neutralisation does nothing to address this.
To address both of these concerns I propose a new method, I will call this one hot exposure clipping (not very catchy so I’m very open to new names). The idea behind this is to make new features which represent the non linear predictions of particular features we deem ourselves to be overexposed to and neutralise based on these rather than the original features. I’m working on a prototype for this and will explain further in a future post.
Lastly I want to briefly mention autocorrelation, or how to predict when a feature will perform well. Below you see two histograms, the left represents the autocorrelations of 310 random number sequences, the right represents the autocorrelations of sequences derived by calculating the correlation of each of the 310 Numerai feature columns with the target for each training round. As you can see there is not much difference, this is presented to warn against assuming you have found a particular feature to be autocorrelated just because you have a statistically significant result. I still believe there may be some way to predict when a feature is going to do well but there is much more work to do.
Thanks for reading and happy modelling!