Model Diagnostics: Feature Exposure

taori · October 23, 2022, 11:19am

@unsentient make sure to test your model out of samples when you use feature neutralization, because it doesn’t necessarily improve your performance. In my experience feature neutralization helps correlation of simple models, such as the numerai example script, but it hurts more advanced models. You might not be looking for improving the correlation metrics though, but still make sure to test if feature neutralization helps your model or not in the metrics you are interested in. Do not blindly trust feature neutralization.

esedx12 · June 26, 2023, 4:45pm

Thanks for this post, it was an interesting read!

At the end, you mentioned that you tried neutralizing a linear model, which you fitted with OLS.
I don’t really understand this. I’ll explain:

What a linear model does, when you apply OLS, is project the target of your training samples linearly onto the vector space spanned by your feature vectors, i.e. it finds the linear combination of the feature vectors that best approximates the target.

What the function neutralize does (as far as I can tell) is fit a linear model to the column ‘target’ and then subtract its prediction from this columng (and then normalize).

However, if the column ‘target’ was already obtained from a linear model fit on the same feature vectors, then ‘target’ is already a linear combination of the feature vectors. Thus, subtracting the predictions of the linear model should theoretically result in a zero vector.

…or did I miss something? I’d appreciate any feedback

wigglemuse · June 26, 2023, 5:18pm

Neutralize is to the features->predictions, not to the original training target we are predicting. (So we can still do neutralization without the target on live predictions.) So we make a linear model using the features to predict the predictions that our trained model is spitting out, and then subtract that, i.e. we are removing the linear relationships between the features and our predictions. Which is what we call “feature exposure” around here – the correlations between the features and our trained predictions. So fully neutralized set of predictions fully removes the portion of our original predictions that can be generated using a straight linear model (and the result if you do it 100% is predictions with zero correlation to any of the features).

So it still doesn’t make sense to neutralize a purely linear model that uses the original features for essentially the same reason – you’d just be zeroing it out. But the training target (which of course is not available for live eras) is not needed for that step.

reeboo · September 16, 2023, 3:45am

Topic		Replies	Views
An introduction to feature neutralization / exposure Tournament	0	6381	February 15, 2022
Feature Neutralisation & Autocorrelation Presentation Data Science	5	3302	June 15, 2022
Optimizing for FNC and TB scores Tournament	31	6690	May 26, 2022
Liz Experiment Review Q1 2021 : Generating Features and Applying Feature Neutralization Tournament	24	5408	May 11, 2021
More metrics for ya Data Science	23	7300	April 4, 2021

Model Diagnostics: Feature Exposure

Related topics