I just want to clarify that I did not used any feature neutral targets for the model training. All I’ve done - my standard training using standard targets (base model). And after that I used normalize_and_neutralize from the analysis_and_tips with proportion=1.0 for my predictions grouped by era (that is what I call 100% neutralized).
To be sure that I got your question I’m attaching code for feature correlation calculation.
corr_list1 = []
for feature in feature_columns:
corr_list1.append(numpy.corrcoef(df_base[feature], df_base["prediction_kazutsugi"])[0,1])
corr_series1 = pandas.Series(corr_list1, index=feature_columns)
print('base model', np.std(corr_series1), np.sqrt(np.mean(np.power(corr_series1, 2))))
corr_list2 = []
for feature in feature_columns:
corr_list2.append(numpy.corrcoef(df_neutralized[feature], df_neutralized["prediction_kazutsugi"])[0,1])
corr_series2 = pandas.Series(corr_list2, index=feature_columns)
print('neutralized model', np.std(corr_series2), np.sqrt(np.mean(np.power(corr_series2, 2))))
base model 0.06892674709919062 0.07109065523194119
neutralized model 0.00032925807334664204 0.00617411917434977