Model Diagnostics: Feature Exposure

wigglemuse · September 20, 2020, 11:46pm

Here’s that neutralization code for R:

neutralize <- function(scores_v,exposures_m,proportion=1.0) {
  scores_v <- scores_v - (proportion * (exposures_m %*% (MASS::ginv(exposures_m) %*% scores_v)))
  return( scores_v/sd(scores_v) )
}

normalize_vector <- function(v) {
  qnorm( (rank(v)-0.5) / length(v) )
}

normalize_matrix <- function(m) {
  qnorm( (Rfast::colRanks(m)-0.5) / nrow(m) )
}

normalize_and_neutralize <- function(scores_v,exposures_m,proportion=1.0) {
  scores_v <- normalize_vector(scores_v)
  exposures_m <- normalize_matrix(exposures_m)
  return( neutralize(scores_v,exposures_m,proportion) )
}

You’ll need “Rfast” package for colRanks function (note that there are other packages with same-named function). “MASS” should be included in any standard R installation. As I discussed with @jrb, I recommend you call “normalize_and_neutralize” rather than just “neutralize” – your results will be different (unless your data is already normalized in the same way) and probably better. The function is expecting a numeric vector for scores and a matrix (not a data.frame) for the exposures. This has some slight differences from the python version given in the tips notebook – namely the ranking functions are using the “average” method instead of the “first” method for breaking ties which makes more sense to me for this application (as “first” essentially introduces randomness which might help, but might hurt – both functions have a parameter to can set to “first” if you want though). [Also, don’t have ties in your predictions.] And I don’t think the python version actually normalizes the exposures, only the scores. Which is fine if the exposures matrix is the raw data or is otherwise standardized/normalized, but sometimes I am neutralizing with respect to other types of transformations of the data and it is just safer.

Topic		Replies	Views
An introduction to feature neutralization / exposure Tournament	0	5741	February 15, 2022
What is the difference between feature exposure and regularization? Data Science	1	1010	September 24, 2022
Feature Request – Publish Metamodel Feature Exposures Tournament	7	1213	June 10, 2021
Better neutralization? Data Science	6	2316	July 23, 2022
Creating features from currency-exposed metrics Signals	1	699	April 10, 2023

Model Diagnostics: Feature Exposure

Related topics