Here’s a simple linear algebra trick to speed up feature neutralization (I got a 2x speed up on my machine but this will vary depending on your hardware).
In the feature neutralization function provided by Numerai, simply replace the line
scores -= proportion * (exposures @ (np.linalg.pinv(exposures) @ scores))
scores -= proportion * (exposures @ np.linalg.lstsq(exposures, scores))
The two are equivalent, since finding the least squares solution to Ax = b is equivalent to taking the pseudo-inverse of A then matrix multiplying by b, which is slower and less numerically stable due to the pseudo-inverse.
This simple test in numpy shows that they’re equivalent:
import numpy as np M, N, = 2000, 40 A = np.random.normal(size=(M, N)) b = np.random.normal(size=(M,1)) x_pinv = np.linalg.pinv(A) @ b x_lstsq = np.linalg.lstsq(A, b) print(np.all(np.isclose(x_pinv, x_lstsq)))
This should return
True, and their mean-squared error is around 1e-33.