Feature Selection with BorutaShap

Might be worth noting that it’s possible to get:
Min era for train = 1
Min era for test = 2

But the eras will still be shuffled to something like this:
train = [1, 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, 17, 19, 20, 23, …, 573, 574]
test = [2, 3, 12, 13, 16, 18, 21, 22, 25, …, 564, 568]

So we’re effectively training on data that occurs after eras in the test set. I don’t know how big of an issue it is for feature selection, but forcing a proper time series split doesn’t seem like a bad idea here.

I know @mdo created a custom splitter here, maybe we can use that again with a little tuning.

1 Like