XGBoost Parameter Tuning Using Genetic Programming

Hey guys, looking to get some feedback on my approach.

I am using TPOT, an autoML tool, to tweak the parameters of an XGBoost model. It builds a population of models which then reproduce or die, creating a new generation etc.

Training is done on the full training set and cross-verified using @mdo’s custom TimeSeriesSplitGroups.

A quick example notebook can be found here: https://www.kaggle.com/jorijnsmit/xgboost-parameter-tuning-using-genetic-programming.

My biggest concerns are what you think of the ranges of parameters I feed TPOT. Am I going too broad here or should I push it even further?

When i run your code with verbosity=2 i get messages like

“Generation 1 - Current best internal CV score: nan”

Maybe TimeSeriesSplitGroups is not working here as expected?

1 Like

Use this code for rank_correlation function:

def rank_correlation(y_true, y_pred):
    return np.nan_to_num(stats.spearmanr(y_true, y_pred, axis=1)[0], nan=-1)
1 Like