Training on eras groups and ensemble models

I’ve been doing some experiments with the idea of training on eras groups and ensemble models.

Taking into account the training dataset has 120 eras (so, 120 sequential months as far as I understood) I join continuous eras in groups of 12 and train them together.

An ensemble model is trained for each group of eras and then performance is evaluated by predicting on the validation data with each of those models. Finally, predictions are averaged.

Each ensemble model is composed by a XGBoostRegressor, a CatBoostRegressor and a LightGBMRegressor.

Models of each ensemble model, are optimized separately using 3 splits kfold cross-validation and no shuffling.

The idea came up after reading some sample scripts and other resources.

Details are published on a Colab notebook

Any feedback is welcome!

Is ensembling a good idea by default? I can imagine situations in which one of the three models outperforms the ensembled model. Maybe you could compare metrics before deciding on using a single or ensembled model for submitting predictions?