Will early stopping in cross validation introduce overfitting?

maxchu · December 18, 2021, 4:01am

Anyone has done experiments on early stopping and cross-validation? Does it give a better result using only cross-validation alone?

Also, how do you determine what n_estimator / num_of_epoch of your final model?

jacob_stahl · December 18, 2021, 6:00pm

Check out Optuna - A hyperparameter optimization framework. It makes picking hyper parameters a lot more efficient, and generally gives good results after ~30 trials.

maxchu · December 20, 2021, 11:29pm

Thanks for the sugguestion!

neosbrother · December 22, 2021, 2:51am

I’m mostly working with neural nets and so early stopping is very important. I am using ES + CV and while I don’t have enough live data to feel confident, I think it’s a principled approach that should at least allow me to evaluate a model with some degree of confidence.

maxchu · December 22, 2021, 4:53am

Is your final model trained on train+valid dataset?

neosbrother · December 22, 2021, 1:47pm

I don’t have a final model. I use the models trained during CV as an ensemble.

maxchu · January 1, 2022, 10:02am

So you just average the predictions of the same model trained on different CV?

neosbrother · January 1, 2022, 5:29pm

Yes, average the predictions of each of the models.

Topic		Replies	Views
Time series CV & seperation to live data Data Science	5	968	November 13, 2022
Lesson Learned (maybe) - Proper Sample Size for Testing Data Science	1	1018	April 29, 2021
Cross-validation done right Data Science	4	2287	May 2, 2021
TL;DR scaling of tree ensemble size with training data size Data Science	2	911	September 26, 2022
Overfitting to Validation Data Data Science	13	1719	July 8, 2021

Will early stopping in cross validation introduce overfitting?

Related topics