As I’m starting to work with the larger datasets, my Google Colab functions are taking hours to run and thus disconnecting me in between when I leave my laptop.
Is there another tool out there that can run a jupyter notebook in the background? What are you using?
Thanks!
1 Like
I am using and recommend Kaggle. Not only that you can run for free jupyter notebooks with 4 CPUs and 30GB RAM (max 12 hours run each, 5 notebooks in parallel), but thanks to Kaggle API you can fully automatize your pipeline. On top, there is great community and possibility to learn more from competitions.
I have created a few public notebooks and datasets to simplify first steps for Numerai Tournament participants:
- weekly updated dataset with latest data, so that you do not need to download them each time:
- Example models - typically forks of Numerai example models with improvements for better results:
- Kaggle automation tips:
5 Likes
Thank you! I’ll review this!!!
Thank you! Also a new user exploring ways to train models (turns out my ~10 year old gaming rig can’t handle much past the “small” feature set)
Also using kaggle.com. Highly convenient option
I train it on my local computer and use a batch script to submit dailly.
2 Likes
This is very useful, thank you so much
Some guys are using Rent GPUs | Vast.ai as most price effective GPU option.
1 Like
I found it useful to have my programs periodically save relevant parameters to files. For example with my basic Numerai program I save parameters every 250 iterations to files named by the iteration number (like “XXXXX.mat”, where XXXXX is the iteration number, and “mat” because I use MatLab). In my case 250 iterations represents about 1/2 hour of processing, so there’s not much lost if for some reason or another processing is interrupted.
That also allows you to branch off downstream programs from pretty much where one likes.
FWIW I train my models at home, and the few times (for other projects) I’ve used Colab I used Google Drive for data storage.
This is very useful, thank you so much
I’m experimenting with PaperSpace Gradient. It’s like Colab but with ‘unlimited’ GPUs with monthly subscription. I’m currently trying to build the correct virtual environments on Gradient so the pickled model is consistent with Numerai Evaluation. If it do work well, I’m guessing it would be a solid place for high-ram GPU training!
1 Like