Hello,
I’ve put together a google colab notebook covering the following topics:
- Handling numerapi API keys in colab without too much friction
- Managing memory (trying to read the tournament dataset all in one go tends to crash colab)
- Fitting models: the notebook includes a fastai tabular learner and a scikit-learn regression model
- Evaluating model performance with average per-era correlation and Sharpe score
- Generating and formatting predictions and submitting them to the competition with numerapi
You can access the notebook from the link below or from the github repository.
Any feedback on ways to make this a more useful resource would be appreciated. It mostly focuses on topics I found challenging during my first few weeks of working on numerai.
The intent of this project is twofold:
- Walking new users through the whole process of getting the current data, fitting a model, making predictions, and submitting to the competition; and
- Helping those who want to take advantage of colab GPUs but have found it inconvenient or difficult to deal with secret keys and/or colab’s memory limitations.