New DataScientist on board - Where do I start?

Hello dear community.

I am completely new to the topic of stock market prediction.
I understand the concept of Numerai and have worked as a Data Scientist for a tech company for multiple years. My specialties are RNNs, so I know a bit about time series.

I already build a classical LSTM model and let it run on the medium numerai dataset, however the results are not great.

Now I am asking for any kind of tips to get better.

As far as I understood the most popular model architectures are XGboost, Transformer and RNNs, right?
Is there any github repo for a model that is performing decently?
Any other ressources that you can recommend?

Thanks in advance!

1 Like

I can’t tell you what you’ll consider useful, but here’s what I considered useful, when I started (which was 4 months ago :stuck_out_tongue: ):

  1. I think you should go trough the Numerai Example Scripts, that would guide you trough the whole process of training, submitting, etc.
  2. Then Check the Benchmark models of Numerai, this is really fresh information: Benchmark Models - Numerai Tournament
  3. Watch the Numerai content on YouTube, specifically the Quant Club and the Fireside chats to give you better context.
  4. Watch StudyM8’ts Numerai Series
  5. Check the Grid Search article: Super Massive LGBM Grid Search and in general, just search the forum if you happen to have questions
  6. Just hop into Discord, a lot of smart people discussing so many things, I picked up a lot there
  7. Marcos López de Prado: Advances in Financial Machine Learning book will give you so many new concept and basic understanding on what Numerai does or even Finance

As per what models are considered effective?
I can’t really tell, most people are using GBT, there are a few people active on Discord who are using, Genetic Algorithms, Random Forests, Transformers, RNNs, but not much people are conversing about LSTMs.

2 Likes

Amazing, thank you. That helps a lot :slight_smile: :+1: