Help us improve Numerai Compute!

  • Are your submissions automated now?
    Semi-automated
  • Do you submit from a home machine, compute/cloud or website?
    Google Colab
  • Do you want to train your model locally or in the cloud?
    No preference.
  • How often do you retrain your models?
    Every week.
  • What cloud platform are you most comfortable with?
    No preference.
  • Do you use version control for your model code?
    No.
  • What are the biggest pain points with the current Compute setup?
    Haven’t tried Compute
  • How do you typically deploy a model to production?
    Manually.

Whatever can easily/cheaply store a model and trigger it is what I’m for. If Compute already does that, then cool.

  • Are your submissions automated now?
    Yes

  • Do you submit from a home machine, compute/cloud or website?
    Cloud

  • Do you want to train your model locally or in the cloud?
    Cloud

  • How often do you retrain your models?
    Classic: Whenever new data is released for 99% of the models. Experimenting with weekly re-training
    Signals: Weekly for most, monthly or quarterly for some. Some are not models but just features and calculated weekly

  • What cloud platform are you most comfortable with?
    Google Cloud Platform

  • Do you use version control for your model code?
    Not really

  • What are the biggest pain points with the current Compute setup?
    Never tried it

  • How do you typically deploy a model to production?
    Test locally then upload it to google cloud storage. The most recent version will get downloaded each week and used

Similar to @jrai, I use GCP and just run everything on a vm. The vm is on a cron job, downloads the latest python files from cloud storage, runs them, submits the predictions and then shuts down automatically. For batch predictions like we’re doing this works very well.

  • Are your submissions automated now?
    Yes
  • Do you submit from a home machine, compute/cloud or website?
    Classic: numerai-compute on AWS. Signals: Dedicated server
  • Do you want to train your model locally or in the cloud?
    Classic: Cloud. Signals: Dedicated server
  • How often do you retrain your models?
    Classic: Never. Signals: Every submission
  • What cloud platform are you most comfortable with?
    GCP
  • Do you use version control for your model code?
    Yes
  • What are the biggest pain points with the current Compute setup?
    Signals: Terraform is too opaque for me to understand how to use it to kick off my complex Signals pipeline, so I just roll my own with schedule. I’d be open to setting up a webhook you could call if timing is variable
  • How do you typically deploy a model to production?
    Signals: Docker-compose up on dedicated server

Bonus questions:

  • How long does your pipeline take?
    Classic: 2-5 min (inference only on legacy data). Signals: >24 hours (data collection of the past week + retraining + inference)
  • How easy would it be to move to daily submissions?
    Classic: Pretty easy. Signals: Pretty hard, would require significant rearchitecting of data pipeline
  • Are your submissions automated now?
    Yes

  • Do you submit from a home machine, compute/cloud or website?
    numerai-compute on AWS

  • Do you want to train your model locally or in the cloud?
    locally

  • How often do you retrain your models?
    Classic: ~5% of slots weekly, some trained years ago still running Signals: Every 3 - 6 months

  • What cloud platform are you most comfortable with?
    AWS

  • Do you use version control for your model code?
    Yes

  • What are the biggest pain points with the current Compute setup?
    Initial setup, diagnosing problems, shutting down/restart via AWS if needed
    Lack of AWS creds accessible from env like Numerai creds. (AWS Creds used for S3- bucket read/saves/archiving/ensembling etc.)

  • How do you typically deploy a model to production?
    Numerai node deploy

Bonus questions:

  • How long does your pipeline take?
    Classic: ~20 min (FE then inference only for 50 slots, single slot trigger). Signals: >1 hour (data retrieval + feature engineering + inference)
  • How easy would it be to move to daily submissions?
    Classic: Pretty easy. Signals: Dicey depending upon time allowed after trigger because of data pipeline
  • Are your submissions automated now?
    

Kind of. I have a single script that submits all of my models that I run manually.

  • Do you submit from a home machine, compute/cloud or website?
    

Home.

  • Do you want to train your model locally or in the cloud?
    

Locally.

  • How often do you retrain your models?
    

Never.

  • What cloud platform are you most comfortable with?
    

AWS.

  • Do you use version control for your model code?
    

Yes.

  • What are the biggest pain points with the current Compute setup?
    

Not enough RAM; a pain to set up 20+ times for each model; don’t want to have to use Docker or other similar dependencies.

  • How do you typically deploy a model to production?
    

Create a predict.py script for each model and run these every weekend.

Hello Numeratis,
A few months ago, I made you a proposal to set up a dedicated cloud workspace [Proposal] Numerai.cloud - Open source cloud workspace for the community. Despite some skeptics I decided to make this project a reality, so I set up a subscription page for those wishing to access the app in beta and support the project at this link https://numerai-cloud.ghost.io/.
I am convinced that this would make life easier for all of us, reduce friction and allow us to quickly onboard new Numeratis and explore more ideas TC friendly.

I had a look at the Compute Lite Beta Testing Document.

napi.deploy(model_id, model, napi.feature_sets('small'), 'requirements.txt')

What does this code do? What is happening behind the scene?

Will this work with my model?

This works with any model or pipeline that matches the sklearn interface. As long as your model has a predict function it will work.

This is such a big limitation and assumption. Many user models will not work.

What are the limitations of Compute Lite?

Compute Lite uses Lambda to run your deployed model, so there are run time and memory constraints. Lambda has a maximum run-time of 15 minutes and maximum memory allocation of 3GB. If your model inference exceeds these limits, it will not work until we add support for AWS Batch

Same as above

That is actually well explained in the document

Are your submissions automated now?

Yes I have a cronjob on a local raspberry waking up my main computer to run the scripts.

Do you submit from a home machine, compute/cloud or website?

Home machine

Do you want to train your model locally or in the cloud?

Locally

How often do you retrain your models?

Depends on the model, those that I retrain once per month

What cloud platform are you most comfortable with?

None of them

Do you use version control for your model code?

Yes

What are the biggest pain points with the current Compute setup?

Lack of control. I don’t like private code somewhere other than my local machines. I also don’t like the approach “Let me handle everything for you” and would rather have “Here is a ready to use solution, but you can also modify it or do everything on your own”. Also, some of my models require some compute power, and I already have a local machine capable of doing that. I don’t want to spend extra money for expensive cloud services. If a webhook trigger mechanism becomes mandatory, I would really like to be able to set custom webhook URLs in my Numerai account, and let me do my own thing.

How do you typically deploy a model to production?

For most models I create a custom model file that can be added to a folder of deployed models after I have trained it.

To support this future, we are exploring the idea of daily rounds with much shorter submission windows. This change will effectively make model automation mandatory.

Just define a daily time window where models are supposed to upload their predictions, i.e. every Mon-Fri from 6:00 UTC to 10:00 UTC. A simple cronjob will work just fine.

The key message I want to convey is that I am fine with everything unless it becomes impossible to upload predictions other than by using a Numerai Compute node running in a cloud service.

1 Like
  • Are your submissions automated now?

Semi automated. All models can be sumitted by manually triggering a single script.

  • Do you submit from a home machine, compute/cloud or website?

home machine

  • Do you want to train your model locally or in the cloud?

Locally

  • How often do you retrain your models?

Not too often. Mostly they are trained once and get included into the pipeline without retraining

  • What cloud platform are you most comfortable with?

AWS and GCP

  • Do you use version control for your model code?

No

  • What are the biggest pain points with the current Compute setup?

I often add new models and remove bad models. Change the script, what goes into the ensemble.

Model files are big. Building a container and uploading the whole packet into the cloud takes looong.

1 Like

While I understand the need and advantage for many users, I am worried that the new Numerai Compute will take away the clean, straightforward and above all flexible approach of Numerai tournament (download data → do whatever you want → upload the predictions).

With Numerai Compute the user models are run on demand by Numerai. Numerai decides when to call what, which is a big shift from the current standard where is the user who decides to do what and when.

I can understand the Numerai’s need for this paradigm shift, but I do not accept the decrease in flexibility on how I can run my model or what I can do (which is a limit imposed by both the current form of Numerai Compute and the fact we have to use AWS).

if this paradigm shift will become mandatory, please, please, please add the possibility to skip Numerai Compute and allow users to register a Webhook on their account instead. The Webhook would work as a simple trigger that starts user models. That would give us back the flexibility to do anything our models need.

2 Likes

I second @taori please keep the webhooks or at least a way to make external custom calls outside of aws.

Rest assured, we are keeping webhooks. The new Compute setup is not mandatory and never will be. Our goal with Compute Lite is to get as many people as possible automated and our initial offering relies on AWS. For users that are already automated, they can keep their existing setup whether it uses Compute or not.

5 Likes

Ah, I was worrying for no reason then. Thank you for clarifying this.

Compute should support the deployment and ensembling of multiple models!
Also it would be beneficial, if it could support reading model data from S3.

Building and reuploading a docker container with several models is very slow even with a good internet connection.

2 Likes
  • Are your submissions automated now?
    Yes
  • Do you submit from a home machine, compute/cloud or website?
    Cloud
  • Do you want to train your model locally or in the cloud?
    Cloud
  • How often do you retrain your models?
    About three times yearly.
  • What cloud platform are you most comfortable with?
    Saturn
  • Do you use version control for your model code?
    Yes.
  • What are the biggest pain points with the current Compute setup?
    I’m not very familiar with docker containers. It’s a steep learning curve.
  • How do you typically deploy a model to production?
    Straight from github.
  • Are your submissions automated now?
    Yes, I have just automated them on my home machine for the daily tournaments. Using cron, bash scripts and back end Rust. All I ask is some guaranteed deadline within the submission window, by which the data will be available for download.

  • Do you submit from a home machine, compute/cloud or website?
    Home machine

  • Do you want to train your model locally or in the cloud?
    Locally

  • How often do you retrain your models?
    Not often, just when I get some ideas for (hoped for) improvements.

  • What cloud platform are you most comfortable with?
    github actions

  • Do you use version control for your model code?
    Yes

  • What are the biggest pain points with the current Compute setup?
    Too pythonesque. I have put quite a lot of effort into setting everything up my own way, in Rust, so that it runs orders of magnitude faster. Any changes assuming all kinds of complications on your side induce pain.

  • How do you typically deploy a model to production?
    I get an idea, I change my rust source accordingly, train a new model and run some benchmarks. If the results appear to show promise, I put it into production. Sometimes I iterate this loop a few times.

None of your proposals are appealing, I know nothing about all that bull and don’t need any of it. Mandatory ‘Compute’ and the fees associated with it would make me pull out of numerai entirely. If you follow my suggestion and have a guaranteed UTC time for data being available, I can synchronise my cron to it and do everything locally in just a few short minutes, quicker than your ‘Compute’ can ever aspire to.

PS. I do not think it is at all reasonable to have a hard submission deadline for us but not to have a hard deadline for you to provide the data.

1 Like
  • Are your submissions automated now?
    Yes!
  • Do you submit from a home machine, compute/cloud or website?
    Azure Cloud
  • Do you want to train your model locally or in the cloud?
    Cloud, because of memory requirements.
  • How often do you retrain your models?
    Never.
  • What cloud platform are you most comfortable with?
    Azure
  • Do you use version control for your model code?
    Yes, I have a private DevOps git.
  • What are the biggest pain points with the current Compute setup?
    I have some monthly free Azure credits, so I prefer Azure vs AWS.
  • How do you typically deploy a model to production?
    I use a container instance to submit all my models, so I must update a script to include the model, then rebuild and push the new docker image.

I need only one trigger to submit all my models, so I had to put a compute webhook in one of my models, what is a little bit odd. Seems like I have only 1 computed model.
Maybe could be a place to define a webhook for all models.

1 Like

If you want to improve Numerai Compute, you could update the docs and provide an updated tutorial. The docs page Numerai CLI and Compute - Numerai Tournament hasn’t been updated in 7 months.

2 Likes
  • Are your submissions automated now?
    No.
  • Do you submit from a home machine, compute/cloud or website?
    Home machine.
  • Do you want to train your model locally or in the cloud?
    Locally
  • How often do you retrain your models?
    Very rarely
  • What cloud platform are you most comfortable with?
    I’ve never used one. I’m such a neanderthal.
  • Do you use version control for your model code?
    No
  • What are the biggest pain points with the current Compute setup?
    Cloud
  • How do you typically deploy a model to production?
    Run my script to generate prediction for all my models and then upload them one by one. deployment is just adding a trained model folder name to my pipe.
1 Like