Help us improve Numerai Compute!

qeintelligence · June 11, 2022, 7:15pm

I am working a little bit on automating it further and into the cloud, especially since holidays and other trips are coming up Anyway, using oracle free tier I managed to predict and upload 20 v3/v4 models based on the smaller feature-set within 10 mins on that free tier (with only 1gb memory) so that’s working nicely.

I also got it working with the medium feature-set (i forgot how many that was, somewhere between 300 and 500?), loading the training set for neutralization purposes ofcourse takes longer but I am guessing 20 models still would go within the hour, not bad for free compute which you can keep on 24/7.

I will write a forum post soon on how I did this with some example code/instructions, maybe others can benefit from it too.

mic · June 13, 2022, 2:41am

Are your submissions automated now?

Semi automated (manually triggered)

Do you submit from a home machine, compute/cloud or website?

Local machine

Do you want to train your model locally or in the cloud?

Depending on resources needed. Prefer local.

How often do you retrain your models?

Manually, no set schedule.

What cloud platform are you most comfortable with?

AWS (but am open to options)

What are the biggest pain points with the current Compute setup?

Lack of information. More explanation would be helpful, to see if it works for me or not, and whether the cost is worth it. Not even sure what the costs would be.

Some proposals we have:  Which of these is most appealing to you? What changes would you make?

None of them look appealing as I’m not so interested in notebooks. Probably need to store an image of environment to run all models in single instance and run it on your trigger. Would only use a notebook service if it was secure and cheap.

restrading · June 13, 2022, 4:35am

Are your submissions automated now?

Partially, manually triggered each week

Do you submit from a home machine, compute/cloud or website?

Home machine

Do you want to train your model locally or in the cloud?

Locally because of extensive GPU usage and retrainings and model iterations

How often do you retrain your models?

Each week for Signals, never for Classic

What cloud platform are you most comfortable with?

Google Cloud

Do you use version control for your model code?

Yes

What are the biggest pain points with the current Compute setup?

I need to swap models each week to adjust stake weightings, that’s my main block for complete automation.

How do you typically deploy a model to production?

Jupyter notebooks or model dump

kenfus · June 15, 2022, 8:06am

Are your submissions automated now?
Yes
Do you submit from a home machine, compute/cloud or website?
Google Cloud
Do you want to train your model locally or in the cloud?
Locally
How often do you retrain your models?
Tournament: Never, Signals: Weekly
What cloud platform are you most comfortable with?
Google Cloud
Do you use version control for your model code?
Yes, git + yaml-files.
What are the biggest pain points with the current Compute setup?
Not enough Memory
How do you typically deploy a model to production?
Usually dockerized but currently I have a bash-script deployed on a VM, which spins up weekly and runs that script via a startup-script. The bash script pulls from github, loads the model from google buckets and runs a script.

malembetirick · June 21, 2022, 8:43pm

Are your submissions automated now? No
Do you submit from a home machine, compute/cloud or website? Google colab
Do you want to train your model locally or in the cloud? both
How often do you retrain your models? Everytime that i have new idea
What cloud platform are you most comfortable with? AWS
Do you use version control for your model code? Yes
What are the biggest pain points with the current Compute setup? Many configurations to do to deploy and just setup webhook
How do you typically deploy a model to production? I just import the model in my google colab notebook

My suggestions are :

Improvement of numerai-cli by replacing api-gateway with lambda function urls Announcing AWS Lambda Function URLs: Built-in HTTPS Endpoints for Single-Function Microservices | AWS News Blog (it’s less costly), add feature to visualize and track model performance https://aimstack.io/ based on user custom metrics, add feature to train easily and at low cost our model in the cloud using spots instances https://github.com/iterative/terraform-provider-iterative
Embed numerai-cli and its dependencies (docker, terraform) to vagrant https://www.vagrantup.com/ that will help build a common environment for all numeratis and help them focus on adding value to numerai meta-model
Build a docker image that will contain all common ML framework to build AI model (tensorflow, pytorch, lightgbm, xgboost, catboost) and vscode with notebook, git extension

The main objective is to have all environment setup in vagrant file that will be run in one click after installation of virtualbox, vmware,hyper-v with vagrant.
If someone want to share its work, he could use vagrant share.

I will be happy to work on something like this if this idea interests community.

malembetirick · June 21, 2022, 8:50pm

Other option will be https://cml.dev/ which is a git friendly approach that use github actions to trigger cloud computing

factorsparsity · June 28, 2022, 7:04pm

Are your submissions automated now?

Sort of. I need to start them manually but I could do that from a phone while travelling.

Do you submit from a home machine, compute/cloud or website?

Home machine, unless I’m travelling.

Do you want to train your model locally or in the cloud?

Locally.

How often do you retrain your models?

I don’t. However, I’m moving to v4 now and I really need to come to terms with TC.

What cloud platform are you most comfortable with?

Google Cloud.

Do you use version control for your model code?

Yes, just CVS.

What are the biggest pain points with the current Compute setup?

I don’t find the time to look at it.

How do you typically deploy a model to production?

Manually.

sneaky · July 3, 2022, 7:56am

Are your submissions automated now?

Semi automated

Do you submit from a home machine, compute/cloud or website?

Home machine, don’t wish to use anything else.

Do you want to train your model locally or in the cloud?

Locally.

How often do you retrain your models?

Once a month.

What cloud platform are you most comfortable with?

None.

Do you use version control for your model code?

Yes.

What are the biggest pain points with the current Compute setup?

I would like to try it sometime, but I am finding myself too lazy to do that.

How do you typically deploy a model to production?

By changing a model-to-id mapping.

qeintelligence · July 3, 2022, 8:30am

Its quite interesting to see there are still a lot of people who don’t even consider cloud as an option (at least for weekly predictions). It makes me think that a general solution should be capable of supporting both cloud and local compute out-of-the-box.

shatteredx · July 3, 2022, 9:32pm

Are your submissions automated now?
Semi-automated
Do you submit from a home machine, compute/cloud or website?
Google Colab
Do you want to train your model locally or in the cloud?
No preference.
How often do you retrain your models?
Every week.
What cloud platform are you most comfortable with?
No preference.
Do you use version control for your model code?
No.
What are the biggest pain points with the current Compute setup?
Haven’t tried Compute
How do you typically deploy a model to production?
Manually.

Whatever can easily/cheaply store a model and trigger it is what I’m for. If Compute already does that, then cool.

robo_boi · July 5, 2022, 9:06am

Are your submissions automated now?
Yes
Do you submit from a home machine, compute/cloud or website?
Cloud
Do you want to train your model locally or in the cloud?
Cloud
How often do you retrain your models?
Classic: Whenever new data is released for 99% of the models. Experimenting with weekly re-training
Signals: Weekly for most, monthly or quarterly for some. Some are not models but just features and calculated weekly
What cloud platform are you most comfortable with?
Google Cloud Platform
Do you use version control for your model code?
Not really
What are the biggest pain points with the current Compute setup?
Never tried it
How do you typically deploy a model to production?
Test locally then upload it to google cloud storage. The most recent version will get downloaded each week and used

Similar to @jrai, I use GCP and just run everything on a vm. The vm is on a cron job, downloads the latest python files from cloud storage, runs them, submits the predictions and then shuts down automatically. For batch predictions like we’re doing this works very well.

dev0n · July 9, 2022, 9:21pm

Are your submissions automated now?
Yes
Do you submit from a home machine, compute/cloud or website?
Classic: numerai-compute on AWS. Signals: Dedicated server
Do you want to train your model locally or in the cloud?
Classic: Cloud. Signals: Dedicated server
How often do you retrain your models?
Classic: Never. Signals: Every submission
What cloud platform are you most comfortable with?
GCP
Do you use version control for your model code?
Yes
What are the biggest pain points with the current Compute setup?
Signals: Terraform is too opaque for me to understand how to use it to kick off my complex Signals pipeline, so I just roll my own with schedule. I’d be open to setting up a webhook you could call if timing is variable
How do you typically deploy a model to production?
Signals: Docker-compose up on dedicated server

Bonus questions:

How long does your pipeline take?
Classic: 2-5 min (inference only on legacy data). Signals: >24 hours (data collection of the past week + retraining + inference)
How easy would it be to move to daily submissions?
Classic: Pretty easy. Signals: Pretty hard, would require significant rearchitecting of data pipeline

dg1 · July 9, 2022, 10:02pm

Are your submissions automated now?
Yes
Do you submit from a home machine, compute/cloud or website?
numerai-compute on AWS
Do you want to train your model locally or in the cloud?
locally
How often do you retrain your models?
Classic: ~5% of slots weekly, some trained years ago still running Signals: Every 3 - 6 months
What cloud platform are you most comfortable with?
AWS
Do you use version control for your model code?
Yes
What are the biggest pain points with the current Compute setup?
Initial setup, diagnosing problems, shutting down/restart via AWS if needed
Lack of AWS creds accessible from env like Numerai creds. (AWS Creds used for S3- bucket read/saves/archiving/ensembling etc.)
How do you typically deploy a model to production?
Numerai node deploy

Bonus questions:

How long does your pipeline take?
Classic: ~20 min (FE then inference only for 50 slots, single slot trigger). Signals: >1 hour (data retrieval + feature engineering + inference)
How easy would it be to move to daily submissions?
Classic: Pretty easy. Signals: Dicey depending upon time allowed after trigger because of data pipeline

by256 · July 11, 2022, 9:12am

```
Are your submissions automated now?
```

Kind of. I have a single script that submits all of my models that I run manually.

Do you submit from a home machine, compute/cloud or website?

Home.

Do you want to train your model locally or in the cloud?

Locally.

```
How often do you retrain your models?
```

Never.

What cloud platform are you most comfortable with?

AWS.

Do you use version control for your model code?

Yes.

What are the biggest pain points with the current Compute setup?

Not enough RAM; a pain to set up 20+ times for each model; don’t want to have to use Docker or other similar dependencies.

How do you typically deploy a model to production?

Create a predict.py script for each model and run these every weekend.

malembetirick · July 15, 2022, 9:53pm

Hello Numeratis,
A few months ago, I made you a proposal to set up a dedicated cloud workspace [Proposal] Numerai.cloud - Open source cloud workspace for the community. Despite some skeptics I decided to make this project a reality, so I set up a subscription page for those wishing to access the app in beta and support the project at this link https://numerai-cloud.ghost.io/.
I am convinced that this would make life easier for all of us, reduce friction and allow us to quickly onboard new Numeratis and explore more ideas TC friendly.

taori · September 1, 2022, 9:32pm

I had a look at the Compute Lite Beta Testing Document.

napi.deploy(model_id, model, napi.feature_sets('small'), 'requirements.txt')

What does this code do? What is happening behind the scene?

Will this work with my model?

This works with any model or pipeline that matches the sklearn interface. As long as your model has a predict function it will work.

This is such a big limitation and assumption. Many user models will not work.

What are the limitations of Compute Lite?

Compute Lite uses Lambda to run your deployed model, so there are run time and memory constraints. Lambda has a maximum run-time of 15 minutes and maximum memory allocation of 3GB. If your model inference exceeds these limits, it will not work until we add support for AWS Batch

Same as above

taori · September 1, 2022, 10:05pm

That is actually well explained in the document

kayeffnumeraitor · September 14, 2022, 3:56pm

Are your submissions automated now?

Yes I have a cronjob on a local raspberry waking up my main computer to run the scripts.

Do you submit from a home machine, compute/cloud or website?

Home machine

Do you want to train your model locally or in the cloud?

Locally

How often do you retrain your models?

Depends on the model, those that I retrain once per month

What cloud platform are you most comfortable with?

None of them

Do you use version control for your model code?

Yes

What are the biggest pain points with the current Compute setup?

Lack of control. I don’t like private code somewhere other than my local machines. I also don’t like the approach “Let me handle everything for you” and would rather have “Here is a ready to use solution, but you can also modify it or do everything on your own”. Also, some of my models require some compute power, and I already have a local machine capable of doing that. I don’t want to spend extra money for expensive cloud services. If a webhook trigger mechanism becomes mandatory, I would really like to be able to set custom webhook URLs in my Numerai account, and let me do my own thing.

How do you typically deploy a model to production?

For most models I create a custom model file that can be added to a folder of deployed models after I have trained it.

To support this future, we are exploring the idea of daily rounds with much shorter submission windows. This change will effectively make model automation mandatory.

Just define a daily time window where models are supposed to upload their predictions, i.e. every Mon-Fri from 6:00 UTC to 10:00 UTC. A simple cronjob will work just fine.

The key message I want to convey is that I am fine with everything unless it becomes impossible to upload predictions other than by using a Numerai Compute node running in a cloud service.

nyuton · September 18, 2022, 2:22pm

Are your submissions automated now?

Semi automated. All models can be sumitted by manually triggering a single script.

Do you submit from a home machine, compute/cloud or website?

home machine

Do you want to train your model locally or in the cloud?

Locally

How often do you retrain your models?

Not too often. Mostly they are trained once and get included into the pipeline without retraining

What cloud platform are you most comfortable with?

AWS and GCP

Do you use version control for your model code?

No

What are the biggest pain points with the current Compute setup?

I often add new models and remove bad models. Change the script, what goes into the ensemble.

Model files are big. Building a container and uploading the whole packet into the cloud takes looong.

taori · September 21, 2022, 10:08am

While I understand the need and advantage for many users, I am worried that the new Numerai Compute will take away the clean, straightforward and above all flexible approach of Numerai tournament (download data → do whatever you want → upload the predictions).

With Numerai Compute the user models are run on demand by Numerai. Numerai decides when to call what, which is a big shift from the current standard where is the user who decides to do what and when.

I can understand the Numerai’s need for this paradigm shift, but I do not accept the decrease in flexibility on how I can run my model or what I can do (which is a limit imposed by both the current form of Numerai Compute and the fact we have to use AWS).

if this paradigm shift will become mandatory, please, please, please add the possibility to skip Numerai Compute and allow users to register a Webhook on their account instead. The Webhook would work as a simple trigger that starts user models. That would give us back the flexibility to do anything our models need.

Topic		Replies	Views
Numerai Automation with IBM Cloud Tournament	0	597	June 9, 2022
Numerai Fireside Chat Aftermath Feedback	61	1903	November 4, 2023
Daily Tournament - Update #1 Tournament	29	3478	February 10, 2023
Daily Tournaments Announcements	0	5067	October 21, 2022
[Proposal] Numerai Community Compute Council of Elders	5	954	August 2, 2022

Help us improve Numerai Compute!

Will this work with my model?

What are the limitations of Compute Lite?

Related topics