Automated submission with google cloud

For those of you who aren’t comfortable following the compute example for aws (I have no idea what I am doing for half the steps), and are happy with something simpler - google cloud made it easier recently to have your virtual machine (VM) spin up on a weekly schedule.

I will explain how in the steps below.

Step 1. Adding the metadata to the VM config that starts your program

This program should fetch the data, run the predictions, and submit those predictions to in any way of your making, and shuts down the VM afterwards.

To add the metadata click on the name of your VM, and click edit

Scroll down and add a metadata entry called “startup-script”, and add your own variant of the startup script you see in the image.

The script I have added there is as follows

#! /bin/bash
cd /home/username
runuser -l username -c ‘./compute.linux’
sudo poweroff

What it does is go to your user directory --do remember to change username with your actual google cloud username!–, and run your program.

As the startup-script is run by the root user, the first it has to do is switch to the username home directory and invoke your program from there. In this example, the program is a script called compute.linux, and for good practices, we don’t run the program as root but as a user.

Remember that the program is responsible for everything related - fetch the data, generate the predictions, and submit the results.

After the program has run its course, the startup-script invokes a “sudo poweroff” to shut down the instance. Note that if the program hangs for some reason, your instance is not shut down!

Step 2. Schedule your VM to start once a week.

Not so long ago, google added an instance scheduler to their compute engine, and it is now trivial to set a weekly schedule for your VM, so that you can submit your predictions every week while on holiday :smiley:. First click on instance schedule, and then click on create schedule.

You see that I already have a named schedule there, but I will go through the process of making another one here.

When you click create schedule, you get the following tab.

What is important for you is to select the same region as your VM is from, and then select a start time/time zone, an initial date from whence forwards the schedule becomes active, the frequency, and when you pick weekly, also the day of the week that your VM is started.

I picked a starting hour that is a few hours later than when the new data typically becomes available. If is a bit late in uploading their new weekly dataset, my program isn’t caught trying to predict and submit last week’s predictions again!

Step 3. Give your schedule permission to start a VM

Now you have to make a small detour to, which is where you can give your schedule permission to start your VM.

Your first action is to click the little box marked in red below, and that will make your compute engine service agent show up. Edit its permissions by clicking the pencil to the right, and add the role of Compute Instance Admin (v1) to the service agent account.



Well done. Just one more step!

Step 4. Back to the VM page, and attach the schedule to your VM

Now, go back to your VM instances page on google cloud, and in the instance schedule tab, click on the weekly schedule that you created.

Now you can add your VM instance to your schedule without errors on permissions.


And congratulations! You are set! You have got something similar to’s AWS compute working, but on google’s cloud - which seems better tailored to people that think of the cloud as ssh’ing into a linux box and running a program, rather than fancy stateless functions and docker images :stuck_out_tongue_winking_eye: .


Works great, thx! I did try numerai compute but I always got an OOM-Error, even after increasing it to 30 GB. Locally, it only uses around 18 GB of Ram (as proven scientifically with Task Manager) but it still crashes with numerai CLI. However, I also need the 64 GB Ram Box on GCP.

Here is a little script I use to set up the boxes. In the future I might do it with Docker, but here is the script and adapt it as you need:

sudo apt update
sudo apt-get install wget -y 
sudo apt install python3 python3-dev python3-venv -y 
sudo python3
sudo apt-get install p7zip-full -y
7z x
pip install pip --upgrade
pip install -r requirements.txt
pip install wandb
python3 -m wandb login <API KEY>

Also, this creates a log in the root folder saving all output (google the syntax a little, you’ll find a stackoverflow post with a good description).

exec 3>&1 4>&2
trap 'exec 2>&4 1>&3' 0 1 2 3
exec 1>"$(date +"%Y_%m_%d_%I_%M_%p").log" 2>&1
echo "Starting run at" $(date -u)