Model Upload Beta!

Hello all!

This morning during the fireside chat I presented on our newest automation feature - model uploads.

Model Upload allows you to upload your entire prediction pipeline to Numerai. Once uploaded, Numerai will take care of running it every day to generate live submissions.

Unlike other automation options, Model Upload is completely free and does not require you to set up any infrastructure.

Here are the slides and here are the instructions on how to join the beta test. Please direct all model upload feedback to this channel on discord.

We welcome everyone to join the beta test and let us know what you think of this new feature!


Thanks for the new feature! Will the feature neutralization be also supported?


You can probably add the neutralization to your predict function:

def predict(live_features: pd.DataFrame) → pd.DataFrame:
live_predictions = model.predict(live_features[feature_cols])
…do some neutralization
submission = pd.Series(live_predictions, index=live_features.index)
return submission.to_frame(“prediction”)


Yep @agorog is correct, you can any code you want inside your predict function, including neutralization.


Is there a way to control the submission file name through this method? If not, what submission file name will these submissions get?




The submission filenames are formatted as live_predictions-[random 12-digits].csv and partitioned by the model_uuid when uploaded to S3.


Fantastic feature! Thanks @slyfox and whole Numerai team!

To test it and possibly to show how easy is it to join Numerai for newcomers, I have created public Kaggle notebook based on your hello_numerai.ipynb example. Output folder of the notebook contains pickled model for download/upload. I have created new model JOS_KAGGLE_TEST, uploaded result and voilà yesterday it was submitting predictions at 17:24.

Kagglers now can just create Numerai account, new model and upload pickled model even without forking their own version of Hello Numerai notebook. Obviously, they will get sub-par performance (although who knows where will be this model on leaderboard), but they can easily tweak it so that they will get better results and can start staking.

On the downside: I will now need to rewrite my about 20 models :crazy_face:, but I think it’s absolutely worth it as it frees me from automation tinkering, which prevented me from spending more time on modelling and data science.


I am not able to give feedback on Discord, so at least here:

I am training my models on int8 datasets (memory constrains) and predictions are produced on live_int8.parquet. Could pickled predict function take as argument type of live data (float or int)? Otherwise I would need to convert live data to int, which is unnecessary when the int data are published anyway…

1 Like

That is sad for me, because I evaluate my models based on their submission names :frowning:
It is a stupid idea, but we weren’t allowed to change the name of the models + at the beginning we didn’t had 70 submission slots, so I had to keep track somehow, what is what.

1 Like