Rnumerai - R Interface to the Numerai Machine Learning Tournament API

theomniacs · April 21, 2020, 3:24pm

Rnumerai - R Interface to the Numerai Machine Learning Tournament API

Hi everyone! This post is to introduce Rnumerai, an R Interface for the Numerai Machine Learning Tournament API. After you’ve spent your time pulling every ounce of performance out of your model, this package steps in to help automate the submission process. Let’s walk through some of the functionality in the Readme.

Installation

For the latest stable release: install.packages("Rnumerai")
For the latest development release: devtools::install_github("Omni-Analytics-Group/Rnumerai")

Automatic submission using this package

1. Load the package.

library(Rnumerai)

2. Set working directory where data will be downloaded and submission files would be kept.

Use current working directory

data_dir <- getwd()

Or use temporary directory

data_dir <- tempdir()

Or use a user specific directory

data_dir <- "~/OAG/numerai"

3. Set Public Key and Secret API key variables.

Get your public key and api key by going to numer.ai and then going to Custom API Keys section under your Account Tab. Select appropriate scopes to generate the key or select all scopes to use the full functionality of this package.

set_public_id("public_id_here")
set_api_key("api_key_here")

Optional: If we choose not to setup the credentials here the terminal will interactively prompt us to type the values when we make an API call.

4. Download data set for the current round and split it into training data and tournament data

data <- download_data(data_dir)
data_train <- data$data_train
data_tournament <- data$data_tournament

5. Generate predictions

A user can put his/her own custom model code to generate the predictions here. For demonstration purposes, we will generate random predictions.

submission <- data.frame(id=data_tournament$id,prediction_kazutsugi = sample(seq(.35,.65,by=.1),nrow(data_tournament),replace=TRUE))

6. Submit predictions for tournament and get submission id

The submission object should have two columns (id & prediction_kazutsugi) only.

submission_id <- submit_predictions(submission,data_dir,tournament="Kazutsugi")

7. Check the status of the submission (Wait for a few seconds to get the submission evaluated)

Sys.sleep(10) ## 10 Seconds wait period
status_submission_by_id(submission_id)

8. Stake submission on submission and get transaction hash for it.

stake_tx_hash <- stake_nmr(value = 1)
stake_tx_hash

9. Release Stake and get transaction hash for it.

release_tx_hash <- release_nmr(value = 1)
release_tx_hash

Performance

In our newest release, we’ve added functionality to graphically assess the performance of a single model or collection of models across time. The current performance metrics supported are:

Reputation
Rank
NMR_Staked
Leaderboard_Bonus
Payout_NMR
Average_Daily_Correlation
Round_Correlation
MMC
Correlation_With_MM

These metrics are used within three main functions:

performance_distribution(username, metric, merge = FALSE, round_aggregate = TRUE)

performance_over_time(username, metric, merge = FALSE, outlier_cutoff = if (round_aggregate) 0 else 0.0125, round_aggregate = TRUE)

summary_statistics(username, dates = NULL, round_aggregate = TRUE)

Here are some samples:

1. Display performance distributions

performance_distribution(c("objectscience"), "Average_Daily_Correlation")

performance_distribution(c("objectscience", "uuazed","arbitrage"), "MMC")

2. Display performance over time

performance_over_time(c(“objectscience”), “Correlation_With_MM”)

performance_over_time(c(“objectscience”, “uuazed”,“arbitrage”), “Correlation_With_MM”)

3. Display performance summary statistics

summary_statistics(c("arbitrage", "objectscience","uuazed"))

4. Support for many models and accounts

performance_distribution(c("objectscience", "uuazed","arbitrage","cryptoquant","phorex","wander","madmin","wilfred","beepboopbeep","no_formal_agreement"), "Round_Correlation")

data.frame(summary_statistics(c("objectscience", "uuazed","arbitrage","cryptoquant","phorex","wander","madmin","wilfred","beepboopbeep","no_formal_agreement"))$stats)

Further Updates

As the performance metrics and functionality change, stay tuned for the latest updates! We’ve already included the recent changes in support of multiple models under the same account. If you have any questions or suggestions, please feel free to open a Github issue or here in the forum.

kreator · May 15, 2020, 7:20am

Hi, thanks for developing and maintaining this project. For me as an R user it already saved me days or even weeks in the last years. I really appreciate it. Please keep up the good work!

javibear · May 15, 2020, 11:28pm

Thanks as well. Quick question. how does the submit_predictions() function change with new multimodel functionality? when i enabled it, i was told to use :

submission_id <- submit_predictions(submission, data_dir, tournament=“Kazutsugi”,
model_id=models$integration_test)

model_id a new parameter?

ssh · May 16, 2020, 4:25pm

I haven’t used my multi-models submissions yet, but in example provided there is also step when you init variable “models” before you use it at submit_predictions()

models <- get_models()
before using it at:
submission_id <- submit_predictions(submission, data_dir, tournament=“Kazutsugi”, model_id=models$YOUR_MODEL_NAME)

if I understand API correctly you could also use:

submission_id <- submit_predictions(submission, data_dir, tournament="Kazutsugi", model_id="YOUR_MODEL_ID_FROM_WEBSITE")

where YOUR_MODEL_ID_FROM_WEBSITE provided at website (at https://numer.ai/models under earch your model name )

kreator · May 17, 2020, 8:12pm

I tried version 2.1.1 of the package. There seems to be a bug in get_models and submit_predictions. I contacted the maintainers and they will have a look at it. Unfortunately, in my understanding the upload does not work for multi-model accounts at the moment.

ssh · May 17, 2020, 8:30pm

@kreator also faced same issue with multi-model submissions. reported it at https://community.numer.ai/channel/support (and possible solution to this issue). I believe @theomniacs will update code at github as they are really supportive and quick to react. Or you could apply it locally (which I don’t like as it makes extra code to support)

ssh · May 17, 2020, 8:41pm

@kreator also had troubles with get_models as it didn’t worked with my previuse API_ID. I believe it need extra permission (like “get user info”), not only “submission”. so I went and created other API_ID with this extra permission on site, but it really not nessessary step as you could use model_id from site without calling get_models()

ssh · May 17, 2020, 8:59pm

other minor thing is: get_models() return a vector so one can’t address it with “$” and correct R code example would be:

models <- Rnumerai::get_models()
submission_id <- Rnumerai::submit_predictions(submission, data_dir, 
tournament=“Kazutsugi”, model_id=models["YOUR_MODEL_NAME"])

javibear · May 18, 2020, 5:01am

i had issues with get_models() too but figured you had to authenticate yourself first. try adding before get_models:

set_public_id(your_id)
set_api_key(your_key)

worked fine after this and merely produces a df of your account names and model_ids

kreator · May 18, 2020, 9:49am

Thank you @ssh I posted an issue at github too. And right, I agree, they are very supportive. Would be great if they could resolve the issue quickly - especially with your input. Thank you!
PS: Thank you for the hint on get_models too! And right, it is not necessarily needed as one can just hard code the model ids.

kreator · May 18, 2020, 9:55am

@javibear and @ssh the hint was correct. You have to produce new API keys, then get_models works. But it did not work with API keys that were generated before the change to the multi-model support.

kreator · May 18, 2020, 12:12pm

The bug was corrected today. After re-installation from github the submission for multi-model accounts worked for me.

Topic		Replies	Views
Example predictions - am i missing something? Tournament	6	1290	January 1, 2022
Numerai Tournament Training Data Explorer Data Science	1	1307	December 2, 2021
Help us improve Numerai Compute! Feedback	48	3459	December 2, 2022
NumerBay API Question Tournament	3	569	June 12, 2022
Automating R submissions? Data Science	5	1068	September 15, 2020