Rnumerai - R Interface to the Numerai Machine Learning Tournament API
Hi everyone! This post is to introduce Rnumerai, an R Interface for the Numerai Machine Learning Tournament API. After you’ve spent your time pulling every ounce of performance out of your model, this package steps in to help automate the submission process. Let’s walk through some of the functionality in the Readme.
Installation
- For the latest stable release:
install.packages("Rnumerai")
- For the latest development release:
devtools::install_github("Omni-Analytics-Group/Rnumerai")
Automatic submission using this package
1. Load the package.
library(Rnumerai)
2. Set working directory where data will be downloaded and submission files would be kept.
Use current working directory
data_dir <- getwd()
Or use temporary directory
data_dir <- tempdir()
Or use a user specific directory
data_dir <- "~/OAG/numerai"
3. Set Public Key and Secret API key variables.
Get your public key and api key by going to numer.ai and then going to Custom API Keys
section under your Account
Tab. Select appropriate scopes to generate the key or select all scopes to use the full functionality of this package.
set_public_id("public_id_here")
set_api_key("api_key_here")
Optional: If we choose not to setup the credentials here the terminal will interactively prompt us to type the values when we make an API call.
4. Download data set for the current round and split it into training data and tournament data
data <- download_data(data_dir)
data_train <- data$data_train
data_tournament <- data$data_tournament
5. Generate predictions
A user can put his/her own custom model code to generate the predictions here. For demonstration purposes, we will generate random predictions.
submission <- data.frame(id=data_tournament$id,prediction_kazutsugi = sample(seq(.35,.65,by=.1),nrow(data_tournament),replace=TRUE))
6. Submit predictions for tournament and get submission id
The submission object should have two columns (id & prediction_kazutsugi) only.
submission_id <- submit_predictions(submission,data_dir,tournament="Kazutsugi")
7. Check the status of the submission (Wait for a few seconds to get the submission evaluated)
Sys.sleep(10) ## 10 Seconds wait period
status_submission_by_id(submission_id)
8. Stake submission on submission and get transaction hash for it.
stake_tx_hash <- stake_nmr(value = 1)
stake_tx_hash
9. Release Stake and get transaction hash for it.
release_tx_hash <- release_nmr(value = 1)
release_tx_hash
Performance
In our newest release, we’ve added functionality to graphically assess the performance of a single model or collection of models across time. The current performance metrics supported are:
Reputation
Rank
NMR_Staked
Leaderboard_Bonus
Payout_NMR
Average_Daily_Correlation
Round_Correlation
MMC
Correlation_With_MM
These metrics are used within three main functions:
performance_distribution(username, metric, merge = FALSE, round_aggregate = TRUE)
performance_over_time(username, metric, merge = FALSE, outlier_cutoff = if (round_aggregate) 0 else 0.0125, round_aggregate = TRUE)
summary_statistics(username, dates = NULL, round_aggregate = TRUE)
Here are some samples:
1. Display performance distributions
performance_distribution(c("objectscience"), "Average_Daily_Correlation")
performance_distribution(c("objectscience", "uuazed","arbitrage"), "MMC")
2. Display performance over time
performance_over_time(c(“objectscience”), “Correlation_With_MM”)
performance_over_time(c(“objectscience”, “uuazed”,“arbitrage”), “Correlation_With_MM”)
3. Display performance summary statistics
summary_statistics(c("arbitrage", "objectscience","uuazed"))
4. Support for many models and accounts
performance_distribution(c("objectscience", "uuazed","arbitrage","cryptoquant","phorex","wander","madmin","wilfred","beepboopbeep","no_formal_agreement"), "Round_Correlation")
data.frame(summary_statistics(c("objectscience", "uuazed","arbitrage","cryptoquant","phorex","wander","madmin","wilfred","beepboopbeep","no_formal_agreement"))$stats)
Further Updates
As the performance metrics and functionality change, stay tuned for the latest updates! We’ve already included the recent changes in support of multiple models under the same account. If you have any questions or suggestions, please feel free to open a Github issue or here in the forum.