Benchmark Models

master_key · October 28, 2023, 9:04pm

Numerai develops new datasets and new targets to help our data science community build better models. Numerai builds models on each new target and data release. Today, we will begin giving out the predictions for all of these models, and details about how they are created.

Why?

New User Acceleration

Numerai has a steep learning curve. After you make it through the tutorial notebooks, you are left with several datasets, many targets, and many modeling options. There are an unlimited number of experiments you’ll want to run as you begin your journey to the top of the leaderboard. With benchmark models, you can immediately see how well different combinations of data and targets do. I think you’ll find that exploring these models and their predictions and subsequent performance will inspire even more ideas for new models you can build yourself.

Better Stake Allocation

If you’re a returning user and you’re a few updates behind, you can see at a glance if your model is still competitive, or if you’d be better off staking on one of the newer benchmark models until you have time to catch back up.

A Meta Model of Meta Models

Some users may not have the resources to train large competitive cutting-edge models themselves. However, by just downloading targets, the Meta Model predictions, and Benchmark Model predictions, it may still be possible to recognize that the Meta Model is underweight some types of models, or you might be able to find that certain targets ensemble especially well together, or you might have a strong belief that one target will outperform into the future. You can explore all of these possibilities yourself and even submit and stake on these ensembles with minimal resource requirements.

Where?

Go to numer.ai/~benchmark_models to see a list of models and their recent performance.
Go to the docs to see more details about how they are made.

The validation and live predictions are available through the api.

pip install numerapi

from numerapi import NumerAPI
napi = NumerAPI()
napi.download_dataset("v4.2/validation_benchmark_models.parquet", "validation_benchmark_models.parquet")
napi.download_dataset("v4.2/live_benchmark_models.parquet", "live_benchmark_models.parquet")

There is now a dotted line on your account page’s score charts to directly compare yourself with the benchmark models account.

Happy Modeling

numerologist · October 29, 2023, 3:44am

Thanks for the hard work, @master_key and Numerai team.

Would it be possible to add a toggle switch (next to “Cumulative” in “…”) to compare user/model performance to MetaModel instead of example models? As a not-very-new user, I’d be interested to see how I perform versus the competition and whether I underperform/contribute to the fund.

degerhan · October 29, 2023, 7:53am

Thank you @master_key very helpful. Is v42_example_preds a rename of the former 20k tree lg_lgbm_v42_cyrus20? The docs say v42_example_preds is a standard model (I assume that means 2k trees), looking to understand for benchmark continuity.

master_key · October 29, 2023, 4:33pm

Yeah that’s correct it’s a rename. And all of the benchmark models (still) have 20,000 trees.

taori · October 30, 2023, 9:17am

This is a bold move, I like it.

zoliveres · November 2, 2023, 8:09am

@master_key Can you specify what the rank_keep_ties_keep_na function does in the rank_gauss_pow1 function? I’ll be better put it in the Documentation.

I found a similar funtion in the numerai-tools repo, is this what you are using?

github.com

numerai/numerai-tools/blob/1c666a480c988578ca63304d7ed6b358c53c9f5a/numerai_tools/scoring.py#L54


      
              return df.apply(
                  lambda series: (series.rank(method=method).values - 0.5) / series.count()
              )
          
          

          
def tie_broken_rank(df: pd.DataFrame) -> pd.DataFrame:
              # rank columns, breaking ties by index
              return rank(df, "first")
          
          

          
def tie_kept_rank(df: pd.DataFrame) -> pd.DataFrame:
              # rank columns, but keep ties
              return rank(df, "average")
          
          

          
def min_max_normalize(s: pd.Series) -> pd.Series:
              # scale a series to be between 0 and 1
              return (s - s.min()) / (s.max() - s.min())
          
          

          
def validate_indices(live_targets: pd.Series, predictions: pd.Series) -> None:

danzell · November 12, 2023, 8:47am

Would be nice to have a functioning example

danzell · November 17, 2023, 9:00am

I’m still confused. What do you exactly mean by:

Ensembles

All of the ensembles use the following steps:

gaussianize each of the predictions on a per-era basis

standardize to standard deviation 1

dot-product the predictions with a weights vector representing the desired weight on each model

gaussianize the resulting predictions vector, and neutralize if there are any features to neutralize to

It would be super helpful if you could provide an example and share underlying code pls.

tessier_ashpool · November 17, 2023, 1:39pm

But they did provide the code.

    def gauss_pred(self, X: pd.DataFrame, ensemble_cols, weight_vector):
        for col in X[ensemble_cols]:
            if "era" in X.columns:
                X[col] = X.groupby("era", group_keys=False)[col].transform(
                    lambda s1: rank_gauss_pow1(s1)
                )
            else:
                # check X contains only a single era
                assert 1800 < X.shape[0] < 6000
                X[col] = rank_gauss_pow1(X[col])
        return X[ensemble_cols].dot(weight_vector)

as for the rank_keep_ties_keep_na method, I imagine it is something like this.

for keeping ties use method average and instead of len(s.dropna()) do just len(s) or s.count() so in essence looks something like this

def rank_gauss_pow1(s: pd.Series) -> pd.Series:
    # do rank-normalize

    # s_rank = rank_keep_ties_keep_na(s)
    # s_rank = (s.rank(method="average") - 0.5) / len(s.dropna())
    s_rank = (s.rank(method="average") - 0.5) / s.count()
    
    # gaussianize
    s_rank_norm = pd.Series(scipy.stats.norm.ppf(s_rank), index=s_rank.index)

    # Standardize to 1 std
    result_series = s_rank_norm / s_rank_norm.std()

    return result_series

nasdaqjockey · March 10, 2024, 1:23pm

It would be good if this was updated for MMC now.

Topic		Replies	Views
New data and the example predictions Tournament	4	1372	January 6, 2022
[Proposal] Community contribution to BENCHMARK MODELS Tournament	0	716	May 5, 2022
Numerapi v4 dataset Tournament	5	1246	September 5, 2022
Which dataset should I use? Tournament	1	635	September 18, 2022
Example predictions - am i missing something? Tournament	6	1289	January 1, 2022

Benchmark Models

Why?

Where?

Related topics