Signal Miner: Find Unique Alpha & Beat the Benchmark

:rocket: Signal Miner: Find Unique Alpha & Beat the Benchmark

Revolutionizing Staking: Aligning users and the fund through unique models.

:snake: What is Signal Miner?

Signal Miner is a fully automated model mining framework designed to generate models that outperform Numerai’s benchmark models in terms of correlation and Sharpe ratio. Instead of staking on pre-existing models, this tool helps you discover your own unique alpha, which has a better chance of producing positive MMC (Meta Model Contribution).

:bulb: Why use Signal Miner?

  • Unique Alpha: Avoids the trap of staking on common, overused models.
  • Better Payouts: Unique signals increase your expected returns compared to generic staking.
  • Automated Discovery: Efficiently scans a search space for high-performance models using a scalable, asynchronous approach.

:inbox_tray: Quick Start: Install & Run

Clone the repo and set up your environment. Instructions available at Github project.


:fire: How It Works

:bulb: The core workflow:

  1. Define a Benchmark Model: This is what your models will aim to outperform.
  2. Launch Model Mining: Explore a grid of hyperparameters asynchronously.
  3. Monitor Performance: Track model evaluations across cross-validation folds.
  4. Compare to the Benchmark: Identify models that exceed performance thresholds.
  5. Export Winning Models: Save the best models for staking or further tuning.

:trophy: Defining a Benchmark Model

benchmark_cfg = {
    "colsample_bytree": 0.1,
    "max_bin": 5,
    "max_depth": 5,
    "num_leaves": 15,
    "min_child_samples": 20,
    "n_estimators": 2000,
    "reg_lambda": 0.0,
    "learning_rate": 0.01,
    "target": 'target'  # Using the first target for simplicity
}

:rocket: Launch Mining

start_mining()

Once mining is started, models will be trained and evaluated in the background.

Check Progress Anytime:

check_progress()

Progress: 122.0/2002 (6.09%)

:bar_chart: Visualizing Cross-Validation Splits

To ensure proper evaluation, the framework implements time-series cross-validation with an embargo period:

Here, training and test sets are sequentially split to mimic live trading conditions—a crucial step for avoiding data leakage.


:chart_with_upwards_trend: Mining Results: Past vs. Future Performance

Since yesterday, I’ve been running Signal Miner to evaluate 70+ models out of 1000, and we already see many models outperforming the benchmark on both validation and test datasets. :rocket:

Below is a scatter plot showing how models that performed well in validation (past) also tended to do well in test (future).

:bar_chart: Sharpe Ratio: Validation vs. Test

:mag_right: Key Insights:

  • The red dot represents the benchmark model.
  • While the top validation model wasn’t the best in test, we found several models that outperformed the benchmark in both.
  • Positive Correlation: The best validation models tended to be among the best in test as well.
  • If the scatter plot looked random (a cloud of points), it would suggest the model selection process is noise—but instead, we see a clear upward trend.

:loudspeaker: Goal: Find a model that beats the benchmark in both correlation & Sharpe ratio. Still mining! :pick: :snake:


:chart_with_upwards_trend: Scaling Behavior

This entire process can be viewed as a function of the number of trees in the search space.
For this experiment, I set n_estimators=2000—but early results suggest that increasing this value improves overall performance.

This hints at a scaling law, an idea that has come up in community discussions before.


:handshake: Join the Experiment!

This is an open-source project, and everyone is welcome to:
:heavy_check_mark: Run their own mining experiments
:heavy_check_mark: Contribute improvements (PRs welcome!)
:heavy_check_mark: Share results & insights

:rocket: Ready to try? Head over to Signal Miner on GitHub and start mining unique alpha today!

:snake: :pick: Let’s Make Staking Great Again! :rocket:

3 Likes

Consider me thoroughly impressed (though still a bit skeptical—hopefully I’m wrong as is often the case). I’ll definitely give it a try. Thanks for sharing and for the excellent write-up, readme, and model miner notebook!

1 Like

Thank you @joakim !

:snake: Alright party people, day 2 of mining and I have currently processed a total of 112 models (not that many!) now I have a model that objectively beats the benchmark on both corr and Sharpe.


Also and interesting thing has emerged on this plot.

The benchmark model has arguably the largest generalization error out of any of my field of random models. This means that, for some reason, this model showed very good performance in the validation and considerably less good in the test set. The generalization error here is worse than for a randomly selected model. Why?

One way to understand it is to say this benchmark model is overfit to the validation set. High validation sharpe corresponds to lower test sharpe compared to any of the randomized models so far. You would have to be very unlucky to have picked that model. :wink: :snake:

1 Like

Very nice work, thanks for sharing

1 Like

I have to say very nice work indeed and props for providing the code so swiftly! :slight_smile:
I would still be interested in some comparison of “discovered” model predictions to benchmark model predictions (for uniqueness). This could be a simple correlation of the two or MMC calc. Maybe someone else has an even better idea? Because my hunch is that the new models performance is still highly correlated with the benchmark model. And the new model is just better at exploiting the same patterns.

1 Like

Yes certainly it seems a requested feature is more metrics to compare. It is straight forward to put any metric you like in there. Thanks for the support! The code for this actually grew out of a project I did for my doctoral work. It went into a small part of one chapter of my thesis, but I thought the conclusion was profound. I applied the logic to numerai’s data and it helped me to start seeing the problem in a new light.

Unfortunately, what happened in a previous project was that the validation vs. test scatter plot was like a round ball, zero correlation, and indeed OOS live performance was very spotty and random. Of course I didn’t produce this scatter plot until at the end of the project.

What is awesome about Numerai’s data set is that we can usually get a nice positive correlation here, which we see. Of course it depends on the model and what you’re doing with feature selection, etc.

Here is a snapshot of the best model so far…