This topic might have been discussed before, but as one of the Signals participants, I would like to stress what I think how Signals can be improved.
Currently most of the top submissions seem to be based on my Signals starter:
[NumeraiSignals] Starter for Beginners
This is apparently not great given that the original intention to introduce Signals was to reward participants with unique data. The abovementioned starter uses the imcomplete yfinance data, which is apparently not unique (I am afraid that the volatility in performance in Signals is derived from this incompleteness with the yfinance data).
This situation is understandable, given that Signals reward participants who cover most of the ticker universe. It is very very hard to get consistent and yet unique data for all the ticker universe.
How can we improve Signals then?
I think Signals should have an evaluation metric limited to particular country or industry.
For example, I am Japanese, and I should have an access to rich data related to the Japan stock market. As this new kaggle competition begins, JPX (Japanese stock exchange) is very willing to provide rich data via API to users.
But I cannot have something similar from US or Korea.
So even though my data about Japan might be rich, I would not be able to use them for Signals. How unfortunate.
Numerai community is now very international, so Signals might want to leverage that by collecting unique submissions about particular country or industry. The job for Numerai is to ensemble them nicely to cover all the universe.
This approach might have a potential to improve the meta model much further as it now accepts all the niche but rich unique data from Signals.
Thanks