If they let everyone with a login download useful data about hedge fund investing, what stops people from taking that data and investing on their own? In that case, Numer.ai has provided a service (the procurement and preparation of data into a convenient and potentially profitable format) at the risk of not being compensated with predictions in return.
The payouts for winning are pretty good, but it is a competition and there is always some degree of luck. The top models each week are all very strong, but there can only be one winner. That is generally decided by random chance.
The payouts for taking “market ready” data, making a strong model and investing with it privately takes a lot of the random chance out of the competitive payout system, i.e. applying game theory, no rational data scientist would compete on Numer.ai. They would all take the data and invest with it themselves. Numer.ai is in the hedge fund business, not the giving away free data business.
Moreover, Numer.ai appears to be leveraging the blind inputs they give us by saying it takes the human bias out of investing. This is true. A lot of hedge funds are still ran by “feel” or some sort of “qualntitative” mix of qualitative and quantitative. In that sense Numer.ai itself is trying to have a high originality score vs other hedge funds. This strategy/marketing may attract investors (on the other side of the coin from the data scientists) looking for alternative investment strategies.
I haven’t been able to find a lot of information about that side of the coin and what they are really doing with the predictions, and since I do not have the net worth to invest in a hedge fund currently, I really do not care. It could be anywhere from a true AI trading robot to a total scam that doesn’t even manage a fund. No clue really. I think the right mind set it to focus on your piece of the puzzle and work with what is given. If it is bothering you too much, you can open up R or Python of the tool of your choice and mess around with some market data from another source until you get something that looks like the Numer.ai data. I have done this myself (as I assume almost everyone else on here with a grain of curiosity has), and I can tell you it isn’t all that difficult to figure out roughly what the data represents, if only in an abstract sense.