Brute-force trading



I apologize for the probably naive question, but I’ve just finished reading the tournament rules, and I’m curious about the gap between submitter incentives and’s incentives, and the possibilities for semi-brute-force participation.

To recap my understanding, the tournament requires that each participant submit a set of predictions for n rows of data (where n is approximately 45k at the moment and where predictions are a probability value in the range [0, 1]). If the predictions are “good” (i.e. they have a low log-loss at the end of the month), pays small amounts of money to the participant. If the predictions are “bad”, there is no payout.

In theory, at some point starts trading real money on the predictions (or the cumulative predictions) of users who have shown to be “good” at predicting.

Now, an obvious problem already is how determines which participants they should trust for making real trades; an evaluation period (i.e. a long period of time of running the tournament but not trading real money) is (slightly) expensive (in direct proportion to the duration of the evaluation period) because must pay winners but themselves do not trade based on those strategies.

But a slightly less obvious problem is that submitters have no disincentive to spam and try to get a payout in the first month, even if they are unable to repeat their successful predictions later on (and even if never trusts them or trades on their predictions).

Fortunately for, true brute-forcing is infeasible; even if one simply assigned a probability of either 0 or 1 to each row, we have 2^n (i.e. 2^45000) possible inputs, so autogenerating a new account and submitting a new prediction for each possible outcome is truly infeasible. (Some of these inputs will not be valid because they will not have “concordance”, though how concordance is actually evaluated or defined, or how many of the possible inputs it will reject, is not very clear to me.)

But clever submitters can (and probably do) generate hundreds or thousands of predictions by varying assumptions or weights in their models and submitting them separately to This behavior is incentivized by the payout scheme.

Unfortunately, this behavior compounds the “which participant do I trust?” problem for, because it creates a larger pool of participants. is probably trying to counter through the reputation bonus–this encourages even spammy participants to be consistent about submitting predictions from the same models from the same accounts, and so (similarly to how they must decide whom to trust when actually trading real money) they pay out extra to users who have shown long-term predictive consistency and accuracy.

So, my question: Is this a problem? One could argue that this is simply working as intended, and that is happy to eat the (very small, in real terms) loss in paying spammy models small bonuses for one or two months of success in exchange for generating very real predictions from long-term consistent users. But the misalignment of incentives still seems somewhat worrisome; submitters have an incentive to submit as much as possible, which is zero-sum from’s perspective–spammy predictions from young accounts generate payouts but no earnings for the fund. Or am I missing something?


On the rules page, it states:

“You may create up to three accounts to try new models; we limit this to three to prevent spamming the leaderboard. Spammers may be banned or penalized.”

Obviously, a clever & deviant user could circumvent their fraud detection methods by trying to use multiple email addresses/machines/browsers/whatever. But it’s in Numerai’s best interest to carefully screen for users violating that rule. Sure, some will probably slip through the cracks, but it seems to me like they are aware of this problem and are trying to actively move towards a system without this potential issue.


You and Antman are completely right. We do our best to catch spam, both with tournament structure and just banning spammy accounts, but obviously rules can be circumvented and metrics can be gamed. Ultimately we are going to solve this with NMR and better incentive alignment.


I think Numeraire, plus the reputation bonus, plus the analysis we can do on the test set will be enough to solve this problem. Over time we can improve concordance and originality to catch models that appear to be spam rather than real intelligence.


@richard brings up a great point- the latest restructuring of the tournament format allows for a higher degree of integrity, both from a data scientist point of view and from Numerai’s.

Regarding concordance, as per @geoffrey’s description:
“Concordance - This is a binary check that the model a user used on validation is the same model they used on test and live.”

This is a great place to start, and I imagine that after a few rounds with this structure if it becomes apparent that there’s a lot of spam, the Numerai team will adjust the concordance check and possibly the originality check to be more robust.

Much of what Numerai is doing exists in uncharted waters, so there’s definitely a learning-by-doing mentality behind a lot of these processes. Time will tell how successful these measures are.


Well, my point was really to call out explicitly (though no doubt you folks have discussed this inside that misaligned incentives require you to police spammy accounts. That’s sub-optimal.

The best strategy here is (of course) to reduce the incentive misalignment–that is, to reduce the payout one can achieve by creating spammy accounts and submitting low-confidence predictions. But my second point was that the structure here involves potentially painful choices for As I see it, you have two basic choices:

  1. Make people ante up their own money for bets, in direct proportion to payouts. (That is, make participants shoulder the risk comparable to any returns they may see.) That’s of course undesirable since it would reduce participation (anyone with a good model would just pursue a non-blinded strategy with their own money on the open market.)

  2. Reduce or significantly delay payouts until establishing trust. This is the more reasonable approach, and the one you guys have taken in various forms (bonuses, NMR). This works, but it works better if the initial payouts are low (to avoid paying even small amounts of money during the trial phase) and the latter payouts are high–effectively, some sort of prize economics, as with other open-competitions like the X Prize or similar.

Bonuses and NMR seem in some sense like roundabout ways of achieving prize economies.


Thanks for elaborating, I have a better understanding of what you mean now (though I’m not 100% sure I get it). I think that’s a very interesting question and accompanying thought exercise.

My thought is that the current payout model would only be a problem for spamming (here meaning spamming that has somehow circumvented all of the detection methods already in place) if someone managed to produce a series of profitable, high-ranking models. For example, if you were to somehow take the top 25 spots on the leaderboard.

Even if you had to upload 100 different predictions to get that, and somehow pulled it off without getting flagged, from Numerai’s POV, wouldn’t you just appear as like, 15 different users? And if out of those 100, 25 take the top 25 spots, and another 30 qualify for smaller payouts, aren’t you ultimately contributing to the meta-model in a way that mirrors the work of multiple people? Meaning, Numerai is okay paying you because you’re helping them achieve their goal of getting contributions for the meta-model.

Forgive me if I’m still missing your point, I really like your ideas about prize economies- just trying to make sure I’m understanding what you mean!


Well, you’re only contributing to the meta-model if there’s some actual value behind your predictions. My initial thought was that had set up a scheme where you can apply the classic joke about sending half your clients a letter saying the market will go up and the other half a letter saying it will go down.

In fact you can’t really do that for the reason I mentioned above (the combinatorial problem). So you have to select some subset of predictions, and if you select them randomly you are unlikley to get a good yield.

In that sense, you are incentivized to select somewhat decent predictions, and as a result, yes, contribute to the model. But perhaps less so than the implied “one person ~one model” system.


Ahh okay, I see. My thinking was that if your model was good enough to produce predictions that earned some level of payout, then it would be good enough to contribute to the meta-model. Thanks for the added explanation (I’m still new to this level of data science + finance, so I really appreciate the info).


how about requirement to link the user LinkedIn account ?