That makes sense. In my opinion, I think a much bigger issue is solving (removing) the originality measure. Truth be told, my best models never get accepted and I always have to do something goofy to make sure I am satisfying the snowflake award for creativity. I am starting to think that Numer.ai is maliciously imposing originality to get many more models from me than I would normally submit, because I am certainly submitting many more models than I should be. If this is the case, then smart move by Numer.ai, but things like that rarely work out long term.
One idea I had is to split the competition. Have one pool with no originality (and if you want, none of the other stuff, too, but I understand they can be useful to sort out junk) and one normal pool. The original pool will function normally, but the unrestricted pool will likely be a lot more competitive, because there is a likely a reason so many models looks similar to one another (its because they're good models).
I also messed around with some scores a few weeks ago (5 weeks of scores) and saw that consistency and log loss were negatively correlated. That is, the less consistent models had on average lower log loss. This might have been a fluke, especially since I cant think of a mathy reason that explains it, but it is evidence against the notion of those metrics (consistency, concordance, etc.) creating better models overall.