I think you raise some good points. I’m thankful for the thoughtful discussion! I apologize for the long essay I have written below. Its a public holiday today but I still have child care for my child and nothing pressing at my real job. Writing like this helps form my views better so I sometimes throw up on the page as a way to do that.
Firstly, I’m sorry if I sounded like I was suggesting you are arrogant. That would be a failing of my writing style. Suggesting arrogance wasn’t meant to be personal but rather just accepting that we all are arrogant in our opinions. This is a human failing which impacts us on an individual level first and then others second. But this is my point regarding the metamodel. I’ll return to that in a moment.
Your point about crypto is well taken by me. I’m personally very bearish on the whole crypto market even though I’ve made some money in it. I mean who could not with a simple buy and wait strategy? But as a system, its future is uncertain, and as you correctly point out it’s legal issues are real, which means I predict high volatility in that space. I am quite risk adverse, which I guess is starting to become obvious. I’d prefer that crypto was not a part of staking with Numerai. The Hedge Fund is making real currency backed by real nations that have a vested interest in continuing to exist, while we make cryptocurrency backed by the blockchain and it is less certain that that system will have the same level of desire to continue to exist. This point is controversial and no one knows for sure. But we (the tournament participants) do take on an extra level of risk than our masters - it’s not fair - but in the end so be it. It’s an extra level of risk I can tolerate. But I take this position because I’m here more for the datascience challenge and less so to make money. I doubt I will make substantial amounts of money anyway and I’m not looking to. So we may have different purposes here that lead to different opinions.
Regarding the metamodel I think we disagree a lot. I’ll try to justify my position again. I can accept that the ‘wisdom of the crowd’ (WOTC) philosophy can fall into heard mentality. But 1) an individual investor is not immune from this either 2) the metamodel can solve this problem so long as it is composed of guesses that are as independently formed as possible. Only a metamodel of some kind can solve this type of problem. An individual can not. Even if they think they are avoiding the heard mentality, it’s impossible for an individual to do so. That last sentence is opinion, not fact - I don’t believe anyone has facts either way. So, how can WOTC suffer heard mentality as well? Simply by having participants pay too much attention to what each other is doing. Does that happen here at Numerai? Certainly it does. WOTC does not solve the problem with algebraic precision.
But I think Numerai is using WOTC in a way that helps solve the problems associated with WOTC. And again we disagree a lot here. You use the term ‘garbage’ to describe their data - its a strong pejorative the word ‘garbage’ but I understand better now what you mean. I’ll use the word ‘obscured’ instead. I maintain that obscured data is a strong way to prevent heard mentality. Lets say ‘Feature X’ is a stock’s price correlation with Dogecoin. And let’s just say that stock is GE (no particular reason, just making things concrete). We notice that this feature has had a consistent value over a few eras. Now Elon Musk makes some pronouncement about Dogecoin. If too many people worship Musk (sadly too many do and I think a lot of them might be here in Numerai) then many will ignore their objective models and insist that Dogecoin movements as a result of Musk statements will impact GE price. Decisions like this have so many levels of bullshit baked into them it needs its own paragraph to discuss.
Firstly, Musk knows next to nothing about what makes a stock price go up or down (IMHO). Even if he knows something good or bad about Dogecoin or GE, and this would have to be insider information (whenever you see insider information, think volatility and run away - you are about to get burnt or imprisoned), his statements are exactly what will make the correlation fail and the feature to become incorrect. If you believe something external about the feature you will hypothesize about the feature. An investor will make mistakes because they know what the stock and feature is. But he will also have to fail to understand the difference between correlation and causation. A model uses the features to guess the target by finding correlations between features and targets and if our model is smart we find correlations between complex nonlinear combinations of features and complex nonlinear combinations of targets. Nonetheless it’s all correlations. Correlations are fine - but if we know stuff about the features and how they reflect the real world we naturally start building a hypothesis about causation. But to establish causation you need to run controlled experiments and you simply can not do this robustly on the macroeconomic scale. Causation (not correlation) will lead investors wrong because we can not establish causation. But OK, maybe you think you wont be (mis)lead by interpreting your knowledge of the features and targets beyond their raw values and you wont try to establish causation. Then why have any information about the features and targets other than their raw values? This information about features/targets is exactly what will lead to heard mentality because, for example, too many people trust Elon Musk or try to establish a causation between Dodgecoin and GE without any basis in reality. Obscured data is the remedy.
In short, we get in our own way.
But as @wigglemuse pointed out, maybe Numerai Signals is more your thing. I want to explore that too at some point. But again, going it alone using your data in isolation (your own silo) is IMHO risky. Sure, use your own data, but I don’t recommend making investment decisions in isolation. Even if you are the smartest person in the room you will blow up one day and maybe for an entire week and maybe until you have no money left. It would be nice if you were buffered by a dissenting opinion. You see, even when you are the smartest person in the room, a dissenting opinion might slow you down, but one day that dissenting opinion will stop you from blowing up. And it’s important that that dissenting opinion be based on methodology as different as possible from your own. I think obscured feature/targets in a metamodel achieve this goal best.
Again, I’m enjoying the discussion. I think I will try and put some numbers behind what I have just said and write some simulations of how I think siloed opinions blow up. I don’t think that is original work but it will make things more concrete for me and maybe convince you and others about the wisdom of the metamodel. Maybe not
Robbo (the Fossil)