Wicked problems

Out of curiosity, has anyone here looked at formulating the TC problem in the Tournament, or the IC one in Signals, in terms of their being “wicked problems” rather than (say) optimization ones? I don’t want to spend time reinventing the wheel :laughing:. But in conversing with someone elsewhere today about such problems, it struck me that perhaps approaching these as wicked problems might prove useful, particularly using Conklin’s generalization (see the attached Wikipedia article, under “Characteristics”).

Of course I could be completely out to lunch. It wouldn’t be the first time :crazy_face:

1 Like

Our problems here have some of those characteristics, but clearly not all of them. But what does “formulate the TC problem” as being wicked even mean? And how would it help?

1 Like

If I knew the answers to those questions, I wouldn’t be asking mine :laughing: .

I guess basically it’s just a hunch I had that TC/IC and working in a wicked problem framework—rather than (say) an optimizing one—might be productive. I think the root of that hunch lies in my target analysis background—there one can measure better or worse in terms of how quickly an algorithm picks up a target, how long you can track it in noise, how quickly you can identify it as a threat or not, that sort of thing. Everything is, in theory moderately knowable. Systems that work in the lab tend to work in practice, etc.

TC otoh doesn’t really seem to be the same. A big problem of course is that the target itself is unknown (except in a vague sort of way), and as of this moment I don’t have any sense whether an approach that’s good for this round will be at all valid the next. And of course there’s the feedback nature of the competition, which from a fairly simple Darwinian perspective might imply that good solutions will cluster, lessening the value of each. I really don’t know, these are just things to ponder.

All that said, I do like having a framework in which to approach complex problems. For example, right now (rounds 310 & 311) the Tournament I’ve just started running 50 models that might be more wicked problem type solutions; they’re built on a genetic algorithm that leans more towards survivability than achieving peak performance. Some of them are doing quite well, and the rest not so well :sob: What’s interesting is that they correlate to each other quite consistently, one group of 25 at about 0.1 among themselves, the other at about 0.7.I’m curious as to how they will sort themselves out over a number of runs.

Over in Signals, I have been running another 50 models that were (algorithmically) very tied together. They did ok on corr an MMC, an absolutely appallingly on IC. Except for two.and those two were the least tied to the rest.

So putting those experiences together leads me to think that trying to optimize for a best solution isn’t the right way to go. It may be better to evolve a set of solutions each of which performs moderately, but generally avoids getting murdered. Does that make sense?

In any case, I do :heart: things like this that spark my curiosity!

I don’t think TC has the “one shot” character where trial-and-error is useless, etc. But I do think there will be a stochastic element to it, i.e. even with the best approach you are going to win some and lose some. That’s always been true with any measure (this is the live stock market, after all), but the ups-and-downs might be greater and more seemingly random (general trends for TC that apply to most people will be less apparent than corr, for instance).

So I recommend everybody have several to many uncorrelated different approaches to gaining TC so you can have a more even-keeled portfolio of results so when one model is getting hammered another will hopefully be doing very well. Unless you really like the rough seas.

I have always approached Numerai models as more of a logic puzzle or bit of code-breaking than an optimization or stats problem. Basically, I assume that others following consensus best-practice (or at least very common) methods are quickly going to converge upon a set of (fairly correlated) models that are going to pick all the low-hanging (and probably medium-hanging) fruit, generalize decently, etc. And I’m not going to try to compete to be the best & most comprehensive low-to-medium fruit picker (beyond the extent that I am forced to be because of the scoring metrics and payout system used) as I’m just not going to win that battle…and it’s boring. So I am making the (mostly unfounded) assumptions that the results are more deterministic than they may seem and also that most people are going to be led astray by simplicity bias in some areas. (i.e. so I’m probably overfitting)


I believe that optimizing for CORR (edit: and FNC, FNCv3) is the only NON wicked problem we have.

CORR we can calculate, and tune our model for… at least for a hold-out from the (now much expanded) validation data. EDIT: one can also calculate feature-neutral correlations.

But MMC (which is gone now) and TC are utterly black boxes for us, since they’re defined using the meta-model, which we don’t have. All we can do is observe our results, and stake accordingly (CORR only, 1 X TC, or 2 X TC).


Well, I do like rough seas. Or at least I’m entertained by them (time for a story about outrunning a hurricane on a 230 ft ship with only emergency steering, :scream:). Anyway, using uncorrelated models is one of the things I’m interested in. For example, these are the Spearman correlations among the predictions I submitted for Tournament round 312 (very similar to 311, which are doing ok):

Each group of 25 has roughly the same Corr scores within the group when run on simulated or randomized Validation data; the 1-25 group runs at about half that of the 26-50 group. In real life, the scores seem fairly evenly distributed against other Tournament submissions, but that’s going only by eyeball for now. Other tests (like against neutralization) I haven’t gotten around to writing yet.