Performance Pattern of leading board

I looked performance pattern of top 5 on leading board. They r almost same with some minor differences:

Assume investor can stake on different model to reduce the risks. But not feasible here.

What does it mean ?

Everyone using the same model - decision tree, or there is one successful model on the long run ?

I think that it probably reflects more the orderliness of the underlying markets with respect to training data that is years out of date.

  • Same user(s) who submitted slight variations of the same model?

  • The model is an obvious one, so multiple users thought at the same idea that works well at this specific point in time?

  • It’s the results of using the same training data set?

1 Like

Models trained with only one or two features?

based on the inputs, it could due to two main factors:

  1. same validation and training data was used - this part of rule of tournament and provided by numerai.
  2. similar ML model was used - XGB Tree
1 Like

take a closer look at the first 100 models and most of them will demonstrate the same pattern. I believe that the nature of this pattern is the period shocks so-called “good” or “bad” eras - that are easy or difficult to predict on the current dataset.


I’d go with your #1 but not #2 as I don’t use any ML techniques, it’s all old-school analysis and stats pour moi, and I have a similar overall pattern.
I think what would really help would be if Numerai simply started providing the target values for the test data once the test data was a year or so old.

Surprised to learn that you r not using ML. What kind of old school analysis tool u r using ? Linear regression ?

I actually first started working with various transform methods + simple inversion, and moved on from there. I guess my workhorse routines now are principal components analysis, kernel density estimation, and Gaussian mixtures. I think it’s more a question of habitude than anything else; I’ve been using tools like those for decades, and I’m comfortable with them (I’m old enough to be one of the early adopters of Numerical Recipes, back when 386s and math coprocessors were the hottest things on the block :laughing: ).


Thanks for the trip down the memory lane! My first PC was a 486DX4-100, but my school had a couple of 386s before that, and I’d even managed to get a copy of the Intel 386 Programmer’s Reference Manual, with some difficulty. I’d spent a lot of time with those 386s after school, learning to write DOS TSRs with TC and TASM. :slight_smile:

1 Like


That’s an old classic! Modeling enough patterns that survive, one can see those patterns that did not survive…

In this instance, they mapped during world war 2 all the bullet holes that were on the airplanes that came back. They used this bullet hole map to decide to re-enforce the airplanes in the areas which did not have red dots - presumably airplanes that were hit in those areas went down and did not make it back!