Performance Pattern of leading board

autratec · July 21, 2021, 9:21am

I looked performance pattern of top 5 on leading board. They r almost same with some minor differences:

Assume investor can stake on different model to reduce the risks. But not feasible here.

What does it mean ?

Everyone using the same model - decision tree, or there is one successful model on the long run ?

gammarat · July 21, 2021, 1:21pm

I think that it probably reflects more the orderliness of the underlying markets with respect to training data that is years out of date.

taori · July 21, 2021, 3:28pm

Same user(s) who submitted slight variations of the same model?
The model is an obvious one, so multiple users thought at the same idea that works well at this specific point in time?
It’s the results of using the same training data set?

sirmobius · July 22, 2021, 6:53am

Models trained with only one or two features?

autratec · July 22, 2021, 7:14am

based on the inputs, it could due to two main factors:

same validation and training data was used - this part of rule of tournament and provided by numerai.
similar ML model was used - XGB Tree

ssh · July 22, 2021, 4:09pm

take a closer look at the first 100 models and most of them will demonstrate the same pattern. I believe that the nature of this pattern is the period shocks so-called “good” or “bad” eras - that are easy or difficult to predict on the current dataset.

gammarat · July 22, 2021, 6:21pm

I’d go with your #1 but not #2 as I don’t use any ML techniques, it’s all old-school analysis and stats pour moi, and I have a similar overall pattern.
I think what would really help would be if Numerai simply started providing the target values for the test data once the test data was a year or so old.

autratec · July 22, 2021, 11:15pm

Surprised to learn that you r not using ML. What kind of old school analysis tool u r using ? Linear regression ?

gammarat · July 22, 2021, 11:54pm

I actually first started working with various transform methods + simple inversion, and moved on from there. I guess my workhorse routines now are principal components analysis, kernel density estimation, and Gaussian mixtures. I think it’s more a question of habitude than anything else; I’ve been using tools like those for decades, and I’m comfortable with them (I’m old enough to be one of the early adopters of Numerical Recipes, back when 386s and math coprocessors were the hottest things on the block ).

jrb · July 23, 2021, 9:55am

Thanks for the trip down the memory lane! My first PC was a 486DX4-100, but my school had a couple of 386s before that, and I’d even managed to get a copy of the Intel 386 Programmer’s Reference Manual, with some difficulty. I’d spent a lot of time with those 386s after school, learning to write DOS TSRs with TC and TASM.

fireball · July 27, 2021, 10:19pm

mattiasl · July 29, 2021, 11:21am

That’s an old classic! Modeling enough patterns that survive, one can see those patterns that did not survive…

In this instance, they mapped during world war 2 all the bullet holes that were on the airplanes that came back. They used this bullet hole map to decide to re-enforce the airplanes in the areas which did not have red dots - presumably airplanes that were hit in those areas went down and did not make it back!

jrb · July 29, 2021, 2:28pm

Topic		Replies	Views
What do the top models do that lesser models don’t? Data Science	26	4429	February 17, 2022
Does Good Model Diagnostics Correlate with Tournament Performance? Data Science	13	3010	February 7, 2021
Which Model is Better? Tournament	44	2626	January 27, 2022
Era Boosted Models Data Science	21	15205	October 10, 2021
Incorporate era similarity in ranking models Data Science	9	1900	May 20, 2022

Performance Pattern of leading board

Related topics