Do really model diagnostics makes sense?

eleven_sigma · July 25, 2021, 10:42pm

Model diagnostics uses the validation eras, so the results depends if you use them for training or not.
If you use validation eras, then diagnostics are too much optimistics, as they are overfitted, but if you don’t use validation eras then the diagnostics aren’t real because for real models a lot of people use validation for sure.
The percentiles returned are biased with models that uses validation.
Diagnostics should use a blind part of test, not validation.

wigglemuse · July 25, 2021, 11:16pm

Well, that’s the definition of a validation set – a hold-out set that you “validate” on by checking the stats on it. If you train on it, of course it doesn’t mean anything. The problem is that even if you don’t train on it, it doesn’t seem to mean much. What eras are “representative” of the larger population and indicate a good generalized model? Well, probably none of them. We need a validation set with years worth of data in it. Sounds like that’s what we’ll actually get soon if the rumblings about releasing the targets for the test set happen.

rigrog · July 26, 2021, 5:53am

If they’d just give us the targets, for 1/4 of the test eras… I believe that would improve the metamodel a whole lot more than throwing 3,000 columns at us. And they wouldn’t need to buy any more data!

sirmobius · July 26, 2021, 11:28am

I assume they use this data to train and validate the meta-model and back-test their trading strategy. They have probably decided they need these examples more than we do at the modelling stage.

wigglemuse · July 26, 2021, 3:32pm

That’s what they said before – they had to have it for backtests. But in the most recent fireside chat Richard said they didn’t really anymore for whatever reason and will probably be releasing those targets, and maybe even releasing targets weekly for live eras that have recently resolved. We’ll see…sounds like we’ll get something anyway.

eleven_sigma · July 26, 2021, 3:40pm

I think the conversation is going to other site… The problem isn’t if we have data enough for modelling (We may open other thread for that). The problem is that if they use a part of data labelled for diagnostics, they will be useless at the moment some people train the models using these labelled data.
The only way diagnostics makes sense is computing them with a part of unlabelled data.

rigrog · July 26, 2021, 5:20pm

According to Numerai-guy Anson Chu, they do indeed use it for back testing & stuff. So I’m suggesting that they continue using three fourths of it just that way.

rigrog · July 26, 2021, 5:55pm

“I think the conversation is going to other site…”

Is “chat” the other site, to which you refer? Please help me find the discussion of this topic, if it’s there.

Thx,
Rigrog

wigglemuse · July 26, 2021, 6:46pm

The diagnostics are just for you and for your models (and you can calculate them yourself), so it doesn’t really matter what other people are doing. (Numerai doesn’t care what your diagnostics are and do not use them to evaluate your model internally.) Obviously you will know if you’ve trained your own models using the validation set or not.

eleven_sigma · July 27, 2021, 7:14am

I know if I train using validation or not, but don’t know how many people did, so the percentile diagnostics gives, is useless for me. I think diagnostics is a very good idea but needs to use an unlabelled part of data for compute them.

wigglemuse · July 27, 2021, 3:27pm

Those percentages I believe are hard-coded thresholds and based on the assumption you are not training on the validation set.

Topic		Replies	Views
Diagnostics for #39 Data Science	64	4598	January 31, 2022
Will diagnostics tool include test eras recently? Tournament	0	484	April 17, 2022
How to test my submissions? Tournament	8	752	October 27, 2022
Stories of Validation Data Science	5	2560	March 28, 2020
Diagnostic Tool Tournament	2	769	April 8, 2023

Do really model diagnostics makes sense?

Related topics