On what data do I need to send my predictions ('validation', 'test', 'live')?

wtd · February 22, 2021, 9:33pm

The tournament data has 3 different types: ‘validation’, ‘test’, ‘live’

When submitting my predictions to a tournament, do I need to use all 3, or can I just send it in one of them (to reduce file size)?

lliwmc · February 22, 2021, 9:48pm

Your predictions should be based on the live data not training data or validation data

I think thats is what you mean I’m not sure I quite understand the question fully now looking back on it

Will

wtd · February 22, 2021, 9:52pm

Thank you. What I mean is that the data_type field in the tournament.csv file, has 3 types: ‘validation’, ‘test’, ‘live’.

When I want to submit my predictions I run my model on the tournament data and get the predictions, write a CSV file with it and send it to numerai. Do I need to do it to all the 3 types?

wigglemuse · February 22, 2021, 10:14pm

All of them. You are sending them a submission file with the same number of rows (and with the same row ids) that are in the “tournament” data file. Look at the “example_predictions.csv” file. Just like that, except with your predictions instead.

lliwmc · February 22, 2021, 10:30pm

This

Sorry completely misread, going back to reading rather than posting!! Been a long monday its 10.27pm in uk and thats clearly past my bed time…

asteeber · February 25, 2021, 8:03pm

I believe you can submit a file with only the “live” data_type tag and still earn payouts from what you stake (not 100% certain on this). But the trade off is that you can’t see predicted performance before Thursday when results for the round are posted. I think NumerAI uses the test/validation data to gauge how well your model might do.

themicon · February 25, 2021, 8:11pm

For Numerai signals you can submit only the live data_type, for the classic tournament you need to upload your predictions for the entire tournament file, which contains live, validation and test.

seunghoans · February 28, 2021, 1:43am

I think NumerAI uses the test/validation data to gauge how well your model might do.

Does this mean that if I set all of the ‘test’ and ‘validation’ predictions as ‘0’ and only predict live data, I get lower score? And I am also curious if I’d better predict validation data or just copy the given target values.

wigglemuse · February 28, 2021, 2:00am

You can do whatever you want with the validation data (you won’t get validation diagnostics that mean anything though, of course). If you are just going to play games with it, you might as well use it as additional training data. But if you don’t predict the test data, you are just being a jerk (although it is true you get no feedback from it). Well, at least if you are staking, as they rely on that info at least some of the time. (And if a lot of people made a habit of that, they’d be forced to make the checks on submissions more draconian. And we don’t want that. Been there, done that.)

asteeber · February 28, 2021, 2:26am

No, I believe NumerAI payouts depend only on the live data predictions. If you want to use NumerAI’s built-in model predicted performance measures then you should predict the validation and test data

wigglemuse · September 23, 2022, 12:57pm

Somebody liked one of my previous posts on this thread today, so I’d just like to point out that this information is outdated, and you no longer need to submit anything but the live era each week. (No more “test”)

ryo_matsuzaka · September 23, 2022, 1:09pm

Somebody …

Thanks. That’s me.

Topic		Replies	Views
Some Simple Tournament Questions Tournament	12	1968	March 16, 2021
Tournament Test Type With NA Target Tournament	3	786	March 18, 2021
Why "test" data? Tournament	4	931	April 10, 2022
Tournament Targets and Target Types Tournament	2	1300	March 7, 2021
Are predictions discrete or continuous? Tournament	19	3901	May 22, 2021

On what data do I need to send my predictions ('validation', 'test', 'live')?

Related topics