Tournament Targets and Target Types

jx0rd · March 7, 2021, 7:01pm

Hey, I’m new to numerai and was confused about a few things.

Firstly, why do we have a target field for the numerai_tournament_data.csv?

I thought the numerai tournament data was data of the next 4 weeks, and we’re submitting predictions about what those values will be? If so, why and how do we currently have target data?

Secondly, the target data of the training and tournament data are the values [0., 0.25, 0.5, 0.75, 1.]. Why in the example Making your first submission on Numerai - Colaboratory (google.com) is the example submission submitting values from around 0.45 to 0.55 as the target data. Do these values get rounded to the nearest value, i.e. 0.45 would become 0.5, or if the targets you submit are in the range 0.45 to 0.5, do these get normalized to the range 0 to 1 to match the training data values?

Thanks

asteeber · March 7, 2021, 7:25pm

There is validation data in the tournament dataset which includes target values, thus there being a target column. Target data for data_types “test” and “live” are filled with “x”

The obfuscated values represent movements within those signals. When ran through a machine learning algorithm they are taken as continuous variables. These algorithms average out values when they are continuous which is why you get values really close to 0.5 (you can expect to see an average of 0.5 when you randomly sample a uniform distribution between 0 and 1, which is kinda what is happening).

NumerAI gauges your performance based on correlation, so even if your predictions are really close to 0.5, the relative distance from 0.5 of your predictions is used to calculate your correlation score.

wigglemuse · March 7, 2021, 7:35pm

You get scored on RANKING correlation (per era). So only the relative ordering of your predictions matter. If an era was only 5 rows and you submitted predictions:

0.1, 0.2, 0.3, 0.4, 0.5

that would be exactly the same as submitting:

0.491, 0.492, 0.493, 0.494, 0.495

because when converted to ranks they’d both be 1 2 3 4 5

And you don’t want to submit only 5 discrete values despite the target only having 5 values. Any ties in your predictions are broken automatically by the scoring method (by row order, which is essentially random) so don’t purposefully round them down to only 5 discrete values as you’re just then removing signal from your predictions.

Topic		Replies	Views
Tournament Test Type With NA Target Tournament	3	788	March 18, 2021
Submission Question Tournament	4	917	January 2, 2021
Tournament Data Target "NaN" Tournament	2	753	December 16, 2021
On what data do I need to send my predictions ('validation', 'test', 'live')? Tournament	11	1703	September 23, 2022
Are predictions discrete or continuous? Tournament	19	3906	May 22, 2021

Tournament Targets and Target Types

Related topics