New data frequency

a_sarfa · January 17, 2024, 1:19pm

Hi,
I would like to know if there is a constant frequency of new data to re-train our models or if new data arrives at a random frequency ?

wigglemuse · January 17, 2024, 6:52pm

Random, but with the just released data they announced their intention to cool it for a while since we’ve had many changes lately. So we’ll probably have at least a good number of months with the v43 dataset before more changes…unless they think of a good reason to do otherwise (or especially if they discover some issue with the latest data). That’s about as sure as we can be.

eleven_sigma · January 18, 2024, 8:53am

I don’t understand what random mean. I’m submiting daily predictions with 4.3. dataset and seems OK. What is the problem?

a_sarfa · January 18, 2024, 9:42am

What I mean is that in 4.3, there is no new era data every week. For example, the era 1092 will remain the last until version 4.4 arrives.

andralienware · January 18, 2024, 2:45pm

I thought that no new data means no new feature columns or target columns, not that we will stop receiving weekly data updates in the validation set. (This way of interpreting the announcement seems consistent with how the announcement for Midnight brought up the idea of no new data). If you interpret “new data” as just new rows in the dataset (eras and stocks within eras), that should be weekly, if you interpret it as new columns in the dataset, that is mostly random.

a_sarfa · January 18, 2024, 4:05pm

Sorry I wasnt’ clear enough. Thank you for ur answer, this way I can expect to have the data of era 1093 next week if I run ‘napi.download_dataset(“v4.3/validation_int8.parquet”)’ ?

andralienware · January 18, 2024, 4:17pm

Yep. Do keep in mind that some of the results may be unresolved from the latest weeks. It might be worth having a part of your scripts that identifies unresolved rounds and only saves new/newly resolved data and throws out the old data so you don’t have tons of copies of eras ~600 to 1000

wigglemuse · January 18, 2024, 10:54pm

Oh right, I thought you meant whole new data. Yeah, yeah, v43 (and even v42) data will continue with a new era every week and targets being filled in as available, etc.

Topic		Replies	Views
Will we get a larger dataset with daily eras? Tournament	1	477	October 30, 2022
How often is the training and validation data updated? Tournament	3	1271	March 1, 2021
Super Massive Data Release: Deep Dive Data Science	81	21354	November 22, 2021
V4 Tournament Data Announcement Announcements	0	3454	March 28, 2022
V4 data realease - questions Tournament	14	2035	July 4, 2022

New data frequency

Related topics