New data frequency

Hi,
I would like to know if there is a constant frequency of new data to re-train our models or if new data arrives at a random frequency ?

1 Like

Random, but with the just released data they announced their intention to cool it for a while since we’ve had many changes lately. So we’ll probably have at least a good number of months with the v43 dataset before more changes…unless they think of a good reason to do otherwise (or especially if they discover some issue with the latest data). That’s about as sure as we can be.

I don’t understand what random mean. I’m submiting daily predictions with 4.3. dataset and seems OK. What is the problem?

What I mean is that in 4.3, there is no new era data every week. For example, the era 1092 will remain the last until version 4.4 arrives.

I thought that no new data means no new feature columns or target columns, not that we will stop receiving weekly data updates in the validation set. (This way of interpreting the announcement seems consistent with how the announcement for Midnight brought up the idea of no new data). If you interpret “new data” as just new rows in the dataset (eras and stocks within eras), that should be weekly, if you interpret it as new columns in the dataset, that is mostly random.

Sorry I wasnt’ clear enough. Thank you for ur answer, this way I can expect to have the data of era 1093 next week if I run ‘napi.download_dataset(“v4.3/validation_int8.parquet”)’ ?

Yep. Do keep in mind that some of the results may be unresolved from the latest weeks. It might be worth having a part of your scripts that identifies unresolved rounds and only saves new/newly resolved data and throws out the old data so you don’t have tons of copies of eras ~600 to 1000

1 Like

Oh right, I thought you meant whole new data. Yeah, yeah, v43 (and even v42) data will continue with a new era every week and targets being filled in as available, etc.

1 Like