Discrepancy in versioning between Numerai database and NumerAPI?

The new v4 version data announcement looks great! However, it refers to legacy data (310 features) as “v2”, while in NumerAPI it is referred to as v1. The 1050+ features dataset is referred to version 2 in NumerAPI, so v4 would be version 3 in NumerAPI?

Am I missing something or is there a discrepancy between how the Numerai team does versioning and how NumerAPI is structured?

Currently, the API changes are breaking pipelines, so it would be great to have clarity on this.

1 Like

Looking into this now and will get it sorted and reply here, thank you!

1 Like

Awesome, thank you for the heads up!

Hey, apologies that your pipeline broke.

In reference to the error you reported:
Can you provide the NumerAPI code snippet that is throwing this error? AFAIK all of our example predictions and example models are able to download data normally.

In reference to versioning:
The version you’re referencing is the submission endpoint version not the dataset version. We did, however, recently remove the need for a version argument, so it doesn’t matter what you give NumerAPI during upload. I’ll work on a PR to remove this version argument so it stops causing confusion.

1 Like

Hey, thanks for the message. I see now that the error occurs from an assert statement my side. I assert that the filename is in NumerAPI().list_datasets(), which breaks because of the v2 and v3 prefixes. Can download files without this assert.

Thanks for the heads up on the submission versioning! So for downloading the version arguments are aligned (v2 = legacy dataset, v3 = 1050+ features dataset)?

There’s no version argument for downloading, just the filename prefixed with v2/ for legacy and v3/ for current. v4/ will be listed upon the v4 data release!