Hi guys, this is my first time trying to get involved in the numerai tournament.
I have tried and downloaded the example script from the numerai main website.
I was able to download the dataset which is as big as 1 Gigabytes but failed to load it into dataframe.
napi = NumerAPI()
current_round = napi.get_current_round(tournament=8) # tournament 8 is the primary Numerai Tournament
training and validation data only change periodically, so no need to download them over again every single week
and get the error below:
2021-12-24 07:06:39,275 INFO numerapi.utils: target file already exists
2021-12-24 07:06:39,289 ERROR numerapi.utils: deleting file and restarting
numerai_training_data.parquet: 1.01GB [00:32, 30.9MB/s]
OSError Traceback (most recent call last)
7 # training and validation data only change periodically, so no need to download them over again every single week
8 napi.download_dataset(“numerai_training_data.parquet”, “numerai_training_data.parquet”)
----> 9 df = pd.read_parquet(‘numerai_training_data.parquet’)
/usr/local/lib/python3.7/dist-packages/pyarrow/error.pxi in pyarrow.lib.check_status()
This was executed in google colab (I am using M1 Mac) after I ran pip install on all the requirements (there was some version conflicts highlights but I doubt this is the reason).
Sorry for the long question but would anybody be able to guide me on this?
I really wanted to get started into this journey.
Thank you so much!
*** Edit Note ***
This is resolved by ignoring all the version stated in requirements.txt and just install all of them as latest version.