The basic examples Modeling/Submissions are not working, please help

The code below yields the following error message:


AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11660/55592206.py in
1 # submit predictions to numer.ai
2 predictions = model.predict(tournament_data[feature_names])
----> 3 predictions.to_csv(“predictions.csv”)

AttributeError: ‘numpy.ndarray’ object has no attribute ‘to_csv’

##########################################################################

import pandas as pd
from xgboost import XGBRegressor

training data contains features and targets

training_data = pd.read_csv(“numerai_training_data.csv”).set_index(“id”)

tournament data contains features only

tournament_data = pd.read_csv(“numerai_tournament_data.csv”).set_index(“id”)
feature_names = [f for f in training_data.columns if “feature” in f]

train a model to make predictions on tournament data

model = XGBRegressor(max_depth=5, learning_rate=0.01,
n_estimators=2000, colsample_bytree=0.1)
model.fit(training_data[feature_names], training_data[“target”])

submit predictions to numer.ai

predictions = model.predict(tournament_data[feature_names])
predictions.to_csv(“predictions.csv”)

I solved the issue above with: pd.DataFrame(predictions).to_csv(“predictions.csv”)
But then comes the next problem. Uploading of the predictions is not working.
This line of code:
napi.upload_predictions(“predictions.csv”, model_id=“my model”)
yields the following error message:
Specified model_id is not a UUID

I tried this:

import uuid
model = str(uuid.uuid4())
napi.upload_predictions(“predictions.csv”, model_id=model)

This yields the following error message:
Unable to resolve model from id.

You need to pass your actual model id that you can get from Numerai

Thanks a lot, now I have the next problem:

Invalid submission headers. Headers must be id and prediction.

OK, you need to do something like this to put id and predictions headers in your submission csv:

#download live data
current_round = napi.get_current_round(tournament=8)
napi.download_dataset("v4/live_int8.parquet", f"live_{current_round}_int8.parquet")
live_data = pd.read_parquet(f"live_{current_round}_int8.parquet")

#put predictions into live data dataframe
live_data["prediction"] = model.predict(live_data[feature_names])

#make new dataframe with only the index (contains ids)
predictions_df = live_data.index.to_frame()
#copy predictions into new dataframe
predictions_df["prediction"] = live_data["prediction"].copy()

#csv file will have only id and prediction headers/columns
predictions_df.to_csv("predictions.csv", index=False)
submission_id = napi.upload_predictions("predictions.csv", model_id=model_id)

I’m sure there’s more elegant code to do this. This is just my amateur hack code.

2 Likes

Thank you very much, I will try it out! :slight_smile: