I went to upload my predictions for the first time, but I noticed it doesn’t like the sort order. How exactly should the predictions be sorted? My system doesn’t use the order of the records in the tournament data, in fact that information is lost.
is it sorted by id? or are there other fields in sort as well?
it looks like the sort order is data_type descending , era, id ascending but even that doesn’t work.
Take a look at the example notebook and replicate how it does it, so you just have to put them back in the same order as you received them I guess. I didn’t actually know order mattered as I just copied their code for submitting.
thanks, I’m not seeing the sort order in that notebook. I guess I need to iterate over the source file as I generate the predictions to get the sort order right.
my system loads the file and then does a lot of processing on it. So I’m trying to reconstruct the sort order but nothing I try seems to work
This bit here
predictions_df = tournament_data["id"].to_frame()
predictions_df["prediction"] = predictions
So as they didn’t change the order before making the predictions, the predictions are in the same order as in tournament_data, so they can just add the predictions as a column next to the ids in tournament_data.
So if you’ve lost the orderings you can just cache the ids in the order in tournament_data, and then construct your predictions file using that ordering.
The important part is to keep the relation of id <-> prediction as you need to provide both fields - no special ordering needed…
well, I did manage to get it upload by generating the prediction file using the same ID order as the data file. The prediction file must have the same IDs and be in the same order to be accepted.
If using pandas you can use reindex: