Help with notebook - pydata


#1

hi all,
Was wondering if I could get some help with the below notebook.
looks like a pretty good run though, however it does not work with the current data.
Any help would be sincerly appreciated.
Many thanks,
Best,
Andrew


#2

The file you linked is from an earlier version of the tournament. The dataset format has changed somewhat since then (different numbers of features, t_id column is now just id, added data_type and era columns). So you’ll need to make some changes to how the script is dealing with the dataset after loading it in to deal with the new format.


#3

Hi Daenris!
I tried to change the proposed metrics in the notebook.
But the output seemed strange and incorrect.
I was wondering if somebody else could run through it and see that the get measurable results.
My notebook was putting zeros in strange places.
Many thanks,
Andrew


#4

I’m not saying anything about metrics. The format of the input files is different from what that notebook expects. There was a time in the old format that the input files had 51 or 52 columns, with t_id, 50 features, and a target (in the training file). That is no longer the case. So if you’re just trying to run that notebook as is, it won’t be removing the id column (because the column header changed) and it will be incorrectly leaving in several additional columns (data_type, era) so I wouldn’t expect anything downstream of loading the data to work at all.

If you have changed those things and are still asking for advice,then you need to provide a link to the file you’re actually using, not some outdated file that doesn’t match what you’re using.