Upload pkl predictions

Hi people!

I am trying to upload a prediction function using cloudpickle. It works fine if all the code is in the same file, but when creating a folder structure, the pkl is uploaded but it can not be extracted correctly as it miss modules. If the main file is like:

from my_module import my_func

def predict(live_features: pd.DataFrame) → pd.DataFrame:
features = my_func(live_features)
live_predictions = model.predict(lfeatures)
submission = pd.Series(live_predictions, index=live_features.index)
return submission.to_frame(“prediction”)

In this case how I should proceed? Numerai tells me “my_module” is not found

Thanks in advance

I think you have to keep all the code in the same file. Maybe someone can prove me wrong?

This is really frustrating :pensive:, I think it’s the first time I’m punished for giving code some structure…

Did someone find a solution or we have to copy paste all related code into a single python module to make cloudpickle work? (and hence, model uploads, which is an amazing feature)

Completly agree, it is frustrating. It did not manage to solve the issue, I finally decided to make the training, evaluation and some other stuff with a code using structure, but when submitting I have a single code that loads the models and the feature list, the only issue I have to copy-paste the same code to recreate the generated-variables, as importing the code does not work, but importing list and models it does.

1 Like

It’s marked as experimental, but have you tried marking your module to be picked by value instead of by reference?

import cloudpickle
import my_module
cloudpickle.register_pickle_by_value(my_module)
cloudpickle.dumps(my_module.my_function)  # my_function is pickled by value
cloudpickle.unregister_pickle_by_value(my_module)
cloudpickle.dumps(my_module.my_function)  # my_function is pickled by reference