I wrote a post discussing the same topic.
I also tried differentiable Spearman’s on my PyTorch MLP and it worked horrible. I was also thinking of trying an approach like the one you have proposed using fast-soft-sort, but I am stuck writing the customized loss function for boosted models.I don’t know how I can define the gradient and the hessian from fast-soft-sort code.