Distance analysis using Facebook AI Similarity Search (faiss)

I was looking for close data points between training, validation and live data. It didn’t work :frowning: but I’m a bit surprised that the validation data isn’t closer to the training data.

It may not be of any help, but here is a colab notebook: Colab : Distance analysis

1 Like

Should be interesting to do an intra-era analysis and see if there are eras with more similar points and others without them.
Perhaps live era you selected belongs to a group of eras with high distance between points.

If I read the notebook correctly, you compared the whole training data to the whole validation data.
Would it not be more interesting to compare each era of the validation set to all eras of the training set (per era)?
If by that way you could determine the training era that is closest to the validation (or later live) era, you could time which train-era-model to use on the live data

1 Like