Hi, first of all I’m just getting started on this so I’m still learning.
I just wanted to know what good values for the sharpe ratio and validation correlation are. I understand that the higher the better for both of these values but what is a realistic value? For example my current model has a validation correlation of 0.0717 and a validation sharpe of 2.4858. Are these values good, bad, or average? (I have another model that I’m sure cannot be right, which had a validation correlation of 0.8 and a sharpe of 54, is this overfit? both models trained on the same data however the max depth in the XGBRegressor was changed from 3 to 10)
Thanks in advance for any help
If you have held out a test set and got similar Sharpe and correlation on that, then I’d say those are excellent. In the end, all that matters is live performance and if you can get positive Sharpe and mean correlation consistently, I’d say that’s good too. Ideally different from the meta model and what others have already discovered.
I’m using the train_test_split method to create a training set and a testing set from both the training data and the tournament validation data using a test size of 0.3 so hopefully that means my numbers are alright?
thanks for your help
instead of treating all the rows as the same dataset, you should instead split based on era. Each era is a single time period of a month, so if you mix and match rows from different eras, then you may receive bad inference. Check out the example notebooks in the github repo: https://github.com/numerai/example-scripts
I didnt realise that, thanks for the heads up!