I am trying to predict stocks prices using Random Forest Regression (rfr), and I’m using a function from scikit-learn like this
rfr=RandomForestRegressor(random_state=200, oob_score=True, max_features=‘sqrt’)
Now, I am using the r2_score with the test set and the predicted values to get an idea of the model’s accuracy. However I’m always getting a value for r2 above 0.9 which I think to be odd.
My data set has 30k data points. The only variable that I’m to change is the random_state. The goal of this post is to find an effective variable of the model which is able to control overfitting.