I am working in machine learning for several years but a new one here. At first I tried some models:
Logistic Regression: just as the example
linear SVC: seems no better than LR
Kernel SVC (poly/rbf): very slow to train and I didn’t wait for it finished
Random Forest: not effective and easy to be overfitted
Then I tried XGBoost and got a better result than before (slightly but obviously). I believe it can be improved by parameter optimization.
I did not tried ensemble yet.
I made a break and change to deep learning methods, a feedforward neural network with 2 hidden layers. But it seems hard to train and did not get a model which is obviously better than random classifier. I want to made the model overfitted first and then use drop out to decrease such overfitting, but I cant even get an overfitted model! The model seems not to converge if I increase the number of hidden layers to 3. I just wonder is it the problem that I didn’t tune my model well, or the deep model not suitable here? Or XGBoost is better?
I will very appreciate if any one shares some about your experience on such problems.