how computationally intensive is model making? do you use third party cloud computing services? Which ones and what are the cost and time?
It depends entirely on your model. A simple Logistic Regression can be run on a fairly standard laptop within a few seconds. The more complex models need more horsepower.
I’ve not looked into cloud computing solutions, but I suspect that you would have to be very good to make sure you get your money back by winning a round or two to offset the costs.
As @themicon stated, LR can finish running in few seconds. However running Deep Learning models is another story.
A single round of training, validation and test submission is around 20 minutes on my computer using PyTorch.
My specs are as follows:
Intel® Core™ i7–5930K CPU @ 3.50GHz, 64GB RAM, 1 GPU GeForce GTX 1080 with CUDA v. 8.0 (driver 375.20) and cuDNN (v. 5005) running on Linux.
Clustering and like-methods take a really long time. PRcomps in Microsoft R open takes about 0.1 seconds. I use h2o for most of the models and the slowest part is writing the data over to java. Even with the new data size the longest running model I have takes ~ 10 minutes, only because it does one each of all the favorite ML methods and finds optimal weights for a weighted sum. I could do this in parallel and it would take like 3 minutes. But we have all week, so no need.
I am actually doing this stuff on a newer i5. My point is, at this stage I dont know how necessary cloud is.