Clustering and like-methods take a really long time. PRcomps in Microsoft R open takes about 0.1 seconds. I use h2o for most of the models and the slowest part is writing the data over to java. Even with the new data size the longest running model I have takes ~ 10 minutes, only because it does one each of all the favorite ML methods and finds optimal weights for a weighted sum. I could do this in parallel and it would take like 3 minutes. But we have all week, so no need.
I am actually doing this stuff on a newer i5. My point is, at this stage I dont know how necessary cloud is.