Map-Reduce for Machine Learning on Multicore Cheng T. Chu, Sang K. Kim, Yi A. Lin, Yuanyuan Yu, Gary R. Bradski, Andrew Y. Ng, Kunle Olukotun In NIPS (2006), pp. 281-288. http://www.jaso.co.kr/attachment/1263039360.pdf ABSTRACT: `We are at the beginning of the multicore era. Computers will have increasingly` `many cores (processors), but there is still no good programming framework for` `these architectures, and thus no simple and unified way for machine learning to` `take advantage of the potential speed up. In this paper, we develop a broadly applicable` `parallel programming method, one that is easily applied to many different` `learning algorithms. Our work is in distinct contrast to the tradition in machine` `learning of designing (often ingenious) ways to speed up a single algorithm at a` `time. Specifically, we show that algorithms that fit the Statistical Query model [15]` `can be written in a certain “summation form,” which allows them to be easily parallelized` `on multicore computers. We adapt Google’s map-reduce [7] paradigm to` `demonstrate this parallel speed up technique on a variety of learning algorithms` `including locally weighted linear regression (LWLR), k-means, logistic regression` `(LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis` `(GDA), EM, and backpropagation (NN). Our experimental results show basically` `linear speedup with an increasing number of processors.` |

ResearchBlog >