Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
Efficient Mini-Batch Training for Stochastic Optimization
Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’14), August 2014.
Mu Li1,2, Tong Zhang2,3, Yuqiang Chen2, Alexander J. Smola1,4
1 Carnegie Mellon University
2 Baidu, Inc.
3 Rutgers University
4 Google, Inc.
Stochastic gradient descent (SGD) is a popular technique for large-scale optimization problems in machine learning. In order to parallelize SGD, minibatch training needs to be employed to reduce the communication cost. However, an increase in minibatch size typically decreases the rate of convergence. This paper introduces a technique based on approximate optimization of a conservatively regular- ized objective function within each minibatch. We prove that the convergence rate does not decrease with increasing minibatch size. Experiments demonstrate that with suitable implementations of approximate optimization, the resulting algorithm can outperform standard SGD in many scenarios.
FULL PAPER: pdf