SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC-CC Abstract
Scaling Distributed Machine Learning with the Parameter Server
Proceedings of 11th USENIX OSDI (OSDI’14), October 2014.
Mu Li*^, David G. Andersen*, Jun Woo Park*, Alexander J. Smola*†, Amr Ahmed†, Vanja Josifovski†, James Long†, Eugene J. Shekita†, Bor-Yiing Su†
* Carnegie Mellon University
^ Baidu
† Google
We propose a parameter server framework for distributed machine learning problems. Both data and workloads are distributed over worker nodes, while the server nodes maintain globally shared parameters, represented as dense or sparse vectors and matrices. The framework manages asynchronous data communication between nodes, and supports flexible consistency models, elastic scalability, and continuous fault tolerance.
To demonstrate the scalability of the proposed framework, we show experimental results on petabytes of real data with billions of examples and parameters on problems ranging from Sparse Logistic Regression to Latent Dirichlet Allocation and Distributed Sketching.
FULL PAPER: pdf