INTEL SCIENCE & TECHNOLOGY CENTER

CLOUD COMPUTING

ISTC-CC NEWSLETTER

The ISTC-CC Update 2016 - NEW!

The ISTC-CC Update 2015

The ISTC-CC Update 2014

RESEARCH HIGHLIGHTS

Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.

ISTC-CC provides a listing of useful benchmarks for cloud computing.

Another list highlighting Open Source Software Releases.

Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.

Open-source Spark framework makes iterative and interactive data analytics FAST, both to run and to write.

ISTC-CC Abstract

Managed Communication and Consistency for Fast Data-Parallel Iterative Analytics

ACM Symposium on Cloud Computing 2015. Aug. 27 - 29, 2015, Kohala Coast, HI.
BEST PAPER!

Jinliang Wei, Wei Dai, Aurick Qiao, Qirong Ho*, Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons†, Garth A. Gibson, Eric P. Xing

Carnegie Mellon University
* Institute for Infocomm Research, A*STAR
† Intel Labs

At the core of Machine Learning (ML) analytics applied to Big Data is often an expert-suggested model, whose parameters are refined by iteratively processing a training dataset until convergence. The completion time (i.e. convergence time) and quality of the learned model not only depends on the rate at which the refinements are generated but also the quality of each refinement. While data-parallel ML applications often employ a loose consistency model when updating shared model parameters to maximize parallelism, the accumulated error may seriously impact the quality of refinements and thus delay completion time, a problem that usually gets worse with scale. Although more immediate propagation of updates reduces the accumulated error, this strategy is limited by physical network bandwidth. Additionally, the performance of the widely used stochastic gradient descent (SGD) algorithm is sensitive to initial step size, simply increasing communication without adjusting the step size value accordingly fails to achieve optimal performance.

This paper presents Bösen, a system that maximizes the network communication efficiency under a given intermachine network bandwidth budget to minimize accumulated error, while ensuring theoretical convergence guarantees for large-scale data-parallel ML applications. Furthermore, Bösen prioritizes messages that are most significant to algorithm convergence, further enhancing algorithm convergence. Finally, Bösen is the first distributed implementation of the recently presented adaptive revision algorithm, which provides orders of magnitude improvement over a carefully tuned fixed schedule of step size refinements. Experiments on two clusters with up to 1024 cores show that our mechanism significantly improves upon static communication schedules.

FULL PAPER: pdf