Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
GRASS: Trimming Stragglers in Approximation Analytics
11th USENIX Symposium on Networked Systems Design and Implementation ( NSDI'14), April 2014.
Ganesh Ananthanarayanan, Michael Chien-Chun Hung*, Xiaoqi Ren^, Ion Stoica,
Adam Wierman^, Minlan Yu*
University of California, Berkeley
*University of Southern California
^California Institute of Technology
In big data analytics timely results, even if based on only part of the data, are often good enough. For this reason, approximation jobs, which have deadline or error bounds and require only a subset of their tasks to complete, are projected to dominate big data workloads. Straggler tasks are an important hurdle when designing approximate data analytic frameworks, and the widely adopted approach to deal with them is speculative execution. In this paper, we present GRASS, which carefully uses speculation to mitigate the impact of stragglers in approximation jobs. The design of GRASS is based on first principles analysis of the impact of speculative copies. GRASS delicately balances immediacy of improving the approximation goal with the long term implications of using extra resources for speculation. Evaluations with production workloads from Facebook and Microsoft Bing in an EC2 cluster of 200 nodes shows that GRASS increases accuracy of deadline-bound jobs by 47% and speeds up error-bound jobs by 38%. GRASS’s design also speeds up exact computations, making it a unified solution for straggler mitigation.
FULL PAPER: pdf