Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
Toward Combining Online & Offline Management of Big Data Applications
MBDS Track, ACM International Conference on Autonomic Computing (ICAC’14), June 2014.
Brian Laub, Chengwei Wang, Karsten Schwan, Chad Huneycutt
Georgia Institute of Technology
Traditional data center monitoring systems focus on collecting basic metrics such as CPU and memory usage, in a centralized location, giving administrators a summary of global system health via a database of observations. Conversely, emerging research systems are focusing on scalable, distributed monitoring capable of quickly detecting and alerting administrators to anomalies. This paper outlines VStore, a system that seeks to combine fast online anomaly detection with offline storage and analysis of monitoring data. VStore can be used as a historical reference to help guide administrators towards quickly classifying and fixing anomalous behavior once a problem has been detected. We demonstrate this idea with a distributed big streaming data application, and explore three common fault scenarios in this application. We show that each scenario exhibits a slightly different monitoring history, which may be undetectable by online algorithms that are resource-constrained. We also offer a discussion of how historical data captured by VStore can be combined with online monitoring tools to improve troubleshooting efforts in the data center.
FULL PAPER: pdf