SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC-CC Abstract
Kineograph: Taking the Pulse of a Fast-Changing and Connected World
EuroSys'12. Bern,Switzerland, April 10-13, 2012.
Raymond Cheng*† Ji Hong*‡ Aapo Kyrola*^ Youshan Miao*§ Xuetian Weng*~
Ming Wu* Fan Yang* Lidong Zhou* Feng Zhao* Enhong Chen§
*Microsoft Research Asia
†University of Washington
‡Fudan University
^Carnegie Mellon University
§University of Science and Technology of China
~Peking University
Kineograph is a distributed system that takes a stream of incoming data to construct a continuously changing graph, which captures the relationships that exist in the data feed. As a computing platform, Kineograph further supports graph-mining algorithms to extract timely insights from the fast-changing graph structure. To accommodate graph- mining algorithms that assume a static underlying graph, Kineograph creates a series of consistent snapshots, using a novel and efficient epoch commit protocol. To keep up with continuous updates on the graph, Kineograph includes an incremental graph-computation engine. We have developed three applications on top of Kineograph to analyze Twitter data: user ranking, approximate shortest paths, and controversial topic detection. For these applications, Kineograph takes a live Twitter data feed and maintains a graph of edges between all users and hashtags. Our evaluation shows that with 40 machines processing 100K tweets per second, Kineograph is able to continuously compute global properties, such as user ranks, with less than 2.5-minute timeliness guar- antees. This rate of traffic is more than 10 times the reported peak rate of Twitter as of October 2011.
KEYWORDS: Graph processing, Distributed storage
FULL PAPER: pdf