SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC-CC Technical Report Abstract
Towards understanding heterogeneous clouds at scale:
Google trace analysis
Intel Science and Technology Center for Cloud Computing Technical Report
ISTC-CC-TR-12-101, April 27, 2012.
Charles Reiss*, Alexey Tumanov†,
Gregory R. Ganger†, Randy H. Katz*,
Michael A. Kozuch^
* UC Berkeley
† Carnegie Mellon University
^ Intel Labs
With the emergence of large, heterogeneous, shared computing clusters, their efficient use by mixed distributed workloads and tenants remains an important challenge. Unfortunately, little data has been available about such workloads and clusters. This paper analyzes a recent Google release of scheduler request and utilization data across a large (12500+) general-purpose compute cluster over 29 days. We characterize cluster resource requests, their distribution, and the actual resource utilization. Unlike previous scheduler traces we are aware of, this one includes diverse workloads – from large web services to large CPU-intensive batch programs – and permits comparison of actual resource utilization with the user-supplied resource estimates available to the cluster resource scheduler. We observe some under-utilization despite over-commitment of resources, difficulty of scheduling high-priority tasks that specify constraints, and lack of dynamic adjustments to user allocation requests despite the apparent availability of this feature in the scheduler.
KEYWORDS: cloud computing, cluster scheduling, trace characterization
FULL PAPER: pdf