SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC–CC February 2012 Status Report
Summary:
The Georgia Tech team is setting up a benchmark repository for use by ISTC researchers, including a Rubbos suite, a real-time Flume/HBase/HDFS multi-tier code with up to 1000 VMs, and a scalable OLIO. The team also made good progress in setting up a Xen-based OpenStack cluster denoted by Intel. The CMU-IL SOFTscale project team completed a paper reporting the effectiveness of their new approach for absorbing large (100% or more) spikes in front-end demand in a multi-tier data center, by borrowing resources from the middle tier until additional first tier servers are powered up. To The Edge projects also made good progress, leveraging new VM synthesis techniques (the CMU-IL Cloudlets project) and new wide-area consistency protocols over richer data models (the Princeton-CMU-IL COPS2 project). A large number of interactions with other parts of Intel have begun ramping up.
Details:
ISTC Mission: Four inter-related research pillars (themes) architected to create a strong foundation for cloud computing of the future
The research agenda of the ISTC-CC is composed of the following four themes
- Specialization: Explores specialization as a primary means for order of magnitude improvements in efficiency (e.g., energy), including use of emerging technologies like non-volatile memory and specialized cores.
- Automation: Addresses cloud’s particular automation challenges, focusing on order of magnitude efficiency gains from smart resource allocation/scheduling and greatly improved problem diagnosis capabilities.
- Big Data: Addresses the critical need for cloud computing to extend beyond traditional big data usage (primarily, search) to efficiently and effectively support Big Data analytics, including the continuous ingest, integration, and exploitation of live data feeds (e.g., video or twitter).
- To theEdge: Explores new frameworks for edge/cloud cooperation that can efficiently and effectively exploit billions of context-aware clients and enable cloud-assisted client applications whose execution spans client devices, edge-local cloud resources, and core cloud resources.
Participants
Academic PI: Greg Ganger(CMU)
Executive Sponsor: Wen Hann Wang (CSR)
Managing Sponsor: Rich Uhlig (CSR-SAL)
Program Director: Jeff Parkhurst (APR)
Intel PI: Phil Gibbons
Intel Researchers:Michael Kiminsky, Mike Kozuch, Babu Pillai
AcademicPartners: Dave Andersen, Guy Blelloch, Garth Gibson, Carlos Guestrin, Mor Harchol-Balter, Todd Mowry, Onur Mutlu, Priya Narasimhan, M. Satyanarayanan, and Dan Siewiorek (CMU); Mike Freedman, Kai Li, and Margaret Martonosi (Princeton); Anthony Joseph, Randy Katz, and Ion Stoica (UC Berkeley); Ada Gavrilovska, Ling Liu, Calton Pu, Karsten Schwan, and Sudha Yalamanchili (GA Tech).
Technical highlights
- The Georgia Tech team is setting up a benchmark repository for use by ISTC researchers. A Rubbos benchmark suite in use by Calton Pu's (Georgia Tech) group is used for research on auto-tuning and self-configuration for multi-tier web applications. A real-time Flume/HBase/HDFS multi-tier code with up to 1000 VMs drives research on online problem detection and mitigation for SLA-based codes. For further scaling, we hope to work with Amazon in Summer 2012. A scalable version of OLIO is used for large-scale cloud resource management, in joint work with VMWare. Toward this end, the non-scalable MySQL backend of OLIO has been replaced with Cassandra.
- The Georgia Tech team is working hard to set up a Xen-based OpenStack cluster donated by Intel for use in our ISTC research. Current status: Xen/OpenStack works now (there were some issues there). An image server has been acquired and installed, with dedup software currently being tested.
- Anshul Gandhi, Timmy Zhu (CMU grad students), Mor Harchol-Balter (CMU) and Michael Kozuch (IL) completed a paper reporting the effectiveness of leveraging the middle tier of a multi-tier data center to absorb spikes in front-end demand when the first tier has been dynamically scaled down. The results indicate that the technique, called SOFTscale, can absorb load spikes of 100% or more while preserving response time performance without consuming many additional resources.
- Anshul Gandhi (CMU grad student), Mor Harchol-Balter (CMU) and Michael Kozuch (IL) completed a paper investigating the regime of sleep states that are advantageous in data centers under various workloads and data center sizes. The results suggest that sleep states can improve aggregate power-performance even as the data center scale increases, particularly if the job scheduling of the data center is adjusted to maximize the benefits.
- Wyatt Lloyd (Princeton grad student), Michael Kaminsky (IL), Dave Andersen (CMU) and Mike Freedman (Princeton) are continuing to make progress on "COPS2" which includes a new, faster partition tolerant protocol for get transactions, support for put transactions, and a richer data model (more like Cassandra instead of simple key-value storage).
- Kiryong Ha (CMU grad student), M. Satyanarayanan (CMU), and Babu Pillai (IL) have been extending and generalizing their ideas on fast offload of processing to cloudlets to cover rapid deployment of a customized VM onto cloud services. Kiryong has been experimenting with applying his VM synthesis techniques to launch a new VM on Amazon EC2 without requiring a time- and bandwidth-consuming upload of the complete image.
- Publications summary for the past month (partial list):
- 2 papers submitted: to ISSTA '12, PODC'12, KDD '12
- 3 papers accepted: to ISPASS '12, NOCS '12, SIGCSE '12
- 1 paper published: in International Journal of High Performance Computing Applications
Schedule of upcoming events and milestones
- June 21-22: The next Open Cirrus Summit will be held in Beijing, June 21-22. The CFP is posted at http://labs.chinamobile.com/cloud/opencirrus/OCsummit12, and the submission date is March 2. All are invited to submit short papers. Michael Kozuch is co-PC chair.
- June 26-27: ISTC showcase, to be held June 26 in San Francisco (as part of Research@Intel) and June 27 in Santa Clara.
Sponsor group interaction highlights
- Michael Kaminsky (IL), Dave Andersen (CMU), and Garth Gibson (CMU) met with several members of the AON storage team (under Balint Fleischer). Ted Willke (IL CSR) was also on the call. Andy Rudoff led the discussion, which covered several open issues, challenges, and potential solutions in building next-generation storage technologies and the associated software interfaces. Michael and Dave forwarded draft submissions of two pieces of related work, and the two sides agreed to follow up to explore a potential Intel-CMU collaboration (given the success of last year's collaboration).
- Phil Gibbons and Mike Kozuch participated in the Intel Systems Software Conference (ISSC) in Redondo Beach, CA. Phil gave a short presentation on the ISTC-CC research agenda.
- Participants of the 7th Kavli Futures Symposium, Scalable Energy-Efficient Data Centers and Clouds, UC Santa Barbara, CA, Nov 29, 2011, included Garth Gibson (CMU), Rich Uhlig (Intel), and Dean Klein (Micron). A report will be published soon.
- Mike Kozuch presented a report-out on the Berkeley AMPlab to the Cloud Core team.
- Babu Pillai has been working with Diana Hu and Greg Regnier (IL CSR) to get them working on his Sprout software. The goal is to deploy a perception application (e.g., face recognition), using the Sprout framework to distribute and partition the workload on multiple servers. Potential demonstrations include running the perception task as a service for multiple mobile clients.
- The Intel ISTC-CC members (Phil, Mike, Babu, and Michael) met with Rick Olha, Ted Willke, and Jeff Jackson (IL CSR) to brainstorm possible uses of the uCluster for the Cloud ISTC.
- Jeff Parkhurst gave an overview of the ISTC-CC to John David Miller of IT. He is Principal Engineer looking into Business Intelligence area with an eye an eye for Big Data analytics. His interest was in the Big Data Pillar theme but also mentioned his other interests including device sharing of an individual's user experience among devices and compute resources.
- Jeff Parkhurst met with Sebastian Schoenberg of Intel's Intelligent Systems Group. His work focuses on providing value added services on given architectures that work well on IA. He specifically is looking at the application of Big Data(image processing and data mining). I referred him to Babu Pillai for some follow up discussion.
Other ISTC highlights
- The inaugural issue of the International Journal of Cloud Computing, for which Mike Kozuch is on the editorial board, appeared last month.
- Mike Kozuch collaborated with leaders from the FutureGrid, Grid 5000, and Nimbus projects to craft the "Report on Experimental Computer Science for Distributed Systems." The paper describes the state of the art in distributed systems research testbeds, such as Open Cirrus, and identifies opportunities for further development in the capabilities and management of such systems.
- Hyeontaek Lim (CMU grad student, lead on FAWN SILT work) won a competitive Facebook Fellowship.
- Georgia Tech PhD students are joining Intel, Microsoft (Cosmos team), Amazon (to be finalized), and other cloud technology companies as interns in Summer 2012.
- Georgia Tech and CMU students have started up a joint reading group to review ongoing research in online behavior detection.
- Guy Blelloch (CMU) gave the plenary talk at ALENEX'12.
- Priya Narasimhan was recently awarded Global Pittsburgh's 2012 International Bridge Award to recognize her entrepreneurial and technology transition efforts.
- Priya Narasimhan will serve as a Program Chair for the 2012 ACM/IFIP/IEEE International Conference on Middleware, to be held in Montreal, Quebec, Canada. More information can be found at http://middleware2012.cs.mcgill.ca/.
List of publications
[List of the publications that were PUBLISHED by center researchers during
the month. This does not include submissions or acceptances, just
publications.]
- "Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous Applications," H. Wu, G. Diamos, J. Wang, S. Li, and S. Yalamanchili, International Journal of High Performance Computing Applications, February 2012.
List of presentations
- Guy Blelloch (CMU) gave the plenary talk, entitled "Problem Based Benchmarks", at the Meeting on Algorithms Engineering and Experiments (ALENEX'12), in Kyoto, Japan, January 2012.
- Garth Gibson (CMU) presented, "Recent Work in Storage Systems for BigData" at the NetApp CTO's Distinguished Lecture Series, in Sunnyvale, CA, February 2012.