SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC–CC Q2 2013 Status Report
Summary:
A great quarter for publications, with 25 papers published and 20 others accepted for publication, including 5 of the 27 papers published at HotOS’13! Dave Andersen (CMU), Michael Kaminsky (Intel Labs), Ling Liu (GA Tech), and Onur Mutlu (CMU) each had 4 publications. Garth Gibson (CMU) gave the keynote talk at HPDC’13. Garth and Ion Stoica (UC Berkeley) were inducted as Fellows of the ACM at the ACM Awards Banquet in June. Satya (CMU), Babu Pillai (Intel Labs), and the rest of the Cloudlets team formulated plans with two Intel business groups (from DCSG and PCCG) for a pilot trial leveraging home gateway platforms that Intel is shipping to Comcast. The ISTC-CC created a benchmark page
http://www.istc-cc.cmu.edu/research/benchmarks/ for cloud-related workloads, with some initial benchmarks. ISTC-CC grad students continue to shine:
- Shicong Meng (GA Tech) won the 2012 SPEC Distinguished Dissertation Award for his Ph.D. dissertation "Monitoring-as-a-Service in the Cloud."
- Yoongu Kim (CMU) and Daniel Lustig (Princeton) were awarded Intel Graduate Fellowships.
- Wolf Richter (CMU), Haicheng Wu (GA Tech), and Chris Fallin/Gennady Pekhimenko (CMU) were awarded other prestigious fellowships from IBM, NVIDIA, and Qualcomm, respectively.
- Jai Dayal (GA Tech) and co-authors won the best paper award at HPDIC’13.
We welcome Professor Jeff Chase (Duke) to the ISTC-CC BoA. Jeff is a highly distinguished leader in systems research, including Cloud Computing, and we are delighted to have him join our BoA. Finally, ISTC CC retreat will be held in Pittsburgh at CMU in the CIC Bldg on Nov 7th and 8th. More details to come.
Details:
ISTC Mission: Four inter-related research pillars (themes) architected to create a strong foundation for cloud computing of the future
The research agenda of the ISTC-CC is composed of the following four themes
- Specialization: Explores specialization as a primary means for order of magnitude improvements in efficiency (e.g., energy), including use of emerging technologies like non-volatile memory and specialized cores.
- Automation: Addresses cloud’s particular automation challenges, focusing on order of magnitude efficiency gains from smart resource allocation/scheduling and greatly improved problem diagnosis capabilities.
- Big Data: Addresses the critical need for cloud computing to extend beyond traditional big data usage (primarily, search) to efficiently and effectively support Big Data analytics, including the continuous ingest, integration, and exploitation of live data feeds (e.g., video or twitter).
- To the Edge: Explores new frameworks for edge/cloud cooperation that can efficiently and effectively exploit billions of context-aware clients and enable cloud-assisted client applications whose execution spans client devices, edge-local cloud resources, and core cloud resources.
Participants
Academic PI: Greg Ganger(CMU)
Executive Sponsor: Wen Hann Wang (CSR)
Managing Sponsor: Rich Uhlig (CSR-SAL)
Program Director: Jeff Parkhurst (APR)
Intel PI: Phil Gibbons
Intel Researchers:Michael Kiminsky, Mike Kozuch, Babu Pillai
AcademicPartners: Dave Andersen, Guy Blelloch, Garth Gibson, Carlos Guestrin, Mor Harchol-Balter, Todd Mowry, Onur Mutlu, Priya Narasimhan, M. Satyanarayanan, and Dan Siewiorek (CMU); Mike Freedman, Kai Li, and Margaret Martonosi (Princeton); Anthony Joseph, Randy Katz, and Ion Stoica (UC Berkeley); Ada Gavrilovska, Ling Liu, Calton Pu, Karsten Schwan, and Sudha Yalamanchili (GA Tech).
Technical highlights
- Publications summary for the past quarter:
- 20 papers accepted: to ISCA'13 (3 papers), ICDCS'13 (2 papers), PACT'13 (2 papers), HotCloud'13, SPAA'13, VTDC'13, IEEE Pervasive Computing, IEEE Software, Big Data Benchmarks, WEED'13, OR Letters, DAC'13, MSPC'13, HotStorage'13, InfoVis'13, and USENIX ATC'13.
- 25 papers published:Alenex'13, HotMobile'13, HPCA'13 (4 papers), PPoPP'13, ACM TACO, IEEE TPDS, and VMWare Technical Journal
- Open Source code releases this past quarter included:
- The FAWN team released their novel rank-and-select algorithm open source https://github.com/efficient/rankselect
- The ISTC-CC created a benchmark page http://www.istc-cc.cmu.edu/research/benchmarks/ for cloud-related workloads, with some initial benchmarks.
- Jai Dayal, Karsten Schwan, Jay Lofstead, Matthew Wolf, Scott Klasky, Hasan Abbasi, Norbert Podhorszki, Greg Eisenhauer and Fang Zhen (GA Tech) won the best paper award at the International Workshop on High Performance Data Intensive Computing (HPDIC’13), a satellite workshop of IPDPS’13, for their paper “I/O Containers: Managing the Data Analytics and Visualization Pipelines of High End Codes.”
- A HotOS’13 paper by James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee (CMU grad students), Gregory R. Ganger (CMU), Garth Gibson (CMU), Kimberly Keeton (HP) and Eric Xing (CMU) entitled “Solving the Straggler Problem with Bounded Staleness” presents an approach for mitigating the well-known straggler problem that arises in bulk-synchronous big data frameworks, in which every transient slowdown of any given thread can delay all other threads. The paper shows that for the broad class of iterative convergent algorithms (e.g., for machine learning), using the proposed Stale Synchronous Parallel model enables efficient execution despite such transient slowdowns. See http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/11161-hotos13-final77.pdf
- Sudha Yalamanchili (GA Tech) and team reached a research milestone: The 'Red Fox' compiler for compiling database queries to GPUs can now compile and execute all 22 queries of the industry standard TPC-H benchmark suite (at scale factor 1) expressed in the Datalog-LB query language. Intermediate forms of the implementations will be packaged and made available as a benchmark suite for GPUs targeted for accelerating data warehousing applications.
- Karsten Schwan (GA Tech) and team have two major open source development efforts underway: 1) enhancing the Flume streaming map reduce framework to better support real-time (online) queries, and 2) scaling the EVPath event transport to be able to operate across 100,000+ nodes in order to monitor large-scale machines and datacenter systems. Concerning monitoring, also underway are (i) the development of graph-based specifications of monitoring overlays, along with automated support for creating and deploying such overlays across large-scale systems, and (ii) the use of time series databases to maintain monitoring data in ways suitable for subsequent analysis.
- Two CMU students focused on problem diagnosis research defended their PhD thesis and graduated:
- Soila Pertet (Priya advisee). Soila’s dissertation is here: http://users.ece.cmu.edu/~spertet/papers/soila_thesis.pdf
- Raja Sambasivan (Ganger advisee). Raja's dissertation is here: http://www.pdl.cmu.edu/PDL-FTP/ProblemDiagnosis/CMU-PDL-13-105_abs.shtml
- Other ISTC-CC Ph.D. students who successfully defended include: Shicong Meng, Chengwei Wang, Hrishikesh Amur, Priyanka Tembey, Mukil Kesavan, and Adit Ranadive from Georgia Tech; Anshul Gandhi and Tunji Ruwase from CMU; Wyatt Lloyd and David Shue from Princeton; and Matei Zaharia from UC Berkeley.
Schedule of upcoming events and milestones
- June 25-26: Intel University Collaboration Symposium, San Francisco, CA
- July 2: ACM SOCC’13 submission deadline
- Nov 7–8: 3rd Annual ISTC-CC Retreat will be held at CMU in Pittsburgh
Sponsor group interaction highlights
- [Cloudlet/HomeCloud] M. Satyanarayanan (CMU) and Babu Pillai (Intel Labs) hosted a productive visit in May by Jim Blakley (Intel DCSG), Paul Diefenbaugh (Intel CSR/SAL), Vince Merrick (Intel CSR/TM), and Jack Weast (Intel PCCG) that showcased the vision and status of the cloudlets work, discussed potential home and automotive use cases, and formulated a plan for a pilot trial leveraging home gateway platforms that Intel is shipping to Comcast. As an important first step, Kiryong Ha (CMU) has been reimplementing his VM Synthesis and Cloudlet discovery code as an extension to OpenStack. In June, a well-attended KVM/QEMU workshop was held to highlight the virtualization technology used in cloudlet research.
- [Genomics/DCSG] Michael Kozuch configured several special machines in the Open Cirrus cluster and installed the genomics pipeline that DCSG is developing along with sample datasets. Thank you to Mishali Naik, Paolo Narvaez, and Mohammad Ghodrat of DCSG for helping with logistics. The test inputs ran successfully on both physical and virtual machines in the Open Cirrus cluster. However, additional storage is required to run the standard input sets—a request has been submitted for the needed equipment. The initial ISTC-CC exploration will be to determine if the MrFast aligning tool developed by Hongyi Xin and Onur Mutlu (with others) at CMU may be integrated into the pipeline.
- [To the Edge / Applied Minds] Babu Pillai (Intel Labs) attended a meeting with Applied Minds discussing a project on edge computing that they will be pursuing over the next few months. The project will likely involve storage at the edge, leveraging the fact that storage capacities are growing faster than compute, and significantly faster than network capacities. Dan Dahle (Intel Labs) has scheduled follow up meetings for the next several months to update status and provide direction to the project.
- [EPaxos / Intel IT] Michael Kaminsky (Intel Labs) and Iulian Moraru (CMU grad student) met with Das Kamhout and others from Intel IT, Data Center Engineering, and SSG in May to discuss their EPaxos protocol. The meeting was productive, and the ISTC researchers followed-up by providing a copy of the EPaxos submission as well as the code.
- Onur Mutlu (CMU) has had many interactions with Chris Wilkerson, Asit Mishra, Shih-Lien Lu, and others from Intel.
- Babu Pillai (Intel Labs) and Ren Wang (Intel CSR/SAL) had some discussions on edge computing and cloudlets, and how this intersects with some of her interests on transparent, limitless storage on mobile devices that leverage local and cloud infrastructure. Some of the CMU work on Gigasight (processing and storing captured video at the edge), and cloudlet discovery seemed particularly relevant to this vision.
- Justin Rattner (Intel CTO) visited the two ISTCs at CMU in late March, getting status updates and providing feedback on highlighted projects in both centers.
- [Dan Dahle/TM IL]: Jeff Parkhurst had a quick sync meeting with Dan to discuss tech transfer outliers that could have potential homes within Intel. Dan was very helpful in providing contacts for follow up.
Other ISTC highlights
- Daniel P. Siewiorek (CMU) has been named director of the Quality of Life Technology (QoLT) Center, a National Science Foundation Engineering Research Center.
- Margaret Martonosi (Princeton) received the AT&T / NCWIT Undergraduate Research Mentoring Award.
- ISTC-CC Ph.D. students interning at Intel this summer include:
- Dipanjan Sengupta (GA Tech) is at Intel Portland.
- Alex Merritt (GA Tech) is at Intel Portland.
- Liting Hu (GA Tech) is at ISTC-CC @ CMU working with Mike Kozuch.
- Xiaozhou Li (Princeton) is at ISTC-CC @ CMU working with Michael Kaminsky.
- Karsten Schwan and Vanish Talwar (Georgia Tech) organized a research track and panel session on 'Management of Big Data Systems' for the upcoming ICAC conference in San Jose, June 2013.
- Karsten Schwan (GA Tech) served as PC member for Usenix ATC’13, IEEE ICDCS’13, ACM Sigmetrics’13, and IEEE CCGrid’13.
- Michael Kozuch (Intel Labs) served as PC member for ACM Sigmetrics’13, ICAC’13, MBDS’13, and HotCloud’13.
- Ada Gavrilovska (GA Tech) served as PC member for VEE’13 and APSLOS’13.
- Phil Gibbons (Intel Labs) served as PC member for PODC’13.
- Wyatt Lloyd (Princeton grad student), Michael Kaminsky (Intel Labs), Dave Andersen (CMU) and Mike Freedman (Princeton) wrote an invited article about causal consistency and Eiger for USENIX’s ;login: magazine. The article is scheduled to appear in the August issue.
- Bin Fan (CMU grad student), David Andersen (CMU), and Michael Kaminsky (Intel Labs) wrote an invited article for USENIX’s ;login: magazine about Cuckoo filters. Cuckoo filters are a new approximate set membership data structure that, compared to Bloom filters, offers better memory efficiency, performance, and the ability to delete items. The article is scheduled to appear in the August issue. Bin Fan was invited to do a research internship at Facebook, during which he hopes to test some of the MemC3/Cuckoo hashing ideas in their memcached setup.
- Jeff Parkhurst submitted the presentation “Hitch Hikers Guide to Engaging in Intel Funded Academic Research and Programs” to DTTC and it was accepted as part of their BoF session.
- Jeff Parkhurst’s IDF technical session submission “Beyond Map Reduce: Processing Big Data” was accepted into the IDF technical program
- [Amplifying Funding] Georgia Tech is successfully using cloud infrastructures in its systems classes, which has amongst others, led to new awards of internal funds:
Karsten Schwan, PI, joint with Matt Wolf and Ada Gavrilovska, "Cloud Infrastructure for Systems Classes'', 2012/2013, $50,000 (builds on a 2010 award of $80,000).
List of publications
[List of the publications that were PUBLISHED by center researchers during the month. This does not include submissions or acceptances, just publications.]
- “Fairness and Isolation in Multi-Tenant Storage as Optimization Decomposition,” David Shue, Michael J. Freedman, and Anees Shaikh, ACM SIGOPS Operating System Review (OSR), 2013.
- “PRObE: A Thousand-Node Experimental Cluster for Computer Systems Research,” Garth Gibson, Gary Grider, Andree Jacobson, and Wyatt Lloyd, USENIX ;login:, 2013.
- “Performance Analysis of Network I/O Workloads in Virtualized Data Centers,” Yiduo Mei, Ling Liu, Xing Pu, Sankaran Sivathanu, and Xiaoshe Dong, IEEE Transactions on Service Computing, 2013.
- “Distance-aware bloom filters: Enabling collaborative search for efficient resource discovery,” Yiming Zhang and Ling Liu, Future Generation Computer Systems, 2013.
- “Computing infrastructure for big data processing,” Ling Liu, Frontiers of Computer Science, 2013.
- “Error Analysis and Retention-Aware Error Management for NAND Flash Memory,” Yu Cai, Gulay Yalcin, Onur Mutlu, Erich F. Haratsch, Adrian Cristal, Osman Unsal, and Ken Mai, Intel Technology Journal (ITJ) Special Issue on Memory Resiliency, 2013.
- “Relational Algorithms for Multi-Bulk-Synchronous Processors (short paper),” G. Diamos, H. Wu, J. Wang, A. Lele, and S. Yalamanchili, 18th Symposium on Principles and Practice of Parallel Programming (PPoPP’13), February 2013.
- “Extracting Useful Computation From Error-Prone Processors for Streaming Applications,” Yavuz Yetim, Margaret Martonosi, and Sharad Malik, Design, Automation & Test in Europe Conference (DATE’13), March 2013.
- “Threshold Voltage Distribution in MLC NAND Flash Memory: Characterization, Analysis and Modeling,” Yu Cai, Erich F. Haratsch, Onur Mutlu, and Ken Mai, Design, Automation, and Test in Europe Conference (DATE’13), March 2013.
- “The Impact of Mobile Multimedia Applications on Data Center Consolidation,” Kiryong Ha, Padmanabhan Pillai, Grace Lewis, Soumya Simanta, Sarah Clinch, Nigel Davies, and Mahadev Satyanarayanan, IEEE International Conference on Cloud Engineering (IC2E’13), March 2013.
- “MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing,” Bin Fan, David G. Andersen, and Michael Kaminsky, 10th Symposium on Networked Systems Design and Implementation (NSDI'13), April 2013.
- “Stronger Semantics for Low-Latency Geo-Replicated Storage,” Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen, 10th Symposium on Networked Systems Design and Implementation (NSDI'13), April 2013.
- “Effective Straggler Mitigation: Attack of the Clones,” Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica, 10th Symposium on Networked Systems Design and Implementation (NSDI'13), April 2013.
- Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative,” Emre Kultursay, Mahmut Kandemir, Anand Sivasubramaniam, and Onur Mutlu, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’13), April 2013.
- "Asymmetry-Aware Execution Placement on Manycore Chips,” Alexey Tumanov, Joshua Wise, Onur Mutlu, and Gregory R. Ganger, 3rd Workshop on Systems for Future Multicore Architectures (SFMA’13), April 2013.
- “Kinship: Resource Management for Performance and Functionally Asymmetric Platforms'', Vishakha Gupta, Rob Knauerhase, Paul Brett, and Karsten Schwan, ACM International Conference on Computing Frontiers (CF’13), May 2013.
- “Cura: A Cost-optimized Model for MapReduce in a Cloud,” Balaji Palanisamy, Aameek Singh, Ling Liu, and Bryan Langston, 27th IEEE International Parallel & Distributed Processing Symposium (IPDPS’13), May 2013.
- “FlexIO: I/O Middleware for Location-Flexible Scientific Data Analytics,” Fang Zheng, Hongbo Zou, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Jai Dayal, Tuan-Anh Nguyen, Jianting Cao, Hasan Abbasi, Scott Klasky, Norbert Podhorszki, and Hongfeng Yu, 27th IEEE International Parallel and Distributed Processing Symposium (IPDPS’13), May 2013.
- “I/O Containers: Managing the Data Analytics and Visualization Pipelines of High End Codes,” Jai Dayal, Karsten Schwan, Jay Lofstead, Matthew Wolf, Scott Klasky, Hasan Abbasi, Norbert Podhorszki, Greg Eisenhauer, and Fang Zhen, International Workshop on High Performance Data Intensive Computing (HPDIC’13), with IPDPS 2013, May 2013. Best paper.
- “When Cycles Are Cheap, Some Tables Can Be Huge,” Bin Fan, Dong Zhou, Hyeontaek Lim, Michael Kaminsky, and David G. Andersen, 14th Workshop on Hot Topics in Operating Systems (HotOS’13), May 2013.
- “Solving the straggler problem with bounded staleness,” James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth Gibson, Kimberly Keeton, and Eric Xing, 14th Workshop on Hot Topics in Operating Systems (HotOS’13), May 2013.
- “Making Every Bit Count in Wide-Area Analytics,” Ariel Rabkin, Matvey Arye, Siddhartha Sen, Vivek Pai, and Michael J. Freedman, 14th Workshop on Hot Topics in Operating Systems (HotOS’13), May 2013.
- “The Case for Tiny Tasks in Compute Clusters,” Kay Ousterhout, Aurojit Panda, Joshua Rosen, Shivaram Venkataraman, Reynold Xin, and Sylvia Ratnasamy, Scott Shenker, and Ion Stoica, 14th Workshop on Hot Topics in Operating Systems (HotOS’13), May 2013.
- “HAT, Not CAP: Towards Highly Available Transactions,” Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein and Ion Stoica, 14th Workshop on Hot Topics in Operating Systems (HotOS’13), May 2013.
- “Space-Efficient, High-Performance Rank & Select Structures,” Dong Zhou, David G. Andersen, and Michael Kaminsky, 12th International Symposium on Experimental Algorithms (SEA’13), June 2013.
List of presentations
In addition to the conference/workshop presentations associated with each of the published conference/workshop papers listed above, we had the following presentations:
- S. Yalamanchili (Georgia Tech) presented “New Workloads and New Rules → New Challenges”, Invited Talk, Second International Workshop on Performance Analysis of Workload Optimized Systems, April 2013.
- Onur Mutlu (CMU) gave an invited talk at IMW 2013. He presented "Memory Scaling: A Systems Architecture Perspective,” Proceedings of the 5th International Memory Workshop (IMW), Monterey, CA, May 2013.
- Guy Blelloch (CMU) gave an invited talk on "Big Data on Small Machines" at the "Big Data Analytics" workshop in Cambridge, UK in May.
- Guy Blelloch (CMU) gave an invited talk on "Internally Deterministic Parallel Algorithms” at Workshop on Determinism and Correctness in Parallel Programming, in Houston in March.
- Mahadev Satyanarayanan (Satya) (CMU) gave a well-attended talk on cloudlets at IBM Research on May 21.
- Garth Gibson presented a keynote talk to the 2013 ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC’13), New York, NY, "Concurrent Write Sharing: Overcoming the Bane of File Systems."
- Padmanabhan Pillai, “Rethinking Cloud Architecture for Mobile Multimedia Applications,” at Georgia Tech, CERCS IAB meeting, April 2013.
- Jeff Parkhurst spoke at Cisco Big Data research summit on “Big Data Vision and University Research Overview”.