SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC–CC Q2 2012 Status Report
Summary:
The ISTC-CC Update, a 32-page year-in-review newsletter of the center’s year 1 accomplishments, is now available at http://www.istc-cc.cmu.edu/publications/news/ISTC-newsletter12.pdf. The first annual Board of Advisor’s meeting took place on August 16 in Pittsburgh, with the center receiving high praise from the BoA while also being challenged to aim for grand slams not just home runs. The First GraphLab Workshop on Large-scale Machine Learning (http://graphlab.org/workshop2012/) attracted 320 participants from ~60 companies. GraphLab 2.1 was released, providing an updated abstraction that scales to billions of nodes and vertices in the Cloud (paper accepted to OSDI’12). Another great quarter for publications with 30 papers published and another 21 papers accepted (including 6 of the 25 papers in SOCC’12). A great quarter, too, for awards and honors, including:
- Mike Freedman (Princeton) received a Presidential Early Career Award for Scientists and Engineers (PECASE) for “efforts in designing, building, and prototyping a modern, highly scalable, replicated storage cloud system that provides strong robustness guarantees.”
- Ling Liu (GA Tech) and her co-authors received the best paper award at CLOUD’12 for their work on “Reliable State Monitoring in Cloud Datacenters.”
- Ling Liu (GA Tech) served as general chair of VLDB 2012 held in Istanbul, Turkey, August 2012.
- Onur Mutlu (CMU) received 3 awards: (i) an Intel Early Career Faculty Honor Program Award, (ii) an IBM Faculty Partnership Award, and (iii) an HP Labs Innovation Research Program Award.
- Ada Gavrilovska (GA Tech) was awarded an NSF grant for her work on cloud power management: “PowerMeter: Tracking Energy Usage in the Clouds.”
Note: The ISTC-CC Annual Retreat will be November 29-30 in Pittsburgh. Register at http://www.istc-cc.cmu.edu/events/retreat12.shtml
Details:
ISTC Mission: Four inter-related research pillars (themes) architected to create a strong foundation for cloud computing of the future
The research agenda of the ISTC-CC is composed of the following four themes
- Specialization: Explores specialization as a primary means for order of magnitude improvements in efficiency (e.g., energy), including use of emerging technologies like non-volatile memory and specialized cores.
- Automation: Addresses cloud’s particular automation challenges, focusing on order of magnitude efficiency gains from smart resource allocation/scheduling and greatly improved problem diagnosis capabilities.
- Big Data: Addresses the critical need for cloud computing to extend beyond traditional big data usage (primarily, search) to efficiently and effectively support Big Data analytics, including the continuous ingest, integration, and exploitation of live data feeds (e.g., video or twitter).
- To the Edge: Explores new frameworks for edge/cloud cooperation that can efficiently and effectively exploit billions of context-aware clients and enable cloud-assisted client applications whose execution spans client devices, edge-local cloud resources, and core cloud resources.
Participants
Academic PI: Greg Ganger(CMU)
Executive Sponsor: Wen Hann Wang (CSR)
Managing Sponsor: Rich Uhlig (CSR-SAL)
Program Director: Jeff Parkhurst (APR)
Intel PI: Phil Gibbons
Intel Researchers:Michael Kiminsky, Mike Kozuch, Babu Pillai
AcademicPartners: Dave Andersen, Guy Blelloch, Garth Gibson, Carlos Guestrin, Mor Harchol-Balter, Todd Mowry, Onur Mutlu, Priya Narasimhan, M. Satyanarayanan, and Dan Siewiorek (CMU); Mike Freedman, Kai Li, and Margaret Martonosi (Princeton); Anthony Joseph, Randy Katz, and Ion Stoica (UC Berkeley); Ada Gavrilovska, Ling Liu, Calton Pu, Karsten Schwan, and Sudha Yalamanchili (GA Tech).
Technical highlights
- [Year-in-Review] The “ISTC-CC Update,” a 32-page year-in-review newsletter of the center’s year 1 accomplishments, is now available at http://www.istc-cc.cmu.edu/publications/news/ISTC-newsletter12.pdf. Table of Contents include ISTC-CC Overview, Message from the PIs, ISTC-CC Personnel, Year in Review, ISTC-CC News, Recent Publications (including paper abstracts, for ease of reference), and Program Director’s Corner. You are encouraged to pass on the newsletter to anyone interested in learning more about the ISTC-CC’s research (the newsletter is not Intel Confidential).
- [Board of Advisors Meeting] Greg Ganger (CMU), Phil Gibbons (Intel Labs), Jeff Parkhurst (Intel UCO), Carlos Guestrin (CMU), Onur Mutlu (CMU), and Babu Pillai (Intel Labs) each gave well-received presentations to the BoA. Feedback on the center from both BoA (Wen-Hann Wang, Balint Fleischer, Rich Uhlig, Pradeep Khosla, Frans Kaashoek) and UnCoR leaders (Limor Fix, Chris Ramming, Tanay Karnik) in attendance was highly positive overall. Balint Fleischer (DCG) remarked that ISTC-CC’s research “really opens my horizon,” helping him and his group to “think about problems differently.”
- [GraphLab] The First GraphLab Workshop on Large-scale Machine Learning (http://graphlab.org/workshop2012/) brought together industry and academic professionals to explore the state-of-the-art on the development of machine-learning techniques for working with huge data sets. The GraphLab Workshop included about 320 participants and 15 talks and demonstrations on systems, abstractions, languages, and algorithms for large-scale data analysis. The workshop also included the release of GraphLab 2.1, an updated abstraction that increases the scalability of GraphLab and of GraphChi, which is able to solve Web-scale problems on a single personal computer. Several of the workshop’s talks included announcements on new big data developments, including Ted Willke’s (Intel SAL) announcement of the development of GraphBuilder (see below). The workshop also featured several short discussions led by participants from Yahoo!, Twitter, Stanford University, Netflix, Pandora, IBM, and One Kings Lane.
- [Berkeley-CMU-Intel SOCC’12 paper] “Heterogeneity and dynamicity of clouds at scale: Google trace analysis,” by Charles Reiss, Alexey Tumanov, Gregory Ganger, Randy Katz, and Michael Kozuch was accepted for publication at the ACM Symposium on Cloud Computing (SOCC’12). This paper, which brings together collaborators from Berkeley, CMU, and Intel (thanks entirely to the ISTC-CC), analyzes an activity trace released to the public by Google of an 11000+ node production Google cluster.
- [Multi-server systems] Mor Harchol-Balter (CMU) and her students had a big research breakthrough on the theoretical front, developing the first queuing analysis of multi-server systems with setup times (that is, there is a setup cost for turning on a server). Previously, this queuing analysis was known only for a single server. Mor flew to the EURANDOM Institute in early September to give a talk on this work, and will later give the talk at MIT.
- [Scheduling for storage access] Randy Katz (UC Berkeley), Ion Stoica (UC Berkeley) and students have been investigating extending a two level scheduling approach, which they originally developed for the Mesos framework manager, to the context of storage access. Initial results, published in a HotCloud’12 paper, show that I/O utilization can be significantly increased with minimum impact on latency by selecting the right scheduling strategy. The ideas have been further developed, and will be reported in a paper to appear at the ACM Symposium on Cloud Computing (SOCC’12).
- [OpenStack] The Georgia Tech contingent worked hard to stand up an OpenStack Xen-based infrastructure on their Intel-based Cluster. The infrastructure is now functional (there were many issues with Xen and OpenStack), and the team is now progressing with (i) the deployment of an online monitoring infrastructure for the GT OpenStack cloud and (ii) the deployment of larger cloud applications, for measurement and evaluation.
- Publications summary for the past quarter:
- 21 papers accepted: including to SOCC'12 (3 regular papers + 3 short papers), OSDI'12 (3 papers), Middleware'12 (2 papers), ICCD'12 (2 papers), CCS'12, LISA'12, MobiCase'12, ICAC'12, and ACM TOCS
- 30 paper published: including in CLOUD'12 (3 papers), SPAA'12 (3 papers + 1 short), ICDCS'12 (2 papers), Sigcomm'12, PODC'12, VLDB'12, KDD'12, ICS'12, HPCC'12, AstroHPC'12, ICME'12, SCC'12, DSN'12, MobiSys'12, SACMAT'12, ACC'12, LADIS'12, RV'12, and IEEE TOS (see List of Publications below)
Schedule of upcoming events and milestones
- September 21: 8th Open Cirrus summit in San Jose
- October 14-17: ACM Symposium on Cloud Computing in San Jose
- November 5-7: 20th Parallel Data Lab (PDL) Retreat in Pittsburgh/Bedford Springs
- November 29-30: 2nd Annual ISTC-CC Retreat in Pittsburgh
Sponsor group interaction highlights
- [GraphLab] Ted Willke (Intel SAL) and his team continue to collaborate with Carlos Guestrin (CMU) and his team on solidifying the GraphLab system. Ted’s team developed GraphBuilder, which uses Hadoop to overcome the gap between unstructured data and the formation of the data’s graph of dependencies.
- Onur Mutlu (CMU) spent a month this summer at Intel Labs in Jones Farm, primarily collaborating with Shih-Lien Lu, Chris Wilkerson, Doug Carmean, and Konrad Lai.
- Matthew Wolf (GA Tech) attended the Intel IDF conference in September 2012. He is also working with Intel on a 'hackathon' event to be held at GA Tech in September 2012.
- [Internships] Alex Merritt (GA Tech) interned at Intel Labs in Summer 2012 to explore the implications of computational offloading for threaded/shared memory systems. He is continuing this research at GA Tech, but is also looking at alternative offloading models like those using OpenCL. Other students with Intel internships included four of Onur Mutlu’s (CMU) students, one of Carlos Guestrin’s (CMU) students, and one of Margaret Martonosi’s (Princeton) students. Other ISTC-CC students gaining valuable cloud-related experiences at non-Intel internships included Chengwei Wang (with Amazon Web Store team), Sudarsun Kannan (at HP Labs), Junwei Li (at AT&T Labs), Hobin Yoon (with Microsoft's Azure cloud team), Adit Ranadive (at VMWare), and three of Ling Liu’s (GA Tech) students at IBM Almaden, at IBM Watson, and with Amazon's Cloud Infrastructure team.
- Jeff Parkhurst Gave an overview of both the cloud and big data centers to Boyd Davis(SSG/DCSG), Dan Dahle(IL), Gopinatth Selvaraje(SSG), and others(approx 15 in attendance overall). Interest is high in both centers. This group may be a good source of technology stakeholders for the Big Data center once its research begins.
- Jeff Parkhurst gave an overview of memory architecture research at the Big Data and Cloud Computing centers to Mike LaTondre who is part of Intel’s Cross Platform Technology planning group. We discussed the possibility of monitoring research at the two centers for purposes of getting a glimpse of future technologies as the research yields fruit. Further discussions to ensue although Mandy Pant will be taking over this interface.
- Jeff Parkhurst gave an overview of the Cloud and Big Data centers to Tom Marchok who is in Intel Capital. He provided me a contact within McAfee with the potential opportunity to add a security vertical for the Big Data center. He is interested in both Cloud and Data centers and will be attending the Cloud Retreat in November at CMU.
- Jeff Parkhurst gave an overview of the Cloud and Big Data centers to Harris Joyce who is the TA for Steve Pawlowski and was interested in further information on the cloud and big data centers. Steve is frequently asked to provide his vision on both of these areas and we discussed the research at both centers. As the research matures, we will seek potential technology recipients in their group.
- Jeff Parkhurst met with Rich to discuss what is being done within his lab to advance and exploit ISTC/ICRI research as well as identify opportunities for building a connection between UCO research communities and IL and BUs. We also discussed potential gaps (mainly in the “To The Edge” pillar) and where we might find potential stakeholders. This discussion was to prepare in part for the sponsor lab strategy annual review effort with my dept.
Other ISTC highlights
- ISTC-CC presented an overview talk, 12 posters, and 3 demos at Intel’s University Collaboration Office “Showcase” in Santa Clara on June 27.
- Satya (CMU) served as Program Chair for the 3rd ACM Asia-Pacific Systems Workshop (APSys’12) held in Seoul, S. Korea on July 23-24. Frans Kaashoek was the keynote speaker at the workshop.
- The ISTC-CC is happy to host Professor Bhuvan Urgaonkar (Penn State) for a portion of his sabbatical. Bhuvan is collaborating with Mor Harchol-Balter (CMU) and Michael Kozuch (Intel Labs) on issues around performance modeling and cluster scheduling.
- Greg Ganger (CMU) attended the Facebook faculty summit in August, which was an open exchange of technical ideas and information among faculty from many universities and Facebook engineers. Greg gave a talk on elastic distributed storage research, which is joint work between CMU, GA Tech, and Intel. Greg also attended EMC's big-data workshop in Boston and participated in a panel about open research directions and industry-academic collaboration.
- The Eighth Open Cirrus Summit event is organized as a workshop to be held in conjunction with the International Conference on Autonomic Computing (ICAC’12). Michael Kozuch is General Chair.
- Guy Blelloch (CMU) and Phil Gibbons (Intel Labs) co-organized an NSF Workshop on Research Directions in the Principles of Parallel Computing.
- The Fall 2012 offering of 15-821/18-843 by Satya (CMU) and Dan Siewiorek (CMU) has 8 student projects focused on the cloudlet concept. The full list of projects can be found at http://www.cs.cmu.edu/~15-821/project-descriptions.html.
- CMU staff member Michael Stroucken has taken a new position within the university and, consequently, will no longer be providing admin support for the Intel Open Cirrus cluster. We are sorry to lose his help. Fortunately, we have already identified two skilled admins to continue the role, so there was no gap in coverage.
- We congratulate Hrishikesh Amur (GA Tech grad student) for successfully proposing his PhD thesis, entitled "Memory-efficient Distributed Parallel Frameworks with Compressed Buffer Trees,” in August 2012. His thesis is based on joint work with ISTC-CC folks at CMU and Intel Labs, and his committee members include Dave Andersen (CMU) and Greg Ganger (CMU).
List of publications
[List of the publications that were PUBLISHED by center researchers during the month. This does not include submissions or acceptances, just publications.]
- "NEAT: Road Network Aware Trajectory Clustering", Binh Han, Ling Liu and Edward Omiecinski, ICDCS’12, June 2012
- “v-Bundle: Flexible Group Resource Offerings in Clouds,” Liting Hu, Kyung Dong Ryu, Dilma Da Silva, and Karsten Schwan, ICDCS’12, June 2012
- “Draco: Statistical Diagnosis of Chronic Problems in Large Distributed Systems,” Soila Pertet Kavulya, Kaustubh Joshi, Matti Hiltunen, Scott Daniels, Rajeev Gandhi, and Priya Narasimhan, IEEE/IFIP Conference on Dependable Systems and Networks (DSN’12), June 2012
- “Human Mobility Modeling at Metropolitan Scales,” S. Isaacman, R. Becker, R. Cáceres, M. Martonosi, J. Rowland, A. Varshavsky, and W. Willinger, 10th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys’12), June 2012
- “A Power Capping Controller for Multicore Processors,” N. Alamoosa, W. J Song, Y Wardi, and S. Yalamanchili, IEEE American Control Conference (ACC’12), June 2012
- “StockMarket Volatility Prediction: A Service-Oriented Multi-Kernel Learning Approach," Feng Wang, Ling Liu, and Chenxiao Dou, Proceedings of IEEE Int. Conf on Service Computing (SCC’12), June 2012
- "Exact and Approximate Computation of a Histogram of Pairwise Distances between Astronomical Objects," Bin Fu, Eugene Fink, Garth Gibson and Jaime Carbonell, First Workshop on High Performance Computing in Astronomy (AstroHPC’12), June 2012
- “Challenges and Opportunities in Consolidation at High Resource Utilization: Non-monotonic Response Time Variations in n-Tier Applications,” Simon Malkowski, Yasuhiko Kanemasa, Hanwei Chen, Masao Yamamoto, Qingyang Wang, Deepal Jayasinghe, and Calton Pu, Proceedings of the 2012 IEEE CLOUD Conference (CLOUD’12), June 2012
- “Expertus: A Generator Approach to Automate Performance Testing in IaaS Clouds,” Deepal Jayasinghe, Galen Swint, Simon Malkowski, Jack Li, Qingyang Wang, and Calton Pu, Proceedings of the 2012 IEEE CLOUD Conference (CLOUD’12), June 2012
- “Reliable State Monitoring in Cloud Datacenters,” Shicong Meng, Arun Iyengar, Isabelle Rouvellou, Ling Liu, Kisung Lee, and Balaji Palanisamy, Proceedings of the 2012 IEEE CLOUD Conference (CLOUD’12), June 2012. Won best paper award.
- “Characterizing and Improving the Use of Demand-Fetched Caches in GPUs,” Wenhao Jia, Kelly A. Shaw, and Margaret Martonosi, Proceedings of the 26th International Conference on Supercomputing (ICS ‘12), June 2012
- "Brief Announcement: The Problem Based Benchmark Suite,” Julian Shun, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Aapo Kyrola, Harsha Vardhan Simhadri, and Kanat Tangwongsan, Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'12), June 2012
- “Greedy Sequential Maximal Independent Set and Matching are Parallel on Average,” Guy E. Blelloch, Jeremy Fineman, and Julian Shun, Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'12), June 2012
- “Parallel and I/O Efficient Algorithms for Set Covering Problems,” Guy Blelloch, Harsha Vardhan Simhadri and Kanat Tangwongsan, Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'12), June 2012
- “Parallel Probabilistic Tree Embeddings, k-Median, and Buy-at-Bulk Network Design,” Guy Blelloch, Anupam Gupta, and Kanat Tangwongsan, Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'12), June 2012
- “Reliability Implications of Power and Thermal Constrained Operation of Asymmetric Multicore Processors,” W. Song, S. Mukhopadhyay, and S. Yalamanchili, Dark Silicon Workshop, June 2012
- “Commodity Converged Fabrics for Global Address Spaces in Accelerator Clouds,” J. Young and S. Yalamanchili, IEEE International Conference on High Performance Computing and Communications (HPCC’12), June 2012
- “Xerxes: Distributed Load Generator for Cloud-scale Experimentation,” Mukil Kesavan, Ada Gavrilovska, and Karsten Schwan, 7th OpenCirrus Summit, June 2012
- “Fine-Grained Access Control of Personal Data," Ting Wang, Mudhakar Srivatsa and Ling Liu, Proceedings of the 2012 ACM Symposium on Access Control Models and Technologies (SACMAT), June 2012.
- "Role-Based and Time-Bound Access and Management of EHR Data,” Rui Zhang, Ling Liu, and Xue Rui, International Journal of Security and Communication Networks, 2012 Wiley.
- "Unsupervised Conversion of 3D Models for Interactive Metaverses,” Jeffrey Terrace, Ewen Cheslack-Postava, Philip Levis, and Michael J. Freedman, Proc. IEEE International Conference on Multimedia and Expo (ICME'12), July 2012
- “The Cost of Fault Tolerance in Multi-Party Communication Complexity,” Binbin Chen, Haifeng Yu, Yuda Zhao, and Phillip B. Gibbons, Proc. 31st ACM Symposium on Principles of Distributed Computing (PODC'12), July 2012
- “Towards Predictable Multi-Tenant Shared Cloud Storage,” David Shue, Michael J. Freedman, and Anees Shaikh, Proc. Large-Scale Distributed Systems and Middleware (LADIS '12), July 2012. One of 6 best papers selected to appear in extended format in special issue of SIGOPS Operating Systems Review
- "On-Chip Networks from a Networking Perspective: Congestion and Scalability in Many-core Interconnects," George Nychis, Chris Fallin, Thomas Moscibroda, Onur Mutlu, and Srinivasan Seshan, Proceedings of the 2012 ACM SIGCOMM Conference (SIGCOMM’12), August 2012
- "Adaptive Usage of Cellular and WiFi Bandwidth: An Optimal Scheduling Formulation,” Ozlem Bilgir Yetim and Margaret Martonosi, ACM MobiCom Workshop on Challenged Networks, August 2012
- “Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads,” Y. Chen, S. Alspaugh, R. H. Katz, 38th International Very Large Databases Conference (VLDB’12), August 2012
- "RainMon: An Integrated Approach to mining bursty timeseries monitoring data,” Ilari Shafer, Kai Ren, Vishnu Boddeti, Yashihisa Abe, Greg Ganger, and Christos Faloutsos, Proc. 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'12), August 2012
- "A Reference Architecture for Mobile Code Offload in Hostile Environments" (Working session paper) Simanta, S., Lewis, G., Morris, E., Ha, K., Satyanarayanan, M. Proceedings of the 10th Working IEEE/IFIP Conference on Software Architecture (WICSA) & 6th European Conference on Software Architecture (ECSA), August 2012
- “File system virtual appliances: Portable file system implementations,” Abd-El-Malek, M., M. Wachs, J. Cipar, K. Sanghi, G. Ganger, G. Gibson, and M. Reiter. IEEE Transactions on Storage (TOS), vol 8, no 3, September 2012
- “Scalable Dynamic Partial Order Reduction,” Jiri Simsa, Randy Bryant, Garth Gibson, and Jason Hickey, Int. Conf. on Runtime Verification (RV’12), September 2012.
List of presentations
In addition to the 28 conference/workshop presentations associated with each of the 28 published conference/workshop papers listed above, we had the following presentations:
- Wen-Hann Wang (ISTC-CC Sponsoring Executive) gave a keynote talk on “Powering the Cloud Computing of the Future” at the OpenCirrus summit in Beijing, June 2012.
- Phil Gibbons (Intel Labs) gave three 90 minute talks on multi-core computing at the Madalgo Summer School on Algorithms for Modern Parallel and Distributed Models in Aarhus, Denmark, attended by ~70 graduate students and post-docs, August 2012. He was one of 4 distinguished lecturers at the Summer School.
- S. Yalamanchili (GA Tech) gave an invited talk "Scalable Resource Composition in a Flat World,” at the 1st Workshop in Unconventional Cluster Architectures and Applications, 2012.
- G. Hsieh, A. Kerr, H. Kim, J. Lee, N. Lakshminarayana, S. Li, A. Rodrigues, and S. Yalamanchili (GA Tech) gave a tutorial at the IEEE International Symposium on Computer Architecture entitled “Ocelot and SST-Macsim,” June 2012.
- Phil Gibbons (Intel Labs) gave a brief talk entitled “Hierarchies, Clouds, and Specialization” at the NSF Workshop on Research Directions in the Principles of Parallel Computation, Pittsburgh, PA, June 2012.
- Ling Liu (GA Tech) presented three invited talks on Big Data and Big Data Analytics at (i) Tokyo Univ. June 2012, (ii) Keynote in the 2012 Big Data workshop, July 2012, organized by Big Data Summer School in RUC, Peking, and (iii) Keynote in the VLDB 2012 PhD workshop, August 2012 Istanbul, Turkey.
- Calton Pu (GA Tech) presented his ISTC cloud research at the following international sites: Renmin University (Beijing, China), Huazhong Inst. Sci. Tech. (Wuhan, China), Univ. Tokyo (Tokyo, Japan), and Data Eng. Workshop (Nagoya, Japan).
- Onur Mutlu (CMU) gave a distinguished lecture entitled "Scaling the Main Memory System in the Many-Core Era,” at the Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, June 2012. Onur also gave a number of talks at Intel Labs during his summertime visit, including “Architecting and Exploiting Asymmetry in Multi-Core Architectures” at Intel Archfest (HIllsboro, OR), August 2012.