SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC-CC Abstract
FlexIO: Location-flexible Execution of In Situ Data Analytics for Large Scale Scientific Applications
27th IEEE International Parallel and Distributed Processing Symposium (IPDPS'13), May 2013.
Fang Zheng, Hongbo Zou, Greg Eisenhauer, Karsten Schwan, Matthew Wolf,
Jai Dayal, Tuan-Anh Nguyen, Jianting Cao, Hasan Abbasi*, Scott Klasky*,
Norbert Podhorszki*, Hongfeng Yu†
Georgia Institute of Technology
*Oak Ridge National Laboratory
†Sandia National Laboratory
Increasingly severe I/O bottlenecks on High-End Computing machines are prompting scientists to process simulation output data while simulations are running and before placing data on disk – "in situ" and/or "in-transit". There are several options in placing in-situ data analytics along the I/O path: on compute nodes, on staging nodes dedicated to analytics, or after data is stored on persistent storage. Different placements have different impact on end to end performance and cost. The consequence is a need for flexibility in the location of in situ data analytics. The FlexIO facility described in this paper supports flexible placement of in situ analytics, by offering simple abstractions and methods that help developers exploit the opportunities and trade-offs in performing analytics at different levels of the I/O hierarchy. Experimental results with several large-scale scientific applications demonstrate the importance of flexibility in analytics placement.
KEYWORDS: I/O, In Situ Processing, Staging, Placement, Data Analytics
FULL PAPER: pdf