Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
Towards Accurate and Fast Evaluation of Multi-Stage Log-structured Designs
14th USENIX Conference on File and Storage Technologies (FAST'16), Santa Clara, CA, February 2016.
Hyeontaek Lim, David G. Andersen, Michael Kaminsky*
Carnegie Mellon University
Multi-stage log-structured (MSLS) designs, such as LevelDB, RocksDB, HBase, and Cassandra, are a family of storage system designs that exploit the high sequential write speeds of hard disks and flash drives by using multiple append-only data structures. As a first step towards accurate and fast evaluation of MSLS, we propose new analytic primitives and MSLS design models that quickly give accurate performance estimates. Our model can almost perfectly estimate the cost of inserts in LevelDB, whereas the conventional worst-case analysis gives 1.8– 3.5X higher estimates than the actual cost. A few minutes of offline analysis using our model can find optimized system parameters that decrease LevelDB’s insert cost by up to 9.4–26.2%; our analytic primitives and model also suggest changes to RocksDB that reduce its insert cost by up to 32.0%, without reducing query performance or requiring extra memory.
FULL PAPER: pdf