SEARCH
ISTC-CC NEWSLETTER
RESEARCH HIGHLIGHTS
Ling Liu's SC13 paper "Large Graph Processing Without the Overhead" featured by HPCwire.
ISTC-CC provides a listing of useful benchmarks for cloud computing.
Another list highlighting Open Source Software Releases.
Second GraphLab workshop should be even bigger than the first! GraphLab is a new programming framework for graph-style data analytics.
ISTC-CC Abstract
Communication-Efficient Distributed Multiple Reference Pattern Matching for M2M Systems
13th IEEE International Conference on Data Mining (ICDM'13), December 2013.
Ruei-Bin Wang†, Yu-Chen Lu†, Mi-Yen Yeh‡, Shou-De Lin†, Phillip B. Gibbons§
†National Taiwan University, Taiwan
‡Institute of Information Science, Academia Sinica, Taiwan
§Intel Labs Pittsburgh
In M2M applications, it is very common to encounter the ad hoc snapshot query that requires fast responses from many local machines in which all the data are distributed. In the scenario when the query is more complex, the communication cost for sending it to all the local machines for processing can be very high. This paper aims to address this issue. Given a reference set of multiple and large-size patterns, we propose an approach to identifying its k nearest and farthest neighbors globally across all the local machines. By decomposing the reference patterns into a multi-resolution representation and using novel distance bound designs, our method guarantees the exact results in a communication-efficient manner. Analytical and empirical studies show that our method outperforms the state-of-the-art methods in saving significant bandwidth usage, especially for large numbers of machines and large-sized reference patterns.
FULL PAPER: pdf