INTEL SCIENCE & TECHNOLOGY CENTER

GTStream is a real-time multi-tier big data platform consisting of a Flume log processing tier, a HBase key-value store, a HDFS file system tier and a MapReduce tier (optional). The GTStreams platform is used to support a variety of big data streaming applications with different types of workloads. It has a default application analyzing distributed web server logs in order to understand the online behavior of web customers. It is completely distributed and scalable with routine deployment on over 200 virtual machines (we have run up to 1000 VMs). GTStream is useful for setting up realistically large scale, online, multi-tier big data applications and for experimenting with performance/reliability/availability insights, by varying GTStream configurations and workload characteristics (Project Hoover: Auto-Scaling Streaming Map-Reduce Applications).

NECTERE

Benchmark emulating streaming codes for multimedia (high throughput) or finance (low latency), for use in GPU and non-GPU cluster settings.

Link: http://www.cc.gatech.edu/~adit262/#Nectere
Paper:
- Benchmarking Next Generation Hardware Platforms: An Experimental Approach
Source: http://www.cc.gatech.edu/~adit262/docs/nectere.tar.bz2

A benchmark comprising of image processing and financial workloads which uses CPUs/GPUs for processing as well as Ethernet/InfiniBand for communication. The goal of Nectere is to understand how systems that have heterogeneity in processing as well as networking capabilities can perform on such future datacenter workloads. Nectere, internally is comprised of different components that can be combined to test diverse and varying application usage patterns. For example, we use the InfiniBand/Ethernet communication components and the Image pipelining component to find the improvement in performance when using IB for such applications.

PARSEC

The Princeton Application Repository for Shared-Memory Computers (PARSEC) is a benchmark suite composed of multithreaded programs. The suite focuses on emerging workloads and was designed to be representative of next-generation shared-memory programs for chip-multiprocessors. [...more]

THE PROBLEM BASED BENCHMARK SUITE

The problem based benchmark suite (PBBS) is designed to be an open source repository to compare different parallel programming methodologies in terms of performance and code quality. The benchmarks define problems in terms of the function they implement and not the particular algorithm or code they use. We encourage people to implement the benchmarks using any algorithm, with any programming language, with any form of parallelism (or sequentially), and for any machine. The problems are selected so they:

Are representative of a reasonably wide variety of real-world tasks
The problem can be defined concisely
Are simple enough that reasonably efficient solutions can be implemented in 500 lines of code, but not trivial microbencharks
Have outputs that can be easily tested for correctness and possibly quality. [...more]
Current benchmarks in the PBBS.

SHOC

The Scalable HeterOgeneous Computing benchmark suite (SHOC) is a collection of benchmark programs that tests the performance and stability of systems using computing devices with nontraditional architectures for general purpose computing, as well as the software used to program them. This benchmark suite is focused on systems that contain Graphics Processing Units (GPUs), multi-core processors, and new vector-based coprocessors like Xeon Phi. Currently, this suite supports OpenCL, MIC, and CUDA variations of microbenchmarks and application kernels, and an OpenACC variant of the suite is currently under development. At a higher level, SHOC uses message-passing based applications to evaluate system-wide performance for features like intra-node and inter-node communication among devices.

Link: https://github.com/vetter/shoc/wiki

YCSB++

YCSB++ is an advanced benchmark suite developed to understand the trade-offs between cloud table-store systems and debug advanced performance features. YCSB++ is built on top of the Yahoo! Cloud Serving Benchmark (YCSB) goes beyond measuring the rate of simple workloads. It has a set of extensions to improve performance understanding and debugging of several advanced features. [...more]

XERXES

Datacenter workload emulation, based on Google traces:
Paper:
- Xerxes: Distributed Load Generator for Cloud-scale Experimentation
Benchmark: http://www.cc.gatech.edu/~mukil/xerxes/xerxes_v1.tbz

Xerxes is a distributed load generation framework that can generate desired CPU and memory consumption patterns across a large number of machines. It is organized as a collection of individual load generators, one per machine, coordinated via a master node. It can be used to replay traces from datacenter machines, generate loads fitting statistical distributions and generate resource usage spikes, all at varying scales. The accuracy of aggregate resource consumption patterns decays gradually in line with the rate of experimentation node failures. This lets datacenter/cluster administrators and researchers achieve highly scalable load testing without having to deal with application logic specific nuances.