1. Motivation / Background

- Growing demand for high memory capacity at low cost
- Emerging NVMs offer Multi-Level Cell (MLC) for low cost per bit
  - PCM projected to offer higher density than DRAM (only single bit/cell)
- Disadvantages of adopting MLC: Higher latency and energy
  - 2 bits/cell: MSB, LSB. High latency and energy for LSB reads and LSB writes
- Existing technology conservatively designed for slowest bit of multi-bit cell

2. Key Idea: Expose Lower Latency & Energy of Faster Bit

- Decoupled Bit Mapping (DeBiM)
  - Maps different bits of multi-bit cell to logically separate memory addresses
  - Exposes lower latency and energy of MSB reads and LSB writes to system software (OS)

- Asymmetry-Aware Page Mapping (AsPaM)
  - OS allocates read-intensive pages to MSB and write-intensive pages to LSB
  - Predicts memory access pattern of page using address of instruction (PC) leading to allocation of the page

- Split Half-Row Buffering (SplitRB)
  - Two logical row buffers from single physical row buffer

3. Results: Improved Performance and Energy-Efficiency

- 8-core x86 system simulation
  - MLC PCM main memory: Conventional vs. (E-)DeBiM

- Improved performance: +26.3%
  - Memory read latency: -19.5%
  - Thread fairness: +20.2%

- Improved memory energy efficiency
  - Performance per memory Watt: +18.9%