Durable Transactional Memory Can Scale with Timestone
Session: Speculation and consistency--Brain teasers.
Authors: Madhava Krishnan Ramanathan (Virginia Polytechnic Institute and State University); Jaeho Kim (Huawei Dresden Research Center); Ajit Mathew (Virginia Polytechnic Institute and State University); Xinwei Fu (Virginia Polytechnic Institute and State University); Anthony Demeri (Virginia Polytechnic Institute and State University); Changwoo Min (Virginia Polytechnic Institute and State University); Sudarsun Kannan (Rutgers University)
Non-volatile main memory (NVMM) technologies promise byte addressability and near-DRAM access that allows developers to build persistent applications with common load and store instructions. However, it is difficult to realize these promises because NVMM software should also provide crash consistency while providing high performance, and scalability. Durable transactional memory (DTM) systems address these challenges. However, none of them scale beyond 16 cores. The poor scalability either stems from the underlying STM layer or from employing limited write parallelism (single writer or dual version). In addition, other fundamental issues with guaranteeing crash consistency are high write amplification and memory footprint in existing approaches. To address these challenges, we propose TimeStone: a highly scalable DTM system with low write amplification and minimal memory footprint. TimeStone uses a novel multi-layered hybrid logging technique, called TOC logging, to guarantee crash consistency. Also, TimeStone further relies on Multi-Version Concurrency Control (MVCC) mechanism to achieve high scalability and to support different isolation levels on the same data set. Our evaluation of TimeStone against the state-of-the-art DTM systems shows that it significantly outperforms other systems for a wide range of workloads with varying data-set size and contention levels, up to 112 hardware threads. In addition, with our TOC logging, TimeStone achieves a write amplification of less than 1, while existing DTM systems suffer from 2×-6× overhead.