P3DC: Reducing DRAM Cache Hit Latency by Hybrid Mappings
-
Abstract
Die-stacked DRAM caches are increasingly advocated to bridge the performance gap between on-chip Cache and
main memory. To fully realize their potential, it is essential to improve DRAM cache hit rate and lower cache hit latency. In
order to take advantage of high hit-rate of set-association and low hit latency of direct-mapping at the same time, we propose
a Partial Direct-mapped Die-stacked DRAM Cache called P3DC. This design is motivated by the following observations:
applying unified mapping policy to different types of blocks cannot achieve high cache hit rate and low hit latency. P3DC
classifies data blocks into leading blocks and following blocks, and places them with static positions and dynamic positions
respectively in a unified set-associative structure. We also propose a replacement policy to balance the different blocks
miss penalty and the temporal locality. In addition, P3DC provides a policy to mitigate cache thrashing due to block type
variations. Experimental results demonstrate that, for workloads that are particularly suitable for direct mapping, P3DC
can reduce the cache hit latency by 20.5% while achieving similar cache hit rate compared with set-associative caches. P3DC
improves the instructions per cycle (IPC) by 12% on average and up to 66% compared with state-of-the-art direct-mapped
cache–BEAR, and by 6% on average and up to 19% compared with tag-data decoupled set-associative cache–DEC-A8.
-
-