We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Qi Wang, Jia-Rui Li, Dong-Hui Wang. Improving the Performance and Energy Efficiency of Phase Change Memory Systems[J]. Journal of Computer Science and Technology, 2015, 30(1): 110-120. DOI: 10.1007/s11390-015-1508-3
Citation: Qi Wang, Jia-Rui Li, Dong-Hui Wang. Improving the Performance and Energy Efficiency of Phase Change Memory Systems[J]. Journal of Computer Science and Technology, 2015, 30(1): 110-120. DOI: 10.1007/s11390-015-1508-3

Improving the Performance and Energy Efficiency of Phase Change Memory Systems

Funds: This work was supported by the National Science and Technology Major Projects of China under Grant No. 2009ZX01 034-001-002-005 and the Knowledge Innovation Project of Institute of Acoustics, Chinese Academy of Sciences.
More Information
  • Author Bio:

    Qi Wang received her B.S. degree in electronic information science and technology from Lanzhou University in 2011. She is now a Ph.D. candidate in Institute of Acoustics, Chinese Academy of Sciences, Beijing. Her research interests include VLSI design, computer architecture, and emerging memory technologies.

  • Received Date: July 12, 2014
  • Revised Date: December 17, 2014
  • Published Date: January 04, 2015
  • Phase change memory (PCM) is a promising technology for future memory thanks to its better scalability and lower leakage power than DRAM (dynamic random-access memory). However, adopting PCM as main memory needs to overcome its write issues, such as long write latency and high write power. In this paper, we propose two techniques to improve the performance and energy-efficiency of PCM memory systems. First, we propose a victim cache technique utilizing the existing buffer in the memory controller to reduce PCM memory accesses. The key idea is reorganizing the buffer into a victim cache structure (RBC) to provide additional hits for the LLC (last level cache). Second, we propose a chip parallelism-aware replacement policy (CPAR) for the victim cache to further improve performance. Instead of evicting one cache line once, CPAR evicts multiple cache lines that access different PCM chips. CPAR can reduce the frequent victim cache eviction and improve the write parallelism of PCM chips. The evaluation results show that, compared with the baseline, RBC can improve PCM memory system performance by up to 9.4% and 5.4% on average. Combing CPAR with RBC (RBC+CPAR) can improve performance by up to 19.0% and 12.1% on average. Moreover, RBC and RBC+CPAR can reduce memory energy consumption by 8.3% and 6.6% on average, respectively.
  • [1]
    Lefurgy C, Rajamani K, Rawson F, Felter W, Kistler M, Keller T W. Energy management for commercial servers. IEEE Computer, 2003, 36(12): 39-48.
    [2]
    Lim K, Ranganathan P, Chang J, Patel C, Mudge T, Reinhardt S. Understanding and designing new server architectures for emerging warehouse-computing environments. In Proc. the 35th Int. Symp. Computer Architecture, Jun. 2008, pp.315-326.
    [3]
    Udipi A N, Muralimanohar N C, Chatterjee N et al. Rethinking DRAM design and organization for energyconstrained multi-cores. ACM SIGARCH Computer Architecture News, 2010, 38(3): 175-186.
    [4]
    Hay A, Strauss K, Sherwood T, Loh G H, Burger D. Preventing PCM banks from seizing too much power. In Proc. the 44th IEEE/ACM Int. Symp. Microarchitecture, Dec. 2011, pp.186-195.
    [5]
    Shi L, Xue C J, Hu J, Tseng W, Zhou X, Sha E H M. Write activity reduction on flash main memory via smart victim cache. In Proc. the 20th ACM Great Lakes Symposium on VLSI, May 2010, pp.91-94.
    [6]
    Lee Y, Kim S, Hong S, Lee J. Skinflint DRAM system: Minimizing DRAM chip writes for low power. In Proc. the 19th IEEE Int. Symp. High Performance Computer Architecture, Feb. 2013, pp.25-34.
    [7]
    Abts D, Bataineh A, Scott S, Faanes G, Schwarzmeier J, Lundberg E, Johnson T, Bye M, Schwoerer G. The Cray BlackWidow: A highly scalable vector multiprocessor. In Proc. ACM/IEEE Conf. Supercomputing, Nov. 2007, Article No. 17.
    [8]
    Liptay J S. Structural aspects of the System/360 model 85: II the cache. IBM System Journal, 1968, 7(1): 15-21.
    [9]
    Rothman J B, Smith A J. Sector cache design and performance. In Proc. the 8th Int. Symp. Modeling, Analysis and Simulation of Computer and Telecommunication Systems, Aug. 2000, pp.124-133.
    [10]
    Zheng H, Lin J, Zhang Z, Gorbatov E, David H, Zhu Z. Mini-rank: Adaptive DRAM architecture for improving memory power efficiency. In Proc. the 41st IEEE/ACM Int. Symp. Microarchitecture, Nov. 2008, pp.210-221.
    [11]
    Brewer T M. Instruction set innovations for the Convey HC-1 computer. IEEE Micro, 2010, 30(2): 70-79.
    [12]
    Binkert N, Beckmann B, Black G, Reinhardt S K, Saidi A, Basu A, Hestness J, Hower D R, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill M D, Wood D A. The gem5 simulator. ACM SIGARCH Computer Architecture News, 2011, 39(2): 1-7.
    [13]
    Rosenfeld P, Cooper-Balis E, Jacob B. DRAMSim2: A cycle accurate memory system simulator. Computer Architecture Letters, 2011, 10(1): 16-19.
    [14]
    Lee B C, Ipek E, Mutlu O, Burger D. Architecting phase change memory as a scalable dram alternative. ACM SIGARCH Computer Architecture News, 2009, 37(3): 2-13.
    [15]
    Yang B, Lee J, Kim J, Cho J, Lee S, Yu B G. A low power phase-change random access memory using a datacomparison write scheme. In Proc. IEEE Int. Symp. Circuits and Systems, May. 2007, pp.3014-3017.
    [16]
    Zhou P, Zhao B, Yang J, Zhang Y. A durable and energy efficient main memory using phase change memory technology. In Proc. the 36th Int. Symp. Computer Architecture, Jun. 2009, pp.14-23.
    [17]
    Cho S, Lee H. Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance. In Proc. the 42nd IEEE/ACM Int. Symp. Microarchitecture, Dec. 2009, pp.347-357.
    [18]
    Qureshi M K, Srinivasan V, Rivers J A. Scalable high performance main memory system using phase-change memory technology. In Proc. the 36th Int. Symp. Computer Architecture, Jun. 2009, pp.24-33.
    [19]
    Lee H G, Baek S, Nicopoulos C, Kim J. An energyand performance-aware DRAM cache architecture for hybrid DRAM/PCM main memory systems. In Proc. the 29th IEEE Int. Conf. Computer Design, Oct. 2011, pp.381-387.
    [20]
    Ramos L E, Gorbatov E, Bianchini R. Page placement in hybrid memory systems. In Proc. the Int. Conf. Supercomputing, Nov. 2011, pp.85-95.
    [21]
    Jiang L, Zhang Y, Childers B R, Yang J. FPB: Fine-grained power budgeting to improve write throughput of multi-level cell phase change memory. In Proc. the 45th IEEE/ACM Int. Symp. Microarchitecture, Dec. 2012, pp.1-12.
  • Related Articles

    [1]Shi-Qi Lian, Ying Wang, Yin-He Han. DimRouter: A Multi-Mode Router Architecture for Higher Energy-Proportionality of On-Chip Networks[J]. Journal of Computer Science and Technology, 2018, 33(5): 984-997. DOI: 10.1007/s11390-018-1869-5
    [2]Fa-Qiang Sun, Gui-Hai Yan, Xin He, Hua-Wei Li, Yin-He Han. CPicker: Leveraging Performance-Equivalent Configurations to Improve Data Center Energy Efficiency[J]. Journal of Computer Science and Technology, 2018, 33(1): 131-144. DOI: 10.1007/s11390-018-1811-x
    [3]Fei Xia, De-Jun Jiang, Jin Xiong, Ning-Hui Sun. A Survey of Phase Change Memory Systems[J]. Journal of Computer Science and Technology, 2015, 30(1): 121-144. DOI: 10.1007/s11390-015-1509-2
    [4]Peng Xiao, Zhi-Gang Hu, Yan-Ping Zhang. An Energy-Aware Heuristic Scheduling for Data-Intensive Workflows in Virtualized Datacenters[J]. Journal of Computer Science and Technology, 2013, 28(6): 948-961. DOI: 10.1007/s11390-013-1390-9
    [5]Jin-Tao Meng, Jian-Rui Yuan, Sheng-Zhong Feng, Yan-Jie Wei. An Energy Efficient Clustering Scheme for Data Aggregation in Wireless Sensor Networks[J]. Journal of Computer Science and Technology, 2013, 28(3): 564-573. DOI: 10.1007/s11390-013-1356-y
    [6]Xiao-Hang Wang, Peng Liu, Mei Yang, Maurizio Palesi, Ying-Tao Jiang, Michael C Huang. Energy Efficient Run-Time Incremental Mapping for 3-D Networks-on-Chip[J]. Journal of Computer Science and Technology, 2013, 28(1): 54-71. DOI: 10.1007/s11390-013-1312-x
    [7]Youngbin Seo, Jeongki Kim, Euiseong Seo. Effectiveness Analysis of DVFS and DPM in Mobile Devices[J]. Journal of Computer Science and Technology, 2012, 27(4): 781-790. DOI: 10.1007/s11390-012-1264-6
    [8]Long Zheng, Mian-Xiong Dong, Kaoru Ota, Hai Jin, Song Guo, Jun Ma. Energy Efficiency of a Multi-Core Processor by Tag Reduction[J]. Journal of Computer Science and Technology, 2011, 26(3): 491-503. DOI: 10.1007/s11390-011-1149-0
    [9]Yu-Lai Zhao, Xian-Feng Li, Dong Tong, Xu Cheng. An Energy-Efficient Instruction Scheduler Design with Two-Level Shelving and Adaptive Banking[J]. Journal of Computer Science and Technology, 2007, 22(1): 15-24.
    [10]Cai-Ming Zhang, Hui-Jian Han, Fuhua Frank Cheng. Determining Knots by Minimizing Energy[J]. Journal of Computer Science and Technology, 2006, 21(2): 261-264.

Catalog

    Article views (26) PDF downloads (1263) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return