›› 2013, Vol. 28 ›› Issue (1): 54-71.doi: 10.1007/s11390-013-1312-x

Special Issue: Computer Architecture and Systems; Computer Networks and Distributed Computing

• Special Section on Selected Paper from NPC 2011 • Previous Articles     Next Articles

Energy Efficient Run-Time Incremental Mapping for 3-D Networks-on-Chip

Xiao-Hang Wang1,2 (王小航), Peng Liu2,* (刘鹏), Mei Yang3 (杨梅), Maurizio Palesi4, Ying-Tao Jiang3 (蒋颖涛), and Michael C Huang5 (黄巍)   

  1. 1. Intelligent Chips and Systems Research Centre, Guangzhou Institute of Advanced Technology, Chinese Academy of Sciences Guangzhou 511458, China;
    2. Department of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China;
    3. Department of Electrical and Computer Engineering, University of Nevada, Las Vegas, Nevada, U.S.A.;
    4. Faculty of Engineering, Kore University, Catania, Italy;
    5. Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York, U.S.A.
  • Received:2012-03-05 Revised:2012-11-01 Online:2013-01-05 Published:2013-01-05
  • Supported by:

    This work is supported by the National Natural Science Foundation of China under Grant Nos. 60873112 and 61028004, the National Natural Science Foundation of USA under Grant No. CNS-1126688.

3-D Networks-on-Chip (NoC) emerge as a potent solution to address both the interconnection and design complexity problems facing future Multiprocessor System-on-Chips (MPSoCs). Effective run-time mapping on such 3-D NoC-based MPSoCs can be quite challenging, as the arrival order and task graphs of the target applications are typically not known a priori, which can be further complicated by stringent energy requirements for NoC systems. This paper thus presents an energy-aware run-time incremental mapping algorithm (ERIM) for 3-D NoC which can minimize the energy consumption due to the data communications among processor cores, while reducing the fragmentation effect on the incoming applications to be mapped, and simultaneously satisfying the thermal constraints imposed on each incoming application. Specifically, incoming applications are mapped to cuboid tile regions for lower energy consumption of communication and the minimal routing. Fragment tiles due to system fragmentation can be gleaned for better resource utilization. Extensive experiments have been conducted to evaluate the performance of the proposed algorithm ERIM, and the results are compared against the optimal mapping algorithm (branch-and-bound) and two heuristic algorithms (TB and TL). The experiments show that ERIM outperforms TB and TL methods with significant energy saving (more than 10%), much reduced average response time, and improved system utilization.

[1] Pavlidis V F, Friedman E G. 3-D topologies for networks-on-chip. IEEE Trans. Very Large Scale Integration Systems,2007, 15(10): 1081-1090
[2] Davis W R, Wilson J, Mick S et al. Demystifying 3D ICs:The pros and cons of going vertical. IEEE Design and Testof Computers, 2005, 22(6): 498-511.
[3] Triviño F, S醤chez J L, Alfaro F J, Flich J. Virtualizingnetwork-on-chip resources in chip-multiprocessors. Micropro-cessors and Microsystems, 2011, 35(2): 230-245.
[4] Zhu C, Gu Z, Shang L, Dick R P, Joseph R. Three-dimensionalchip-multiprocessor run-time thermal management. IEEETrans. Computer-Aided Design of Integrated Circuits andSystems, 2008, 27(8): 1479-1492.
[5] Addo-Quaye C. Thermal-aware mapping and placement for3-D NoC designs. In Proc. Int. SoC Conf., Sept. 2005,pp.25-28.
[6] Chou C L, Marculescu R. Run-time task allocation consider-ing user behavior in embedded multiprocessor networks-on-chip. IEEE Trans. Computer-Aided Design of IntegratedCircuits and Systems, 2010, 29(1): 78-91.
[7] Chou C L, Ogras U Y, Marculescu R. Energy- andperformance-aware incremental mapping for networks on chipwith multiple voltage levels. IEEE Trans. Computer-AidedDesign of Integrated Circuits and Systems, 2008, 27(10):1866-1879.
[8] Matsutani H, Koibuchi M, Amano H. Tightly-coupled multi-layer topologies for 3-D NoCs, In Proc. Int. Conf. ParallelProcessing, Sept. 2007, Article No.75.
[9] Feero B S, Pande P P. Networks-on-chip in a three-dimensional environment: A performance evaluation. IEEETrans. Computers, 2009, 58(1): 32-45.
[10] Kim J, Nicopoulos C, Park D et al. A novel dimensionally-decomposed router for on-chip communication in 3D archi-tectures. In Proc. Int. Symp. Computer Architecture, June2007, pp.138-149.
[11] Seiculescu C, Murali S, Benini L, De Micheli G. Sunfloor 3D:A tool for networks on chip topology synthesis for 3-D systemson chips. IEEE Trans. Computer-Aided Design of IntegratedCircuits and Systems, 2010, 29(12): 1987-2000.
[12] Pavlidis V F, Friedman E G. 3-D topologies for networks-on-chip. IEEE Trans. Very Large Scale Integration Systems,2007, 15(10): 1081-1090.
[13] Wang X, Yang M, Jiang Y, Liu P. A power-aware mappingapproach to map IP cores onto NoCs under bandwidth andlatency constraints. ACM Trans. Architecture and Code Op-timization, 2010, 7(1): 1-30.
[14] Land A H, Doig A G. An automatic method for solving dis-crete programming problems. Econometrica, 1960, 28(3):497-520.
[15] Hu J, Marculescu R. Energy- and performance-aware map-ping for regular NoC architectures. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2005, 24(4):551-562.
[16] Addo-Quaye C. Thermal-aware mapping and placement for3-D NoC designs. In Proc. Int. SoC Conf., Sept. 2005,pp.25-28.
[17] Smit L T, Smit G J M, Hurink J L et al. Run-time assignmentof tasks to multiple heterogeneous processors. In Proc. the5th Progress Embedded System Symp., Oct. 2004, pp.185-192.
[18] Carvalho E, Calazans N, Moraes F. Heuristics for dynamictask mapping in NoC-based heterogeneous MPSoCs. In Proc.the 18th IEEE/IFIP Int. Workshop. Rapid System Prototyp-ing, May 2007, pp.34-40.
[19] Lo V, Windisch K J, Liu W, Nitzberg B. Noncontiguous pro-cessor allocation algorithms for mesh-connected multicomput-ers. Trans. Parallel and Distributed Systems, 1997, 8(7):712-726.
[20] Arjomand M, Sarbazi-Azad H. Voltage-frequency planning forthermal-aware, low-power design of regular 3-D NoCs. InProc. the 23rd Int. Conf. VLSI Design, Jan. 2010, pp.57-62.
[21] Zhou X, Yang J, Xu Y, Zhang Y, Zhao J. Thermal-awaretask scheduling for 3D multi-core processors. IEEE Trans.Parallel and Distributed Systems, 2010, 21(1): 60-71.
[22] Chao C H, Jheng K Y, Wang H Y et al. Traffic- and thermal-aware run-time thermal management scheme for 3D NoC sys-tems. In Proc. the 4th ACM/IEEE Int. Symp. Networks-on-Chip, May 2010, pp.223-230.
[23] Truong D, Cheng W, Mohsenin T et al. A 167-processor65nm computational platform with per-processor dynamicsupply voltage and dynamic clock frequency scaling. In Proc.IEEE Symp. VLSI Circuits, June 2008, pp.22-23.
[24] Liu Y, Yang H, Dick R P, Wang H, Shang L. Thermal vs en-ergy optimization for DVFS-enabled processors in embeddedsystems. In Proc. Int. Symp. Quality Electronic Design,March 2007, pp.204-209.
[25] Srinivasan J, Adve S V, Bose P, Rivers J A. The impactof technology scaling on lifetime reliability. In Proc. Int.Conf. Dependable Systems and Networks, June 28-July 1,2004, pp.177-186.
[26] Huang W, Ghosh S, Velusamy S et al. HotSpot: A compactthermal modeling methodology for early-stage VLSI design.IEEE Trans. Very Large Scale Integration Systems, 2006,14(5): 501-513.
[27] Dally W J, Towles B. Principles and Practices of Intercon-nection Networks. San Francisco, USA: Morgan Kaufmann,2004.
[28] Kahng A, Li B, Peh L S et al. Orion 2.0: A fast and accu-rate NoC power and area model for early-stage design spaceexploration. In Proc. Conf. Design, Automation & Test inEurope, Apr. 2009, pp.423-428.
[29] Garey M R, Johnson D S. Computers and Intractability: AGuide to the Theory of NP-Completeness. New York, USA:WH Freeman, 1979.
[30] Lin M, Gamal A E, Lu Y C, Wong S. Performance benefitsof monolithically stacked 3D-FPGA. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2007, 26(2):216-229.
[31] Dick R P, Rhodes D L, Wolf W. TGFF: Task graphs for free.In Proc. the 6th Int. Workshop on Hardware/Software Code-sign, March 1998, pp.97-101.
[32] Yang Y S, Bahn J H, Lee S E, Bagherzadeh N. Parallel andpipeline processing for block cipher algorithms on a network-on-chip. In Proc. the 6th Int. Conf. Information Technology:New Generations, Apr. 2009, pp.849-854.
[33] Delorme J, Houzet D. A complete 4G radiocommunicationapplication mapping onto a 2D mesh NoC architecture. InProc. North-East Workshop. Circuits and Systems, June2006, pp.93-96.
[34] Brooks D, Tiwari V, Martonosi M. Wattch: A frameworkfor architectural-level power analysis and optimizations. InProc. the 27th Int. Symp. Computer Architecture, June2000, pp.83-94.
No related articles found!
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved