We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Xiaofang (Maggie) Wang, Swetha Thota. A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs[J]. Journal of Computer Science and Technology, 2011, 26(3): 434-447. DOI: 10.1007/s11390-011-1145-4
Citation: Xiaofang (Maggie) Wang, Swetha Thota. A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs[J]. Journal of Computer Science and Technology, 2011, 26(3): 434-447. DOI: 10.1007/s11390-011-1145-4

A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs

More Information
  • Received Date: February 24, 2010
  • Revised Date: March 13, 2011
  • Published Date: May 04, 2011
  • Significant advances in field-programmable gate arrays (FPGAs) have made it viable to explore innovative multiprocessor solutions on a single FPGA chip. For multiprocessors, an efficient communication network that matches the needs of the target application is always critical to the overall performance. Wormhole packet-switching network-on-chip (NoC) solutions are replacing conventional shared buses to deal with scalability and complexity challenges coming along with the increasing number of processing elements (PEs). However, the quest for high performance networks has led to very complex and resource-expensive NoC designs, leaving little room for the real computing force, i.e., PEs. Moreover, many techniques offer very small performance gains or none at all when network traffic is light while increasing the resource usage of routers. We argue that computation is still the primary task of multiprocessors and sufficient resources should be reserved for PEs. This paper presents our novel design and implementation of a resource-efficient communication network for multiprocessors on FPGAs. We reduce not only the required number of routers for a given number of PEs by introducing a new PE-router topology, but also the resource requirement of each router. Our communication network relies on the NEWS channels to transfer packets in a pipelined fashion following the path determined by the routing network. The implementation results on various Xilinx FPGAs show good performance in the typical range of network load for multiprocessor applications.
  • [1]
    Cosoroaba A, Rivoallon F. Achieving higher system performance with Virtex-5 family FPGAs. Xilinx Corporation, Tech. Rep., 2006.
    [2]
    Virtex 5 FPGA datasheet. http://www.xilinx.com/support/documentation/data_sheets/ds202.pdf, May 2010.
    [3]
    Underwood K. FPGAs vs. CPUs: Trends in peak floatingpoint performance. In Proc. ACM/SIGDA Int. Symp. Field Programmable Gate Arrays, Monterey, USA, Feb. 22- 24, 2004, pp.171-180.
    [4]
    deLorimier M, DeHon A. Floating-point sparse matrix-vector multiply for FPGAs. In Proc. ACM/SIGDA Int. Symp. Field-Programmable Gate Arrays, Monterey, USA, Feb. 20- 22, 2005, pp.75-85.
    [5]
    Hauck S, DeHon A (Eds.). Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation. Burlington: Morgan Kaufmann, MA, 2008.
    [6]
    El-Ghazawi T, El-Araby E, Huang M, Gaj K, Kindratenko V, Buell D. The promise of high-performance reconfigurable computing. IEEE Computer, Feb. 2008, 41(2): 69-76.
    [7]
    Zhuo L, Prasanna V. Scalable hybrid designs for linear algebra on reconfigurable computing systems. IEEE Trans. Comput., Dec. 2008, 57(12): 1661-1675.
    [8]
    Ravindran K, Satish N R, Jin Y, Keutzer K. An FPGA-based soft multiprocessor system for IPv4 packet forwarding. In Proc. Int. Conf. Field Programmable Logic and Applications (FPL), Tampere, Finland, Aug. 24-26, 2005, pp.487-492.
    [9]
    Saint-Jean N, Sassatelli G, Benoit P, Torres L, Robert M. HSScale: A hardware-software scalable MP-SOC architecture for embedded systems. In Proc. IEEE Computer Society Annual Symp. VLSI (ISVLSI), Porto Alegre, Brazil, May 9-11, 2007, pp.21-28.
    [10]
    Wang X, Ziavras S G. Exploiting mixed-mode parallelism for matrix operations on the HERA architecture through reconfiguration. IEE Proc. Computers Digital Techniques, July 2006, 153(4): 249-260.
    [11]
    Kumar S et al. A network on chip architecture and design methodology. In Proc. IEEE Computer Society Annual Symp. VLSI (ISVLSI), Pittsburgh, USA, Apr. 25-26, 2002, pp.105-112.
    [12]
    Dally W, Seitz C. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans. Comput., May, 1987, 36(5): 547-553.
    [13]
    Ni L, Mckinley P. A survey of wormhole routing techniques in direct networks. IEEE Computer, Feb. 1993, 26(2): 62-76.
    [14]
    Bjerregaard T, Mahadevan S. A survey of research and practices of network-on-chip. ACM Computing Surveys, June 2006, 38(1): Article No. 1.
    [15]
    Peh L S, Dally W. A delay model for router microarchitectures. IEEE Micro, Jan. 2001, 21(1): 26-34.
    [16]
    Mullins R, West A, Moore S. Low-latency virtual-channel routers for on-chip networks. In Proc. IEEE Int. Symp. Computer Architecture, M¨unchen, Germany, Jun. 19-23, 2004, pp.188-197.
    [17]
    Kapre N, Mehta N, Delorimier M, Rubin R, Barnor H, Wilson M, Wrighton M, Dehon A. Packet switched vs. time multiplexed FPGA overlay networks. In Proc. IEEE Symp. Field-Programmable Custom Computing Machines, Napa, USA, Apr. 24-26, 2006, pp.205-216.
    [18]
    Gratz P, Sankaralingam K, Hanson H, Shivakumar P, McDonald R, Keckler S, Burger D. Implementation and evaluation of a dynamically routed processor operand network. In Proc. IEEE Int. Symp. Networks-on-Chip, Princeton, USA, May 7-9, 2007, pp.7-17.
    [19]
    Schelle G, Grunwald D. Exploring FPGA network on chip implementations across various application and network loads. In Proc. Int. Conf. Field Program. Logic and Applications, Heidelberg, Germany, Sept. 8-10, 2008, pp.41-46.
    [20]
    Moraes F, Calazans N, Mello A, Moller L, Ost L. HERMES: An infrastructure for low area overhead packet-switching networks on chip. Integration, the VLSI Journal, Oct. 2004, 38: 69-93.
    [21]
    Brebner G, Levi D. Networking on chip with platform FPGAs. In Proc. IEEE Int. Conf. Field-Programmable Technology, Tokyo, Japan, Dec. 15-17, 2003, pp.13-20.
    [22]
    Bartic T, Mignolet J Y et al. Topology adaptive networkon-chip design and implementation. IEE Proc. Computers Digital Techniques, July 2005, 152(4): 467-472.
    [23]
    Sethuraman B, Bhattacharya P, Khan J, Vemuri R. LiPaR: A light-weight parallel router for FPGA-based networks-onchip. In Proc. ACM Great Lakes Symp. VLSI, Chicago, USA, Apr. 17-19, 2005, pp.452-457.
    [24]
    Ogras U, Marculescu R, Lee H, Choudhary P, Marculescu D, Kaufman M, Nelson P. Challenges and promising results in NoC prototyping using FPGAs. IEEE Micro, Sept. 2007, 27(5): 86-95.
    [25]
    Ngouanga A, Sassatelli G, Torres L, Gil T, Suarez A, Susin A. Run-time resources management on coarse grained, packetswitching reconfigurable architecture: A case study through the APACHES’ platform. In Proc. Int. Workshop on Applied Reconfigurable Computing (ARC), Delft, The Netherlands, Mar. 1-3, 2006, pp.134-145.
    [26]
    Gratz P, Kim C, Mcdonald R, Keckler S W, Burger D. Implementation and evaluation of on-chip network architectures. In Proc. IEEE Int. Conf. Computer Design, San Jose, USA, Oct. 1-4, 2006, pp.477-484.
    [27]
    ML505/ML506/ML507 evaluation platform user guide. http://www.xilinx.com/support/documentation/boards_and_kits/ug347.pdf, Oct. 7, 2009.
    [28]
    Sassatelli G, Torres L, Riso S, Robert M. Packet-switching network-on-chip features exploration and characterization. In Proc. IFIP Int. Conf. Very Large Scale Integration, Madrid, Spain, Sept. 27-29, 2005, pp.403-409.

Catalog

    Article views (16) PDF downloads (1870) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return