? A Pipelining Loop Optimization Method for Dataflow Architecture
Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | FAQ
 
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
Journal of Computer Science and Technology 2018, Vol. 33 Issue (1) :116-130    DOI: 10.1007/s11390-017-1748-5
Computer Architecture and Systems Current Issue | Archive | Adv Search << Previous Articles | Next Articles >>
A Pipelining Loop Optimization Method for Dataflow Architecture
Xu Tan1,2, Student Member, CCF, Xiao-Chun Ye1,3, Member, CCF, Xiao-Wei Shen1,2, Yuan-Chao Xu1,4,*, Member, CCF, Da Wang1, Member, CCF, Lunkai Zhang5, Wen-Ming Li1, Member, CCF, Dong-Rui Fan1,2, Senior Member, CCF, Zhi-Min Tang1, Distinguished Member, CCF
1 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences Beijing 100190, China;
2 School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 100049, China;
3 State Key Laboratory of Mathematical Engineering and Advanced Computing, Wuxi 214125, China;
4 College of Information Engineering, Capital Normal University, Beijing 100048, China;
5 Department of Computer Science, The University of Chicago, Chicago, IL 60637, U.S.A

Abstract
Reference
Related Articles
Download: [PDF 754KB]     Export: BibTeX or EndNote (RIS)  
Abstract With the coming of exascale supercomputing era, power efficiency has become the most important obstacle to build an exascale system. Dataflow architecture has native advantage in achieving high power efficiency for scientific applications. However, the state-of-the-art dataflow architectures fail to exploit high parallelism for loop processing. To address this issue, we propose a pipelining loop optimization method (PLO), which makes iterations in loops flow in the processing element (PE) array of dataflow accelerator. This method consists of two techniques, architecture-assisted hardware iteration and instruction-assisted software iteration. In hardware iteration execution model, an on-chip loop controller is designed to generate loop indexes, reducing the complexity of computing kernel and laying a good foundation for pipelining execution. In software iteration execution model, additional loop instructions are presented to solve the iteration dependency problem. Via these two techniques, the average number of instructions ready to execute per cycle is increased to keep floating-point unit busy. Simulation results show that our proposed method outperforms static and dynamic loop execution model in floating-point efficiency by 2.45x and 1.1x on average, respectively, while the hardware cost of these two techniques is acceptable.
Articles by authors
Keywordsdataflow model   control-flow model   loop optimization   exascale computing   scientific application     
Received 2016-09-04;
Fund:

This work was supported by the National Key Research and Development Program of China under Grant No. 2016YFB0200501, the National Natural Science Foundation of China under Grant Nos. 61332009 and 61521092, the Open Project Program of State Key Laboratory of Mathematical Engineering and Advanced Computing under Grant No. 2016A04 and the Beijing Municipal Science and Technology Commission under Grant No. Z15010101009, the Open Project Program of State Key Laboratory of Computer Architecture under Grant No. CARCH201503, China Scholarship Council, and Beijing Advanced Innovation Center for Imaging Technology.

Corresponding Authors: Yuan-Chao Xu     Email: xuyuanchao@cnu.edu.cn
About author: Xu Tan received his Bachelor's degree in computer science and technology from Capital Normal University, Beijing, in 2012. He is currently a Ph.D. candidate in Institute of Computing Technology, Chinese Academy of Sciences, Beijing. His main research interests include dataflow architecture and high-performance computer systems.
Cite this article:   
Xu Tan, Xiao-Chun Ye, Xiao-Wei Shen, Yuan-Chao Xu, Da Wang, Lunkai Zhang, Wen-Ming Li, Dong-Rui Fan, Zhi-Min Tang.A Pipelining Loop Optimization Method for Dataflow Architecture[J]  Journal of Computer Science and Technology, 2018,V33(1): 116-130
URL:  
http://jcst.ict.ac.cn:8080/jcst/EN/10.1007/s11390-017-1748-5
Copyright 2010 by Journal of Computer Science and Technology