|
计算机科学技术学报 ›› 2019,Vol. 34 ›› Issue (5): 1096-1108.doi: 10.1007/s11390-019-1962-4
所属专题: Computer Networks and Distributed Computing
Han Lin, Ming-Fan Li, Cheng-Fan Jia, Jun-Nan Liu, Hong An, Senior Member, CCF, ACM, IEEE
Han Lin, Ming-Fan Li, Cheng-Fan Jia, Jun-Nan Liu, Hong An, Senior Member, CCF, ACM, IEEE
处理器专有化已经成为现代处理器工业的重要发展趋势,在半导体时代的未来十余年,这将很可能仍是主流趋势。随着异构系统中多样性的增加,如何在具备多种异构处理器的系统上高效地组织计算成为一个颇具挑战的问题,也将成为未来异构计算的常态。在本文中,我们分析了一系列在异构系统上目前最先进的任务调度算法,并提出了在异构系统上调度细粒度并行程序的新观点。本文的主要创新在于:1.简化了基于有向无环图的细粒度并行程序调度问题中任务优先级的计算方法,这不仅减小了任务选择阶段的时间复杂度,也使得本文的调度算法可以处理动态有向无环图的调度问题;2.在处理器选择阶段,充分考虑了通信链路对性能的影响,构建了一种新颖的通信模型,使得任务调度更加高效。上述所列两点的实现有赖于对以数据流程序执行模型为代表的细粒度并行问题所进行的探索,并且通过一组选取的基准测试程序进行了实验验证。在人工合成以及真实应用的有向无环图上所进行的实验展现出了非常优秀的性能。在调度长度比和并行效率两个指标上,本文提出的基于节点加权出度的任务调度算法比所有文中评估的目前最优的启发式算法都明显表现更好。
[1] Suetterlein J. DARTS:A runtime based on the Codelet execution model[Master Thesis]. University of Delaware, 2014. [2] Qu P, Yan J, Zhang Y H, Gao G R. Parallel turing machine, a proposal. Journal of Computer Science and Technology, 2017, 32(2):269-285. [3] Valiant L G. A bridging model for parallel computation. Communications of the ACM, 1990, 33(8):103-111. [4] Zuckerman S, Suetterlein J, Knauerhase R, Gao G R. Using a "codelet" program execution model for exascale machines:Position paper. In Proc. the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era, June 2011, pp.64-69. [5] Suettlerlein J, Zuckerman S, Gao G R. An implementation of the Codelet model. In Proc. the 19th Int. Conf. Parallel Processing, August 2013, pp.633-644. [6] Ullman J D. NP-complete scheduling problems. Journal of Computer and System Sciences, 1975, 10(3):384-393. [7] Lenstra J K, Kan A H G R. Computational complexity of discrete optimization problems. Annals of Discrete Mathematics, 1979, 4:121-140. [8] Garey M R, Johnson D S. Computers and Intractability:A Guide to the Theory of NP-Completeness (1st edition). W.H. Freeman, 1979. [9] Arabnejad H, Barbosa J G. List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans. Parallel and Distributed Systems, 2014, 25(3):682-694. [10] Dennis J B, Fosseen J B, Linderman J P. Data flow schemas. In Proc. Int. Symp. Theoretical Programming, Aug. 1972, pp.187-216. [11] Dennis J B. First version of a data flow procedure language. In Proc. Programming Symposium, April 1974, pp.362-376. [12] Dennis J B. Data flow supercomputers. IEEE Computer, 1980, 13(11):48-56. [13] Arvind A, Gostelow K P, Plouffe W. Indeterminacy, monitors, and dataflow. ACM SIGOPS Operating Systems Review, 1977, 11(5):159-169. [14] Arvind A, Kathail V. A multiple processor data flow machine that supports generalized procedures. In Proc. the 8th Symp. Computer Architecture, May 1981, pp.291-302. [15] Watson I, Gurd J. A practical data flow computer. IEEE Computer, 1982, 15(2):51-57. [16] Hum H H J, Maquelin O, Theobald K B et al. A design study of the EARTH multiprocessor. In Proc. the IFIP WG10.3 Working Conf. Parallel Architectures and Compilation Techniques, June 1995, pp.59-68. [17] Farquhar W G, Evripidou P. DART:A data-driven processor architecture for real-time computing. In Proc. the IFIP WG10. 3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, January 1993, pp.141-152. [18] Evripidou P. Thread synchronization unit (TSU):A building block for high performance computers. In Proc. the 1997 Int. Symp. High Performance Computing, Nov. 1997, pp.107-118. [19] Topcuouglu H, Hariri S, Wu M Y. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst., 2002, 13(3):260-274. [20] Bittencourt L F, Sakellariou R, Madeira E R M. DAG scheduling using a lookahead variant of the heterogeneous earliest finish time algorithm. In Proc. the 18th Euromicro Int. Conf. Parallel, Distributed and Network-Based Processing, February 2010, pp.27-34. [21] Munir E U, Mohsin S, Hussain A, Nisar M W, Ali S. SDBATS:A novel algorithm for task scheduling in heterogeneous computing systems. In Proc. the 27th Int. Symp. Parallel and Distributed Processing Symposium Workshops & PhD Forum, May 2013, pp.43-53. [22] Wang G, Wang Y X, Liu H, Guo H. HSIP:A novel task scheduling algorithm for heterogeneous computing. Sci. Program., 2016, 2016:Article No. 3676149. [23] Yelick K, Bonachea D, Chen W Y et al. Productivity and performance using partitioned global address space languages. In Proc. the 2007 International Workshop on Parallel Symbolic Computation, July 2007, pp.24-32. [24] Murray D G, McSherry F, Isaacs R et al. Naiad:A timely dataflow system. In Proc. the 24th ACM SIGOPS Symp. Operating Systems Principles, Nov. 2013, pp.439-455. [25] Daoud M I, Kharma N. A high performance algorithm for static task scheduling in heterogeneous distributed computing systems. Journal of Parallel and Distributed Computing, 2008, 68(4):399-409. [26] Blumofe R D, Joerg C F, Kuszmaul B C et al. Cilk:An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing, 1996, 37(1):55-69. [27] Blumofe R D, Frigo M, Joerg C F et al. DAG-consistent distributed shared memory. In Proc. the 10th International Parallel Processing Symposium, April 1996, pp.132-141. [28] Augonnet C, Thibault S, Namyst R, Wacrenier P A. StarPU:A unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation:Practice and Experience, 2011, 23(2):187-198. [29] Dongarra J, Heroux M A, Luszczek P. HPCG benchmark:A new metric for ranking high performance computing systems. Technical Report, Electrical Engineering and Computer Science Department, Knoxville, Tennessee, 2015. https://www.hpcg-benchmark.org/pubs/index.html, May 2019. [30] Su Z C, Chen J S, Lin H et al. A dataflow-based runtime support on a 100P actual system. In Proc. the 2017 IEEE Int. Symp. Parallel and Distributed Processing with Applications and the 2017 IEEE Int. Conf. Ubiquitous Computing and Communications, December 2017, pp.599-606. [31] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In Proc. the 3rd Int. Conf. Learning Representations, May 2015, Article No. 4. [32] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In Proc. the 26th Annual Conference on Neural Information Processing Systems, December 2012, pp.1106-1114. [33] Sun Y, Liang D, Wang X G et al. DeepID3:Face recognition with very deep neural networks. arXiv:1502.00873, 2015. https://arxiv.org/abs/1502.00873, May 2019. [34] Dai J F, He K M, Sun J. Convolutional feature masking for joint object and stuff segmentation. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.3992-4000. [35] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.3431-3440. [36] Ma C, Huang J B, Yang X K et al. Hierarchical convolutional features for visual tracking. In Proc. the 2015 IEEE Int. Conf. Computer Vision, December 2015, pp.3074-3082. |
[1] | Ying-Lin Zhao, Jian-Lei Yang, Wei-Sheng Zhao, Aida Todri-Sanial, Yuan-Qing Cheng. 考虑供电噪声和散热约束的三维同构片上多核系统任务调度方案研究[J]. 计算机科学技术学报, 2018, 33(5): 966-983. |
[2] | Dong-Fang Zhou, Jian-Xi Fan, Cheng-Kuan Lin, Bao-Lei Cheng, Jing-Ya Zhou, Zhao Liu. 交换交叉立方体中最优路径的嵌入[J]. , 2017, 32(3): 618-629. |
[3] | Peng Qu, Jin Yan, You-Hui Zhang, Guang R. Gao. 一个并行图灵机的提议[J]. , 2017, 32(2): 269-285. |
[4] | Xiao-Hu Yan, Fa-Zhi He, Yi-Lin Chen. 基于野草扰动粒子群算法的新型软硬件划分方法[J]. , 2017, 32(2): 340-355. |
[5] | Kai Zhang, Feng Chen, Xiaoning Ding, Yin Huai, Rubao Lee, Tian Luo, Kaibo Wang. Hetero-DB: 基于异构计算和存储资源的下一代数据库系统[J]. , 2015, 30(4): 657-678. |
[6] | Rong Chen, Jia-Xin Shi, Hai-Bo Chen, Bin-Yu Zang. 大数据学习场景中面向二分图的分布式图划分算法[J]. , 2015, 30(1): 20-29. |
[7] | Juan Fang, Zhen-Yu Leng, Si-Tong Liu, Zhi-Cheng Yao, Xiu-Feng Sui. 面向GPU-CPU异构体系结构的异构片上网络设计空间探索[J]. , 2015, 30(1): 74-83. |
[8] | Zheng Cao, Xiao-Li Liu, ACM Qiang Li, Xiao-Bing Liu, ACM Zhan Wang, Xue-Jun An. 面向异构计算的服务器内互连架构[J]. , 2014, 29(6): 976-988. |
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |