|
Journal of Computer Science and Technology ›› 2019, Vol. 34 ›› Issue (5): 1096-1108.doi: 10.1007/s11390-019-1962-4
Special Issue: Computer Networks and Distributed Computing
• Computer Networks and Distributed Computing • Previous Articles Next Articles
Han Lin, Ming-Fan Li, Cheng-Fan Jia, Jun-Nan Liu, Hong An, Senior Member, CCF, ACM, IEEE
[1] Suetterlein J. DARTS:A runtime based on the Codelet execution model[Master Thesis]. University of Delaware, 2014. [2] Qu P, Yan J, Zhang Y H, Gao G R. Parallel turing machine, a proposal. Journal of Computer Science and Technology, 2017, 32(2):269-285. [3] Valiant L G. A bridging model for parallel computation. Communications of the ACM, 1990, 33(8):103-111. [4] Zuckerman S, Suetterlein J, Knauerhase R, Gao G R. Using a "codelet" program execution model for exascale machines:Position paper. In Proc. the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era, June 2011, pp.64-69. [5] Suettlerlein J, Zuckerman S, Gao G R. An implementation of the Codelet model. In Proc. the 19th Int. Conf. Parallel Processing, August 2013, pp.633-644. [6] Ullman J D. NP-complete scheduling problems. Journal of Computer and System Sciences, 1975, 10(3):384-393. [7] Lenstra J K, Kan A H G R. Computational complexity of discrete optimization problems. Annals of Discrete Mathematics, 1979, 4:121-140. [8] Garey M R, Johnson D S. Computers and Intractability:A Guide to the Theory of NP-Completeness (1st edition). W.H. Freeman, 1979. [9] Arabnejad H, Barbosa J G. List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans. Parallel and Distributed Systems, 2014, 25(3):682-694. [10] Dennis J B, Fosseen J B, Linderman J P. Data flow schemas. In Proc. Int. Symp. Theoretical Programming, Aug. 1972, pp.187-216. [11] Dennis J B. First version of a data flow procedure language. In Proc. Programming Symposium, April 1974, pp.362-376. [12] Dennis J B. Data flow supercomputers. IEEE Computer, 1980, 13(11):48-56. [13] Arvind A, Gostelow K P, Plouffe W. Indeterminacy, monitors, and dataflow. ACM SIGOPS Operating Systems Review, 1977, 11(5):159-169. [14] Arvind A, Kathail V. A multiple processor data flow machine that supports generalized procedures. In Proc. the 8th Symp. Computer Architecture, May 1981, pp.291-302. [15] Watson I, Gurd J. A practical data flow computer. IEEE Computer, 1982, 15(2):51-57. [16] Hum H H J, Maquelin O, Theobald K B et al. A design study of the EARTH multiprocessor. In Proc. the IFIP WG10.3 Working Conf. Parallel Architectures and Compilation Techniques, June 1995, pp.59-68. [17] Farquhar W G, Evripidou P. DART:A data-driven processor architecture for real-time computing. In Proc. the IFIP WG10. 3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, January 1993, pp.141-152. [18] Evripidou P. Thread synchronization unit (TSU):A building block for high performance computers. In Proc. the 1997 Int. Symp. High Performance Computing, Nov. 1997, pp.107-118. [19] Topcuouglu H, Hariri S, Wu M Y. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst., 2002, 13(3):260-274. [20] Bittencourt L F, Sakellariou R, Madeira E R M. DAG scheduling using a lookahead variant of the heterogeneous earliest finish time algorithm. In Proc. the 18th Euromicro Int. Conf. Parallel, Distributed and Network-Based Processing, February 2010, pp.27-34. [21] Munir E U, Mohsin S, Hussain A, Nisar M W, Ali S. SDBATS:A novel algorithm for task scheduling in heterogeneous computing systems. In Proc. the 27th Int. Symp. Parallel and Distributed Processing Symposium Workshops & PhD Forum, May 2013, pp.43-53. [22] Wang G, Wang Y X, Liu H, Guo H. HSIP:A novel task scheduling algorithm for heterogeneous computing. Sci. Program., 2016, 2016:Article No. 3676149. [23] Yelick K, Bonachea D, Chen W Y et al. Productivity and performance using partitioned global address space languages. In Proc. the 2007 International Workshop on Parallel Symbolic Computation, July 2007, pp.24-32. [24] Murray D G, McSherry F, Isaacs R et al. Naiad:A timely dataflow system. In Proc. the 24th ACM SIGOPS Symp. Operating Systems Principles, Nov. 2013, pp.439-455. [25] Daoud M I, Kharma N. A high performance algorithm for static task scheduling in heterogeneous distributed computing systems. Journal of Parallel and Distributed Computing, 2008, 68(4):399-409. [26] Blumofe R D, Joerg C F, Kuszmaul B C et al. Cilk:An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing, 1996, 37(1):55-69. [27] Blumofe R D, Frigo M, Joerg C F et al. DAG-consistent distributed shared memory. In Proc. the 10th International Parallel Processing Symposium, April 1996, pp.132-141. [28] Augonnet C, Thibault S, Namyst R, Wacrenier P A. StarPU:A unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation:Practice and Experience, 2011, 23(2):187-198. [29] Dongarra J, Heroux M A, Luszczek P. HPCG benchmark:A new metric for ranking high performance computing systems. Technical Report, Electrical Engineering and Computer Science Department, Knoxville, Tennessee, 2015. https://www.hpcg-benchmark.org/pubs/index.html, May 2019. [30] Su Z C, Chen J S, Lin H et al. A dataflow-based runtime support on a 100P actual system. In Proc. the 2017 IEEE Int. Symp. Parallel and Distributed Processing with Applications and the 2017 IEEE Int. Conf. Ubiquitous Computing and Communications, December 2017, pp.599-606. [31] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In Proc. the 3rd Int. Conf. Learning Representations, May 2015, Article No. 4. [32] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In Proc. the 26th Annual Conference on Neural Information Processing Systems, December 2012, pp.1106-1114. [33] Sun Y, Liang D, Wang X G et al. DeepID3:Face recognition with very deep neural networks. arXiv:1502.00873, 2015. https://arxiv.org/abs/1502.00873, May 2019. [34] Dai J F, He K M, Sun J. Convolutional feature masking for joint object and stuff segmentation. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.3992-4000. [35] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.3431-3440. [36] Ma C, Huang J B, Yang X K et al. Hierarchical convolutional features for visual tracking. In Proc. the 2015 IEEE Int. Conf. Computer Vision, December 2015, pp.3074-3082. |
[1] | Zhou Zhang, Pei-Quan Jin, Xiao-Liang Wang, Yan-Qi Lv, Shou-Hong Wan, Xi-Ke Xie. COLIN: A Cache-Conscious Dynamic Learned Index with High Read/Write Performance [J]. Journal of Computer Science and Technology, 2021, 36(4): 721-740. |
[2] | Songjie Niu, Shimin Chen. TransGPerf: Exploiting Transfer Learning for Modeling Distributed Graph Computation Performance [J]. Journal of Computer Science and Technology, 2021, 36(4): 778-791. |
[3] | Heng Bu, Ming-Kai Dong, Ji-Fei Yi, Bin-Yu Zang, Hai-Bo Chen. Revisiting Persistent Indexing Structures on Intel Optane DC Persistent Memory [J]. Journal of Computer Science and Technology, 2021, 36(1): 140-157. |
[4] | Jason Liu, Pedro Espina, Xian-He Sun. A Study on Modeling and Optimization of Memory Systems [J]. Journal of Computer Science and Technology, 2021, 36(1): 71-89. |
[5] | Jian-Bin Fang, Xiang-Ke Liao, Chun Huang, De-Zun Dong. Performance Evaluation of Memory-Centric ARMv8 Many-Core Architectures:A Case Study with Phytium 2000+ [J]. Journal of Computer Science and Technology, 2021, 36(1): 33-43. |
[6] | Lan Huang, Da-Lin Li, Kang-Ping Wang, Teng Gao, Adriano Tavares. A Survey on Performance Optimization of High-Level Synthesis Tools [J]. Journal of Computer Science and Technology, 2020, 35(3): 697-720. |
[7] | Hong-Mei Wei, Jian Gao, Peng Qing, Kang Yu, Yan-Fei Fang, Ming-Lu Li. MPI-RCDD: A Framework for MPI Runtime Communication Deadlock Detection [J]. Journal of Computer Science and Technology, 2020, 35(2): 395-411. |
[8] | Wen-Yan Chen, Ke-Jiang Ye, Cheng-Zhi Lu, Dong-Dai Zhou, Cheng-Zhong Xu. Interference Analysis of Co-Located Container Workloads: A Perspective from Hardware Performance Counters [J]. Journal of Computer Science and Technology, 2020, 35(2): 412-417. |
[9] | Zheng-Hao Jin, Haiyang Shi, Ying-Xin Hu, Li Zha, Xiaoyi Lu. CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance [J]. Journal of Computer Science and Technology, 2020, 35(1): 194-208. |
[10] | André Brinkmann, Kathryn Mohror, Weikuan Yu, Philip Carns, Toni Cortes, Scott A. Klasky, Alberto Miranda, Franz-Josef Pfreundt, Robert B. Ross, Marc-André Vef. Ad Hoc File Systems for High-Performance Computing [J]. Journal of Computer Science and Technology, 2020, 35(1): 4-26. |
[11] | Yu-Tong Lu, Peng Cheng, Zhi-Guang Chen. Design and Implementation of the Tianhe-2 Data Storage and Management System [J]. Journal of Computer Science and Technology, 2020, 35(1): 27-46. |
[12] | Qi Chen, Kang Chen, Zuo-Ning Chen, Wei Xue, Xu Ji, Bin Yang. Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance [J]. Journal of Computer Science and Technology, 2020, 35(1): 47-60. |
[13] | Marc-André Vef, Nafiseh Moti, Tim Süß, Markus Tacke, Tommaso Tocci, Ramon Nou, Alberto Miranda, Toni Cortes, André Brinkmann. GekkoFS—A Temporary Burst Buffer File System for HPC Applications [J]. Journal of Computer Science and Technology, 2020, 35(1): 72-91. |
[14] | Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, Qing Zheng. Mochi: Composing Data Services for High-Performance Computing Environments [J]. Journal of Computer Science and Technology, 2020, 35(1): 121-144. |
[15] | Suren Byna, M. Scot Breitenfeld, Bin Dong, Quincey Koziol, Elena Pourmal, Dana Robinson, Jerome Soumagne, Houjun Tang, Venkatram Vishwanath, Richard Warren. ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems [J]. Journal of Computer Science and Technology, 2020, 35(1): 145-160. |
|
|