|
Journal of Computer Science and Technology ›› 2021, Vol. 36 ›› Issue (1): 123-139.doi: 10.1007/s11390-020-9826-z
Special Issue: Computer Architecture and Systems
• Computer Architecture and Systems • Previous Articles Next Articles
Jun-Shi Chen1, Member, CCF, Hong An1, Member, CCF, ACM, IEEE Wen-Ting Han1,*, Member, CCF, ACM, IEEE, Zeng Lin1, and Xin Liu2, Member, CCF
[1] Hollingsworth S A, Dror R O. Molecular dynamics simulation for all. Neuron, 2018, 99(6):1129-1143. DOI:10.1016/j.neuron.2018.08.011. [2] Kumar S, Huang C, Zheng G et al. Scalable molecular dynamics with NAMD on the IBM Blue Gene/L system. IBM Journal of Research and Development, 2008, 52(1/2):177- 188. DOI:10.1147/rd.521.0177. [3] Shaw D E, Grossman J P, Bank J A et al. Anton 2:Raising the bar for performance and programmability in a specialpurpose molecular dynamics supercomputer. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, November 2014, pp.41-53. DOI:10.1109/SC.2014.9. [4] Shaw D E, Deneroff M M, Dror R O et al. Anton, a special-purpose machine for molecular dynamics simulation. Commun. ACM, 2008, 51(7):91-97. DOI:10.1145/1364782.1364802. [5] Götz A W, Williamson M J, Xu D et al. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 1. generalized born. Journal of Chemical Theory and Computation, 2012, 8(5):1542-1555. DOI:10.1021/ct200909j. [6] Pennycook S J, Hughes C J, Smelyanskiy M, Jarvis S A. Exploring SIMD for molecular dynamics, using Intelr Xeonr and Intelr Xeon Phi coprocessors. In Proc. the 27th IEEE International Symposium on Parallel and Distributed Processing, May 2013, pp.1085-1097. DOI:10.1109/IPDPS.2013.44. [7] Wang H Q, Peng S L, Zhu X Q et al. A method to accelerate GROMACS in offload mode on Tianhe-2 supercomputer. In Proc. the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2015, pp.781-784. DOI:10.1109/CCGrid.2015.65. [8] Hu C J, Wang X M, Li J J et al. Kernel optimization for short-range molecular dynamics. Computer Physics Communications, 2017, 211:31-40. DOI:10.1016/j.cpc.2016.07.010. [9] Law T R, Hancox J, Wright S A, Jarvis S A. An algorithm for computing short-range forces in molecular dynamics simulations with non-uniform particle densities. Journal of Parallel and Distributed Computing, 2019, 130:1-11. DOI:10.1016/j.jpdc.2019.03.008. [10] Peng S L, Cui Y B, Yang S Y et al. A CPU/MIC collaborated parallel framework for GROMACS on Tianhe-2 supercomputer. IEEE/ACM Trans. Comput. Biology Bioinform., 2019, 16(2):425-433. DOI:10.1109/TCBB.2017.2713362. [11] Anderson J A, Lorenz C D, Travesset A. General purpose molecular dynamics simulations fully implemented on graphics processing units. Journal of Computational Physics, 2008, 227(10):5342-5359. DOI:10.1016/j.jcp.2008.01.047. [12] Friedrichs M S, Eastman P, Vaidyanathan V et al. Accelerating molecular dynamic simulation on graphics processing units. Journal of Computational Chemistry, 2009, 30(6):864-872. DOI:10.1002/jcc.21209. [13] Minkin A S, Knizhnik A A, Potapkin B V. GPU implementations of some many-body potentials for molecular dynamics simulations. Advances in Engineering Software, 2017, 111:43-51. DOI:10.1016/j.advengsoft.2016.05.013. [14] Spellings M, Marson R L, Anderson J A, Glotzer S C. GPU accelerated Discrete Element Method (DEM) molecular dynamics for conservative, faceted particle simulations. Journal of Computational Physics, 2017, 334:460-467. DOI:10.1016/j.jcp.2017.01.014. [15] Fu H H, Liao J F, Yang J Z et al. The Sunway TaihuLight supercomputer:System and applications. Science China Information Sciences, 2016, 59(7):Article No. 072001. DOI:10.1007/s11432-016-5588-7. [16] Dong W Q, Kang L T, Quan Z et al. Implementing molecular dynamics simulation on Sunway TaihuLight system. In Proc. the 18th IEEE International Conference on High Performance Computing and Communications, December 2016, pp.443-450. DOI:10.1109/HPCC-SmartCityDSS.2016.0070. [17] Dong W Q, Li K L, Kang L T, Quan Z, Li K Q. Implementing molecular dynamics simulation on the Sunway TaihuLight system with heterogeneous many-core processors. Concurrency and Computation:Practice and Experience, 2018, 30(16):Article No. e4468. DOI:10.1002/cpe.4468. [18] Yu Y, An H, Chen J S et al. Pipelining computation and optimization strategies for scaling GROMACS on the Sunway many-core processor. In Proc. the 17th International Conference on Algorithms and Architectures for Parallel Processing, August 2017, pp.18-32. DOI:10.1007/978-3-319-65482- 9_2. [19] Duan X H, Gao P, Zhang T J et al. Redesigning LAMMPS for peta-scale and hundred-billion-atom simulation on Sunway TaihuLight. In Proc. the International Conference for High Performance Computing, Networking, Storage, and Analysis, November 2018, Article No. 12. DOI:10.1109/SC.2018.00015. [20] Páll S, Hess B. A flexible algorithm for calculating pair interactions on SIMD architectures. Computer Physics Communications, 2013, 184(12):2641-2650. DOI:10.1016/j.cpc.2013.06.003. [21] Abraham M J, Murtola T, Schulz R et al. GROMACS:High performance molecular simulations through multilevel parallelism from laptops to supercomputers. SoftwareX, 2015, 1/2:19-25. DOI:10.1016/j.softx.2015.06.001. [22] Phillips J C, Braun R, Wang W, Gumbart J et al. Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 2005, 26:1781-1802. DOI:10.1002/jcc.20289. [23] Plimpton S. Fast parallel algorithms for short-range molecular dynamics. Journal of Computational Physics, 1995, 117:1-19. DOI:10.1006/jcph.1995.1039. [24] Yao Z H, Wang J S, Liu G R, Cheng M. Improved neighbor list algorithm in molecular simulations using cell decomposition and data sorting method. Computer Physics Communications, 2004, 161(1/2):27-35. DOI:10.1016/j.cpc.2004.04.004. [25] Nguyen T D. GPU-accelerated Tersoff potentials for massively parallel Molecular Dynamics simulations. Computer Physics Communications, 2017, 212:113-122. DOI:10.1016/j.cpc.2016.10.020. [26] Jia Z, Maggioni M, Staiger B, Scarpazza D P. Dissecting the NVIDIA volta GPU architecture via microbenchmarking. arXiv:1804.06826, 2018. https://arxiv.org/abs/1804.06826, April 2020. [27] Kunaseth M, Richards D F, Glosli J N et al. Analysis of scalable data-privatization threading algorithms for hybrid MPI/OpenMP parallelization of molecular dynamics. The Journal of Supercomputing, 2013, 66(1):406-430. DOI:10.1007/s11227-013-0915-x. [28] Lin J, Xu Z G, Cai L J, Nukada A, Satoshi M. Evaluating the SW26010 many-core processor with a micro-benchmark suite for performance optimizations. Parallel Computing, 2018, 77:128-143. DOI:10.1016/j.parco.2018.06.001. |
[1] | Guang-Ming Tan (谭光明), Member, CCF, ACM, Ping Liu (刘萍), Member, CCF, ACM, Dong-Bo Bu (卜东波), Member, CCF, ACM, and Yan-Bing Liu (刘燕兵), Member, CCF, ACM. Revisiting Multiple Pattern Matching Algorithms for Multi-Core Architecture [J]. , 2011, 26(5): 866-874. |
[2] | Issam W. Damaj. Higher-Level Hardware Synthesis of the KASUMI Algorithm [J]. , 2007, 22(1): 60-70 . |
[3] | Guo-Liang Chen, Guang-Zhong Sun, Yun-Quan Zhang, and Ze-Yao Mo. Study on Parallel Computing [J]. , 2006, 21(5): 665-673 . |
[4] | Ken-Li Li, Ren-Fa Li, and Qing-Hua. Optimal Parallel Algorithm for the Knapsack Problem Without Memory Conflicts [J]. , 2004, 19(6): 0-0. |
[5] | Xian-Chao Zhang, Ying-Yu Wan, and Guo-Liang Chen. Max-Flow Problem in Undirected Planar Networks with Node Capacities Being in NC [J]. , 2004, 19(6): 0-0. |
[6] | WANG Guoren(王国仁)and YU Ge(于戈). Performance Evaluation of a Parallel Cascade Semijoin Algorithm for Computing Path Expressions in Object Database Systems [J]. , 2002, 17(2): 0-0. |
[7] | WAN Yingyu; XU Yinlong; GU Xiaodong; CHEN Guoliang;. Efficient Minimum Spanning Tree Algorithms on the Reconfigurable Mesh [J]. , 2000, 15(2): 116-125. |
[8] | Li Xiaofeng; Shou Biao; Zheng Shirong;. Research on the Optimal Parallel Algorithms of Broadcast-Class Problems [J]. , 1998, 13(5): 455-463. |
[9] | Chi Xuebin;. Parallel Implementation of Linear Algebra Problems on Dawning-1000 [J]. , 1998, 13(2): 141-146. |
[10] | Sun Jiachang; Chi Xuebin; Cao Jianwen; Zhang Linbo;. Parallel Algorithm Design on Some Distributed Systems [J]. , 1997, 12(2): 97-104. |
[11] | HU Chengyi;. Parallel Solutions for Large-Scale General Sparse Nonlinear Systems of Equations [J]. , 1996, 11(3): 257-271. |
[12] | Gao Qingshi; Liu Zhiyong;. K-Dimensional Optimal Parallel Algorithm for the Solution of a General Class of Recurrence Equations [J]. , 1995, 10(5): 417-424. |
[13] | Shuai Dianxun;. New Heuristic Distributed Parallel Algorithms for Searching and Planning [J]. , 1995, 10(4): 354-374. |
|