|
Journal of Computer Science and Technology ›› 2020, Vol. 35 ›› Issue (1): 72-91.doi: 10.1007/s11390-020-9797-6
Special Issue: Computer Architecture and Systems
• Special Section on Selected I/O Technologies for High-Performance Computing and Data Analytics • Previous Articles Next Articles
Marc-André Vef1, Nafiseh Moti1, Tim Sü?1, Markus Tacke1, Tommaso Tocci2, Ramon Nou2, Alberto Miranda2, Toni Cortes2,3, André Brinkmann1, Member, ACM
[1] Hey T, Tansley S, Tolle K M. The Fourth Paradigm:DataIntensive Scientific Discovery (1st edition). Microsoft Research, 2009. [2] Ross R, Thakur R, Choudhary A. Achievements and challenges for I/O in computational science. Journal of Physics:Conference Series, 2005, 16(1):501-509. [3] Nieuwejaar N, Kotz D, Purakayastha A, Ellis C S, Best M L. File-access characteristics of parallel scientific workloads. IEEE Trans. Parallel Distrib. Syst., 1996, 7(10):1075-1089. [4] Wang F, Xin Q, Hong B, Brandt S A, Miller E, Long D, McLarty T. File system workload analysis for large scientific computing applications. In Proc. the 21st IEEE/12th NASA Goddard Conference on Mass Storage Systems and Technologies, April 2004, pp.139-152. [5] Crandall P, Aydt R A, Chien A A, Reed D A. Input/output characteristics of scalable parallel applications. In Proc. the 1995 Supercomputing, December 1995, Article No. 59. [6] Dorier M, Antoniu G, Ross R B, Kimpe D, Ibrahim S. CALCioM:Mitigating I/O interference in HPC systems through cross-application coordination. In Proc. the 28th IEEE International Parallel and Distributed Processing Symposium, May 2014, pp.155-164. [7] Thapaliya S, Bangalore P, Lofstead J F, Mohror K, Moody A. Managing I/O interference in a shared burst buffer system. In Proc. the 45th International Conference on Parallel Processing, August 2016, pp.416-425. [8] Lofstead J F, Klasky S, Schwan K, Podhorszki N, Jin C. Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS). In Proc. the 6th International Workshop on Challenges of Large Applications in Distributed Environments, June 2008, pp.15-24. [9] Folk M, Cheng A, Yates K. HDF5:A file format and I/O library for high performance computing applications. In Proc. the 1999 Supercomputing (CD-ROM), November 1999, pp.5-33. [10] Liu N, Cope J, Carns P H, Carothers C D, Ross R B, Grider G, Crume A, Maltzahn C. On the role of burst buffers in leadership-class storage systems. In Proc. the 28th IEEE Symposium on Mass Storage Systems and Technologies, April 2012, Article No. 5. [11] Wang T, Mohror K, Moody A, Sato K, Yu W. An ephemeral burst-buffer file system for scientific applications. In Proc. the 2016 International Conference for High Performance Computing, November 2016, pp.807-818. [12] Bent J, Gibson G A, Grider G, McClelland B, Nowoczynski P, Nunez J, Polte M, Wingate M. PLFS:A checkpoint filesystem for parallel applications. In Proc. the 2009 ACM/IEEE Conference on High Performance Computing, November 2009, Article No. 26. [13] Vilayannur M, Nath P, Sivasubramaniam A. Providing tunable consistency for a parallel file store. In Proc. the 2005 Conference on File and Storage Technologies, December 2005, Article No. 3. [14] Lensing P H, Cortes T, Hughes J, Brinkmann A. File system scalability with highly decentralized metadata on independent storage devices. In Proc. the 16th the IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2016, pp.366-375. [15] Soumagne J, Kimpe D, Zounmevo J A, Chaarawi M, Koziol Q, Afsahi A, Ross R B. Mercury:Enabling remote procedure call for high-performance computing. In Proc. the 2013 IEEE International Conference on Cluster Computing, September 2013, Article No. 50. [16] Seo S, Amer A, Balaji P, Bordage C et al. Argobots:A lightweight low-level threading and tasking framework. IEEE Trans. Parallel Distrib. Syst., 2018, 29(3):512-526. [17] Carns P H, Jenkins J, Cranor C D, Atchley S, Seo S, Snyder S, Ross R B. Enabling NVM for data-intensive scientific services. In Proc. the 4th Workshop on Interactions of NVM/Flash with Operating Systems and Workloads, November 2016, Article No. 4. [18] Jasak H, Jemcov A, Tukovic Z et al. OpenFOAM:A C++ library for complex physics simulations. In Proc. the International Workshop on Coupled Methods in Numerical Dynamics, September 2007, Article No. 3. [19] Vef M, Moti N, Süß T, Tocci T, Nou R, Miranda A, Cortes T, Brinkmann A. GekkoFS-A temporary distributed file system for HPC applications. In Proc. the 2018 IEEE International Conference on Cluster Computing, September 2018, pp.319-324. [20] Schmuck F B, Haskin R L. GPFS:A shared-disk file system for large computing clusters. In Proc. the 2002 Conference on File and Storage Technologies, January 2002, pp.231-244. [21] Braam P J, Schwan P. Lustre:The intergalactic file system. In Proc. the 2002 Ottawa Linux Symposium, June 2002, pp.50-54. [22] Qian Y, Li X, Ihara S, Zeng L, Kaiser J, Süß T, Brinkmann A. A configurable rule based classful token bucket filter network request scheduler for the Lustre file system. In Proc. the 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2017, Article No. 6. [23] Herold F, Breuner S. An introduction to BeeGFS. https://www.beegfs.io/docs/whitepapers/Introduction_to_BeeGFS_by_ThinkParQ.pdf,August 2019. [24] Ross R B, Latham R. PVFS-PVFS:A parallel file system. In Proc. the 2006 ACM/IEEE Conference on High Performance Networking and Computing, November 2006, Article No. 34. [25] Oral S, Shah G. Spectrum scale enhancements for CORAL. http://files.gpfsug.org/presentations/2016/SC16/11_Sarp_Oral_Gautam_Shah_Spectrum_Scale_Enhancements_for_CORAL_v2.pdf,August 2019. [26] Kougkas A, Devarajan H, Sun X. Hermes:A heterogeneousaware multi-tiered distributed I/O buffering system. In Proc. the 27th International Symposium on HighPerformance Parallel and Distributed Computing, June 2018, pp.219-230. [27] Latham R, Ross R B, Thakur R. The impact of file systems on MPI-IO scalability. In Proc. the 11th European PVM/MPI Users' Group Meeting, September 2004, pp.87-96. [28] Choudhary A, Liao W K, Gao K, Nisar A, Ross R, Thakur R, Latham R. Scalable I/O and analytics. Journal of Physics:Conference Series, 2009, 180(1):Article No. 012048. [29] Moore M, Bonnie D, Ligon B, Marshall M, Ligon W, Mills N, Quarles E, Sampson S, Yang S, Wilson B. OrangeFS:Advancing PVFS. https://www.usenix.org/legacy/event/fast11/posters_files/Moore.pdf,August 2019. [30] Ritchie D, Thompson K. The UNIX time-sharing system (reprint). Commun. ACM, 1983, 26(1):84-89. [31] Vef M A, Tarasov V, Hildebrand D, Brinkmann A. Challenges and solutions for tracing storage systems:A case study with spectrum scale. ACM Trans. Storage, 2018, 14(2):Article No. 18. [32] Patil S, Gibson G A. Scale and concurrency of GIGA+:File system directories with millions of files. In Proc. the 9th USENIX Conference on File and Storage Technologies, February 2011, pp.177-190. [33] Ren K, Zheng Q, Patil S, Gibson G A. IndexFS:Scaling file system metadata performance with stateless caching and bulk insertion. In Proc. the 2014 International Conference for High Performance Computing, November 2014, pp.237-248. [34] Carns P, Yao Y, Harms K, Latham R, Ross R, Antypas K. Production I/O characterization on the Cray XE6. In Proc. the Cray User Group Meeting, May 2013, Article No. 121. [35] Xing J, Xiong J, Sun N, Ma J. Adaptive and scalable metadata management to support a trillion files. In Proc. the 2009 ACM/IEEE Conference on High Performance Computing, November 2009, Article No. 31. [36] Frings W, Wolf F, Petkov V. Scalable massively parallel I/O to task-local files. In Proc. the 2009 ACM/IEEE Conference on High Performance Computing, November 2009, Article No. 22. [37] Yang S, Ligon III W B, Quarles E C. Scalable distributed directory implementation on orange file system. In Proc. the 7th IEEE International Workshop on Storage Network Architecture and Parallel I/Os, May 2011. [38] Patil S, Ren K, Gibson G. A case for scaling HPC metadata performance through de-specialization. In Proc. the 2012 SC Companion:High Performance Computing, Networking Storage and Analysis, November 2012, pp.30-35. [39] Carns P H, Ligon III W B, Ross R B, Thakur R. PVFS:A parallel file system for Linux clusters. In Proc. the 4th Annual Linux Showcase & Conference, October 2000, Article No. 4. [40] Dong S, Callaghan M, Galanis L, Borthakur D, Savor T, Strum M. Optimizing space amplification in RocksDB. In Proc. the 8th Biennial Conference on Innovative Data Systems Research, January 2017, Article No. 30. [41] Oral S, Dillow D A, Fuller D et al. OLCF's 1 Tb/s, nextgeneration Lustre file system. In Proc. the 2013 Cray User Group Conference, May 2013, Article No. 151. [42] Lofstead J F, Zheng F, Liu Q, Klasky S, Oldfield R, Kordenbrock T, Schwan K, Wolf M. Managing variability in the IO performance of petascale storage systems. In Proc. the 2010 Conference on High Performance Computing Networking, Storage and Analysis, November 2010, Article No. 35. [43] Xie B, Chase J S, Dillow D, Drokin O, Klasky S, Oral S, Podhorszki N. Characterizing output bottlenecks in a supercomputer. In Proc. the 2012 International Conference on High Performance Computing Networking, Storage and Analysis, November 2012, Article No. 8. [44] Kougkas A, Devarajan H, Sun X, Lofstead J F. Harmonia:An interference-aware dynamic I/O scheduler for shared non-volatile burst buffers. In Proc. the 2018 IEEE International Conference on Cluster Computing, September 2018, pp.290-301. [45] Hashimoto Y, Aida K. Evaluation of performance degradation in HPC applications with VM consolidation. In Proc. the 3rd International Conference on Networking and Computing, December 2012, pp.273-277. [46] Lofstead J F, Ross R. Insights for exascale IO APIs from building a petascale IO API. In Proc. the 2013 International Conference for High Performance Computing, November 2013, Article No. 87. [47] Reed D A, Dongarra J J. Exascale computing and big data. Commun. ACM, 2015, 58(7):56-68. |
[1] | André Brinkmann, Kathryn Mohror, Weikuan Yu, Philip Carns, Toni Cortes, Scott A. Klasky, Alberto Miranda, Franz-Josef Pfreundt, Robert B. Ross, Marc-André Vef. Ad Hoc File Systems for High-Performance Computing [J]. Journal of Computer Science and Technology, 2020, 35(1): 4-26. |
[2] | Yu-Tong Lu, Peng Cheng, Zhi-Guang Chen. Design and Implementation of the Tianhe-2 Data Storage and Management System [J]. Journal of Computer Science and Technology, 2020, 35(1): 27-46. |
[3] | Osamu Tatebe, Shukuko Moriwake, Yoshihiro Oyama. Gfarm/BB—Gfarm File System for Node-Local Burst Buffer [J]. Journal of Computer Science and Technology, 2020, 35(1): 61-71. |
[4] | Anthony Kougkas, Hariharan Devarajan, Xian-He Sun. I/O Acceleration via Multi-Tiered Data Buffering and Prefetching [J]. Journal of Computer Science and Technology, 2020, 35(1): 92-120. |
|