.tabbox {width:400px; margin-top: 15px;margin-bottom: 5px} .tabmenu {width:400px;height:28px;border-left:1px solid #CCC;border-top:1px solid #ccc;} .tabmenu ul {margin:0;padding:0;list-style-type: none;} .tabmenu li { text-align:center; float:left; display:block; width:99px; overflow:hidden; background-color: #f1f1f1; line-height:27px; border-right:#ccc 1px solid; border-bottom:#ccc 1px solid; display:inline;} .tabmenu .cli {text-align:center;float:left;display:block;width:99px;overflow:hidden;background-color: #fff;line-height:27px;border-right:#ccc 1px solid;border-bottom:#fff 1px solid;display:inline; cursor:pointer; color: #810505; font-weight:bold} #tabcontent {width:399px;background-color:#fff;border-left:#CCC 1px solid;border-right:#CCC 1px solid;border-bottom:#CCC 1px solid; height:60px;} #tabcontent ul {margin:0;padding:5px;list-style-type: none;} #tabcontent .hidden {display:none;} Search Browse by Issue Fig/Tab Adv Search
 HOME ABOUT JCST AUTHORS REVIEWERS PUBLISHED PAPERS FORTHCOMING PAPERS

• •

### ExaHDF5:为百万兆次计算系统提供有效并行I/O

Suren Byna1,*, M. Scot Breitenfeld2, Bin Dong1, Quincey Koziol1, Elena Pourmal2, Dana Robinson2, Jerome Soumagne2, Houjun Tang1, Venkatram Vishwanath3, Richard Warren2

1. 1 Lawrence Berkeley National Laboratory, Berkeley, CA 94597, U.S.A;
2 The HDF Group, Champaign, IL 61820, U.S.A;
3 Argonne National Laboratory, Lemont, IL 60439, U.S.A
• 收稿日期:2019-07-06 修回日期:2019-08-28 出版日期:2020-01-05 发布日期:2020-01-14
• 通讯作者: Suren Byna E-mail:sbyna@lbl.gov
• 作者简介:Suren Byna received his Master's degree in 2001 and Ph.D. degree in 2006, both in computer science from Illinois Institute of Technology, Chicago. He is a Staff Scientist in the Scientific Data Management (SDM) Group in CRD at Lawrence Berkeley National Laboratory (LBNL). His research interests are in scalable scientific data management. More specifically, he works on optimizing parallel I/O and on developing systems for managing scientific data. He is the PI of the ECP funded ExaHDF5 project, and ASCR funded object-centric data management systems (Proactive Data Containers-PDC) and experimental and observational data management (EOD-HDF5) projects.
• 基金资助:
This research was supported by the Exascale Computing Project under Grant No. 17-SC-20-SC, a joint project of the U.S. Department of Energy's Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation's exascale computing imperative. This work is also supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract Nos. DE-AC02-05CH11231 and DE-AC02-06CH11357. This research was funded in part by the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract No. DE-AC02-06CH11357. This research used resources of the National Energy Research Scientific Computing Center, which is DOE Office of Science User Facilities supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.

### ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

Suren Byna1,*, M. Scot Breitenfeld2, Bin Dong1, Quincey Koziol1, Elena Pourmal2, Dana Robinson2, Jerome Soumagne2, Houjun Tang1, Venkatram Vishwanath3, Richard Warren2

1. 1 Lawrence Berkeley National Laboratory, Berkeley, CA 94597, U.S.A;
2 The HDF Group, Champaign, IL 61820, U.S.A;
3 Argonne National Laboratory, Lemont, IL 60439, U.S.A
• Received:2019-07-06 Revised:2019-08-28 Online:2020-01-05 Published:2020-01-14
• Contact: Suren Byna E-mail:sbyna@lbl.gov
• About author:Suren Byna received his Master's degree in 2001 and Ph.D. degree in 2006, both in computer science from Illinois Institute of Technology, Chicago. He is a Staff Scientist in the Scientific Data Management (SDM) Group in CRD at Lawrence Berkeley National Laboratory (LBNL). His research interests are in scalable scientific data management. More specifically, he works on optimizing parallel I/O and on developing systems for managing scientific data. He is the PI of the ECP funded ExaHDF5 project, and ASCR funded object-centric data management systems (Proactive Data Containers-PDC) and experimental and observational data management (EOD-HDF5) projects.
• Supported by:
This research was supported by the Exascale Computing Project under Grant No. 17-SC-20-SC, a joint project of the U.S. Department of Energy's Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation's exascale computing imperative. This work is also supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract Nos. DE-AC02-05CH11231 and DE-AC02-06CH11357. This research was funded in part by the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract No. DE-AC02-06CH11357. This research used resources of the National Energy Research Scientific Computing Center, which is DOE Office of Science User Facilities supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.

Abstract: Scientific applications at exascale generate and analyze massive amounts of data. A critical requirement of these applications is the capability to access and manage this data efficiently on exascale systems. Parallel I/O, the key technology enables moving data between compute nodes and storage, faces monumental challenges from new applications, memory, and storage architectures considered in the designs of exascale systems. As the storage hierarchy is expanding to include node-local persistent memory, burst buffers, etc., as well as disk-based storage, data movement among these layers must be efficient. Parallel I/O libraries of the future should be capable of handling file sizes of many terabytes and beyond. In this paper, we describe new capabilities we have developed in Hierarchical Data Format version 5 (HDF5), the most popular parallel I/O library for scientific applications. HDF5 is one of the most used libraries at the leadership computing facilities for performing parallel I/O on existing HPC systems. The state-of-the-art features we describe include:Virtual Object Layer (VOL), Data Elevator, asynchronous I/O, full-featured single-writer and multiple-reader (Full SWMR), and parallel querying. In this paper, we introduce these features, their implementations, and the performance and feature benefits to applications and other libraries.

 [1] Folk M, Heber G, Koziol Q, Pourmal E, Robinson D. An overview of the HDF5 technology suite and its applications. In Proc. the 2011 EDBT/ICDT Workshop on Array Databases, March 2011, pp.36-47.[2] Li J W, Liao W K, Choudhary A N et al. Parallel netCDF:A high-performance scientific I/O interface. In Proc. the 2003 ACM/IEEE Conference on Supercomputing, November 2003, Article No. 39.[3] Lofstead J, Zheng F, Klasky S, Schwan K. Adaptable, metadata rich IO methods for portable high performance IO. In Proc. the 23rd IEEE International Symposium on Parallel Distributed Processing, May 2009, Article No. 44.[4] Dong B, Byna S, Wu K S et al. Data elevator:Lowcontention data movement in hierarchical storage system. In Proc. the 23rd IEEE International Conference on High Performance Computing, December 2016, pp.152-161.[5] Dong B, Wang T, Tang H, Koziol Q, Wu K, Byna S. ARCHIE:Data analysis acceleration with array caching in hierarchical storage. In Proc. the 2018 IEEE International Conference on Big Data, December 2018, pp.211-220.[6] Seo S, Amer A, Balaji P et al. Argobots:A lightweight lowlevel threading and tasking framework. IEEE Transactions on Parallel and Distributed Systems, 2018, 29(3):512-526.[7] Wu K. FastBit:An efficient indexing technology for accelerating data-intensive science. Journal of Physics:Conference Series, 2005, 16(16):556-560.[8] Racah E, Beckham C, Maharaj T, Kahou S E, Prabhat, Pal C. ExtremeWeather:A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. In Proc. the 31st Annual Conference on Neural Information Processing Systems, December 2017, pp.3402-3413.[9] Byna S, Chou J C Y, Rübel O et al. Parallel I/O, analysis, and visualization of a trillion particle simulation. In Proc. the International Conference on High Performance Computing, Networking, Storage and Analysis, November 2012, Article No. 59.[10] Chen J H, Choudhary A, de Supinski B et al. Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science & Discovery, 2009, 2(1).[11] Dong B, Wu K S, Byna S, Liu J L, Zhao W J, Rusu F. ArrayUDF:User-defined scientific data analysis on arrays. In Proc. the 26th International Symposium on HighPerformance Parallel and Distributed Computing, June 2017, pp.53-64.
 No related articles found!
Viewed
Full text

Abstract

Cited

Shared
Discussed
 [1] 周笛;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 . [2] 陈世华;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 . [3] 李万学;. Almost Optimal Dynamic 2-3 Trees[J]. , 1986, 1(2): 60 -71 . [4] C.Y.Chung; 华宣仁;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 . [5] 章萃; 赵沁平; 徐家福;. Kernel Language KLND[J]. , 1986, 1(3): 65 -79 . [6] 屈延文;. AGDL: A Definition Language for Attribute Grammars[J]. , 1986, 1(3): 80 -91 . [7] 王建潮; 魏道政;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 . [8] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 . [9] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 . [10] 郑国梁; 李辉;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
 版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持：support@magtech.com.cn 总访问量：