ExaHDF5:为百万兆次计算系统提供有效并行I/O

doi:10.1007/s11390-020-9822-9

ExaHDF5:为百万兆次计算系统提供有效并行I/O

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

摘要

摘要: 百万兆级的科学应用产生并分析了大量数据。此类应用急需有效访问和管理百万兆次系统中的数据。并行I/O，是使得数据能在计算结点和存储间移动的关键技术。它面临来自百万兆级系统设计中应考虑的新应用、内存和存储系统结构所产生的巨大挑战。随着存储层次结构不断扩展，包括了结点本地持久内存、突发缓存等，以及基于磁盘的存储，这些层次间的数据移动必须是有效的。将来的并行I/O库应能处理兆字节及以上的大小的文件。本文描述了分层数据格式版本5（Hierarchical Data Format version 5，HDF5）中研发的新功能。HDF5为最流行的用于科学应用的平行I/O库，是现有HPC系统中执行并行I/O的主导计算设施所使用的最常用函数库之一。我们描述的具有代表性的特征包括：虚拟对象层（VOL），数据电梯（Data Elevator），异步I/O，全功能单写多读（Full SWMR），以及并行查询。本文我们介绍了这些特征及其实现，以及它们的性能和能为应用和其它函数库所能带来的好处。

Abstract: Scientific applications at exascale generate and analyze massive amounts of data. A critical requirement of these applications is the capability to access and manage this data efficiently on exascale systems. Parallel I/O, the key technology enables moving data between compute nodes and storage, faces monumental challenges from new applications, memory, and storage architectures considered in the designs of exascale systems. As the storage hierarchy is expanding to include node-local persistent memory, burst buffers, etc., as well as disk-based storage, data movement among these layers must be efficient. Parallel I/O libraries of the future should be capable of handling file sizes of many terabytes and beyond. In this paper, we describe new capabilities we have developed in Hierarchical Data Format version 5 (HDF5), the most popular parallel I/O library for scientific applications. HDF5 is one of the most used libraries at the leadership computing facilities for performing parallel I/O on existing HPC systems. The state-of-the-art features we describe include:Virtual Object Layer (VOL), Data Elevator, asynchronous I/O, full-featured single-writer and multiple-reader (Full SWMR), and parallel querying. In this paper, we introduce these features, their implementations, and the performance and feature benefits to applications and other libraries.

HTML全文

参考文献()

施引文献

资源附件()