We use cookies to improve your experience with our site.

高性能计算专用文件系统

  • 摘要: 目前平行计算集群储存后端几乎仍基于磁盘,而更快更新的储存技术却部署在计算节点之内,比如基于闪存的固态硬盘(SSDs),或非易失性随机访问存储器(NVRAM)。不幸的是现在主要通过手工将新的储存技术纳入到科学工作流程,因此大多数科学家没有利用更快的储存介质。一种将节点本地SSD或者NVRAM系统地纳入科学工作流程的方法是在一组计算节点上部署专用的文件系统,这些文件系统充当单个应用程序或长时间运作配置的临时储存系统。本文陈述了Dagstuhl Seminar 17202《用于高性能计算的用户层文件系统的挑战与机遇》"Challenges and Opportunities of User-Level File Systems for HPC"的结果,并且讨论了使用节点本地储存介质的专用文件系统的应用场景和设计策略。该讨论包括开放的研究问题,例如,如何处理批量调度环境的专用文件系统和如何在储存端和专用文件系统之间调度数据的阶段输入和阶段输出进程。与此同时,我们提出了使用网络的可重用组件构建专用文件系统的策略,以及如何提高储存设备兼容性的策略。本文概述了各种接口和语义,如BeeOND,GekkoFS,和BurstFS三种专用文件系统使用的接口和语义。其范围既覆盖了实际应用的文件系统,又包括针对底层设备性能极限的前沿研究。

     

    Abstract: Storage backends of parallel compute clusters are still based mostly on magnetic disks, while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory (NVRAM) are deployed within compute nodes. Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task, and most scientists therefore do not take advantage of the faster storage media. One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes, which serve as temporary storage systems for single applications or longer-running campaigns. This paper presents results from the Dagstuhl Seminar 17202 "Challenges and Opportunities of User-Level File Systems for HPC" and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media. The discussion includes open research questions, such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems. Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility. Various interfaces and semantics are presented, for example those used by the three ad hoc file systems BeeOND, GekkoFS, and BurstFS. Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices.

     

/

返回文章
返回