基于GPU的大规模并行文件系统元数据加速

doi:10.1007/s11390-020-0783-9

基于GPU的大规模并行文件系统元数据加速

A GPU-Accelerated In-Memory Metadata Management Scheme for Large-Scale Parallel File Systems

摘要

摘要: 元数据一直是文件系统的一大瓶颈。为了提升元数据性能,并行文件系统逐步转向分布式元数据管理方案。我们认为,分布式元数据管理在一致性和可靠性上仍然存在一定的缺陷,相比之下,单节点的元数据性能还存在很大的提升空间。随着存储设备IO性能的不断提升,元数据瓶颈的主要原因逐步由IO转向计算。在此背景下,我们提出基于GPU加速元数据的方案。具体地,我们设计了一种全新的元数据服务器架构,该架构包含CPU、GPU和SSD三个部分。其中,CPU主要负责与客户端交互,从客户端接收元数据请求,并打包传递到GPU中;GPU保存所有的元数据信息,当接收到CPU发来的批量元数据请求后,启动大量的并发线程实施元数据计算,GPU处理完元数据请求后将结果返回到CPU,并由CPU转发到客户端。为了保证元数据的持久化,我们以日志和检查点相结合的形式将元数据保存在SSD上。为了提升GPU中并发线程的计算效率,我们改进了元数据在内存中的数据结构,使之能够高效支持GPU的SIMT计算。我们以BeeGFS为原型实现了基于GPU的元数据加速系统,实验表明,基于GPU的加速方案显著优于基于CPU的元数据管理,在大量客户端并发访问的情况下优势尤其明显。总之,本文针对高性能计算场景,提出了一种新的元数据管理方案,借助GPU的高并发能力,显著缓解计算部件在元数据管理中的瓶颈效应,最终显著提升了单点的元数据性能。值得注意的是,本项工作与分布式元数据管理是不冲突的,所研发的系统能够直接融入元数据集群中。

Abstract: Driven by the increasing requirements of high-performance computing applications, supercomputers are prone to containing more and more computing nodes. Applications running on such a large-scale computing system are likely to spawn millions of parallel processes, which usually generate a burst of I/O requests, introducing a great challenge into the metadata management of underlying parallel file systems. The traditional method used to overcome such a challenge is adopting multiple metadata servers in the scale-out manner, which will inevitably confront with serious network and consistence problems. This work instead pursues to enhance the metadata performance in the scale-up manner. Specifically, we propose to improve the performance of each individual metadata server by employing GPU to handle metadata requests in parallel. Our proposal designs a novel metadata server architecture, which employs CPU to interact with file system clients, while offloading the computing tasks about metadata into GPU. To take full advantages of the parallelism existing in GPU, we redesign the in-memory data structure for the name space of file systems. The new data structure can perfectly fit to the memory architecture of GPU, and thus helps to exploit the large number of parallel threads within GPU to serve the bursty metadata requests concurrently. We implement a prototype based on BeeGFS and conduct extensive experiments to evaluate our proposal, and the experimental results demonstrate that our GPU-based solution outperforms the CPU-based scheme by more than 50% under typical metadata operations. The superiority is strengthened further on high concurrent scenarios, e.g., the high-performance computing systems supporting millions of parallel threads.

HTML全文

参考文献()

施引文献

资源附件()