We use cookies to improve your experience with our site.

伏羲:基于缓存内计算的纳米孔碱基识别加速架构

FuHsi: Shifting Base-Calling Closer to Sequencer via In-Cache Acceleration

  • 摘要:
    研究背景 纳米孔(第三代)基因组测序的新兴技术正在彻底改变基因组研究、产业和市场。碱基识别是第三代基因组分析的第一步,负责将电信号还原为原始碱基,即A、C、G、T和N。尽管这些工作精度很高,但碱基识别吞吐量和测序仪吞吐量之间存在一个数量级的差异。例如,在主流边缘平台英伟达Jetson AGX Orin上,CatCaller的平均吞吐量仅为MinION测序仪吞吐量的0.25倍。
    目的 我们认为并行性和数据移动是碱基识别中的瓶颈,而最近的设计未能解决这些问题。由于CPU通常是许多边缘设备(例如ONT测序仪)的基础组件,而缓存是CPU的核心组件,因此在缓存内进行计算近年来已成为提高数据处理并行性和提高边缘设备效率的主要方法。本文探索能否通过增强测序仪CPU的功能来对碱基识别进行加速。
    方法 我们引入了伏羲,这是一种基于缓存内计算的碱基识别加速器。伏羲系统增强了传统CPU,仅引入了三个关键组件,即伏羲缓存、伏羲控制器和伏羲软件接口。伏羲缓存增强了传统缓存,以加速碱基识别中的所有瓶颈操作。从软件的角度来看,伏羲 软件接口使用Python风格来定义所有关键操作的粗粒度,使其易于与现有的机器学习框架集成。伏羲控制器支持获取和解码伏羲软件接口,并为我们的软件接口提供粗粒度控制。
    结果 实验结果表明,与Jetson AGX Orin和NVIDIA A100 GPU相比,伏羲分别实现了平均吞吐量提升8.34倍和2.13倍,功耗降低5.49倍和24.48倍。与Helix加速器相比,伏羲还实现了两个数量级的吞吐量/瓦特提升。我们的伏羲加速器可以同时为多达15个测序仪提供服务。
    结论 结果表明,基于缓存内计算的碱基识别架构可以显著提高碱基识别的吞吐量。我们观察到,缓存内计算不仅可以提供大量的并行运算单元,而且可以在数据搬移开销和计算延迟之间做很好的折中。此外,本文还提出了基于缓存内计算的新型测序系统,这有望对后续工业界的测序仪设计产生影响。

     

    Abstract: Base-calling is an essential step in the analysis of third-generation genome data. Many previous hardware efforts aimed at enhancing processing in the workflow. However, an order of magnitude throughput gap still exists. In this paper, we propose FuHsi to improve the end-to-end throughput of the base-calling process. FuHsi is an in-cache accelerator that only introduces three components to the traditional CPUs in the sequencer. We propose FuHsi Cache, which offloads the bottleneck operations to cache arithmetic. Specifically, we accelerate beam search, string conversion, and MAC (multiply-accumulate) using algorithm/hardware co-design. We also introduce FuHsi APIs and FuHsi Controller to provide coarse-grained control for FuHsi Cache. Experimental results show that FuHsi can achieve 45.7x, 113.1x, and 100x throughput per watt speedup compared with an NVIDIA Jetson baseline, an NVIDIA A100 GPU baseline, and the Helix accelerator, respectively. FuHsi can provide base-calling requests for up to 15 ONT sequencers simultaneously.

     

/

返回文章
返回