iSCoder: Mitigating Genomic Sequencing Data Compression Bottlenecks via In-SRAM Computing
-
Abstract
With the rapid expansion of genomic sequencing data over the years, the costs associated with storage, transmission, and bandwidth are becoming the primary bottlenecks in genomic research and applications. Data compression is widely used to alleviate this burden, provided it achieves a sufficiently high compression ratio and fast compression speed. MPEG-G is a genome-specific compression standard that offers a higher compression ratio than general-purpose compression tools (4.3x), however, at the cost of performance reduction (5x). Following common strategies in compression acceleration, we have designed the first hardware accelerator for the MPEG-G genomic data compression pipeline utilizing in-SRAM computing, referred to as iSCoder. We identify and analyze MatchC and LutC as two bottleneck algorithms within this pipeline, propose two optimized in-SRAM algorithms, and design a unified hardware architecture for these algorithms, considering the characteristics of genomic data. Compared to 72-core Intel processors operating at 3
GHz, experimental results demonstrate that iSCoder achieves an average speedup of 131x for MatchC and 191x for LutC, respectively.
-
-