TY - GEN
T1 - Algorithm-Hardware Co-design for BQSR Acceleration in Genome Analysis ToolKit
AU - Lo, Michael
AU - Fang, Zhenman
AU - Wang, Jie
AU - Zhou, Peipei
AU - Chang, Mau Chung Frank
AU - Cong, Jason
PY - 2020/5
Y1 - 2020/5
N2 - Genome sequencing is one of the key applications in healthcare and has a great potential to realize precision medicine and personalized healthcare. However, its computing process is very time consuming. Even pre-processing the raw sequence data of a whole genome for a single person to the analysis ready data can take several days on a single-core CPU.In this paper, we propose to accelerate the performance of the widely used Genome Analysis ToolKit (GATK) using FPGAs. More specifically, we focus on the algorithm and hardware co-design for the Base Quality Score Re-calibration (BQSR) step in GATK, which is an important and time-consuming step to correct systematic errors made by a sequencing machine. Prior studies did not consider hardware acceleration for BQSR because it requires a large amount of memory with random access and has a lot of control flow. To address these challenges, we first adapt the algorithm to resolve the random memory access conflicts to achieve a fully pipelined accelerator design and reduce its dataset size. Second, we leverage the newly introduced large-capacity UltraRAM (URAM) in Xilinx UltraScale+ FPGAs to butter BQSR's large dataset on chip, and further optimize its operating frequency. Finally, we also explore the coarse-grained pipeline and parallelism to improve the overall performance of the BQSR accelerator. Compared to the latest software implementation of BQSR on GATK 4.1, running on single-thread and 56-thread CPUs (14nm Xeon E5-2680 v4), our FPGA accelerator running on Xilinx 16nmUltraScale+VCUl525 board achieves up to 40. 7x and 8. 5x speedups, respectively.
AB - Genome sequencing is one of the key applications in healthcare and has a great potential to realize precision medicine and personalized healthcare. However, its computing process is very time consuming. Even pre-processing the raw sequence data of a whole genome for a single person to the analysis ready data can take several days on a single-core CPU.In this paper, we propose to accelerate the performance of the widely used Genome Analysis ToolKit (GATK) using FPGAs. More specifically, we focus on the algorithm and hardware co-design for the Base Quality Score Re-calibration (BQSR) step in GATK, which is an important and time-consuming step to correct systematic errors made by a sequencing machine. Prior studies did not consider hardware acceleration for BQSR because it requires a large amount of memory with random access and has a lot of control flow. To address these challenges, we first adapt the algorithm to resolve the random memory access conflicts to achieve a fully pipelined accelerator design and reduce its dataset size. Second, we leverage the newly introduced large-capacity UltraRAM (URAM) in Xilinx UltraScale+ FPGAs to butter BQSR's large dataset on chip, and further optimize its operating frequency. Finally, we also explore the coarse-grained pipeline and parallelism to improve the overall performance of the BQSR accelerator. Compared to the latest software implementation of BQSR on GATK 4.1, running on single-thread and 56-thread CPUs (14nm Xeon E5-2680 v4), our FPGA accelerator running on Xilinx 16nmUltraScale+VCUl525 board achieves up to 40. 7x and 8. 5x speedups, respectively.
UR - http://www.scopus.com/inward/record.url?scp=85087331482&partnerID=8YFLogxK
U2 - 10.1109/FCCM48280.2020.00029
DO - 10.1109/FCCM48280.2020.00029
M3 - Conference contribution
AN - SCOPUS:85087331482
T3 - Proceedings - 28th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2020
SP - 157
EP - 166
BT - Proceedings - 28th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 3 May 2020 through 6 May 2020
ER -