An Effective and Efficient Algorithm for Detecting Exact Deletion Breakpoints from Viral Next-Generation Sequencing Data

Ji Hong Cheng, Wen Chun Liu, Ting Tsung Chang, Sun Yuan Hsieh, Vincent S. Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The COVID-19 pandemic has caused serious damage to the health, life and, economic stability of human beings all over the world. In order to combat this disease, researchers from all over the world, including computer scientists, are beginning to engage in cross-regional cooperation to conduct research on SARS-CoV-2. One of the latest reports pointed out that the sequence deletion of the specific region of the SARS-CoV-2 genomic is related to its viral infectivity. In addition, the sequence deletion of this specific region is also found in Hepatitis B Virus (HBV), and Hepatocellular carcinoma (HCC). Through next-generation sequencing (NGS) technology, the sequence data of biological genomes can be quickly obtained, but the number of short reads generated by NGS is often as high as one million big data. It is a challenge to detect the information necessary to provide the exact sequence deletion breakpoint from these NGS data, especially in the sequence data of highly variable viral genomes. In our previous research, we proposed VirDelect, a bioinformatics tool that can detect exact breakpoints in Viral NGS data. In this paper, a new method, One-base Alignment Plus (OAP), is proposed to enhance further the core VirDelect algorithm, in order to improve the sequence deletion detection correctness. We use the simulated data of SARS-CoV-2 and HBV with different deletion lengths and the real data of HBV to conduct experiments and evaluate the correctness. The experimental results showed that VirDelect+OAP was able to find deletions that VirDelect could not find in the simulation data, and in the real data, the correctness of VirDelect+OPA was raised effectively.

Original languageEnglish
Title of host publicationProceedings - 2020 International Computer Symposium, ICS 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages147-152
Number of pages6
ISBN (Electronic)9781728192550
DOIs
StatePublished - Dec 2020
Event2020 International Computer Symposium, ICS 2020 - Tainan, Taiwan
Duration: 17 Dec 202019 Dec 2020

Publication series

NameProceedings - 2020 International Computer Symposium, ICS 2020

Conference

Conference2020 International Computer Symposium, ICS 2020
CountryTaiwan
CityTainan
Period17/12/2019/12/20

Keywords

  • Big data
  • COVID-19
  • Hepatitis B Virus
  • Next-generation sequencing
  • Viral deletion detection

Fingerprint Dive into the research topics of 'An Effective and Efficient Algorithm for Detecting Exact Deletion Breakpoints from Viral Next-Generation Sequencing Data'. Together they form a unique fingerprint.

Cite this