Identification and Characterization of Species-Specific Severe Acute Respiratory Syndrome Coronavirus 2 Physicochemical Properties

Srinivasulu Yerukala Sathipati, Shinn Ying Ho*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

There is an urgent need to elucidate the underlying mechanisms of coronavirus disease (COVID-19) so that vaccines and treatments can be devised. Severe acute respiratory syndrome coronavirus 2 has genetic similarity with bats and pangolin viruses, but a comprehensive understanding of the functions of its proteins at the amino acid sequence level is lacking. A total of 4320 sequences of human and nonhuman coronaviruses was retrieved from the Global Initiative on Sharing All Influenza Data and the National Center for Biotechnology Information. This work proposes an optimization method COVID-Pred with an efficient feature selection algorithm to classify the species-specific coronaviruses based on physicochemical properties (PCPs) of their sequences. COVID-Pred identified a set of 11 PCPs using a support vector machine and achieved 10-fold cross-validation and test accuracies of 99.53% and 97.80%, respectively. These findings could provide key insights into understanding the driving forces during the course of infection and assist in developing effective therapies.

Original languageEnglish
JournalJournal of Proteome Research
DOIs
StateAccepted/In press - 2021

Keywords

  • machine learning
  • physicochemical properties
  • SARS-CoV-2 classification
  • support vector machines

Fingerprint Dive into the research topics of 'Identification and Characterization of Species-Specific Severe Acute Respiratory Syndrome Coronavirus 2 Physicochemical Properties'. Together they form a unique fingerprint.

Cite this