A hybrid algorithm of Backward Hashing and automaton tracking for virus scanning

Po Ching Lin*, Ying-Dar Lin, Yuan Cheng Lai

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

21 Scopus citations


Virus scanning involves computationally intensive string matching against a large number of signatures of different characteristics. Matching a variety of signatures challenges the selection of matching algorithms, as each approach has better performance than others for different signature characteristics. We propose a hybrid approach that partitions the signatures into long and short ones in the open-source ClamAV for virus scanning. An algorithm enhanced from the Wu-Manber algorithm, namely the Backward Hashing algorithm, is responsible for only long patterns to lengthen the average skip distance, while the Aho-Corasick algorithm scans for only short patterns to reduce the automaton sizes. The former utilizes the bad-block heuristic to exploit long shift distance and reduce the verification frequency, so it is much faster than the original WM implementation in ClamAV. The latter increases the AC performance by around 50 percent due to better cache locality. We also rank the factors to indicate their importance for the string matching performance.

Original languageEnglish
Article number5453354
Pages (from-to)594-601
Number of pages8
JournalIEEE Transactions on Computers
Issue number4
StatePublished - 7 Mar 2011


  • automaton
  • filtering
  • String matching
  • virus scanning

Fingerprint Dive into the research topics of 'A hybrid algorithm of Backward Hashing and automaton tracking for virus scanning'. Together they form a unique fingerprint.

Cite this