A Hybrid Methodology of Effective Text-Similarity Evaluation

Shu Kai Yang*, Chien Chou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In this paper, an effective methodology which hybridizes a LCS finding algorithm and SimHash computation is presented for evaluating the text-similarity of articles. It reduces the time-space scale needed by the LCS algorithm by breaking the articles into word subsequences of sentences, managing and pairing them by SimHash comparisons, and reaching the goal of evaluating long-length articles rapidly, with the similar parts and similarity score of compared articles figured out exactly.

Original languageEnglish
Title of host publicationNew Trends in Computer Technologies and Applications - 23rd International Computer Symposium, ICS 2018, Revised Selected Papers
EditorsChuan-Yu Chang, Chien-Chou Lin, Horng-Horng Lin
PublisherSpringer Verlag
Pages227-237
Number of pages11
ISBN (Print)9789811391897
DOIs
StatePublished - 1 Jan 2019
Event23rd International Computer Symposium, ICS 2018 - Yunlin, Taiwan
Duration: 20 Dec 201822 Dec 2018

Publication series

NameCommunications in Computer and Information Science
Volume1013
ISSN (Print)1865-0929

Conference

Conference23rd International Computer Symposium, ICS 2018
CountryTaiwan
CityYunlin
Period20/12/1822/12/18

Keywords

  • LCS
  • LSH
  • Plagiarism detection
  • SimHash
  • Text similarity

Fingerprint Dive into the research topics of 'A Hybrid Methodology of Effective Text-Similarity Evaluation'. Together they form a unique fingerprint.

Cite this