MULTI-layer segmentation of complex document images

Bing-Fei Wu*, Yen Lin Chen, Chung Cheng Chiu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Text is commonly printed on a complex background. Segmenting text is an important part in document analysis. In the past some methods have been shown for the segmentation of texts with images. However, previous studies have not sufficiently addressed complex compound documents. This investigation presents an algorithm for the segmentation of text in various document images. The proposed segmentation algorithm applies a new multilayer segmentation method to separate the text from various compound document images, independent from the text and background overlapping or not. This method solves various problems associated with the complexity of background images. Experimental results obtained using various document images scanned from book covers, advertisements, brochures and magazines, reveal that the proposed algorithm can successfully segment Chinese and English text strings from various backgrounds, regardless of whether the texts are over a simple, slowly varying or rapidly varying background texture.

Original languageEnglish
Pages (from-to)997-1025
Number of pages29
JournalInternational Journal of Pattern Recognition and Artificial Intelligence
Volume19
Issue number8
DOIs
StatePublished - 1 Dec 2005

Keywords

  • Complex compound document
  • Document analysis
  • Image segmentation
  • Text extraction

Fingerprint Dive into the research topics of 'MULTI-layer segmentation of complex document images'. Together they form a unique fingerprint.

Cite this