Rule-based page segmentation for palm leaf manuscript on color image

Papangkorn Inkeaw, Jakramate Bootkrajang, Phasit Charoenkwan, Sanparith Marukatat, Shinn-Ying Ho, Jeerayut Chaijaruwanich*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Palm leaf manuscripts are important source of history and ancient wisdom. Large number of manuscripts have been already digitized in the form of folio images. To extract useful information, an optical character recognition (OCR) is often considered to be the first step towards text mining. Unfortunately, folio images contain multiple unsegmented palm leaf images, making it difficult to manage in OCR process. This motivates us to propose a new page segmentation method for palm leaf manuscripts. This method consists of two main steps, first of which is the detection of objects in folio images using Connected Component Labeling method in a transformed L*a*b* color space. The second step is rule-based selection of objects as either palm leaf or not palm leaf. The experiments performed on 20 publicly available palm leaf manuscripts composed of 384 folio images demonstrated that the proposed method effectively segmented folio images into separate palm leaf images, with 99.86% precision and 96.67% recall scores.

Original languageEnglish
Title of host publicationDigital Libraries
Subtitle of host publicationKnowledge, Information, and Data in an Open Access Society - 18th International Conference on Asia-Pacific Digital Libraries, ICADL 2016, Proceedings
EditorsAtsuyuki Morishima, Andreas Rauber, Chern li Liew
PublisherSpringer Verlag
Pages127-136
Number of pages10
ISBN (Print)9783319493039
DOIs
StatePublished - 1 Jan 2016
Event18th International Conference on Asia-Pacific Digital Libraries, ICADL 2016 - Tsukuba, Japan
Duration: 7 Dec 20169 Dec 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10075 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Asia-Pacific Digital Libraries, ICADL 2016
CountryJapan
CityTsukuba
Period7/12/169/12/16

Keywords

  • L*a*b* color space
  • Page segmentation
  • Palm leaf manuscripts
  • Rule-based selection

Fingerprint Dive into the research topics of 'Rule-based page segmentation for palm leaf manuscript on color image'. Together they form a unique fingerprint.

Cite this