Journal of Imaging Article Benchmarking of Document Image Analysis Tasks for Palm Leaf Manuscripts from Southeast Asia Made Windu Antara Kesiman 1,2,* ID , Dona Valy 3,4, Jean-Christophe Burie 1, Erick Paulus 5, Mira Suryani 5, Setiawan Hadi 5, Michel Verleysen 2, Sophea Chhun 4 and Jean-Marc Ogier 1 1 Laboratoire Informatique Image Interaction (L3i), Université de La Rochelle, 17042 La Rochelle, France;
[email protected] (J.-C.B.);
[email protected] (J.-M.O.) 2 Laboratory of Cultural Informatics (LCI), Universitas Pendidikan Ganesha, Singaraja, Bali 81116, Indonesia;
[email protected] 3 Institute of Information and Communication Technologies, Electronic, and Applied Mathematics (ICTEAM), Université Catholique de Louvain, 1348 Louvain-la-Neuve, Belgium;
[email protected] 4 Department of Information and Communication Engineering, Institute of Technology of Cambodia, Phnom Penh, Cambodia;
[email protected] 5 Department of Computer Science, Universitas Padjadjaran, Bandung 45363, Indonesia;
[email protected] (E.P.);
[email protected] (M.S.);
[email protected] (S.H.) * Correspondence:
[email protected] Received: 15 December 2017; Accepted: 18 February 2018; Published: 22 February 2018 Abstract: This paper presents a comprehensive test of the principal tasks in document image analysis (DIA), starting with binarization, text line segmentation, and isolated character/glyph recognition, and continuing on to word recognition and transliteration for a new and challenging collection of palm leaf manuscripts from Southeast Asia. This research presents and is performed on a complete dataset collection of Southeast Asian palm leaf manuscripts. It contains three different scripts: Khmer script from Cambodia, and Balinese script and Sundanese script from Indonesia.