Comparative Study of Table Layout Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Master of Science in Computer Science March 2018 Comparative study of table layout analysis Layout analysis solutions study for Swedish historical hand-written document Xusheng Liang Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfilment of the requirements for the degree of Master of Science in Computer Science. The thesis is equivalent to 20 weeks of full time studies. The authors declare that they are the sole authors of this thesis and that they have not used any sources other than those listed in the bibliography and identified as references. They further declare that they have not submitted this thesis at any other institution to obtain a degree. Contact Information: Author(s): Xusheng Liang E-mail: [email protected] University advisor: Cheddad, Abbas E-mail: [email protected] Department of Computer Science and Engineering Faculty of Computing Internet : www.bth.se Blekinge Institute of Technology Phone : +46 455 38 50 00 SE-371 79 Karlskrona, Sweden Fax : +46 455 38 50 57 ii ABSTRACT Background. Nowadays, information retrieval system become more and more popular, it helps people retrieve information more efficiently and accelerates daily task. Within this context, Image processing technology play an important role that help transcribing content in printed or handwritten documents into digital data in information retrieval system. This transcribing procedure is called document digitization. In this transcribing procedure, image processing technique such as layout analysis and word recognition are employed to segment the document content and transcribe the image content into words. At this point, a Swedish company (ArkivDigital® AB) has a demand to transcribe their document data into digital data. Objectives. In this study, the aim is to find out effective solution to extract document layout regard to the Swedish handwritten historical documents, which are featured by their tabular forms containing the handwritten content. In this case, outcome of application of OCRopus, OCRfeeder, traditional image processing techniques, machine learning techniques on Swedish historical hand-written document is compared and studied. Methods. Implementation and experiment are used to develop three comparative solutions in this study. One is Hessian filtering with mask operation; another one is Gabor filtering with morphological open operation; the last one is Gabor filtering with machine learning classification. In the last solution, different alternatives were explored to build up document layout extraction pipeline. Hessian filter and Gabor filter are evaluated; Secondly, filter images with the better filter evaluated at previous stage, then refine the filtered image with Hough line transform method. Third, extract transfer learning feature and custom feature. Fourth, feed classifier with previous extracted features and analyze the result. After implementing all the solutions, sample set of the Swedish historical handwritten document is applied with these solutions and compare their performance with survey. Results. Both open source OCR system OCRopus and OCRfeeder fail to deliver the outcome due to these systems are designed to handle general document layout instead of table layout. Tradition image processing solutions work in more than a half of the cases, but it does not work well. Combining traditional image process technique and machine leaning technique give the best result, but with great time cost. Conclusions. Results shows that existing OCR system cannot carry layout analysis task in our Swedish historical handwritten document. Traditional image processing techniques are capable to extract the general table layout in these documents. By introducing machine learning technique, better and more accurate table layout can be extracted, but comes with a bigger time cost. Keywords: layout analysis, pattern recognition, document digitalization, table layout extraction, transfer learning, machine learning, image processing, Hessian filter, Gabor filter iii ACKNOWLEDGEMENT Due to personal issue, this study was spent longer time than expected to finish. During this period, thanks my supervisor Abbas being patient and provide good guidance and support to me. Also, thanks my family unconditional support and the fellows participated in evaluation work. Besides, thanks Swedish medical care taking good care of me. iv ACRONYMS DSSHDUHGSDJHV ͳʹǦͳ͵ǡͳͷǡʹǦ͵͵ǡ͵ͷǡ͵ǡ ͵ͻǡͶ͵ǡͶͷǦͶͺ ʹʹǡʹǡʹǡ͵ͷ ͳͷǦͳ Ǧ Ǥ ͳͶ Ǧ ͵ͲǦ͵ͳ ͵ͷǡ͵ͺ ͵ͳǡ͵ Ǥ ͳͳǡͳ͵ǡͳͷǡʹͳǦʹʹǡ͵ͻ ͵ͻ ͵ͳǦ͵ͶǡͶǦͶͺ Ǥ ͳͳǡ͵ͳǦ͵͵ǡͶǦͶͺ ͳͶ Ǥ ͳͳǡͳͷǡ͵ͳǦ͵ǡͶǦͶͺǡͷͲǦ ͵ v GLOSSARY DSSHDUHGSDJHV ௪ௗ௧ Ǥ ͳ͵ǡͳͷǡ͵Ͳ ௧ ͳͷ Ǥ ͵ͺ ͳͷ Ǥ Ǥ ͳͷ Ǥ ʹͻ ͵͵ ⅵ-ⅶ, ͳͷǡʹͶ Ǥ ͳ ̵ ͵ʹǦ͵ͶǡͶǦͶͺ ǡ οǡⅷǡͳͲǦͳͳǡͳͶǦ Ǥ ͳͷǡͳǦʹǡ͵Ǧ͵ǡ ǡǡ ͵ͻǡͶ͵ǡͶͻǦʹ Ǥ ͵Ͳ ͵Ͳ ǡ Ǥ ͳǦͳͺ Ǥ ǡ ͳǡ͵Ͳǡ͵ͷǡ͵ͺ Ǥ Ǥ ⅷǡʹͶǦʹͺǡ͵ͷ Ǥ ǡ Ǥ ୲୰୳ୣ୮୭ୱ୧୲୧୴ୣ Ǣ ͳͶǡͶͷǦͶ ୲୰୳ୣ୮୭ୱ୧୲୧୴ୣାୟ୪ୱୣ୮୭ୱ୧୲୧୴ୣ Ǥ Ȁ ͳ͵ǦͳͷǡʹǡͶʹ ȀǤ Ȁ Ǥ ୲୰୳ୣ୮୭ୱ୧୲୧୴ୣ Ǣ ͵͵ǡͶǦͶ ୲୰୳ୣ୮୭ୱ୧୲୧୴ୣାୟ୪ୱୣ୬ୣୟ୲୧୴ୣ Ǥ ͳ͵ǦͳͶǡʹͳ Ǥ Ǥ ͳ͵Ǧͳͷ Ǥ vi ʹͷǦʹͺ Ǥ vii CONTENTS ABSTRACT ......................................................................................................................................................... III ACKNOWLEDGEMENT .................................................................................................................................. IV ACRONYMS ......................................................................................................................................................... V GLOSSARY ......................................................................................................................................................... VI CONTENTS ...................................................................................................................................................... VIII 1 INTRODUCTION ...................................................................................................................................... 10 1.1 BACKGROUND ...................................................................................................................................... 10 1.2 AIM AND OBJECTIVES ........................................................................................................................... 10 1.2.1 Research gap ................................................................................................................................... 10 1.2.2 Aim .................................................................................................................................................. 11 1.2.3 Objectives ........................................................................................................................................ 11 1.3 RESEARCH QUESTIONS ......................................................................................................................... 11 1.4 STRUCTURE OF STUDY .......................................................................................................................... 11 1.5 LIMITATION .......................................................................................................................................... 11 1.6 CONTRIBUTION ..................................................................................................................................... 12 1.7 ETHICS ................................................................................................................................................. 12 2 RELATED WORK .................................................................................................................................... 13 3 METHOD .................................................................................................................................................... 17 3.1 FILTER APPLICATION ............................................................................................................................ 17 3.1.1 Hessian filter ................................................................................................................................... 17 3.1.2 Gabor filter ..................................................................................................................................... 18 3.2 OPEN SOURCE TOOL APPLICATIONS ...................................................................................................... 21 3.2.1 OCRopus testing .............................................................................................................................. 21 3.2.2 OCRfeeder testing ........................................................................................................................... 22 3.3 IMPLEMENTATION ................................................................................................................................ 22 3.3.1 Image processing solution I - Hessian filter with mask application................................................ 22 3.3.2 Image processing solution II - Gabor filter with morphological opening ....................................... 24 3.3.3 Solution III - Connected Component with Machine Learning Classifier ........................................ 26 3.3.4 Experiments in implantation ..........................................................................................................