Arabic Handwriting Synthesis
Total Page:16
File Type:pdf, Size:1020Kb
© Yousef S. I. Elarian 2014 iii Dedication dهل لوالدي ثم لكل محب To Allah Then, to my parents, and to all who helped, cared, or loved. iv ACKNOWLEDGMENTS Thanks again to my Lord, and to my Parents. Thanks to King Abdul Aziz City for Science and Technology (KACST) for granting and supporting this work (Project # GSP-18-112). Thanks to Dr. Sabri Mahmoud and to Dr. Muhammad Al-Mulhem. Deep Personal Thanks to Dr. Zidouri, Dr. Al-Khatib, and the committee members. Thanks to my colleagues: Sameh, Tanvir, Misbahuddin, Irfan, Anas, and to all who helped that this dissertation is completed Thanks from the heart. v TABLE OF CONTENTS ACKNOWLEDGMENTS ............................................................................................... V TABLE OF CONTENTS ............................................................................................... VI LIST OF TABLES .......................................................................................................... IX LIST OF FIGURES ........................................................................................................ XI LIST OF ABBREVIATIONS ..................................................................................... XIV ABSTRACT ................................................................................................................ XVII XIX .................................................................................................................... ملخص الرسالة CHAPTER 1 INTRODUCTION ................................................................................... 20 1.1 Problem Statement ............................................................................................................................ 21 1.2 Motivation and Applications ............................................................................................................. 23 1.3 Background and Pre-Analysis of the Arabic Writing System ....................................................... 24 1.4 List of Contributions ......................................................................................................................... 29 1.5 List of Publications ............................................................................................................................ 30 1.6 Dissertation Organization ................................................................................................................. 32 CHAPTER 2 LITERATURE REVIEW ....................................................................... 33 2.1 Synthesis Applications, Specifications and Evaluation Methods ................................................... 34 2.1.1 Synthesis Applications ............................................................................................................... 34 2.1.2 Specifications of Synthesis Systems and Outputs ...................................................................... 36 vi 2.1.3 Evaluation Methods.................................................................................................................... 41 2.1.4 Linking Applications, Specifications and Evaluation Methods.................................................. 43 2.2 Review on Shape-Simulation Approaches ....................................................................................... 44 2.2.1 Generation Techniques ............................................................................................................... 46 2.2.2 Concatenation Techniques ......................................................................................................... 51 2.3 Overview on Some Other Synthesis Approaches ............................................................................ 56 2.4 Synthesis for Text Recognition ......................................................................................................... 60 2.5 Arabic Handwriting Synthesis .......................................................................................................... 62 CHAPTER 3 ARABIC HANDWRITING ANALYSIS AND DATASET DESIGN . 63 3.1 Analysis of Arabic Typographic Models .......................................................................................... 63 3.2 Analysis and Design of Dataset ......................................................................................................... 67 3.2.1 The Ligatures Part of Dataset ..................................................................................................... 68 3.2.2 The Unligative Text and the Isolated Characters Parts of Dataset ............................................. 74 3.2.3 The Passages Part of Dataset ...................................................................................................... 81 3.2.4 The Repeated Phrases Part of Dataset ........................................................................................ 82 3.3 Form Collection ................................................................................................................................. 82 3.4 Data Preparation ............................................................................................................................... 89 3.4.1 Form-Page Deskew and Classification ....................................................................................... 89 3.4.2 Block of Handwriting Extraction and File Naming .................................................................... 90 CHAPTER 4 SEGMENTATION AND GROUND-TRUTHING .............................. 92 4.1 Preprocessing and Common Tools ................................................................................................... 94 4.1.1 Projections .................................................................................................................................. 95 4.1.2 Block of Handwriting Deskew ................................................................................................... 96 4.1.3 Baseline Estimation .................................................................................................................... 97 4.2 Ground-Truthing and Analysis on the Pixel-Level ....................................................................... 101 4.2.1 Line-Level Ground-Truthing .................................................................................................... 101 4.2.2 Character-Level Ground-Truthing............................................................................................ 102 4.2.3 Words, Pieces of Words, and Extended Character-Shapes Reassembly .................................. 103 4.3 Blind Line Segmentation ................................................................................................................. 104 4.3.1 Line Segmentation Algorithm .................................................................................................. 105 4.3.2 Line Segmentation Evaluation ................................................................................................. 106 4.3.3 Blind Character-Shape Segmentation ....................................................................................... 109 4.4 Non-Blind Segmentation ................................................................................................................. 114 vii 4.4.1 Words-to-Pieces of Arabic Words Non-Blind Segmentation ................................................... 114 4.4.2 Pieces of Arabic Words to Character-Shapes Segmentation .................................................... 118 4.5 Segmentation Evaluation with Ground-Truth .............................................................................. 121 CHAPTER 5 ARABIC HANDWRITING SYNTHESIS .......................................... 127 5.1 Connection-Point Location ............................................................................................................. 128 5.2 Feature Extraction ........................................................................................................................... 129 5.3 Sample Selection .............................................................................................................................. 131 5.4 Concatenation .................................................................................................................................. 132 5.4.1 The Extended Glyph Approach ................................................................................................ 133 5.4.2 The Synthetic-Extension Approach .......................................................................................... 134 5.5 Experimentation and Results .......................................................................................................... 141 5.5.1 Synthesis Experimentation and Results .................................................................................... 141 5.5.2 Recognition Experimentation and Results ............................................................................... 143 CHAPTER 6 CONCLUSION AND RECOMMENDATIONS ................................ 149 REFERENCES .............................................................................................................. 151 APPENDICES ............................................................................................................... 166 Appendix A: Statistics on character and ligature shapes sizes .............................................................. 166 Appendix B: Statistics and comparisons on ligature shape