Arabic Handwriting Synthesis

© Yousef S. I. Elarian 2014 iii Dedication dهل لوالدي ثم لكل محب To Allah Then, to my parents, and to all who helped, cared, or loved. iv ACKNOWLEDGMENTS Thanks again to my Lord, and to my Parents. Thanks to King Abdul Aziz City for Science and Technology (KACST) for granting and supporting this work (Project # GSP-18-112). Thanks to Dr. Sabri Mahmoud and to Dr. Muhammad Al-Mulhem. Deep Personal Thanks to Dr. Zidouri, Dr. Al-Khatib, and the committee members. Thanks to my colleagues: Sameh, Tanvir, Misbahuddin, Irfan, Anas, and to all who helped that this dissertation is completed Thanks from the heart. v TABLE OF CONTENTS ACKNOWLEDGMENTS ............................................................................................... V TABLE OF CONTENTS ............................................................................................... VI LIST OF TABLES .......................................................................................................... IX LIST OF FIGURES ........................................................................................................ XI LIST OF ABBREVIATIONS ..................................................................................... XIV ABSTRACT ................................................................................................................ XVII XIX .................................................................................................................... ملخص الرسالة CHAPTER 1 INTRODUCTION ................................................................................... 20 1.1 Problem Statement ............................................................................................................................ 21 1.2 Motivation and Applications ............................................................................................................. 23 1.3 Background and Pre-Analysis of the Arabic Writing System ....................................................... 24 1.4 List of Contributions ......................................................................................................................... 29 1.5 List of Publications ............................................................................................................................ 30 1.6 Dissertation Organization ................................................................................................................. 32 CHAPTER 2 LITERATURE REVIEW ....................................................................... 33 2.1 Synthesis Applications, Specifications and Evaluation Methods ................................................... 34 2.1.1 Synthesis Applications ............................................................................................................... 34 2.1.2 Specifications of Synthesis Systems and Outputs ...................................................................... 36 vi 2.1.3 Evaluation Methods.................................................................................................................... 41 2.1.4 Linking Applications, Specifications and Evaluation Methods.................................................. 43 2.2 Review on Shape-Simulation Approaches ....................................................................................... 44 2.2.1 Generation Techniques ............................................................................................................... 46 2.2.2 Concatenation Techniques ......................................................................................................... 51 2.3 Overview on Some Other Synthesis Approaches ............................................................................ 56 2.4 Synthesis for Text Recognition ......................................................................................................... 60 2.5 Arabic Handwriting Synthesis .......................................................................................................... 62 CHAPTER 3 ARABIC HANDWRITING ANALYSIS AND DATASET DESIGN . 63 3.1 Analysis of Arabic Typographic Models .......................................................................................... 63 3.2 Analysis and Design of Dataset ......................................................................................................... 67 3.2.1 The Ligatures Part of Dataset ..................................................................................................... 68 3.2.2 The Unligative Text and the Isolated Characters Parts of Dataset ............................................. 74 3.2.3 The Passages Part of Dataset ...................................................................................................... 81 3.2.4 The Repeated Phrases Part of Dataset ........................................................................................ 82 3.3 Form Collection ................................................................................................................................. 82 3.4 Data Preparation ............................................................................................................................... 89 3.4.1 Form-Page Deskew and Classification ....................................................................................... 89 3.4.2 Block of Handwriting Extraction and File Naming .................................................................... 90 CHAPTER ‎4 SEGMENTATION AND GROUND-TRUTHING .............................. 92 4.1 Preprocessing and Common Tools ................................................................................................... 94 4.1.1 Projections .................................................................................................................................. 95 4.1.2 Block of Handwriting Deskew ................................................................................................... 96 4.1.3 Baseline Estimation .................................................................................................................... 97 4.2 Ground-Truthing and Analysis on the Pixel-Level ....................................................................... 101 4.2.1 Line-Level Ground-Truthing .................................................................................................... 101 4.2.2 Character-Level Ground-Truthing............................................................................................ 102 4.2.3 Words, Pieces of Words, and Extended Character-Shapes Reassembly .................................. 103 4.3 Blind Line Segmentation ................................................................................................................. 104 4.3.1 Line Segmentation Algorithm .................................................................................................. 105 4.3.2 Line Segmentation Evaluation ................................................................................................. 106 4.3.3 Blind Character-Shape Segmentation ....................................................................................... 109 4.4 Non-Blind Segmentation ................................................................................................................. 114 vii 4.4.1 Words-to-Pieces of Arabic Words Non-Blind Segmentation ................................................... 114 4.4.2 Pieces of Arabic Words to Character-Shapes Segmentation .................................................... 118 4.5 Segmentation Evaluation with Ground-Truth .............................................................................. 121 CHAPTER ‎5 ARABIC HANDWRITING SYNTHESIS .......................................... 127 5.1 Connection-Point Location ............................................................................................................. 128 5.2 Feature Extraction ........................................................................................................................... 129 5.3 Sample Selection .............................................................................................................................. 131 5.4 Concatenation .................................................................................................................................. 132 5.4.1 The Extended Glyph Approach ................................................................................................ 133 5.4.2 The Synthetic-Extension Approach .......................................................................................... 134 5.5 Experimentation and Results .......................................................................................................... 141 5.5.1 Synthesis Experimentation and Results .................................................................................... 141 5.5.2 Recognition Experimentation and Results ............................................................................... 143 CHAPTER ‎6 CONCLUSION AND RECOMMENDATIONS ................................ 149 REFERENCES .............................................................................................................. 151 APPENDICES ............................................................................................................... 166 Appendix A: Statistics on character and ligature shapes sizes .............................................................. 166 Appendix B: Statistics and comparisons on ligature shape

Arabic Handwriting Synthesis

A Collection of Mildly Interesting Facts About the Little Symbols We Communicate With

Using Action-Level Metrics to Report the Performance of Multi-Step Keyboards

VMD134 Calligraphy

Problems to Be Completed in 90 Minutes

Efficient Acquisition and Synthesis in Computerized Handwriting

Classifying Arabic Fonts Based on Design Characteristics: PANOSE-A

V·Index of Linguistic Items

Adaptifont: Increasing Individuals' Reading Speed with a Generative

Fonts & Encodings

LUCIDAH Ligative and Unligative Characters in a Dataset for Arabic Handwriting

Letter from the President

It's All About Languages! :: Lingvo.Info