<<

Arabic Optical State of the Smart Character Art in Arabic Apps for Recognition OCR PWDs and Assistive Qatari using OCR Issue no. 15 Technology Research Nafath Efforts

Page 04 Page 07 Page 27

Machine Learning, Deep Learning and OCR Revitalizing Technology Arabic Optical Character Recognition (OCR) Technology at Qatar National Library

Overview of Arabic OCR and Related Applications

www.mada.org.qa Nafath About AboutIssue 15 Content Mada Nafath3 Page Nafath aims to be a key information 04 Arabic Optical Character resource for disseminating the facts about Recognition and Assistive Mada Center is a private institution for public benefit, which latest trends and innovation in the field of Technology was founded in 2010 as an initiative that aims at promoting ICT Accessibility. It is published in English digital inclusion and building a technology-based community and Arabic languages on a quarterly basis 07 State of the Art in Arabic OCR that meets the needs of persons with functional limitations and intends to be a window of information Qatari Research Efforts (PFLs) – persons with disabilities (PWDs) and the elderly in to the world, highlighting the pioneering Qatar. Mada today is the world’s Center of Excellence in digital work done in our field to meet the growing access in Arabic. Overview of Arabic demands of ICT Accessibility and Assistive 11 OCR and Related Through strategic partnerships, the center works to Technology products and services in Qatar Applications enable the education, culture and community sectors and the Arab region. through ICT to achieve an inclusive community and educational system. The Center achieves its goals 14 Examples of Optical by building partners’ capabilities and supporting the Character Recognition Tools development and accreditation of digital platforms in accordance with international standards of digital access. Arabic Optical Mada raises awareness, provides consulting services 16 and increases the number of assistive technology Character Recognition solutions in Arabic through the Mada Innovation Program (OCR) Technology at to enable equal opportunities for PWDs and the elderly Qatar National Library in the digital community. 22 The optical recognition technology At the national level, Mada Center has achieved a digital in assistive technology devices accessibility rate of 94% amongst government websites, while Qatar ranks fifth globally on the Digital Accessibility Rights Evaluation Index (DARE). Machine Learning, 24 Deep Learning and OCR Our Vision Enhancing ICT accessibility in Qatar and beyond. Revitalizing Technology

Our Mission 27 Smart Apps for PWDs using OCR Unlock the potential of persons with functional limitations (PFLs) – persons with disabilities (PWDs) and the elderly - 30 Making Social Media Accessible for All through enabling ICT accessible capabilities and platforms. Twitter Nafath Nafath Arabic Optical Character Recognition and Assistive Technology Issue 15 Issue 15 4 5

The advent of OCR based AT has had a materials to accessible format using specialized transforming effect on PWDs in terms of areas or hardware solutions. The first step Arabic Optical Character like increased educational productivity and in the conversion process relies on using OCR enhanced independence in performing daily technologies to identify the characters in the tasks, leading to an improved quality of life. reading materials and convert the content into The ability of OCR technology to offer efficient a digital format. Once converted, the digitized Recognition and and accurate conversion of paper and image document can be used by AT solutions (e.g., documents to editable digitized formats has Duxbury, EZ converter, etc.) to create reading greatly influenced the potential of providing materials in an accessible format like large Assistive Technology information in accessible formats suitable print, text-to-speech, and braille. Without for use by PWDs. Several types of innovative accurate OCR technology, the translation of As Optical Character Recognition (OCR) technology has gone AT solutions based on OCR technologies are scanned documents into digital format would be through substantial improvements over the past decades, the available in the market today, some of which unreliable and thus, hinder the ability for those Assistive Technology (AT) industry has utilized it as a crucial tool to are as follows: with visual impairments to read independently. serve as a foundation for developing breakthroughs in innovative Technologies that help to Wearable technologies that enable Individuals AT solutions. AT for individuals with various kinds of disabilities overcome Learning Difficulties with Visual Impairment to identify key objects has been developed through the use of OCR. These solutions have Individuals with learning difficulties like and improve their ability for independent living enabled Persons with Disabilities (PWDs) to be active members of dyslexia, and Attention Deficit Hyperactivity In recent years the concept of integrating society in several domains such as education, employment, and Disorder (ADHD) often find it challenging to read accessibility features into smart glasses to community. printed materials as it becomes problematic to improve the lives of individuals with visual distinguish between characters and keep track of impairment has been explored. The inclusion the flow of reading. Various AT solutions support of a camera and computing chip in smart the creation of digital documents from scanned glasses allows them to be an ideal platform for documents and images through the use of OCR learning OCR based AT solutions. Smart glass technology. These digitized documents can be technologies like Envision, NuEyes, and OCRam processed automatically and converted into have integrated OCR based features that enable accessible materials tailored fit for the needs the identification of print-based information like of the student with learning disabilities. Once product expiry dates, restaurant menus, and converted to digital format, specialized AT (e.g., printed bills. These glasses are also equipped TextHelp, Clicker, Kurzweil, etc.) can support with audio output which allows text-to-speech features like text-to-speech and text-tracking feedback of OCR identified information. tools that help users visually track the words being read out to them by the software. The Technologies that enable Individuals with software features can also include additional Physical Disabilities to access print materials useful options like creating a user-configurable Accessing conventional reading materials like library, dictionary, highlighting, adjustable word books and newspapers can be challenging and sentence tracking colors, and customizable for Individuals with Physical Disabilities as it backgrounds. requires sufficient dexterity to perform tasks like flipping the document pages. In such cases, the Technologies that enable Individuals with preferred access method is to have the reading Visual Impairment to read independently materials available in an electronic format (e.g., Individuals with visual impairments like HTML, PDF, etc.). OCR based AT solutions like low vision or blindness can often encounter OpenBook allow the creation of documents challenges to read printed materials. AT solutions in an electronic format from scanned printed can automate the process of converting printed documents and graphics-image based text. Nafath Arabic Optical Character Recognition and Assistive Technology Nafath Issue 15 Issue 15 6 7

The development of OCR technologies in Mada Center has continuously been invested other languages like Arabic has progressed in supporting the development of AT based on considerably over the past decade. This has Arabic OCR technologies. This continues to be State of the Art allowed the AT industry to create innovative achieved primarily by supporting innovators OCR based solutions localized in the Arabic and entrepreneurs through the Mada Innovation language as the improved OCR accuracy meant Program. A major contribution of Mada Center a more reliable outcome from the AT solution. towards its commitment of supporting the in Arabic OCR Currently, there are a handful of OCR based evolution of Arabic OCR is the development of the solutions commercially available in the Arabic “Arabic Money Reader App”, which recognizes language, some of which have around 96% or Qatari Riyal currency notes using the mobile Qatari Research Efforts more accuracy. Examples of OCR solutions that phone camera. Mada Center has also been support the Arabic language include: committed to supporting the digitization of Arabic Since the mid-1940s, there has been extensive research and language reading materials in accessible format publications on character recognition. With most of the published Sakhr OCR available worldwide. This objective is achieved work being on Latin characters, and Japanese and Chinese Sakhr OCR solution is capable of identifying by collaborating with international partners characters emerging in the mid-1960s. Despite almost a billion complex fonts (including cursive writing), like Bookshare, which hosts one of the largest diacritics, position-dependent character shapes, platforms of accessible reading materials for people worldwide using Arabic characters for writing (Arabic, overlapping, and non-standard fonts in the individuals with print disabilities. Persian, and Urdu), Arabic character recognition research, starting Arabic language. Sakhr OCR converts scans of in the 1970s, is sparse. both Arabic and Arabic-script based languages and can recognize Arabic text with an output accuracy of up to 99%.

ABBYY FineReader ABBYY FineReader is an OCR solution that includes features like digital conversion of Arabic scanned documents through applying intelligent document layouts, image enhancement, barcode recognition, and command line integration. ABBYY FineReader can directly convert the contents of printed documents into editable , Excel, or PDF format.

ReadIRIS ReadIRIS offers an accurate Arabic OCR recognition rate through its OCR software. The OCR engine supports the recovery of text from printed materials to various file format (e.g., Word, Excel, PowerPoint, or PDF). ReadIRIS supports the digitization of Arabic paper documents and thus, paves the way for AT solutions to create Arabic content in an accessible format. Nafath State of the Art in Arabic OCR Nafath State of the Art in Arabic OCR Issue 15 Qatari Research Efforts Issue 15 Qatari Research Efforts 8 9

This may be attributed to: ii. The dotting challenge iv. The ligatures challenge vi. Size variation challenge • Inadequate journals, books, conferences, Dotting is extensively used to differentiate Certain compounds of characters at particular Different Arabic graphemes do not have a funding, and interaction between characters sharing similar graphemes. positions of word segments are represented fixed height or a fixed width. Moreover, neither researchers. Figure (2) shows small differences between by single atomic graphemes called ligatures; the different nominal sizes of the same font • The lack of utilities like Arabic text members of the same set. Whether the dots found to some extent in most Arabic fonts. scale linearly with their actual line height nor databases, dictionaries, programming are eliminated before the recognition process, Traditional Arabic font contains around 220 the different fonts with the same nominal size tools, and supporting staff. or recognition features are extracted from the graphemes and Simplified Arabic contains have a fixed line height. • Delayed onset of Arabic text recognition. dotted script, dotting is an area of confusion – around 151 graphemes. Compared to • The techniques developed for other English with 40 or 50 graphemes. A broader therefore, recognition errors – in Arabic font- vii. The diacritics challenge writings cannot be successfully applied to written OCR systems, especially when using grapheme set means higher ambiguity for the Arabic writing due to the unique attributes devices such as photocopiers. same recognition methodology; hence, more Arabic diacritics are used only when they help of Arabic script. confusion. Figure (4) illustrates some ligatures resolve linguistic ambiguity of the text. The in Traditional Arabic. problem of diacritics with font written Arabic Arabic OCR challenges OCR is that their direction of flow is vertical Figure (2) while the main writing direction of the body Arabic is written from right to left, which Examples sets of dotting-differentiated graphemes Arabic text is horizontal from right to left. presents many challenges to the OCR (See Figure (6)) Similar dots and diacritics developer, which include (Al-Badr 1995; Attia are a source of confusion of font written OCR Figure (4) 2004): Some ligatures in the Traditional Arabic font systems; due to their relatively larger size they are usually pre-processed. i. The connectivity challenge iii. The multiple grapheme cases challenge v. The overlapping challenge Arabic text can only be scripted cursively, i.e., Due to the connectivity in Arabic orthography, Characters in a word may overlap vertically graphemes are connected and only interrupted the same grapheme representing the same even without touching, as shown in Figure (5). at limited characters or at the end of the word. character can have multiple variants according Figure (6) This necessitates that any Arabic OCR system to its position within the Arabic word segment Arabic text with diacritics undergoes a traditional grapheme recognition (Starting, Middle, Ending, Separate) as task and a more rigorous grapheme exemplified by the four variants of the Arabic .(shown in bold in Figure (3 ”ع“ segmentation (see Figure 1). To complicate character things, both tasks are mutually dependent; Figure (5) therefore, must be done simultaneously. Some overlapped Characters in Demashq Arabic font

Figure (3) ;in its 4 positions ”ع“ Figure (1) Grapheme Grapheme segmentation process illustrated by Starting, Middle, Ending & Separate manually inserting vertical lines at the appropriate grapheme connection points Nafath QCRI has established projects related to Nafath Issue 15 e-education and making non-native language Issue 15 material accessible. The development of an Arabic language supported e-reader and 10 assistive language tutor are examples that will 11 directly impact society and learning.

Qatari Research Efforts Some of the projects run by the Arabic Language Technologies Group at QCRI include: Arabic Language Technologies Group at Qatar Computing Research Institute (QCRI) is QATIP – An Optical Character Recognition leading research on OCR in Qatar. They are System for Arabic Heritage Collections in dedicated to promoting the Arabic language Libraries by conducting world-class research in Arabic language technologies. Ensuring that the The Qatar Computing Research Institute team Arabic language flourishes in the digital worked on an end-user oriented QATIP system world is a primary focus area. Some current for OCR in such documents. The recognition research projects address the challenges is based on the Kaldi toolkit and sophisticated related to the lack of content and extracting text image normalization. The QATIP interface that content. for libraries consists of a graphical user interface for adding and monitoring jobs and QCRI strives to become the regional and global a web API for automated access. It also uses leader in Arabic language technologies – in a novel approach for language modeling the areas of search, information retrieval and and ligature modeling for continuous Arabic Optical Character Recognition analysis, multilingual language processing, OCR. The QATIP system was tested on an (OCR) is a generic term used to advanced machine translation, and leading early print and a historical manuscript and characterize technologies that report substantial improvements – e.g., 12.6% efforts to increase and enrich Arabic language recognize text within scanned content online. character error rate with QATIP compared to 51.8% with the best OCR product (Stahlberg Overview of documents, and photos, to help Moreover, QCRI’s initiatives also examine 2015, 2016). convert them into a digital format. challenges in retrieving content, making it OCR technology is used to convert PrepOCRessor accessible, and enabling information flow virtually any kind of images across language barriers. In this regard, Arabic OCR The QCRI Preprocessing Tool for Arabic OCR containing written text (typed, development is underway to process Arabic was developed for preprocessing document in the search domain such as the use of images for optical character recognition. A handwritten, or printed) into morphological word analysis, named entity set of image processing operations is chained and Related machine-readable text data. OCR recognition, and data learning technology to such that the output of each operation serves has become a key area of interest detect relevant content for more elaborate as an input to the next one. The tool supports over the past two decades with analysis. In addition, developing proofing tools batch processing for high parallelism and Applications such as typographical checks and language scalability. PrepOCRessor is intended to be respect to implementing projects identification for local Arabic dialects and used in combination with the recognition related to digitize historic Arabic written using Latin characters. toolkit Kaldi and supports file formats for documents (e.g., newspapers, feature sets (.ark,t) and fOCRed-alignments manuscripts, constitutional bills, A major effort at QCRI goes into improving (.al) for seamless integration. Though the machine translation. Combining an Arabic focus is on Arabic script, the tool has been letters, etc.). The importance of “Speech-to-Text” engine that permits successfully used for other writing systems, OCR technologies has become instantaneous transcription of videos with a e.g., Latin in the ICDAR2015 Competition HTRtS even more widespread with machine translation system allows access to on historic documents. broadcast news and news distributed over the advent of the internet Stahlberg Felix, and Stephan Vogel. “The QCRI Recognition System for which serves as a resource of the web. Future research will concentrate on Handwritten Arabic.” In Proceedings of the 18th International Conference on applications such as lecture translation. Image Analysis and Processing (ICIAP 2015). Genova, Italy, September 2015. Stahlberg, Felix, and Stephan Vogel. “QATIP - An Optical Character Recognition multilingual information based on System for Arabic Heritage Collections in Libraries.” In DAS, 2016. Al-Badr, B., Mahmoud, S.A. “Survey and Bibliography of Arabic Optical Text digital textual data. Recognition.” Elsevier Science, Signal Processing, 41(1) (1995) pp. 49-77. Nafath Overview of Arabic OCR and Related Applications Nafath Overview of Arabic OCR and Related Applications Issue 15 Issue 15 12 13

While OCR technology identify the concerned OCR systems to recognize on Arabic and Chinese OCR technology has a key role Mada Center has a has undergone several Arabic character the character or word Handwriting Recognition was to play in the development of significant role to propel improvements over time and irrespective of its position accurately. held at College Park, MD in the Assistive Technologies to help the improvement of Arabic achieved close to a hundred within a word. • Loop-Shaped Character US where experts from both improve the lives of People OCR and the development percent accuracy in languages • Dot and No-Dot Character Arabic characters have a research fields presented with Disabilities (PWDs). As of innovative accessible based on Latin scripts (e.g., Certain Arabic characters loop shape, such as Saad their actual work. From that such PWDs, primarily Visually OCR based solutions. This time, intensive research on Impaired individuals, cannot is done by supporting ,ف)) Fa ,(ض) Dhad ,(ص) English), there have been may have the presence An Arabic script recognition use their Assistive Technology relevant innovators and .(ق) and Qaf ,(م) major challenges in enhancing of dots above or under Meem OCR accuracy for languages them which can impact obstacle for Arabic OCR started and has resulted in a to read digital content without entrepreneurs through the based on right-to-left the outcome of the final is to be able to accurately big step forward today. the accurate utilization of OCR Mada Innovation Program reading stylized scripts (e.g., character or word. There recognize Arabic The most common application technology. With improved to successfully develop Arabic, Persian, Urdu, etc.). may be one to three dots characters that contain a of OCR technology is Arabic OCR, PWDs can enjoy their Assistive Technology Arabic is the first language used with the character to loop shape. converting printed paper greater access to digital solutions and prepare them for more than 400 million determine the final word. • Diacritics documents to machine- documents and improve to be market-ready for Qatar people worldwide and Arabic • Dot Character Baseline Some Arabic text may be readable digital text format. their quality of life across and the Arabic speaking speaking readers represents The presence of a dot written with diacritical Some other areas of education, employment, and region. Mada Center strives a major proportion of internet within a character is in marks accompanying application for OCR technology other aspects of daily living. to increase the number of ICT users who are potentially relation to a baseline each character which are as follows (but not limited Additionally, the availability Digital Accessibility solutions interested in accessing Arabic as the dot used with makes it challenging for to): of digital text is vital to make to adequately serve the digital resource. Hence, the the Arabic character the OCR engine to identify • Data Entry Automation printed information accessible growing needs of PWDs in importance of optimizing may be located above the character effectively • Indexing Documents for to PWDs because this enables Qatar and the region. Arabic OCR technology is or below the baseline because this affects the Search Engines the creation of information significantly vital for improved (where applicable). The graphical analysis of the • Automatic Vehicle Plate in other accessible formats information and knowledge baseline is significant image. Number Recognition such as audio, large print, sharing within society. in developing Arabic • Voucher Code Scanning and Braille. Digital text The fundamental challenges OCR systems as it Over the past decades, • Office filing system is especially helpful for involving Arabic OCR is helps to classify Arabic researchers and scientists • Self-service stores / struggling readers, including the fact that recognition characters into two have worked on developing e-Kiosk those who have learning accuracy is harder to achieve classes: dot character various databases of Arabic • Digitizing handwritten difficulties such as dyslexia. . primarily due to the following above the baseline and handwritten words to serve documents, books, and characteristics of the Arabic dot character below the as a reference for OCR manuscripts script set: baseline. developers to create solutions • Assistive Technology • Character Position • Zigzag-Shaped Character for identifying textual shapes, An Arabic character may Another distinguishing characters, and reconciliating have one to four unique characteristic of Arabic them into a digital text format. shapes depending on its script is the presence of In 2002, a database on Arabic position within a word (i.e., Hamza, a zigzag-shaped handwritten words (IFN/ with some Arabic ENIT-database) was made (ء) isolated, initial, middle, mark and end). The OCR solution characters which can available to the community. must be able to effectively pose challenges for the In September 2006, a summit Nafath Nafath Examples of Optical Character Recognition Tools Issue 15 Issue 15 14 15

Name Founded License Online Programming SDK Arabic Examples of Year Language Language QATIP 2016 Free Yes Unknown Arabic Google 2016 Proprietary Yes Unknown Yes Arabic; Optical Character Cloud Vision Modern Standard + More than 100 1985 Apache No ++, C Yes Arabic + Recognition Tools More than 100 ABBYY 1989 Proprietary Yes C/C++ Yes Arabic + 192 Optical Character Recognition (OCR) refers to computer processes for converting FineReader images of printed, typed, or handwritten texts into text files. A computer requires OCR software to perform this task. This allows retrieving the text in the image Asprise OCR 1998 Proprietary Yes Java, C#,VB. Yes Arabic not supported and to save it in a file that can be used in a word processor for enrichment and SDK NET, C/C++/ + 20 stored in a database or on another medium that can be used by a computer system. Today, there many OCR engines that are used including Google Drive AnyDoc 1989 Proprietary No VBScript Arabic not supported OCR, Tesseract, Transym, OmniPage, etc. Many of them are paid, however, some Software are accessible for free. CuneiForm 1996 BSD variant No C/C++ Yes Arabic not supported Dynamsoft 2003 Proprietary Yes C/C++ Yes Arabic + 40 Recognizing Arabic text is a popular research topic, a significant amount of OCR SDK research efforts is being invested to increase the accuracy rate of Arabic OCR by using different approaches and technologies. In 2002, a system to OmniPage 1970s Proprietary Yes C/C++, C#[15] Yes Arabic + 125 recognize Arabic text using a neural network was developed using a set of 2003 GPL Yes C++ Yes Latin alphabet moment invariants descriptors. An artificial neural network (ANN) is used for SmartScore 1991 Proprietary No - Music classification [1]. The study has shown a high accuracy rate of 90% [2]. Another Microsoft - Proprietary No - Arabic research project made in 2017 used a database of 34,000 characters, 70% are Office used for training the machine learning, 15% for the testing phase, and 15% for Document validation. The project has achieved a 98.27% recognition rate [3]. In 2018, a Imaging project aiming to recognize Arabic handwriting used a dataset of greater than 43,000 handwritten Arabic phrases, 30,000 used for training and 13,000 used Puma.NET 2006 BSD No C# Yes Arabic not supported for the testing stage. The recognition result showed a 99% rate of accuracy [4]. + 28 ReadSoft - Proprietary No - Arabic not supported A number of tools and services have emerged in the market as a result of OCRFeeder 2009 GPL No Python Arabic not supported advances in such research. The quality, accuracy, and precision of OCR tools OCRopus 2007 Apache No Python All languages using have become more effective and improved over the years. Today, from simple to Latin script (other complex, there is a wide range of OCR solutions available for use. Some of these languages can be tools may need programming skills to make them work while others are ready trained) to use off-the-shelf solutions. Depending on their features and accuracy, the solution costs may vary, while some OCR tools are available to use for free too. [1] Muna Ahmed Awel, Ali Imam Abidi, Review on optical character recognition, International Research Journal of Details of the most known OCR resources in the market are provided in the table Engineering and Technology (IRJET), p-ISSN: 2395-0072, Volume: 06 Issue: 06 | June 2019 below: [2] M. M. Altuwaijri and M. A. Bayoumi, “Arabic text recognition using neural networks,” pp. 415–418, 2002. [3] N. Lamghari, M. E. H. Charaf, and S. Raghay, “Hybrid Feature Vector for the Recognition of Arabic Handwritten Characters Using Feed-Forward Neural Network,” Arab. J. Sci. Eng., vol. 43, no. 12, pp. 7031– 7039, 2018. [4] N. A. Jebril, H. R. Al-Zoubi, and Q. Abu Al-Haija, “Recognition of Handwritten Arabic Characters using Histograms of Oriented Gradient (HOG),” Pattern Recognit. Image Anal., vol. 28, no. 2, pp. 321–345, 2018. Nafath Nafath Issue 15 Issue 15 16 17

Arabic Optical Character Recognition (OCR) Technology at Qatar National Library Optical Character Recognition (OCR) is the practice of extracting text from images. The process itself is becoming popular in terms of usage and research, as it spans multiple areas of science, including image processing, machine learning, information retrieval, and artificial intelligence. Nafath Arabic Optical Character Recognition (OCR) Nafath Arabic Optical Character Recognition (OCR) Issue 15 Technology at Qatar National Library Issue 15 Technology at Qatar National Library 18 19

In layman’s terms, it is the only way to copy, use, and index the text from a scanned image. With the proper tools and algorithms, QNL This allowed QNL to index the output text using The benefits vary from a simple copy/paste process, citation, search within, text annotation, built universal libraries that cover 99% of the robust yet sophisticated Arabic text lexical and tagging. Moreover, it perfectly meshes with the modern algorithms of text mining, printed Arabic text based on shape, quality, analyzers and offered that to our patrons morphological search, text automatic translation, text summarization, linked data, and indexing and size and we have engineered a clear with just a single click. Our patrons need only tools. workflow to enhance the quality of the images to access the Library’s Digital Repository at the start by raising the DPI, smoothing and enjoy the “search within” feature, which Using OCR images is the ultimate value-add to scanned documents—it brings the documents to the edges, refining smudged printing is expected to immensely help improve the life and allows users to discover every bit of information stored within. and removing the noise. With this image- quality of research in Arabic studies, such as enhancement process and the trained machine art, history, science, and philosophy, to name At Qatar National Library, learning libraries, our OCR accuracy achieves just a few. multiple techniques and 99 percent character level accuracy in Arabic. algorithms to OCR text have been developed; these methods include both human operators and automatic SDKs (Software Development Kits) and APIs (Application Programming Interfaces). Adittionally, QNL built an accurate yet scalable system that will efficiently Figure 02 streamline the operation, Shape classification as it harmonizes the roles and responsibilities between humans and machines to reach the maximum quality of the extracted text.

While we use OCR for a wide array of languages, we are most proud of our achievements with regards to the Arabic text. Since the start of machine learning algorithms, and even with the most modern OCR executables, Arabic text remains a formidable challenge. The multiple shapes and sizes of the Figure 03 Arabic font, in addition to its diacritics, dot usage, cursive Arabic Text Recognition flow characters, and the changing of character shapes based on their location inside a word were all factors that reduced the quality of the OCRed text.

Figure 01 Arabic OCR challenges Nafath Arabic Optical Character Recognition (OCR) Nafath Arabic Optical Character Recognition (OCR) Issue 15 Technology at Qatar National Library Issue 15 Technology at Qatar National Library 20 21

The Library’s state-of-the-art digitization facility makes Arabic content from • Al Shaqab Horse Collection: the Library digitized over 50,000 images from its Heritage Library and other institutions available online, increasing the Al Shaqab’s Horse Collection. availability of Arabic content worldwide. • Ottoman Archive: 1,600 digital images of heritage documents related to the Gulf region from the Ottoman Archive have been processed, to be made The Library harnesses the expertise of a fully trained international team, and available on the Library’s online platforms. laboratories equipped with cutting-edge technology, to undertake various • Qatar Traditional Architecture Photographic Collection: The Library digitized processes of digital preservation. The Library offers the services of bulk a collection of 1793 photographs from a 1985 French archeological digitization, large-format scanning and image stitching, on-site digitization, expedition to Qatar that produced a comprehensive record of traditional E-Pub creation, Optical Character Recognition (OCR), 3D Photography, 19th-century architecture. audiovisual digitization, and long-term preservation. The Digitization Center follows international best practices and guidelines, In addition to ongoing including the Federal Agencies Digitization Guidelines Initiative (FADGI), efforts to digitize the Metamorfoze Preservation Imaging Guidelines, ISO- 19264, and the IFLA Library’s collections guidelines for digitization projects. This enabled the Library to digitize of rare books, 10,277,367 pages from various collections including 4,957,546 Arabic pages manuscripts, maps, from Qatar National Library’s Heritage Collection, and 2,782,016 pages from and photographs, the the Online Arabic Collection of New York University. Library’s Digitization Center is working on Libraries play an important role in preserving heritage for future generations, digitization and Arabic and digitization and its sophisticated processes and operations go a long way OCR projects with other in ensuring this is done. Furthermore, Qatar National Library is committed to heritage collections in the preservation of heritage not only of the region but of the Islamic world as a Qatar and international whole. We have come a long way in building a reliable process for digitization institutions, including: and OCR Arabic content for the benefit of spreading rich Arabic knowledge and heritage; we are committed to working harder to fulfill that goal. • New York University (NYU) project: This joint project applies Hany A. Elsawy Abdellatif optical character Head of Digitization Unit at Qatar National Library recognition to more than 8,000 Arabic books in NYU library collections, which will also be available on Qatar National Library’s online platforms. • Doha Historical Dictionary of Arabic Language: The Library contributes to the field of optical recognition of Arabic characters, which will assist research on the etymology and meaning of Arabic words. • Museum of Islamic Art: A Memorandum of Understanding outlined possible collaborations, including a project to digitize 164 of the rarest books and manuscripts in the museum as well as its and library collections, including the Latin OCR. Nafath Nafath The optical recognition technology Issue 15 Issue 15 in assistive technology devices 22 23

Digital text is one of the numerous formats Future - Machine Learning and Deep Learning that make printed information accessible to a The future of OCR technology is rebooting using The optical recognition broader audience; other forms include audio, artificial intelligence-based machine learning large print, and Braille. Digital text is incredibly and deep learning technologies; these new accommodating for struggling readers, including technologies are not limited by the rules-based those who experience learning differences character matching of existing OCR software. technology in assistive such as dyslexia. The digital format makes it The latest updates in OCR technology will have feasible for readers to see words on a screen a neural network that mirrors human brain and hear them read aloud at the same time, function to confirm that the algorithms don›t have technology devices which promotes more ways to interlock with to depend on historical patterns to determine Optical Character Recognition (OCR) plays a vital the information. It can also help children amplify accuracy and the benefit will be that it can derive role in converting printed materials into digital independent reading skills. meaning from the recognized text and will help text files. The use of OCR technology (OCR) has to automate manually intensive tasks such as transformed productivity, increased engagement Standalone OCR Devices document classification, data extraction, and and motivation, and, most importantly, with The latest OCR technology comes integrated storage. With this adaption, a disabled user will accelerated learning. The technology has also with various hardware like a smart reading also benefit from enriched data, deeper analysis, played a crucial role in improving the lifestyle pen, handheld magnifier devices, standalone and thoughtful recommendations. of individuals requiring support and people with CCTV devices, and braille devices. The basic disabilities. OCR technology provides effective, functionality is the same but the accuracy varies efficient, and accurate document processing that with the device. The device camera scans the converts paper or image original texts to editable document and the OCR software then converts formats and then, translates it into multiple the images into identified characters and words languages. and creates temporary files, including the text›s characters and page design. The identification process considers the logical form of the language. Solutions like Ebot Pro or Compact 10 HD portray all the key components of a flawless OCR device while also having other features. Such a system can recognize that a word spelled incorrectly at the beginning of a sentence is an error and can fix the error.

OCR systems use a dictionary and implement spell-checking techniques comparable to those found in many other word processors. The synthesizer in the OCR system then speaks the recognized text and the knowledge is stored in an electronic format. In certain OCRs, these temporary files can be translated to forms that can be retrieved by commonly used computer applications such as word processors, spreadsheets, and databases. A visually impaired or blind individual can access the scanned text by using adaptive technology devices that magnify the computer screen or provide text to speech or braille output. Nafath Nafath Machine Learning, Deep Learning and OCR Issue 15 Issue 15 Revitalizing Technology 24 25

Until now, OCR has contributed to helping Technology is being revitalized with the business owners to automate the processing introduction of Artificial Intelligence, Machine Machine Learning, of handling physical documents. When it Learning, and Deep Learning. Software comes to people with functional limitations, developers are working on robust solutions virtually all people with visual impairments and upgrades to existing OCR devices. Until Deep Learning and OCR to learning disabilities use OCR technology now, OCR users only option to increase the provided by various entities. Today, OCR reliability of scans is to manually measure programs are still used to transform and evaluate the process resultsy. With the Revitalizing Technology handwritten or printed text into machine- introduction of Machine Learning and Deep encoded text so that it can be retrieved on Learning, solutions will automatically conduct Optical character recognition tools a computer. OCR programs make copies of the evaluation while pulling insights from documents like receipts, bank statements, the text and understanding the meaning of are experiencing a quiet revolution as passports, and other forms of documentation the converted text. In other words, they can aspiring software providers merge that needs managing. process document content more accurately. As visionary technology providers blend OCR with Machine Learning and Deep OCR with Machine Learning and Deep Learning. Therefore, data capturing Learning, these tools are experiencing a quiet revolution. As a result, software for data software is instantaneously capturing capture is simultaneously collecting data information and understanding the and understanding the information, which content. In practice, this means that Machine Learning and Deep Learning tools can check for mistakes independent of a human user providing efficient fault management. Nafath Machine Learning, Deep Learning and OCR Nafath Issue 15 Revitalizing Technology Issue 15 26 27 implies, in practice, that Machine Learning The Mada Innovation Program has worked on and Deep Learning tools can search for a use case to develop an OCR with improved errors independent of a human-user, which Arabic language support with key benefits Smart Apps for will result in simplified and effective fault like improved access to digital documents for management. Arabic speaking People with Disabilities and superior accuracy for Arabic OCR that can PWDs using OCR The text in a grainy photograph can be read by be used across multiple disciplines. This will today’s Deep Learning and machine learning- also allow the conversion, management, and Optical Character Recognition (OCR) is a driven OCR systems, such as the Google privacy of big data in Arabic. technology that allows you to transform Vision API, even if it is thin, in a weird font, different types of documents into editable and upside down, or partially obscured. This is Deep Learning and Machine Learning searchable content, recognize and transform made possible through probabilistic analyses embedded OCR tools are sleeping giants on text or print documents, such as scanned of which letters are likely to occur where the broader topic of digital transformation. paper documents, PDF files, or digital camera considering the context of the scene. While As Deep Learning and Machine Learning are images, and convert them into digital text machine learning offers pioneering results widely embraced as disruptive new technology documents. The OCR program extracts in the extraction of information, extraction of that has automated manual processes, its and converts the characters into machine- receipt data, and freedom from templates, growth has led modern businesses to raise readable data. deep understanding helps to gain insights their expectations of what can be achieved by into the transformed data and algorithms to automation. Those using deep learning and learn from continuous feedback generated machine learning integrated OCR engines to by corrections to the extracted data to create search for errors and meanings are beginning better results over time. to outpace OCR engines that need to be controlled by human users. Deep learning and machine learning innovation in OCR has helped overcome This marks a revolution for users with reading challenges for individuals with functional limitations easing the usage at dyslexia, ADHD, and Irlen Syndrome while schools, the workplace, and home with various enabling visually impaired people by using settings which will improve education and high accuracy image-based with text-to- empower the user to take up challenging speech technology and deriving meaning from tasks, courses, and positions. While AI- the converted phrases. based OCR tools may not be as desirable as other transformative technologies, they will When it comes to the Arabic language, the predictably have a significant influence. accuracy rate for OCR is very low, making the technology effectively unusable on a wide scale. For People with Disabilities, namely people with visual disabilities, this means a low availability of accessible digital content in the Arabic language. Furthermore, it translates into that the means to create such material through OCR is not available as well. Nafath Smart Apps for PWDs using OCR Nafath Smart Apps for PWDs using OCR Issue 15 Issue 15 28 29

With the fundamental technology being Optical Character 3. Snapverter Recognition (OCR) technology, there are many creative For Google ChromeTM, Snapverter is an easy-to-use assisted learning technologies available. add-on for Read&Write that converts classroom papers Using OCR technology has transformed efficiency in the and files into readable PDF documents for easy sharing classroom, increased student participation and motivation, through Google Drive and aloud reading. and, most significantly, accelerated learning. Also, this 4. Prizmo innovation has played a crucial role in improving the lifestyle With advanced editing, OCR, and text-to-speech, scan any of persons with disabilities. image with text into PDF. In order to enhance their learning environment or reading skills, Prizmo may be used by low To convert a physical copy of a document into an interactive vision students or students struggling with dyslexia in (or soft) format, OCR technology can be used. For instance, if classrooms. you scan a multi-page document and want to convert it into a digital image format such as a TIFF file, you can load the 5. Voice Dream Reader document into an OCR program that identifies the text and Voice Dream Reader is a flexible iOS and Android converts it into an editable text file. application for reading. This app is beneficial for anyone, When creating, editing, and reusing different documents, an including people with various disabilities, with Dyslexia advanced, powerful OCR program helps you to save a lot of friendly font, text and audio synchronization, adjustable time and effort. font size and color variations, as well as complete voiceover support. Groundbreaking examples of OCR technology-based solutions 6. Aipoly Vision were developed to support people with disabilities. These Aipoly Vision utilizes artificial intelligence and OCR to help solutions helped reduce reading difficulties for people with low-vision people better understand their surroundings. dyslexia, allowed visually impaired people to read mail and Users point the app at an object and simply press a fill out forms independently, made learning tools and lessons recognition button. Once users encounter a sign or accessible to learners with learning difficulties and reading document, they can switch to the “read text” button to read difficulties through text-to-speech and gave visually impaired it out loud. It can also read text in multiple languages. people access to text-to-speech image-based PDFs. 7. Digit Eyes examples of these apps include: Digit Eyes was created for the visually impaired shopper.

This software audibly reads the manufacturer barcode and 1. ClaroPDF the product’s name. For household products, users may As PDF files are basically images of documents, they pose also record their own labels. a problem with the basic technologies of text-to-speech.

ClaroPDF is an app that can recognize and interpret text in pictures. It retains the formatting of the original text, unlike most OCR apps. It includes synchronized highlighting, text-to-speech, annotation tools, audio and video notes addability, and Dropbox integration. 2. SnapType Pro For students with dyslexia, workbooks and photocopied worksheets may be troublesome. For fill-in-the-blank and matching exercises, the formatting is always lost during regular OCR, an issue that makes it hard to use AT to insert responses. By giving users the ability to overlay text boxes on worksheet images, SnapType solves the problem. A keyboard can then be used by students to position their answers in the correct spaces. Nafath Nafath Making Social Media Accessible for All Issue 15 Issue 15 Twitter 30 31

his article will showcase some of Twitter Accessibility the ways Twitter has been designed With over 285 million users with visual to be accessible to people with impairments around the globe, an accessible Making disabilities, thereby enabling Twitter will undoubtedly have a profound impact Teveryone to share and access content in a on this community. manner that makes the most sense to them. Twitter mostly has text-based tweets, which are Social Media The article is part of a Nafath series that focuses on accessible by default and readable by screen the different ways major social media platforms readers, but often people use photographs and implement the fundamentals of accessibility and videos to tweet, and before publishing, they must universal design to their websites and apps. be checked for accessibility. When you tweet Accessible At a time when the use of these platforms has images using the iOS or Android Twitter app increased, replacing traditional media outlets or on twitter.com, you have the option to write and, at times, even workplace collaboration tools, image descriptions so that more individuals, it is important to ensure that there are ample including those who are blind or low-vision, can for All resources to enable people with disabilities to access the content. Good image representations access them and use them at par with the rest are concise and informative, helping people of the world. understand what is happening in a photo.

Twitter About Twitter Twitter is built in a way that allows users with Twitter is an American microblogging and social screen readers to quickly access the app. This networking platform that users post messages means that a blind person can access the entirety known as tweets on and interact with. Registered of the app by audible signals by triggering the users can upload, like, and retweet tweets, but voiceover function on iOS or Apple computers. For unregistered users can only read them. Via a Windows (JAWS or NVDA) and Android (Talkback) browser, Short Message Service (SMS), or its platforms, this function is also available. mobile device application program (‘app’), users access Twitter. Twitter has more than 25 offices Here are some tips for making Twitter as worldwide and is headquartered in San Francisco, accessible as possible: California. Tweets were initially limited to 140 characters but were doubled to 280 characters Image Descriptions in November 2017. For most accounts, audio and There is an option to create a description video tweets remain restricted to 140 seconds. while tweeting using pictures. This is done by going to In March 2006, Twitter was founded and was Settings > Display and Sound > Accessibility launched in July of that year. By 2012, 340 million and turning on Compose Image Descriptions. tweets a day were posted by over 100 million The next time a picture is attached to one of the people, and there was an average of 1.6 billion tweets, the Add Description button at the bottom daily search requests. It was one of the ten most- of the tweet will appear. Clicking it will take the visited websites in 2013 and was described as user to an Image Description screen, where they “The Internet SMS.” Twitter had more than 321 will be able to add a picture description of up to million active monthly users as of 2018. 420 characters, often known as alternative text or alt text for visually impaired users. نفاذ جعل وسائل التواصل االجتماعي متاحة للجميع Nafath Making Social Media Accessible for All العدد 15 تويتر Issue 15 Twitter 32 32

ومع ذلك، لن يظهر الوصف كجزء من التحديث الرئيسي، ممارســة طريقــة وضــع الوســومات االجتماعيــة However, the description will not appear as part Conclusion ولكــن ســيتمكن األشــخاص ذوي اإلعاقــة البصريــة مــن إذا كان المستخدم يستخدم في رسائله @S أو #S، فيجب of the main update. People who are visually There are different ways major social media الوصــول إلــى الوصــف عبــر التكنولوجيــا المســاعدة )علــى عليــه ًدائمــا وضعهــا فــي نهايــة التغريــدات. عــادة مــا تكون impaired will have access to the description via Major social media platforms implement the ســبيل المثــال، قارئــات الشاشــة وعارضــات برايــل(. هــذه ممارســة جيــدة علــى وســائل التواصــل االجتماعــي their assistive technology (e.g., screen readers fundamentals of accessibility and universal بشــكل عــام، ولكنهــا مفيــدة أيضً ــا لمن يســتخدمون برامج and braille displays). design to their websites and apps in different تباين األلوان قــراءة الشاشــة. باإلضافــة إلــى ذلــك، مــن األفضــل كتابــة ways. At a time when the use of these platforms يمكّ ــن تويتــر المســتخدمين مــن تحســين تبايــن األلــوان الحرف األول من كل كلمة بأحرف كبيرة )وهذا ما يسمى Color contrast is increasingly taking the place of traditional بيــن النــص وألــوان الخلفيــة الخاصــة بالمنصــة، مما يســهل CamelCase( أثناء استخدام عالمات التصنيف التي هي ,Twitter enables users to improve the color media outlets and workplace collaboration tools قــراءة النــص. كلمات مركبة، مثل “#contrast between the text and the background it is important to ensure that there are plenty .”AssistiveTechnology colors of the platform, making it easier to read resources to enable the persons with disabilities يحتــوي إعــداد إمكانيــة الوصــول لتطبيــق الويــب اآلن استخدام الكالم النصي text. The web app’s accessibility setting now to access them and use them equally with the علــى زر تبديــل جديــد مــن أجــل “زيــادة تبايــن األلــوان”، يقيــد تويتــر عــدد األحــرف، لكــن ذلــك ال يعني أن اســتخدام has a new toggle button for “increase color other users in the world. Twitter is built in a way وعنــد تشــغيله، يقــوم الــزر بتنشــيط األلــوان عاليــة التبايــن االختصارات الشائعة مثل U و tho و K وما إلى ذلك ًأمرا contrast”. When switched on, the button activates that allows users with screen readers to quickly لمكونات واجهة المستخدم، ويجعل وضع التباين العالي ًّذكيــا، فقــد يبــدو مــن الغريــب أن يقــرأ قــارئ الشاشــة هــذه ,the high contrast colors for the User Interface access the app, it mostly has text-based tweets اســتخدام تطبيــق تويتــر ويــب أســهل لألشــخاص ذوي الكلمــات، لــذا ُينصــح باســتخدام كلمــات كاملــة. وبالمثــل، UI components. The high contrast mode makes which are accessible by default and readable by اإلعاقــات البصريــة. من األفضل إما تجنب االختصارات حيثما أمكن أو كتابتها using the Twitter Web app easier for people with screen readers, but often people use photographs بشكل كامل بعد االختصار. visual impairments. and videos to tweet, and before publishing, they يمكن للمستخدمين تنشيط الميزة من Users can activate the feature from Settings > must be checked for accessibility and add alt اإلعدادات < إمكانية الوصول < الرؤية < زيادة تباين الخاتمة .Accessibility > Vision > Increase color contrast. text and closed captioning when needed األلوان. هناك طرق مختلفة لتطبيق منصات التواصل االجتماعي الرئيسية ألساسيات الوصول والتصميم الشامل لمواقعها Using indicators in tweets استخدام المؤشرات في التغريدات اإللكترونيــة وتطبيقاتهــا. ففــي الوقــت الــذي أصبــح فيــه Indicators must be used before hyperlinks to يجــب اســتخدام المؤشــرات قبــل االرتباطــات التشــعبية اســتخدام هذه المنصات يحل بشــكل متزايد محل وســائل ensure that visually impaired users know what للتأكــد مــن أن المســتخدمين ذوي اإلعاقــة البصريــة اإلعــالم التقليديــة ووســائل التعــاون فــي مــكان العمــل، ,to expect. “[PIC] for images, [VIDEO] for videos يعرفــون مــا يمكــن توقعــه “]PIC[ للصــور و ]VIDEO[ من المهم التأكد من وجود موارد كافية لتمكين مجتمع ”.and [AUDIO] for audio لمقاطــع الفيديــو و ]AUDIO[ للصــوت” . األشــخاص ذوي اإلعاقــة مــن الوصــول إليها واســتخدامها على نحو متكافئ مع سائر المستخدمين. تم تصميم تويتر Practicing social tagging etiquette بطريقــة تســمح للمســتخدمين الذيــن لديهــم برامــج قراءة In his messages, if a user uses @s or # s, he الشاشــة بالوصــول الســريع إلــى التطبيــق، فهــو يحتــوي In their messages, if a user uses @’s or #’s, he فــي الغالــب علــى تغريــدات نصيــة، والتــي يمكــن الوصــول .must always place these at the end of a tweet إليهــا ًافتراضيــا وقراءتهــا بواســطة برامــج قــراءة الشاشــة، This is usually good practice on social media وطالمــا أنــه ًغالبــا مــا يتــم اســتخدام الصــور الفوتوغرافيــة in general, but it is also useful for those who ومقاطــع الفيديــو فــي التغريــدات، فينبغــي التحقــق قبــل use screen readers. Additionally, it is better to النشــر مــن إمكانيــة الوصــول إليهــا وإضافــة نــص بديــل capitalize the first letter of each word (this is وتسمية توضيحية أو تعليقات نصية مغلقة عند الحاجة. called camel case) while using hashtags that are compound words, like “#AssistiveTechnology”.

Using text-speak Twitter restricts the number of characters, but that does not mean that common abbreviations such as U, tho, K, etc. are clever to use. These would sound weird to read by a screen reader, so it is advisable to use full words. Similarly, it is best to either avoid acronyms where possible or type them out following the abbreviation.