ICTA’07, April 12-14, Hammamet, Tunisia

An Optical Recognition System

AbdulMalik Al-Salman, Yosef AlOhali, Mohammed AlKanhal, and Abdullah AlRajih King Saud University and KACST, Riyadh, Saudi Arabia [email protected]

Abstract positions, or any combination. Counting the space, in which no dot is raised, there are 64 such combinations 6 Technology has shown great promise in providing (that is 2 = 64). access to textual information for visually impaired The dimensions of a Braille people. Optical Braille Recognition (OBR) allows dot have been set according to people with visual impairments to read volumes of the tactile resolution of the typewritten documents with the help of flatbed fingertips of person. The scanners and OBR software. This project looks at horizontal and vertical developing a system to recognize an image of distance between dots in a embossed Arabic Braille and then convert it to text. It character, the distance particularly aims to build fully functional Optical between cells representing a Arabic Braille Recognition system. It has two main word and the inter-line Figure 1. Braille tasks, first is to recognize printed Braille cells, and distance are also specified by Cell. second is to convert them to regular text. Converting the Library of Congress. Dot height is approximately Braille to text is not simply a one to one mapping, 0.02 inches (0.5 mm); the horizontal and vertical because one cell may represent one symbol ( spacing between dot centers within a Braille cell is letter, digit, or special character), two or more approximately 0.1 inches (2.5 mm); the blank space symbols, or part of a symbol. Moreover, multiple cells between dots on adjacent cells is approximately 0.15 may represent a single symbol. inches (3.75 mm) horizontally and 0.2 inches (5.0 mm) vertically. A standard Braille page is 11 inches by 11.5 1. Introduction inches and typically has a maximum of 40 to 43 Braille cells per line and 25 lines. Braille has been adapted to Visually impaired people are part of the society, and write many different languages including Arabic and is can play an integral role in its prosperity. Therefore, it also used for musical and mathematical notation. Note has been a must to provide those people with means that both Arabic and are read from left- and systems through which they may communicate to-right. with the world. These systems should depend on the For OBR, a Braille document is placed on a sense of hearing or touching. Many systems have been standard scanner and scanned by the OBR program, developed to achieve this purpose; the most famous which either converts the document into text or saves it system that is based on touching is Braille system. as a formatted Braille file. OBR works with single-side Braille is a that enables visually people or double-sided Braille using a single scan. to read and write through touch using a series of raised OBR offers many benefits to Braille users and those dots to be read with their fingers. who work with them, facilitating communication, Braille contractions representing groups of letters or reducing storage space, and preserving out-of-print whole words that appear frequently in a language. This Braille texts. Everyone who works with blind people is usually referred to as Grade 2 Braille. The use of and does not read Braille will benefit from using the contractions permits faster Braille reading and helps OBR. For example: parents, teachers, public reduce the size of Braille books, making them organizations communicating with blind individuals, somewhat less cumbersome. Each Braille character or and computerized Braille libraries. All people in "cell" is made of 6 dots arranged in a rectangle workplaces where Braille is used can read Braille comprising 2 columns of 3 dots each as it can be seen easily by using an OBR. in Figure 1. A dot may be raised at any of the 6 This paper looks at developing algorithms to recognize an image of embossed Arabic Braille material obtained by a regular scanner. Although the thresholds used were adaptively calculated from the Braille dots have the same color as the background, histogram of the input image. they cast soft shadows when scanned with a standard In 1995, Hentzschel and Blenkhorn [2] presented a flatbed scanner. These shadows are used to locate the system for optical Braille recognition based on twin dots on the page. As the dots on both sides of the page shadows approach, which subtracts two images of the are visible from one side, both sides of the page can be same Braille page, where each image was taken under recognized in a single scan. The recognition engine is different illumination conditions. This helps eliminate capable of processing Braille documents with various blemish and noise in images caused by the texture of paper colors, number of dots, Braille cell sizes, and the paper used. Unlike Blenkhorn’s work [1] presented orientations. To convert the Braille cells to text, a in 1994 that solely discussed the classification process, translation engine is needed his work in association with Hentzschel in the following year addressed the different modules of a 2. Literature Review pattern recognition system, including image processing. In [2], the image-processing module An OBR system consists of several basic modules encompassed a variety of routines, each serving a including: image acquisition, image processing, dot different and crucial purpose as using random-noise localization and segmentation, and finally dot reduction filter that was necessary to avoid undesired recognition and conversion. Among issues that must be emphasis of noise. taken into consideration when implementing an OBR In [8], preprocessing consists of two sub- system are factors that negatively influence the operations: noise filtering and edge enhancement. identification process, such as lighting conditions, page Noise filtering is achieved via a low-pass special placement in the scanner, and page movement [6]. Gaussian filter. Edge detection is achieved using Image acquisition is the first and most important convolution Sobel kernels. step in any pattern recognition system. In OBR The approach adopted by Mennens et al. in [4] for systems data is provided to the system in the form of extracting Braille dots is based on several assumptions, images of Braille embossed pages. The process of one of which is that a single dot is represented by two acquiring these images digitally can be achieved by gray level intensities, a light area right above a dark using a number of different equipments such as one. Also their system is designed to recognize a scanners or digital cameras, both of which have been double-sided Braille page. This approach may produce used by developers and researchers [2-4, 7, 9-12]. false core regions if two dots are vertically neighbors. Image preprocessing is an essential step during The localization and extraction algorithm developed which errors that occurred while the images were taken by Hermida et al. [3] takes a thresholded image are eliminated. Errors include noise, deformation, bad consisting of couples of white and black spots, where illumination or blurring. Image preprocessing can be each couple denotes a single Braille dot. Though this used for image enhancement by reducing noise, method is easy and quick, it suffers from that: if some sharpening images, or rotating a skewed page. The legitimate points are lost, false ones are produced. algorithms used differ from one system to another The dot localization and extraction technique used depending on the classification approach followed by in [2] consists of two steps; the first is image researchers and developers. registration and the second is character segmentation. An early effort was presented by Mennens et al. in The final step of dot extraction is normalization, where their work [4]. The authors addressed the problem of each Braille character is represented by a 2x3 matrix. false shadows in the image caused by the fact that Each bit in the matrix denotes a dot in Braille cell. Braille pages are never perfectly flat due to the tension Deciding whether a bit value is either 0 or 1 is based in the paper’s surface, by subtracting a locally on a threshold used to test 1/6 of the cell area. averaged image from the original. The authors rotate In 2004, Wong et al. [7] proposed an OBR system skewed images by using the deviation over a vertical that is capable of recognizing a single sided Braille projection of the image. page in addition to preserving the format of the Hermida’s et al. [3] system employed thresholding original document in the produced text file. The before the image is passed on to the Braille dot algorithm processes the image one row at a time extraction module. Their algorithm converts a digital reducing the computation time significantly. image of a scanned Braille page into one consisting The dot detection module incorporated in the OBR mainly of black and white spots denoting the dots. The system proposed by Oyama et al. [6] in 1997 is designed to detect both recto and verso Braille dots; that is detecting dots on both sides of the page. This

82 was possible due to the difference in light reflectance image is converted to a gray color. Following that any between recto and verso dots. A hardware circuit white or black frames are cropped. The image is then configuration that corresponds to the equations and thresholded so that only three classes of regions exist: operates in a similar manner was given as well. dark, light and background. In case of having a tilted Mennens et al. [4] adopted Binary Braille cell sets paper then de-skewing is applied using a binary search as basis for their classifier, which is grouping the dots algorithm. Having labeled each of the different types and representing each dot by a bit position. The of regions, an initial identification of Braille dots is authors did not elaborate on the comparison method performed. Finally, Braille cells are then recognized. used to recognize a Braille cell as a certain letter or Each stage is described below in more detail. digit. The classifier presented by Hermida et al [3] takes as an input the image produced from the dot 3.1 Converting the Image to Gray Level extraction module, where characters are represented as a group of dots, each dot is in turn represented by a Inside a computer system, colored images are stored single bit, with 1 or 0 values. Using this representation, in 3-D arrays while gray level images are stored in 2-D the Braille-to-ASCII conversion is accomplished. The system proposed in [1] is based on finite state approach that operates with a finite number of states that perceives the correct state, in addition to its ability Figure 2. An example of a part of a colored to perform both left and right context checking using scanned Braille document matching algorithms. This feature is very important in determining characters proceeding wildcards. One of the greatest advantages of this system is that it is designed in a way to perform the conversion from Braille to any of the natural languages depending on Figure 3. Gray level of Figure 4 the tables provided containing the conversion rules. arrays. Dealing with 2-D arrays is much easier and The work in [2] did not elaborate on the techniques faster. Therefore, the first step in the proposed system used for converting an extracted Braille cell into converts colored scanned images to gray level so that natural language characters. any pixel value in the image falls within the range 0- The recognition module in [7] is designed to work 255 as could be seen in Figures 2 and 3. with thresholded images resulting from the half- character detection module. The classification process 3.2 Cropping the Image Frame is carried out using a probabilistic neural network. For interpretation purposes, authors in [8] Some of the scanned images suffer from having determine centriod distances between each dot and its either black or white frames that would affect the four possible neighbors. Dots are then grouped into thresholding step coming next. It is a must that those cells. Based on the boundary coordinates, information frames are cropped before proceeding forward. To do and illumination characteristics, two standard that, the average gray level is calculated for the whole templates were then constructed to represent the front- image and then for each of the rows and columns face dots and back-face dots. separately. Finding a row or column average gray level To the best of our knowledge, there is only one that is above or below 15% of the whole image commercial OBR software. It is a Windows-based average gray level is an indication to delete it as software that allows reading single and double-sided experiences have shown. Braille documents with a standard scanner [13]. Unfortunately, it does not support the Arabic language 3.3 Image Thresholding and there is no explanation of the used techniques. This step involves examining each pixel in the 2-D 3. The Arabic OBR System array representing the scanned Braille document and classifying it to one of the following three categories: The Arabic OBR system is able to recognize both (1) Elements having values 32 and above are single-sided and double-sided Arabic Braille considered bright and given a threshold value +1. (2) documents from a single scan. It comprises the Elements having values 23 and below are considered following stages. First, an image of a Braille document dark and given a threshold value -1. (3) Elements page is obtained using a flatbed scanner. Second, the having values between 23 and 32 are considered gray and given a threshold value 0.

83 An algorithm was developed to handle the 1. After thresholding the image (stage 3) we select thresholding. This algorithm works as follows; after either the dark or the bright part that resulted from converting the image to gray level and removing thresholding and delete the other part. highest and lowest values to leave out distinct values, 2. Horizontal projection is performed to count the the average gray level for the whole image is number of pixels in each row for the selected part calculated. After that both a and b values are calculated in the previous step. Then we calculate those rows according to the following equations: having more than 10 points. a = mean ( max(I) + avg ) / 2 3. The image is rotated 4 degrees one time to the left b = mean ( min(I) + avg ) / 2; and one to the right and in each time step (2) is Where I is an array representing the whole image, repeated. max(I) and min(I) represent the highest and lowest 4. Calculating the number of rows in the image in its values respectively in each column of the array I. three case (right/ left/ middle) we have either of the The classification parameters L and H are two possibilities: (a) if the number of rows for the calculated according to the following equations: right and left side is the same then the image is not L = avg – ( avg – min_avg ) / 3 skewed and we stop here (b) if the number of the H = avg + ( max_avg – avg ) /3; rows for the right and left side is different then the Where min_avg image is skewed and we proceed to step 5. represents the average of 5. At this step we have two possibilities as well: (a) if all values less than b the number of rows in the right side is less than that while max_avg for the left side then we set the middle point represents the average of calculated in the fourth step to be the new starting all values higher than a. point for the left side. We then calculate the The image is then number of rows for this new middle point for both classified according to L the right and left sides, (b) if the number of rows in and H values (see Figure the left side is less than that for the right side then 4). Values less than L are we set the middle point calculated in the fourth step considered dark. Values to be the new starting point for the right side. We higher than H are Figure 4. Image then calculate the number of rows for this new considered bright. Values values middle point for both the right and left sides. between L and H are 6. Repeat steps 4 and 5 as long as half the difference considered gray (background). between the right and left deviation is more than Having noticed that the histogram of a scanned 1/16. Braille document is represented by the Gaussian After determining the Distribution, we deviation degree, we found that 1.5* σ, rotate the original gray where σ is the image and then repeat standard steps 2 and 3 for the deviation, gives rotated image. This is good results when done because if the Figure 6. Crop effect finding L and H image is skewed then on skewed image values. Image cropping may remove pixel values Figure 5. L, H and m values some of the image parts that constitute Braille cells as above H are could be seen in Figure 6. bright and those blow L are dark (m is the man). See Figure 5. 3.5 Dot Parts Detection

3.4 Image De-Skewing Experiences have shown that detection of Braille dot parts is better than detection of the whole dot. It We have developed a binary search algorithm to was also proven that the average dot height is 8 pixels. correct any de-skewing in tilted scanned images. The Each dot is composed of a bright and a dark region maximum degree of recognizing a de-skewed image is with a small space between them. See Figure 7. The 4 degrees from either the left or the right side. To implication is that if the bright region comes at top calculate the de-skewing degree we have used the where the dark one comes at bottom then this is a recto following algorithm: dot, the contrary situation results in a verso dot. The

84 algorithm for dot part detection has the following we take the first and last positive value positions. After steps: that one of the two following algorithms can be used: 1. Since the average dot height is 8 pixels and each Algorithm 1. dot is composed of a dark and a bright region with Determine the number of rows and columns in the a space in between, we will look for any 8 pixels defined region. The average line and column width high column. For each column we are interested have been identified as in Figure 10. To calculate the only in the first and second values from top and number of rows and columns we use the following two bottom. equations: 2. Construct an empty array to hold the points. linNum = ( yMax - yMin ) / 59, 3. As indicated before, the value assigned to bright colNum = ( xMax – xMin ) / 34.7; pixels is +1 and to dark pixels is -1, otherwise 0. Then we can reach the beginning of any cell by 4. A vertical search for dot parts is performed on the using the line and column number, as in the following array starting from top to bottom such that: equations: • If pixel(1)+pixel(2)>0 AND pixel(7)+pixel(8) <0 i = yMin + (lin-1) * 59, then this column is part of a recto dot. j = xMin + (col-1) * 34.7; • If pixel(1)+pixel(2)<0 AND pixel(7)+pixel(8) >0 then this column is part of a verso dot. Algorithm 2. 5. In case of having any of the two conditions true we register that in the array and then go down 12 pixels since this is the vertical space between two dots. 6. At the end, we will have recto and verso dots as in Figure 8. Now we can separate them in two arrays. Errors from previous step may occur. To correct that we can use either of the following methods: Figure 7. Dots thresholding Local Measure: select the best column within all into bright and dark regions. columns that are 8 pixels below it. Global Measure: all columns are classified into three categories: columns holding recto dots, columns holding verso dots, or undefined columns. The second method gave excellent results.

3.6 Whole Dot Detection This stage involves the following steps: 1. A detected dot should have at least three columns, less than that is not counted in. Figure 8. Recto and Verso dots 2. An empty array that has the same size of the one in two different colors . holding dots parts is constructed. 3. We perform a vertical search on the array starting from up to down, when three parts of a dot that are not more than 6 pixels apart are found then this is considered a point. This point is registered at the corresponding position in the array as 4x4 point. Figure 9. Recto dots detected from Fig. 8. 4. The region holding the detected dot part in the previous step is deleted so that it will not be This algorithm does not depend on fixed lengths for considered in future searches ( Detect and Clean .) cells heights and distances between them. The steps Figure 9 shows the dots at the end of the search. involved here are as follows: (1) Determine the horizontal projection for the array 3.7 Braille Cells Recognition holding the dots and then determine the average distances between the rows holding the dots. Having identified all possible valid dots, the system (2) Determine the vertical projection for the array defines the region containing all the dots such that no holding the dots and then determine the average dots exist outside this region. To do this, we have to distances between the columns holding the dots. add each row and column in the array separately then

85 (3) Using the horizontal and vertical projections we scanned with different scanners. Overall, on single- can reach any cell. This is because we can consider sided and double-sided documents 99% of the dots are any consecutive 3 rows and 2 columns as a cell as long correctly recognized. The tests included different as the distance between the rows and columns does not variations of Braille documents; skewed, reversed or exceed the averages in (1) and (2). worn-out. By comparing the two algorithms we found that We were unable to compare our work with others even though the latter one does not require fixed due to the unavailability of others as a working lengths to determine the cells, it is less accurate than software, yet it gives almost similar results with the the former one, especially that most Braille documents commercial OBR software (which does not support the use the lengths and distances defined in Figure 10. Arabic language). After using either of the algorithms, we now have to Further work will focus on Arabic/English Braille convert the cell to binary code. To do that, we first documents, large Braille papers, converting the divide the cell into six regions as in Figure 11. Then recognized Braille cells (uncontracted and contracted, we take each region separately, if the sum of its Arabic and English) to text by producing translation elements is more than 8 pixels then it will be assigned rules for the conversion of Braille into print for the the number 1, otherwise it will be assigned 0. Then we required language, and more collaborative user convert the image to its decimal code representation: interface. Decimal-Code=b1+b2*2+b3*4+b4*8+b5*16+ b6*32 The final step is to convert the decimal code to its 5. References corresponding Arabic letters to get the translation. [1] Blenkhorn, P., “A System for Converting Braille into Print”, IEEE transactions on rehabilitation engineering, Vol. 3.8 Correcting the Paper Layout 3, No. 2, June 1995. [2] Hentzschel, T. W., and P. Blenkhorn, “An Optical It might be hard for the user to recognize the correct Reading Systems for Embossed Braille Characters using a layout of a Braille document when placed on a Twin Shadows Approach”, Journal of Microcomputer scanner. As a result some documents may be placed Applications, pp. 341-345. 1995. reversed by 180 degrees. To correct a reversed paper, [3] Hermida, X. F., et al, "A Braille O.C.R. for Blind we determine the percentage of the alphabetic in the People", Proceedings of ICSPAT-96. Boston (U.S.A.). October, 1996. [4] Mennens, J., et al, “Optical Recognition of Braille Writing”, IEEE, 1993. pp. 428-431. [5] Mennens, J., et al, “Optical Recognition of Braille Writing Using Standard Equipment”, IEEE transactions of rehabilitation engineering, Vol. 2, No. 4, December 1994. [6] Oyama, Y., T. Tajima, and H. Koga, “Character Recognition of Mixed Convex- Concave Braille Points and Legibility of Deteriorated Braille Points”, System and Figure 11. One Computer in Japan, Vol. 28, No. 2, 1997. Figure 10. The ave [7] Wong, L., W. Abdulla, and S. Hussmann, “A Software cells’ mesuares. cell measures Algorithm Prototype for Optical Recognition of Embossed page. Having more than 85% indicates that the page is Braille”, the 17th conference of the International Conference in its correct position while less than 45% indicates the in Pattern Recognition, Cambridge, UK, 23-26 August 2004. opposite. [8] C. Ng and V. Lau, “Regular feature extraction for . Horizontal and vertical reflections occur when the recognition of Braille”, ICCIMA'99, 1999 [9] I. Dias, “A portable device for optically recognizing shadow direction is not specified in the calibration. Braille-Part II: Software development”, 7 th Australian & Also, it has been shown that when the page is in its Neazlan Intelligent Information Systems Conf., pp. 18-21, right position then the percentage of correctly Perth, W. Australia, Nov 2001. recognized letters is more than the other positions. [10] T. Gomez, et al, “AIR-Coupled Ultrasonic Scanner for Braille”, IEEE ultrasonic Symposium, pp. 591-594, 2001. 4. Conclusion and Future Work [11] T. Yoshida, A. Ohya and S. Yuta, “Braille Block The proposed system has used some new Detection for Autonomous Mobile Robot Navigation”, Proc. techniques to recognize Braille cells using a standard Of the 2000 IEEE/RSJ intl. Conf. on Intelligent Robots and Systems, pp. 633-638, 2000. scanner. The system has been tested with a wide [12] J. Dubus, et al, “Image Processing Techniques to variety of A4 scanned Braille documents, both single Perform an Autonomous System to Translate Relief Braille and double sided, written in the Arabic language and

86 into Black-Ink, Called: Lectobraille”, IEEE Engineering in Medicine & Biology Society 10 th annual Inter. Conf., 1988. [13] Optical Braille Recognition System, version 3.5, User Manual, October 2000.

87