Mobile Sign Translator for the Thai Language
Tomas Tinoco De Rubira Department of Electrical Engineering Stanford University Stanford, California 94305 Email: [email protected]
Abstract---In this paper, I describe a simple smartphone- as in [1], [2], [3], use different detection approaches, based system that translates signs written in Thai to ranging from manual to fully automatic. In this paper, I English. The system captures an image of a sign written present a simple smartphone-based system that translates in Thai from the phone's camera and detects the text boundaries using an algorithm based on K-means cluster- signs written in Thai to English. The system uses a ing. Then, it extracts the text using Tesseract 1 Optical detection algorithm that requires a simple user input and Character Recognition (OCR) engine and translates it uses K-means for determining the boundaries of the text to English using Google Translate 2. The text detection region, taking advantage of certain features of the Thai algorithm implemented works well for single-line signs language. written in Thai that have a strong color distinction between letters and background within the text's bounding box. II. Related Work The algorithm assumes that the user placed the text at the center of the viewfinder with a horizontal orientation. Several mobile translation devices have been proposed By testing the system using a set of sign images, I found and some are currently available for usage. For example, that the performance of the overall system was limited by the performance of the Thai OCR. Tesseract, with Google Goggles is an Android and iPhone application the Thai language file available, gave inaccurate results, that can be used for translating text captured from a especially classifying vowels that occurred above or below phone's camera. The text detection part in this applica- the line of text. By limiting the number of characters that tion requires users to create a bounding box of the text Tesseract could recognize, the presented system was able manually and the translation requires Internet connec- to successfully translate Thai signs from a subset of the 4 sign images considered. tion. ABBYY's Fototranslate is a Symbian application that lets users capture text images and automatically I. Introduction finds bounding boxes around the words to be translated. A common problem faced by travelers is that of This application currently supports English, French, Ger- interpreting signs written in an unfamiliar language. man, Italian, Polish, Russian, Spanish and Ukrainian and Failing to interpret signs when traveling can lead to does not require Internet connection. minor problems, such as taking a photograph in a place The authors from [2] propose TranslatAR, a mobile where photographs are forbidden, or very serious ones, augmented reality translator. This smartphone-based sys- such as missing a stop sign. With the computing power tem implements a text detection algorithm that requires and capabilities of today's mobile devices, it is possible an initial touch on the screen where the text is located. to design smartphone-based systems that can help travel- Once the user provides such input, the algorithm finds ers by translating signs automatically to their language. a bounding box for the text by moving horizontal and These systems are usually composed of three subsystems vertical line segments until these do no cross vertical that perform text detection, text extraction and text trans- and horizontal edges respectively. It then uses a modi- lation respectively. The extraction and translation parts fied Hough Transform to detect the exact location and are relatively well developed and there exists a large va- orientation of the upper and lower baselines of the text. riety of software packages or web services that perform The text is then warped to have an orthogonal layout these tasks. The challenge is with the detection. Current and then the algorithm applies K-means to extract the available portable translation systems, such as Google colors of the background and letters. The system then Goggles 3, and systems proposed in the literature such extracts the text using Tesseract, obtains a translation using Google Translate and renders the translation on 1http://code.google.com/p/tesseract-ocr/ the phone's screen using the extracted colors. 2http://translate.google.com/ 3http://www.google.com/mobile/goggles/ 4http://www.abbyy.com/fototranslate/ The authors from [1] propose a system for translating CD signs written in Chinese to English. They describe a C