Algorithm for Automatic Text Retrieval from Images of Book Covers
Total Page:16
File Type:pdf, Size:1020Kb
ALGORITHM FOR AUTOMATIC TEXT RETRIEVAL FROM IMAGES OF BOOK COVERS A Dissertation submitted towards the partial fulfilment of requirement for the award of degree of MASTER OF ENGINEERING IN WIRELESS COMMUNICATION Submitted by Niharika Yadav Roll No. 801363021 Under the guidance of Dr. Vinay Kumar Assistant Professor, ECED Thapar University Patiala ELECTRONICS AND COMMUNICATION ENGINEERING DEPARTMENT THAPAR UNIVERSITY (Established under the section 3 of UGC Act, 1956) PATIALA – 147004, PUNJAB, INDIA JULY-2015 ii ACKNOWLEDGMENT With deep sense of gratitude I express my sincere thanks to my esteemed and worthy supervisor, Dr. Vinay Kumar, Assistant Professor, Department of Electronics and Communication Engineering, Thapar University, Patiala for his valuable guidance in carrying out work under his effective supervision, encouragement, enlightenment and cooperation. Most of the novel ideas and solutions found in this dissertation are the result of our numerous stimulating discussions. I shall be failing in my duties if I do not express my deep sense of gratitude towards Dr. Sanjay Sharma, Professor and Head of the Department of Electronics and Communication Engineering, Thapar University, Patiala who has been a constant source of inspiration for me throughout this work, and for the providing us with adequate infrastructure in carrying the work. I am also thankful to Dr. Amit Kumar Kohli, Associate Professor and P.G. Coordinator, and Dr. Hem Dutt Joshi, Assistant Professor and Program Coordinator, Electronics and Communication Engineering Department, for the motivation and inspiration that triggered me for this work. I am greatly indebted to all my friends who constantly encouraged me and also would like to thank the entire faculty and staff members of Electronics and Communication Engineering Department for their unyielding encouragement. At last but not the least my gratitude towards my parents, who always supported me in doing the things my way and whose everlasting desires, selfless sacrifice, encouragement, affectionate blessings and help made it possible for me to complete my degree. Place: TU, Patiala Niharika Yadav Date: Roll No. 801363021 iii ABSTRACT Text extraction is one of the major areas of research in the field of document image Analysis. Text retrieval is needed for bibliographic databases, structuring images etc. Text embedded in multimedia data, as a well-defined model of concepts for humans’ communication, contains much semantic information related to the content. This text information can provide a much truer form of content–based access to the image and video documents if it can be extracted and harnessed efficiently. Moreover, automation of this process will greatly reduce the human interference while converting books (specifically their covers where this task becomes extremely difficult) to readable and editable electronic format specifically for electronic book readers. However this is a challenging task because images contain text of different size, style, orientation, alignment, low contrast, noise and have complex background structure. This dissertation propounds a method for extracting text from images of book covers and embedded text. A new text model is constructed to retrieve text regions from the scene text images. The image is first clustered to reduce the number of color variances, a suitable plane is identified and then text region is segmented using connected component based method. The text thus obtained is then enhanced to ameliorate the results. A detailed study of sundry techniques that have been proposed so far, along with their performance analysis has also been incorporated in the work. The algorithm is evaluated comprehensively on various datasets including ICDAR -2011 dataset. The experimental results demonstrate that the proposed text detection method can capture the inherent properties of text and discriminate text from other objects efficiently. The proposed method gives a very high character recognition rate for monochrome images, however in cases where there is a drastic variation in the text features rejection is noticeable. iv TABLE OF CONTENTS ACKNOWLEDGMENT..................................................................................................... ii ABSTRACT ....................................................................................................................... iv LIST OF FIGURES .......................................................................................................... vii LIST OF TABLES ............................................................................................................. ix GROSSARY OF ACRONYMS ......................................................................................... x CHAPTER-1 INTRODUCTION ........................................................................................ 1 1.1 Motivation ................................................................................................................. 2 1.2 Text Features ............................................................................................................. 3 1.3 Text Classification .................................................................................................... 5 1.4 Text Information Extraction ..................................................................................... 8 1.5 Scope of the Dissertation .......................................................................................... 9 CHAPTER-2 TEXT INFROMATION EXTRACTION MODEL ................................... 10 2.1 Text Detection ......................................................................................................... 12 2.2 Text localization...................................................................................................... 12 2.2.1 Region-based methods ..................................................................................... 13 2.2.2 Morphological based methods ......................................................................... 14 2.2.3 Texture-based methods .................................................................................... 17 2.3 Performance Analysis ............................................................................................. 18 CHAPTER-3 LITERATURE REVIEW ........................................................................... 19 3.1 Preprocessing Techniques ....................................................................................... 20 3.2 Connected Component Based Methods .................................................................. 21 3.3 Edge Based Methods............................................................................................... 23 3.4 Texture Based Methods .......................................................................................... 25 3.5 Morphological Method ........................................................................................... 26 CHAPTER-4 METHODOLOGY ..................................................................................... 28 4.1 Text Information Extraction Model ........................................................................ 29 4.2 Preprocessing Technique ........................................................................................ 29 4.2.1 Clustering ......................................................................................................... 30 4.2.2 Best Plane Identification .................................................................................. 32 v 4.3 Text Segmentation .................................................................................................. 34 4.3.1 Bottom up Analysis.......................................................................................... 35 4.3.2 Top Down Analysis ......................................................................................... 36 4.4 Noise Removal ........................................................................................................ 36 4.5 Text Extraction and Identification .......................................................................... 38 CHAPTER-5 RESULTS AND PERFORMANCE ANALYSIS ...................................... 41 5.1 Dataset and Experimental Results .......................................................................... 42 5.2 Performance Analysis ............................................................................................. 53 CHAPTER-6 CONCLUSION AND FUTURE SCOPE .................................................. 56 REFERENCE .................................................................................................................... 59 LIST OF PUBLICATIONS .............................................................................................. 64 vi LIST OF FIGURES Figure 1.1: Image with caption text .................................................................................... 4 Figure 1.2: Scene text image ............................................................................................... 7 Figure 1.3:Multi-color document images ........................................................................... 7 Figure 2.1: Text information extraction model ................................................................. 11 Figure 2.2: Gaussian filter................................................................................................. 14 Figure 2.3: Morphological operations..............................................................................