Algorithm for Automatic Text Retrieval from Images of Book Covers

Total Page:16

File Type:pdf, Size:1020Kb

Algorithm for Automatic Text Retrieval from Images of Book Covers ALGORITHM FOR AUTOMATIC TEXT RETRIEVAL FROM IMAGES OF BOOK COVERS A Dissertation submitted towards the partial fulfilment of requirement for the award of degree of MASTER OF ENGINEERING IN WIRELESS COMMUNICATION Submitted by Niharika Yadav Roll No. 801363021 Under the guidance of Dr. Vinay Kumar Assistant Professor, ECED Thapar University Patiala ELECTRONICS AND COMMUNICATION ENGINEERING DEPARTMENT THAPAR UNIVERSITY (Established under the section 3 of UGC Act, 1956) PATIALA – 147004, PUNJAB, INDIA JULY-2015 ii ACKNOWLEDGMENT With deep sense of gratitude I express my sincere thanks to my esteemed and worthy supervisor, Dr. Vinay Kumar, Assistant Professor, Department of Electronics and Communication Engineering, Thapar University, Patiala for his valuable guidance in carrying out work under his effective supervision, encouragement, enlightenment and cooperation. Most of the novel ideas and solutions found in this dissertation are the result of our numerous stimulating discussions. I shall be failing in my duties if I do not express my deep sense of gratitude towards Dr. Sanjay Sharma, Professor and Head of the Department of Electronics and Communication Engineering, Thapar University, Patiala who has been a constant source of inspiration for me throughout this work, and for the providing us with adequate infrastructure in carrying the work. I am also thankful to Dr. Amit Kumar Kohli, Associate Professor and P.G. Coordinator, and Dr. Hem Dutt Joshi, Assistant Professor and Program Coordinator, Electronics and Communication Engineering Department, for the motivation and inspiration that triggered me for this work. I am greatly indebted to all my friends who constantly encouraged me and also would like to thank the entire faculty and staff members of Electronics and Communication Engineering Department for their unyielding encouragement. At last but not the least my gratitude towards my parents, who always supported me in doing the things my way and whose everlasting desires, selfless sacrifice, encouragement, affectionate blessings and help made it possible for me to complete my degree. Place: TU, Patiala Niharika Yadav Date: Roll No. 801363021 iii ABSTRACT Text extraction is one of the major areas of research in the field of document image Analysis. Text retrieval is needed for bibliographic databases, structuring images etc. Text embedded in multimedia data, as a well-defined model of concepts for humans’ communication, contains much semantic information related to the content. This text information can provide a much truer form of content–based access to the image and video documents if it can be extracted and harnessed efficiently. Moreover, automation of this process will greatly reduce the human interference while converting books (specifically their covers where this task becomes extremely difficult) to readable and editable electronic format specifically for electronic book readers. However this is a challenging task because images contain text of different size, style, orientation, alignment, low contrast, noise and have complex background structure. This dissertation propounds a method for extracting text from images of book covers and embedded text. A new text model is constructed to retrieve text regions from the scene text images. The image is first clustered to reduce the number of color variances, a suitable plane is identified and then text region is segmented using connected component based method. The text thus obtained is then enhanced to ameliorate the results. A detailed study of sundry techniques that have been proposed so far, along with their performance analysis has also been incorporated in the work. The algorithm is evaluated comprehensively on various datasets including ICDAR -2011 dataset. The experimental results demonstrate that the proposed text detection method can capture the inherent properties of text and discriminate text from other objects efficiently. The proposed method gives a very high character recognition rate for monochrome images, however in cases where there is a drastic variation in the text features rejection is noticeable. iv TABLE OF CONTENTS ACKNOWLEDGMENT..................................................................................................... ii ABSTRACT ....................................................................................................................... iv LIST OF FIGURES .......................................................................................................... vii LIST OF TABLES ............................................................................................................. ix GROSSARY OF ACRONYMS ......................................................................................... x CHAPTER-1 INTRODUCTION ........................................................................................ 1 1.1 Motivation ................................................................................................................. 2 1.2 Text Features ............................................................................................................. 3 1.3 Text Classification .................................................................................................... 5 1.4 Text Information Extraction ..................................................................................... 8 1.5 Scope of the Dissertation .......................................................................................... 9 CHAPTER-2 TEXT INFROMATION EXTRACTION MODEL ................................... 10 2.1 Text Detection ......................................................................................................... 12 2.2 Text localization...................................................................................................... 12 2.2.1 Region-based methods ..................................................................................... 13 2.2.2 Morphological based methods ......................................................................... 14 2.2.3 Texture-based methods .................................................................................... 17 2.3 Performance Analysis ............................................................................................. 18 CHAPTER-3 LITERATURE REVIEW ........................................................................... 19 3.1 Preprocessing Techniques ....................................................................................... 20 3.2 Connected Component Based Methods .................................................................. 21 3.3 Edge Based Methods............................................................................................... 23 3.4 Texture Based Methods .......................................................................................... 25 3.5 Morphological Method ........................................................................................... 26 CHAPTER-4 METHODOLOGY ..................................................................................... 28 4.1 Text Information Extraction Model ........................................................................ 29 4.2 Preprocessing Technique ........................................................................................ 29 4.2.1 Clustering ......................................................................................................... 30 4.2.2 Best Plane Identification .................................................................................. 32 v 4.3 Text Segmentation .................................................................................................. 34 4.3.1 Bottom up Analysis.......................................................................................... 35 4.3.2 Top Down Analysis ......................................................................................... 36 4.4 Noise Removal ........................................................................................................ 36 4.5 Text Extraction and Identification .......................................................................... 38 CHAPTER-5 RESULTS AND PERFORMANCE ANALYSIS ...................................... 41 5.1 Dataset and Experimental Results .......................................................................... 42 5.2 Performance Analysis ............................................................................................. 53 CHAPTER-6 CONCLUSION AND FUTURE SCOPE .................................................. 56 REFERENCE .................................................................................................................... 59 LIST OF PUBLICATIONS .............................................................................................. 64 vi LIST OF FIGURES Figure 1.1: Image with caption text .................................................................................... 4 Figure 1.2: Scene text image ............................................................................................... 7 Figure 1.3:Multi-color document images ........................................................................... 7 Figure 2.1: Text information extraction model ................................................................. 11 Figure 2.2: Gaussian filter................................................................................................. 14 Figure 2.3: Morphological operations..............................................................................
Recommended publications
  • Snakes, Shapes, and Gradient Vector Flow
    IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 7, NO. 3, MARCH 1998 359 Snakes, Shapes, and Gradient Vector Flow Chenyang Xu, Student Member, IEEE, and Jerry L. Prince, Senior Member, IEEE Abstract—Snakes, or active contours, are used extensively in There are two key difficulties with parametric active contour computer vision and image processing applications, particularly algorithms. First, the initial contour must, in general, be to locate object boundaries. Problems associated with initializa- close to the true boundary or else it will likely converge tion and poor convergence to boundary concavities, however, have limited their utility. This paper presents a new external force to the wrong result. Several methods have been proposed to for active contours, largely solving both problems. This external address this problem including multiresolution methods [11], force, which we call gradient vector flow (GVF), is computed pressure forces [10], and distance potentials [12]. The basic as a diffusion of the gradient vectors of a gray-level or binary idea is to increase the capture range of the external force edge map derived from the image. It differs fundamentally from fields and to guide the contour toward the desired boundary. traditional snake external forces in that it cannot be written as the negative gradient of a potential function, and the corresponding The second problem is that active contours have difficulties snake is formulated directly from a force balance condition rather progressing into boundary concavities [13], [14]. There is no than a variational formulation. Using several two-dimensional satisfactory solution to this problem, although pressure forces (2-D) examples and one three-dimensional (3-D) example, we [10], control points [13], domain-adaptivity [15], directional show that GVF has a large capture range and is able to move snakes into boundary concavities.
    [Show full text]
  • Good Colour Maps: How to Design Them
    Good Colour Maps: How to Design Them Peter Kovesi Centre for Exploration Targeting School of Earth and Environment The University of Western Australia Crawley, Western Australia, 6009 [email protected] September 2015 Abstract Many colour maps provided by vendors have highly uneven percep- tual contrast over their range. It is not uncommon for colour maps to have perceptual flat spots that can hide a feature as large as one tenth of the total data range. Colour maps may also have perceptual discon- tinuities that induce the appearance of false features. Previous work in the design of perceptually uniform colour maps has mostly failed to recognise that CIELAB space is only designed to be perceptually uniform at very low spatial frequencies. The most important factor in designing a colour map is to ensure that the magnitude of the incre- mental change in perceptual lightness of the colours is uniform. The specific requirements for linear, diverging, rainbow and cyclic colour maps are developed in detail. To support this work two test images for evaluating colour maps are presented. The use of colour maps in combination with relief shading is considered and the conditions under which colour can enhance or disrupt relief shading are identified. Fi- nally, a set of new basis colours for the construction of ternary images arXiv:1509.03700v1 [cs.GR] 12 Sep 2015 are presented. Unlike the RGB primaries these basis colours produce images whereby the salience of structures are consistent irrespective of the assignment of basis colours to data channels. 1 Introduction A colour map can be thought of as a line or curve drawn through a three dimensional colour space.
    [Show full text]
  • Gradient Magnitude Similarity Deviation: an Highly Efficient Perceptual Image Quality Index
    1 Gradient Magnitude Similarity Deviation: An Highly Efficient Perceptual Image Quality Index Wufeng Xue, Lei Zhang, Member IEEE, Xuanqin Mou, Member IEEE, and Alan C. Bovik, Fellow, IEEE Abstract—It is an important task to faithfully evaluate the perceptual quality of output images in many applications such as image compression, image restoration and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. Figure 1 The flowchart of a class of two-step FR-IQA models. We present a new effective and efficient IQA model, called ratio (PSNR) correlates poorly with human perception, and gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local hence researchers have been devoting much effort in structures in a distorted image suffer different degrees of developing advanced perception-driven image quality degradations. This motivates us to explore the use of global assessment (IQA) models [2, 25]. IQA models can be classified variation of gradient based local quality map for overall image [3] into full reference (FR) ones, where the pristine reference quality prediction. We find that the pixel-wise gradient magnitude image is available, no reference ones, where the reference similarity (GMS) between the reference and distorted images image is not available, and reduced reference ones, where combined with a novel pooling strategy – the standard deviation of the GMS map – can predict accurately perceptual image partial information of the reference image is available.
    [Show full text]
  • Fast and Optimal Laplacian Solver for Gradient-Domain Image Editing Using Green Function Convolution
    Fast and Optimal Laplacian Solver for Gradient-Domain Image Editing using Green Function Convolution Dominique Beaini, Sofiane Achiche, Fabrice Nonez, Olivier Brochu Dufour, Cédric Leblond-Ménard, Mahdis Asaadi, Maxime Raison Abstract In computer vision, the gradient and Laplacian of an image are used in different applications, such as edge detection, feature extraction, and seamless image cloning. Computing the gradient of an image is straightforward since numerical derivatives are available in most computer vision toolboxes. However, the reverse problem is more difficult, since computing an image from its gradient requires to solve the Laplacian equation (also called Poisson equation). Current discrete methods are either slow or require heavy parallel computing. The objective of this paper is to present a novel fast and robust method of solving the image gradient or Laplacian with minimal error, which can be used for gradient-domain editing. By using a single convolution based on a numerical Green’s function, the whole process is faster and straightforward to implement with different computer vision libraries. It can also be optimized on a GPU using fast Fourier transforms and can easily be generalized for an n-dimension image. The tests show that, for images of resolution 801x1200, the proposed GFC can solve 100 Laplacian in parallel in around 1.0 milliseconds (ms). This is orders of magnitude faster than our nearest competitor which requires 294ms for a single image. Furthermore, we prove mathematically and demonstrate empirically that the proposed method is the least-error solver for gradient domain editing. The developed method is also validated with examples of Poisson blending, gradient removal, and the proposed gradient domain merging (GDM).
    [Show full text]
  • Colour Image Gradient Regression Reintegration
    Colour Image Gradient Regression Reintegration Graham D. Finlayson1, Mark S. Drew2 and Yasaman Etesam2 1 School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, U.K. [email protected] 2School of Computing Science, Simon Fraser University, Vancouver, British Columbia, Canada V5A 1S6, {mark,yetesam}@cs.sfu.ca Abstract much-used method for the reintegration task is the well known Suppose we process an image and alter the image gradients Frankot-Chellappa algorithm [3]. This method solves a Poisson in each colour channel R,G,B. Typically the two new x and y com- equation by minimizing difference from an integrable gradient ponent fields p,q will be only an approximation of a gradient and pair in the Fourier domain. Of course, many other algorithms hence will be nonintegrable. Thus one is faced with the prob- have been proposed, e.g. [4, 5, 6, 7, 8, 9]. lem of reintegrating the resulting pair back to image, rather than Note in the first place that any reintegration method leaves a derivative of image, values. This can be done in a variety of ways, constant of integration to be set – an offset added to the resulting usually involving some form of Poisson solver. Here, in the case image – since we start off with derivatives which would zero out of image sequences or video, we introduce a new method of rein- any such offset. Its value must be handled through a heuristic of tegration, based on regression from gradients of log-images. The some kind. strength of this idea is that not only are Poisson reintegration arti- But more fundamentally many reintegration methods will facts eliminated, but also we can carry out the regression applied generate aliasing of various kinds such as creases or halos in the to only thumbnail images.
    [Show full text]
  • SHORT PAPERS.Indd
    VSMM 2008 VSMM 2008 – ofthe14 Digital Heritage–Proceedings Digital Heritage This volume contains the Short Papers presented at VSMM 2008, the 14th th International Conference on Virtual Systems and Multimedia which took Proceedings of the 14 International place on the 20 to 25 October 2008 in Limassol, Cyprus. The conference title was “Digital Heritage: Our Hi-tech-STORY for the Future, Technologies Conference on Virtual Systems to Document, Preserve, Communicate and Prevent the Destruction of our Fragile Cultural Heritage”. and Multimedia The conference was jointly organized by CIPA, the International ICOMOS Committee on Heritage Documentation and the Cyprus Institute. It also Short Papers hosted the 38th CIPA Workshop dedicated on e-Documentation and Standardization in Cultural Heritage and the second Euro-Med Conference on IT in Cultural Heritage. Through the Cyprus Institute, VSMM 2008 received the support of the Government of Cyprus and the European Commission and it was held under the Patronage of H. E. the President of the Republic of Cyprus. th International Conference on Virtual Systems andMultimedia Virtual Systems on Conference International 20–25 October 2008 Limassol, Cyprus M. Ioannides, A. Addison, A. Georgopoulos, L. Kalisperis (Editors) VSMM 2008 Digital Heritage Proceedings of the 14th International Conference on Virtual Systems and Multimedia Short Papers 20–25 October 2008 Limassol, Cyprus M. Ioannides, A. Addison, A. Georgopoulos, L. Kalisperis (Editors) Marinos Ioannides Editor-in-Chief Elizabeth Jerem Managing Editor
    [Show full text]
  • Unmixing-Based Soft Color Segmentation for Image Manipulation
    Unmixing-Based Soft Color Segmentation for Image Manipulation YAGIZ˘ AKSOY ETH Zurich¨ and Disney Research Zurich¨ TUNC¸ OZAN AYDIN and ALJOSAˇ SMOLIC´ Disney Research Zurich¨ and MARC POLLEFEYS ETH Zurich¨ We present a new method for decomposing an image into a set of soft color Additional Key Words and Phrases: Soft segmentation, digital composit- segments that are analogous to color layers with alpha channels that have ing, image manipulation, layer decomposition, color manipulation, color been commonly utilized in modern image manipulation software. We show unmixing, green-screen keying that the resulting decomposition serves as an effective intermediate image ACM Reference Format: representation, which can be utilized for performing various, seemingly unrelated, image manipulation tasks. We identify a set of requirements that Yagız˘ Aksoy, Tunc¸ Ozan Aydin, Aljosaˇ Smolic,´ and Marc Pollefeys. 2017. soft color segmentation methods have to fulfill, and present an in-depth Unmixing-based soft color segmentation for image manipulation. ACM theoretical analysis of prior work. We propose an energy formulation for Trans. Graph. 36, 2, Article 19 (March 2017), 19 pages. producing compact layers of homogeneous colors and a color refinement DOI: http://dx.doi.org/10.1145/3002176 procedure, as well as a method for automatically estimating a statistical color model from an image. This results in a novel framework for automatic and high-quality soft color segmentation that is efficient, parallelizable, and 1. INTRODUCTION scalable. We show that our technique is superior in quality compared to previous methods through quantitative analysis as well as visually through The goal of soft color segmentation is to decompose an image into a an extensive set of examples.
    [Show full text]
  • Estimation of the Color Image Gradient with Perceptual Attributes
    Estimation of the Color Image Gradient with Perceptual Attributes lphilippe Pujas, 2Marie-Jos6 Aldon 11nstitut Universitaire de Technologic de Montpellier, Universit6 Montpellier II, 17 quai Port Neuf, 34500 B6ziers, FRANCE 2LIRMM - UMR C55060 - CNRS / Universit6 Montpellier II 161 rue ADA, 34392 Montpellier Cedex 05, FRANCE Abstract Classical gradient operators are generally defined for grey level images and are very useful for image processing such as edge detection, image segmentation, data compression and object extraction. Some attempts have been made to extend these techniques to multi- component images. However, most of these solutions do not provide an optimal edge enhancement. In this paper we propose a general formulation of the gradient of a multi-image. We first give the definition of the gradient operator, and then we extend it to multi-spectral images by using a metric and a tensorial formula. This definition is applied to the case of RGB images. Then we propose a perceptual color representation and we show that the gradient estimation may be improved by using this color representation space. Different examples are provided to illustrate the efficiency of the method and its robusmess for color image analysis. 1 Introduction This paper addresses the problem of detecting significant edges in color images. More specifically, given a scene including objects which are characterized by homogeneous colors, we want to detect and to extract their contours in the image. Our objective is to propose a solution which overcomes the problems of shades and reflections due to lighting conditions and objects surface state, in order to achieve an adequate image segmentation.
    [Show full text]
  • Edges and Binary Image Analysis
    1/25/2017 Edges and Binary Image Analysis Thurs Jan 26 Kristen Grauman UT Austin Today • Edge detection and matching – process the image gradient to find curves/contours – comparing contours • Binary image analysis – blobs and regions 1 1/25/2017 Gradients -> edges Primary edge detection steps: 1. Smoothing: suppress noise 2. Edge enhancement: filter for contrast 3. Edge localization Determine which local maxima from filter output are actually edges vs. noise • Threshold, Thin Kristen Grauman, UT-Austin Thresholding • Choose a threshold value t • Set any pixels less than t to zero (off) • Set any pixels greater than or equal to t to one (on) 2 1/25/2017 Original image Gradient magnitude image 3 1/25/2017 Thresholding gradient with a lower threshold Thresholding gradient with a higher threshold 4 1/25/2017 Canny edge detector • Filter image with derivative of Gaussian • Find magnitude and orientation of gradient • Non-maximum suppression: – Thin wide “ridges” down to single pixel width • Linking and thresholding (hysteresis): – Define two thresholds: low and high – Use the high threshold to start edge curves and the low threshold to continue them • MATLAB: edge(image, ‘canny’); • >>help edge Source: D. Lowe, L. Fei-Fei The Canny edge detector original image (Lena) Slide credit: Steve Seitz 5 1/25/2017 The Canny edge detector norm of the gradient The Canny edge detector thresholding 6 1/25/2017 The Canny edge detector How to turn these thick regions of the gradient into curves? thresholding Non-maximum suppression Check if pixel is local maximum along gradient direction, select single max across width of the edge • requires checking interpolated pixels p and r 7 1/25/2017 The Canny edge detector Problem: pixels along this edge didn’t survive the thresholding thinning (non-maximum suppression) Credit: James Hays 8 1/25/2017 Hysteresis thresholding • Use a high threshold to start edge curves, and a low threshold to continue them.
    [Show full text]
  • Image Gradients and Edges April 11Th, 2017
    4/11/2017 Image gradients and edges April 11th, 2017 Yong Jae Lee UC Davis Announcements • PS0 due this Friday • Questions? 2 Last time • Image formation • Linear filters and convolution useful for – Image smoothing, removing noise • Box filter • Gaussian filter • Impact of scale / width of smoothing filter • Separable filters more efficient • Median filter: a non-linear filter, edge-preserving 3 1 4/11/2017 Review Filter f = 1/9 x [ 1 1 1 1 1 1 1 1 1] f*g=? original image g filtered 4 Slide credit: Kristen Grauman Review Filter f = 1/9 x [ 1 1 1 1 1 1 1 1 1]T f*g=? original image g filtered 5 Slide credit: Kristen Grauman Review How do you sharpen an image? 6 2 4/11/2017 Practice with linear filters 0 0 0 1 1 1 0 2 0 - 1 1 1 0 0 0 1 1 1 Original Sharpening filter: accentuates differences with local average 7 Slide credit: David Lowe Filtering examples: sharpening 8 Slide credit: Kristen Grauman Sharpening revisited What does blurring take away? – = original smoothed (5x5) detail Let’s add it back: + α = original detail sharpened9 Slide credit: Svetlana Lazebnik 3 4/11/2017 Unsharp mask filter f ( f f g) (1) f f g f ((1)e g) image blurred unit impulse image (identity) unit impulse Gaussian Laplacian of Gaussian 10 Slide credit: Svetlana Lazebnik Review Median filter f: Is f(a+b) = f(a)+f(b)? Example: a = [10 20 30 40 50] b = [55 20 30 40 50] Is f linear? 11 Slide credit: Devi Parikh Recall: Image filtering • Compute a function of the local neighborhood at each pixel in the image – Function specified by a “filter” or mask saying
    [Show full text]
  • Color, Filtering & Edges
    Lecture 2: Color, Filtering & Edges Slides: S. Lazebnik, S. Seitz, W. Freeman, F. Durand, D. Forsyth, D. Lowe, B. Wandell, S.Palmer, K. Grauman Color What is color? • • Color Camera Sensor http://www.photoaxe.com/wp-content/uploads/2007/04/camera-sensor.jpg Overview of Color • • • • Electromagnetic Spectrum http://www.yorku.ca/eye/photopik.htm Visible Light Why do we see light of these wavelengths? …because that’s where the Sun radiates EM energy © Stephen E. Palmer, 2002 The Physics of Light Any source of light can be completely described physically by its spectrum: the amount of energy emitted (per time unit) at each wavelength 400 - 700 nm. Relative # Photonsspectral (per powerms.) 400 500 600 700 Wavelength (nm.) © Stephen E. Palmer, 2002 The Physics of Light Some examples of the spectra of light sources . A. Ruby Laser B. Gallium Phosphide Crystal # Photons # Photons Rel. power Rel. Rel. power Rel. 400 500 600 700 400 500 600 700 Wavelength (nm.) Wavelength (nm.) C. Tungsten Lightbulb D. Normal Daylight Rel. power Rel. # Photons # Photons Rel. power Rel. 400 500 600 700 400 500 600 700 © Stephen E. Palmer, 2002 The Physics of Light Some examples of the reflectance spectra of surfaces Red Yellow Blue Purple % Light % Reflected 400 700 400 700 400 700 400 700 Wavelength (nm) © Stephen E. Palmer, 2002 Interaction of light and surfaces • • – – From Foundation of Vision by Brian Wandell, Sinauer Associates, 1995 Interaction of light and surfaces • Olafur Eliasson, Room for one color Slide by S. Lazebnik Overview of Color • • • • The Eye
    [Show full text]
  • CS6640 Computational Photography 11. Gradient Domain Image
    CS6640 Computational Photography 11. Gradient Domain Image Processing © 2012 Steve Marschner 1 Problems with direct copy/paste slide by Frédo Durand, MIT From Perez et al. 2003 Image gradient • Gradient: derivative of a function Rn → R (n = 2 for images) f = df df = f f r dx dy x y h i ⇥ ⇤ • Note it turns a function R2 → R into a function R2 → R2 • Most such functions are not the derivative of anything! • How do you if some function g is the derivative of something? in 2D, simple: mixed partials are equal (g is conservative) y x gx = gy because g = f and fxy = fyx r Cornell CS6640 Fall 2012 3 A nonconservative gradient? M.C. Escher Ascending and Descending 1960 Lithograph 35.5 x 28.5 cm Cornell CS6640 Fall 2012 4 Gradient: intuition Gradient: slide by Frédo Durand, MIT Gradient: intuition Gradient: slide by Frédo Durand, MIT Gradient: intuition Gradient: slide by Frédo Durand, MIT Gradient: intuition Gradient: slide by Frédo Durand, MIT Key gradient domain idea 1. Construct a vector field that we wish was the gradient of our output image 2. Look for an image that has that gradient 3. That won’t work, so look for an image that has approximately the desired gradient Gradient domain image processing is all about clever choices for (1) and efficient algorithms for (3) Cornell CS6640 Fall 2012 6 Solution: paste gradient hacky visualization of gradient slide by Frédo Durand, MIT Problem setup Given desired gradient g on a domain D, and some constraints on a subset B of the domain 2 ~g : D IR B D f ⇤ : B IR ! ⇢ ! Find a function f on D that fits the constraints and has a gradient close to g min f ~g 2 subject to f B = f ⇤ f kr − k | Since the gradient is a linear operator, this is a (constrained) linear least squares problem.
    [Show full text]