Optimization Algorithms for Discrete Markov Random Fields, with Applications to Computer Vision
Total Page:16
File Type:pdf, Size:1020Kb
Optimization Algorithms for Discrete Markov Random Fields, with Applications to Computer Vision Nikos Komodakis A thesis submitted for the degree of Doctor of Philosophy University of Crete Computer Science Department Heraklion Supervisor: Professor Georgios Tziritas (May 2006) Abstract We are what we repeatedly do. Excellence, then, is not an act, but a habit. —Aristotle 384 BC–322 BC A large variety of important tasks in lowlevel vision, image analysis and pat tern recognition can be formulated as discrete labeling problems where one seeks to optimize some measure of the quality of the labeling. For example such is the case in optical flow estimation, stereo matching, image restoration to men tion only a few of them. Discrete Markov Random Fields are ideal candidates for modeling these labeling problems and, for this reason, they are ubiquitous in computer vision. Therefore, an issue of paramount importance, that has at tracted a significant amount of computer vision research over the past years, is how to optimize discrete Markov Random Fields efficiently and accurately. The main theme of this thesis is concerned exactly with this issue. Two novel MRF op timization schemes are thus presented, both of which manage to extend current stateoftheart techniques in significant ways. On one hand, a novel framework is proposed that is based on the duality theory of Linear Programming (LP) and provides an alternative as well as more general view of existing graphcut methods such as the αexpansion technique, which is included merely as a special case. Moreover, unlike αexpansion which is valid only for MRFs with metric potentials, the derived algorithms provably generate almost optimal solutions for a much wider class of MRFs that are frequently encountered in computer vision, which is an important advance. Results on a variety of low level vision tasks demonstrate the efficacy of our approach. On the other hand, a novel optimization scheme, called PriorityBP, is proposed which carries two very important extensions over standard Belief Propagation ii (BP): prioritybased message scheduling and dynamic label pruning. For the first time, these two extensions work in cooperation in order to deal with one of the major limitations of BP: its inefficiency in handling MRFs with very large discrete statespaces. Moreover, both extensions are generic and do not make any use of domainspecific knowledge. They are therefore applicable to any discrete Markov Random Field i.e. a very wide class of problems in computer vision. In order to demonstrate the effectiveness of PriorityBP, a novel exemplar based framework (based on PriorityBP) is also proposed which treats the prob lems of image completion, texture synthesis and image inpainting in a unified manner, while managing to compare favorably with related stateoftheart meth ods. According to our framework, all of the above mentioned tasks are posed as discrete MRF optimization problems with a welldefined global objective function. Thanks to our PriorityBP algorithm, the intolerable (due to the huge number of labels) computational cost of optimizing the resulting MRF is drastically reduced. Furthermore, visually inconsistent results due to greedy patch assignments are avoided, since our method always manages to maintain many candidate source patches for each block of missing pixels. Numerous results on a wide variety of difficult image completion cases prove the efficacy of our framework. Finally, as an application of our LPbased MRF optimization techniques, we turn our attention to a research topic that lies at the convergence of the fields of computer vision and computer graphics: the virtual reconstruction of 3D environ ments based on image sequences. Contrary to most of the existing imagebased modelingandrendering (IBMR) methods, which typically require large amount of image data and are thus suitable mostly for small scale scenes, here we propose a novel hybrid (geometry & image based) framework that is capable of providing photorealistic walkthroughs of very large, complex outdoor scenes at interactive frame rates. Furthermore, our framework is fully automatic and takes as input only a sparse set of stereoscopic image pairs from the scene. Based on these image pairs, a novel hybrid data representation of a 3D scene, called morphable 3Dmosaics, is then automatically extracted. According to it, a 3D scene is repre sented as a series of enhance local 3D models that allow a continuous morphing between each successive two of them to be taking place during rendering. The morphing is both photometric as well geometric and always proceeds in a phys iii ically valid way. MRFs play a crucial role for the correct estimation of the mor phing and, for optimizing these MRFs, we make use of our LPbased optimization methods. Our framework has already been successfully applied to the virtual 3D reconstruction of the Samaria gorge in Crete and a sample from the results that have been obtained is shown as well. iv Acknowledgements I first wish to express my sincere gratitude towards my advisor, Professor Geor gios Tziritas, who guided and supported me throughout the years. His office door was always open to me whenever i needed his advice. I also want to thank him for giving me the opportunity to explore and deal with challenging research topics. I would also like to thank assistant Professor George Georgakopoulos. His won derful lectures on algorithms and complexity (that i was lucky enough to attend) gave the inspiration for part of the research that was conducted in this thesis. In addition, i would like to thank Professor Panos Trahanias, as well as all the mem bers of my thesis committee, for their insightfull comments, which were indeed of great value for the improvement of the quality and presentation of this thesis. A special thanks also goes to Professor Nikos Paragios. Although being far away, he has always been willing to offer his valuable advice, as well as its constructive comments, on various early manuscripts. I should note that the biggest part of the research in this thesis has been funded by the European DHX (Digital Ecological and Artistic Heritage Exchange) project. Finally, I want to express my deepest gratitude towards my parents for their unfal tering love and support. They have always been a tremendous source of strength for me. Last but not least, i want to thank my wife, Vaggelio. Through all these years she has always been by my side, encouraging, helping and caring. I am pretty sure that, without her support, this thesis would not have been possible. Nikos Komodakis Heraklion, 2006. vi Contents Abstract i Acknowledgements v Thesis organization 1 1 Introduction 5 1.1 The optimization paradigm in vision . 5 1.2 Discrete labeling problems and Markov Random Fields . 10 1.3 Stateoftheart optimization methods for discrete Markov Random Fields . 17 1.4 Thesis contributions . 19 2 Background on Markov Random Fields 25 2.1 Introduction . 25 2.2 Basics of Markov Random Fields . 26 2.3 Gibbs random fields and MarkovGibbs equivalence . 28 2.4 Maximum a posteriori estimation . 30 2.5 Stateoftheart MRF optimization techniques . 32 2.5.1 The αexpansion algorithm . 32 2.5.2 Loopy belief propagation . 35 2.6 Other MRF optimization methods in vision . 39 3 Approximate Labeling via Graph Cuts Based on Linear Programming 43 3.1 Introduction . 44 3.2 Related work . 50 3.3 The primaldual schema . 52 3.3.1 The primal and dual LPs (Linear Programs) corresponding to Metric Labeling . 54 viii 3.3.2 An intuitive view of the dual variables . 56 3.3.3 Applying the primaldual schema to Metric Labeling . 57 3.4 The PD1 algorithm . 58 3.4.1 Update of the primal and dual variables . 63 3.5 The PD2 algorithm . 66 3.6 PD3: extending PD2 to the semimetric case . 70 3.7 Experimental results . 74 3.7.1 Perinstance suboptimality bounds . 74 3.7.2 Stereo matching . 78 3.7.3 Image restoration and image completion . 80 3.7.4 Optical flow estimation . 82 3.7.5 Synthetic problems . 84 3.8 Conclusions . 85 4 PriorityBP and the Problem of Image Completion 87 4.1 Introduction . 88 4.2 Image completion as a discrete global optimization problem . 94 4.3 PriorityBP . 95 4.3.1 Prioritybased message scheduling . 96 4.3.2 Assigning priorities to nodes . 100 4.3.3 Applying PriorityBP to image completion . 101 4.3.4 Label pruning . 104 4.4 Extensions & further results . 106 4.5 Conclusions . 111 5 3D Visual Reconstruction of Large Scale Natural Sites 113 5.1 Introduction . 115 5.2 Related work . 120 5.3 Overview of the modeling pipeline . 121 5.4 Local 3D models construction . 123 5.4.1 Disparity estimation . 124 5.5 Relative pose estimation between successive local models . 125 5.5.1 Widebaseline feature matching under camera looming . 126 5.6 Morphing estimation between successive local models . 128 ix 5.6.1 Estimating optical flow between widebaseline images Ik and Ik+1 . 130 5.6.2 Geometric morphing in region Ψ¯ . 134 5.7 Rendering pipeline . 138 5.7.1 Decimation of local 3D models . 140 5.8 3Dmosaics construction . 142 5.8.1 Rotation (Rij) estimation between views Ii; Ij . 143 5.8.2 Geometric rectification of local models . 143 5.8.3 Merging the rectified local models . 146 5.9 Further results . 146 5.10Conclusions . 150 A Technical proofs for theorems of chapter 3 153 A.1 Proof of theorem 3.2 about the optimality properties of the PD1 algo rithm . 153 A.2 Proof of theorem 3.3 about the optimality properties of the PD2µ algorithm .