Information to Users

INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI A Bell & Howell Information Company 300 North Zeeb Road, Ann Arbor MI 48106-1346 USA 313/761-4700 800/521-0600 MULTISCALE STRUCTURE DETECTION AND ITS APPLICATION TO IMAGE SEGMENTATION AND MOTION ANALYSIS BY MARK D. TABB B.S., Cornell University, 1991 M.S., University of Illinois at Urbana-Champaign, 1993 THESIS Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering in the Graduate College of the University of Illinois at Urbana-Champaign, 1996 Urbana, Illinois UMI Number: 9702679 UMI Microform 9702679 Copyright 1996, by UMI Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. UMI 300 North Zeeb Road Ann Arbor, MI 48103 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN THE GRADUATE COLLEGE FEBRUARY 1996 WE HEREBY RECOMMEND THAT THE THESIS BY MARK D. TABB MULTISCALE STRUCTURE DETECTION AND ITS ENTITLED. APPLICATION TO IMAGE SEGMENTATION AND MOTION ANALYSIS BE ACCEPTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR DOCTOR OF PfflLOSOPHY THE DEGREE OF. 31k^ Director of Thesis Research «=2-^c ^tjt A/Head of Department Committee on Final Examination^ Chairperson <3&CJ=> -ffe~- duC<^ 1/ tn.tiu t Required for doctor's degree but not for master's. 0-5I7 ABSTRACT The problem of structure detection in images involves the identification of local groups of pixels that are both homogeneous and dissimilar to all nearby areas. Homogeneity can be measured with respect to any criteria of interest, such as color, texture, motion, or depth. No prior knowledge is assumed regarding the number of structures, their size or shape, or the degree of homogeneity that they must possess. Only the homogeneity criteria of interest have to be known. Structures may be either connected (pixels form contiguous areas) or disconnected, but the former case is treated in detail by this thesis. Structure identification is inherently a multiscale problem. For example, a texture contains subtexture, which itself contains subtexture, etc. In the absence of prior information, an algorithm must identify all such structures present, regardless of the scale. A formulation of scale is given that is able to describe image structures at different scales. A nonlinear transform is presented that has the property that it makes structure information at a given scale explicit in the transformed domain. This property allows the processes of automatic scale selection and structure identification to be integrated and performed simultaneously. Structures that are stable (locally invariant) to changes in scale are identified as being perceptually relevant. The transform can be viewed as collecting spatially distributed evidence for edges and regions and making it available at contour locations, thereby facilitating integrated detection of edges and regions without restrictive models of geometry or homogeneity variation. An application of this structure identification to the problem of estimating 2-D motion fields from video sequences is given. This approach has advantages in being able to compute accurate motion near occlusion boundaries and in areas with little variation in intensity. ACKNOWLEDGEMENTS I would like to thank my advisor, Professor Narendra Ahuja, for his support and guidance throughout the course of this research. His enthusiasm for the subject made this work truly enjoyable. I also would like to thank the committee members, Professors Zhi-Pei Liang, William O'Brien, and Michael Orchard, for their suggestions to improve the thesis as well as my presentation of it. In addition, I would like to thank the students in the Computer Vision Laboratory at the Beckman Institute, particularly Arun Krishnan, Tanuja Joshi, and Tod Courtney, who spent a great deal of their time keeping the machines up and running. IV TABLE OF CONTENTS Page 1. INTRODUCTION 1 1.1. Motivation and Approach 2 1.2. Previous Work 4 1.3. Thesis Organization 5 2. STRUCTURE AND SCALE 7 2.1. The Relationship Between Structure, Scale, and Resolution 8 2.2. Comparison with the Fractal Model of Structure 9 2.3. On Scale 9 2.4. Comparison with Other Formulations of Scale 16 3. A TRANSFORM FOR STRUCTURE DETECTION 18 3.1. The Transform 18 3.2. Properties of the Homogeneity and Spatial Distance Functions 20 3.3. Structure in the Field Domain 22 3.4. Computing Feature Vectors 24 4. MULTISCALE IMAGE SEGMENTATION 26 4.1. Previous Work 27 4.2. Specific Form of the Transform Used for Image Segmentation 28 4.2.1. Specific formulation of scale 28 4.2.2. Determining values for ^ 31 4.2.3. Behavior of F with ag 34 4.2.4. Homogeneity distance function selection 37 4.3. The Transform as Multiscale Gradient Operator 39 4.4. Identifying Regions of Attraction within F 42 4.4.1. Identifying region boundaries 42 4.4.2. Structure stability 43 4.4.3. Segmentation representation 45 4.5. Implementation Details 46 4.6. Experimental Results 48 4.7. Discussion 74 5. MULTISCALE REGION-BASED 2-D MOTION ESTIMATION AND SEGMENTATION 75 5.1. Previous Work 76 5.2. Motivation for the Approach 77 5.3. General Overview of the Algorithm 79 5.4. Region Matching 83 5.4.1. Region matching algorithm overview 86 5.4.2. Matching of BRAGs 88 5.4.3. Subgraph incorporation at articulation nodes 91 5.5. Affine Transformation Parameter Estimation 93 5.6. Motion Field Integration 95 5.7. Motion Segmentation 97 5.8. Layers and Occlusion 98 5.9. Motion and Occlusion Prediction 100 5.10. Implementation Details 100 5.11. Experimental Results 101 5.12. Discussion 120 6. CONCLUSION 122 6.1. Primary Contributions of this Work 122 6.2. Future Extensions of this Work 123 REFERENCES 125 VTTA 130 VI 1. INTRODUCTION The problem of structure detection involves the identification of local areas within some set of data that are both homogeneous and dissimilar to all nearby areas. Homogeneity can be measured with respect to any criteria of interest. In the fields of image processing and computer vision, some such criteria include color, texture, motion, and depth. This problem is approached from the perspective of a total lack of prior information. No knowledge is assumed regarding the number of structures, their size or shape, or the degree of homogeneity that they must possess. Only the homogeneity criteria of interest have to be known. Further, the structure detection problem is treated from a low-level perspective in this thesis. No high-level reasoning is employed, and no human supervision of the process is allowed. Because of the lack of a priori information, issues related to scale become significant. It is well-known that the real world contains structure at multiple scales. Consider, for example, structures defined by similarity of texture. A forest is a texture composed of trees. However, each tree is also a texture defined by leaves, and each leaf a texture defined by veins. None of these textures is more relevant than any other; they just exist at different scales. Thus, structure forms a natural pyramidal hierarchy in which a structure contains substructures, which themselves contain substructures, etc. If a particular image contains structures at multiple scales (such as if a forest, individual trees, and individual leaves are all visible), then the best a structure detection algorithm can hope to do is to identify all of the structures in the image, regardless of the scale at which they are present. A general framework is introduced by this thesis within which this kind of problem can be solved. It is then shown in detail how to apply this framework to solve the specific structure detection problem known as image segmentation. This problem involves the identification of l homogeneous areas that are connected (defined by contiguous pixels), and for which homogeneity is defined in terms of variation on the actual values of the data, and not some derived attribute such as texture. A method is then described that uses the structures identified by the image segmentation process to derive 2-D motion fields from a video sequence. 1.1. Motivation and Approach Identifying structures at different scales requires a formal description of scale and its relation ship to structure. This thesis argues that there are three distinct types of scale that are necessary to capture image structure. The first, integration scale, describes the amount of spatial support necessary at a given pixel in order to extract the homogeneity criteria of interest (e.g., texture). The second, homogeneity scale, expresses the amount of homogeneity present within a structure, and the last, spatial scale, determines the degree of proximity among the elements of a structure.

Load more