Modal Matching: a Method for Describing, Comparing, and Manipulating Digital Signals by Stanley Edward Sclaroff
Total Page:16
File Type:pdf, Size:1020Kb
Modal Matching: A Method for Describing, Comparing, and Manipulating Digital Signals by Stanley Edward Sclaroff B.S., Computer Science and English Tufts University (1984) S.M., Massachusetts Institute of Technology (1991) Submitted to the Program in Media Arts and Sciences, School of Architecture and Planning, in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 1995 @ Massachusetts Institute of Technology 1995. All rights reserved. Author ........ .................................................. Program in I6iedia Arts and Sdiences January 13, 1995 Certified by....................................... ......... .......... Alex R Pentland Associate Professor of Media Arts and Sciences Program in Media Arts and Sciences Thesis Supervisor Accepted by ............................ ........ - yV - , - w r V Stephen A. Benton Chairman, Departmental Committee on Graduate Students Program in Media Arts and Sciences ~'~2 199~ 2 Modal Matching: A Method for Describing, Comparing, and Manipulating Digital Signals by Stanley Edward Sclaroff Submitted to the Program in Media Arts and Sciences, School of Architecture and Planning, on January 13, 1995, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract This thesis introduces modal matching, a physically-motivated method for establishing cor- respondences and computing canonical shape descriptions. The method is based on the idea of describing objects interms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and categoriza- tion, where shape similarities are expressed as the amounts of modal deformation energy needed to align two shapes. Modal matching is also used for a physically-motivated linear- combinations-of-models paradigm, where the computer synthesizes a shape in terms of a weighted combination of modally deformed prototype shapes. In general, modes provide a global-to-local ordering of shape deformation and thus allow for selecting the types of deformations used in object alignment and comparison. In contrast to previous techniques, which required correspondence to be computed with an initial or prototype shape, modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available shape information. This improved formulation provides greater generality and accuracy, and is applicable to data of any dimensionality. Correspondence results with 2-D contour and point feature data are shown. Recognition experiments for image databases are described, in which a user selects example images and then the computer efficiently sorts the set of images based on the similarity of their shape. While the primary focus of this thesis is matching shapes in2-D images, the underlying shape representation is quite general and can be applied to compare signals in other modalities or in higher dimensions, for instance in sounds or scientific measurement data. Thesis Supervisor: Alex P.Pentland Title: Associate Professor of Media Arts and Sciences Program in Media Arts and Sciences Doctoral Committee -~ I Thesis Advisor............................ ........ .. .......... Alex P Pentland Associate Professor of Media Arts and Sciences Program in Media Arts a iences Thesis Reader ............................. ............ Whitman A. Richards Professor of Psychophysics Dept. of Brain and Cognitive Sciences Thesis Reader............................ ........... ...... .. .. ... ... - .. Tomaso A. Poggio Professor of Vision Sciences and Biophysics Dept. of Brain and Cognitive Sciences Thesis Reader............................ Andrew Witkii Professor of Computer Science Carnegie Mellon University Contents Acknowledgments 1 Introduction 10 1.1 Approach . .. .. ... .. .. .. .. ... .. .... -... 11 1.2 Thesis Overview .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 16 2 Background 18 2.1 Linear Combinations of Models .. .. .. .. .. .. .. .. .. .. .. 20 2.2 Object-Centered Coordinate Frames .. .. .. .. .. .. .. .. .. 22 2.3 Deformable Models . .. .. .. .. .. .. .. .. ... -. .. 23 2.4 Eigen-representations .. .. .. .. .. .. .. .. .. .. .. .. .. 29 2.5 Summary .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 32 3 Approach 34 3.1 Modal Matching . .. .. .. .. .. .. .. .. .. .. .. .. .. .. 35 3.2 Modal Descriptions .. .. .. .. .. .. .. .. .. .. .. .. .. .. 40 3.3 Modal Combinations of Models .. .. .. .. .. .. .. .. .. .. .. 41 3.4 Mathematical Formulation .. .. .. .. .. .. .. .. .. .. .. .. 42 3.5 Summary .. .. .. .. .. .. .. .. .. .. .. .. - . .. .. 54 4 Modal Matching 55 4.1 Determining Correspondences . .. .. .. .. .. .. .. .. .. .. .. 55 4.2 Multiresolution Models . .. .. .. .. .. .. .. .. .. .. .. 60 4.3 Coping with Large Rotations .. .. .. .. .. .. .. .. .. .. .. 61 5 Modal Descriptions 63 5.1 Recovering Modal Descriptions via Modal Alignment .. .. .. .. 63 5.2 Coping with Large Rotations . .. .. .. .. .. .. .. .. 67 5.3 Comparing Modal Descriptions .. .. .. .. .. .. .. 71 6 Modal Combinations of Models 6.1 Modal Image Warping . .. .. .. .. .. .. .. 6.2 Morphing: Physically-based Linear Combinations of Models . 6.3 Example-based Shape Comparison . .. .. .. .. CONTENTS 7 Experiments 85 7.1 Correspondence Experiments ................. ....... .. 85 7.2 Alignment and Description ................... ... ... .... 90 7.3 Recognition of Objects and Categories . .. .. .. .. .... ... ... 93 7.4 Structuring Image Databases for Interactive Search .. ... ... .... 98 7.5 Modal Combinations of Models . .. .. .. .. .. .. .. .. .. .. ... 107 8 Discussion 114 8.1 Shape-based Image Database Search .. .. .. .. .. .. .. ... .. 114 8.2 Modal Models, Estimation, and Learning . .. .. .. .. .. .. ... .. 117 8.3 Material Properties, Preset Parameters, and Robustness .. ... .... 119 8.4 Lim itations .. ... .. .. .. .. .. .. .. .. .. .. .. .. ... .. 123 8.5 Future W ork. .. .. .. .. .. .. .. .. .. .. .. .. .... ... ... 125 9 Conclusion 132 A A FEM Formulation for Higher-dimensional Problems 145 A.1 Formulating a 3-D Element .. .. .. .. .. .. .. .. .. .. .. .. 146 A.2 Formulating Higher-Dimensional Elements .. .. .. .. .. .. .. .. 149 B Sound 153 List of Figures 1-1 Fish morphometry: example of biological shape deformation . .. .. .. 1-2 Three types of deformed shapes that the system will match and compare. 3-1 Modal matching system diagram . .. .. .. .. .. .. .. .. .. 36 3-2 The low-order 18 modes computed for the upright tree shape . .. .. .. 37 3-3 Similar shapes have similar low order modes .. .. .. .. .. .. 38 3-4 Finding feature correspondences with modal matching .. .. 39 3-5 Simple correspondence example: two flat tree shapes .. .. .. 39 3-6 Graphs of interpolants for a 1-D element . .. .. .. .. .. .. 48 4-1 Rotation invariant modal matching . .. .. ... .. ... .. .. ... .. 61 5-1 Using initial rigid-body alignment step .. .. .. .. .. .. .. .. .. 72 6-1 Modal flow field diagram . .. .. .. .. .. .. .. .. .. .. .. 76 6-2 Modal flow field diagram including alignment step . .. ... .. .. 79 7-1 Correspondences for various pear shapes .. .. ..... 86 7-2 Correspondence found for two wrenches . .. ... .. 87 7-3 Correspondences obtained for hand silhouettes . .. .. .. 87 7-4 Correspondences obtained for airplane silhouettes . .... .. 88 7-5 Multiresolution correspondence for edge images of . ... .. 89 7-6 Describing planes in terms of a prototype .. .. .. .. 92 7-7 Describing hand tool shapes as deformations from a prototype . .. 94 7-8 How different modes affect alignment .. .. .. .. .. .. .. .. .. 95 7-9 Comparing a prototype wrench with different hand tools. .. .. ... ... 96 7-10 Comparing different hand tools (continued) .. .. .. .. .. .. .. 97 7-11 The two prototype rabbit images .. .. .. .. .. .. .. .. 99 7-12 The five prototype fish used in database experiments .. .. .. .. 99 7-13 Six fish whose modes do not match a Butterfly Fish modes . .. .. .. .. 100 7-14 Scatter plot of modal strain energy for rabbit prototypes . .. .. .. 101 7-15 Searching an image database for similarly-shaped rabbits .. ... .. 102 7-16 Searching an image database for similarly-shaped fish .. .. .. .. 104 7-17 Searching for similarly-shaped fish (continued) . .. .. .. .. .. 105 7-18 Physically-based linear combinations of views . .. .. .. .. .. 108 7-19 Physically-based linear combinations of images .. .. .. .. .. 108 LIST OF FIGURES 7-20 A heart image, its extracted contour, and first six nonrigid modes .. .. 109 7-21 Representing a beating heart in terms of warps of extremal views .. ... 110 7-22 Modal image synthesis . .. .. .. .. .. .. .. .. .. .. .. 112 7-23 First nine nonrigid modal warps for a hand image . .. .. .. .... 113 8-1 How a change in topology can affect modal matching . .. .. ... .. 124 8-2 Shapes as density patterns: spiral galaxies . .. .. .. .. .. ... .. 128 8-3 Modal matching for periodic textures . .. .. .. .. .. .. ... .. 130 B-1 An example of deformed signals in the sound domain. .. .. .. .. .. .. 154 Acknowledgments I gratefully acknowledge my mentor and thesis advisor, Sandy Pentland, who over the last six years has been an inexhaustible source of intellectual and creative energy. I have the deepest respect for Sandy as a colleague, and have felt priviledged to share, invent, and develop research ideas with a person who has such a genuine interest and enthusiasm for collaboration. His guidance, support, vision and friendship will be a long-lasting influence in my professional