DICTIONARY OF DISTANCES This page intentionally left blank DICTIONARY OF DISTANCES
ELENA DEZA Moscow Pedagogical State University and Donghua University
MICHEL-MARIE DEZA Ecole Normale Superieure, Paris, and Shanghai University
Amsterdam • Boston • Heidelberg • London • New York • Oxford Paris • San Diego • San Francisco • Singapore • Sydney • Tokyo ELSEVIER Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK
First edition 2006
Copyright © 2006 Elsevier B.V. All rights reserved
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher
Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; Email: [email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting: Obtaining permission to use Elsevier material
Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made
Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress
British Library Cataloguing in Publica tion Data A catalogue record for this book is available from the British Library
ISBN-13: 978-0-444-52087-6 ISBN-10: 0-444-52087-2
For information on all Elsevier publications visit our website at books.elsevier.com
Printed and bound in The Netherlands
0607080910 10987654321 On 2 April 2006 it was 100 years from the submission by Maurice FRÉCHET of his outstanding doctoral dissertation Sur quelques points du calcul functionnel that introduced (within a systematic study of functional operations) the notion of metric space. It was also 92 years from 1914, the date of publication by Felix HAUSDORFF of his famous Grundzüge der Mengenlehre where the theory of topological and metric spaces was created. Let this Dictionary be our homage to the memory of these great mathematicians and their lives of dignity through the hard ways of the first half of the XX century.
Maurice FRÉCHET (1878–1973) coined Felix HAUSDORFF (1868–1942) coined in 1906 the concept of écart (semi-metric) in 1914 the term metric space
v This page intentionally left blank Preface
The concept of distance is one of the basic ones in the whole of human experience. In everyday life it usually means some degree of closeness between two physical objects or ideas, i.e., length, time interval, gap, rank difference, coolness or remoteness, while the term metric is often used as a standard for a measurement. But here we consider, except in the last two chapters, the mathematical meaning of these terms. The mathematical notions of distance metric (i.e., a function d(x,y) from X × X to the set of real numbers satisfying d(x,y) 0 with equality only for x = y, d(x,y) = d(y,x), and d(x,y) d(x,z) + d(z,y)) and of metric space (X, d) were originated a century ago by M. Fréchet (1906) and F. Hausdorff (1914) as a special case of an infinite topological space. The triangle inequality above appears already in Euclid. The infinite metric spaces are seen usually as a generalization of the metric |x − y| on the real numbers. Their main classes are the measurable spaces (add measure) and Banach spaces (add norm and completeness). However, starting from K. Menger (1928) and, especially, L.M. Blumenthal (1953), an explosion of interest in finite metric spaces occurred. Another trend: many mathematical theories, in the process of their generalization, settled on the level of metric space. Now finite distance metrics have become an essential tool in many areas of Mathemat- ics and its applications include Geometry, Probability, Statistics, Coding/Graph Theory, Clustering, Data Analysis, Pattern Recognition, Networks, Engineering, Computer Graph- ics/Vision, Astronomy, Cosmology, Molecular Biology, and many other areas of science. Devising the most suitable distance metrics has become a standard task for many re- searchers. Especially intense ongoing searches for such distances occur, for example, in Genetics, Image Analysis, Speech Recognition, Information Retrieval. Often the same dis- tance metric appears independently in several different areas; for example, the edit distance between words, the evolutionary distance in Biology, the Levenstein distance in Coding Theory, and the Hamming+Gap or shuffle-Hamming distance. This body of knowledge has become too large and disparate to operate within. The number of worldwide web entries offered by Google on the topics “distance”, “metric space” and “distance metric” approach 300 million (i.e., about 4% of all), 12 million and 6 million, respectively, not to mention all the printed information outside the Web, or the vast “invisible Web” of searchable databases. However, this huge amount of information on distances is too scattered: the works evaluating distance from some list usually treat very specific areas and are hardly accessible for non-experts. Therefore, many researchers, including us, keep and cherish a collection of distances for use in their own areas of science. In view of the growing general need for an accessible interdisciplinary source for a vast multitude of researchers, we have expanded our private collection into this Dictionary. Some additional material was reworked from various en-
vii viii Preface cyclopedia, especially, Encyclopedia of Mathematics ([EM98]), MathWorld ([Weis99]), PlanetMath ([PM]), and Wikipedia ([WFE]). However, the majority of distances should be extracted directly from specialist literature. The vast reservoir of concepts defined in this Dictionary, aims to be a thought-provoking archive and a valuable resource. Besides distances themselves, we have collected here many distance-related notions (especially in Chapter 1) and paradigms, enabling people from applications to get those, arcane for non-specialists, research tools, in ready-to-use fashion. This and the appearance of some distances in different contexts can be a source of new research. At a time when over-specialization and terminology barriers isolate researchers, this Dictionary tries to be “centripetal” and “ecumenical”, providing some access and altitude of vision but without taking the route of scientific vulgarization. This attempted balance defined the structure and style of the Dictionary. The Dictionary is divided into 28 chapters grouped into 7 Parts of about the same length. The titles of parts are purposely approximative: they just allow a reader to figure out her/his area of interest and competence. For example, Parts II, III and IV, V require some culture in, respectively, pure and applied Mathematics. Part VII can be read by a layman. The chapters are thematic lists, by areas of Mathematics or applications which can be read independently. When necessary, a chapter or a section starts with a short introduction: a field trip with the main concepts. Besides those introductions, the main properties and uses of distances are given, within items, only exceptionally. We also tried, when it was easy, to trace distances to their originator(s), but the proposed extensive bibliography has a less general ambition: it is just to provide convenient sources for a quick search. Each chapter consists of items ordered in a way that hints of connections between them. All item titles and selected key terms can be traced in the large Subject Index (about 1400 entries); they are boldfaced unless the meaning is clear from the context. So, the definitions are easy to locate, by subject, in chapters and/or, by alphabetic order, in the index. The introductions and definitions are reader-friendly and as far as possible independent of each other; still they are interconnected, in the 3-dimensional HTML manner, by hyperlink-like boldfaced references to similar definitions. Many nice curiosities appear in this “Who is Who” of distances. Examples of such sundry terms are: ubiquitous Euclidean distance (“as-the-crow-flies”), flower-shop met- ric (shortest way between two points, visiting a “flower-shop” point first), knight-move metric on a chessboard, Gordian distance of knots, Earth Mover distance, biotope distance, Procrustes distance, lift metric, post-office metric, Internet hop metric, WWW hyperlink quasi-metric, Moscow metric, dogkeeper distance. Besides abstract distances, the distances having physical meaning appear also (especially in Part VI); they range from 1.6×10−35 m (Planck length) to 7.4×1026 m (the estimated size of observable Universe, about 46×1060 Planck lengths). The number of distance metrics is infinite and therefore, our Dictionary cannot enu- merate all of them. But we were inspired by several successful thematic dictionaries on other infinite lists; for example, on Integer Sequences, Inequalities, Numbers, Random Processes, and by atlases of Functions, Groups, Fullerenes, etc. On the other hand, the largeness of the scope forced us often to switch into a laconic tutorial style. Preface ix
The target audience consists of all researchers working on some measuring schemes and, to a certain degree, of students and a part of the general public interested in science. We have tried to address, even if incompletely, all scientific uses of the notion of dis- tance. However some distances did not made it into this Dictionary due to space limitations (being too specific and/or complex) or our oversight. In general, the size/interdisciplinarity cut-off, i.e., decision where to stop, was our main headache. We would be grateful to the readers who send us their favorite distances missed here. Four pages at the end are reserved for such personal additions. We are grateful to many people for their help with this book, especially, to Jacques Beigbeder, Mathieu Dutour, Emmanuel Guerre, Jack Koolen, Jin Ho Kwak, Hiroshi Mae- hara, Sergey Shpectorov, Alexei Sossinsky, and Jiancang Zhuang. This page intentionally left blank Contents
Preface vii
PART I. MATHEMATICS OF DISTANCES
Chapter 1. General Definitions 2 1.1. Basic definitions 2 1.2. Main distance-related notions 7 1.3. General distances 21
Chapter 2. Topological Spaces 31
Chapter 3. Generalizations of Metric Spaces 36 3.1. m-metrics 36 3.2. Indefinite metrics 37 3.3. Topological generalizations 38 3.4. Beyond numbers 40
Chapter 4. Metric Transforms 44 4.1. Metrics on the same set 44 4.2. Metrics on set extensions 46 4.3. Metrics on other sets 48
Chapter 5. Metrics on Normed Structures 50
PART II. GEOMETRY AND DISTANCES
Chapter 6. Distances in Geometry 62 6.1. Geodesic Geometry 62 6.2. Projective Geometry 66
xi xii Contents
6.3. Affine Geometry 71 6.4. Non-Euclidean Geometry 73
Chapter 7. Riemannian and Hermitian Metrics 81 7.1. Riemannian metrics and generalizations 81 7.2. Riemannian metrics in Information Theory 94 7.3. Hermitian metrics and generalizations 99
Chapter 8. Distances on Surfaces and Knots 111 8.1. General surface metrics 111 8.2. Intrinsic metrics on surfaces 116 8.3. Distances on knots 120
Chapter 9. Distances on Convex Bodies, Cones, and Simplicial Complexes 122 9.1. Distances on convex bodies 122 9.2. Distances on cones 127 9.3. Distances on simplicial complexes 129
PART III. DISTANCES IN CLASSICAL MATHEMATICS
Chapter 10. Distances in Algebra 134 10.1. Group metrics 134 10.2. Metrics on binary relations 141 10.3. Lattice metrics 143
Chapter 11. Distances on Strings and Permutations 146 11.1. Distances on general strings 147 11.2. Distances on permutations 151
Chapter 12. Distances on Numbers, Polynomials, and Matrices 154 12.1. Metrics on numbers 154 12.2. Metrics on polynomials 158 12.3. Metrics on matrices 159
Chapter 13. Distances in Functional Analysis 165 13.1. Metrics on function spaces 165 13.2. Metrics on linear operators 172 Contents xiii
Chapter 14. Distances in Probability Theory 176 14.1. Distances on random variables 177 14.2. Distances on distribution laws 178
PART IV. DISTANCES IN APPLIED MATHEMATICS
Chapter 15. Distances in Graph Theory 190 15.1. Distances on vertices of a graph 191 15.2. Distance-defined graphs 195 15.3. Distances on graphs 197 15.4. Distances on trees 202
Chapter 16. Distances in Coding Theory 207 16.1. Minimum distance and relatives 208 16.2. Main coding distances 211
Chapter 17. Distances and Similarities in Data Analysis 217 17.1. Similarities and distances for numerical data 218 17.2. Relatives of Euclidean distance 221 17.3. Similarities and distances for binary data 223 17.4. Correlation similarities and distances 226
Chapter 18. Distances in Mathematical Engineering 230 18.1. Motion planning distances 230 18.2. Cellular automata distances 234 18.3. Distances in Control Theory 234 18.4. MOEA distances 236
PART V. COMPUTER-RELATED DISTANCES
Chapter 19. Distances on Real and Digital Planes 240 19.1. Metrics on real plane 240 19.2. Digital metrics 248
Chapter 20. Voronoi Diagram Distances 253 20.1. Classical Voronoi generation distances 254 20.2. Plane Voronoi generation distances 256 20.3. Other Voronoi generation distances 259 xiv Contents
Chapter 21. Image and Audio Distances 262 21.1. Image distances 262 21.2. Audio distances 273
Chapter 22. Distances in Internet and Similar Networks 279 22.1. Scale-free networks 279 22.2. Network-based semantic distances 281 22.3. Distances in Internet and Web 283
PART VI. DISTANCES IN NATURAL SCIENCES
Chapter 23. Distances in Biology 288 23.1. Genetic distances for gene-frequency data 289 23.2. Distances for DNA data 292 23.3. Distances for protein data 294 23.4. Other biological distances 296
Chapter 24. Distances in Physics and Chemistry 301 24.1. Distances in Physics 301 24.2. Distances in Chemistry 306
Chapter 25. Distances in Geography, Geophysics, and Astronomy 309 25.1. Distances in Geography and Geophysics 309 25.2. Distances in Astronomy 312
Chapter 26. Distances in Cosmology and Theory of Relativity 318 26.1. Distances in Cosmology 318 26.2. Distances in Theory of Relativity 324
PART VII. REAL-WORLD DISTANCES
Chapter 27. Length Measures and Scales 342 27.1. Length scales 342 27.2. Orders of magnitude for length 346
Chapter 28. Non-Mathematical and Figurative Meaning of Distance 350 28.1. Remoteness-related distances 350 28.2. Vision-related distances 357 Contents xv
28.3. Equipment distances 360 28.4. Miscellany 364
References 371
Subject Index 378 This page intentionally left blank Part I Mathematics of Distances
Chapter 1. General Definitions Chapter 2. Topological Spaces Chapter 3. Generalizations of Metric Spaces Chapter 4. Metric Transforms Chapter 5. Metrics on Normed Structures Chapter 1 General Definitions
1.1. BASIC DEFINITIONS
• Distance Let X be a set. A function d : X × X → R is called distance (or dissimilarity)onX if, for all x,y ∈ X, it holds: 1. d(x,y) 0(non-negativity); 2. d(x,y) = d(y,x) (symmetry); 3. d(x,x) = 0. In Topology, it is also called symmetric. The vector from x to y having the length d(x,y) is called displacement. A distance which is a squared metric, is called quad- rance. For any distance d, the function, defined for x = y by D(x,y) = d(x,y) + c, where c = maxx,y,z∈X(d(x, y) − d(x,z) − d(y,z)), and D(x,x) = 0, is a metric.
• Distance space A distance space (X, d) is a set X equipped with a distance d.
• Similarity Let X be a set. A function s : X ×X → R is called similarity (or proximity)onX if s is non-negative, symmetric, and if s(x,y) s(x,x) holds for all x,y ∈ X, with equality if and only if x = y. = Main transforms used√ to obtain a distance (dissimilarity) d from a similarity s are: d − = 1−s = − = − 2 =− = 1 s, d s , d 1 s, d 2(1 s ), d ln s, d arccos s. • Semi-metric Let X be a set. A function d : X × X → R is called semi-metric (or écart, pseudo- metric)onX if d is non-negative, symmetric,ifd(x,x) = 0 holds for all x ∈ X, and if d(x,y) d(x,z) + d(z,y) holds for all x,y,z ∈ X (triangle inequality). For any distance d, the equality d(x,x) = 0 and the strong triangle inequality d(x,y) d(x,z) + d(y,z) imply that d is a semi-metric.
2 Chapter 1: General Definitions [ • Metric] 3
• Metric Let X be a set. A function d : X × X → R is called metric on X if, for all x,y,z ∈ X, it holds: 1. d(x,y) 0(non-negativity); 2. d(x,y) = 0 if and only if x = y (separation or self-identity axiom); 3. d(x,y) = d(y,x) (symmetry); 4. d(x,y) d(x,z) + d(z,y) (triangle inequality).
• Metric space A metric space (X, d) is a set X equipped with a metric d. A metric scheme is a metric space with an integral valued metric.
• Extended metric An extended metric is a generalization of the notion of metric: the value ∞ is allowed for a metric d.
• Near-metric Let X be a set. A distance d on X is called near-metric if 0 holds for all different x,y,z1,...,zn ∈ X and a constant C 1. • Coarse-path metric Let X be a set. A metric d on X is called coarse-path metric if, for a fixed C 0 and for every pair of points x,y ∈ X, there exists a sequence x = x0,x1,...,xt = y for which d(xi−1,xi) C for i = 1,...,t, and d(x,y) d(x0,x1) + d(x1,x2) +···+d(xt−1,xt ) − C, t i.e., the weakened triangle inequality d(x,y) i=1 d(xi−1,xi) becomes an equality up to a bounded error. • Resemblance Let X be a set. A function d : X ×X → R is called resemblance on X if d is symmetric and if, for all x,y ∈ X, either d(x,x) d(x,y) holds (in which case d is called forward resemblance), or d(x,x) d(x,y) holds (in which case d is called backward resemblance). Every resemblance d induces a strict partial order ≺ on the set of all unordered pairs of elements of X by defining {x,y}≺{u, v} if and only if d(x,y) < d(u,v). For any backward resemblance d, the forward resemblance −d induces the same partial order. 4 [ • Quasi-distance] Part I: Mathematics of Distances • Quasi-distance Let X be a set. A function d : X × X → R is called quasi-distance on X if d is non- negative, and if d(x,x) = 0 holds for all x ∈ X. • Quasi-semi-metric Let X be a set. A function d : X ×X → R is called quasi-semi-metric (or weak metric) on X if d is non-negative,ifd(x,x) = 0 holds for all x ∈ X, and if d(x,y) d(x,z) + d(z,y) holds for all x,y,z ∈ X (oriented triangle inequality). • Albert quasi-metric An Albert quasi-metric d is a quasi-semi-metric on X with weak definiteness, i.e., for all x,y ∈ X the equality d(x,y) = d(y,x) implies x = y. • Weak quasi-metric A weak quasi-metric d is a quasi-semi-metric on X with weak symmetry, i.e., for all x,y ∈ X, d(x,y) = 0 if and only if d(y,x) = 0. • Quasi-metric Let X be a set. A function d : X × X → R is called quasi-metric on X if d(x,y) 0 holds for all x,y ∈ X with equality if and only if x = y, and if d(x,y) d(x,z) + d(z,y) holds for all x,y,z ∈ X (oriented triangle inequality). A quasi-metric space (X, d) is asetX equipped with a quasi-metric d. For any quasi-metric d, the function D(x,y) = d(x,y) + d(y,x) is a metric. • 2k-gonal distance An 2k-gonal distance d is a distance on X which satisfies the 2k-gonal inequality bibj d(xi,xj ) 0 1i