Quasiorthogonal Dimension
Total Page:16
File Type:pdf, Size:1020Kb
Quasiorthogonal Dimension Paul C. Kainen1 and VˇeraK˚urkov´a2 1 Department of Mathematics and Statistics, Georgetown University Washington, DC, USA 20057 [email protected] 2 Institute of Computer Science, Czech Academy of Sciences Pod Vod´arenskou vˇeˇz´ı 2, 18207 Prague, Czech Republic [email protected] Abstract. An interval approach to the concept of dimension is pre- sented. The concept of quasiorthogonal dimension is obtained by relax- ing exact orthogonality so that angular distances between unit vectors are constrained to a fixed closed symmetric interval about π=2. An ex- ponential number of such quasiorthogonal vectors exist as the Euclidean dimension increases. Lower bounds on quasiorthogonal dimension are proven using geometry of high-dimensional spaces and a separate argu- ment is given utilizing graph theory. Related notions are reviewed. 1 Introduction The intuitive concept of dimension has many mathematical formalizations. One version, based on geometry, uses \right angles" (in Greek, \ortho gonia"), known since Pythagoras. The minimal number of orthogonal vectors needed to specify an object in a Euclidean space defines its orthogonal dimension. Other formalizations of dimension are based on different aspects of space. For example, a topological definition (inductive dimension) emphasizing the recur- sive character of d-dimensional objects having d−1-dimensional boundaries, was proposed by Poincar´e,while another topological notion, covering dimension, is associated with Lebesgue. A metric-space version of dimension was developed by Hausdorff, Besicovitch, and Mandelbrot; the new concept of fractal dimension can take nonintegral values. This chapter presents a geometric concept of dimension, using an interval approach. We define as in [31, 32, 38], the "-quasiorthogonal dimension of Rn, n−1 dim"(n) := maxfjXj : X ⊂ S ; x 6= y 2 X ) jx · yj ≤ "g (1) to be the maximum number of unit vectors in Rn with pairwise-dot-products in the interval [−"; "] or, equivalently, the maximum number of nonzero vectors whose pairwise angles lie in the interval [arccos("); arccos(−")] centered at π=2. Interval analysis was introduced by Moore in 1962 [51] and replaces real numbers by intervals. Kreinovich contributed substantially to its modern refor- mulation as interval computation (see Kearfott and Kreinovich [35]), and has been the creator and maintainer of the Interval Computation website [37]. Replacing a \crisp" number by a nontrivial (closed) interval has a pro- found impact on orthogonal dimension. There are exactly n pairwise-orthogonal n nonzero vectors in R , but for fixed " > 0, dim"(n) grows exponentially with n. Quasiorthogonality has found numerous applications, including word-space models for semantic classification (Hecht-Nielsen [28], Kaski [33]), selection of input parameters for neural networks (Gorban, Tyukin, Prokhorov, and Sofeikov [22]), estimates of covering numbers (K˚urkov´aand Sanguineti [39]), and predic- tion of consumer financial behavior (Lazarus [40]). The chapter is organized as follows. In Section 2 quasiorthogonal dimen- sion is defined and its growth is estimated via geometrical properties of high- dimensional spaces. Section 3 presents a graph theory approach and includes some new results. In Section 4, quasiorthogonal vectors in Hamming cubes are examined. Section 5 describes concrete constructions utilizing sparse ternary vec- tors. The application of quasiorthogonality to context vectors and computational semantics is in Section 6. The final section includes a number of classical and recent generalizations which are related to quasiorthogonality in other domains. 2 Orthogonal and Quasiorthogonal Geometry Let Rn denote the n-dimensional Euclidean space, Sn−1 := fh 2 Rn : khk = 1g n Pn n is the unit sphere in R , and x · y := i=1 xiyi is the inner product of x; y 2 R . Hecht-Nielsen introduced prior to 1991 (see [21]) the concept of what was later called a quasiorthogonal set (K˚urkov´aand Hecht-Nielsen [38]). For " 2 [0; 1), a subset T of Sn−1 is an "-quasiorthogonal set if x 6= y 2 T ) jx · yj ≤ ": A set of nonzero vectors is "-quasiorthogonal if and only if the corresponding set of normalized vectors is "-quasiorthogonal. n Thus, the "-quasiorthogonal dimension of R , dim"(n), is the maximum car- dinality of an "-quasiorthogonal subset of Rn, i.e., We consider the two cases: (i) " \small" or (ii) " \large" w.r.t. arcsin(1=n) ∼ 1=n. In the first case (i), when all pairwise angular measurement errors are small (strictly less than arcsin(1=n)), it was shown (Kainen [31], Kainen and K˚urkov´a [32]) that quasiorthogonal dimension equals orthogonal dimension n. To have a quasiorthogonal set with more than n members in n-dimensional Euclidean space, some pair of the vectors must be at an angle which deviates from π=2 by at least arcsin(1=n). For instance, for n = 2, at least one of the measurements must be in error by at least 30 degrees, corresponding to 1=12-th of a circle. Hence, one can trust an estimate of orthogonal dimension made in a fixed finite-dimensional space if the error is small enough; i.e., precise accuracy in orthogonal dimension is achieved when angular error is sufficiently small. In the second case (ii), assume that " 2 (0; 1) is fixed and n increases. It was conjectured in [38] and [28] that "-quasiorthogonal dimension grows expo- nentially as n increases. We proved the existence of such exponentially large quasiorthogonal sets using geometry of high-dimensional Euclidean spaces [31] and graph theory [32]), giving the same lower bound on the rate of growth. We will review both of these approaches, starting with the geometric one. Let E be any set and F any family of subsets of E; F is a packing if its elements are pairwise-disjoint and F is a cover if its union is E. For real-valued f and g, we write f(n) & g(n) and f(n) ∼ g(n) to mean lim f(n)=g(n) ≥ 1 and lim f(n)=g(n) = 1: n!1 n!1 A simple argument for the existence of large quasiorthogonal sets comes from packing spherical caps into the surface of Sn−1. The caps consist of all points on the sphere within a fixed angular distance from some center point. More precisely, let g 2 Sn−1 and let " > 0. Put C(g; ") := fh 2 Sn−1 j hh; gi ≥ "g: Then C(g; ") is the set of all unit vectors within angular distance α = arccos(") from g (see Fig. 1), i.e., the α-ball in the angular metric. As " ! 0+, arccos(") approaches π=2 from below; that is, the cap is nearly a hemisphere. Fig. 1. Spherical cap Theorem 1. Let 0 < " < 1. Then for all integers n ≥ 2, n"2=2 dim"(n) ≥ e : Proof. Let µ be the rotationally symmetric uniform probability measure on Sn−1 obtained by normalizing Lebesgue measure. Determining the area of a cap in Lebesgue measure is well-known (Ball [4, p.11]) µ(C(g; ")) ≤ exp − n"2=2 : (2) 2 Hence, any family of such caps which covers Sn−1 has at least en" =2 members. Kolmogorov and Tikhomorov [36] showed that the cardinality of a minimum covering by balls of radius r bounds from below the size of a maximum packing by balls of radius r=2 but the latter equals dim"(n) [31, Theorem 2.3]. 2 These properties of quasiorthogonality were already implicit in earlier litera- ture on packing spherical caps (Rankin [53] in 1955 and Wyner [63] in 1967) as described in [31] which includes a few other early references not given here. The upper bound in (2) is quite counter-intuitive since for any fixed ", the bound becomes very small as n increases. Hence, in high dimension, most of the area of the sphere lies very close to its \equator". This is a special case of the phenomenon of concentration of measure, which states that for large dimensions most of the values of Lipschitz continuous func- tions concentrate closely around their medians (see, e.g., Matousek [47, p.337]). Due originally to Levy [42] and Schmidt [60], see also Boucheron, Lugosi, and Massart [7, p. 4], concentration of measure had remained obscure for two decades until it was used by Milman in 1971 to prove a theorem of Dvoretzky which led to the development of the asymptotic theory of normed linear spaces (Milman and Schechtman [49], Ball [4, pp. 41, 47]. Quasiorthogonality is also a special case of the Johnson-Lindenstrauss Lemma [30] on linear projections from spaces of high dimensions to lower-dimensional subspaces that approximately preserve distance on a given finite set. A function f from RD to Rd is called an "-isometry w.r.t. a subset A ⊆ RD if for all a; a0 2 A, f changes square-distances by a multiplicative factor of at most 1 ± " - i.e., for all a; a0 2 A, with k · k denoting Euclidean norm, (1 − ")ka − a0k2 ≤ kf(a) − f(a0)k2 ≤ (1 + ")ka − a0k2 One has the following result from [7, pp. 39{42] . Lemma 1 (Johnson-Lindenstrauss) Let A be an n-element subset of RD with "; δ 2 (0; 1). Suppose a random linear mapping W : RD ! Rd is con- structed by choosing the dD entries of the standard representing matrix to be normal random variables, centered at zero with variance 1. Then with proba- bility at least (1 − δ), the function W changes the pairwise distances between distinct members of A by a multiplicative factor of at most 1 ± " (that is, W is an "-isometry w.r.t. A) provided that d ≥ κε−2 log nδ−1=2 . The result is essentially sharp and κ is a universal constant which is not larger than 20 [7, p.