Fractional Norms and Quasinorms Do Not Help to Overcome the Curse of Dimensionality
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Curse of Dimensionality for TSK Fuzzy Neural Networks
Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions Yuqi Cui, Dongrui Wu and Yifan Xu School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan, China yqcui, drwu, yfxu @hust.edu.cn { } Abstract—Takagi-Sugeno-Kang (TSK) fuzzy system with usually use distance based approaches to compute membership Gaussian membership functions (MFs) is one of the most widely grades, so when it comes to high-dimensional datasets, the used fuzzy systems in machine learning. However, it usually fuzzy partitions may collapse. For instance, FCM is known has difficulty handling high-dimensional datasets. This paper explores why TSK fuzzy systems with Gaussian MFs may fail to have trouble handling high-dimensional datasets, because on high-dimensional inputs. After transforming defuzzification the membership grades of different clusters become similar, to an equivalent form of softmax function, we find that the poor leading all centers to move to the center of the gravity [18]. performance is due to the saturation of softmax. We show that Most previous works used feature selection or dimensional- two defuzzification operations, LogTSK and HTSK, the latter of ity reduction to cope with high-dimensionality. Model-agnostic which is first proposed in this paper, can avoid the saturation. Experimental results on datasets with various dimensionalities feature selection or dimensionality reduction algorithms, such validated our analysis and demonstrated the effectiveness of as Relief [19] and principal component analysis (PCA) [20], LogTSK and HTSK. [21], can filter the features before feeding them into TSK Index Terms—Mini-batch gradient descent, fuzzy neural net- models. Neural networks pre-trained on large datasets can also work, high-dimensional TSK fuzzy system, HTSK, LogTSK be used as feature extractor to generate high-level features with low dimensionality [22], [23]. -
Optimal Grid-Clustering : Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering Alexander Hinneburg Daniel A. Keim [email protected] [email protected] Institute of Computer Science, UniversityofHalle Kurt-Mothes-Str.1, 06120 Halle (Saale), Germany Abstract 1 Intro duction Because of the fast technological progress, the amount Many applications require the clustering of large amounts of data which is stored in databases increases very fast. of high-dimensional data. Most clustering algorithms, This is true for traditional relational databases but however, do not work e ectively and eciently in high- also for databases of complex 2D and 3D multimedia dimensional space, which is due to the so-called "curse of data such as image, CAD, geographic, and molecular dimensionality". In addition, the high-dimensional data biology data. It is obvious that relational databases often contains a signi cant amount of noise which causes can be seen as high-dimensional databases (the at- additional e ectiveness problems. In this pap er, we review tributes corresp ond to the dimensions of the data set), and compare the existing algorithms for clustering high- butitisalsotrueformultimedia data which - for an dimensional data and show the impact of the curse of di- ecient retrieval - is usually transformed into high- mensionality on their e ectiveness and eciency.Thecom- dimensional feature vectors such as color histograms parison reveals that condensation-based approaches (such [SH94], shap e descriptors [Jag91, MG95], Fourier vec- as BIRCH or STING) are the most promising candidates tors [WW80], and text descriptors [Kuk92]. In many for achieving the necessary eciency, but it also shows of the mentioned applications, the databases are very that basically all condensation-based approaches havese- large and consist of millions of data ob jects with sev- vere weaknesses with resp ect to their e ectiveness in high- eral tens to a few hundreds of dimensions. -
Functional Analysis 1 Winter Semester 2013-14
Functional analysis 1 Winter semester 2013-14 1. Topological vector spaces Basic notions. Notation. (a) The symbol F stands for the set of all reals or for the set of all complex numbers. (b) Let (X; τ) be a topological space and x 2 X. An open set G containing x is called neigh- borhood of x. We denote τ(x) = fG 2 τ; x 2 Gg. Definition. Suppose that τ is a topology on a vector space X over F such that • (X; τ) is T1, i.e., fxg is a closed set for every x 2 X, and • the vector space operations are continuous with respect to τ, i.e., +: X × X ! X and ·: F × X ! X are continuous. Under these conditions, τ is said to be a vector topology on X and (X; +; ·; τ) is a topological vector space (TVS). Remark. Let X be a TVS. (a) For every a 2 X the mapping x 7! x + a is a homeomorphism of X onto X. (b) For every λ 2 F n f0g the mapping x 7! λx is a homeomorphism of X onto X. Definition. Let X be a vector space over F. We say that A ⊂ X is • balanced if for every α 2 F, jαj ≤ 1, we have αA ⊂ A, • absorbing if for every x 2 X there exists t 2 R; t > 0; such that x 2 tA, • symmetric if A = −A. Definition. Let X be a TVS and A ⊂ X. We say that A is bounded if for every V 2 τ(0) there exists s > 0 such that for every t > s we have A ⊂ tV . -
HYPERCYCLIC SUBSPACES in FRÉCHET SPACES 1. Introduction
PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 134, Number 7, Pages 1955–1961 S 0002-9939(05)08242-0 Article electronically published on December 16, 2005 HYPERCYCLIC SUBSPACES IN FRECHET´ SPACES L. BERNAL-GONZALEZ´ (Communicated by N. Tomczak-Jaegermann) Dedicated to the memory of Professor Miguel de Guzm´an, who died in April 2004 Abstract. In this note, we show that every infinite-dimensional separable Fr´echet space admitting a continuous norm supports an operator for which there is an infinite-dimensional closed subspace consisting, except for zero, of hypercyclic vectors. The family of such operators is even dense in the space of bounded operators when endowed with the strong operator topology. This completes the earlier work of several authors. 1. Introduction and notation Throughout this paper, the following standard notation will be used: N is the set of positive integers, R is the real line, and C is the complex plane. The symbols (mk), (nk) will stand for strictly increasing sequences in N.IfX, Y are (Hausdorff) topological vector spaces (TVSs) over the same field K = R or C,thenL(X, Y ) will denote the space of continuous linear mappings from X into Y , while L(X)isthe class of operators on X,thatis,L(X)=L(X, X). The strong operator topology (SOT) in L(X) is the one where the convergence is defined as pointwise convergence at every x ∈ X. A sequence (Tn) ⊂ L(X, Y )issaidtobeuniversal or hypercyclic provided there exists some vector x0 ∈ X—called hypercyclic for the sequence (Tn)—such that its orbit {Tnx0 : n ∈ N} under (Tn)isdenseinY . -
High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Lecturer: Pravesh Kothari Scribe:Sanjeev Arora
princeton univ. F'16 cos 521: Advanced Algorithm Design Lecture 9: High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Lecturer: Pravesh Kothari Scribe:Sanjeev Arora High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer, etc.) and this lecture seeks to introduce high dimen- sional geometry. We encounter the so-called curse of dimensionality which refers to the fact that algorithms are simply harder to design in high dimensions and often have a running time exponential in the dimension. We also encounter the blessings of dimensionality, which allows us to reason about higher dimensional geometry using tools like Chernoff bounds. We also show the possibility of dimension reduction | it is sometimes possible to reduce the dimension of a dataset, for some purposes. d P 2 1=2 Notation: For a vector x 2 < its `2-norm is jxj2 = ( i xi ) and the `1-norm is P jxj1 = i jxij. For any two vectors x; y their Euclidean distance refers to jx − yj2 and Manhattan distance refers to jx − yj1. High dimensional geometry is inherently different from low-dimensional geometry. Example 1 Consider how many almost orthogonal unit vectors we can have in space, such that all pairwise angles lie between 88 degrees and 92 degrees. In <2 the answer is 2. In <3 it is 3. (Prove these to yourself.) In <d the answer is exp(cd) where c > 0 is some constant. Rd# R2# R3# Figure 1: Number of almost-orthogonal vectors in <2; <3; <d Example 2 Another example is the ratio of the the volume of the unit sphere to its cir- cumscribing cube (i.e. -
L P and Sobolev Spaces
NOTES ON Lp AND SOBOLEV SPACES STEVE SHKOLLER 1. Lp spaces 1.1. Definitions and basic properties. Definition 1.1. Let 0 < p < 1 and let (X; M; µ) denote a measure space. If f : X ! R is a measurable function, then we define 1 Z p p kfkLp(X) := jfj dx and kfkL1(X) := ess supx2X jf(x)j : X Note that kfkLp(X) may take the value 1. Definition 1.2. The space Lp(X) is the set p L (X) = ff : X ! R j kfkLp(X) < 1g : The space Lp(X) satisfies the following vector space properties: (1) For each α 2 R, if f 2 Lp(X) then αf 2 Lp(X); (2) If f; g 2 Lp(X), then jf + gjp ≤ 2p−1(jfjp + jgjp) ; so that f + g 2 Lp(X). (3) The triangle inequality is valid if p ≥ 1. The most interesting cases are p = 1; 2; 1, while all of the Lp arise often in nonlinear estimates. Definition 1.3. The space lp, called \little Lp", will be useful when we introduce Sobolev spaces on the torus and the Fourier series. For 1 ≤ p < 1, we set ( 1 ) p 1 X p l = fxngn=1 j jxnj < 1 : n=1 1.2. Basic inequalities. Lemma 1.4. For λ 2 (0; 1), xλ ≤ (1 − λ) + λx. Proof. Set f(x) = (1 − λ) + λx − xλ; hence, f 0(x) = λ − λxλ−1 = 0 if and only if λ(1 − xλ−1) = 0 so that x = 1 is the critical point of f. In particular, the minimum occurs at x = 1 with value f(1) = 0 ≤ (1 − λ) + λx − xλ : Lemma 1.5. -
Fact Sheet Functional Analysis
Fact Sheet Functional Analysis Literature: Hackbusch, W.: Theorie und Numerik elliptischer Differentialgleichungen. Teubner, 1986. Knabner, P., Angermann, L.: Numerik partieller Differentialgleichungen. Springer, 2000. Triebel, H.: H¨ohere Analysis. Harri Deutsch, 1980. Dobrowolski, M.: Angewandte Funktionalanalysis, Springer, 2010. 1. Banach- and Hilbert spaces Let V be a real vector space. Normed space: A norm is a mapping k · k : V ! [0; 1), such that: kuk = 0 , u = 0; (definiteness) kαuk = jαj · kuk; α 2 R; u 2 V; (positive scalability) ku + vk ≤ kuk + kvk; u; v 2 V: (triangle inequality) The pairing (V; k · k) is called a normed space. Seminorm: In contrast to a norm there may be elements u 6= 0 such that kuk = 0. It still holds kuk = 0 if u = 0. Comparison of two norms: Two norms k · k1, k · k2 are called equivalent if there is a constant C such that: −1 C kuk1 ≤ kuk2 ≤ Ckuk1; u 2 V: If only one of these inequalities can be fulfilled, e.g. kuk2 ≤ Ckuk1; u 2 V; the norm k · k1 is called stronger than the norm k · k2. k · k2 is called weaker than k · k1. Topology: In every normed space a canonical topology can be defined. A subset U ⊂ V is called open if for every u 2 U there exists a " > 0 such that B"(u) = fv 2 V : ku − vk < "g ⊂ U: Convergence: A sequence vn converges to v w.r.t. the norm k · k if lim kvn − vk = 0: n!1 1 A sequence vn ⊂ V is called Cauchy sequence, if supfkvn − vmk : n; m ≥ kg ! 0 for k ! 1. -
Basic Differentiable Calculus Review
Basic Differentiable Calculus Review Jimmie Lawson Department of Mathematics Louisiana State University Spring, 2003 1 Introduction Basic facts about the multivariable differentiable calculus are needed as back- ground for differentiable geometry and its applications. The purpose of these notes is to recall some of these facts. We state the results in the general context of Banach spaces, although we are specifically concerned with the finite-dimensional setting, specifically that of Rn. Let U be an open set of Rn. A function f : U ! R is said to be a Cr map for 0 ≤ r ≤ 1 if all partial derivatives up through order r exist for all points of U and are continuous. In the extreme cases C0 means that f is continuous and C1 means that all partials of all orders exists and are continuous on m r r U. A function f : U ! R is a C map if fi := πif is C for i = 1; : : : ; m, m th where πi : R ! R is the i projection map defined by πi(x1; : : : ; xm) = xi. It is a standard result that mixed partials of degree less than or equal to r and of the same type up to interchanges of order are equal for a C r-function (sometimes called Clairaut's Theorem). We can consider a category with objects nonempty open subsets of Rn for various n and morphisms Cr-maps. This is indeed a category, since the composition of Cr maps is again a Cr map. 1 2 Normed Spaces and Bounded Linear Oper- ators At the heart of the differential calculus is the notion of a differentiable func- tion. -
Clustering for High Dimensional Data: Density Based Subspace Clustering Algorithms
International Journal of Computer Applications (0975 – 8887) Volume 63– No.20, February 2013 Clustering for High Dimensional Data: Density based Subspace Clustering Algorithms Sunita Jahirabadkar Parag Kulkarni Department of Computer Engineering & I.T., Department of Computer Engineering & I.T., College of Engineering, Pune, India College of Engineering, Pune, India ABSTRACT hiding clusters in noisy data. Then, common approach is to Finding clusters in high dimensional data is a challenging task as reduce the dimensionality of the data, of course, without losing the high dimensional data comprises hundreds of attributes. important information. Subspace clustering is an evolving methodology which, instead Fundamental techniques to eliminate such irrelevant dimensions of finding clusters in the entire feature space, it aims at finding can be considered as Feature selection or Feature clusters in various overlapping or non-overlapping subspaces of Transformation techniques [5]. Feature transformation methods the high dimensional dataset. Density based subspace clustering such as aggregation, dimensionality reduction etc. project the algorithms treat clusters as the dense regions compared to noise higher dimensional data onto a smaller space while preserving or border regions. Many momentous density based subspace the distance between the original data objects. The commonly clustering algorithms exist in the literature. Each of them is used methods are Principal Component Analysis [6, 7], Singular characterized by different characteristics caused by different Value Decomposition [8] etc. The major limitation of feature assumptions, input parameters or by the use of different transformation approaches is, they do not actually remove any of techniques etc. Hence it is quite unfeasible for the future the attributes and hence the information from the not-so-useful developers to compare all these algorithms using one common dimensions is preserved, making the clusters less meaningful. -
Spectral Radii of Bounded Operators on Topological Vector Spaces
SPECTRAL RADII OF BOUNDED OPERATORS ON TOPOLOGICAL VECTOR SPACES VLADIMIR G. TROITSKY Abstract. In this paper we develop a version of spectral theory for bounded linear operators on topological vector spaces. We show that the Gelfand formula for spectral radius and Neumann series can still be naturally interpreted for operators on topological vector spaces. Of course, the resulting theory has many similarities to the conventional spectral theory of bounded operators on Banach spaces, though there are several im- portant differences. The main difference is that an operator on a topological vector space has several spectra and several spectral radii, which fit a well-organized pattern. 0. Introduction The spectral radius of a bounded linear operator T on a Banach space is defined by the Gelfand formula r(T ) = lim n T n . It is well known that r(T ) equals the n k k actual radius of the spectrum σ(T ) p= sup λ : λ σ(T ) . Further, it is known −1 {| | ∈ } ∞ T i that the resolvent R =(λI T ) is given by the Neumann series +1 whenever λ − i=0 λi λ > r(T ). It is natural to ask if similar results are valid in a moreP general setting, | | e.g., for a bounded linear operator on an arbitrary topological vector space. The author arrived to these questions when generalizing some results on Invariant Subspace Problem from Banach lattices to ordered topological vector spaces. One major difficulty is that it is not clear which class of operators should be considered, because there are several non-equivalent ways of defining bounded operators on topological vector spaces. -
Note on the Solvability of Equations Involving Unbounded Linear and Quasibounded Nonlinear Operators*
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS 56, 495-501 (1976) Note on the Solvability of Equations Involving Unbounded Linear and Quasibounded Nonlinear Operators* W. V. PETRYSHYN Deportment of Mathematics, Rutgers University, New Brunswick, New Jersey 08903 Submitted by Ky Fan INTRODUCTION Let X and Y be Banach spaces, X* and Y* their respective duals and (u, x) the value of u in X* at x in X. For a bounded linear operator T: X---f Y (i.e., for T EL(X, Y)) let N(T) C X and R(T) C Y denote the null space and the range of T respectively and let T *: Y* --f X* denote the adjoint of T. Foranyset vinXandWinX*weset I”-={uEX*~(U,X) =OV~EV) and WL = {x E X 1(u, x) = OVu E W). In [6] Kachurovskii obtained (without proof) a partial generalization of the closed range theorem of Banach for mildly nonlinear equations which can be stated as follows: THEOREM K. Let T EL(X, Y) have a closed range R(T) and a Jinite dimensional null space N(T). Let S: X --f Y be a nonlinear compact mapping such that R(S) C N( T*)l and S is asymptotically zer0.l Then the equation TX + S(x) = y is solvable for a given y in Y if and only if y E N( T*)l. The purpose of this note is to extend Theorem K to equations of the form TX + S(x) = y (x E WY, y E Y), (1) where T: D(T) X--f Y is a closed linear operator with domain D(T) not necessarily dense in X and such that R(T) is closed and N(T) (CD(T)) has a closed complementary subspace in X and S: X-t Y is a nonlinear quasi- bounded mapping but not necessarily compact or asymptotically zero. -
Normed Linear Spaces of Continuous Functions
NORMED LINEAR SPACES OF CONTINUOUS FUNCTIONS S. B. MYERS 1. Introduction. In addition to its well known role in analysis, based on measure theory and integration, the study of the Banach space B(X) of real bounded continuous functions on a topological space X seems to be motivated by two major objectives. The first of these is the general question as to relations between the topological properties of X and the properties (algebraic, topological, metric) of B(X) and its linear subspaces. The impetus to the study of this question has been given by various results which show that, under certain natural restrictions on X, the topological structure of X is completely determined by the structure of B{X) [3; 16; 7],1 and even by the structure of a certain type of subspace of B(X) [14]. Beyond these foundational theorems, the results are as yet meager and exploratory. It would be exciting (but surprising) if some natural metric property of B(X) were to lead to the unearthing of a new topological concept or theorem about X. The second goal is to obtain information about the structure and classification of real Banach spaces. The hope in this direction is based on the fact that every Banach space is (equivalent to) a linear subspace of B(X) [l] for some compact (that is, bicompact Haus- dorff) X. Properties have been found which characterize the spaces B(X) among all Banach spaces [ô; 2; 14], and more generally, prop erties which characterize those Banach spaces which determine the topological structure of some compact or completely regular X [14; 15].