Proceedings of Machine Learning Research vol 75:1–32, 2018 31st Annual Conference on Learning Theory Fitting a putative manifold to noisy data Charles Fefferman
[email protected] Princeton University, Mathematics Department, Fine Hall, Washington Road, Princeton NJ, 08544-1000, USA. Sergei Ivanov
[email protected] Steklov Institute of Mathematics, Russian Academy of Sciences, 27 Fontanka, 191023 St.Petersburg, Russia. Yaroslav Kurylev
[email protected] University College London, Department of Mathematics, Gower Street WC1E 6BT, London, UK. Matti Lassas
[email protected] University of Helsinki, Department of Mathematics and Statistics, P.O. Box 68, 00014, Helsinki, Finland. Hariharan Narayanan
[email protected] School of Technology and Computer Science, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400 005, INDIA Editors: Sebastien Bubeck, Vianney Perchet and Philippe Rigollet Abstract In the present work, we give a solution to the following question from manifold learning. Sup- pose data belonging to a high dimensional Euclidean space is drawn independently, identically distributed from a measure supported on a low dimensional twice differentiable embedded mani- fold M, and corrupted by a small amount of gaussian noise. How can we produce a manifold Mo whose Hausdorff distance to M is small and whose reach is not much smaller than the reach of M? Keywords: Manifold learning, Hausdorff distance, reach 1. Introduction One of the main challenges in high dimensional data analysis is dealing with the exponential growth of the computational and sample complexity of generic inference tasks as a function of dimension, a phenomenon termed “the curse of dimensionality”.