entropy Article Robust and Scalable Learning of Complex Intrinsic Dataset Geometry via ElPiGraph Luca Albergante 1,2,3,4,* , Evgeny Mirkes 5,6 , Jonathan Bac 1,2,3,7, Huidong Chen 8,9, Alexis Martin 1,2,3,10, Louis Faure 1,2,3,11 , Emmanuel Barillot 1,2,3 , Luca Pinello 8,9, Alexander Gorban 5,6 and Andrei Zinovyev 1,2,3,* 1 Institut Curie, PSL Research University, 75005 Paris, France;
[email protected] (J.B.);
[email protected] (A.M.);
[email protected] (L.F.);
[email protected] (E.B.) 2 INSERM U900, 75248 Paris, France 3 CBIO-Centre for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France 4 Sensyne Health, Oxford OX4 4GE, UK 5 Center for Mathematical Modeling, University of Leicester, Leicester LE1 7RH, UK;
[email protected] (E.M.);
[email protected] (A.G.) 6 Lobachevsky University, 603000 Nizhny Novgorod, Russia 7 Centre de Recherches Interdisciplinaires, Université de Paris, F-75000 Paris, France 8 Molecular Pathology Unit & Cancer Center, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA 02114, USA;
[email protected] (H.C.);
[email protected] (L.P.) 9 Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA 10 ECE Paris, F-75015 Paris, France 11 Center for Brain Research, Medical University of Vienna, 22180 Vienna, Austria * Correspondence:
[email protected] (L.A.);
[email protected] (A.Z.) Received: 6 December 2019; Accepted: 2 March 2020; Published: 4 March 2020 Abstract: Multidimensional datapoint clouds representing large datasets are frequently characterized by non-trivial low-dimensional geometry and topology which can be recovered by unsupervised machine learning approaches, in particular, by principal graphs.