Arxiv:1901.09057V3 [Math.DG]

Total Page:16

File Type:pdf, Size:1020Kb

Arxiv:1901.09057V3 [Math.DG] MSC Classification: 58D15, 68T99. Keywords: Manifold Learning, Embedding Spaces, Discretized Gradient Flow DISCRETIZED GRADIENT FLOW FOR MANIFOLD LEARNING DARA GOLD AND STEVEN ROSENBERG Abstract. Gradient descent, or negative gradient flow, is a standard technique in opti- mization to find minima of functions. Many implementations of gradient descent rely on discretized versions, i.e., moving in the gradient direction for a set step size, recomputing the gradient, and continuing. In this paper, we present an approach to manifold learning where gradient descent takes place in the infinite dimensional space E = Emb(M, RN ) of smooth embeddings φ of a manifold M into RN . Implementing a discretized version of gra- dient descent for P : E −→ R, a penalty function that scores an embedding φ ∈E, requires estimating how far we can move in a fixed direction – the direction of one gradient step – before leaving the space of smooth embeddings. Our main result is to give an explicit lower bound for this step length in terms of the Riemannian geometry of φ(M). In particular, we consider the case when the gradient of P is pointwise normal to the embedded manifold φ(M). We prove this case arises when P is invariant under diffeomorphisms of M, a natural condition in manifold learning. 1. Introduction A common approach in data analysis and machine learning is manifold learning, i.e., deter- L RN mining how to approximate a finite set of {yi}i=1 in Euclidean space by a k-dimensional embedded, compact manifold M for some k ≪ N [6, 11, 12, 18, 25, 33]. The mathematical setup involves the space of smooth embeddings E = Emb(M, RN ) considered as an open subset of the infinite dimensional vector space of all maps from M to RN with the Banach space topology coming from a high Sobolev norm or the C∞ Fr´echet space topology. We also have a C1 penalty function P : E → R which typically contains a data fitting term and a regularization term, as explained below. (In keeping with the literature, we assume that k and the diffeomorphism type of M are given.) In theory, finding a global minimum of P via the negative gradient flow of P on E gives an optimal embedding, or one that “best fits” the training set {yi}. arXiv:1901.09057v3 [math.DG] 23 Dec 2020 To avoid overfitting – or choosing a φ such that φ(M) fits the {yi} very closely but performs poorly on new data points – a penalty function P can penalize φ(M) both for being too far from {yi} and for “twisting too much” to fit the data. Thus a typical penalty function R r 2 P = P1 +P2 : E −→ contains two terms: (i) a data fitting term P1(φ)= i=1 d (φ(M),yi), where d(φ(M),yi) is the Euclidean distance from yi to the closest point in φ(M); (ii) a P regularization term P2 designed to prevent overfitting, e.g. P2(φ) = kφks, the s-Sobolev norm of φ. (For overviews of this standard approach, see [5], [36].) Gradient descent, i.e., moving in the direction of −∇P in E, can find a local or global minimum of P , or an optimal manifold to fit {yi}. While there are theoretical challenges with this setup, we focus on an implementation problem in this paper. In theory, to find a negative gradient flow line on E, we need to know the gradient of P at each point of E. This is generally intractable for computer calculations. Instead, the gradient flow is often discretized: we move in the negative gradient direction from an initial point φ0 for a fixed step size to a new point φ1, stop and recompute the gradient at φ1, then iterate until the gradient is smaller than a specified amount. Since a gradient vector in the tangent space of E corresponds to a vector field along φ(M), we need 1 2 DARA GOLD AND STEVEN ROSENBERG to estimate a lower bound for how far we can move from a fixed embedding φ in the negative gradient direction −∇Pφ and still remain in the space of embeddings. This practical issue is the main focus of this paper. In the main result, Theorem 3, we provide such a lower bound. We focus on the case where ∇Pφ is pointwise normal to φ(M), and motivate this restriction in §4. An example of manifold learning is the “learning” of black and white photographic images. Here an image with N pixels is stored as a vector in RN with each component taking values in a finite set which codifies the greyscale magnitude of the pixel. We assume that there locally exist k components of these vectors which implicitly determine the remaining N − k components. In the generic situation where the implicit function theorem applies, the set of images is then determined by an unknown k-manifold embedded in RN . Thus the computationally expensive task of storing many vectors in RN is reduced to finding a much lower dimensional manifold. Given a sample set {yi} of images, we fix a k-manifold M and N search for an embedding φ : M −→ R with φ(xi)= yi, for some xi ∈ M. The crucial issue of the topological type of M is discussed in [15]. This setup applies to recovering partially occluded objects in an image. In this case, the training set might consist of L unoccluded pictures of cars. Given new images of a car partially blocked by a stop sign, say only k pixels in each image are unblocked. Under the assumption that there is a unique way to fill in N −k remaining pixels, a “best fit” k-manifold determines the rest of the car’s shape [1, 21, 26]. We emphasize that our approach to manifold learning directly tackles the infinite di- mensional nature of this optimization problem without making any simplifying choices that reduce the problem to finite dimensions. Typical choices in the literature are parametric methods, which fix a finite dimensional parameter space of embeddings, and RKHS meth- ods, which reduce the optimization to a finite dimensional problem via the Representation Theorem, but only after making a choice of kernel function. In contrast, our approach only assumes that M is compact, possibly with boundary, and so must contend with infinite di- mensional analytic issues. Since we are given a finite set of training data, the compactness assumption is reasonable. We briefly discuss the issues with directly working with the smooth gradient flow on E. It may be difficult to prove that P is differentiable for typical data terms which measure the minimum distance from a data point to the embedded manifold. Even if P is differentiable, in this infinite dimensional case it is not clear that a gradient flow line γ(t) stays in E or converges as t −→ ∞ to a critical point of P , as E is an open dense set in the space of smooth maps from M to RN . (In finite dimensional problems, open dense sets do not occur.) Even if we can prove convergence, since neither P nor E is in general convex, calculus methods cannot tell us if a critical point is a local, much less a global, minimum. Perhaps most fundamentally, even the short time existence for the gradient flow may be difficult to establish, particularly if we use the most natural C∞ topology on E. These problems are well known in differential geometry, e.g., in the study of minimal submanifolds. In contrast, discretized gradient flow is both a tool for theoretical results on gradient flow [2, Ch. 11] and for computer implementations based on discretized, usually linearized, versions of gradient flow [13], although these are not guaranteed to converge [10]. In the theoretical direction, we argue that the entire penalty term should be invariant under the diffeomorphism group of M, just like typical data terms which measure the distance from data points to φ(M). In particular, regularization terms built from geometric quantities like the volume or total curvature of M have this invariance. In contrast, more familiar regular- ization terms like a Sobolev norm of the embedding are not diffeomorphism invariant. We prove in Theorem 1 that for a diffeomorphism invariant penalty function P , the gradient vector field ∇Pφ is guaranteed to be pointwise normal to φ(M). This simplifies the discretiza- tion process, and leads us to determine the distance one can move in a pointwise normal DISCRETIZEDGRADIENTFLOWFOR MANIFOLDLEARNING 3 direction to an embedding φ(M) and still remain in E. In the applied direction, Theorem 5.7 an explicit lower bound for this distance in terms of the geometry of φ(M) and knowledge of local coordinate charts for M. 2. Related work In addition to manifold learning, the use of gradient flow for functionals on infinite dimen- sional manifolds of maps has a large literature in machine learning, where this comes under the general heading of nonparametric methods. (In the parametric approach, one restricts attention to a finite dimensional submanifold depending on a finite dimensional family of parameters, usually in some Rn for some n.) Osher and Sethian introduced the Level Set Method [32], in which a decision boundary is treated as the level set of a function. Viewing the decision boundary this way avoids typical problems that arise with cusps and discon- tinuities in a flow whose speed is curvature dependent. This work has been extended in many directions, e.g., [30, 35, 37]. In supervised learning, [4] applied geometric gradient flow techniques to optimize a statistical labeling function using a penalty function that has a data term and a geometric regularization term, giving more refined information than contained in the decision boundary.
Recommended publications
  • I. Overview of Activities, April, 2005-March, 2006 …
    MATHEMATICAL SCIENCES RESEARCH INSTITUTE ANNUAL REPORT FOR 2005-2006 I. Overview of Activities, April, 2005-March, 2006 …......……………………. 2 Innovations ………………………………………………………..... 2 Scientific Highlights …..…………………………………………… 4 MSRI Experiences ….……………………………………………… 6 II. Programs …………………………………………………………………….. 13 III. Workshops ……………………………………………………………………. 17 IV. Postdoctoral Fellows …………………………………………………………. 19 Papers by Postdoctoral Fellows …………………………………… 21 V. Mathematics Education and Awareness …...………………………………. 23 VI. Industrial Participation ...…………………………………………………… 26 VII. Future Programs …………………………………………………………….. 28 VIII. Collaborations ………………………………………………………………… 30 IX. Papers Reported by Members ………………………………………………. 35 X. Appendix - Final Reports ……………………………………………………. 45 Programs Workshops Summer Graduate Workshops MSRI Network Conferences MATHEMATICAL SCIENCES RESEARCH INSTITUTE ANNUAL REPORT FOR 2005-2006 I. Overview of Activities, April, 2005-March, 2006 This annual report covers MSRI projects and activities that have been concluded since the submission of the last report in May, 2005. This includes the Spring, 2005 semester programs, the 2005 summer graduate workshops, the Fall, 2005 programs and the January and February workshops of Spring, 2006. This report does not contain fiscal or demographic data. Those data will be submitted in the Fall, 2006 final report covering the completed fiscal 2006 year, based on audited financial reports. This report begins with a discussion of MSRI innovations undertaken this year, followed by highlights
    [Show full text]
  • Clay Mathematics Institute 2005 James A
    Contents Clay Mathematics Institute 2005 James A. Carlson Letter from the President 2 The Prize Problems The Millennium Prize Problems 3 Recognizing Achievement 2005 Clay Research Awards 4 CMI Researchers Summary of 2005 6 Workshops & Conferences CMI Programs & Research Activities Student Programs Collected Works James Arthur Archive 9 Raoul Bott Library CMI Profile Interview with Research Fellow 10 Maria Chudnovsky Feature Article Can Biology Lead to New Theorems? 13 by Bernd Sturmfels CMI Summer Schools Summary 14 Ricci Flow, 3–Manifolds, and Geometry 15 at MSRI Program Overview CMI Senior Scholars Program 17 Institute News Euclid and His Heritage Meeting 18 Appointments & Honors 20 CMI Publications Selected Articles by Research Fellows 27 Books & Videos 28 About CMI Profile of Bow Street Staff 30 CMI Activities 2006 Institute Calendar 32 2005 Euclid: www.claymath.org/euclid James Arthur Collected Works: www.claymath.org/cw/arthur Hanoi Institute of Mathematics: www.math.ac.vn Ramanujan Society: www.ramanujanmathsociety.org $.* $MBZ.BUIFNBUJDT*OTUJUVUF ".4 "NFSJDBO.BUIFNBUJDBM4PDJFUZ In addition to major,0O"VHVTU BUUIFTFDPOE*OUFSOBUJPOBM$POHSFTTPG.BUIFNBUJDJBOT ongoing activities such as JO1BSJT %BWJE)JMCFSUEFMJWFSFEIJTGBNPVTMFDUVSFJOXIJDIIFEFTDSJCFE the summer schools,UXFOUZUISFFQSPCMFNTUIBUXFSFUPQMBZBOJOnVFOUJBMSPMFJONBUIFNBUJDBM the Institute undertakes a 5IF.JMMFOOJVN1SJ[F1SPCMFNT SFTFBSDI"DFOUVSZMBUFS PO.BZ BUBNFFUJOHBUUIF$PMMÒHFEF number of smaller'SBODF UIF$MBZ.BUIFNBUJDT*OTUJUVUF $.* BOOPVODFEUIFDSFBUJPOPGB special projects
    [Show full text]
  • David Donoho COMMENTARY 52 Cliff Ord J
    ISSN 0002-9920 (print) ISSN 1088-9477 (online) of the American Mathematical Society January 2018 Volume 65, Number 1 JMM 2018 Lecture Sampler page 6 Taking Mathematics to Heart y e n r a page 19 h C th T Ru a Columbus Meeting l i t h i page 90 a W il lia m s r e lk a W ca G Eri u n n a r C a rl ss on l l a d n a R na Da J i ll C . P ip her s e v e N ré F And e d e r i c o A rd ila s n e k c i M . E d al Ron Notices of the American Mathematical Society January 2018 FEATURED 6684 19 26 29 JMM 2018 Lecture Taking Mathematics to Graduate Student Section Sampler Heart Interview with Sharon Arroyo Conducted by Melinda Lanius Talithia Williams, Gunnar Carlsson, Alfi o Quarteroni Jill C. Pipher, Federico Ardila, Ruth WHAT IS...an Acylindrical Group Action? Charney, Erica Walker, Dana Randall, by omas Koberda André Neves, and Ronald E. Mickens AMS Graduate Student Blog All of us, wherever we are, can celebrate together here in this issue of Notices the San Diego Joint Mathematics Meetings. Our lecture sampler includes for the first time the AMS-MAA-SIAM Hrabowski-Gates-Tapia-McBay Lecture, this year by Talithia Williams on the new PBS series NOVA Wonders. After the sampler, other articles describe modeling the heart, Dürer's unfolding problem (which remains open), gerrymandering after the fall Supreme Court decision, a story for Congress about how geometry has advanced MRI, “My Father André Weil” (2018 is the 20th anniversary of his death), and a profile on Donald Knuth and native script by former Notices Senior Writer and Deputy Editor Allyn Jackson.
    [Show full text]
  • IMS Council Election Results
    Volume 42 • Issue 5 IMS Bulletin August 2013 IMS Council election results The annual IMS Council election results are in! CONTENTS We are pleased to announce that the next IMS 1 Council election results President-Elect is Erwin Bolthausen. This year 12 candidates stood for six places 2 Members’ News: David on Council (including one two-year term to fill Donoho; Kathryn Roeder; Erwin Bolthausen Richard Davis the place vacated by Erwin Bolthausen). The new Maurice Priestley; C R Rao; IMS Council members are: Richard Davis, Rick Durrett, Steffen Donald Richards; Wenbo Li Lauritzen, Susan Murphy, Jane-Ling Wang and Ofer Zeitouni. 4 X-L Files: From t to T Ofer will serve the shorter term, until August 2015, and the others 5 Anirban’s Angle: will serve three-year terms, until August 2016. Thanks to Council Randomization, Bootstrap members Arnoldo Frigessi, Nancy Lopes Garcia, Steve Lalley, Ingrid and Sherlock Holmes Van Keilegom and Wing Wong, whose terms will finish at the IMS Rick Durrett Annual Meeting at JSM Montreal. 6 Obituaries: George Box; William Studden IMS members also voted to accept two amendments to the IMS Constitution and Bylaws. The first of these arose because this year 8 Donors to IMS Funds the IMS Nominating Committee has proposed for President-Elect a 10 Medallion Lecture Preview: current member of Council (Erwin Bolthausen). This brought up an Peter Guttorp; Wald Lecture interesting consideration regarding the IMS Bylaws, which are now Preview: Piet Groeneboom reworded to take this into account. Steffen Lauritzen 14 Viewpoint: Are professional The second amendment concerned Organizational Membership.
    [Show full text]
  • Awards at the International Congress of Mathematicians at Seoul (Korea) 2014
    Awards at the International Congress of Mathematicians at Seoul (Korea) 2014 News release from the International Mathematical Union (IMU) The Fields medals of the ICM 2014 were awarded formidable technical power, the ingenuity and tenac- to Arthur Avila, Manjual Bhagava, Martin Hairer, and ity of a master problem-solver, and an unerring sense Maryam Mirzakhani; and the Nevanlinna Prize was for deep and significant questions. awarded to Subhash Khot. The prize winners of the Avila’s achievements are many and span a broad Gauss Prize, Chern Prize, and Leelavati Prize were range of topics; here we focus on only a few high- Stanley Osher, Phillip Griffiths and Adrián Paenza, re- lights. One of his early significant results closes a spectively. The following article about the awardees chapter on a long story that started in the 1970s. At is the news released by the International Mathe- that time, physicists, most notably Mitchell Feigen- matical Union (IMU). We are grateful to Prof. Ingrid baum, began trying to understand how chaos can Daubechies, the President of IMU, who granted us the arise out of very simple systems. Some of the systems permission of reprinting it.—the Editors they looked at were based on iterating a mathemati- cal rule such as 3x(1 − x). Starting with a given point, one can watch the trajectory of the point under re- The Work of Artur Avila peated applications of the rule; one can think of the rule as moving the starting point around over time. Artur Avila has made For some maps, the trajectories eventually settle into outstanding contribu- stable orbits, while for other maps the trajectories be- tions to dynamical come chaotic.
    [Show full text]
  • Mathematical Challenges of the 21St Century: a Panorama of Mathematics
    comm-ucla.qxp 9/11/00 3:11 PM Page 1271 Mathematical Challenges of the 21st Century: A Panorama of Mathematics On August 7–12, 2000, the AMS held the meeting lenges was a diversity Mathematical Challenges of the 21st Century, the of views of mathemat- Society’s major event in celebration of World ics and of its connec- Mathematical Year 2000. The meeting took place tions with other areas. on the campus of the University of California, Los The Mathematical Angeles, and drew nearly 1,000 participants, who Association of America enjoyed the balmy coastal weather as well as held its annual summer the panorama of contemporary mathematics Mathfest on the UCLA Reception outside Royce Hall. provided in lectures by thirty internationally campus just prior to renowned mathematicians. the Mathematical Chal- This meeting was very different from other na- lenges meeting. On Sunday, August 6, a lecture by tional meetings organized by the AMS, with their master expositor Ronald L. Graham of the Uni- complicated schedules of lectures and sessions versity of California, San Diego, provided a bridge running in parallel. Mathematical Challenges was between the two meetings. Graham discussed a by comparison a streamlined affair, the main part number of unsolved problems that, like those of which comprised thirty plenary lectures, five per presented by Hilbert, have the intriguing combi- day over six days (there were also daily contributed nation of being simple to state while at the same paper sessions). Another difference was that the time being difficult to solve. Among the problems Mathematical Challenges speakers were encour- Graham talked about were Goldbach’s conjecture, aged to discuss the broad themes and major the “twin prime” conjecture, factoring of large outstanding problems in their areas rather than numbers, Hadwiger’s conjecture about chromatic their own research.
    [Show full text]
  • American Statistical Association Presents 2016 Awards at Joint Statistical Meetings
    And the Award Goes to … American Statistical Association Presents 2016 Awards at Joint Statistical Meetings ALEXANDRIA, VA (July 31, 2016) – The American Statistical Association (ASA), the nation’s preeminent professional statistical society, presented its annual awards program last evening at the Joint Statistical Meetings (JSM 2016) in Chicago. A list of honorees and their respective awards follow: Samuel S. Wilks Memorial Award The Wilks award honors the memory and distinguished career of Samuel S. Wilks by recognizing outstanding contributions to statistics that carry on in the spirit of his work. The 2016 Samuel S. Wilks Memorial Award honoree is David Donoho of Stanford University. Donoho is recognized for his contributions to statistics, mathematics, signal processing, information theory and reproducible research, including his innovations in statistical theory, multiscale analysis and compressed sensing, which have had wide influence across science and engineering. The Wilks Award is made possible in part by a donation from Alexander Mood, who was a student of Wilks. Gottfried E. Noether Awards The Noether awards were established to recognize distinguished researchers and teachers and to support the field of nonparametric statistics. The Noether Senior Scholar Award is presented to a distinguished senior researcher or teacher in nonparametric statistics. The 2016 Noether Senior Scholar Award honoree is Jane-Ling Wang of the University of California, Davis, who was recognized for her outstanding contributions to the theory, applications and teaching of nonparametric statistics. The Noether Young Scholar Award is presented to an accomplished young researcher to promote research and teaching in nonparametric statistics. The 2016 Noether Young Scholar Award honoree is Jing Lei of Carnegie Mellon University, who was recognized for his outstanding early-career contributions to nonparametric statistics.
    [Show full text]
  • Emissary | Spring 2021
    Spring 2021 EMISSARY M a t h e m a t i c a lSc i e n c e sRe s e a r c hIn s t i t u t e www.msri.org Mathematical Problems in Fluid Dynamics Mihaela Ifrim, Daniel Tataru, and Igor Kukavica The exploration of the mathematical foundations of fluid dynamics began early on in human history. The study of the behavior of fluids dates back to Archimedes, who discovered that any body immersed in a liquid receives a vertical upward thrust, which is equal to the weight of the displaced liquid. Later, Leonardo Da Vinci was fascinated by turbulence, another key feature of fluid flows. But the first advances in the analysis of fluids date from the beginning of the eighteenth century with the birth of differential calculus, which revolutionized the mathematical understanding of the movement of bodies, solids, and fluids. The discovery of the governing equations for the motion of fluids goes back to Euler in 1757; further progress in the nineteenth century was due to Navier and later Stokes, who explored the role of viscosity. In the middle of the twenti- eth century, Kolmogorov’s theory of tur- bulence was another turning point, as it set future directions in the exploration of fluids. More complex geophysical models incorporating temperature, salinity, and ro- tation appeared subsequently, and they play a role in weather prediction and climate modeling. Nowadays, the field of mathematical fluid dynamics is one of the key areas of partial differential equations and has been the fo- cus of extensive research over the years.
    [Show full text]
  • IMPRINTS (NUS)-Issue5/04
    ISSUE 5 Newsletter of Institute for Mathematical Sciences, NUS 2004 David Donoho: Sparse Data, Beautiful Mine >>> of scientific bodies, such as the Wald Lecture and the Bernoulli Lecture, and at the International Congress of Mathematicians. He has served on the committees of professional scientific bodies and on the editorial boards of leading journals on probability, statistics and mathematics. The Editor of Imprints interviewed him at the Department of Mathematics on 26 August 2004 when he was a guest of the Department of Mathematics and the Department of Statistics and Applied Probability from 11 August to 5 September 2004 and an invited speaker at the Institute's program on image and signal processing. The following is an enhanced and vetted account of excerpts of the interview. David Donoho It reveals little-known facets of his early scientific Excerpts of interview of David Donoho by Y.K. Leong apprenticeship in the primeval and almost unreal world of (Full interview at website: http://www.ims.nus.edu.sg/imprints/interview_donoho.htm) computer programming and data analysis of the seventies. He talks passionately about the trilogy of attraction and David Donoho is world-renowned for many important fascination with computing, statistics and mathematics and contributions to statistics and its applications to image and about the many statistical challenges and opportunities signal processing, in particular to the retrieval of essential arising from the exponential growth in data collected in all information from “sparse” data. He is reputed to be the branches of human knowledge. most highly-cited mathematician for work done in the last decade (1994–2004) — a reflection of the impact of his (Acknowledgement.
    [Show full text]
  • NEWSLETTER Issue: 479 - November 2018
    i “NLMS_479” — 2018/11/19 — 11:26 — page 1 — #1 i i i NEWSLETTER Issue: 479 - November 2018 THE STATISTICS OF HARDY–RAMANUJAN INTERNATIONAL VOLCANIC PARTITION CONGRESS OF SUPER-ERUPTIONS FORMULA MATHEMATICIANS i i i i i “NLMS_479” — 2018/11/19 — 11:26 — page 2 — #2 i i i EDITOR-IN-CHIEF COPYRIGHT NOTICE Iain Moatt (Royal Holloway, University of London) News items and notices in the Newsletter may [email protected] be freely used elsewhere unless otherwise stated, although attribution is requested when reproducing whole articles. Contributions to EDITORIAL BOARD the Newsletter are made under a non-exclusive June Barrow-Green (Open University) licence; please contact the author or photog- Tomasz Brzezinski (Swansea University) rapher for the rights to reproduce. The LMS Lucia Di Vizio (CNRS) cannot accept responsibility for the accuracy of Jonathan Fraser (University of St Andrews) information in the Newsletter. Views expressed Jelena Grbic´ (University of Southampton) do not necessarily represent the views or policy Thomas Hudson (University of Warwick) of the Editorial Team or London Mathematical Stephen Huggett (University of Plymouth) Society. Adam Johansen (University of Warwick) Bill Lionheart (University of Manchester) ISSN: 2516-3841 (Print) Mark McCartney (Ulster University) ISSN: 2516-385X (Online) Kitty Meeks (University of Glasgow) DOI: 10.1112/NLMS Vicky Neale (University of Oxford) Susan Oakes (London Mathematical Society) David Singerman (University of Southampton) Andrew Wade (Durham University) NEWSLETTER WEBSITE The Newsletter is freely available electronically Early Career Content Editor: Vicky Neale at lms.ac.uk/publications/lms-newsletter. News Editor: Susan Oakes Reviews Editor: Mark McCartney MEMBERSHIP CORRESPONDENTS AND STAFF Joining the LMS is a straightforward process.
    [Show full text]
  • 50 Years of Data Science’ Leaves Out” Wise
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS , VOL. , NO. ,– https://doi.org/./.. Years of Data Science David Donoho Department of Statistics, Stanford University, Standford, CA ABSTRACT ARTICLE HISTORY More than 50 years ago, John Tukey called for a reformation of academic statistics. In “The Future of Data Received August Analysis,” he pointed to the existence of an as-yet unrecognized science, whose subject of interest was Revised August learning from data, or “data analysis.” Ten to 20 years ago, John Chambers, JefWu, Bill Cleveland, and Leo Breiman independently once again urged academic statistics to expand its boundaries beyond the KEYWORDS classical domain of theoretical statistics; Chambers called for more emphasis on data preparation and Cross-study analysis; Data presentation rather than statistical modeling; and Breiman called for emphasis on prediction rather than analysis; Data science; Meta inference. Cleveland and Wu even suggested the catchy name “data science” for this envisioned feld. A analysis; Predictive modeling; Quantitative recent and growing phenomenon has been the emergence of “data science”programs at major universities, programming environments; including UC Berkeley, NYU, MIT, and most prominently, the University of Michigan, which in September Statistics 2015 announced a $100M “Data Science Initiative” that aims to hire 35 new faculty. Teaching in these new programs has signifcant overlap in curricular subject matter with traditional statistics courses; yet many aca- demic statisticians perceive the new programs as “cultural appropriation.”This article reviews some ingredi- ents of the current “data science moment,”including recent commentary about data science in the popular media, and about how/whether data science is really diferent from statistics.
    [Show full text]
  • Session at AAAS Annual Meeting
    SESSION on DESIGN THINKING TO MOBILIZE SCIENCE, TECHNOLOGY AND INNOVATION FOR SOCIAL CHALLENGES The aim of this symposium is to highlight the innova- tive approaches towards address social challenges. There has been growing interest in promoting “social innovation” to imbed innovation in the wider economy by fostering opportunities for new actors, such as non-profit foundations, to steer research and collaborate with firms and entrepreneurs and to tackle social challenges. User and consumers are also relevant as they play an important role in demanding innovation for social goals but also as actors and suppliers of solutions. Although the innovation process is now much more open and receptive to social influences, progress on social innovation will call for the greater involvement of stakeholders who can mobilize science, technol- ogy and innovation to address social challenges. Thus, the session requires to be approached from holistic and multidisciplinary mind and needs to cover the issue from different aspects by seven inter- national speakers. Contents Overview ■1 Program ■4 Session Description ■6 Outline of Presentation ■7 Lecture material ■33 Appendix ■55 Program Welcome and Opening 8:30-8:35 • Dr. Yuko Harayama, Deputy Director of the OECD’s Directorate for Science, Technology and Industry (DSTI) ● (moderator) (5min) Session 1: Putting ‘design thinking’ into practice 8:35-9:20 • Dr. Karabi Acharya, Change Leader, Ashoka, USA(15min) ● Systemic Change to Achieve Environmental Impact and Sustainability • Mr. Tateo Arimoto, Director-General, Research Institute of Science and Technology for Society, Japan Science and Technology Agency (JST / RISTEX) (Co-organiser and Host), Japan (15min) Design Thinking to Induce new paradigm for issue-driven approach −Discussant (5min) • Dr.
    [Show full text]