Continuous, Nowhere Differentiable Functions
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Training Autoencoders by Alternating MinimizationUnder review as a conference paper at ICLR 2018 TRAINING AUTOENCODERS BY ALTERNATING MINI- MIZATION Anonymous authors Paper under double-blind review ABSTRACT We present DANTE, a novel method for training neural networks, in particular autoencoders, using the alternating minimization principle. DANTE provides a distinct perspective in lieu of traditional gradient-based backpropagation techniques commonly used to train deep networks. It utilizes an adaptation of quasi-convex optimization techniques to cast autoencoder training as a bi-quasi-convex optimiza- tion problem. We show that for autoencoder configurations with both differentiable (e.g. sigmoid) and non-differentiable (e.g. ReLU) activation functions, we can perform the alternations very effectively. DANTE effortlessly extends to networks with multiple hidden layers and varying network configurations. In experiments on standard datasets, autoencoders trained using the proposed method were found to be very promising and competitive to traditional backpropagation techniques, both in terms of quality of solution, as well as training speed. 1 INTRODUCTION For much of the recent march of deep learning, gradient-based backpropagation methods, e.g. Stochastic Gradient Descent (SGD) and its variants, have been the mainstay of practitioners. The use of these methods, especially on vast amounts of data, has led to unprecedented progress in several areas of artificial intelligence. On one hand, the intense focus on these techniques has led to an intimate understanding of hardware requirements and code optimizations needed to execute these routines on large datasets in a scalable manner. Today, myriad off-the-shelf and highly optimized packages exist that can churn reasonably large datasets on GPU architectures with relatively mild human involvement and little bootstrap effort.
- 
												  Karl Weierstraß – Zum 200. Geburtstag „Alles Im Leben Kommt Doch Leider Zu Spät“ Reinhard Bölling Universität Potsdam, Institut Für Mathematik Prolog Nunmehr Im 741 Karl Weierstraß – zum 200. Geburtstag „Alles im Leben kommt doch leider zu spät“ Reinhard Bölling Universität Potsdam, Institut für Mathematik Prolog Nunmehr im 74. Lebensjahr stehend, scheint es sehr wahrscheinlich, dass dies mein einziger und letzter Beitrag über Karl Weierstraß für die Mediathek meiner ehemaligen Potsdamer Arbeitsstätte sein dürfte. Deshalb erlaube ich mir, einige persönliche Bemerkungen voranzustellen. Am 9. November 1989 ging die Nachricht von der Öffnung der Berliner Mauer um die Welt. Am Tag darauf schrieb mir mein Freund in Stockholm: „Herzlich willkommen!“ Ich besorgte das damals noch erforderliche Visum in der Botschaft Schwedens und fuhr im Januar 1990 nach Stockholm. Endlich konnte ich das Mittag- Leffler-Institut in Djursholm, im nördlichen Randgebiet Stockholms gelegen, besuchen. Dort befinden sich umfangreiche Teile des Nachlasses von Weierstraß und Kowalewskaja, die von Mittag-Leffler zusammengetragen worden waren. Ich hatte meine Arbeit am Briefwechsel zwischen Weierstraß und Kowalewskaja, die meine erste mathematikhistorische Publikation werden sollte, vom Inhalt her abgeschlossen. Das Manuskript lag in nahezu satzfertiger Form vor und sollte dem Verlag übergeben werden. Geradezu selbstverständlich wäre es für diese Arbeit gewesen, die Archivalien im Mittag-Leffler-Institut zu studieren. Aber auch als Mitarbeiter des Karl-Weierstraß- Institutes für Mathematik in Ostberlin gehörte ich nicht zu denen, die man ins westliche Ausland reisen ließ. – Nun konnte ich mir also endlich einen ersten Überblick über die Archivalien im Mittag-Leffler-Institut verschaffen. Ich studierte in jenen Tagen ohne Unterbrechung von morgens bis abends Schriftstücke, Dokumente usw. aus dem dortigen Archiv, denn mir stand nur eine Woche zur Verfügung. Am zweiten Tag in Djursholm entdeckte ich unter Papieren ganz anderen Inhalts einige lose Blätter, die Kowalewskaja beschrieben hatte.
- 
												  HalloweierstrassHappy Hallo WEIERSTRASS Did you know that Weierestrass was born on Halloween? Neither did we… Dmitriy Bilyk will be speaking on Lacunary Fourier series: from Weierstrass to our days Monday, Oct 31 at 12:15pm in Vin 313 followed by Mesa Pizza in the first floor lounge Brought to you by the UMN AMS Student Chapter and born in Ostenfelde, Westphalia, Prussia. sent to University of Bonn to prepare for a government position { dropped out. studied mathematics at the M¨unsterAcademy. University of K¨onigsberg gave him an honorary doctor's degree March 31, 1854. 1856 a chair at Gewerbeinstitut (now TU Berlin) professor at Friedrich-Wilhelms-Universit¨atBerlin (now Humboldt Universit¨at) died in Berlin of pneumonia often cited as the father of modern analysis Karl Theodor Wilhelm Weierstrass 31 October 1815 { 19 February 1897 born in Ostenfelde, Westphalia, Prussia. sent to University of Bonn to prepare for a government position { dropped out. studied mathematics at the M¨unsterAcademy. University of K¨onigsberg gave him an honorary doctor's degree March 31, 1854. 1856 a chair at Gewerbeinstitut (now TU Berlin) professor at Friedrich-Wilhelms-Universit¨atBerlin (now Humboldt Universit¨at) died in Berlin of pneumonia often cited as the father of modern analysis Karl Theodor Wilhelm Weierstraß 31 October 1815 { 19 February 1897 born in Ostenfelde, Westphalia, Prussia. sent to University of Bonn to prepare for a government position { dropped out. studied mathematics at the M¨unsterAcademy. University of K¨onigsberg gave him an honorary doctor's degree March 31, 1854. 1856 a chair at Gewerbeinstitut (now TU Berlin) professor at Friedrich-Wilhelms-Universit¨atBerlin (now Humboldt Universit¨at) died in Berlin of pneumonia often cited as the father of modern analysis Karl Theodor Wilhelm Weierstraß 31 October 1815 { 19 February 1897 sent to University of Bonn to prepare for a government position { dropped out.
- 
												  On Fractal Properties of Weierstrass-Type FunctionsProceedings of the International Geometry Center Vol. 12, no. 2 (2019) pp. 43–61 On fractal properties of Weierstrass-type functions Claire David Abstract. In the sequel, starting from the classical Weierstrass function + 8 n n defined, for any real number x, by (x) = λ cos (2 π Nb x), where λ W n=0 ÿ and Nb are two real numbers such that 0 λ 1, Nb N and λ Nb 1, we highlight intrinsic properties of curious mapsă ă which happenP to constituteą a new class of iterated function system. Those properties are all the more interesting, in so far as they can be directly linked to the computation of the box dimension of the curve, and to the proof of the non-differentiabilty of Weierstrass type functions. Анотація. Метою даної роботи є узагальнення попередніх результатів автора про класичну функцію Вейерштрасса та її графік. Його можна отримати як границю послідовності префракталів, тобто графів, отри- маних за допомогою ітераційної системи функцій, які, як правило, не є стискаючими відображеннями. Натомість вони мають в деякому сенсі еквівалентну властивість до стискаючих відображень, оскільки на ко- жному етапі ітераційного процесу, який дає змогу отримати префра- ктали, вони зменшують двовимірні міри Лебега заданої послідовності прямокутників, що покривають криву. Такі системи функцій відіграють певну роль на першому кроці процесу побудови підкови Смейла. Вони можуть бути використані для доведення недиференційованості функції Вейєрштрасса та обчислення box-розмірності її графіка, а також для по- будови більш широких класів неперервних, але ніде не диференційовних функцій. Останнє питання ми вивчатимемо в подальших роботах. 2010 Mathematics Subject Classification: 37F20, 28A80, 05C63 Keywords: Weierstrass function; non-differentiability; iterative function systems DOI: http://dx.doi.org/10.15673/tmgc.v12i2.1485 43 44 Cl.
- 
												  Continuous Nowhere Differentiable Functions2003:320 CIV MASTER’S THESIS Continuous Nowhere Differentiable Functions JOHAN THIM MASTER OF SCIENCE PROGRAMME Department of Mathematics 2003:320 CIV • ISSN: 1402 - 1617 • ISRN: LTU - EX - - 03/320 - - SE Continuous Nowhere Differentiable Functions Johan Thim December 2003 Master Thesis Supervisor: Lech Maligranda Department of Mathematics Abstract In the early nineteenth century, most mathematicians believed that a contin- uous function has derivative at a significant set of points. A. M. Amp`ereeven tried to give a theoretical justification for this (within the limitations of the definitions of his time) in his paper from 1806. In a presentation before the Berlin Academy on July 18, 1872 Karl Weierstrass shocked the mathematical community by proving this conjecture to be false. He presented a function which was continuous everywhere but differentiable nowhere. The function in question was defined by ∞ X W (x) = ak cos(bkπx), k=0 where a is a real number with 0 < a < 1, b is an odd integer and ab > 1+3π/2. This example was first published by du Bois-Reymond in 1875. Weierstrass also mentioned Riemann, who apparently had used a similar construction (which was unpublished) in his own lectures as early as 1861. However, neither Weierstrass’ nor Riemann’s function was the first such construction. The earliest known example is due to Czech mathematician Bernard Bolzano, who in the years around 1830 (published in 1922 after being discovered a few years earlier) exhibited a continuous function which was nowhere differen- tiable. Around 1860, the Swiss mathematician Charles Cell´erieralso discov- ered (independently) an example which unfortunately wasn’t published until 1890 (posthumously).
- 
												  Deep Learning Based Computer Generated Face Identification Usingapplied sciences Article Deep Learning Based Computer Generated Face Identification Using Convolutional Neural Network L. Minh Dang 1, Syed Ibrahim Hassan 1, Suhyeon Im 1, Jaecheol Lee 2, Sujin Lee 1 and Hyeonjoon Moon 1,* 1 Department of Computer Science and Engineering, Sejong University, Seoul 143-747, Korea; [email protected] (L.M.D.); [email protected] (S.I.H.); [email protected] (S.I.); [email protected] (S.L.) 2 Department of Information Communication Engineering, Sungkyul University, Seoul 143-747, Korea; [email protected] * Correspondence: [email protected] Received: 30 October 2018; Accepted: 10 December 2018; Published: 13 December 2018 Abstract: Generative adversarial networks (GANs) describe an emerging generative model which has made impressive progress in the last few years in generating photorealistic facial images. As the result, it has become more and more difficult to differentiate between computer-generated and real face images, even with the human’s eyes. If the generated images are used with the intent to mislead and deceive readers, it would probably cause severe ethical, moral, and legal issues. Moreover, it is challenging to collect a dataset for computer-generated face identification that is large enough for research purposes because the number of realistic computer-generated images is still limited and scattered on the internet. Thus, a development of a novel decision support system for analyzing and detecting computer-generated face images generated by the GAN network is crucial. In this paper, we propose a customized convolutional neural network, namely CGFace, which is specifically designed for the computer-generated face detection task by customizing the number of convolutional layers, so it performs well in detecting computer-generated face images.
- 
												  A Century of Mathematics in America, Peter Duren Et Ai., (Eds.), VolGarrett Birkhoff has had a lifelong connection with Harvard mathematics. He was an infant when his father, the famous mathematician G. D. Birkhoff, joined the Harvard faculty. He has had a long academic career at Harvard: A.B. in 1932, Society of Fellows in 1933-1936, and a faculty appointmentfrom 1936 until his retirement in 1981. His research has ranged widely through alge bra, lattice theory, hydrodynamics, differential equations, scientific computing, and history of mathematics. Among his many publications are books on lattice theory and hydrodynamics, and the pioneering textbook A Survey of Modern Algebra, written jointly with S. Mac Lane. He has served as president ofSIAM and is a member of the National Academy of Sciences. Mathematics at Harvard, 1836-1944 GARRETT BIRKHOFF O. OUTLINE As my contribution to the history of mathematics in America, I decided to write a connected account of mathematical activity at Harvard from 1836 (Harvard's bicentennial) to the present day. During that time, many mathe maticians at Harvard have tried to respond constructively to the challenges and opportunities confronting them in a rapidly changing world. This essay reviews what might be called the indigenous period, lasting through World War II, during which most members of the Harvard mathe matical faculty had also studied there. Indeed, as will be explained in §§ 1-3 below, mathematical activity at Harvard was dominated by Benjamin Peirce and his students in the first half of this period. Then, from 1890 until around 1920, while our country was becoming a great power economically, basic mathematical research of high quality, mostly in traditional areas of analysis and theoretical celestial mechanics, was carried on by several faculty members.
- 
												  Matrix CalculusAppendix D Matrix Calculus From too much study, and from extreme passion, cometh madnesse. Isaac Newton [205, §5] − D.1 Gradient, Directional derivative, Taylor series D.1.1 Gradients Gradient of a differentiable real function f(x) : RK R with respect to its vector argument is defined uniquely in terms of partial derivatives→ ∂f(x) ∂x1 ∂f(x) , ∂x2 RK f(x) . (2053) ∇ . ∈ . ∂f(x) ∂xK while the second-order gradient of the twice differentiable real function with respect to its vector argument is traditionally called the Hessian; 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 ∂x1 ∂x1∂x2 ··· ∂x1∂xK 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 2 K f(x) , ∂x2∂x1 ∂x2 ··· ∂x2∂xK S (2054) ∇ . ∈ . .. 2 2 2 ∂ f(x) ∂ f(x) ∂ f(x) 2 ∂xK ∂x1 ∂xK ∂x2 ∂x ··· K interpreted ∂f(x) ∂f(x) 2 ∂ ∂ 2 ∂ f(x) ∂x1 ∂x2 ∂ f(x) = = = (2055) ∂x1∂x2 ³∂x2 ´ ³∂x1 ´ ∂x2∂x1 Dattorro, Convex Optimization Euclidean Distance Geometry, Mεβoo, 2005, v2020.02.29. 599 600 APPENDIX D. MATRIX CALCULUS The gradient of vector-valued function v(x) : R RN on real domain is a row vector → v(x) , ∂v1(x) ∂v2(x) ∂vN (x) RN (2056) ∇ ∂x ∂x ··· ∂x ∈ h i while the second-order gradient is 2 2 2 2 , ∂ v1(x) ∂ v2(x) ∂ vN (x) RN v(x) 2 2 2 (2057) ∇ ∂x ∂x ··· ∂x ∈ h i Gradient of vector-valued function h(x) : RK RN on vector domain is → ∂h1(x) ∂h2(x) ∂hN (x) ∂x1 ∂x1 ··· ∂x1 ∂h1(x) ∂h2(x) ∂hN (x) h(x) , ∂x2 ∂x2 ··· ∂x2 ∇ .
- 
												  Pathological” Functions and Their ClassesProperties of “Pathological” Functions and their Classes Syed Sameed Ahmed (179044322) University of Leicester 27th April 2020 Abstract This paper is meant to serve as an exposition on the theorem proved by Richard Aron, V.I. Gurariy and J.B. Seoane in their 2004 paper [2], which states that ‘The Set of Dif- ferentiable but Nowhere Monotone Functions (DNM) is Lineable.’ I begin by showing how certain properties that our intuition generally associates with continuous or differen- tiable functions can be wrong. This is followed by examples of functions with surprisingly unintuitive properties that have appeared in mathematical literature in relatively recent times. After a brief discussion of some preliminary concepts and tools needed for the rest of the paper, a construction of a ‘Continuous but Nowhere Monotone (CNM)’ function is provided which serves as a first concrete introduction to the so called “pathological functions” in this paper. The existence of such functions leaves one wondering about the ‘Differentiable but Nowhere Monotone (DNM)’ functions. Therefore, the next section of- fers a description of how the 2004 paper by the three authors shows that such functions are very abundant and not mere fancy counter-examples that can be called “pathological.” 1Introduction A wide variety of assumptions about the nature of mathematical objects inform our intuition when we perform calculations. These assumptions have their utility because they help us perform calculations swiftly without being over skeptical or neurotic. The problem, however, is that it is not entirely obvious as to which assumptions are legitimate and can be rigorously justified and which ones are not.
- 
												  Differentiability in Several Variables: Summary of Basic ConceptsDIFFERENTIABILITY IN SEVERAL VARIABLES: SUMMARY OF BASIC CONCEPTS 3 3 1. Partial derivatives If f : R ! R is an arbitrary function and a = (x0; y0; z0) 2 R , then @f f(a + tj) ¡ f(a) f(x ; y + t; z ) ¡ f(x ; y ; z ) (1) (a) := lim = lim 0 0 0 0 0 0 @y t!0 t t!0 t etc.. provided the limit exists. 2 @f f(t;0)¡f(0;0) Example 1. for f : R ! R, @x (0; 0) = limt!0 t . Example 2. Let f : R2 ! R given by ( 2 x y ; (x; y) 6= (0; 0) f(x; y) = x2+y2 0; (x; y) = (0; 0) Then: @f f(t;0)¡f(0;0) 0 ² @x (0; 0) = limt!0 t = limt!0 t = 0 @f f(0;t)¡f(0;0) ² @y (0; 0) = limt!0 t = 0 Note: away from (0; 0), where f is the quotient of differentiable functions (with non-zero denominator) one can apply the usual rules of derivation: @f 2xy(x2 + y2) ¡ 2x3y (x; y) = ; for (x; y) 6= (0; 0) @x (x2 + y2)2 2 2 2. Directional derivatives. If f : R ! R is a map, a = (x0; y0) 2 R and v = ®i + ¯j is a vector in R2, then by definition f(a + tv) ¡ f(a) f(x0 + t®; y0 + t¯) ¡ f(x0; y0) (2) @vf(a) := lim = lim t!0 t t!0 t Example 3. Let f the function from the Example 2 above. Then for v = ®i + ¯j a unit vector (i.e.
- 
												  Lipschitz Recurrent Neural NetworksPublished as a conference paper at ICLR 2021 LIPSCHITZ RECURRENT NEURAL NETWORKS N. Benjamin Erichson Omri Azencot Alejandro Queiruga ICSI and UC Berkeley Ben-Gurion University Google Research [email protected] [email protected] [email protected] Liam Hodgkinson Michael W. Mahoney ICSI and UC Berkeley ICSI and UC Berkeley [email protected] [email protected] ABSTRACT Viewing recurrent neural networks (RNNs) as continuous-time dynamical sys- tems, we propose a recurrent unit that describes the hidden state’s evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this en- ables architectural design decisions before experimentation. Sufficient conditions for global stability of the recurrent unit are obtained, motivating a novel scheme for constructing hidden-to-hidden matrices. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of bench- mark tasks, including computer vision, language modeling and speech prediction tasks. Finally, through Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs. 1 INTRODUCTION Many interesting problems exhibit temporal structures that can be modeled with recurrent neural networks (RNNs), including problems in robotics, system identification, natural language process- ing, and machine learning control. In contrast to feed-forward neural networks, RNNs consist of one or more recurrent units that are designed to have dynamical (recurrent) properties, thereby enabling them to acquire some form of internal memory.
- 
												  Fundamental Theorems in MathematicsSOME FUNDAMENTAL THEOREMS IN MATHEMATICS OLIVER KNILL Abstract. An expository hitchhikers guide to some theorems in mathematics. Criteria for the current list of 243 theorems are whether the result can be formulated elegantly, whether it is beautiful or useful and whether it could serve as a guide [6] without leading to panic. The order is not a ranking but ordered along a time-line when things were writ- ten down. Since [556] stated “a mathematical theorem only becomes beautiful if presented as a crown jewel within a context" we try sometimes to give some context. Of course, any such list of theorems is a matter of personal preferences, taste and limitations. The num- ber of theorems is arbitrary, the initial obvious goal was 42 but that number got eventually surpassed as it is hard to stop, once started. As a compensation, there are 42 “tweetable" theorems with included proofs. More comments on the choice of the theorems is included in an epilogue. For literature on general mathematics, see [193, 189, 29, 235, 254, 619, 412, 138], for history [217, 625, 376, 73, 46, 208, 379, 365, 690, 113, 618, 79, 259, 341], for popular, beautiful or elegant things [12, 529, 201, 182, 17, 672, 673, 44, 204, 190, 245, 446, 616, 303, 201, 2, 127, 146, 128, 502, 261, 172]. For comprehensive overviews in large parts of math- ematics, [74, 165, 166, 51, 593] or predictions on developments [47]. For reflections about mathematics in general [145, 455, 45, 306, 439, 99, 561]. Encyclopedic source examples are [188, 705, 670, 102, 192, 152, 221, 191, 111, 635].