All-Optical Machine Learning Using
Total Page:16
File Type:pdf, Size:1020Kb
RESEARCH OPTICAL COMPUTING diffraction. On the basis of the calculated error with respect to the target output, determined by the desired function, the network structure and its neuron phase values are optimized via an error All-optical machine learning using back-propagation algorithm, which is based on the stochastic gradient descent approach used diffractive deep neural networks in conventional deep learning (14). To demonstrate the performance of the D2NN Xing Lin1,2,3*, Yair Rivenson1,2,3*, Nezih T. Yardimci1,3, Muhammed Veli1,2,3, framework, we first trained it as a digit classifier Yi Luo1,2,3, Mona Jarrahi1,3, Aydogan Ozcan1,2,3,4† to perform automated classification of hand- written digits, from 0 to 9 (Figs. 1B and 2A). For Deep learning has been transforming our ability to execute advanced inference tasks using this task, phase-only transmission masks were computers. Here we introduce a physical mechanism to perform machine learning by designed by training a five-layer D2NN with demonstrating an all-optical diffractive deep neural network (D2NN) architecture that can 55,000 images (5000 validation images) from the implement various functions following the deep learning–based design of passive MNIST (Modified National Institute of Stan- diffractive layers that work collectively. We created 3D-printed D2NNs that implement dards and Technology) handwritten digit data- classification of images of handwritten digits and fashion products, as well as the function base (15). Input digits were encoded into the of an imaging lens at a terahertz spectrum. Our all-optical deep learning framework can amplitude of the input field to the D2NN, and perform, at the speed of light, various complex functions that computer-based neural the diffractive network was trained to map input networks can execute; will find applications in all-optical image analysis, feature detection, digits into 10 detector regions, one for each digit. and object classification; and will also enable new camera designs and optical components The classification criterion was to find the de- Downloaded from that perform distinctive tasks using D2NNs. tector with the maximum optical signal, and this was also used as a loss function during the net- work training (14). eep learning is one of the fastest-growing based on each point on a given layer acting as a After training, the design of the D2NN digit machine learning methods (1). This ap- secondary source of a wave, the amplitude and classifier was numerically tested using 10,000 proach uses multilayered artificial neural phase of which are determined by the product images from the MNIST test dataset (which were http://science.sciencemag.org/ D networks implemented in a computer to of the input wave and the complex-valued trans- not used as part of the training or validation digitally learn data representation and ab- mission or reflection coefficient at that point [see image sets) and achieved a classification accu- straction and to perform advanced tasks in a (14) for an analysis of the waves within a D2NN]. racy of 91.75% (Fig. 3C and fig. S1). In addition to manner comparable or even superior to the per- Therefore, an artificial neuron in a D2NN is con- the classification performance of the diffractive formance of human experts. Recent examples in nected to other neurons of the following layer network, we also analyzed the energy distribu- which deep learning has made major advances in through a secondary wave modulated in ampli- tion observed at the network output plane for the machine learning include medical image analysis tude and phase by both the input interference same 10,000 test digits (Fig. 3C), the results of (2), speech recognition (3), language transla- pattern created by the earlier layers and the local which clearly demonstrate that the diffractive tion (4), and image classification (5), among others transmission or reflection coefficient at that point. network learned to focus the input energy of (1, 6). Beyond some of these mainstream appli- As an analogy to standard deep neural networks each handwritten digit into the correct (i.e., the cations, deep learning methods are also being (Fig. 1D), one can consider the transmission or target) detector region, in accord with its train- on September 6, 2018 used to solve inverse imaging problems (7–13). reflection coefficient of each point or neuron as ing. With the use of complex-valued modulation Here we introduce an all-optical deep learning amultiplicative“bias” term, which is a learnable and increasing numbers of layers, neurons, and framework in which the neural network is phys- network parameter that is iteratively adjusted connections in the diffractive network, our classi- ically formed by multiple layers of diffractive during the training process of the diffractive net- fication accuracy can be further improved (figs. surfaces that work in collaboration to optically work, using an error back-propagation method. S1 and S2). For example, fig. S2 demonstrates a perform an arbitrary function that the network After this numerical training phase, the D2NN Lego-like physical transfer learning behavior for can statistically learn. Whereas the inference and design is fixed and the transmission or reflec- D2NN framework, where the inference perform- prediction mechanism of the physical network tion coefficients of the neurons of all layers are ance of an already existing D2NN can be further is all optical, the learning part that leads to its determined. This D2NN design—once physically improved by adding new diffractive layers—or, in design is done through a computer. We term this fabricated using techniques such as 3D-printing some cases, by peeling off (i.e., discarding) some framework a diffractive deep neural network or lithography—can then perform, at the speed of the existing layers—where the new layers to (D2NN) and demonstrate its inference capabil- of light, the specific task for which it is trained, be added are trained for improved inference ities through both simulations and experiments. using only optical diffraction and passive optical (coming from the entire diffractive network: old Our D2NN can be physically created by using components or layers that do not need power, and new layers). By using a patch of two layers several transmissive and/or reflective layers (14), thereby creating an efficient and fast way of added to an existing and fixed D2NN design (N = where each point on a given layer either trans- implementing machine learning tasks. 5 layers), we improved our MNIST classification mits or reflects the incoming wave, representing In general, the phase and amplitude of each accuracy to 93.39% (fig. S2) (14); the state-of-the- an artificial neuron that is connected to other neuron can be learnable parameters, providing art convolutional neural network performance neurons of the following layers through optical a complex-valued modulation at each layer, has been reported as 99.60 to 99.77% (16–18). diffraction(Fig.1A).Inaccordancewiththe which improves the inference performance of More discussion on reconfiguring D2NN designs Huygens-Fresnel principle, our terminology is the diffractive network (fig. S1) (14). For coher- is provided in the supplementary materials (14). ent transmissive networks with phase-only mod- Following these numerical results, we 3D- 1 Department of Electrical and Computer Engineering, University ulation, each layer can be approximated as a thin printed our five-layer D2NN design (Fig. 2A), of California, Los Angeles, CA 90095, USA. 2Department of Bioengineering, University of California, Los Angeles, CA optical element (Fig. 1). Through deep learning, with each layer having an area of 8 cm by 8 cm, 90095, USA. 3California NanoSystems Institute (CNSI), the phase values of the neurons of each layer of followed by 10 detector regions defined at the University of California, Los Angeles, CA 90095, USA. the diffractive network are iteratively adjusted output plane of the diffractive network (Figs. 1B 4 Department of Surgery, David Geffen School of (trained) to perform a specific function by feed- and 3A). We then used continuous-wave illumi- Medicine, University of California, Los Angeles, CA 90095, USA. ’ *These authors contributed equally to this work. ing training data at the input layer and then nation at 0.4 THz to test the network sinference †Corresponding author. Email: [email protected] computing the network’s output through optical performance (Figs. 2, C and D). Phase values of Lin et al., Science 361, 1004–1008 (2018) 7 September 2018 1of5 RESEARCH | REPORT each layer’s neurons were physically encoded ance of the experimental network compared to valued modulation D2NN with N = 5 diffractive using the relative thickness of each 3D-printed our numerical testing is especially pronounced layers (sharing the same physical network dimen- neuron. Numerical testing of this five-layer D2NN for the digit 0 because it is challenging to 3D- sions as the digit classification D2NN shown design achieved a classification accuracy of 91.75% print the large void region at the center of the in Fig. 2A) reached an accuracy of 81.13 and over ~10,000 test images (Fig. 3C). To quantify the digit. Similar printing challenges were also ob- 86.33%, respectively (fig. S4). By increasing the match between these numerical testing results served for other digits that have void regions; number of diffractive layers to N = 10 and the and our experiments, we 3D-printed 50 hand- e.g.,6,8,and9(Fig.3B). total number of neurons