End-To-End Deep Learning of Optical Fiber Communications Boris Karanov, Student Member, IEEE, Mathieu Chagnon, Member, IEEE, Felix´ Thouin, Tobias A
Total Page:16
File Type:pdf, Size:1020Kb
1 End-to-end Deep Learning of Optical Fiber Communications Boris Karanov, Student Member, IEEE, Mathieu Chagnon, Member, IEEE, Felix´ Thouin, Tobias A. Eriksson, Henning Bulow,¨ Domanic¸Lavery, Member, IEEE, Polina Bayvel, Fellow, IEEE, and Laurent Schmalen, Senior Member, IEEE Abstract—In this paper, we implement an optical fiber commu- as the major information rate-limiting factors in modern opti- nication system as an end-to-end deep neural network, including cal communication systems [6], the application of artificial the complete chain of transmitter, channel model, and receiver. neural networks (ANNs), known as universal function ap- This approach enables the optimization of the transceiver in a single end-to-end process. We illustrate the benefits of this proximators [7], for channel equalization has been of great method by applying it to intensity modulation/direct detection research interest [8]–[12]. For example, a multi-layer ANN (IM/DD) systems and show that we can achieve bit error rates architecture, which enables deep learning techniques [13], has below the 6.7% hard-decision forward error correction (HD- been recently considered in [14] for the realization of low- FEC) threshold. We model all componentry of the transmitter complexity nonlinearity compensation by digital backpropaga- and receiver, as well as the fiber channel, and apply deep learning to find transmitter and receiver configurations minimizing the tion (DBP) [15]. It has been shown that the proposed ANN- symbol error rate. We propose and verify in simulations a based DBP achieves similar performance than conventional training method that yields robust and flexible transceivers DBP for a single channel 16-QAM system while reducing that allow—without reconfiguration—reliable transmission over the computational demands. Deep learning has also been con- a large range of link dispersions. The results from end-to-end sidered for short-reach communications. For instance, in [16] deep learning are successfully verified for the first time in an experiment. In particular, we achieve information rates of 42 Gb/s ANNs are considered for equalization in PAM8 IM/DD sys- below the HD-FEC threshold at distances beyond 40 km. We find tems. Bit-error rates (BERs) below the forward error correction that our results outperform conventional IM/DD solutions based (FEC) threshold have been experimentally demonstrated over on 2 and 4 level pulse amplitude modulation (PAM2/PAM4) with 4 km transmission distance. In [17], deep ANNs are used at feedforward equalization (FFE) at the receiver. Our study is the the receiver of the IM/DD system as an advanced detection first step towards end-to-end deep learning-based optimization of optical fiber communication systems. block, which accounts for channel memory and linear and nonlinear signal distortions. For short reaches (1.5 km), BER Index Terms—Machine learning, deep learning, neural net- improvements over common feed-forward linear equalization works, optical fiber communication, modulation, detection. were achieved. In all the aforementioned examples, deep learning tech- I. INTRODUCTION niques have been applied to optimize a specific function HE application of machine learning techniques in com- in the fiber-optic system, which itself consists of several munication systems has attracted a lot of attention in signal processing blocks at both transmitter and receiver, each recentT years [1], [2]. In the field of optical fiber com- carrying out an individual task, e.g. coding, modulation and munications, various tasks such as performance monitoring, equalization. In principle, such a modular implementation fiber nonlinearity mitigation, carrier recovery and modulation allows the system components to be analyzed, optimized and format recognition have been addressed from the machine controlled separately and thus presents a convenient way of learning perspective [3]–[5]. In particular, since chromatic building the communication link. Nevertheless, this approach dispersion and nonlinear Kerr effects in the fiber are regarded can be sub-optimal, especially for communication systems where the optimum receivers or optimum blocks are not known B. Karanov is with Nokia Bell Labs, Lorenzstr. 10, 70435, Stuttgart, or not available due to complexity reasons. As a consequence, Germany and Optical Networks Group, Dept. Electronic and Electrical Engi- in some systems, a block-based receiver with one or several neering, University College London, Torrington Place, WC1E 7JE, London, United Kingdom. (email:[email protected]) sub-optimum modules does not necessarily achieve the optimal M. Chagnon, H. Bulow¨ and L. Schmalen are with Nokia Bell Labs, end-to-end system performance. Especially if the optimum Lorenzstr. 10, 70435, Stuttgart, Germany. joint receiver is not known or too complex to implement, we F. Thouin is with the School of Physics, Georgia Institute of Technology, Atlanta, GA, USA. require carefully chosen approximations. T. A. Eriksson is with the Quantum ICT Advanced Development Center, Deep learning techniques, which can approximate any non- National Institute of Information and Communications Technology (NICT), linear function [13], allow us to design the communication 4-2-1 Nukui-kita, Koganei, Tokyo 184-8795, Japan. D. Lavery and P. Bayvel are with Optical Networks Group, Dept. Electronic system by carrying out the optimization in a single end-to- and Electrical Engineering, University College London, Torrington Place, end process including the transmitter and receiver as well as WC1E 7JE, London, United Kingdom. the communication channel. Such a novel design based on Copyright (c) 2015 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be full system learning avoids the conventional modular structure, obtained from the IEEE by sending a request to [email protected]. because the system is implemented as a single deep neural 2 network, and has the potential to achieve an optimal end-to- the training method is also presented in this section. Section IV end performance. The objective of this approach is to acquire reports the system performance results in simulation. Section V a robust representation of the input message at every layer of presents the experimental test-bed and validation of the key the network. Importantly, this enables a communication system simulation results. Section VI contains an extensive discussion to be adapted for information transmission over any type of on the properties of the transmit signal, the advantages of channel without requiring prior mathematical modeling and training the system in an end-to-end manner, and the details analysis. The viability of such an approach has been intro- about the experimental validation. Finally, Sec. VII concludes duced for wireless communications [18] and also demonstrated the work. experimentally with a wireless link [19]. Such an application of end-to-end deep learning presents the opportunity to fun- II. DEEP FULLY-CONNECTED FEED-FORWARD ARTIFICIAL damentally reconsider optical communication system design. NEURAL NETWORKS Our work introduces end-to-end deep learning for design- ing optical fiber communication transceivers. The focus in A fully-connected K-layer feed-forward ANN maps an input vector s to an output vector s f s through this paper is on IM/DD systems, which are currently the 0 K “ ANNp 0q preferred choice in many data center, access, metro and iterative steps of the form backhaul applications because of their simplicity and cost- sk αk Wksk 1 bk ; k 1; ::; K: (1) effectiveness [20]. The IM/DD communication channel is “ p ´ ` q “ Nk´1 nonlinear due to the combination of photodiode (square-law) Where sk 1 R is the output of the k 1 -th layer, Nk´ P p ´ q Nk Nk´1 detection and fiber dispersion. Moreover, noise is added by sk R is the output of the k-th layer, Wk R ˆ P Nk P the amplifier and the quantization in both the digital-to-analog and bk R are respectively the weight matrix and the bias P converters (DACs) and analog-to-digital converters (ADCs). vector of the k-th layer and αk is its activation function. The We model the fiber-optic system as a deep fully-connected set of layer parameters Wk and bk is denoted by feedforward ANN. Our work shows that such a deep learning θ W ; b : (2) system including transmitter, receiver, and the nonlinear chan- k “ t k ku nel, achieves reliable communication below FEC thresholds. The activation function αk introduces nonlinear relations be- We experimentally demonstrate the feasibility of the approach tween the layers and enables the approximation of nonlinear and achieve information rates of 42 Gb/s beyond 40 km. We functions by the network. A commonly chosen activation apply re-training of the receiver to account for the specific function in state-of-the-art ANNs is the rectified linear unit characteristics of the experimental setup not covered by the (ReLU), which acts individually on each of its input vector model. Moreover, we present a training method for realizing elements by keeping the positive values and equating the flexible and robust transceivers that work over a range of negative to zero [21], i.e., y αReLU x with distance. Precise waveform generation is an important aspect “ p q in such an end-to-end system design. In contrast to [18], we yi max 0; xi ; (3) “ p q do not generate modulation symbols, but perform a direct where y , x denote the i-th elements of the vectors y mapping of the input messages to a set of robust transmit i i and x, respectively. Compared to other popular activation waveforms. functions such as the hyperbolic tangent and sigmoid, the The goal of this paper is to design, in an offline process, ReLU function has a constant gradient, which renders training transceivers for low-cost optical communication system that computationally less expensive and avoids the effect of van- can be deployed without requiring the implementation of a ishing gradients.