Unsupervised Feature Extraction with Autoencoder for the Representation of Parkinson’S Disease Patients

MGI Mestrado em Gestão de Informação Master Program in Information Management Unsupervised feature extraction with Autoencoder for the representation of Parkinson’s disease patients Veronica Kazak Dissertation presented as partial requirement for obtaining the Master’s degree in Information Management i NOVA Information Management School Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa UNSUPERVISED FEATURE EXTRACTION WITH AUTOENCODER FOR THE REPRESENTATION OF PARKINSON'S DISEASE PATIENTS by Veronica Kazak Proposal for Dissertation presented as partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Knowledge Management and Business Intelligence Advisor: Professor Roberto Henriques, PhD Co-Advisor: Professor Mauro Castelli, PhD September 2018 ii Abstract Data representation is one of the fundamental concepts in machine learning. An appropriate representation is found by discovering a structure and automatic detection of patterns in data. In many domains, representation or feature learning is a critical step in improving the performance of machine learning algorithms due to the multidimensionality of data that feeds the model. Some tasks may have different perspectives and approaches depending on how data is represented. In recent years, deep artificial neural networks have provided better solutions to several pattern recognition problems and classification tasks. Deep architectures have also shown their effectiveness in capturing latent features for data representation. In this document, autoencoders will be examined to obtain the representation of Parkinson's disease patients and compared with conventional representation learning algorithms. The results will show whether the proposed method of feature selection leads to the desired accuracy for predicting the severity of Parkinson’s disease. Keywords: Autoencoder, Representation Learning, Feature Extraction, Unsupervised Learning, Deep Learning. iii Acknowledgements Developing this master's thesis initially seemed extremely challenging due to requirements for conducting research in the field of deep learning but it has been made possible by the support of some people to whom I would like to express my sincere gratitude. First and foremost, I would like to thank my two advisors Prof. Dr. Roberto Henriques and Prof. Dr. Mauro Castelli. You both provided me with a perfect balance of guidance and freedom. Prof. Dr. Roberto Henriques became my guide in the world of data science – first, planting the seed of curiosity in the area and then, encouraging my research. Thank you for helping me to have carried out research and for all your valuable advice in developing the document. Prof. Dr. Mauro Castelli played a key role in determining the area of research suggesting the source of data on Parkinson's disease patients. Thank you for your availability and support as well as for solving doubts in training neural networks. I would also like to thank Prof. Dr. Pedro Cabral for teaching me the standards and guidelines for scientific research and writing. Thanks to Professor, I managed to organize the process as from the earliest stage and comply with deadlines. I would also like to express my special thanks to Eran Lottem, the neuroscientist from Champalimaud Foundation who inspired me for studying deep learning. Being a brain scientist, he triggered my interest in artificial neural networks that simulate a real brain. Our talks about brain activity strengthened further my decision to do this research. I also want to thank my friends and fellow colleagues for your moral support and contributing to keep my mental energy. Last but not least, truly heartfelt and great thanks to my family and my loved ones Laurent Filipe Iarincovschi, Ivan Iarincovschi and Svetlana Kalinina for your blessings, for the amazing support and always being on my side giving the strength to continue and conclude this work. Sincere thanks to all, Veronica Kazak. iv Contents CHAPTER 1 INTRODUCTION ................................................................................................................................... 1 1.1 MOTIVATION ...................................................................................................................................................... 1 1.2 BACKGROUND ..................................................................................................................................................... 1 1.2.1 Theoretical framework .......................................................................................................................... 1 1.2.2 Autoencoders in medical data research ................................................................................................ 3 1.3 STUDY RELEVANCE ............................................................................................................................................... 4 1.4 OBJECTIVES ........................................................................................................................................................ 5 CHAPTER 2 NEURAL NETWORKS: DEFINITIONS AND BASICS .................................................................................. 6 2.1 INTRODUCTION TO ARTIFICIAL NEURAL NETWORKS ..................................................................................................... 6 2.1.1 Single perceptron .................................................................................................................................. 6 2.1.2 Multilayer perceptron ........................................................................................................................... 7 2.1.3 Activation functions .............................................................................................................................. 9 2.1.3.1 Sigmoid activation function ......................................................................................................................... 9 2.1.3.2 Hyperbolic tangent activation function ..................................................................................................... 10 2.1.3.3 Rectified Linear Unit (ReLU) activation function ........................................................................................ 10 2.1.3.4 Softmax activation function ....................................................................................................................... 11 2.2 TRAINING ARTIFICIAL NEURAL NETWORKS ............................................................................................................... 12 2.2.1 Backpropagation ................................................................................................................................. 12 CHAPTER 3 AUTOENCODER FRAMEWORK ........................................................................................................... 16 3.1 OVERVIEW ....................................................................................................................................................... 16 3.2 AUTOENCODER ALGORITHM ................................................................................................................................ 17 3.3 PARAMETRIZATION OF AUTOENCODER................................................................................................................... 17 3.3.1 Code size ............................................................................................................................................. 17 3.3.1.1 Undercomplete Autoencoders .................................................................................................................. 18 3.3.1.2 Overcomplete Autoencoders ..................................................................................................................... 18 3.3.2 Number of layers ................................................................................................................................. 18 3.3.2.1 Restricted Boltzmann machine .................................................................................................................. 18 3.3.2.2 Deep belief network .................................................................................................................................. 20 3.3.2.3 Stacked autoencoder ................................................................................................................................. 20 3.3.3 Loss function ....................................................................................................................................... 21 3.3.3.1 Mean squared error loss function ............................................................................................................. 22 3.3.3.2 Cross entropy loss function ....................................................................................................................... 22 3.3.4 Optimizer ............................................................................................................................................ 22 3.3.5 Regularization ..................................................................................................................................... 23 3.3.5.1 Sparse autoencoder ................................................................................................................................... 23 v 3.3.5.2 Denoising autoencoder.............................................................................................................................

Load more