Neural Network Tools: an Overview from Work on "Automatic Classification of Alzheimer’S Disease from Structural MRI" Report for Master’S Thesis
Total Page:16
File Type:pdf, Size:1020Kb
Neural Network Tools: An Overview From work on "Automatic Classification of Alzheimer’s Disease from Structural MRI" Report for Master’s Thesis Eivind Arvesen Østfold University College [email protected] Abstract Machine learning has exploded in popularity during the last decade, and neural networks in particular have seen a resurgence due to the advent of deep learning. In this paper, we examine modern neural network software-libraries and describe the main differences between them, as well as their strengths and weaknesses. 1 Introduction at The University of Montreal, built with re- search in mind. It has a particular focus on The field of machine learning is now more al- deep learning. The library is focused on easy luring than ever, helping researchers and busi- configurability for expert users (i.e. machine nesses to tackle a variety of challenging prob- learning researchers) —unlike "black-box" li- lems such as development of autonomous vehi- braries, which provide good performance with- cles, visual recognition systems and computer- out demanding knowledge about underlying al- aided diagnosis. There is a plethora of different gorithms from users. One way to put it would tools and libraries available that aim to make be that the library values ease of experimenta- work on machine learning problems easier for tion over ease of use. users, and while most of the more general alter- natives are quite similar in terms of functional- Though Pylearn2 assumes some technical ity, they nonetheless differ in terms of approach sophistication on the part of users, the library and design goals. also has a focus on reusability, and the re- This report describes several popular ma- sulting modularity makes it possible to com- chine learning tools, attempts to showcase their bine and adapt several re-usable components strengths and characterizes situations in which to form working models, and to only learn they would be well suited for use. The primary about the parts of the library one wants to use. focus is on modern tools; Most of these will One of the library’s goals is to contain refer- have some connection to deep learning, as it is ence implementations of all models and algo- an approach that is currently producing many rithms published by the LISA-lab, and it has state of the art results in fields like computer been used to set the state of the art on several vision. standardized datasets, including a test error of 0.45% on MNIST in 2013 [3] —the best per- formance without data augmentation at that 2 Tools point —using a convolutional maxout-network with dropout regularization. 2.1 Pylearn2 Pylearn2 makes use of a YAML2-interface, Pylearn2 [4] is a cutting edge machine learning which allows users to set up and perform ex- library for Python developed at the LISA-lab1 periments rapidly by defining their models in 1One of recent years’ most prominent labs, spearheaded by one of the leading researchers on deep learning, Yoshua Bengio. 2A human-readable data serialization format. 1 an almost declarative style, using (pre-)defined to-use standardized benchmark datasets (such components as building blocks and specifying as MNIST and CIFAR-10) and unsupervised hyperparameters. Alternatively, experiments learning out of the box. It appears to be grow- may also be specified through a Python script. ing steadily in popularity, and is apparently The library is built upon the Theano library popular amongst contestants in Kaggle con- —a numerical computation library for Python tests. and C/CUDA-compiler —which is also devel- oped at the LISA-lab. Theano can also com- 2.2 Torch pute gradients automatically; Users need only specify architecture and loss function. This Torch [1] is an open source scientific comput- makes it very easy to experiment quickly. ing framework that supports many different GPU-based models are also enabled via machine learning algorithms. It uses a JIT compilation with Theano, as it can compile compilation-based implementation of the Lua- both to GPU and CPU. Pylearn2 also provides language for ease of use and efficiency, on top of wrappers to widely used and efficient third- a C/CUDA-implementation. Torch7 (its cur- party GPU-implementations of convolutional rent version) is currently in use at and being nets like CUDA-convnet and Caffe. contributed to by DeepMind (Google), Face- As the library is under heavy develop- book, Twitter and New York University, ac- ment, its documentation is somewhat sparse, cording to the project’s website5. although the source code itself is thoroughly It is somewhat similar and comparable to commented. Though its website has a few note- Theano/Pylearn2 (used in conjunction with book3-style and short examples, users will likely Numpy and the rest of the packages in the have to spend some amount of time reading "standard" Python scientific stack), complete source code to figure out how to implement with suitable datatypes which supports math- their own models and extensions, as well as ematical operations, stastistical distributions, how to integrate them (for use) with the library. and BLAS6 operations, as well as supporting Pylearn2 can thereby be somewhat hard to un- rapid experimentation via a REPL7. It is also derstand, extend, modify and debug for begin- composed of reusable parts that can be com- ners. There is, however, a very active commu- bined in many variations. Torch also has a nity around the library, including usergroups/- large ecosystem of community-driven packages, mailinglists and the official source code repos- and as with Pylearn2, other packages brings itory4. Despite its somewhat steep learning- support for things like image processing —in- curve, Pylearn2 is very powerful. cluding a relatively recently released package8 Pylearn2 is to a large extent created and of CUDA extensions for deep learning by Face- supported by it’s users (students at the LISA- book AI Research. lab and other contributors), and since it has its Torch allows neural network models to be focus on research, features are added as they are run on GPUs through the help of CUDA. This needed, meaning that users will likely have to can be achieved by users via simple typecasting. code up some things themselves —unless they Like Pylearn2, Torch7 includes scripts to are only trying to replicate published results or load several popular datasets. Third party run variations on old experiments/datasets. loaders are also available for datasets other than Pylearn2 comes with support for/ready- those that are supported by default. 3A web-based (locally interactive) environment provided by the enhanced Python-shell and -project IPython. 4Available at https://github.com/lisa-lab/pylearn2 5http://torch.ch 6Basic Linear Algebra Subprograms 7Read–eval–print loop, i.e. an interactive shell 8Available at https://github.com/facebook/fbcunn 2 According to a 2011 paper [2] by some dustry and academia, and it is therefore easier of Torch’s maintainers, Torch7 was faster to find helpful tips and snippets of code online, than Theano on most benchmarks. The au- as there is a larger userbase (and indeed more thor’s noted, however, that Theano was "faster use cases, as it is a language of its own). Mat- than any existing implementation" (including lab also includes integrated development envi- Torch5, Matlab with GPUmat and EBLearn), ronment. going as far as saying that it crushed the al- ternatives in benchmarks when run on a GPU. 2.4 Scikit-learn It is also important to remember, as the au- thors comment, that only larger network archi- Scikit-learn [7] is another machine learning li- tectures will benefit from GPU-acceleration. brary for Python. Building upon widely used In the realm of deployment, Torch may packages like numpy (N-dimensional arrays) prove to be easier in use than for instance and scipy (scientific computing), it contains a Pylearn2 (and MATLAB, which only supports variety of classification, regression and clus- deployment of pre-trained networks), as the Lu- tering algorithms, providing machine learn- aJIT environment is embeddable into a host of ing techniques for supervised and unsupervised environments, including smart-phone applica- problems. The package "focuses on bringing tions and video games. machine learning to non-specialists" [7], and Torch is very practical to use for deep learn- is thus more accessible than some alternatives. ing, as the library has a focus on these ap- It does therefore, unlike Pylearn2, not require proaches. Like Pylearn2, Torch has an active users to possess knowledge of the models’ im- community, including mailing lists/user groups, plementations. Also unlike Pylearn2, which fo- community wiki, online chat and an official cuses mostly on neural networks, scikit-learn source code repository on github. has a wide varity of machine learning tech- niques available, including support vector ma- chines, logistic regression, nearest neighbors 2.3 Matlab Neural Network Toolbox and random forest, covering tasks like classi- fication, regression, clustering, dimensionality Matlab supports Neural Networks via the Neu- reduction, model selection and preprocessing. ral Network Toolbox, and networks built with The libraries website includes comprehensive this can in turn be parallelized and run on documentation. It includes limited support for GPUs when used in conjunction with the Par- neural networks, but it is not very practical to allel Computing Toolbox. use for deep learning. While it is not supported "out of the box", deep learning architectures can be used in Mat- 2.5 Weka lab via the Deep Learning Toolbox9 third-party software. The toolbox includes support for Weka is machine learning software [5] from the Deep Belief Nets, Stacked Autoencoders, Con- University of Waikato, New Zealand, written volutional Neural Nets, Convolutional Autoen- with focus on data mining. It is written in Java, coders and Feedforward Backpropagation Neu- and is usable on its own (through it’s GUI "ex- ral Nets.