Choosing a Deep Learning Library

Total Page:16

File Type:pdf, Size:1020Kb

Choosing a Deep Learning Library Choosing a Deep Learning Library There are a lot of them JesseBrizzi{.com,@gmail.com,@curalate.com} Who am I/What do I do? ● Research Engineer ○ Focus in Computer Vision and Machine learning ○ CS background ● Work on Image Intelligence Team @Curalate ○ E-Commerce SaaS ○ Platform to enable brands to find image based social media content to repurpose for e-commerce purposes. ○ Image Intelligence Team owns entire pipeline of researching new ML application to training, development, and then getting it into production. ○ Intelligent Product Tagging - technology that can analyze an image and use machine learning to identify specific products depicted within that image. Choosing a Deep Learning Library Choosing a Deep Learning Library at’s L’s a Neural Net? Choosing a Deep Learning Library at’s L’s a Neural Net? ● FCN - Fully Connected Network ○ Multilayer perceptron/fundamental neural net where each neuron is connect to all neurons in the previous layer of the network. ● CNN - Convolutional Neural Network ○ Neural net that uses convolutional layers, heavily used in Computer Vision applications. ● RNN - Recurrent Neural Network ○ Neural net that feeds its output back into itself to process the next input, heavily used in Natural Language Processing applications. ● LSTM - Long Short-Term Memory Recurrent Neural Net ○ Fancy RNNs that contain additional control over what output is passed to the next input. Choosing a Deep Learning Library Important Factors ● Academia vs Industry ○ Who is the target audience? ● Community support ○ Pretrained models? ○ Research paper repos? ○ How googleable are bugs and issues? ● Development speed/barriers for entry ○ Abstractions of low level concepts. ○ Documentation quality ○ Supported programming languages ○ The ability to Scale Choosing a Deep Learning Library Important Factors ● Codebase Quality ○ Is the code actively maintained? ●Performance ○ Benchmarks (oldish) https://arxiv.org/pdf/1608.07249.pdf ○ Performance does not scale very well on CPUs. 16 core CPUs are only slightly better than 4 or 8 core CPUs. ○ GPUs perform much better than many-core CPUs. ○ Scalability across multiple GPUs ○ Performance is also affected by the design of configuration files/implementation paradigm. Choosing a Deep Learning Library Important Factors ● Train to Production pipeline ○ Support for a fast to prototype language (python, R) and deployment in your production language (java/scala, c++, JS, whatever). ○ Train locally if you have the hardware vs training on pre-prepared, simplified cloud services. ○ Ability to run on different platforms ranging from mobile phones to massive server farms ○ Transfer your work to other libraries Choosing a Deep Learning Library Imperative vs Symbolic paradigms ● Dynamic Computation Graphing (Imperative Programming) ○ Are built at runtime which lets you use standard language statements. ○ At run time the system generation the graph structure. ○ Useful for when the graph structure needs to change at run time. ○ Makes debugging easy. ● Imperative programs tend to be more flexible ○ It’s easier to use native language features. ○ The graph can follow your programs logical control flow. Choosing a Deep Learning Library Imperative vs Symbolic paradigms ● Symbolic Programs Tend to be More Efficient ○ Both in terms of memory and speed. ○ Can safely reuse the memory for in-place computation. ○ Can also operation folding optimizations. ● Static Computation Graphing (Symbolic Paradigm) ○ Define the computation graph once, execute graph many times. ○ Can optimized the graph at the start ○ Good for fixed size Net (feed-forward, CNN) ● Easier to manage in terms of loading and resources Choosing a Deep Learning Library Libraries That People Should Know About Caffe ● IMO the first mainstream production ready lib. UC Berkeley ○ high performance and well tested C++ codebase. Watches: 2,241 Stars: 27,296 ● One of the first, and largest, model zoos. Forks: 16,454 ● Large community of open source research projects. Avg Issue Resolution: 3 Days Open issues: 13% ● Able to train a net from your data without writing any code. Symbolic Paradigm ● Good for feedforward networks, image processing, and for fine-tuning pretrained nets Research Citations (2014): 10,159 ● Main advantage was being first to market. Model zoo ● Can convert models to almost any other relevant lib. Choosing a Deep Learning Library Caffe ● Has bad design choices that are inherited from its original use case: UC Berkeley conventional CNN applications. Watches: 2,241 Stars: 27,296 ● Not good for recurrent networks Forks: 16,454 ● Does not support Auto differentiation Avg Issue Resolution: 3 Days Open issues: 13% ● Very verbose in layer and network definitions Symbolic Paradigm ○ the graph is treated as a collection of layers, as opposed to nodes of single tensor operations Research Citations (2014): 10,159 Model zoo Choosing a Deep Learning Library Keras ● A library that sits on top of other DL libs and provides a single, easy to use, high level interface. ● Very modular, minimal, readable, object oriented code. Keras ● Great for beginners, with great documentation Watches: 1,982 | Stars: 38,796 ● Lacks in optimizations Forks: 14,799 ● Supported backends Avg Issue Resolution: 23 Days Open issues: 24% ○ Tensorflow, Theano, CNTK, MXNet Symbolic Paradigm ● Can export your trained models into the backends format. ● Fork included in TensorFlow’s Python library. Model zoo ● Not as customizable Choosing a Deep Learning Library Tensorflow ● The current most popular option. Google ○ Largest active community Watches: 8,606 Stars: 121,864 ○ More open source projects and models. Forks: 72,545 Avg Issue Resolution: 8 Days ● Google’s attempt to build a single deep learning framework for Open issues: 16% everything deep learning related. Symbolic/Dynamic Paradigm ○ Built with massive distributed computing in mind (powers G-apps). Research Citations (2016): 6233 ○ Has mobile capabilities in the form of TensorFlow Mobile and Model zoo TensorFlow Light. ● TensorBoard is amazing for debugging and training. CNN Example Code (Keras R) ● TensorFlow Serving for prod deployments (python) CNN Example Code (Keras Py) ● A lot of documentation (official and 3rd party) CNN Example Code Choosing a Deep Learning Library Tensorflow ● Deep Google Cloud integration. Google ● Pretty low level (Keras and Sonnet help solve this) Watches: 8,606 Stars: 121,864 ● Most things outside of the core c/python library are “experimental” Forks: 72,545 Avg Issue Resolution: 8 Days ○ All of the APIs outside of the Python API are not covered by Open issues: 16% their API stability promises. Symbolic/Dynamic Paradigm ● Biggest issue with library is performance. Research Citations (2016): 6233 ○ TensorFlow is just slower and more of a resource hog when Model zoo compared to the other libraries. CNN Example Code (Keras R) ○ Other libs can perform twice as fast on typical deep net tasks. ○ Avoid for performant RNNs or LSTMs networks. CNN Example Code (Keras Py) ○ Worst at scaling efficiency. CNN Example Code Choosing a Deep Learning Library Torch/PyTorch ● Torch was one of the original academic Deepmind, NYU, IDIAP Facebook focused libs. ● Many maintainers went to work at Watches: 665 | Stars: 8,218 Watches: 1,197 | Stars: 25,450 Forks: 2,340 Forks: 6,044 Facebook and created PyTorch. Avg Issue Resolution: 69 Days Avg Issue Resolution: 6 Days Open issues: 34% Open issues: 24% ● They use the same underlying C lib. ○ Provide similar performance. Symbolic Paradigm Symbolic/Dynamic Paradigm ● They differ in Research Citations: 1,246 Research Citations: 879 ○ Interface (Lua vs Python) Model zoo Model zoo ○ Auto diff capabilities CNN Example Code ○ Paradigms Choosing a Deep Learning Library PyTorch ● PyTorch was made with the goal of fixing or modernizing Torch. Facebook ● Hybrid fronted for switching between paradigms. ● PyTorch also has its own visualization dashboard called Visdom. Watches: 1,197 | Stars: 25,450 Forks: 6,044 ● Probably should avoid if want to deploy into production. Avg Issue Resolution: 6 Days Open issues: 24% ○ Facebook maintains a separate lib targeted at developers, Caffe2. Symbolic/Dynamic Paradigm ○ Making changes to make PyTorch production ready. Research Citations: 879 ○ Caffe2 recently merged into PyTorch Model zoo ● Researchers tend to prefer PyTorch over Tensorflow CNN Example Code ○ Makes prototyping easy Choosing a Deep Learning Library MXNet ● Newer and growing option. Apache, Amazon ● Largest officially supported API selection. Watches: 1,180 | Stars: 16,450 ○ High compatibility and consistency. Forks: 5,889 Avg Issue Resolution: 40 Days ● Direct competitor to TensorFlow across all applications. Open issues: 13% ○ It can run on everything from a web browser, a mobile Symbolic/Dynamic Paradigm phone, to a massive distributed server farm. Research Citations: 712 ○ Amazon has found that you can get up to an 85% scaling Model zoo efficiency with MXNet. CNN Example Python Code ● Has its own serving framework and deep integration with AWS. CNN Example Code (Gluon) ● Also has its own Tensorboard forks. Choosing a Deep Learning Library MXNet Gluon ● Collaboration between AWS and Microsoft. ● Provides a clear, concise, and simple API for deep learning. ○ Full set of plug-and-play neural network building blocks. ■ predefined layers, optimizers, and initializers ○ Built in model zoo. ● Hybridization is awesome ○ Hybrid Symbolic/Dynamic graph functionality. ○ Offers benefits of both. ○ Can make Gluon 3x faster than PyTorch ● Great documentation for absolute beginners.
Recommended publications
  • Intro to Tensorflow 2.0 MBL, August 2019
    Intro to TensorFlow 2.0 MBL, August 2019 Josh Gordon (@random_forests) 1 Agenda 1 of 2 Exercises ● Fashion MNIST with dense layers ● CIFAR-10 with convolutional layers Concepts (as many as we can intro in this short time) ● Gradient descent, dense layers, loss, softmax, convolution Games ● QuickDraw Agenda 2 of 2 Walkthroughs and new tutorials ● Deep Dream and Style Transfer ● Time series forecasting Games ● Sketch RNN Learning more ● Book recommendations Deep Learning is representation learning Image link Image link Latest tutorials and guides tensorflow.org/beta News and updates medium.com/tensorflow twitter.com/tensorflow Demo PoseNet and BodyPix bit.ly/pose-net bit.ly/body-pix TensorFlow for JavaScript, Swift, Android, and iOS tensorflow.org/js tensorflow.org/swift tensorflow.org/lite Minimal MNIST in TF 2.0 A linear model, neural network, and deep neural network - then a short exercise. bit.ly/mnist-seq ... ... ... Softmax model = Sequential() model.add(Dense(256, activation='relu',input_shape=(784,))) model.add(Dense(128, activation='relu')) model.add(Dense(10, activation='softmax')) Linear model Neural network Deep neural network ... ... Softmax activation After training, select all the weights connected to this output. model.layers[0].get_weights() # Your code here # Select the weights for a single output # ... img = weights.reshape(28,28) plt.imshow(img, cmap = plt.get_cmap('seismic')) ... ... Softmax activation After training, select all the weights connected to this output. Exercise 1 (option #1) Exercise: bit.ly/mnist-seq Reference: tensorflow.org/beta/tutorials/keras/basic_classification TODO: Add a validation set. Add code to plot loss vs epochs (next slide). Exercise 1 (option #2) bit.ly/ijcav_adv Answers: next slide.
    [Show full text]
  • Deep Learning Frameworks | NVIDIA Developer
    4/10/2017 Deep Learning Frameworks | NVIDIA Developer Deep Learning Frameworks The NVIDIA Deep Learning SDK accelerates widely­used deep learning frameworks such as Caffe, CNTK, TensorFlow, Theano and Torch as well as many other deep learning applications. Choose a deep learning framework from the list below, download the supported version of cuDNN and follow the instructions on the framework page to get started. Caffe is a deep learning framework made with expression, speed, and modularity in mind. Caffe is developed by the Berkeley Vision and Learning Center (BVLC), as well as community contributors and is popular for computer vision. Caffe supports cuDNN v5 for GPU acceleration. Supported interfaces: C, C++, Python, MATLAB, Command line interface Learning Resources Deep learning course: Getting Started with the Caffe Framework Blog: Deep Learning for Computer Vision with Caffe and cuDNN Download Caffe Download cuDNN The Microsoft Cognitive Toolkit —previously known as CNTK— is a unified deep­learning toolkit from Microsoft Research that makes it easy to train and combine popular model types across multiple GPUs and servers. Microsoft Cognitive Toolkit implements highly efficient CNN and RNN training for speech, image and text data. Microsoft Cognitive Toolkit supports cuDNN v5.1 for GPU acceleration. Supported interfaces: Python, C++, C# and Command line interface Download CNTK Download cuDNN TensorFlow is a software library for numerical computation using data flow graphs, developed by Google’s Machine Intelligence research organization. TensorFlow supports cuDNN v5.1 for GPU acceleration. Supported interfaces: C++, Python Download TensorFlow Download cuDNN https://developer.nvidia.com/deep­learning­frameworks 1/3 4/10/2017 Deep Learning Frameworks | NVIDIA Developer Theano is a math expression compiler that efficiently defines, optimizes, and evaluates mathematical expressions involving multi­dimensional arrays.
    [Show full text]
  • Ways to Use Machine Learning Approaches for Software Development
    Ways to use Machine Learning approaches for software development Nicklas Jonsson Nicklas Jonsson VT 2018 Examensarbete, 30 hp Supervisor: Eddie wadbro Extern Supervisor: C4 Contexture Examiner: Henrik Bjorklund¨ Master of Science Programme in Computing Science and Engineering, 300 hp Abstract With the rise of machine learning and in particular deep learning enter- ing all different types of fields, including software development. It could be a bit hard to know where to begin to search for the tools when some- one wants to use machine learning for a one’s problems. This thesis has looked at some available technologies of today for applying machine learning to one’s applications. This thesis has looked at some of the available cloud services, frame- works, and libraries for machine learning and it presents three different implementation structures that can be used with these technologies for the problem of image classification. Acknowledgements I want to thank C4 Contexture for giving me the thesis idea, support, and supplying me with a working station. I also want to thank Lantmannen¨ for supplying me with the image data that was used for this thesis. Finally, I want to thank Eddie Wadbro for his guidance during this thesis and of course a big thanks to my family and friends for their support during this period of my life. 1(45) Content 1 Introduction 3 1.1 The Client 3 1.1.1 C4 Contexture PIM software 4 1.2 The data 4 1.3 Goal 5 1.4 Limitation 5 2 Background 7 2.1 Artificial intelligence, machine learning and deep learning.
    [Show full text]
  • Keras2c: a Library for Converting Keras Neural Networks to Real-Time Compatible C
    Keras2c: A library for converting Keras neural networks to real-time compatible C Rory Conlina,∗, Keith Ericksonb, Joeseph Abbatec, Egemen Kolemena,b,∗ aDepartment of Mechanical and Aerospace Engineering, Princeton University, Princeton NJ 08544, USA bPrinceton Plasma Physics Laboratory, Princeton NJ 08544, USA cDepartment of Astrophysical Sciences at Princeton University, Princeton NJ 08544, USA Abstract With the growth of machine learning models and neural networks in mea- surement and control systems comes the need to deploy these models in a way that is compatible with existing systems. Existing options for deploying neural networks either introduce very high latency, require expensive and time con- suming work to integrate into existing code bases, or only support a very lim- ited subset of model types. We have therefore developed a new method called Keras2c, which is a simple library for converting Keras/TensorFlow neural net- work models into real-time compatible C code. It supports a wide range of Keras layers and model types including multidimensional convolutions, recurrent lay- ers, multi-input/output models, and shared layers. Keras2c re-implements the core components of Keras/TensorFlow required for predictive forward passes through neural networks in pure C, relying only on standard library functions considered safe for real-time use. The core functionality consists of ∼ 1500 lines of code, making it lightweight and easy to integrate into existing codebases. Keras2c has been successfully tested in experiments and is currently in use on the plasma control system at the DIII-D National Fusion Facility at General Atomics in San Diego. 1. Motivation TensorFlow[1] is one of the most popular libraries for developing and training neural networks.
    [Show full text]
  • Automated Elastic Pipelining for Distributed Training of Transformers
    PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers Chaoyang He 1 Shen Li 2 Mahdi Soltanolkotabi 1 Salman Avestimehr 1 Abstract the-art convolutional networks ResNet-152 (He et al., 2016) and EfficientNet (Tan & Le, 2019). To tackle the growth in The size of Transformer models is growing at an model sizes, researchers have proposed various distributed unprecedented rate. It has taken less than one training techniques, including parameter servers (Li et al., year to reach trillion-level parameters since the 2014; Jiang et al., 2020; Kim et al., 2019), pipeline paral- release of GPT-3 (175B). Training such models lel (Huang et al., 2019; Park et al., 2020; Narayanan et al., requires both substantial engineering efforts and 2019), intra-layer parallel (Lepikhin et al., 2020; Shazeer enormous computing resources, which are luxu- et al., 2018; Shoeybi et al., 2019), and zero redundancy data ries most research teams cannot afford. In this parallel (Rajbhandari et al., 2019). paper, we propose PipeTransformer, which leverages automated elastic pipelining for effi- T0 (0% trained) T1 (35% trained) T2 (75% trained) T3 (100% trained) cient distributed training of Transformer models. In PipeTransformer, we design an adaptive on the fly freeze algorithm that can identify and freeze some layers gradually during training, and an elastic pipelining system that can dynamically Layer (end of training) Layer (end of training) Layer (end of training) Layer (end of training) Similarity score allocate resources to train the remaining active layers. More specifically, PipeTransformer automatically excludes frozen layers from the Figure 1. Interpretable Freeze Training: DNNs converge bottom pipeline, packs active layers into fewer GPUs, up (Results on CIFAR10 using ResNet).
    [Show full text]
  • Lecture 6 Learned Feedforward Visual Processing Neural Networks, Deep Learning, Convnets
    William T. Freeman, Antonio Torralba, 2017 Lecture 6 Learned feedforward visual processing Neural Networks, Deep learning, ConvNets Some slides modified from R. Fergus We need translation invariance Lots of useful linear filters… Laplacian Gaussian derivative Gaussian Gabor And many more… High order Gaussian derivatives We need translation and scale invariance Lots of image pyramids… Gaussian Pyr Laplacian Pyr And many more: QMF, steerable, … We need … What is the best representation? • All the previous representation are manually constructed. • Could they be learnt from data? A brief history of Neural Networks enthusiasm time Perceptrons, 1958 Rosenblatt http://www.ecse.rpi.edu/homepages/nagy/PDF_chrono/2011_Na gy_Pace_FR.pdf. Photo by George Nagy 9 http://www.manhattanrarebooks-science.com/rosenblatt.htm Perceptrons, 1958 10 Perceptrons, 1958 enthusiasm time Minsky and Papert, Perceptrons, 1972 12 Perceptrons, 1958 enthusiasm Minsky and Papert, 1972 time Parallel Distributed Processing (PDP), 1986 14 XOR problem Inputs Output 0 0 0 1 0 1 0 1 1 0 1 1 1 0 0 1 PDP authors pointed to the backpropagation algorithm as a breakthrough, allowing multi-layer neural networks to be trained. Among the functions that a multi-layer network can represent but a single-layer network cannot: the XOR function. 15 Perceptrons, PDP book, 1958 1986 enthusiasm Minsky and Papert, 1972 time LeCun conv nets, 1998 Demos: http://yann.lecun.com/exdb/lenet/index.html 17 18 Neural networks to recognize handwritten digits? yes Neural networks for tougher problems? not really http://pub.clement.farabet.net/ecvw09.pdf 19 NIPS 2000 • NIPS, Neural Information Processing Systems, is the premier conference on machine learning.
    [Show full text]
  • Comparative Study of Deep Learning Software Frameworks
    Comparative Study of Deep Learning Software Frameworks Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah Research and Technology Center, Robert Bosch LLC {Soheil.Bahrampour, Naveen.Ramakrishnan, fixed-term.Lukas.Schott, Mohak.Shah}@us.bosch.com ABSTRACT such as dropout and weight decay [2]. As the popular- Deep learning methods have resulted in significant perfor- ity of the deep learning methods have increased over the mance improvements in several application domains and as last few years, several deep learning software frameworks such several software frameworks have been developed to have appeared to enable efficient development and imple- facilitate their implementation. This paper presents a com- mentation of these methods. The list of available frame- parative study of five deep learning frameworks, namely works includes, but is not limited to, Caffe, DeepLearning4J, Caffe, Neon, TensorFlow, Theano, and Torch, on three as- deepmat, Eblearn, Neon, PyLearn, TensorFlow, Theano, pects: extensibility, hardware utilization, and speed. The Torch, etc. Different frameworks try to optimize different as- study is performed on several types of deep learning ar- pects of training or deployment of a deep learning algorithm. chitectures and we evaluate the performance of the above For instance, Caffe emphasises ease of use where standard frameworks when employed on a single machine for both layers can be easily configured without hard-coding while (multi-threaded) CPU and GPU (Nvidia Titan X) settings. Theano provides automatic differentiation capabilities which The speed performance metrics used here include the gradi- facilitates flexibility to modify architecture for research and ent computation time, which is important during the train- development. Several of these frameworks have received ing phase of deep networks, and the forward time, which wide attention from the research community and are well- is important from the deployment perspective of trained developed allowing efficient training of deep networks with networks.
    [Show full text]
  • Comparative Study of Caffe, Neon, Theano, and Torch
    Workshop track - ICLR 2016 COMPARATIVE STUDY OF CAFFE,NEON,THEANO, AND TORCH FOR DEEP LEARNING Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah Bosch Research and Technology Center fSoheil.Bahrampour,Naveen.Ramakrishnan, fixed-term.Lukas.Schott,[email protected] ABSTRACT Deep learning methods have resulted in significant performance improvements in several application domains and as such several software frameworks have been developed to facilitate their implementation. This paper presents a comparative study of four deep learning frameworks, namely Caffe, Neon, Theano, and Torch, on three aspects: extensibility, hardware utilization, and speed. The study is per- formed on several types of deep learning architectures and we evaluate the per- formance of the above frameworks when employed on a single machine for both (multi-threaded) CPU and GPU (Nvidia Titan X) settings. The speed performance metrics used here include the gradient computation time, which is important dur- ing the training phase of deep networks, and the forward time, which is important from the deployment perspective of trained networks. For convolutional networks, we also report how each of these frameworks support various convolutional algo- rithms and their corresponding performance. From our experiments, we observe that Theano and Torch are the most easily extensible frameworks. We observe that Torch is best suited for any deep architecture on CPU, followed by Theano. It also achieves the best performance on the GPU for large convolutional and fully connected networks, followed closely by Neon. Theano achieves the best perfor- mance on GPU for training and deployment of LSTM networks. Finally Caffe is the easiest for evaluating the performance of standard deep architectures.
    [Show full text]
  • Torch7: a Matlab-Like Environment for Machine Learning
    Torch7: A Matlab-like Environment for Machine Learning Ronan Collobert1 Koray Kavukcuoglu2 Clement´ Farabet3;4 1 Idiap Research Institute 2 NEC Laboratories America Martigny, Switzerland Princeton, NJ, USA 3 Courant Institute of Mathematical Sciences 4 Universite´ Paris-Est New York University, New York, NY, USA Equipe´ A3SI - ESIEE Paris, France Abstract Torch7 is a versatile numeric computing framework and machine learning library that extends Lua. Its goal is to provide a flexible environment to design and train learning machines. Flexibility is obtained via Lua, an extremely lightweight scripting language. High performance is obtained via efficient OpenMP/SSE and CUDA implementations of low-level numeric routines. Torch7 can easily be in- terfaced to third-party software thanks to Lua’s light interface. 1 Torch7 Overview With Torch7, we aim at providing a framework with three main advantages: (1) it should ease the development of numerical algorithms, (2) it should be easily extended (including the use of other libraries), and (3) it should be fast. We found that a scripting (interpreted) language with a good C API appears as a convenient solu- tion to “satisfy” the constraint (2). First, a high-level language makes the process of developing a program simpler and more understandable than a low-level language. Second, if the programming language is interpreted, it becomes also easier to quickly try various ideas in an interactive manner. And finally, assuming a good C API, the scripting language becomes the “glue” between hetero- geneous libraries: different structures of the same concept (coming from different libraries) can be hidden behind a unique structure in the scripting language, while keeping all the functionalities coming from all the different libraries.
    [Show full text]
  • Tensorflow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction
    TensorFlow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction TensorFlow Google Brain, 2015 (rewritten DistBelief) Theano University of Montréal, 2009 Keras François Chollet, 2015 (now at Google) Torch Facebook AI Research, Twitter, Google DeepMind Caffe Berkeley Vision and Learning Center (BVLC), 2013 Outline 1. Introduction of each framework a. TensorFlow b. Theano c. Keras d. Torch e. Caffe 2. Further comparison a. Code + models b. Community and documentation c. Performance d. Model deployment e. Extra features 3. Which framework to choose when ..? Introduction of each framework TensorFlow architecture 1) Low-level core (C++/CUDA) 2) Simple Python API to define the computational graph 3) High-level API (TF-Learn, TF-Slim, soon Keras…) TensorFlow computational graph - auto-differentiation! - easy multi-GPU/multi-node - native C++ multithreading - device-efficient implementation for most ops - whole pipeline in the graph: data loading, preprocessing, prefetching... TensorBoard TensorFlow development + bleeding edge (GitHub yay!) + division in core and contrib => very quick merging of new hotness + a lot of new related API: CRF, BayesFlow, SparseTensor, audio IO, CTC, seq2seq + so it can easily handle images, videos, audio, text... + if you really need a new native op, you can load a dynamic lib - sometimes contrib stuff disappears or moves - recently introduced bells and whistles are barely documented Presentation of Theano: - Maintained by Montréal University group. - Pioneered the use of a computational graph. - General machine learning tool -> Use of Lasagne and Keras. - Very popular in the research community, but not elsewhere. Falling behind. What is it like to start using Theano? - Read tutorials until you no longer can, then keep going.
    [Show full text]
  • Introduction to Deep Learning Framework 1. Introduction 1.1
    Introduction to Deep Learning Framework 1. Introduction 1.1. Commonly used frameworks The most commonly used frameworks for deep learning include Pytorch, Tensorflow, Keras, caffe, Apache MXnet, etc. PyTorch: open source machine learning library; developed by Facebook AI Rsearch Lab; based on the Torch library; supports Python and C++ interfaces. Tensorflow: open source software library dataflow and differentiable programming; developed by Google brain team; provides stable Python & C APIs. Keras: an open-source neural-network library written in Python; conceived to be an interface; capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, R, Theano, or PlaidML. Caffe: open source under BSD licence; developed at University of California, Berkeley; written in C++ with a Python interface. Apache MXnet: an open-source deep learning software framework; supports a flexible programming model and multiple programming languages (including C++, Python, Java, Julia, Matlab, JavaScript, Go, R, Scala, Perl, and Wolfram Language.) 1.2. Pytorch 1.2.1 Data Tensor: the major computation unit in PyTorch. Tensor could be viewed as the extension of vector (one-dimensional) and matrix (two-dimensional), which could be defined with any dimension. Variable: a wrapper of tensor, which includes creator, value of variable (tensor), and gradient. This is the core of the automatic derivation in Pytorch, as it has the information of both the value and the creator, which is very important for current backward process. Parameter: a subset of variable 1.2.2. Functions: NNModules: NNModules (torch.nn) is a combination of parameters and functions, and could be interpreted as layers. There some common modules such as convolution layers, linear layers, pooling layers, dropout layers, etc.
    [Show full text]
  • Zero-Shot Text-To-Image Generation
    Zero-Shot Text-to-Image Generation Aditya Ramesh 1 Mikhail Pavlov 1 Gabriel Goh 1 Scott Gray 1 Chelsea Voss 1 Alec Radford 1 Mark Chen 1 Ilya Sutskever 1 Abstract Text-to-image generation has traditionally fo- cused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part la- bels or segmentation masks supplied during train- ing. We describe a simple approach for this task based on a transformer that autoregressively mod- els the text and image tokens as a single stream of data. With sufficient data and scale, our approach is competitive with previous domain-specific mod- els when evaluated in a zero-shot fashion. Figure 1. Comparison of original images (top) and reconstructions from the discrete VAE (bottom). The encoder downsamples the 1. Introduction spatial resolution by a factor of 8. While details (e.g., the texture of Modern machine learning approaches to text to image syn- the cat’s fur, the writing on the storefront, and the thin lines in the thesis started with the work of Mansimov et al.(2015), illustration) are sometimes lost or distorted, the main features of the image are still typically recognizable. We use a large vocabulary who showed that the DRAW Gregor et al.(2015) generative size of 8192 to mitigate the loss of information. model, when extended to condition on image captions, could also generate novel visual scenes. Reed et al.(2016b) later demonstrated that using a generative adversarial network tioning model pretrained on MS-COCO.
    [Show full text]