A Robot Exploration Strategy Based on Q-Learning Network
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Automatic Feature Learning Using Recurrent Neural Networks
Predicting purchasing intent: Automatic Feature Learning using Recurrent Neural Networks Humphrey Sheil Omer Rana Ronan Reilly Cardiff University Cardiff University Maynooth University Cardiff, Wales Cardiff, Wales Maynooth, Ireland [email protected] [email protected] [email protected] ABSTRACT influence three out of four major variables that affect profit. Inaddi- We present a neural network for predicting purchasing intent in an tion, merchants increasingly rely on (and pay advertising to) much Ecommerce setting. Our main contribution is to address the signifi- larger third-party portals (for example eBay, Google, Bing, Taobao, cant investment in feature engineering that is usually associated Amazon) to achieve their distribution, so any direct measures the with state-of-the-art methods such as Gradient Boosted Machines. merchant group can use to increase their profit is sorely needed. We use trainable vector spaces to model varied, semi-structured input data comprising categoricals, quantities and unique instances. McKinsey A.T. Kearney Affected by Multi-layer recurrent neural networks capture both session-local shopping intent and dataset-global event dependencies and relationships for user Price management 11.1% 8.2% Yes sessions of any length. An exploration of model design decisions Variable cost 7.8% 5.1% Yes including parameter sharing and skip connections further increase Sales volume 3.3% 3.0% Yes model accuracy. Results on benchmark datasets deliver classifica- Fixed cost 2.3% 2.0% No tion accuracy within 98% of state-of-the-art on one and exceed state-of-the-art on the second without the need for any domain / Table 1: Effect of improving different variables on operating dataset-specific feature engineering on both short and long event profit, from [22]. -
Exploring the Potential of Sparse Coding for Machine Learning
Portland State University PDXScholar Dissertations and Theses Dissertations and Theses 10-19-2020 Exploring the Potential of Sparse Coding for Machine Learning Sheng Yang Lundquist Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/open_access_etds Part of the Applied Mathematics Commons, and the Computer Sciences Commons Let us know how access to this document benefits ou.y Recommended Citation Lundquist, Sheng Yang, "Exploring the Potential of Sparse Coding for Machine Learning" (2020). Dissertations and Theses. Paper 5612. https://doi.org/10.15760/etd.7484 This Dissertation is brought to you for free and open access. It has been accepted for inclusion in Dissertations and Theses by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. Exploring the Potential of Sparse Coding for Machine Learning by Sheng Y. Lundquist A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Dissertation Committee: Melanie Mitchell, Chair Feng Liu Bart Massey Garrett Kenyon Bruno Jedynak Portland State University 2020 © 2020 Sheng Y. Lundquist Abstract While deep learning has proven to be successful for various tasks in the field of computer vision, there are several limitations of deep-learning models when com- pared to human performance. Specifically, human vision is largely robust to noise and distortions, whereas deep learning performance tends to be brittle to modifi- cations of test images, including being susceptible to adversarial examples. Addi- tionally, deep-learning methods typically require very large collections of training examples for good performance on a task, whereas humans can learn to perform the same task with a much smaller number of training examples. -
Text-To-Speech Synthesis
Generative Model-Based Text-to-Speech Synthesis Heiga Zen (Google London) February rd, @MIT Outline Generative TTS Generative acoustic models for parametric TTS Hidden Markov models (HMMs) Neural networks Beyond parametric TTS Learned features WaveNet End-to-end Conclusion & future topics Outline Generative TTS Generative acoustic models for parametric TTS Hidden Markov models (HMMs) Neural networks Beyond parametric TTS Learned features WaveNet End-to-end Conclusion & future topics Text-to-speech as sequence-to-sequence mapping Automatic speech recognition (ASR) “Hello my name is Heiga Zen” ! Machine translation (MT) “Hello my name is Heiga Zen” “Ich heiße Heiga Zen” ! Text-to-speech synthesis (TTS) “Hello my name is Heiga Zen” ! Heiga Zen Generative Model-Based Text-to-Speech Synthesis February rd, of Speech production process text (concept) fundamental freq voiced/unvoiced char freq transfer frequency speech transfer characteristics magnitude start--end Sound source fundamental voiced: pulse frequency unvoiced: noise modulation of carrier wave by speech information air flow Heiga Zen Generative Model-Based Text-to-Speech Synthesis February rd, of Typical ow of TTS system TEXT Sentence segmentation Word segmentation Text normalization Text analysis Part-of-speech tagging Pronunciation Speech synthesis Prosody prediction discrete discrete Waveform generation ) NLP Frontend discrete continuous SYNTHESIZED ) SEECH Speech Backend Heiga Zen Generative Model-Based Text-to-Speech Synthesis February rd, of Rule-based, formant synthesis [] -
Lecture 6 Learned Feedforward Visual Processing Neural Networks, Deep Learning, Convnets
William T. Freeman, Antonio Torralba, 2017 Lecture 6 Learned feedforward visual processing Neural Networks, Deep learning, ConvNets Some slides modified from R. Fergus We need translation invariance Lots of useful linear filters… Laplacian Gaussian derivative Gaussian Gabor And many more… High order Gaussian derivatives We need translation and scale invariance Lots of image pyramids… Gaussian Pyr Laplacian Pyr And many more: QMF, steerable, … We need … What is the best representation? • All the previous representation are manually constructed. • Could they be learnt from data? A brief history of Neural Networks enthusiasm time Perceptrons, 1958 Rosenblatt http://www.ecse.rpi.edu/homepages/nagy/PDF_chrono/2011_Na gy_Pace_FR.pdf. Photo by George Nagy 9 http://www.manhattanrarebooks-science.com/rosenblatt.htm Perceptrons, 1958 10 Perceptrons, 1958 enthusiasm time Minsky and Papert, Perceptrons, 1972 12 Perceptrons, 1958 enthusiasm Minsky and Papert, 1972 time Parallel Distributed Processing (PDP), 1986 14 XOR problem Inputs Output 0 0 0 1 0 1 0 1 1 0 1 1 1 0 0 1 PDP authors pointed to the backpropagation algorithm as a breakthrough, allowing multi-layer neural networks to be trained. Among the functions that a multi-layer network can represent but a single-layer network cannot: the XOR function. 15 Perceptrons, PDP book, 1958 1986 enthusiasm Minsky and Papert, 1972 time LeCun conv nets, 1998 Demos: http://yann.lecun.com/exdb/lenet/index.html 17 18 Neural networks to recognize handwritten digits? yes Neural networks for tougher problems? not really http://pub.clement.farabet.net/ecvw09.pdf 19 NIPS 2000 • NIPS, Neural Information Processing Systems, is the premier conference on machine learning. -
Comparative Study of Deep Learning Software Frameworks
Comparative Study of Deep Learning Software Frameworks Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah Research and Technology Center, Robert Bosch LLC {Soheil.Bahrampour, Naveen.Ramakrishnan, fixed-term.Lukas.Schott, Mohak.Shah}@us.bosch.com ABSTRACT such as dropout and weight decay [2]. As the popular- Deep learning methods have resulted in significant perfor- ity of the deep learning methods have increased over the mance improvements in several application domains and as last few years, several deep learning software frameworks such several software frameworks have been developed to have appeared to enable efficient development and imple- facilitate their implementation. This paper presents a com- mentation of these methods. The list of available frame- parative study of five deep learning frameworks, namely works includes, but is not limited to, Caffe, DeepLearning4J, Caffe, Neon, TensorFlow, Theano, and Torch, on three as- deepmat, Eblearn, Neon, PyLearn, TensorFlow, Theano, pects: extensibility, hardware utilization, and speed. The Torch, etc. Different frameworks try to optimize different as- study is performed on several types of deep learning ar- pects of training or deployment of a deep learning algorithm. chitectures and we evaluate the performance of the above For instance, Caffe emphasises ease of use where standard frameworks when employed on a single machine for both layers can be easily configured without hard-coding while (multi-threaded) CPU and GPU (Nvidia Titan X) settings. Theano provides automatic differentiation capabilities which The speed performance metrics used here include the gradi- facilitates flexibility to modify architecture for research and ent computation time, which is important during the train- development. Several of these frameworks have received ing phase of deep networks, and the forward time, which wide attention from the research community and are well- is important from the deployment perspective of trained developed allowing efficient training of deep networks with networks. -
Comparative Study of Caffe, Neon, Theano, and Torch
Workshop track - ICLR 2016 COMPARATIVE STUDY OF CAFFE,NEON,THEANO, AND TORCH FOR DEEP LEARNING Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah Bosch Research and Technology Center fSoheil.Bahrampour,Naveen.Ramakrishnan, fixed-term.Lukas.Schott,[email protected] ABSTRACT Deep learning methods have resulted in significant performance improvements in several application domains and as such several software frameworks have been developed to facilitate their implementation. This paper presents a comparative study of four deep learning frameworks, namely Caffe, Neon, Theano, and Torch, on three aspects: extensibility, hardware utilization, and speed. The study is per- formed on several types of deep learning architectures and we evaluate the per- formance of the above frameworks when employed on a single machine for both (multi-threaded) CPU and GPU (Nvidia Titan X) settings. The speed performance metrics used here include the gradient computation time, which is important dur- ing the training phase of deep networks, and the forward time, which is important from the deployment perspective of trained networks. For convolutional networks, we also report how each of these frameworks support various convolutional algo- rithms and their corresponding performance. From our experiments, we observe that Theano and Torch are the most easily extensible frameworks. We observe that Torch is best suited for any deep architecture on CPU, followed by Theano. It also achieves the best performance on the GPU for large convolutional and fully connected networks, followed closely by Neon. Theano achieves the best perfor- mance on GPU for training and deployment of LSTM networks. Finally Caffe is the easiest for evaluating the performance of standard deep architectures. -
Unsupervised Speech Representation Learning Using Wavenet Autoencoders Jan Chorowski, Ron J
1 Unsupervised speech representation learning using WaveNet autoencoders Jan Chorowski, Ron J. Weiss, Samy Bengio, Aaron¨ van den Oord Abstract—We consider the task of unsupervised extraction speaker gender and identity, from phonetic content, properties of meaningful latent representations of speech by applying which are consistent with internal representations learned autoencoding neural networks to speech waveforms. The goal by speech recognizers [13], [14]. Such representations are is to learn a representation able to capture high level semantic content from the signal, e.g. phoneme identities, while being desired in several tasks, such as low resource automatic speech invariant to confounding low level details in the signal such as recognition (ASR), where only a small amount of labeled the underlying pitch contour or background noise. Since the training data is available. In such scenario, limited amounts learned representation is tuned to contain only phonetic content, of data may be sufficient to learn an acoustic model on the we resort to using a high capacity WaveNet decoder to infer representation discovered without supervision, but insufficient information discarded by the encoder from previous samples. Moreover, the behavior of autoencoder models depends on the to learn the acoustic model and a data representation in a fully kind of constraint that is applied to the latent representation. supervised manner [15], [16]. We compare three variants: a simple dimensionality reduction We focus on representations learned with autoencoders bottleneck, a Gaussian Variational Autoencoder (VAE), and a applied to raw waveforms and spectrogram features and discrete Vector Quantized VAE (VQ-VAE). We analyze the quality investigate the quality of learned representations on LibriSpeech of learned representations in terms of speaker independence, the ability to predict phonetic content, and the ability to accurately re- [17]. -
Sparse Feature Learning for Deep Belief Networks
Sparse Feature Learning for Deep Belief Networks Marc'Aurelio Ranzato1 Y-Lan Boureau2,1 Yann LeCun1 1 Courant Institute of Mathematical Sciences, New York University 2 INRIA Rocquencourt {ranzato,ylan,[email protected]} Abstract Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have cer- tain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the repre- sentation. We describe a novel and efficient algorithm to learn sparse represen- tations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured. 1 Introduction One of the main purposes of unsupervised learning is to produce good representations for data, that can be used for detection, recognition, prediction, or visualization. Good representations eliminate irrelevant variabilities of the input data, while preserving the information that is useful for the ul- timate task. One cause for the recent resurgence of interest in unsupervised learning is the ability to produce deep feature hierarchies by stacking unsupervised modules on top of each other, as pro- posed by Hinton et al. -
Tensorflow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction
TensorFlow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction TensorFlow Google Brain, 2015 (rewritten DistBelief) Theano University of Montréal, 2009 Keras François Chollet, 2015 (now at Google) Torch Facebook AI Research, Twitter, Google DeepMind Caffe Berkeley Vision and Learning Center (BVLC), 2013 Outline 1. Introduction of each framework a. TensorFlow b. Theano c. Keras d. Torch e. Caffe 2. Further comparison a. Code + models b. Community and documentation c. Performance d. Model deployment e. Extra features 3. Which framework to choose when ..? Introduction of each framework TensorFlow architecture 1) Low-level core (C++/CUDA) 2) Simple Python API to define the computational graph 3) High-level API (TF-Learn, TF-Slim, soon Keras…) TensorFlow computational graph - auto-differentiation! - easy multi-GPU/multi-node - native C++ multithreading - device-efficient implementation for most ops - whole pipeline in the graph: data loading, preprocessing, prefetching... TensorBoard TensorFlow development + bleeding edge (GitHub yay!) + division in core and contrib => very quick merging of new hotness + a lot of new related API: CRF, BayesFlow, SparseTensor, audio IO, CTC, seq2seq + so it can easily handle images, videos, audio, text... + if you really need a new native op, you can load a dynamic lib - sometimes contrib stuff disappears or moves - recently introduced bells and whistles are barely documented Presentation of Theano: - Maintained by Montréal University group. - Pioneered the use of a computational graph. - General machine learning tool -> Use of Lasagne and Keras. - Very popular in the research community, but not elsewhere. Falling behind. What is it like to start using Theano? - Read tutorials until you no longer can, then keep going. -
DIY Deep Learning for Vision: the Caffe Framework
DIY Deep Learning for Vision: the Caffe framework caffe.berkeleyvision.org github.com/BVLC/caffe Evan Shelhamer adapted from the Caffe tutorial with Jeff Donahue, Yangqing Jia, and Ross Girshick. Why Deep Learning? The Unreasonable Effectiveness of Deep Features Classes separate in the deep representations and transfer to many tasks. [DeCAF] [Zeiler-Fergus] Why Deep Learning? The Unreasonable Effectiveness of Deep Features Maximal activations of pool5 units [R-CNN] conv5 DeConv visualization Rich visual structure of features deep in hierarchy. [Zeiler-Fergus] Why Deep Learning? The Unreasonable Effectiveness of Deep Features 1st layer filters image patches that strongly activate 1st layer filters [Zeiler-Fergus] What is Deep Learning? Compositional Models Learned End-to-End What is Deep Learning? Compositional Models Learned End-to-End Hierarchy of Representations - vision: pixel, motif, part, object - text: character, word, clause, sentence - speech: audio, band, phone, word concrete abstract learning What is Deep Learning? Compositional Models Learned End-to-End figure credit Yann LeCun, ICML ‘13 tutorial What is Deep Learning? Compositional Models Learned End-to-End Back-propagation: take the gradient of the model layer-by-layer by the chain rule to yield the gradient of all the parameters. figure credit Yann LeCun, ICML ‘13 tutorial What is Deep Learning? Vast space of models! Caffe models are loss-driven: - supervised - unsupervised slide credit Marc’aurelio Ranzato, CVPR ‘14 tutorial. Convolutional Neural Nets (CNNs): 1989 LeNet: a layered model composed of convolution and subsampling operations followed by a holistic representation and ultimately a classifier for handwritten digits. [ LeNet ] Convolutional Nets: 2012 AlexNet: a layered model composed of convolution, + data subsampling, and further operations followed by a holistic + gpu representation and all-in-all a landmark classifier on + non-saturating nonlinearity ILSVRC12. -
Caffe in Practice
This Business of Brewing: Caffe in Practice caffe.berkeleyvision.org Evan Shelhamer github.com/BVLC/caffe from the tutorial by Evan Shelhamer, Jeff Donahue, Yangqing Jia, and Ross Girshick Deep Learning, as it is executed... What should a framework handle? Compositional Models Decompose the problem and code! End-to-End Learning Solve and check! Vast Space of Architectures and Tasks Define, experiment, and extend! Frameworks ● Torch7 ○ NYU ○ scientific computing framework in Lua ○ supported by Facebook ● Theano/Pylearn2 ○ U. Montreal ○ scientific computing framework in Python ○ symbolic computation and automatic differentiation ● Cuda-Convnet2 ○ Alex Krizhevsky ○ Very fast on state-of-the-art GPUs with Multi-GPU parallelism ○ C++ / CUDA library Framework Comparison ● More alike than different ○ All express deep models ○ All are nicely open-source ○ All include scripting for hacking and prototyping ● No strict winners – experiment and choose the framework that best fits your work ● We like to brew our deep networks with Caffe Why Caffe? In one sip… ● Expression: models + optimizations are plaintext schemas, not code. ● Speed: for state-of-the-art models and massive data. ● Modularity: to extend to new tasks and architectures. ● Openness: common code and reference models for reproducibility. ● Community: joint discussion, development, and modeling. CAFFE EXAMPLES + APPLICATIONS Share a Sip of Brewed Models demo.caffe.berkeleyvision.org demo code open-source and bundled Scene Recognition by MIT Places CNN demo B. Zhou et al. NIPS 14 Object Detection R-CNN: Regions with Convolutional Neural Networks http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/detection.ipynb Full R-CNN scripts available at https://github.com/rbgirshick/rcnn Ross Girshick et al. -
Comparison of Feature Learning Methods for Human Activity Recognition Using Wearable Sensors
sensors Article Comparison of Feature Learning Methods for Human Activity Recognition Using Wearable Sensors Frédéric Li 1,*, Kimiaki Shirahama 1 ID , Muhammad Adeel Nisar 1, Lukas Köping 1 and Marcin Grzegorzek 1,2 1 Research Group for Pattern Recognition, University of Siegen, Hölderlinstr 3, 57076 Siegen, Germany; [email protected] (K.S.); [email protected] (M.A.N.); [email protected] (L.K.); [email protected] (M.G.) 2 Department of Knowledge Engineering, University of Economics in Katowice, Bogucicka 3, 40-226 Katowice, Poland * Correspondence: [email protected]; Tel.: +49-271-740-3973 Received: 16 January 2018; Accepted: 22 February 2018; Published: 24 February 2018 Abstract: Getting a good feature representation of data is paramount for Human Activity Recognition (HAR) using wearable sensors. An increasing number of feature learning approaches—in particular deep-learning based—have been proposed to extract an effective feature representation by analyzing large amounts of data. However, getting an objective interpretation of their performances faces two problems: the lack of a baseline evaluation setup, which makes a strict comparison between them impossible, and the insufficiency of implementation details, which can hinder their use. In this paper, we attempt to address both issues: we firstly propose an evaluation framework allowing a rigorous comparison of features extracted by different methods, and use it to carry out extensive experiments with state-of-the-art feature learning approaches. We then provide all the codes and implementation details to make both the reproduction of the results reported in this paper and the re-use of our framework easier for other researchers.