The Natural Language of Actions

Total Page:16

File Type:pdf, Size:1020Kb

The Natural Language of Actions The Natural Language of Actions Guy Tennenholtz 1 Shie Mannor 1 Abstract We introduce Act2Vec, a general framework for learning context-based action representation for Reinforcement Learning. Representing actions in a vector space help reinforcement learning al- gorithms achieve better performance by grouping similar actions and utilizing relations between different actions. We show how prior knowledge of an environment can be extracted from demon- strations and injected into action vector represen- tations that encode natural compatible behavior. We then use these for augmenting state represen- tations as well as improving function approxima- Figure 1. A schematic visualization of action pair embeddings in tion of Q-values. We visualize and test action a navigation domain using a distributional representation method. embeddings in three domains including a draw- Each circle represents an action of two consecutive movements in ing task, a high dimensional navigation task, and the world. Actions that are close have similar contexts. Relations the large action space domain of StarCraft II. between actions in the vector space have interpretations in the physical world. 1. Introduction the distribution of words around it. For this reason, these methods are called “distributional” methods. Similarly, the The question “What is language” has had implications in context in which an action is executed holds vital infor- the fields of neuropsychology, linguistics, and philosophy. mation (e.g., prior knowledge) of the environment. This One definition tells us that language is “a purely human information can be transferred to a learning agent through and non-instinctive method of communicating ideas, emo- distributed action representations in a manifold of diverse tions, and desires by means of voluntarily produced sym- environments. bols” (Sapir, 1921). Much like humans adopt languages to communicate, their interaction with an environment uses In this paper, actions are represented and characterized by sophisticated languages to convey information. Inspired by the company they keep (i.e., their context). We assume the this conceptual analogy, we adopt existing methods in natu- context in which actions reside is induced by a demonstra- ral language processing (NLP) to gain a deeper understand- tor policy (or set of policies). We use the celebrated con- ing of the “natural language of actions” with the ultimate tinuous skip-gram model (Mikolov et al., 2013a) to learn arXiv:1902.01119v2 [cs.AI] 19 May 2019 goal of solving reinforcement learning (RL) tasks. high-quality vector representations of actions from large amounts of demonstrated trajectories. In this approach, In recent years, many advances were made in the field of each action or sequence of actions is embedded in a d- distributed representations of words (Mikolov et al., 2013b; dimensional vector that characterizes knowledge of accept- Pennington et al., 2014; Zhao et al., 2017). Distributional able behavior in the environment. methods make use of the hypothesis that words which oc- cur in a similar context tend to have similar meaning (Firth, As motivation, consider the problem of navigating a robot. 1957), i.e., the meaning of a word can be inferred from In its basic form, the robot must select a series of primitive actions of going straight, backwards, left, or right. By read- 1Faculty of Electrical Engineering, Technion Institute of ing the words “straight”, “backwards”, “left”, and “right” Technology, Israel. Correspondence to: Guy Tennenholtz the reader already has a clear understanding of their impli- <[email protected]>, Shie Mannor <[email protected]>. cation in the physical world. Particularly, it is presumed Proceedings of the 36 th International Conference on Machine that moving straight and then left should have higher corre- Learning, Long Beach, California, PMLR 97, 2019. Copyright lation to moving left and then straight, but for the most part 2019 by the author(s). contrast to moving right and then backwards. Moreover, it The Natural Language of Actions This paper is organized as follows. Section2 describes the general setting of action embeddings. We describe the Skip Gram with Negative Sampling (SGNS) model used for representing actions and its relation to the pointwise mutual information (PMI) of actions with their context. In Section3 we show how embeddings can be used for ap- proximating Q-values. We introduce a cluster based explo- ration procedure for large action spaces. Section4 includes empirical uses of action representations for solving rein- forcement learning tasks, including drawing a square and a navigation task. We visualize semantic relations of ac- tions in the said domains as well as the large action space domain of Starcraft II. We conclude the paper with related work (Section5) and a short discussion on future directions (Section6). 2. General Setting Figure 2. A schematic use of action embedding for state augmen- tation. Act2Vec is trained over sequences of length jHj actions In this section we describe the general framework of taken from trajectories in the action corpus Tp, sampled from Act2Vec. We begin by defining the context assumptions Dp. Representation of action histories (et+1) are joined with the from which we learn action vectors. We then illustrate that state (st+1). The agent π thus maps state and action histories embedding actions based on their point-wise mutual in- (st; at−1; : : : at−|Hj) to an action at. formation (PMI) with their contexts contributes favorable characteristics. One beneficial property of PMI-based em- is well understood that a navigation solution should rarely bedding is that close actions issue similar outcomes. In take the actions straight and backwards successively, as it Section4 we visualize Act2Vec embeddings of several do- would postpone the arrival to the desired location. Figure1 mains, showing their inherent structure is aligned with our shows a 2-dimensional schematic of these action pairs in prior understanding of the tested domains. an illustrated vector space. We later show (Section 4.2) that similarities, relations, and symmetries relating to such A Markov Decision Processes (MDP) is defined by a 5- actions are present in context-based action embeddings. tuple MR = (S; A; P; R; γ), where S is a set of states, A is a discrete set of actions, P : S × A × S ! [0; 1] is a set Action embeddings are used to mitigate an agent’s learning of transition probabilities, R : S! [0; 1] is a scalar reward, process on several fronts. First, action representations can and γ 2 (0; 1) is a discount factor. We consider the general be used to improve expressiveness of state representations, framework of multi-task RL in which R is sampled from sometimes even replacing them altogether (Section 4.1). In a task distribution DR. In addition, we consider a corpus this case, the policy space is augmented to include action- Tp of trajectories, (s0; a0; s1;:::), relating to demonstrated histories. These policies map current state and action his- policies. We assume the demonstrated policies are optimal jHj tories to actions, i.e., π : S × A !A. Through vec- w.r.t. to task MDPs sampled from DR. More specifically, tor action representations, these policies can be efficiently trajectories in Tp are sampled from a distribution of permis- learned, improving an agent’s overall performance. A con- sible policies Dp, defined by ceptual diagram presenting this approach is depicted in Fig- P (π) = P (π 2 Π∗ ) 8π; ure2. Second, similarity between actions can be leveraged π∼Dp MR to decrease redundant exploration through grouping of ac- where Π∗ is the set of optimal policies, which maximize tions. In this paper, we show how similarity between ac- MR the discounted return of M . tions can improve approximation of Q-values, as well as R devise a cluster-based exploration strategy for efficient ex- We consider a state-action tuple (st; at) at time t, and ploration in large action space domains (Sections3, 4.2). define its context of width w as the sequence cw(st; at) = (s ; a ; : : : s ; a ; s ; a ; : : : s ; a ). Our main contributions in this paper are as follows. (1) We t−w t−w t−1 t−1 t+1 t+1 t+w t+w We will sometimes refer to state-only contexts and action- generalize the use of context-based embedding for actions. only contexts as contexts containing only states or only We show that novel insights can be acquired, portraying our actions, respectively. We denote the set of all possible knowledge of the world via similarity, symmetry, and rela- contexts by C. With abuse of notation, we write (a; c) to tions in the action embedding space. (2) We offer uses of denote the pair < (s; a); c (s; a) >. action representations for state representation, function ap- w proximation, and exploration. We demonstrate the advan- In our work we will focus on the pointwise mutual informa- tage in performance on drawing and navigation domains. tion (PMI) of (s; a) and its context c. Pointwise mutual in- The Natural Language of Actions formation is an information-theoretic association measure 2.2. State-only context between a pair of discrete outcomes x and y, defined as State-only contexts provide us with information about the PMI(x; y) = log P (x;y) : We will also consider the con- P (x)P (y) environment as well as predictions of optimal trajectories. ditional PMI denoted by PMI(·; ·|·). Let us consider the next-state context c = s0. More specifi- cally, given a state action pair (s; a) we are interested in the 2.1. Act2Vec measure PMI(a; s0js), where here s0 is the random vari- able depicting the next state. The following lemma shows We summarize the Skip-Gram neural embedding model that when two actions a ; a have similar PMI w.r.t. (with introduced in (Mikolov et al., 2013a) and trained using 1 2 respect to) their next state context, they can be joined into the Negative-Sampling procedure (SGNS) (Mikolov et al., a single action with a small change in value.
Recommended publications
  • Malware Classification with BERT
    San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 5-25-2021 Malware Classification with BERT Joel Lawrence Alvares Follow this and additional works at: https://scholarworks.sjsu.edu/etd_projects Part of the Artificial Intelligence and Robotics Commons, and the Information Security Commons Malware Classification with Word Embeddings Generated by BERT and Word2Vec Malware Classification with BERT Presented to Department of Computer Science San José State University In Partial Fulfillment of the Requirements for the Degree By Joel Alvares May 2021 Malware Classification with Word Embeddings Generated by BERT and Word2Vec The Designated Project Committee Approves the Project Titled Malware Classification with BERT by Joel Lawrence Alvares APPROVED FOR THE DEPARTMENT OF COMPUTER SCIENCE San Jose State University May 2021 Prof. Fabio Di Troia Department of Computer Science Prof. William Andreopoulos Department of Computer Science Prof. Katerina Potika Department of Computer Science 1 Malware Classification with Word Embeddings Generated by BERT and Word2Vec ABSTRACT Malware Classification is used to distinguish unique types of malware from each other. This project aims to carry out malware classification using word embeddings which are used in Natural Language Processing (NLP) to identify and evaluate the relationship between words of a sentence. Word embeddings generated by BERT and Word2Vec for malware samples to carry out multi-class classification. BERT is a transformer based pre- trained natural language processing (NLP) model which can be used for a wide range of tasks such as question answering, paraphrase generation and next sentence prediction. However, the attention mechanism of a pre-trained BERT model can also be used in malware classification by capturing information about relation between each opcode and every other opcode belonging to a malware family.
    [Show full text]
  • Arxiv:2002.06235V1
    Semantic Relatedness and Taxonomic Word Embeddings Magdalena Kacmajor John D. Kelleher Innovation Exchange ADAPT Research Centre IBM Ireland Technological University Dublin [email protected] [email protected] Filip Klubickaˇ Alfredo Maldonado ADAPT Research Centre Underwriters Laboratories Technological University Dublin [email protected] [email protected] Abstract This paper1 connects a series of papers dealing with taxonomic word embeddings. It begins by noting that there are different types of semantic relatedness and that different lexical representa- tions encode different forms of relatedness. A particularly important distinction within semantic relatedness is that of thematic versus taxonomic relatedness. Next, we present a number of ex- periments that analyse taxonomic embeddings that have been trained on a synthetic corpus that has been generated via a random walk over a taxonomy. These experiments demonstrate how the properties of the synthetic corpus, such as the percentage of rare words, are affected by the shape of the knowledge graph the corpus is generated from. Finally, we explore the interactions between the relative sizes of natural and synthetic corpora on the performance of embeddings when taxonomic and thematic embeddings are combined. 1 Introduction Deep learning has revolutionised natural language processing over the last decade. A key enabler of deep learning for natural language processing has been the development of word embeddings. One reason for this is that deep learning intrinsically involves the use of neural network models and these models only work with numeric inputs. Consequently, applying deep learning to natural language processing first requires developing a numeric representation for words. Word embeddings provide a way of creating numeric representations that have proven to have a number of advantages over traditional numeric rep- resentations of language.
    [Show full text]
  • Combining Word Embeddings with Semantic Resources
    AutoExtend: Combining Word Embeddings with Semantic Resources Sascha Rothe∗ LMU Munich Hinrich Schutze¨ ∗ LMU Munich We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings that incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The obtained embeddings live in the same vector space as the input word embeddings. A sparse tensor formalization guar- antees efficiency and parallelizability. We use WordNet, GermaNet, and Freebase as semantic resources. AutoExtend achieves state-of-the-art performance on Word-in-Context Similarity and Word Sense Disambiguation tasks. 1. Introduction Unsupervised methods for learning word embeddings are widely used in natural lan- guage processing (NLP). The only data these methods need as input are very large corpora. However, in addition to corpora, there are many other resources that are undoubtedly useful in NLP, including lexical resources like WordNet and Wiktionary and knowledge bases like Wikipedia and Freebase. We will simply refer to these as resources. In this article, we present AutoExtend, a method for enriching these valuable resources with embeddings for non-word objects they describe; for example, Auto- Extend enriches WordNet with embeddings for synsets. The word embeddings and the new non-word embeddings live in the same vector space. Many NLP applications benefit if non-word objects described by resources—such as synsets in WordNet—are also available as embeddings.
    [Show full text]
  • Word-Like Character N-Gram Embedding
    Word-like character n-gram embedding Geewook Kim and Kazuki Fukui and Hidetoshi Shimodaira Department of Systems Science, Graduate School of Informatics, Kyoto University Mathematical Statistics Team, RIKEN Center for Advanced Intelligence Project fgeewook, [email protected], [email protected] Abstract Table 1: Top-10 2-grams in Sina Weibo and 4-grams in We propose a new word embedding method Japanese Twitter (Experiment 1). Words are indicated called word-like character n-gram embed- by boldface and space characters are marked by . ding, which learns distributed representations FNE WNE (Proposed) Chinese Japanese Chinese Japanese of words by embedding word-like character n- 1 ][ wwww 自己 フォロー grams. Our method is an extension of recently 2 。␣ !!!! 。␣ ありがと proposed segmentation-free word embedding, 3 !␣ ありがと ][ wwww 4 .. りがとう 一个 !!!! which directly embeds frequent character n- 5 ]␣ ございま 微博 めっちゃ grams from a raw corpus. However, its n-gram 6 。。 うござい 什么 んだけど vocabulary tends to contain too many non- 7 ,我 とうござ 可以 うござい 8 !! ざいます 没有 line word n-grams. We solved this problem by in- 9 ␣我 がとうご 吗? 2018 troducing an idea of expected word frequency. 10 了, ください 哈哈 じゃない Compared to the previously proposed meth- ods, our method can embed more words, along tion tools are used to determine word boundaries with the words that are not included in a given in the raw corpus. However, these segmenters re- basic word dictionary. Since our method does quire rich dictionaries for accurate segmentation, not rely on word segmentation with rich word which are expensive to prepare and not always dictionaries, it is especially effective when the available.
    [Show full text]
  • Learned in Speech Recognition: Contextual Acoustic Word Embeddings
    LEARNED IN SPEECH RECOGNITION: CONTEXTUAL ACOUSTIC WORD EMBEDDINGS Shruti Palaskar∗, Vikas Raunak∗ and Florian Metze Carnegie Mellon University, Pittsburgh, PA, U.S.A. fspalaska j vraunak j fmetze [email protected] ABSTRACT model [10, 11, 12] trained for direct Acoustic-to-Word (A2W) speech recognition [13]. Using this model, we jointly learn to End-to-end acoustic-to-word speech recognition models have re- automatically segment and classify input speech into individual cently gained popularity because they are easy to train, scale well to words, hence getting rid of the problem of chunking or requiring large amounts of training data, and do not require a lexicon. In addi- pre-defined word boundaries. As our A2W model is trained at the tion, word models may also be easier to integrate with downstream utterance level, we show that we can not only learn acoustic word tasks such as spoken language understanding, because inference embeddings, but also learn them in the proper context of their con- (search) is much simplified compared to phoneme, character or any taining sentence. We also evaluate our contextual acoustic word other sort of sub-word units. In this paper, we describe methods embeddings on a spoken language understanding task, demonstrat- to construct contextual acoustic word embeddings directly from a ing that they can be useful in non-transcription downstream tasks. supervised sequence-to-sequence acoustic-to-word speech recog- Our main contributions in this paper are the following: nition model using the learned attention distribution. On a suite 1. We demonstrate the usability of attention not only for aligning of 16 standard sentence evaluation tasks, our embeddings show words to acoustic frames without any forced alignment but also for competitive performance against a word2vec model trained on the constructing Contextual Acoustic Word Embeddings (CAWE).
    [Show full text]
  • Survey on Reinforcement Learning for Language Processing
    Survey on reinforcement learning for language processing V´ıctorUc-Cetina1, Nicol´asNavarro-Guerrero2, Anabel Martin-Gonzalez1, Cornelius Weber3, Stefan Wermter3 1 Universidad Aut´onomade Yucat´an- fuccetina, [email protected] 2 Aarhus University - [email protected] 3 Universit¨atHamburg - fweber, [email protected] February 2021 Abstract In recent years some researchers have explored the use of reinforcement learning (RL) algorithms as key components in the solution of various natural language process- ing tasks. For instance, some of these algorithms leveraging deep neural learning have found their way into conversational systems. This paper reviews the state of the art of RL methods for their possible use for different problems of natural language processing, focusing primarily on conversational systems, mainly due to their growing relevance. We provide detailed descriptions of the problems as well as discussions of why RL is well-suited to solve them. Also, we analyze the advantages and limitations of these methods. Finally, we elaborate on promising research directions in natural language processing that might benefit from reinforcement learning. Keywords| reinforcement learning, natural language processing, conversational systems, pars- ing, translation, text generation 1 Introduction Machine learning algorithms have been very successful to solve problems in the natural language pro- arXiv:2104.05565v1 [cs.CL] 12 Apr 2021 cessing (NLP) domain for many years, especially supervised and unsupervised methods. However, this is not the case with reinforcement learning (RL), which is somewhat surprising since in other domains, reinforcement learning methods have experienced an increased level of success with some impressive results, for instance in board games such as AlphaGo Zero [106].
    [Show full text]
  • Knowledge-Powered Deep Learning for Word Embedding
    Knowledge-Powered Deep Learning for Word Embedding Jiang Bian, Bin Gao, and Tie-Yan Liu Microsoft Research {jibian,bingao,tyliu}@microsoft.com Abstract. The basis of applying deep learning to solve natural language process- ing tasks is to obtain high-quality distributed representations of words, i.e., word embeddings, from large amounts of text data. However, text itself usually con- tains incomplete and ambiguous information, which makes necessity to leverage extra knowledge to understand it. Fortunately, text itself already contains well- defined morphological and syntactic knowledge; moreover, the large amount of texts on the Web enable the extraction of plenty of semantic knowledge. There- fore, it makes sense to design novel deep learning algorithms and systems in order to leverage the above knowledge to compute more effective word embed- dings. In this paper, we conduct an empirical study on the capacity of leveraging morphological, syntactic, and semantic knowledge to achieve high-quality word embeddings. Our study explores these types of knowledge to define new basis for word representation, provide additional input information, and serve as auxiliary supervision in deep learning, respectively. Experiments on an analogical reason- ing task, a word similarity task, and a word completion task have all demonstrated that knowledge-powered deep learning can enhance the effectiveness of word em- bedding. 1 Introduction With rapid development of deep learning techniques in recent years, it has drawn in- creasing attention to train complex and deep models on large amounts of data, in order to solve a wide range of text mining and natural language processing (NLP) tasks [4, 1, 8, 13, 19, 20].
    [Show full text]
  • Lecture 26 Word Embeddings and Recurrent Nets
    CS447: Natural Language Processing http://courses.engr.illinois.edu/cs447 Where we’re at Lecture 25: Word Embeddings and neural LMs Lecture 26: Recurrent networks Lecture 26 Lecture 27: Sequence labeling and Seq2Seq Lecture 28: Review for the final exam Word Embeddings and Lecture 29: In-class final exam Recurrent Nets Julia Hockenmaier [email protected] 3324 Siebel Center CS447: Natural Language Processing (J. Hockenmaier) !2 What are neural nets? Simplest variant: single-layer feedforward net For binary Output unit: scalar y classification tasks: Single output unit Input layer: vector x Return 1 if y > 0.5 Recap Return 0 otherwise For multiclass Output layer: vector y classification tasks: K output units (a vector) Input layer: vector x Each output unit " yi = class i Return argmaxi(yi) CS447: Natural Language Processing (J. Hockenmaier) !3 CS447: Natural Language Processing (J. Hockenmaier) !4 Multi-layer feedforward networks Multiclass models: softmax(yi) We can generalize this to multi-layer feedforward nets Multiclass classification = predict one of K classes. Return the class i with the highest score: argmaxi(yi) Output layer: vector y In neural networks, this is typically done by using the softmax N Hidden layer: vector hn function, which maps real-valued vectors in R into a distribution … … … over the N outputs … … … … … …. For a vector z = (z0…zK): P(i) = softmax(zi) = exp(zi) ∕ ∑k=0..K exp(zk) Hidden layer: vector h1 (NB: This is just logistic regression) Input layer: vector x CS447: Natural Language Processing (J. Hockenmaier) !5 CS447: Natural Language Processing (J. Hockenmaier) !6 Neural Language Models LMs define a distribution over strings: P(w1….wk) LMs factor P(w1….wk) into the probability of each word: " P(w1….wk) = P(w1)·P(w2|w1)·P(w3|w1w2)·…· P(wk | w1….wk#1) A neural LM needs to define a distribution over the V words in Neural Language the vocabulary, conditioned on the preceding words.
    [Show full text]
  • Local Homology of Word Embeddings
    Local Homology of Word Embeddings Tadas Temcinasˇ 1 Abstract Intuitively, stratification is a decomposition of a topologi- Topological data analysis (TDA) has been widely cal space into manifold-like pieces. When thinking about used to make progress on a number of problems. stratification learning and word embeddings, it seems intu- However, it seems that TDA application in natural itive that vectors of words corresponding to the same broad language processing (NLP) is at its infancy. In topic would constitute a structure, which we might hope this paper we try to bridge the gap by arguing why to be a manifold. Hence, for example, by looking at the TDA tools are a natural choice when it comes to intersections between those manifolds or singularities on analysing word embedding data. We describe the manifolds (both of which can be recovered using local a parallelisable unsupervised learning algorithm homology-based algorithms (Nanda, 2017)) one might hope based on local homology of datapoints and show to find vectors of homonyms like ‘bank’ (which can mean some experimental results on word embedding either a river bank, or a financial institution). This, in turn, data. We see that local homology of datapoints in has potential to help solve the word sense disambiguation word embedding data contains some information (WSD) problem in NLP, which is pinning down a particular that can potentially be used to solve the word meaning of a word used in a sentence when the word has sense disambiguation problem. multiple meanings. In this work we present a clustering algorithm based on lo- cal homology, which is more relaxed1 that the stratification- 1.
    [Show full text]
  • Word Embedding, Sense Embedding and Their Application to Word Sense Induction
    Word Embeddings, Sense Embeddings and their Application to Word Sense Induction Linfeng Song The University of Rochester Computer Science Department Rochester, NY 14627 Area Paper April 2016 Abstract This paper investigates the cutting-edge techniques for word embedding, sense embedding, and our evaluation results on large-scale datasets. Word embedding refers to a kind of methods that learn a distributed dense vector for each word in a vocabulary. Traditional word embedding methods first obtain the co-occurrence matrix then perform dimension reduction with PCA. Recent methods use neural language models that directly learn word vectors by predicting the context words of the target word. Moving one step forward, sense embedding learns a distributed vector for each sense of a word. They either define a sense as a cluster of contexts where the target word appears or define a sense based on a sense inventory. To evaluate the performance of the state-of-the-art sense embedding methods, I first compare them on the dominant word similarity datasets, then compare them on my experimental settings. In addition, I show that sense embedding is applicable to the task of word sense induction (WSI). Actually we are the first to show that sense embedding methods are competitive on WSI by building sense-embedding-based systems that demonstrate highly competitive performances on the SemEval 2010 WSI shared task. Finally, I propose several possible future research directions on word embedding and sense embedding. The University of Rochester Computer Science Department supported this work. Contents 1 Introduction 3 2 Word Embedding 5 2.1 Skip-gramModel................................
    [Show full text]
  • Arxiv:2007.00183V2 [Eess.AS] 24 Nov 2020
    WHOLE-WORD SEGMENTAL SPEECH RECOGNITION WITH ACOUSTIC WORD EMBEDDINGS Bowen Shi, Shane Settle, Karen Livescu TTI-Chicago, USA fbshi,settle.shane,[email protected] ABSTRACT Segmental models are sequence prediction models in which scores of hypotheses are based on entire variable-length seg- ments of frames. We consider segmental models for whole- word (“acoustic-to-word”) speech recognition, with the feature vectors defined using vector embeddings of segments. Such models are computationally challenging as the number of paths is proportional to the vocabulary size, which can be orders of magnitude larger than when using subword units like phones. We describe an efficient approach for end-to-end whole-word segmental models, with forward-backward and Viterbi de- coding performed on a GPU and a simple segment scoring function that reduces space complexity. In addition, we inves- tigate the use of pre-training via jointly trained acoustic word embeddings (AWEs) and acoustically grounded word embed- dings (AGWEs) of written word labels. We find that word error rate can be reduced by a large margin by pre-training the acoustic segment representation with AWEs, and additional Fig. 1. Whole-word segmental model for speech recognition. (smaller) gains can be obtained by pre-training the word pre- Note: boundary frames are not shared. diction layer with AGWEs. Our final models improve over segmental models, where the sequence probability is com- prior A2W models. puted based on segment scores instead of frame probabilities. Index Terms— speech recognition, segmental model, Segmental models have a long history in speech recognition acoustic-to-word, acoustic word embeddings, pre-training research, but they have been used primarily for phonetic recog- nition or as phone-level acoustic models [11–18].
    [Show full text]
  • A Deep Reinforcement Learning Neural Network Folding Proteins
    DeepFoldit - A Deep Reinforcement Learning Neural Network Folding Proteins Dimitra Panou1, Martin Reczko2 1University of Athens, Department of Informatics and Telecommunications 2Biomedical Sciences Research Center “Alexander Fleming” ABSTRACT Despite considerable progress, ab initio protein structure prediction remains suboptimal. A crowdsourcing approach is the online puzzle video game Foldit [1], that provided several useful results that matched or even outperformed algorithmically computed solutions [2]. Using Foldit, the WeFold [3] crowd had several successful participations in the Critical Assessment of Techniques for Protein Structure Prediction. Based on the recent Foldit standalone version [4], we trained a deep reinforcement neural network called DeepFoldit to improve the score assigned to an unfolded protein, using the Q-learning method [5] with experience replay. This paper is focused on model improvement through hyperparameter tuning. We examined various implementations by examining different model architectures and changing hyperparameter values to improve the accuracy of the model. The new model’s hyper-parameters also improved its ability to generalize. Initial results, from the latest implementation, show that given a set of small unfolded training proteins, DeepFoldit learns action sequences that improve the score both on the training set and on novel test proteins. Our approach combines the intuitive user interface of Foldit with the efficiency of deep reinforcement learning. KEYWORDS: ab initio protein structure prediction, Reinforcement Learning, Deep Learning, Convolution Neural Networks, Q-learning 1. ALGORITHMIC BACKGROUND Machine learning (ML) is the study of algorithms and statistical models used by computer systems to accomplish a given task without using explicit guidelines, relying on inferences derived from patterns. ML is a field of artificial intelligence.
    [Show full text]