Arguments For: Semantic Folding and Hierarchical Temporal Memory
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Approximate Pattern Matching Using Hierarchical Graph Construction and Sparse Distributed Representation
Portland State University PDXScholar Dissertations and Theses Dissertations and Theses 9-29-2020 Approximate Pattern Matching Using Hierarchical Graph Construction and Sparse Distributed Representation Aakanksha Mathuria Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/open_access_etds Part of the Electrical and Computer Engineering Commons Let us know how access to this document benefits ou.y Recommended Citation Mathuria, Aakanksha, "Approximate Pattern Matching Using Hierarchical Graph Construction and Sparse Distributed Representation" (2020). Dissertations and Theses. Paper 5581. https://doi.org/10.15760/etd.7453 This Thesis is brought to you for free and open access. It has been accepted for inclusion in Dissertations and Theses by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. Approximate Pattern Matching using Hierarchical Graph Construction and Sparse Distributed Representation by Aakanksha Mathuria A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical and Computer Engineering Thesis Committee: Dan Hammerstrom, Chair Christof Teuscher Nirupama Bulusu Portland State University 2020 Abstract With recent developments in deep networks, there have been significant advances in visual object detection and recognition. However, some of these networks are still easily fooled/hacked and have shown ”bag of features” kinds of failures. Some of this is due to the fact that even deep networks make only marginal use of the complex structure that exists in real-world images. Primate visual systems appear to capture the structure in images, but how? In the research presented here, we are studying approaches for robust pattern matching using static, 2D Blocks World images based on graphical representations of the various components of an image. -
Anomalous Behavior Detection Framework Using HTM-Based Semantic Folding Technique
Hindawi Computational and Mathematical Methods in Medicine Volume 2021, Article ID 5585238, 14 pages https://doi.org/10.1155/2021/5585238 Research Article Anomalous Behavior Detection Framework Using HTM-Based Semantic Folding Technique Hamid Masood Khan ,1 Fazal Masud Khan,1 Aurangzeb Khan,2 Muhammad Zubair Asghar ,1 and Daniyal M. Alghazzawi 3 1Institute of Computing and Information Technology, Gomal University, D.I.Khan, Pakistan 2Department of Computer Science, University of Science and Technology, Bannu, Pakistan 3Information Systems Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia Correspondence should be addressed to Muhammad Zubair Asghar; [email protected] Received 23 January 2021; Revised 17 February 2021; Accepted 26 February 2021; Published 16 March 2021 Academic Editor: Waqas Haider Bangyal Copyright © 2021 Hamid Masood Khan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Upon the working principles of the human neocortex, the Hierarchical Temporal Memory model has been developed which is a proposed theoretical framework for sequence learning. Both categorical and numerical types of data are handled by HTM. Semantic Folding Theory (SFT) is based on HTM to represent a data stream for processing in the form of sparse distributed representation (SDR). For natural language perception and production, SFT delivers a solid structural background for semantic evidence description to the fundamentals of the semantic foundation during the phase of language learning. Anomalies are the patterns from data streams that do not follow the expected behavior. -
Approximate Pattern Matching Using Hierarchical Graph Construction and Sparse Distributed Representation
Portland State University PDXScholar Electrical and Computer Engineering Faculty Publications and Presentations Electrical and Computer Engineering 7-23-2019 Approximate Pattern Matching using Hierarchical Graph Construction and Sparse Distributed Representation Aakanksha Mathuria Portland State University, [email protected] Dan Hammerstrom Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/ece_fac Part of the Engineering Commons Let us know how access to this document benefits ou.y Citation Details "Approximate Pattern Matching using Hierarchical Graph Construction and Sparse Distributed Representation," A. Mathuria, D.W. Hammerstrom, International Conference on Neuromorphic Systems, Knoxville, TN, July 23-5, 2019 This Conference Proceeding is brought to you for free and open access. It has been accepted for inclusion in Electrical and Computer Engineering Faculty Publications and Presentations by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. Approximate Pattern Matching using Hierarchical Graph Construction and Sparse Distributed Representation Aakanksha Mathuria Electrical and Computer Engineering Portland State University USA [email protected] Dan W. Hammerstrom Electrical and Computer Engineering Portland State University USA [email protected] ABSTRACT training on large numbers of images. None of these techniques actually captures the spatial relationships of the low level or high- With recent developments in deep networks, there have been level features, which biological networks appear to do. significant advances in visual object detection and Efficient graph representations capture the higher order recognition. However, some of these networks are still easily information content of the objects and provide algorithmic benefits fooled/hacked and have shown “bag of features” failures. -
Biologically-Inspired Machine Intelligence Technique for Activity Classification in Smart Home Environments
Biologically-Inspired Machine Intelligence Technique for Activity Classification in Smart Home Environments Talal Sarheed Alshammari A thesis submitted in partial fulfillment of the requirement of Staffordshire University for the degree of Doctor of Philosophy May 2019 List of Publications 2 REFEREED JOURNAL PUBLICATIONS OpenSHS: Open Smart Home Simulator May, 2017 Sensors Journal MDPI 2 REFEREED JOURNAL PUBLICATIONS SIMADL: Simulated Activities of Daily Living Dataset. April, 2018 Data Journal MDPI 2 REFEREED CONFERENCE PUBLICATIONS Evaluating Machine Learning Techniques for Activity Classifica- February ,2018 tion in Smart Home Environments World Academy of Science, Engineering and Technology International Journal of Infor- mation and Communication Engineering i Acknowledgements I would like to thank my supervisors Dr. Mohamed Sedky and Chris Howard for encour- aging and supporting me during my Ph.D study. They were very helpful and provided me their assistance throughout my Ph.D journey. It was very good experience to improve my research skills. Due to this experience, I learned new knowledge, found new ideas from your supervision. I appreciate your consideration, efforts and kind cooperation, you deserve all the thanks. I would like to thank to my parents and my wife for their unconditional support that has helped me for achieving this thesis. I would like to thank the Ministry of Education in Saudi Arabia and Hail university for funding and supporting this research. ii Contents List of Publicationsi Acknowledgements ii Table of Contents -
Distributional Semantics
Distributional semantics Distributional semantics is a research area that devel- by populating the vectors with information on which text ops and studies theories and methods for quantifying regions the linguistic items occur in; paradigmatic sim- and categorizing semantic similarities between linguis- ilarities can be extracted by populating the vectors with tic items based on their distributional properties in large information on which other linguistic items the items co- samples of language data. The basic idea of distributional occur with. Note that the latter type of vectors can also semantics can be summed up in the so-called Distribu- be used to extract syntagmatic similarities by looking at tional hypothesis: linguistic items with similar distributions the individual vector components. have similar meanings. The basic idea of a correlation between distributional and semantic similarity can be operationalized in many dif- ferent ways. There is a rich variety of computational 1 Distributional hypothesis models implementing distributional semantics, includ- ing latent semantic analysis (LSA),[8] Hyperspace Ana- The distributional hypothesis in linguistics is derived logue to Language (HAL), syntax- or dependency-based from the semantic theory of language usage, i.e. words models,[9] random indexing, semantic folding[10] and var- that are used and occur in the same contexts tend to ious variants of the topic model. [1] purport similar meanings. The underlying idea that “a Distributional semantic models differ primarily with re- word is characterized by the company it keeps” was pop- spect to the following parameters: ularized by Firth.[2] The Distributional Hypothesis is the basis for statistical semantics. Although the Distribu- • tional Hypothesis originated in linguistics,[3] it is now re- Context type (text regions vs. -
Embeddings in Natural Language Processing
Embeddings in Natural Language Processing Theory and Advances in Vector Representation of Meaning Mohammad Taher Pilehvar Tehran Institute for Advanced Studies Jose Camacho-Collados Cardiff University SYNTHESIS LECTURESDRAFT ON HUMAN LANGUAGE TECHNOLOGIES M &C Morgan& cLaypool publishers ABSTRACT Embeddings have been one of the dominating buzzwords since the early 2010s for Natural Language Processing (NLP). Encoding information into a low-dimensional vector representation, which is easily integrable in modern machine learning algo- rithms, has played a central role in the development in NLP. Embedding techniques initially focused on words but the attention soon started to shift to other forms: from graph structures, such as knowledge bases, to other types of textual content, such as sentences and documents. This book provides a high level synthesis of the main embedding techniques in NLP, in the broad sense. The book starts by explaining conventional word vector space models and word embeddings (e.g., Word2Vec and GloVe) and then moves to other types of embeddings, such as word sense, sentence and document, and graph embeddings. We also provide an overview on the status of the recent development in contextualized representations (e.g., ELMo, BERT) and explain their potential in NLP. Throughout the book the reader can find both essential information for un- derstanding a certain topic from scratch, and an in-breadth overview of the most successful techniques developed in the literature. KEYWORDS Natural Language Processing, Embeddings, Semantics DRAFT iii Contents 1 Introduction ................................................. 1 1.1 Semantic representation . 3 1.2 One-hot representation . 4 1.3 Vector Space Models . 5 1.4 The Evolution Path of representations . -
Sociolinguistic Properties of Word Embeddings Introduction
Sociolinguistic Properties of Word Embeddings Arseniev-Koehler and Foster (Draft) Sociolinguistic Properties of Word Embeddings Alina Arseniev-Koehler and Jacob G. Foster UCLA Department of Sociology Introduction Just a single word can prime our thinking and initiate a stream of associations. Not only do words denote concepts or things-in-the-world; they can also be combined to convey complex ideas and evoke connotations and even feelings. The meanings of our words are also contextual — they vary across time, space, linguistic communities, interactions, and even specific sentences. As rich as words are to us, in their raw form they are merely a sequence of phonemes or characters. Computers typically encounter words in their written form. How can this sequence become meaningful to computers? Word embeddings are one approach to represent word meanings numerically. This representation enables computers to process language semantically as well as syntactically. Embeddings gained popularity because the “meaning” they captured corresponds to human meanings in unprecedented ways. In this chapter, we illustrate these surprising correspondences at length. Because word embeddings capture meaning so effectively, they are a key ingredient in a variety of downstream tasks involving natural language (Artetxe et al., 2017; Young et al., 2018). They are used ubiquitously in tasks like translating languages (Artetxe et al., 2017; Mikolov, Tomas, Quoc V. Le, and Ilya Sutskever, 2013) and parsing clinical notes (Ching et al., 2018). But embeddings are more than just tools for computers to represent words. Word embeddings can represent human meanings because they learn the meanings of words from human language use – such as news, books, crawling the web, or even television and movie scripts. -
Using Semantic Folding with Textrank for Automatic Summarization
DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2017 Using semantic folding with TextRank for automatic summarization SIMON KARLSSON KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND COMMUNICATION Using semantic folding with TextRank for automatic summarization SIMON KARLSSON [email protected] Master in Computer Science Date: June 27, 2017 Principal: Findwise AB Supervisor at Findwise: Henrik Laurentz Supervisor at KTH: Stefan Nilsson Examiner at KTH: Olov Engwall Swedish title: TextRank med semantisk vikning för automatisk sammanfattning School of Computer Science and Communication i Abstract This master thesis deals with automatic summarization of text and how semantic fold- ing can be used as a similarity measure between sentences in the TextRank algorithm. The method was implemented and compared with two common similarity measures. These two similarity measures were cosine similarity of tf-idf vectors and the number of overlapping terms in two sentences. The three methods were implemented and the linguistic features used in the construc- tion were stop words, part-of-speech filtering and stemming. Five different part-of- speech filters were used, with different mixtures of nouns, verbs, and adjectives. The three methods were evaluated by summarizing documents from the Document Understanding Conference and comparing them to gold-standard summarization cre- ated by human judges. Comparison between the system summaries and gold-standard summaries was made with the ROUGE-1 measure. The algorithm with semantic fold- ing performed worst of the three methods, but only 0.0096 worse in F-score than cosine similarity of tf-idf vectors that performed best. For semantic folding, the average preci- sion was 46.2% and recall 45.7% for the best-performing part-of-speech filter. -
Semantic Structure and Interpretability of Word Embeddings
1 Semantic Structure and Interpretability of Word Embeddings Lutfi¨ Kerem S¸enel, Student Member, IEEE, Ihsan˙ Utlu, Veysel Yucesoy,¨ Student Member, IEEE, Aykut Koc¸, Member, IEEE, and Tolga C¸ukur, Senior Member, IEEE, Abstract—Dense word embeddings, which encode meanings have been receiving increasing interest in NLP research. of words to low dimensional vector spaces, have become very Approaches that learn embeddings include neural network popular in natural language processing (NLP) research due to based predictive methods [2], [8] and count-based matrix- their state-of-the-art performances in many NLP tasks. Word embeddings are substantially successful in capturing semantic factorization methods [9]. Word embeddings brought about relations among words, so a meaningful semantic structure must significant performance improvements in many intrinsic NLP be present in the respective vector spaces. However, in many tasks such as analogy or semantic textual similarity tasks, cases, this semantic structure is broadly and heterogeneously as well as downstream NLP tasks such as part-of-speech distributed across the embedding dimensions making interpre- (POS) tagging [10], named entity recognition [11], word sense tation of dimensions a big challenge. In this study, we propose a statistical method to uncover the underlying latent semantic disambiguation [12], sentiment analysis [13] and cross-lingual structure in the dense word embeddings. To perform our analysis, studies [14]. we introduce a new dataset (SEMCAT) that contains more Although high levels of success have been reported in many than 6,500 words semantically grouped under 110 categories. NLP tasks using word embeddings, the individual embedding We further propose a method to quantify the interpretability dimensions are commonly considered to be uninterpretable of the word embeddings. -
A Compositional and Interpretable Semantic Space
A Compositional and Interpretable Semantic Space Alona Fyshe,1 Leila Wehbe,1 Partha Talukdar,2 Brian Murphy,3 and Tom Mitchell1 1 Machine Learning Department, Carnegie Mellon University, Pittsburgh, USA 2 Indian Institute of Science, Bangalore, India 3 Queen’s University Belfast, Belfast, Northern Ireland [email protected], [email protected], [email protected], [email protected], [email protected] Abstract like a black box - it is unclear what VSM dimen- sions represent (save for broad classes of corpus Vector Space Models (VSMs) of Semantics statistic types) and what the application of a com- are useful tools for exploring the semantics of position function to those dimensions entails. Neu- single words, and the composition of words ral network (NN) models are becoming increas- to make phrasal meaning. While many meth- ingly popular (Socher et al., 2012; Hashimoto et al., ods can estimate the meaning (i.e. vector) of a phrase, few do so in an interpretable way. 2014; Mikolov et al., 2013; Pennington et al., 2014), We introduce a new method (CNNSE) that al- and some model introspection has been attempted: lows word and phrase vectors to adapt to the Levy and Goldberg (2014) examined connections notion of composition. Our method learns a between layers, Mikolov et al. (2013) and Penning- VSM that is both tailored to support a chosen ton et al. (2014) explored how shifts in VSM space semantic composition operation, and whose encodes semantic relationships. Still, interpreting resulting features have an intuitive interpreta- NN VSM dimensions, or factors, remains elusive. -
Biological and Machine Intelligence (Bami)
BIOLOGICAL AND MACHINE INTELLIGENCE (BAMI) A living book that documents Hierarchical Temporal Memory (HTM) March 8, 2017 ©Numenta, Inc. 2017 For more details on the use of Numenta’s software and intellectual property, including the ideas contained in this book, see http://numenta.com/business-strategy-and-ip/. Biological and Machine Intelligence (BAMI) is a living book authored by Numenta researchers and engineers. Its purpose is to document Hierarchical Temporal Memory, a theoretical framework for both biological and machine intelligence. While there’s a lot more work to be done on HTM theory, we have made good progress on several components of a comprehensive theory of the neocortex and how to apply that knowledge to practical applications. We would like to share our work as we go. We hope this book will become the standard reference for people who want to learn about HTM cortical theory and its applications for machine intelligence. Going forward we anticipate using two main forms of documentation of our research, one is published research papers and the other is additions to this book, Biological and Machine Intelligence. Just as we make all of our research and technology available in open source, we want to be transparent with this manuscript as well, even well-ahead of its completion. When it is finished, this book will cover all the fundamental concepts of HTM theory. For now, we are sharing the chapters that have been completed, each of which contains a revision history, so you can see what has changed in each chapter. Over time, we will add chapters to BAMI. -
A Vector Semantics Approach to the Geoparsing Disambiguation Task for Texts in Spanish
Kalpa Publications in Computing Volume 13, 2019, Pages 47{55 Proceedings of the 1st International Con- ference on Geospatial Information Sciences A Vector Semantics Approach to the Geoparsing Disambiguation Task for Texts in Spanish Filomeno Alc´antara1, Alejandro Molina2, and Victor Mu~niz3 1 Centro de Investigacion en Matem´aticas,Monterrey, Nuevo Le´on,M´exico [email protected] http://www.cimat.mx 2 CONACYT { Centro de Investigaci´onen Ciencias de Informaci´onGeoespacial, M´erida,Yucat´an, M´exico [email protected] http://mid.geoint.mx 3 Centro de Investigaci´onen Matem´aticas,Monterrey, Nuevo Le´on,M´exico victor [email protected] http://www.cimat.mx Abstract Nowadays, online news sources generate continuous streams of information that in- cludes references to real locations. Linking these locations to coordinates in a map usually requires two steps involving the named entity: extraction and disambiguation. In past years, efforts have been devoted mainly to the first task. Approaches to location disam- biguation include knowledge-based, map-based and data-driven methods. In this paper, we present a work in progress for location disambiguation in news documents that uses a vector-semantic representation learned from information sources that include events and geographic descriptions, in order to obtain a ranking for the possible locations. We will describe the proposed method and the results obtained so far, as well as ideas for future work. 1 Introduction Nowadays, online news sources generate continuous streams of information with global coverage that directly impact the daily activities of people in a wide variety of areas. Although social media can also generate real-time news content, newspapers remain a source of information that incorporates a refined version of relevant events.