Recent Advances in the Deep CNN Neocognitron

Total Page:16

File Type:pdf, Size:1020Kb

Recent Advances in the Deep CNN Neocognitron NOLTA, IEICE Invited Paper Recent advances in the deep CNN neocognitron Kunihiko Fukushima 1 a) 1 Fuzzy Logic Systems Institute 680-41 Kawazu, Iizuka, Fukuoka 820-0067, Japan a) [email protected] Received October 19, 2018; Published October 1, 2019 Abstract: Deep convolutional neural networks (deep CNN) show a large power for robust recognition of visual patterns. The neocognitron, which was first proposed by Fukushima (1979), is a network classified to this category. Its architecture was suggested by neurophysi- ological findings on the visual systems of mammals. It acquires the ability to recognize visual patterns robustly through learning. Although the neocognitron has a long history, improve- ments of the network are still continuing. This paper discusses the recent neocognitron, focusing on differences from the conventional deep CNN. Some other functions of the visual system can also be realized by networks extended from the neocognitron, for example, recognition of partly occluded patterns, the mechanism of selective attention, and so on. Key Words: deep CNN, pattern recognition, neocognitron, add-if-silent rule, interpolating- vector 1. Introduction Recently, deep convolutional neural networks (deep CNN) have become very popular in the field of visual pattern recognition (e.g. [1]). The neocognitron, which was first proposed by Fukushima [2–4], is a network classified to this category. It is a hierarchical multi-layered network. Its architecture was suggested by neurophysiological findings on the visual systems of mammals. It acquires the ability to recognize visual patterns robustly through learning. Although the neocognitron has a long history, improvements of the network are still continuing. For training intermediate layers of the neocognitron, the learning rule called AiS (Add-if-Silent) is used. Under the AiS rule, a new cell is generated and added to the network if all postsynaptic cells are silent in spite of non-silent presynaptic cells. The generated cell learns the activity of the presynaptic cells in one-shot. Once a cell is generated, its input connections do not change any more. Thus the training process is very simple and does not require time-consuming repetitive calculation. In the deepest layer, a method called IntVec (Interpolating-Vector) is used for classifying input patterns based on the features extracted by intermediate layers. For the recognition by the IntVec, we search, in the multi-dimensional feature space, the nearest plane (or line) that is made of a trio (or pair) of reference vectors. Computer simulation shows that recognition error can be made much smaller by the IntVec than by the WTA (Winner-Take-All) or even by the SVM (support vector machine). 304 Nonlinear Theory and Its Applications, IEICE, vol. 10, no. 4, pp. 304–321 c IEICE 2019 DOI: 10.1587/nolta.10.304 Some other functions of the visual system can also be realized by networks extended from the neocognitron, for example, recognition of partly occluded patterns, the mechanism of selective atten- tion, and so on. Sections from 2 to 5 discuss the recent neocognitron, focusing on differences from the conventional deep CNN. Sections 6 and 7 show some networks extended from the neocognitron. 2. Network architecture of the neocognitron As shown in Fig. 1, the neocognitron consists of a cascaded connection of a number of stages preceded by an input layer U0. Each stage is composed of a layer of S-cells followed by a layer of C-cells. After learning, S-cells come to work as feature-extracting cells. C-cells pool the response of S-cells in retinotopic neighborhood. Incidentally, S-cells behave like simple cells in the visual cortex, and C-cells behave like complex cells. In the figure, USl and UCl indicate the layer of S-cells and the layer of C-cells of the lth stage, respectively. Fig. 1. The architecture of the neocognitron. Each layer of the network is divided into a number of sub-layers, called cell-planes, depending on the feature to which cells respond preferentially. A cell-plane is a group of cells that are arranged retinotopically and share the same set of input connections. All cells in a cell-plane have receptive fields of an identical characteristic, but the locations of the receptive fields differ from cell to cell. The input connections to S-cells are variable and are modified through learning. After learning, each S-cell comes to respond selectively to a particular visual feature. The feature extracted by an S-cell is determined during learning. Generally speaking, local features of the input pattern are extracted in shallow layers, and more global features are extracted in deeper layers. The output of layer USl is fed to layer UCl. C-cells pool the response of S-cells. Each C-cell has fixed excitatory connections from a group of S-cells whose receptive field locations are slightly deviated. C-cells thus average the responses of S-cells spatially. We can also interpret that S-cells’ response is spatially blurred by C-cells. Although deformation of an input pattern causes shift of local features, C-cells can absorb the effect of shift by spatial averaging. In the neocognitron, the pooling by C-cells is done by spatial averaging (L2 pooling), not by MAX- operation. The spatial averaging is useful, not only for robust recognition of deformed patterns, but also for smoothing additive random noise contained in the responses of S-cells. To reduce the computational cost by decreasing the density of cells, down-sampling of cells is done in layers of C- cells. Spatial blur (i.e., low-pass filter for spatial frequency) before down-sampling is also important for suppressing aliasing noise caused by coarse down-sampling. The output of layer UCl is fed to layer USl+1 of the next stage. Based on the features extracted in the intermediate stages, the final classification of the input pattern is made in the deepest stage. 305 3. Feature extraction by S-cells 3.1 Response of an S-cell S-cells work as feature extractors. Let vector a be the strength of excitatory connections to an S-cell from presynaptic C-cells, whose outputs are x (Fig. 2). The V-cell receives signals from the same group of C-cells as does the S-cell and calculates the norm of x. Namely, v = x. Fig. 2. Connections converging to an S-cell. The output u of the S-cell is given by 1 u = · ϕ [(a, x) − θ x ](1) 1 − θ where ϕ[ ] is a rectified linear function, which is defined by ϕ[x]=max(x, 0). Connection a is given by a = X/X, where X is the training vector (or a linear combination of training vectors) that the S-cell has learned. Then, (1) reduces to ϕ[s − θ] (X, x) u = x· where s = (2) 1 − θ X·x In the multi-dimensional feature space, s shows a kind of similarity between X and x, which is defined by the inner product of X and x. If similarity s is larger than θ, the S-cell yields a non-zero response [4]. Thus θ determines the threshold of the S-cell. The area that satisfies s>θin the multi-dimensional feature space is named the tolerance area of the S-cell (Fig. 3). We call X the reference vector of the S-cell. It represents the preferred feature of the S-cell. Fig. 3. Tolerance area of an S-cell in the feature space [4]. Now, the distance between two vectors can be defined by similarity s. In this paper, the nearest vector means the vector that has the largest similarity. 3.2 Robustness to background noise Different from the neocognitron of old versions, the inhibition from V-cells works, not in a divisional manner, but in a subtractive manner. This is effective for increasing robustness to background noise [5]. 306 Figure 4 illustrates a situation when the background noise is a faint image of other character. As shown in the lower left of the figure, each S-cell of an intermediate layer has a small receptive field and observes only a part of the input pattern. Fig. 4. The effect of background noise on the response of S-cells of an in- termediate layer. Comparison between subtractive and divisional inhibition. Recent neocognitrons use subtractive inhibition, while old neocognitrons used the divisional inhibition. (modified from [5]) If inhibition works in a divisional manner, an S-cell, whose receptive field (left circle on U0)covers only a part of the faint background character, responds strongly despite of the weak intensity of the stimulus. As a result, S-cells in the intermediate layer respond as though an input pattern shown in the lower right in Fig. 4 is presented. Correct recognition of the foreground character is largely interfered by the strong responses from S-cells elicited by these irrelevant features. On the contrary, if the inhibition works in a subtractive manner, the faint background noise elicits only negligibly small responses from S-cells. Thus, the responses of S-cells are little affected by the background noise. Incidentally, when the receptive field of an S-cell covers parts of both background and foreground characters together as shown by the upper circle in U0, the S-cell’s response is almost the same, either with divisional or subtractive inhibition. Figure 5 shows how the intensity of the background nose affects the recognition error by the neocog- Fig. 5. The recognition error under background noise, which is a faint image of a different digit. The recognition error remains small if the inhibition to S-cells works in a subtractive manner, but becomes very large if it works in a divisional manner [5].
Recommended publications
  • CS855 Pattern Recognition and Machine Learning Homework 3 A.Aziz Altowayan
    CS855 Pattern Recognition and Machine Learning Homework 3 A.Aziz Altowayan Problem Find three recent (2010 or newer) journal articles of conference papers on pattern recognition applications using feed-forward neural networks with backpropagation learning that clearly describe the design of the neural network { number of layers and number of units in each layer { and the rationale for the design. For each paper, describe the neural network, the reasoning behind the design, and include images of the neural network when available. Answer 1 The theme of this ansewr is Deep Neural Network 2 (Deep Learning or multi-layer deep architecture). The reason is that in recent years, \Deep learning technology and related algorithms have dramatically broken landmark records for a broad range of learning problems in vision, speech, audio, and text processing." [1] Deep learning models are a class of machines that can learn a hierarchy of features by building high-level features from low-level ones, thereby automating the process of feature construction [2]. Following are three paper in this topic. Paper1: D. C. Ciresan, U. Meier, J. Schmidhuber. \Multi-column Deep Neural Networks for Image Classifica- tion". IEEE Conf. on Computer Vision and Pattern Recognition CVPR 2012. Feb 2012. arxiv \Work from Swiss AI Lab IDSIA" This method is the first to achieve near-human performance on MNIST handwriting dataset. It, also, outperforms humans by a factor of two on the traffic sign recognition benchmark. In this paper, the network model is Deep Convolutional Neural Networks. The layers in their NNs are comparable to the number of layers found between retina and visual cortex of \macaque monkeys".
    [Show full text]
  • Multiscale Computation and Dynamic Attention in Biological and Artificial
    brain sciences Review Multiscale Computation and Dynamic Attention in Biological and Artificial Intelligence Ryan Paul Badman 1,* , Thomas Trenholm Hills 2 and Rei Akaishi 1,* 1 Center for Brain Science, RIKEN, Saitama 351-0198, Japan 2 Department of Psychology, University of Warwick, Coventry CV4 7AL, UK; [email protected] * Correspondence: [email protected] (R.P.B.); [email protected] (R.A.) Received: 4 March 2020; Accepted: 17 June 2020; Published: 20 June 2020 Abstract: Biological and artificial intelligence (AI) are often defined by their capacity to achieve a hierarchy of short-term and long-term goals that require incorporating information over time and space at both local and global scales. More advanced forms of this capacity involve the adaptive modulation of integration across scales, which resolve computational inefficiency and explore-exploit dilemmas at the same time. Research in neuroscience and AI have both made progress towards understanding architectures that achieve this. Insight into biological computations come from phenomena such as decision inertia, habit formation, information search, risky choices and foraging. Across these domains, the brain is equipped with mechanisms (such as the dorsal anterior cingulate and dorsolateral prefrontal cortex) that can represent and modulate across scales, both with top-down control processes and by local to global consolidation as information progresses from sensory to prefrontal areas. Paralleling these biological architectures, progress in AI is marked by innovations in dynamic multiscale modulation, moving from recurrent and convolutional neural networks—with fixed scalings—to attention, transformers, dynamic convolutions, and consciousness priors—which modulate scale to input and increase scale breadth.
    [Show full text]
  • Deep Neural Network Models for Sequence Labeling and Coreference Tasks
    Federal state autonomous educational institution for higher education ¾Moscow institute of physics and technology (national research university)¿ On the rights of a manuscript Le The Anh DEEP NEURAL NETWORK MODELS FOR SEQUENCE LABELING AND COREFERENCE TASKS Specialty 05.13.01 - ¾System analysis, control theory, and information processing (information and technical systems)¿ A dissertation submitted in requirements for the degree of candidate of technical sciences Supervisor: PhD of physical and mathematical sciences Burtsev Mikhail Sergeevich Dolgoprudny - 2020 Федеральное государственное автономное образовательное учреждение высшего образования ¾Московский физико-технический институт (национальный исследовательский университет)¿ На правах рукописи Ле Тхе Ань ГЛУБОКИЕ НЕЙРОСЕТЕВЫЕ МОДЕЛИ ДЛЯ ЗАДАЧ РАЗМЕТКИ ПОСЛЕДОВАТЕЛЬНОСТИ И РАЗРЕШЕНИЯ КОРЕФЕРЕНЦИИ Специальность 05.13.01 – ¾Системный анализ, управление и обработка информации (информационные и технические системы)¿ Диссертация на соискание учёной степени кандидата технических наук Научный руководитель: кандидат физико-математических наук Бурцев Михаил Сергеевич Долгопрудный - 2020 Contents Abstract 4 Acknowledgments 6 Abbreviations 7 List of Figures 11 List of Tables 13 1 Introduction 14 1.1 Overview of Deep Learning . 14 1.1.1 Artificial Intelligence, Machine Learning, and Deep Learning . 14 1.1.2 Milestones in Deep Learning History . 16 1.1.3 Types of Machine Learning Models . 16 1.2 Brief Overview of Natural Language Processing . 18 1.3 Dissertation Overview . 20 1.3.1 Scientific Actuality of the Research . 20 1.3.2 The Goal and Task of the Dissertation . 20 1.3.3 Scientific Novelty . 21 1.3.4 Theoretical and Practical Value of the Work in the Dissertation . 21 1.3.5 Statements to be Defended . 22 1.3.6 Presentations and Validation of the Research Results .
    [Show full text]
  • Arxiv:2107.04562V1 [Stat.ML] 9 Jul 2021 ¯ ¯ Θ∗ = Arg Min `(Θ) Where `(Θ) = `(Yi, Fθ(Xi)) + R(Θ)
    The Bayesian Learning Rule Mohammad Emtiyaz Khan H˚avard Rue RIKEN Center for AI Project CEMSE Division, KAUST Tokyo, Japan Thuwal, Saudi Arabia [email protected] [email protected] Abstract We show that many machine-learning algorithms are specific instances of a single algorithm called the Bayesian learning rule. The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models. This includes classical algorithms such as ridge regression, Newton's method, and Kalman filter, as well as modern deep-learning algorithms such as stochastic-gradient descent, RMSprop, and Dropout. The key idea in deriving such algorithms is to approximate the posterior using candidate distributions estimated by using natural gradients. Different candidate distributions result in different algorithms and further approximations to natural gradients give rise to variants of those algorithms. Our work not only unifies, generalizes, and improves existing algorithms, but also helps us design new ones. 1 Introduction 1.1 Learning-algorithms Machine Learning (ML) methods have been extremely successful in solving many challenging problems in fields such as computer vision, natural-language processing and artificial intelligence (AI). The main idea is to formulate those problems as prediction problems, and learn a model on existing data to predict the future outcomes. For example, to design an AI agent that can recognize objects in an image, we D can collect a dataset with N images xi 2 R and object labels yi 2 f1; 2;:::;Kg, and learn a model P fθ(x) with parameters θ 2 R to predict the label for a new image.
    [Show full text]
  • Convolutional Neural Network Model Layers Improvement for Segmentation and Classification on Kidney Stone Images Using Keras and Tensorflow
    Journal of Multidisciplinary Engineering Science and Technology (JMEST) ISSN: 2458-9403 Vol. 8 Issue 6, June - 2021 Convolutional Neural Network Model Layers Improvement For Segmentation And Classification On Kidney Stone Images Using Keras And Tensorflow Orobosa Libert Joseph Waliu Olalekan Apena Department of Computer / Electrical and (1) Department of Computer / Electrical and Electronics Engineering, The Federal University of Electronics Engineering, The Federal University of Technology, Akure. Nigeria. Technology, Akure. Nigeria. [email protected] (2) Biomedical Computing and Engineering Technology, Applied Research Group, Coventry University, Coventry, United Kingdom [email protected], [email protected] Abstract—Convolutional neural network (CNN) and analysis, by automatic or semiautomatic means, of models are beneficial to image classification large quantities of data in order to discover meaningful algorithms training for highly abstract features patterns [1,2]. The domains of data mining include: and work with less parameter. Over-fitting, image mining, opinion mining, web mining, text mining, exploding gradient, and class imbalance are CNN and graph mining and so on. Some of its applications major challenges during training; with appropriate include anomaly detection, financial data analysis, management training, these issues can be medical data analysis, social network analysis, market diminished and enhance model performance. The analysis [3]. Recent progress in deep learning using models are 128 by 128 CNN-ML and 256 by 256 CNN machine learning (CNN-ML) has been helpful in CNN-ML training (or learning) and classification. decision support and contributed to positive outcome, The results were compared for each model significantly. The application of CNN-ML to diverse classifier. The study of 128 by 128 CNN-ML model areas of soft computing is adapted diagnosis has the following evaluation results consideration procedure to enhance time and accuracy [4].
    [Show full text]
  • Training Plastic Neural Networks with Backpropagation
    Differentiable plasticity: training plastic neural networks with backpropagation Thomas Miconi 1 Jeff Clune 1 Kenneth O. Stanley 1 Abstract examples. By contrast, biological agents exhibit a remark- How can we build agents that keep learning from able ability to learn quickly and efficiently from ongoing experience, quickly and efficiently, after their ini- experience: animals can learn to navigate and remember the tial training? Here we take inspiration from the location of (and quickest way to) food sources, discover and main mechanism of learning in biological brains: remember rewarding or aversive properties of novel objects synaptic plasticity, carefully tuned by evolution and situations, etc. – often from a single exposure. to produce efficient lifelong learning. We show Endowing artificial agents with lifelong learning abilities that plasticity, just like connection weights, can is essential to allowing them to master environments with be optimized by gradient descent in large (mil- changing or unpredictable features, or specific features that lions of parameters) recurrent networks with Heb- are unknowable at the time of training. For example, super- bian plastic connections. First, recurrent plastic vised learning in deep neural networks can allow a neural networks with more than two million parameters network to identify letters from a specific, fixed alphabet can be trained to memorize and reconstruct sets to which it was exposed during its training; however, au- of novel, high-dimensional (1,000+ pixels) nat- tonomous learning abilities would allow an agent to acquire ural images not seen during training. Crucially, knowledge of any alphabet, including alphabets that are traditional non-plastic recurrent networks fail to unknown to the human designer at the time of training.
    [Show full text]
  • Simultaneous Unsupervised and Supervised Learning of Cognitive
    Simultaneous unsupervised and supervised learning of cognitive functions in biologically plausible spiking neural networks Trevor Bekolay ([email protected]) Carter Kolbeck ([email protected]) Chris Eliasmith ([email protected]) Center for Theoretical Neuroscience, University of Waterloo 200 University Ave., Waterloo, ON N2L 3G1 Canada Abstract overcomes these limitations. Our approach 1) remains func- tional during online learning, 2) requires only two layers con- We present a novel learning rule for learning transformations of sophisticated neural representations in a biologically plau- nected with simultaneous supervised and unsupervised learn- sible manner. We show that the rule, which uses only infor- ing, and 3) employs spiking neuron models to reproduce cen- mation available locally to a synapse in a spiking network, tral features of biological learning, such as spike-timing de- can learn to transmit and bind semantic pointers. Semantic pointers have previously been used to build Spaun, which is pendent plasticity (STDP). currently the world’s largest functional brain model (Eliasmith et al., 2012). Two operations commonly performed by Spaun Online learning with spiking neuron models faces signifi- are semantic pointer binding and transmission. It has not yet been shown how the binding and transmission operations can cant challenges due to the temporal dynamics of spiking neu- be learned. The learning rule combines a previously proposed rons. Spike rates cannot be used directly, and must be esti- supervised learning rule and a novel spiking form of the BCM mated with causal filters, producing a noisy estimate. When unsupervised learning rule. We show that spiking BCM in- creases sparsity of connection weights at the cost of increased the signal being estimated changes, there is some time lag signal transmission error.
    [Show full text]
  • Arxiv:2103.13076V1 [Cs.CL] 24 Mar 2021
    Finetuning Pretrained Transformers into RNNs Jungo Kasai♡∗ Hao Peng♡ Yizhe Zhang♣ Dani Yogatama♠ Gabriel Ilharco♡ Nikolaos Pappas♡ Yi Mao♣ Weizhu Chen♣ Noah A. Smith♡♢ ♡Paul G. Allen School of Computer Science & Engineering, University of Washington ♣Microsoft ♠DeepMind ♢Allen Institute for AI {jkasai,hapeng,gamaga,npappas,nasmith}@cs.washington.edu {Yizhe.Zhang, maoyi, wzchen}@microsoft.com [email protected] Abstract widely used in autoregressive modeling such as lan- guage modeling (Baevski and Auli, 2019) and ma- Transformers have outperformed recurrent chine translation (Vaswani et al., 2017). The trans- neural networks (RNNs) in natural language former makes crucial use of interactions between generation. But this comes with a signifi- feature vectors over the input sequence through cant computational cost, as the attention mech- the attention mechanism (Bahdanau et al., 2015). anism’s complexity scales quadratically with sequence length. Efficient transformer vari- However, this comes with significant computation ants have received increasing interest in recent and memory footprint during generation. Since the works. Among them, a linear-complexity re- output is incrementally predicted conditioned on current variant has proven well suited for au- the prefix, generation steps cannot be parallelized toregressive generation. It approximates the over time steps and require quadratic time complex- softmax attention with randomized or heuris- ity in sequence length. The memory consumption tic feature maps, but can be difficult to train in every generation step also grows linearly as the and may yield suboptimal accuracy. This work aims to convert a pretrained transformer into sequence becomes longer. This bottleneck for long its efficient recurrent counterpart, improving sequence generation limits the use of large-scale efficiency while maintaining accuracy.
    [Show full text]
  • The History Began from Alexnet: a Comprehensive Survey on Deep Learning Approaches
    > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1 The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches Md Zahangir Alom1, Tarek M. Taha1, Chris Yakopcic1, Stefan Westberg1, Paheding Sidike2, Mst Shamima Nasrin1, Brian C Van Essen3, Abdul A S. Awwal3, and Vijayan K. Asari1 Abstract—In recent years, deep learning has garnered I. INTRODUCTION tremendous success in a variety of application domains. This new ince the 1950s, a small subset of Artificial Intelligence (AI), field of machine learning has been growing rapidly, and has been applied to most traditional application domains, as well as some S often called Machine Learning (ML), has revolutionized new areas that present more opportunities. Different methods several fields in the last few decades. Neural Networks have been proposed based on different categories of learning, (NN) are a subfield of ML, and it was this subfield that spawned including supervised, semi-supervised, and un-supervised Deep Learning (DL). Since its inception DL has been creating learning. Experimental results show state-of-the-art performance ever larger disruptions, showing outstanding success in almost using deep learning when compared to traditional machine every application domain. Fig. 1 shows, the taxonomy of AI. learning approaches in the fields of image processing, computer DL (using either deep architecture of learning or hierarchical vision, speech recognition, machine translation, art, medical learning approaches) is a class of ML developed largely from imaging, medical information processing, robotics and control, 2006 onward. Learning is a procedure consisting of estimating bio-informatics, natural language processing (NLP), cybersecurity, and many others.
    [Show full text]
  • The Understanding of Convolutional Neuron Network Family
    2017 2nd International Conference on Computer Engineering, Information Science and Internet Technology (CII 2017) ISBN: 978-1-60595-504-9 The Understanding of Convolutional Neuron Network Family XINGQUN QI ABSTRACT Along with the development of the computing speed and Artificial Intelligence, more and more jobs have been done by computers. Convolutional Neural Network (CNN) is one of the most popular algorithms in the field of Deep Learning. It is mainly used in computer identification, especially in voice, text recognition and other aspects of the application. CNNs have been developed for decades. In theory, Convolutional Neural Network has gradually become more mature. However, in practical applications, it is limited by various conditions. This paper will introduce the course of Convolutional Neural Network’s development today, as well as the current more mature and popular architecture and related applications of it, to help readers have more macro and comprehensive understanding of Convolutional Neural Networks. KEYWORDS Convolutional Neuron Network, Architecture, CNN family, Application INTRODUCTION With the continuous improvements of technology, there are more and more multi- disciplinary applications achieved with computer technology. And Machine Learning, especially Deep Learning is one of the most indispensable and important parts of that technology. Deep learning has the great advantage of acquiring features and modeling from data [1]. The Convolutional Neural Network (CNN) algorithm is one of the best and foremost algorithms in the Deep Learning algorithms. It is mainly used for image recognition and computer vision. In today’s living conditions, the impact and significance of CNN is becoming more and more important. In computer vision, CNN has an important position.
    [Show full text]
  • A Survey Paper on Convolutional Neural Network
    A Survey Paper on Convolutional Neural Network 1Ekta Upadhyay, 2Ranjeet Singh, 3Pallavi Upadhyay 1.1Department of Information Technology, 2.1Department of Computer Science Engineering 3.1Department of Information Technology, Buddha Institute of Technology, Gida, Gorakhpur, India Abstract In this era use of machines are growing promptly in every field such as pattern recognition, image, video processing project that can mimic like human cerebral network function and to achieve this Convolutional Neural Network of deep learning algorithm helps to train large datasets with millions of parameters of 2d image to provide desirable output using filters. Going through the convolutional layer then pooling layer and in last fully connected layer, Images becomes more effective how many times it filters become better than other. In this article we are going through the basic of Convolution Neural Network and its working process. Keywords-Deep Learning, Convolutional Neural Network, Handwritten digit recognition, MNIST, Pooling. 1- Introduction In the world of technology deep learning has become one of the most aspect in the field of machine. Deep learning which is sub-field of Artificial Learning that focuses on creating large Neural Network Model which provides accuracy in the field of data processing decision. social media apps like Instagram, twitter Google, Microsoft and many other apps with million users having multiple features like face recognition, some apps for handwriting recognition. Which is machine learning problem to recognize clearly[3], as we know machines are man-made and machines does not have minds or visual cortex to understands or see the real word entity, so to understand the real world in 1960 human create a theorem named Convolutional neural network of deep learning algorithm that make machine much more understandable.
    [Show full text]
  • Countering Terrorism Online with Artificial Intelligence an Overview for Law Enforcement and Counter-Terrorism Agencies in South Asia and South-East Asia
    COUNTERING TERRORISM ONLINE WITH ARTIFICIAL INTELLIGENCE AN OVERVIEW FOR LAW ENFORCEMENT AND COUNTER-TERRORISM AGENCIES IN SOUTH ASIA AND SOUTH-EAST ASIA COUNTERING TERRORISM ONLINE WITH ARTIFICIAL INTELLIGENCE An Overview for Law Enforcement and Counter-Terrorism Agencies in South Asia and South-East Asia A Joint Report by UNICRI and UNCCT 3 Disclaimer The opinions, findings, conclusions and recommendations expressed herein do not necessarily reflect the views of the Unit- ed Nations, the Government of Japan or any other national, regional or global entities involved. Moreover, reference to any specific tool or application in this report should not be considered an endorsement by UNOCT-UNCCT, UNICRI or by the United Nations itself. The designation employed and material presented in this publication does not imply the expression of any opinion whatsoev- er on the part of the Secretariat of the United Nations concerning the legal status of any country, territory, city or area of its authorities, or concerning the delimitation of its frontiers or boundaries. Contents of this publication may be quoted or reproduced, provided that the source of information is acknowledged. The au- thors would like to receive a copy of the document in which this publication is used or quoted. Acknowledgements This report is the product of a joint research initiative on counter-terrorism in the age of artificial intelligence of the Cyber Security and New Technologies Unit of the United Nations Counter-Terrorism Centre (UNCCT) in the United Nations Office of Counter-Terrorism (UNOCT) and the United Nations Interregional Crime and Justice Research Institute (UNICRI) through its Centre for Artificial Intelligence and Robotics.
    [Show full text]