Aplicações De Deep Learning Em Análise De Imagens Médicas

Total Page:16

File Type:pdf, Size:1020Kb

Aplicações De Deep Learning Em Análise De Imagens Médicas Introdução CNNs Usos médicos Referências Aplicações de Deep Learning em Análise de Imagens Médicas Vinicius Pavanelli Vianna FAMB - FFCLRP - USP Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Capa CNNs Motivações Usos médicos Tipos de imagens médicas Referências Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Capa CNNs Motivações Usos médicos Tipos de imagens médicas Referências O uso de registros eletrônicos de pacientes quadruplicaram, passando de 11,8% para 39,6% entre 2007 e 2012 Radiologistas são limitados por velocidade, fadiga e experiência, podendo cometer erros ou demorarem demais Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Capa CNNs Motivações Usos médicos Tipos de imagens médicas Referências Figura: TC do cérebro mostrando um derrame no lado esquerdo, MRI mostrando um tumor no lado esquerdo, um raio-X normal do peito, CT normal do pulmão e slide de histologia mostrando um glioma (tumor cerebral) Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet CNN são redes neurais convolutivas originadas no Neocognitron (Fukushima, 1980) Formalizadas e melhoradas por (LeCun et al., 1989), que utilizou CNNs para reconhecer dígitos em CEPs Seu uso foi muito difundido após (Krizhevsky, Sutskever e Hinton, 2012) ganhar a competição 2012 Imagenet Large Scale Visual Recognition Challenge (ILSVRC) com apenas 16,7% de erros, o segundo lugar teve 26%. Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Neocognitron Proposto por Kunihiko Fukushima em 1980 nos laboratórios de pesquisa cientíca da NHK Broadcasting Foi uma melhoria do seu cognitron feito em 1975, mas que dependia dos caracteres estarem posicionados corretamente Organizado em várias camadas com propriedades especícas (S-Cells e C-Cells) Continha 34.980 neurônios Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Fonte: https://www.youtube.com/watch?v=Qil4kmvm2Sw Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Ilustração do neocognitron de (Fukushima, 1988) Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Ilustração do neocognitron de (Fukushima, 1988) Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Alguns números da rede da AlexNet (SuperVision) Foi treinada por duas semanas em duas GPUs NVIDIA usando gradiente descendente 650.000 neurônios 60.000.000 parâmetros 630.000.000 conexões 10.000.000 de imagens com mais de 10.000 categorias 16,7% de erro na ImageNet (ILSVRC 2012) Escrita em Python / C++ / CUDA Fonte: http: //image-net.org/challenges/LSVRC/2012/supervision.pdf Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Ilustração da AlexNet de (Krizhevsky, Sutskever e Hinton, 2012) Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada CONV1 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada POOL1 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada CONV2 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada POOL2 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada CONV3 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada CONV4 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada CONV5 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Características extraídas da imagem pela camada POOL3 Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Figura: Ilustração da AlexNet de (Krizhevsky, Sutskever e Hinton, 2012) Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Alguns números da rede da GoogleNet 22 camadas 5 milhões de parâmetros, 12x menos que a AlexNet Necessita de 2x o poder computacional que a AlexNet 6,7% de erro na ImageNet (ILSVRC 2014) Não possui camadas totalmente conectadas (FC - Fully Connected) Fonte: (Szegedy et al., 2014) e http://image-net.org/ challenges/LSVRC/2014/slides/GoogLeNet.pptx Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução Introdução CNNs Neocognitron Usos médicos AlexNet Referências GoogleNet Googlenet (Szegedy et al., 2014) - 22 camadas Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Shin, H.C. et al Usos médicos LIAO, FANGZHOU et al. Referências (Shin et al., 2016) Detecção assistida por computador usando CNNs profundas Testou a técnica de transferência de aprendizado com sucesso Alcançou o estado da arte em detecção de nodos linfáticos tóraco-abdominais Reportou o primeiro resultado de classicação com validação cruzada de doença intersticial pulmonar Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Shin, H.C. et al Usos médicos LIAO, FANGZHOU et al. Referências Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Shin, H.C. et al Usos médicos LIAO, FANGZHOU et al. Referências Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Shin, H.C. et al Usos médicos LIAO, FANGZHOU et al. Referências (Liao et al., 2017) Venceu o Data Science Bowl 2017 da kaggle, com um prêmio de US$ 500.000,00 Analisou cerca de 2000 imagens TC de pulmão Para contornar o problema de memória das GPU em virtude do tamanho dos dados das imagens em 3D usou uma abordagem em duas etapas, a primeira localizava as regiões da imagem que continham nódulos a segunda classicava a probabilidade de câncer Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Shin, H.C. et al Usos médicos LIAO, FANGZHOU et al. Referências Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Shin, H.C. et al Usos médicos LIAO, FANGZHOU et al. Referências Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Usos médicos Referências FUKUSHIMA, KUNIHIKO (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaected by shift in position. Em: Biological Cybernetics 36.4, pp. 193202. ISSN: 1432-0770. DOI: 10.1007/BF00344251. URL: https://doi.org/10.1007/BF00344251. FUKUSHIMA, KUNIHIKO (1988). Neocognitron: A hierarchical neural network capable of visual pattern recognition. Em: Neural Networks 1.2, pp. 119 130. ISSN: 0893-6080. DOI: https://doi.org/10.1016/0893-6080(88)90014-7. URL: http://www.sciencedirect.com/science/article/pii/ 0893608088900147. Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Usos médicos Referências KRIZHEVSKY, ALEX, ILYA SUTSKEVER e GEOFFREY E HINTON (2012). ImageNet Classication with Deep Convolutional Neural Networks. Em: Advances in Neural Information Processing Systems 25. Ed. por F. PEREIRA et al. Curran Associates, Inc., pp. 10971105. URL: http: //papers.nips.cc/paper/4824-imagenet-classification- with-deep-convolutional-neural-networks.pdf. LECUN, Y. et al. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Em: Neural Computation 1.4, pp. 541551. ISSN: 0899-7667. DOI: 10.1162/neco.1989.1.4.541. LIAO, FANGZHOU et al. (2017). Evaluate the Malignancy of Pulmonary Nodules Using the 3D Deep Leaky Noisy-or Network. Em: CoRR abs/1711.08324. arXiv: 1711.08324. URL: http://arxiv.org/abs/1711.08324. Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas Introdução CNNs Usos médicos Referências SHIN, H. C. et al. (2016). Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning. Em: IEEE Transactions on Medical Imaging 35.5, pp. 12851298. ISSN: 0278-0062. DOI: 10.1109/TMI.2016.2528162. SZEGEDY, CHRISTIAN et al. (2014). Going Deeper with Convolutions. Em: CoRR abs/1409.4842. arXiv: 1409.4842. URL: http://arxiv.org/abs/1409.4842. Vinicius Pavanelli Vianna Deep Learning em Imagens Médicas.
Recommended publications
  • Multiscale Computation and Dynamic Attention in Biological and Artificial
    brain sciences Review Multiscale Computation and Dynamic Attention in Biological and Artificial Intelligence Ryan Paul Badman 1,* , Thomas Trenholm Hills 2 and Rei Akaishi 1,* 1 Center for Brain Science, RIKEN, Saitama 351-0198, Japan 2 Department of Psychology, University of Warwick, Coventry CV4 7AL, UK; [email protected] * Correspondence: [email protected] (R.P.B.); [email protected] (R.A.) Received: 4 March 2020; Accepted: 17 June 2020; Published: 20 June 2020 Abstract: Biological and artificial intelligence (AI) are often defined by their capacity to achieve a hierarchy of short-term and long-term goals that require incorporating information over time and space at both local and global scales. More advanced forms of this capacity involve the adaptive modulation of integration across scales, which resolve computational inefficiency and explore-exploit dilemmas at the same time. Research in neuroscience and AI have both made progress towards understanding architectures that achieve this. Insight into biological computations come from phenomena such as decision inertia, habit formation, information search, risky choices and foraging. Across these domains, the brain is equipped with mechanisms (such as the dorsal anterior cingulate and dorsolateral prefrontal cortex) that can represent and modulate across scales, both with top-down control processes and by local to global consolidation as information progresses from sensory to prefrontal areas. Paralleling these biological architectures, progress in AI is marked by innovations in dynamic multiscale modulation, moving from recurrent and convolutional neural networks—with fixed scalings—to attention, transformers, dynamic convolutions, and consciousness priors—which modulate scale to input and increase scale breadth.
    [Show full text]
  • Deep Neural Network Models for Sequence Labeling and Coreference Tasks
    Federal state autonomous educational institution for higher education ¾Moscow institute of physics and technology (national research university)¿ On the rights of a manuscript Le The Anh DEEP NEURAL NETWORK MODELS FOR SEQUENCE LABELING AND COREFERENCE TASKS Specialty 05.13.01 - ¾System analysis, control theory, and information processing (information and technical systems)¿ A dissertation submitted in requirements for the degree of candidate of technical sciences Supervisor: PhD of physical and mathematical sciences Burtsev Mikhail Sergeevich Dolgoprudny - 2020 Федеральное государственное автономное образовательное учреждение высшего образования ¾Московский физико-технический институт (национальный исследовательский университет)¿ На правах рукописи Ле Тхе Ань ГЛУБОКИЕ НЕЙРОСЕТЕВЫЕ МОДЕЛИ ДЛЯ ЗАДАЧ РАЗМЕТКИ ПОСЛЕДОВАТЕЛЬНОСТИ И РАЗРЕШЕНИЯ КОРЕФЕРЕНЦИИ Специальность 05.13.01 – ¾Системный анализ, управление и обработка информации (информационные и технические системы)¿ Диссертация на соискание учёной степени кандидата технических наук Научный руководитель: кандидат физико-математических наук Бурцев Михаил Сергеевич Долгопрудный - 2020 Contents Abstract 4 Acknowledgments 6 Abbreviations 7 List of Figures 11 List of Tables 13 1 Introduction 14 1.1 Overview of Deep Learning . 14 1.1.1 Artificial Intelligence, Machine Learning, and Deep Learning . 14 1.1.2 Milestones in Deep Learning History . 16 1.1.3 Types of Machine Learning Models . 16 1.2 Brief Overview of Natural Language Processing . 18 1.3 Dissertation Overview . 20 1.3.1 Scientific Actuality of the Research . 20 1.3.2 The Goal and Task of the Dissertation . 20 1.3.3 Scientific Novelty . 21 1.3.4 Theoretical and Practical Value of the Work in the Dissertation . 21 1.3.5 Statements to be Defended . 22 1.3.6 Presentations and Validation of the Research Results .
    [Show full text]
  • Convolutional Neural Network Model Layers Improvement for Segmentation and Classification on Kidney Stone Images Using Keras and Tensorflow
    Journal of Multidisciplinary Engineering Science and Technology (JMEST) ISSN: 2458-9403 Vol. 8 Issue 6, June - 2021 Convolutional Neural Network Model Layers Improvement For Segmentation And Classification On Kidney Stone Images Using Keras And Tensorflow Orobosa Libert Joseph Waliu Olalekan Apena Department of Computer / Electrical and (1) Department of Computer / Electrical and Electronics Engineering, The Federal University of Electronics Engineering, The Federal University of Technology, Akure. Nigeria. Technology, Akure. Nigeria. [email protected] (2) Biomedical Computing and Engineering Technology, Applied Research Group, Coventry University, Coventry, United Kingdom [email protected], [email protected] Abstract—Convolutional neural network (CNN) and analysis, by automatic or semiautomatic means, of models are beneficial to image classification large quantities of data in order to discover meaningful algorithms training for highly abstract features patterns [1,2]. The domains of data mining include: and work with less parameter. Over-fitting, image mining, opinion mining, web mining, text mining, exploding gradient, and class imbalance are CNN and graph mining and so on. Some of its applications major challenges during training; with appropriate include anomaly detection, financial data analysis, management training, these issues can be medical data analysis, social network analysis, market diminished and enhance model performance. The analysis [3]. Recent progress in deep learning using models are 128 by 128 CNN-ML and 256 by 256 CNN machine learning (CNN-ML) has been helpful in CNN-ML training (or learning) and classification. decision support and contributed to positive outcome, The results were compared for each model significantly. The application of CNN-ML to diverse classifier. The study of 128 by 128 CNN-ML model areas of soft computing is adapted diagnosis has the following evaluation results consideration procedure to enhance time and accuracy [4].
    [Show full text]
  • The History Began from Alexnet: a Comprehensive Survey on Deep Learning Approaches
    > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1 The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches Md Zahangir Alom1, Tarek M. Taha1, Chris Yakopcic1, Stefan Westberg1, Paheding Sidike2, Mst Shamima Nasrin1, Brian C Van Essen3, Abdul A S. Awwal3, and Vijayan K. Asari1 Abstract—In recent years, deep learning has garnered I. INTRODUCTION tremendous success in a variety of application domains. This new ince the 1950s, a small subset of Artificial Intelligence (AI), field of machine learning has been growing rapidly, and has been applied to most traditional application domains, as well as some S often called Machine Learning (ML), has revolutionized new areas that present more opportunities. Different methods several fields in the last few decades. Neural Networks have been proposed based on different categories of learning, (NN) are a subfield of ML, and it was this subfield that spawned including supervised, semi-supervised, and un-supervised Deep Learning (DL). Since its inception DL has been creating learning. Experimental results show state-of-the-art performance ever larger disruptions, showing outstanding success in almost using deep learning when compared to traditional machine every application domain. Fig. 1 shows, the taxonomy of AI. learning approaches in the fields of image processing, computer DL (using either deep architecture of learning or hierarchical vision, speech recognition, machine translation, art, medical learning approaches) is a class of ML developed largely from imaging, medical information processing, robotics and control, 2006 onward. Learning is a procedure consisting of estimating bio-informatics, natural language processing (NLP), cybersecurity, and many others.
    [Show full text]
  • The Understanding of Convolutional Neuron Network Family
    2017 2nd International Conference on Computer Engineering, Information Science and Internet Technology (CII 2017) ISBN: 978-1-60595-504-9 The Understanding of Convolutional Neuron Network Family XINGQUN QI ABSTRACT Along with the development of the computing speed and Artificial Intelligence, more and more jobs have been done by computers. Convolutional Neural Network (CNN) is one of the most popular algorithms in the field of Deep Learning. It is mainly used in computer identification, especially in voice, text recognition and other aspects of the application. CNNs have been developed for decades. In theory, Convolutional Neural Network has gradually become more mature. However, in practical applications, it is limited by various conditions. This paper will introduce the course of Convolutional Neural Network’s development today, as well as the current more mature and popular architecture and related applications of it, to help readers have more macro and comprehensive understanding of Convolutional Neural Networks. KEYWORDS Convolutional Neuron Network, Architecture, CNN family, Application INTRODUCTION With the continuous improvements of technology, there are more and more multi- disciplinary applications achieved with computer technology. And Machine Learning, especially Deep Learning is one of the most indispensable and important parts of that technology. Deep learning has the great advantage of acquiring features and modeling from data [1]. The Convolutional Neural Network (CNN) algorithm is one of the best and foremost algorithms in the Deep Learning algorithms. It is mainly used for image recognition and computer vision. In today’s living conditions, the impact and significance of CNN is becoming more and more important. In computer vision, CNN has an important position.
    [Show full text]
  • Methods and Trends in Natural Language Processing Applications in Big Data
    International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-7, Issue-6S5, April 2019 Methods and Trends in Natural Language Processing Applications in Big Data Joseph M. De Guia, Madhavi Devaraj translation that can be processed and accessible to computer Abstract: Understanding the natural language of humans by applications. Some of the tasks available to NLP can be the processing them in the computer makes sense for applications in following: machine translation, generation and understanding solving practical problems. These applications are present in our of natural language, morphological separation, part of speech systems that automates some and even most of the human tasks performed by the computer algorithms.The “big data” deals with tagging, recognition of speech, entities, optical characters, NLP techniques and applications that sifts through each word, analysis of discourse, sentiment analysis, etc. Through these phrase, sentence, paragraphs, symbols, images, speech, tasks, NLPs achieved the goal of analyzing, understanding, utterances with its meanings, relationships, and translations that and generating the natural language of human using a can be processed and accessible to computer applications by computer application and with the help of learning humans. Through these automated tasks, NLPs achieved the goal techniques. of analyzing, understanding, and generating the natural This paper is a survey of the published literature in NLP and language of human using a computer application and with the its uses to big data. This also presents a review of the NLP help of classic and advanced machine learning techniques. This paper is a survey of the published literature in NLP and its uses to applications and learning techniques used in some of the big data.
    [Show full text]
  • The Neocognitron As a System for Handavritten Character Recognition: Limitations and Improvements
    The Neocognitron as a System for HandAvritten Character Recognition: Limitations and Improvements David R. Lovell A thesis submitted for the degree of Doctor of Philosophy Department of Electrical and Computer Engineering University of Queensland March 14, 1994 THEUliW^^ This document was prepared using T^X and WT^^. Figures were prepared using tgif which is copyright © 1992 William Chia-Wei Cheng (william(Dcs .UCLA. edu). Graphs were produced with gnuplot which is copyright © 1991 Thomas Williams and Colin Kelley. T^ is a trademark of the American Mathematical Society. Statement of Originality The work presented in this thesis is, to the best of my knowledge and belief, original, except as acknowledged in the text, and the material has not been subnaitted, either in whole or in part, for a degree at this or any other university. David R. Lovell, March 14, 1994 Abstract This thesis is about the neocognitron, a neural network that was proposed by Fuku- shima in 1979. Inspired by Hubel and Wiesel's serial model of processing in the visual cortex, the neocognitron was initially intended as a self-organizing model of vision, however, we are concerned with the supervised version of the network, put forward by Fukushima in 1983. Through "training with a teacher", Fukushima hoped to obtain a character recognition system that was tolerant of shifts and deformations in input images. Until now though, it has not been clear whether Fukushima's ap- proach has resulted in a network that can rival the performance of other recognition systems. In the first three chapters of this thesis, the biological basis, operational principles and mathematical implementation of the supervised neocognitron are presented in detail.
    [Show full text]
  • Memory Based Machine Intelligence Techniques in VLSI Hardware
    Memory Based Machine Intelligence Techniques in VLSI hardware Alex Pappachen James, Machine Intelligence Research Group, Indian Institute of Information Technology and Management-Kerala Email: [email protected] Abstract: We briefly introduce the memory based approaches to emulate machine intelligence in VLSI hardware, describing the challenges and advantages. Implementation of artificial intelligence techniques in VLSI hardware is a practical and difficult problem. Deep architectures, hierarchical temporal memories and memory networks are some of the contemporary approaches in this area of research. The techniques attempt to emulate low level intelligence tasks and aim at providing scalable solutions to high level intelligence problems such as sparse coding and contextual processing. Neocortex in human brain that accounts for 76% of the brain's volume can be seen as a control unit that is involved in the processing of intelligence functions such as sensory perception, generation of motor commands, spatial reasoning, conscious thought and language. The realisation of neocortex like functions through algorithms and hardware implementation has been the inspiration to majority of state of the art artificial intelligence systems and theory. Intelligent functions are often required to be performed in environments having high levels of natural variability, which makes individual functions of the brain highly complex. On the other hand, the huge number of neurons in neocortex and their interconnections with each other makes the brain structurally very complex. As a result, the mimicking of neocortex is exceptionally challenging and difficult. Human brain is highly modular and hierarchical in structure and functionality. Each module in a human brain has evolved in structure and is trained to deal with very specific task.
    [Show full text]
  • An Improved Hierarchical Temporal Memory
    The copyright © of this thesis belongs to its rightful author and/or other copyright owner. Copies can be accessed and downloaded for non-commercial or learning purposes without any charge and permission. The thesis cannot be reproduced or quoted as a whole without the permission from its rightful owner. No alteration or changes in format is allowed without permission from its rightful owner. TEMPORAL - SPATIAL RECOGNIZER FOR MULTI-LABEL DATA ASEEL MOUSA DOCTOR OF PHILOSOPHY UNIVERSITI UTARA MALAYSIA 2018 i Permission to Use I am presenting this dissertation in partial fulfilment of the requirements for a Post Graduate degree from the Universiti Utara Malaysia (UUM). I agree that the Library of this university may make it freely available for inspection. I further agree that permission for copying this dissertation in any manner, in whole or in part, for scholarly purposes may be granted by my supervisor or in his absence, by the Dean of Awang Had Saleh school of Arts and Sciences where I did my dissertation. It is understood that any copying or publication or use of this dissertation or parts of it for financial gain shall not be allowed without my written permission. It is also understood that due recognition shall be given to me and to the university Utara Malaysia (UUM) in any scholarly use which may be made of any material in my dissertation. Request for permission to copy or to make other use of materials in this dissertation in whole or in part should be addressed to: Dean of Awang Had Salleh Graduate School UUM College of Arts and Sciences Universiti Utara Malaysia 06010 UUM Sintok Kedah Darul Aman i Abstrak Pengecaman corak merupakan satu tugas perlombongan data yang penting dengan aplikasi praktikal dalam pelbagai bidang seperti perubatan dan pengagihan spesis.
    [Show full text]
  • What Can Computer Vision Teach NLP About Efficient Neural Networks?
    SqueezeBERT: What can computer vision teach NLP about efficient neural networks? Forrest N. Iandola Albert E. Shaw Ravi Krishna Kurt W. Keutzer [email protected] [email protected] UC Berkeley EECS UC Berkeley EECS [email protected] [email protected] Abstract Humans read and write hundreds of billions of messages every day. Further, due to the availability of large datasets, large computing systems, and better neural network models, natural language processing (NLP) technology has made sig- nificant strides in understanding, proofreading, and organizing these messages. Thus, there is a significant opportunity to deploy NLP in myriad applications to help web users, social networks, and businesses. In particular, we consider smart- phones and other mobile devices as crucial platforms for deploying NLP models at scale. However, today’s highly-accurate NLP neural network models such as BERT and RoBERTa are extremely computationally expensive, with BERT-base taking 1.7 seconds to classify a text snippet on a Pixel 3 smartphone. In this work, we observe that methods such as grouped convolutions have yielded significant speedups for computer vision networks, but many of these techniques have not been adopted by NLP neural network designers. We demonstrate how to replace several operations in self-attention layers with grouped convolutions, and we use this technique in a novel network architecture called SqueezeBERT, which runs 4.3x faster than BERT-base on the Pixel 3 while achieving competitive accuracy on the GLUE test set. The SqueezeBERT code will be released. 1 Introduction and Motivation The human race writes over 300 billion messages per day [1, 2, 3, 4].
    [Show full text]
  • Deep Learning 1.1. from Neural Networks to Deep Learning
    Deep learning 1.1. From neural networks to deep learning Fran¸coisFleuret https://fleuret.org/dlc/ Many applications require the automatic extraction of “refined” information from raw signal (e.g. image recognition, automatic speech processing, natural language processing, robotic control, geometry reconstruction). (ImageNet) Fran¸coisFleuret Deep learning / 1.1. From neural networks to deep learning 1 / 18 Our brain is so good at interpreting visual information that the \semantic gap" is hard to assess intuitively. Fran¸coisFleuret Deep learning / 1.1. From neural networks to deep learning 2 / 18 Our brain is so good at interpreting visual information that the \semantic gap" is hard to assess intuitively. This: is a horse Fran¸coisFleuret Deep learning / 1.1. From neural networks to deep learning 2 / 18 Fran¸coisFleuret Deep learning / 1.1. From neural networks to deep learning 2 / 18 Fran¸coisFleuret Deep learning / 1.1. From neural networks to deep learning 2 / 18 >>> from torchvision.datasets import CIFAR10 >>> cifar = CIFAR10('./data/cifar10/', train=True, download=True) Files already downloaded and verified >>> x = torch.from_numpy(cifar.data)[43].permute(2, 0, 1) >>> x[:, :4, :8] tensor([[[ 99, 98, 100, 103, 105, 107, 108, 110], [100, 100, 102, 105, 107, 109, 110, 112], [104, 104, 106, 109, 111, 112, 114, 116], [109, 109, 111, 113, 116, 117, 118, 120]], [[166, 165, 167, 169, 171, 172, 173, 175], [166, 164, 167, 169, 169, 171, 172, 174], [169, 167, 170, 171, 171, 173, 174, 176], [170, 169, 172, 173, 175, 176, 177, 178]], [[198, 196, 199, 200, 200, 202, 203, 204], [195, 194, 197, 197, 197, 199, 200, 201], [197, 195, 198, 198, 198, 199, 201, 202], [197, 196, 199, 198, 198, 199, 200, 201]]], dtype=torch.uint8) Fran¸coisFleuret Deep learning / 1.1.
    [Show full text]
  • Neocognitron: a Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position
    Biological Biol. Cybernetics36, 193 202 (1980) Cybernetics by Springer-Verlag 1980 Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position Kunihiko Fukushima NHK Broadcasting Science Research Laboratories, Kinuta, Setagaya,Tokyo, Japan Abstract. A neural network model for a mechanism of reveal it only by conventional physiological experi- visual pattern recognition is proposed in this paper. ments. So, we take a slightly different approach to this The network is self-organized by "learning without a problem. If we could make a neural network model teacher", and acquires an ability to recognize stimulus which has the same capability for pattern recognition patterns based on the geometrical similarity (Gestalt) as a human being, it would give us a powerful clue to of their shapes without affected by their positions. This the understanding of the neural mechanism in the network is given a nickname "neocognitron". After brain. In this paper, we discuss how to synthesize a completion of self-organization, the network has a neural network model in order to endow it an ability of structure similar to the hierarchy model of the visual pattern recognition like a human being. nervous system proposed by Hubel and Wiesel. The Several models were proposed with this intention network consists of an input layer (photoreceptor (Rosenblatt, 1962; Kabrisky, 1966; Giebel, 1971; array) followed by a cascade connection of a number of Fukushima, 1975). The response of most of these modular structures, each of which is composed of two models, however, was severely affected by the shift in layers of cells connected in a cascade.
    [Show full text]