Neocognitron: a Neural Network Model for a Mechanism of Visual Pattern Recognition - Kunihiko Fukushima, Sei Miyake, Takayuki Ito

Neocognitron: A neural network model for a mechanism of visual pattern recognition - Kunihiko Fukushima, Sei Miyake, Takayuki Ito

Vijay Veerabadran1

1Department of Cognitive Science University of California, San Diego

CSE 254 Paper Presentation, Fall 2018

1 / 40 Outline

1 Overview of the visual cortex 2 Hubel and Wiesel architecture of the visual cortex Hubel and Wiesel experiment - Cat visual cortex Conclusions from the experiment 3 The Neocognitron - Inspired by H & W Motivation High-level description of the Neocognitron Architecture of the neocognitron Feedforward pass Learning algorithm Response of simple and complex cells 4 Neocognitron for number recognition Neocognitron for number recognition - Success case Neocognitron for number recognition - Failure case 5 Computational models of visual pattern recognition 6 Acknowledgements

2 / 40 Overview of the visual cortex

An intricate component of the animal brain, the visual cortex is at the rear part of the cerebral cortex. The visual cortex processes visual information in a hierarchical manner, and is broadly divided into the early, middle and higher visual areas.

3 / 40 Outline

4 / 40 Hubel and Wiesel experiment - Cat visual cortex

A neurophysiological experiment that was published in 1959, which was aimed at understanding the functional architecture of the visual cortex. A cell in the V1 area of a cat’s visual cortex was studied using a then-new single cell electrophysiology technique. A video of the experiment with sounds of the neuronal ﬁring on YouTube: https://www.youtube.com/watch?v=8VdFf3egwfg

5 / 40 Outline

6 / 40 Conclusions - Simple and complex cells

7 / 40 Prior knowledge of synapses - Inhibitory, Excitatory

Figure: A Synapse in the nervous system

Figure: Excitatory synapse vs Inhibitory synapse

8 / 40 Conclusions - Simple and complex cell properties

Uncovering some of the then-unknown facts about the architecture of the visual cortex. A concrete functional architecture of the simple cells and complex cells in the early visual cortex. Shown below is an example ﬁgure from Hubel and Wiesel 1962, showing a few receptive ﬁelds of simple cells.

9 / 40 Outline

10 / 40 Motivation - Positional invariance

One of the key drawbacks of the then proposed visual recognition models - they were unable to learn invariance to position. As a result, these models served suboptimal during cases of deformations in the test images presented. Solution: A layered neural network model called ’The neocognitron’ that grows a tolerance to deformations systematically at each of its layers.

11 / 40 Outline

12 / 40 High-level description of the Neocognitron

”The neocognitron is a multilayered network consisting of a cascaded connection of many layers of cells. The information of the stimulus pattern given to the input layer is processed step by step in each stage of the multilayered network. The synapses between the cells in the network are modiﬁable, and the neocognitron has a function of learning.”

13 / 40 Outline

14 / 40 Architecture - Constituent layers

Two main types of alternating layers in the neocognitron, called the Simple cell layer (or simple cell module) and the Complex cell layer. Simple cell layer: These are the main feature extraction layers in the neocognitron. Analogous to convolutional layers in modern CNNs. Complex cell layer: Aggregates information from neighboring with like tuning from the previous simple cell layer. Builds positional invariance by functioning like an OR gate (activated when at least one of its presynapses are activated).

15 / 40 Architecture - Multi-layered structure

16 / 40 Architecture - Some relevant jargon

Simple cell layer I=1 Complex cell layer l=1 Photoreceptor Simple cell array plane k=1, l=2 Complex cell plane k=1, l=3

Receptive ﬁeld

17 / 40 Outline

18 / 40 Feedforward pass - Simple cell output

The output (or activation) for a simple cell at position n in the kth cell plane in the lth simple cell module is given by the following equation:

Notations: n → (x,y) position in the immediate input layer κ → Index of cell planes in the previous module (complex cell) Al → Receptive field size al and bl → Excitatory and inhibitory modifiable synapse efficiency cl−1(v) → Unmodifiable synapses efficiency rl → Fixed parameter controlling inhibition (One value per layer) Non-linearity: ( x if x >= 0 φ(x) = (1) 0 if x < 0

19 / 40 Feedforward pass - Simple cell output diagram

Pictorial representation of the output of a simple cell

uCl-1(K,n+(4,4))

al(K,(4,4),K’) uSl(K’,n) UCl-1 (Kth plane)

cl-1(4) bl(K’)

vCl-1(n) USl (K’th plane)

th VCl-1 (K plane) (Local RMS of Ucl-1)

20 / 40 Figure: Pictorial version of Simple Cell output Feedforward pass - Simple cell output contd.

21 / 40 Feedforward pass - Complex cell output

The output of a complex cell at position n in the kth cell plane in the lth complex cell module is given by the following equation:

Notations: n → (x,y) position in the immediate input layer κ → Index of cell planes in the previous module (simple cell) th jl (κ, k) → Binary indicator of connection from the κ simple cell plane in the previous layer to the kth complex cell plane under computation. dl → Eﬃciency of the excitatory synapse from the previous simple cell module Nonlinearity: ( x if x >= 0 ψ(x) = αl +x (2) 0 if x < 0

22 / 40 Feedforward pass - Complex cell output diagram

Pictorial representation for the output of a complex cell

usl(K,n+(4,4)) dl((4,4),K’) uCl(K’,n)

th USl (K plane) UCl (K’th plane)

jl(K,K’)

23 / 40 Outline

24 / 40 Learning algorithm - Reinforcement of synapses

The neocognitron is trained in a supervised manner in a sequence from the distal to the deeper layers for each cell plane - meaning? The training begins with the update of cell planes in the first simple cell layer (US1). All synapses are initialized with a value of 0. For each cell plane(kˆ), layer(l): The ”teacher” presents a training pattern to the input layer and simultaneously picks a ”representative” S cell in that cell plane, say usl (â, nˆ). The modifiable excitatory synapse is updated using the following amount of reinforcement: ∆al (κ, ν, kˆ) = ql .cl−1(ν).uCl−1(κ, nˆ + ν) The modifiable inhibitory synapse is updated using the following amount of reinforcement: ∆bl (kˆ) = ql .vCl−1(ˆn) All the remaining cells in this cell-plane are trained using the same update rules.

25 / 40 Learning algorithm - Simple cell update

In this part, we shall understand the intuition behind the learning rule used for Neocognitron. Without loss of generality, consider the training of a simple cell (u) in the first simple cell layer (US1). u receives its input (p(ν)) from its input layer (U0). a(ν) and b represent the modifiable excitatory and inhibitory synapses. c(ν) represents the unmodifiable excitatory synapse from the input to the local RMS map (V ).

26 / 40 Learning algorithm - Intuition

In this part, we shall see the intuition behind the neocognitron’s learning rule. As we assumed, p(ν) is the response of the cells in the receptive ﬁeld of u. With this notation, the output of u can be written as, P 1+ ν a(ν).p(ν) u = r.φ r − 1 –>(1) 1+ 1+r .b.v pP 2 v = ν c(ν).p (ν)–>(2) P ν a(ν).p(ν) Let s = b.v –>(3) Provided a(ν) and b(ν) are large, u is reduced to r+1 u = r.φ( r .s − 1) –>(4)

27 / 40 Learning algorithm - Simple cell update

Initially, all modiﬁable synapses are set to 0. From the learning rules, the amounts of reinforcement for a and b are given by: ∆a(ν) = q.c(ν).p(ν) ∆b = q.v Consider presenting an training pattern P, with U0 response P(ν) in the receptive ﬁeld of u. According to P(ν), the synapses are set as follows: a(ν) = 0 + ∆a(ν) = q.c(ν).P(ν)–>(5) b = 0 + ∆b = q.v pP 2 = q. ν c(ν).P (ν)–>(6)

28 / 40 Learning algorithm - Simple cell update

Plugging (5) and (6) in (3), P q.c(ν).P(ν).p(ν) √ν s = P 2 q. ν c(ν).P (ν).v

We know from (2) that pP 2 v = ν c(ν).p (ν) Hence, P c(ν).P(ν).p(ν) √ ν √ s = P 2 P 2 ν c(ν).P (ν). ν c(ν).p (ν) Therefore, s is a weighted dot product of P(ν) and p(ν). r+1 We know that u = r.φ( r .s − 1) r Hence, u ﬁres if s > r+1 for any arbitrary input pattern.

29 / 40 Outline

30 / 40 Response of simple and complex cells - Example

Figure: Example cell responses

31 / 40 Outline

32 / 40 Neocognitron for number recognition - Success case

Figure: Training pattern Figure: Distorted test pattern

33 / 40 Outline

34 / 40 Neocognitron for number recognition - Failure case

Figure: Test cases where Neocognitron makes mistakes

35 / 40 Outline

36 / 40 Computational models of visual pattern recognition

Shown below is a comparison of three groundbreaking models of hierarchical visual pattern recognition.

Hubel and Wiesel model (1962)

Neocognitron, Fukushima (1980, 1988, 2003)

LeNet, Yann Lecun (1998)

37 / 40 Figure: Comparing models of visual processing Outline

38 / 40 Acknowledgements

David Hubel Torsten Wiesel

Kunihiko Yann Lecun Fukushima

39 / 40 Thank you

There has been a myth that the brain cannot understand itself. It is compared to a man trying to lift himself by his own bootstraps. We feel that is nonsense. The brain can be studied just as the kidney can. – Dr. David Hubel

Thank you for being a great audience! Questions?

Vijay Veerabadran, Oﬃce: SSRB 244, Email: [email protected], Dept. of Cognitive Science, UC San Diego, La Jolla

40 / 40