From Neural Networks to Deep Learning David Donoho and Vardan Papyan

McCulloch-Pitts model of a neuron (1943)

● Several excitatory binary inputs ● Single binary output ● Single inhibitory input (if it is on, the neuron cannot fire) ● If the sum of the inputs is larger than a critical value, the neuron fires McCulloch-Pitts model of a neuron (1943)

Functions that can be represented:

Problems: ● The weights need to be hard-coded by the user ● Can not represent XOR Rosenblatt’s single layer perceptron (1957) Rosenblatt’s single layer perceptron (1957)

● The synaptic weights are not restricted to unity ● Weights learned from data Rosenblatt’s single layer perceptron (1957) Rosenblatt’s single layer perceptron (1957) Problem with perceptron (1969)

Marvin Minsky, founder of MIT AI Lab, and Seymour Papert, director of the lab at the time, claimed in their 1969 book: ● Need Multilayer Perceptrons to represent simple nonlinear functions such as XOR ● No one found a viable way to train MLPs good enough to learn such simple functions

● Derived by multiple researchers in the early 1960’s ● Implemented to run on computers in 1970 by Seppo Linnainmaa ● was first to propose that it could be used for neural nets after analyzing it in depth in his 1974 PhD Thesis ● Popularized in 1986 by David Rumelhart, , and Ronald Williams, who explicitly mention David Parker and Yann LeCun as two people who discovered it beforehand

Backpropagation applied to handwritten ZIP code recognition (1989)

● Yann LeCun et al. combined convolutional neural networks with back propagation ● Similar ideas appeared in Neocognitron by Kunihiko Fukushima in 1979 Backpropagation applied to handwritten ZIP code recognition (1989) ImageNet classification with deep convolutional neural networks (2012) Theoretical Background -- Neuroscience

Our Current Understanding of Visual Cortex Theoretical Background -- Neuroscience

Our Current Understanding of Visual Cortex Theoretical Background -- Neuroscience

Our Current Understanding of Visual Cortex Theoretical Background -- Neuroscience

Our Current Understanding of Visual Cortex Theoretical Background -- Neuroscience

Our Current Understanding of Visual Cortex Theoretical Background -- Neuroscience

Experimental Setup -- Hubel/Wiesel Theoretical Background -- Neuroscience

Experimental Setup -- Hubel/Wiesel Theoretical Background -- Neuroscience

Experimental Measurements -- Lateral Geniculate Nucleus Theoretical Background -- Neuroscience

Experimental Measurements -- Lateral Geniculate Nucleus Theoretical Background -- Neuroscience

Experimental Measurements -- Lateral Geniculate Nucleus Theoretical Background -- Neuroscience

Experimental Measurements -- V1 Theoretical Background -- Neuroscience

Experimental Measurements -- V1 Theoretical Background -- Neuroscience

Experimental Measurements -- V1 Theoretical Background -- Neuroscience

Experimental Measurements -- V1 Theoretical Background -- Neuroscience

Experimental Measurements -- V1 Theoretical Background -- Neuroscience

Theoretical Picture Theoretical Background -- Neuroscience

Theoretical Picture Theoretical Background -- Neuroscience

Theoretical Picture

Grand Hypothesis -- the visual system learns how to organize itself from the images it sees

Horace Barlow -- organization exploits sparsity/compression of information

David Field -- the statistics of images are nongaussian and multiscale

Olshausen/Field -- statistics of images can produce `simple cells’ with orientation selectivity Theoretical Background -- Neuroscience

Olshausen/Field -- optimization yields `simple cells’ with orientation selectivity