
The Perceptron Foundations of Data Analysis 04/09/2020 1 History of Perceptron ■ Frank Rosenblatt ■ 1928-1969 invented perceptron algorithm 2 History of Perceptron ■ Marvin Minsky ■ 1927-2016 stated that a simple perceptron is limited to linearly separable problems 3 Perceptron Learning Algorithm ■ First neural network learning model in the 1960’s 4 Perceptron Learning Algorithm ■ First neural network learning model in the 1960’s ■ Simple and limited (single layer model) 5 Perceptron Learning Algorithm ■ First neural network learning model in the 1960’s ■ Simple and limited (single layer model) ■ Basic concepts are similar to multi-layer models 6 What is Perceptron? The goal of perceptron algorithm is to find a hyperplane that separates a set of data into two classes. Hyperplane (decision boundary) Class 1 Class 0 7 What is Perceptron? The goal of perceptron algorithm is to find a hyperplane that separates a set of data into two classes. Hyperplane (decision boundary) Class 1 • Binary classifier • Supervised learning Class 0 8 Perceptron bias term Class 1 1,if w x + ✓ > 0 f(x)=1,w·x + b>0 0,otherwise· n Class 0 w x = wixi (dot product) · i=1 P 9 Perceptron Net Activation funciton Output x 1 w1 x2 w2 Z 1,if w x>✓ z = · x w 0,w x<= ✓ n n · n w x = wixi · i=1 P • Learning weights such that an objective function is minimized 10 Activation Function Outputs the label given an input or a set of inputs. ⎧ 1 if x ≥ 0 f (x) := sgn(x) = ⎨ ⎩− 1 if x < 0 Step function € f(x)=max(0,x) ReLU (rectified linear unit) 1 f (x) := σ(x) = 1+ e(−ax) Sigmoid function 11 € Perceptron as a Single Layer Neuron 12 Examples ✓ =0.2 .9 .5 Z z =? .1 -.3 1,if w x>✓ z = · 0,w x<= ✓ · .8 .4 .6 .2 Z z =? -.5 .2 ✓ =0.1 13 How to Learn Perceptron? Class 1 1,if w x + ✓ > 0 f(x)=1,w·x + b>0 0,otherwise· Class 0 w, ✓ are unknown parameters 14 How to Learn Perceptron? Class 1 1,if w x + ✓ > 0 f(x)=1,w·x + b>0 0,otherwise· Class 0 w, ✓ are unknown parameters ■ In supervised learning the network has its output compared with known correct answers ■ Supervised learning ■ Learning with a teacher 15 CS 478 - Perceptrons 16 Perceptron Learning Rules ■ Consider linearly separable problems ■ How to find appropriate weights ■ Look if the output result o belongs to the desired class has the desired value d (give labels) new old w = w + w w = ⌘ (d o)xi 4 4 i − η is called the learning rate, withP 0 < η ≤ 1 Perceptron Convergence Theorem: Guaranteed to find a solution in finite time if a solution exists 17 Perceptron Learning Rules ■ The algorithm converges to the correct classification ■ if the training data is linearly separable ■ When assigning a value to η we must keep in mind two conflicting requirements ■ Averaging of past inputs to provide stable weights estimates, which requires small η ■ Fast adaptation with respect to real changes in the underlying distribution of the process responsible for the generation of the input vector x, which requires large η 18 Linear Separability 19 Limited Functionality of Hyperplane CS 478 - Perceptrons 20 Multilayer Network output layer n o1 = sgn(∑w1i xi ) i= 0 hidden layer n € o2 = sgn(∑w2i xi ) i= 0 input layer € 21.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages21 Page
-
File Size-