Neural Networks Learning the network: Part 2
11-785, Fall 2020 Lecture 4
1 Recap: Universal approximators
• Neural networks are universal approximators – Can approximate any function • Provided they have sufficient architecture – We have to determine the weights and biases to make them model the function 2 Recap: Approach
• Define a divergence between the actual output and desired output of the network – Must be differentiable: can quantify how much a miniscule change of changes • Make all neuronal activations differentiable – Differentiable: can quantify how much a miniscule change of changes • Differentiability – enables us to determine if a small change in any parameter of the network is increasing or decreasing
– Will let us optimize the network 3 Recap: The expected divergence
• Minimize the expected “divergence” between the output of the net and the desired function over the input space
4 Recap: Emipirical Risk Minimization
• Problem: Computing the expected divergence requires knowledge of at all which we will not have
• Solution: Approximate it by the average divergence over a large number of “training” samples drawn from