Radial Basis Function Networks

Radial Basis Function Networks

Machine Learning Srihari Radial Basis Function Networks Sargur Srihari 1 Machine Learning Srihari Topics • Basis Functions • Radial Basis Functions • Gaussian Basis Functions • Nadaraya Watson Kernel Regression Model • Decision Tree Initialization of RBF 2 Machine Learning Srihari Basis Functions • Summary of Linear Regression Models 1. Polynomial Regression with one variable 2 i y(x,w) = w0+w1x+w2x +…=Σ wix 2. Simple Linear Regression with D variables T y(x,w) = w0+w1x1+..+wD xD = w x In one-dimensional case y(x,w) = w0 + w1x which is a straight line 3. Linear Regression with Basis functions fj(x) M−1 y(x,w) = w + w (x) = wT (x) 0 ∑ jφj φ j=1 • There are now M parameters instead of D parameters • What form should the basis functions take? 3 Machine Learning Srihari Radial Basis Functions • A radial basis function depends only on the radial distance (Euclidean) from the origin f(x)=f(||x||) th • If the basis function is centered at mj fj (x) =h (||x-mj||) • We look at radial basis functions centered at the data points xn, n =1,…,N 4 Machine Learning Srihari An RBF Network 5 Machine Learning Srihari History of Radial Basis Functions • Introduced for exact function interpolation • Given set of input vectors x1,..,xN and target values t1,..,tN • Goal is to find a smooth function f (x) that fits every target value exactly so that f (xn) = tn for n=1,..,N • Achieved by expressing f(x) as a linear combination of radial basis functions, one per data point N f (x) = ∑wn h(|| x − x n ||) n=1 € Machine Learning Srihari History of Radial Basis Functions • Values of coefficients wn are found by least squares • Very large number of weights • Since there are same number of coefficients as constraints, result is a function that fits target values exactly • When target values are noisy exact interpolation is undesirable N f (x) = ∑wn h(|| x − x n ||) n=1 € Machine Learning Srihari Another Motivation for Radial Basis Functions • Interpolation when Input rather than target is noisy • If noise on input variable x is described by variable x having distribution v(x) then sum of squared error function becomes N 1 2 ν: pronounced “nu” E = ∑ ∫ {y(x n + ξ) − tn } ν(ξ)dξ for noise 2 n=1 € 8 Machine Learning Srihari Another Motivation for Radial Basis Functions • Using calculus of variations, optimize wrt y(x) to give N y(x) = ∑tn h(x − x n ) n=1 where the€ basis functions are given by ν(x − x n ) h(x − x n ) = N ν(x − x ) ∑ n n=1 • There is a basis function centered on every data point€ • Known as Nadaraya-Watson Model 9 Machine Learning Srihari Normalized Basis Functions If noise ν(x) is isotropic so that it is only a function of ||x|| Gaussian Basis Functions then the basis functions are radial Functions are normalized ν(x − x ) h(x − x ) = n n N ∑ν(x − x n ) so that n=1 Normalized Basis Functions € h(x − x ) =1 for any value of x ∑ n n Normalization is useful in regions of input space where all basis h(x-x ) is called a kernel function since we use functions are small n € it with every sample to determine value at x Factorization into basis functions is not used Machine Learning Srihari Computational Efficiency • Since there is one basis function per data point corresponding computational model will be costly to evaluate for new data points • Methods for choosing a smaller set • Use a random subset of data points • Orthogonal least squares • At each step the next data point to be chosen as basis function center is the one that gives the greatest reduction in least-squared error • Clustering algorithms which give basis function centers that no longer coincide with training data points 11 Machine Learning Srihari Nadaraya-Watson Regression Model • More general formulation than before • Proposed in 1964.Nadaraya: a Russian statistician • Begin with Parzen estimation to derive kernel function • Given a training set {xn,tn} the joint distribution of two variables is Parzen window estimate 1 N With one variable p(x) p(x,t) = ∑ f (x − x n,t − tn ) N n=1 • where f(x,t) is the component density function • there is one such component centered at each data point € • Starting with y(x)=E[t|x]=∫tp(t|x)dt we derive the kernel form • Where we need to note that component densities have zero mean MachineDerivation Learning of Nadaraya-Watson Regression Srihari • Desired Regression function • Using Parzen window estimate and change of variables • Where k is a normalized kernel 13 Machine Learning Srihari Example of Nadaraya-Watson Regression Regression function is y(x) = ∑k(x,x n )tn n Where the kernel function is € g(x − x n ) k(x,x n ) = ∑g(x − x m ) m Green: Original sine function Blue: Data points and we have defined Red: Resulting regression function ∞ Pink region: Two std deviation region for distribution p(t|x) € g(x) = ∫ f (x,t)dt Blue ellipses: one std deviation contour −∞ Different scales on x and y axes make circles elongated and f(x,t) is a zero-mean component density function centered at each data point € 14 Machine Learning Srihari Radial Basis Function Network • A neural network that uses RBFs as activation functions • In Nadaraya-Watson • Weights ai are target values • r is component density (Gaussian) • Centers ci are samples 15 Machine Learning Srihari Speeding-up RBFs • More flexible forms of Gaussian components can be used • Model p(x,t) using a Gaussian mixture model • Trained using EM • Number of components in the mixture model can be smaller than the number of training samples • Faster evaluation for test data points • But increased training time 16 Machine Learning Srihari Decision Tree Initialization 17 Machine Learning Srihari Three Initialization Methods 18 Machine Learning Srihari References • http://mccormickml.com/2013/08/15/radial- basis-function-network-rbfn-tutorial/ • http://www.informatik.uni-ulm.de/ni/staff/ CDietrich/publications/nnw2000.pdf 19 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    19 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us