Radial Basis Function Artificial Neural Networks and Fuzzy Logic

Radial Basis Function Artificial Neural Networks and Fuzzy Logic

PERIODICA POLYTECHNICA SER. EL. E.'iG. VOL. 42, NO. 1, PP. 155-172 (1998) RADIAL BASIS FUNCTION ARTIFICIAL NEURAL NETWORKS AND FUZZY LOGIC i'\. C. STEELEx and J. Go DJ EVAC xX • Control Theory and Application Centre Coventry University email: [email protected] •• EPFL IVlicrocomputing Laboratory IN-F Ecublens, CH-101.5 Lausanne, Switzerland Received: :VIay 8, 1997; Revised: June 11, 1998 Abstract This paper examines the underlying relationship between radial basis function artificial neural networks and a type of fuzzy controller. The major advantage of this relationship is that the methodology developed for training such networks can be used to develop 'intelligent' fuzzy controlers and an application in the field of robotics is outlined. An approach to rule extraction is also described. Much of Zadeh's original work on fuzzy logic made use of the MAX/MIN form of the compositional rule of inference. A trainable/adaptive network which is capable of learning to perform this type of inference is also developed. Keywords: neural networks, radial basis function networks, fuzzy logic, rule extraction. 1. Introduction In this paper we examine the Radial Basis Function (RBF) artificial neural network and its application in the approximate reasoning process. The paper opens with a brief description of this type of network and its origins, and then goes on to shmv one \vay in which it can be used to perform approximate reasoning. vVe then consider the relation between a modified form of RBF network and a fuzzy controller, and conclude that they can be identical. vVe also consider the problem of rule extraction and we discuss ideas which were developed for obstacle avoidance by mobile robots. The final part of the paper will consider a novel type of network, the Artificial Neural Inference (ANI) network, which is related to the RBF network and can perform max/min compositional inference. Making use of some new results in analysis, it is also possible to propose an adaptive form of this network, and it is on this network that current work is focused. 156 N, C, STEELE and J, GODJEVAC 2. The RBF Network The radial basis function network has its origins in approximation theory, and such an approach to multi-variable approximation has been developed as a neural network paradigm notably by BROOMHEAD and LOWE [2]. Surpris­ ingly perhaps, techniques of multi-variable approximation have only recently been studied in any great depth. A good account of the state of the art at the start of this decade is given by POWELL in [1]. In the application to the approximation of functions of a single variable, it is assumed that values of the function to be approximated are known at a number, n. of distinct sam­ ple points, and the approximation task is to construct a continuous function through these points. If n centres are chosen to coincide with the known data points then an approximation can be constructed in the form n where ri = Ilx - cill is the radial distance of x from the centre Ci, i = 1 ... n, and Wi, i = 1 ... n are constants. Each of the n centres is associated with one of the n radial basis functions 0;, i = 1 ... n. The method extends easily to higher dimensions when r is usually taken as the Euclidean distance. With the centres chosen to coincide with the sample or data points, and with one centre corresponding to each such point. then the approximation generated will be a function 'which reproduces the function values at the sample points exactly. In the absence of 'noise' this is desirable. but in situations when this is present, steps have to be taken to prevent the noise being modelled at the expense of the underlying data. In fact. in most neural net\\'ork applications, it is usual to work with a lesser number of centres than of data points in order to prod uce a network \\'ith good 'generalisation' as opposed to noise modelling properties, and the problem of how to choose the centres has then to be addressed. Some users take the vie\v that this is part of the network training process, but to take this view from the outset means that possible advantages may be lost. The 'default' method for 'prior' centre selection is the use of the l'C-means algorithm, although other methods are available. In a number of applications. notably those where the data occur in clusters, this use of a set of fixed centres is adequate for the task in hand. In other situations, this approach is either not successfuL or is not appropriate, and the network has to be given the ability to adapt these locations, either as part of the training process. or in a later tuning operation. The exact nature of the radial basis functions Oi. i = 1 ... n is not usually of major importance, and good results have been achieved in a number of applications using Gaussian basis functions when RADIAL BASIS FUNCTION... 157 Y2 w cMr;) = e-{ ~ V rj = Ilx - cill Xl X2 Xn Fig. 1. The Radial Basis Function network where ai, i = 1 ... n are constants. This type of set of basis functions, with the property that 9 -+ 0 as l' -+ x is knmvn as a set of localised basis functions, and although this appears to be a natural choice, there is no requirement for this to be the case. There are a number of examples of non-localised basis functions in use, including the thin-plate spline function, <7'>(1') = 1'2 In (1') and the multi-quadric function. A major advantage of the RBF neural network, when the centres have been pre-selected, is the speed at which it can be trained. In Fig. 1, we show a network with two output nodes and we note that with the centres fixed, 158 N. C. STEELE and J. GODJEVAC the only parameters which have to be determined are the weights on the connections between the single hidden layer and the output layer. This means that the parameters can be found by the solution of a system of linear equations, and this can be made both efficient and robust by use of the Singular Value Decomposition (SVD) algorithm. The algorithm has further advantages when the number of centres is less than the number of data points since a least squares approximation is then found automatically. If centres are not pre-selected, then some proced ure for selection, per­ haps based on gradient descent, must be incorporated into the training algorithm. In this case, either all parameters are determined by gradient descent, or a hybrid algorithm may be used, with the parameters of the centres being updated by this means and new sets of weights found (not necessarily at each iteration) again as the solution of a system of linear equations. 3. A Fuzzy Controller (4) w (:3) (2) (1) Fig. 2. A fuzzy controller In Fig. 2 we show the architecture of a fuzzy controller based on the Takagi­ Sugeno design, with a crisp output. This controller is a variant of that discussed by JANG and SUN [4]. In general. this fuzzy controller has m crisp RADIAL BASIS FUNCTION... 159 inputs Xl, X2,"" Xm and, in this case, one output y. There are n linguistic rules of the form: Ri: IF Xl is Ai,l AND X2 is Ai,2 AND ... AND Xm is Ai,m THEN Y is Yi i = 1, ... , n where i is the index of the rule, Ai,j is a fuzzy set for i-th rule and j-th linguistic variable defined over the universe of discourse for j-th variable, and Wi is a real number. The fuzzy sets Ai,j,i = 1, ... , n, j = 1, ... , m are each defined by Gaussian membership functions j..li,j(Xj), where -( x -c )'/- (r j..li,j(Xj) = e ) t,) t,). (1) Here Ci,j and ai,j are the constant Gaussian parameters. In the first layer of the structure shown in Fig. 2, fuzzification of the crisp inputs Xl and X2 takes place. This means that the degree of mem­ bership of the inputs Xl and X2 is calculated for each of the corresponding fuzzy sets Ai,j. These values are then supplied to be nodes in the second layer, where a t-norm is applied, and here we use the algebraic product. The output of the kth unit in this layer is then the firing strength UI; of rule k, and in this case, with two inputs, it is given by k = 1, .. ,n. In the general case this is m UI; = IT j..lk,j(Xj) , k = 1, .. ,n. (2) j=1 The rule firing strength may also be taken as a measure of the degree of compliance of the current input state with a rule in the rule base, taking the value 1 when compliance is total. The overall network output at the fourth layer has to be a crisp value, and this is generated by the height method. The Wi, i = L ... , n are the weights between the third and fourth layers and are in some sense 'partial conseq uents'. The out pu t y is finally formed as (3) This can be ..vritten as n - 111 , -L 112 ' • -L Un ,'-.L:- ,. y - ,\,n ..1(; 1 I ,\,n .. /1;2 . .. I ,\,n . Un - U, U, . (4) L.,i=1 U, L.,i=1 U, L.,i=l 11, i=l Eq. (4) suggests that layer three should perform a normalisation function, and if this is the case the output of unit I in layer three is given by _Ut U/ = n .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    18 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us