
Deriving Monotonic Function Envelopes from Observations* Herbert Kay Lyle H. Ungar Department of Computer Sciences Department of Chemical Engineering University of Texas at Austin University of Pennsylvania Austin, TX 78712 Philadelphia, PA 19104 bert@cs .utexas.edu [email protected] .edu Abstract function f(x; 0) and then using regression analysis to determine the values 0 such that f(x; 0) .., g(x) . work in qualitative physics involves con- Much Traditional regression methods require knowledge of structing models of physical systems using func- the form of the estimation function f. For instance, tional descriptions such as "flow monotonically in- we may know that body weight is linearly related to creases with pressure." Semiquantitative meth- amount of body fat. We may then determine that ods improve model precision by adding numeri- f(weight; B) = weight - B is an appropriate model. cal envelopes to these monotonic functions . Ad Neural network methods have been developed for cases hoc methods are normally used to determine these where no information about f is known . This may be envelopes. This paper describes a systematic the case if f models a complex process whose physics is a envelope a method for computing bounding of poorly understood . Such networks are known to be ca- stream multivariate monotonic function given a pable of representing any functional relationship given of data. The derived envelope is computed by a large enough network. In this paper, we consider the determining a simultaneous confidence band for case where some intermediate level of knowledge about a special neural network which is guaranteed to f is known. In particular, we are interested in cases produce only monotonic functions. By compos- where we have knowledge of the monotonicity of f in ing these envelopes, more complex systems can be terms of the signs of where E x . For example, simulated using semiquantitative methods. xk we might know that outflowa from a tank monotonically increases with tank pressure. This type of knowledge Introduction is prevalent in qualitative descriptions of systems, so it makes sense to take advantage of it. Scientists and engineers build models of continuous Since the estimate is based on a finite set of data, it systems to better understand and control them. Ide- is not possible for it to be exact. We therefore require ally, these models are constructed based on the the our estimate to have an associated confidence measure underlying physical properties of the system. Unfortu- which takes into account the uncertainty introduced by nately, real systems are seldom well enough understood the finite sample size. For our semiquantitative repre- to construct precise models based solely on a priori sentation, we are therefore interested in deriving an knowledge of physical laws. Therefore, process data is envelope that bounds all possible functions that could often used to estimate some portions of the model. have generated the datastream with some probability Techniques for estimating a functional relationship P. between an output variable y and the input vector This paper describes a method for estimating and x typically assume that there is some determinis- computing bounding envelopes for multivariate func- tic function g and some random variate e such that tions based on a set of data and knowledge of the y = g(x) + e, where c is a normally distributed, mean monotonicity of the functional relationship . First, zero random variable with variance o-2 that represents we describe our method for computing an estimate measurement error and stochastic variations . The es- f(x; 0) ;zz~ g(x) based on a neural network that is con- timate is computed by using a parameterized fitting strained to produce only monotonic functions. Second, we describe our method for computing a bounding en- 'This work has taken place in the Qualitative Reasoning Group at the Artificial Intelligence Laboratory, The Uni- velope, which is based on linearizing the estimation versity of Texas at Austin. Research of the Qualitative function and then using F-statistics to compute a si- Reasoning Group is supported in past by NSF grants IRI- multaneous confidence band . Third, we present several 8905494, IRI-8904454, and IRI-9017047, by NASA contract examples of function fitting and its use in semiquanti- NCC 2-760, and by the Jet Propulsion Laboratory. tative simulation using QSIM. Next we discuss related work in neural networks and nonparametric analysis and finally we summarize the results and describe fu- ture work. Bounded monotonic functions are a key element of the semiquantitative representation used by simulators such as Q2 [Kuipers and Berleant, 1988], Q3 [Berleant and Kuipers, 1992], and Nsim [Kay and Kuipers, 1993] which predict behaviors from models that are incom- pletely specified . To date, the bounds for such func- tions have been derived in an ad hoc manner . The work described in this paper provides a systematic method for finding these functional bounds. It is par- ticularly appropriate for semiquantitative monitoring and diagnosis systems (such as MIMIC [Dvorak and Kuipers, 1989]) because process data is readily avail- Figure 1: A neural net-based function estimator . able in such applications. By combining our function This three-layer net computes the function bounding method with model abduction methods such (E' + w wO[nh+1,1]) . as MISQ [Richards et ad., 1992], we plan to construct ~~ Al [w.[.i,l]a i=, wi[i,j]xi i[n+l,i1)~ + a self-calibrating system which derives models directly from observations of the monitored process. Such a system will have the property that as more data is ac- at 11 . All nodes use sigmoidal basis functions and all quired from the process, the model and its predictions connections are weighted . In our notation, wili j] rep- will improve. resents the connection from input xi to hidden node j and woU,ll represents the connection from hidden node network represents the Computing the Estimate j to the output layer' . This function Computing the estimate of g requires that we make Q(S some assumptions about the nature of deterministic = + wo[nh+1,11) and stochastic portions of the model. We assume that nh n the relationship between y and x is y = g(x)+c where e s -_ WOLi,11 0 wi[ii]xi + wi[n+lj] is a normally distributed random variable with mean 0 j=1 L ~Ei=1 and variance Q' . Other assumptions, such as different noise probability distributions or multiplicative rather where v(x) is the sigmoidal function We can than additive noise coupling could be made. The above compute the weights by solving the nonlinear least model, however, is fairly general and it permits us to squares problem use powerful regression techniques for the computation of the estimate and its envelope . For situations where Win (yi - yi)z variance is not uniform, we can use variance stabiliza- i tion techniques to transform the problem so that it has where Pi = f(xi ; w) and w is a vector of all weights. a constant variance. Cybenko [Cybenko, 1989] and others have shown that In traditional regression analysis, the modeler sup- with a large enough number of hidden units, any con- plies a function f(x; B) together with a dataset to a tinuous function may be approximated by a network least-squares algorithm which determines the optimal of this form. values for 9 so that f (x; 9) :z~ g(x) . The estimated One drawback of using this estimation function is that it can overfit the given data. This results in the value of y is then y = f(x; 9) . In our case, however, estimate following the random variate e as well as the the only information available about f is the signs of deterministic part of the model which means that we its n partial derivatives where xk E x, so no ex- get a poor approximation of g3. We therefore reduce One way to work plicit equation for f can beaassumed . the scope of possible functions to include only mono- for f is to use a neural net without an explicit form tonic functions. To do this, note that if f is mono- function estimator. Figure 1 illustrates a network for tonically increasing in xk, then must be positive. determining y given a set of inputs x. The network a has three layers. The input layer contains one node for 'The bias terms permit the estimation function to shift each element xk and one bias node set to a constant the center of the sigmoid. value of 1 . The hidden layer consists of a set of nh 'The weight wi[n+l,i] represents the connection to the input variable as nodes which are connected to each input bias and the weight wo[nh+1,1] represents the connec- well as to the bias input. The output layer consists tion from the hidden layer bias node to the output node. of a single node which is connected to all the hidden 'Visually, the estimate will try to pass through every nodes as well as to another bias value which is fixed datapoint. By constraining the weights, we can force this deriva- tive to positive for all x, insuring that the resulting function is monotonic. The derivative in question is of = v' (s + Wo[ axk .n+1,1]) -p n~ ( n p t woU,110` E(w=[i,j]xi) + Wi[k,j]~ j=1 i=1 Since the derivative of the sigmoid function is positive for all values of its domain, will be positive if p is positive . This will be the caseaifbl<j<nwo[j,1] - wi[k,j] >- 0 . If the partial derivative is negative, then the inequal- ity is reversed .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages7 Page
-
File Size-