Bayesian Inference for the Errors-In-Variables Model
Total Page:16
File Type:pdf, Size:1020Kb
Bayesian inference for the Errors-In-Variables model XING FANG1,2, BOFENG LI3, HAMZA ALKHATIB4, WENXIAN ZENG1* AND YIBIN YAO1 1 School of Geodesy and Geomatics, Wuhan University, China ([email protected]) 2 School of Earth Science, Ohio State University, USA 3 College of Surveying and Geo-Informatics, Tongji University, Shanghai, China 4 Geodetic Institute, Leibniz University Hannover, Germany * Corresponding author Received: December 26, 2015; Revised: June 5, 2016; Accepted: June 28, 2016 ABSTRACT We discuss the Bayesian inference based on the Errors-In-Variables (EIV) model. The proposed estimators are developed not only for the unknown parameters but also for the variance factor with or without prior information. The proposed Total Least-Squares (TLS) estimators of the unknown parameter are deemed as the quasi Least-Squares (LS) and quasi maximum a posterior (MAP) solution. In addition, the variance factor of the EIV model is proven to be always smaller than the variance factor of the traditional linear model. A numerical example demonstrates the performance of the proposed solutions. Ke ywo rd s: Errors-In-Variables, Total Least-Squares, Bayesian inference, quasi solution, Maximum Likelihood, noninformative prior, informative prior 1. INTRODUCTION The method of least-squares (LS), which was developed by C.F. Gauss and A.M. Legendre in nineteenth century (Stigler, 1986), has been widely applied to solve an overdetermined system. In spite of its wide use, however, the principle hypothesis of the coefficient matrix within the mathematical model is not necessarily true in all cases in geodesy. A popular type of models with an uncertain coefficient matrix is known in the literature as Errors-In-Variables (EIV) models. In 1980 Total Least-Squares (TLS) is introduced by Golub and van Loan (1980) in the field of numerical analysis. TLS is nowadays frequently utilized as a standard terminology of the estimation method sets for adjusting the EIV model in science and engineering fields, respectively. The EIV model is now attached the high importance in the geodetic data processing, since within the model the random errors of all measured data are rigorously regarded. In typical geodetic problems, e.g., regression and transformation, the random errors of the measured data contained in the design matrix should be properly taken into consideration. Therefore, the Gauss-Markov model (GMM) is improper for treating such cases any more rigorously. Stud. Geophys. Geod., 61 (2017), 3552, DOI: 10.1007/s11200-015-6107-9 35 © 2017 Inst. Geophys. CAS, Prague X. Fang et al. Recently, the investigation on adjusting the EIV model has been also exhibited by quite a number of publications in geodesy. Usually, the adjustment of EIV model without linearization is called TLS. The most frequent approaches include the closed form solution in terms of the singular value decomposition (SVD) of the data matrix (e.g., Teunissen, 1988; Felus, 2004; Akyilmaz, 2007; Schaffrin and Felus, 2008; Grafarend and Awange, 2012), reformulation of the TLS problem as a constrained minimization optimization problem (e.g., Schaffrin and Wieser, 2008; Xu et al., 2012; Snow, 2012; Li et al., 2013; Fang, 2014a), the iterative LS solution by properly treating the weight matrix (e.g., Amiri-Simkooei and Jazaeri, 2012, 2013) and transformation of the TLS problem into an unconstrained optimization problem (Xu et al., 2012; Fang, 2011, 2013, 2014b, 2015). All these methods provide the identical TLS solution and guarantee the (weighted) orthogonality when, and only when the design matrix within the EIV model contains only linear terms of random errors. Being an alternative to these TLS methods, the iteratively linearized Gauss Helmert model (GHM) method proposed by Pope (1972) can also solve the (weighted) TLS problem. Xu (2016) analyzed how random errors in the design matrix influence the variance components within the EIV model. Although a significant number of publications as mentioned above has been presented to adjust the EIV model, they are all based on the assumption that only the first and second moments of errors are available. In fact, most of the methods are indeed optimal/proper in case of normal distribution although the distribution information is not explicitly used. However, when the errors are not normally distributed, these methods are not proper anymore. The earliest studies in Bayesian EIV models can be found in Lindley and El Sayyad (1968) and Zellner (1971). Later, the investigation on adjusting the EIV model from the Bayesian perspective instead of from the frequentist point of view has been presented by a number of publications in different disciplines: e.g., Bauwens and Lubrano (1999) in economics, Florens et al. (1974), Polasek (1995), Reilly and Patino-Lea (1981), Bolfarine and Rodrigues (2007) and Huang (2010) in statistics, Dellaportas and Stephens (1995) in biometrics. However, parameter estimation methods proposed in the aforementioned publications were based on a simplified functional model (e.g, regression model) and the special stochastic information, i.e. the dispersion matrix is given in a special structure. Therefore, we will propose the Bayesian approach to handle the EIV model with general stochastic information. As a result, the prior information about the parameter vector as well as the variance factor can be fully considered in the Bayesian inference for suitable solution of the EIV model. Furthermore, the estimated variance component will be proven to be always smaller than that in the LS estimation. The objectives of this contribution are as follows. Firstly, the EIV model is formulated with the known probability density function (PDF); as a case study, the normal distribution is assumed. Secondly, the formulae of the maximum likelihood estimator (MLE) and maximum a posteriori estimator (MAPE) are derived to solve the model with the noninformative and informative prior density function of parameters under the conditions of known and unknown variance factor, respectively. Furthermore, it is proven that the estimate of the variance factor estimated by the (marginal) maximum likelihood method proposed is always smaller than the traditional estimate of the variance factor in the framework of the LS method. At the next step, a simulated example is presented. Finally, some concluding remarks and further work are given. 36 Stud. Geophys. Geod., 61 (2017) Bayesian inference for the Errors-In-Variables model 2. THE FORMULATION OF THE ERRORS-IN-VARIABLES MODEL It is well-known that the LS estimation is the best linear unbiased estimation when the design matrix is free of noise and the expectation of the random errors in the traditional observation vector equals zero. This kind of estimation has frequently been applied in the GMM for the error adjustment. In contrast, an EIV model is a model similar to GMM except for the elements of the design matrix that are contaminated by random errors. Consequently, the observation equation of the EIV model can be expressed as yv=y AVA ξ , (1) where the column full rank n m matrix A affected by random errors and the conventional observation vector y have the correction matrix VA and vector vy. Vector is a vector of unknowns. With all error-affected variables as observations, the observations can be expressed as an extended vector vec A l , (2) y where ‘vec’ denotes the operator that stacks one column of a matrix underneath the previous one. The corresponding extended uncertainty vector v and the stochastic properties of the uncertainties can be characterized as follows: vecVA vA 0 v , v , Σll , (3) vy vy 0 where 1 22Q0AA P0AA Σll 00= 0Qyy 0Pyy is the dispersion matrix of the extended observation vector, matrices QAA and Qyy are the 2 symmetric and positive definite cofactor matrices for vA and vy, vA vecVA , and 0 is the variance factor. PAA and Pyy are the symmetric and positive definite weight matrices for vA and vy. In this paper, the vectors vA and vy are assumed independent. In order to adjust the model, the objective function in the sense of weighted TLS criterion reads TT vvvvAAAAPP y yyy min . (4) Stud. Geophys. Geod., 61 (2017) 37 X. Fang et al. 3. THE MAXIMUM LIKELIHOOD ESTIMATOR FOR THE EIV MODEL As the common assumption, the (vectorized) model matrix A and observation vector y 2 are here assumed to be normally distributed. Provided that 0 is known, the distributions are given as follows 2 2 y UUQAAy,,ξξ N 0 y , vecAUAAA N vec U , 0 QA , (5) 2 where U A denotes the expectation of the coefficient matrix. Here, the variance factor 0 for all measured quantities is modeled to be identical (see Schaffrin and Felus, 2008; Shen et al., 2010). Modeling of the different variance components of design matrix A and the conventional observation vector y leads to the issue of the variance components estimation within the EIV model (Amiri-Simkooei, 2013; Xu and Liu, 2014). With the distribution in Eq. (5), the likelihood function is given as T 1vecUUP0AA vec exp llAA 2 2 UUξξ0Pyy 0 AA L l |,ξ U . (6) A n 2 12 2det Σll Conveniently, working with the logarithm of the likelihood function (6) instead of the exponential function yields lnL l |ξ , UA T 11vecUU vec n (7) ll AAΣΣ1 ln 2 ln det . ll ll 222UUAAξξ In order to obtain the maximum of the log-likelihood function of Eq. (7), two partial derivatives with respect to the unknown parameter vector and the (vectorized) model matrix UA should be analytically derived. The first-order derivative of Eq. (7) with respect to the parameter vector reads: lnL l |ξ , U A ξ T vecUU vec l AAΣ1 l ll UUAAξξ (8) ξ T 12y UPAyyAξ y Uξ UPT Uξ y . 22Ayy A 00 ξ 38 Stud. Geophys. Geod., 61 (2017) Bayesian inference for the Errors-In-Variables model The first derivative of Eq.