An Empirical Bayes Approach to Identification of Modules in Dynamic
Total Page:16
File Type:pdf, Size:1020Kb
An empirical Bayes approach to identification of modules in dynamic networks Niklas Everitt a, Giulio Bottegal b, H˚akan Hjalmarsson a aACCESS Linneaus Center, School of Electrical Engineering, KTH Royal Institute of Technology, Sweden bDepartment of Electrical Engineering, TU Eindhoven, The Netherlands Abstract We present a new method of identifying a specific module in a dynamic network, possibly with feedback loops. Assuming known topology, we express the dynamics by an acyclic network composed of two blocks where the first block accounts for the relation between the known reference signals and the input to the target module, while the second block contains the target module. Using an empirical Bayes approach, we model the first block as a Gaussian vector with covariance matrix (kernel) given by the recently introduced stable spline kernel. The parameters of the target module are estimated by solving a marginal likelihood problem with a novel iterative scheme based on the Expectation-Maximization algorithm. Additionally, we extend the method to include additional measurements downstream of the target module. Using Markov Chain Monte Carlo techniques, it is shown that the same iterative scheme can solve also this formulation. Numerical experiments illustrate the effectiveness of the proposed methods. Key words: system identification, dynamic network, empirical Bayes, expectation-maximization. 1 Introduction tion. The first is topology detection, that is, understand- ing the interconnection between the modules. The sec- Networks of dynamical systems are everywhere, and ap- ond problem is the identification of one or more specific plications are in different branches of science, e.g., econo- modules in the network. Some recent papers deal with metrics, systems biology, social science, and power sys- both the aforementioned problems [4,5,1], whereas oth- tems. Identification of these networks, usually referred ers are mainly focused on the identification of a single to as dynamic networks, has been given increasing atten- module in the network [6,7,8,9,10]. As observed in [2], tion in the system identification community, see e.g., [1], dynamic networks with known topology can be seen as a [2], [3]. In this paper, we use the term \dynamic net- generalization of simple compositions, such as systems in work" to mean the interconnection of modules, where parallel, series or feedback connection. Therefore, iden- each module is a linear time-invariant (LTI) system. The tification techniques for dynamic networks may be de- interconnecting signals are the outputs of these modules; rived by extending methods already developed for simple we also assume that exogenous measurable signals may structures. This is the idea underlying the method pro- affect the dynamics of the network. posed in [7], which generalized the results of [11] for the identification of cascaded systems to the context of dy- namic networks. In that work, the underlying idea is that Two main problems arise in dynamic network identifica- a dynamic network can be transformed into an acyclic ? structure, where any reference signal of the network is This work was supported by the Swedish Research Coun- the input to a cascaded system consisting of two LTI cil under contracts 2015-05285 and NewLEADS 2016-06079, blocks. In this alternative system description, the first by the European Research Council under the advanced grant LEARN, contract 267381, and by the European Research block captures the relation between the reference and Council (ERC), Advanced Research Grant SYSDYNET, un- the noisy input of the target module, the second block der the European Union's Horizon 2020 research and inno- contains the target module. The two LTI blocks are iden- vation programme (grant agreement No 694504). tified simultaneously using the prediction error method Email addresses: [email protected] (Niklas Everitt), (PEM) [12]. In this setup, determining the model struc- [email protected] (Giulio Bottegal), [email protected] ture of the first block of the cascaded structure may be (H˚akan Hjalmarsson). Preprint submitted to Automatica 3 December 2017 complicated, due to the possibly large number of inter- in combination with the ECM method, builds up a novel connections in the dynamic network. Furthermore, it re- identification method for the target module. quires knowledge of the model structure of essentially all modules in the feedback loop. Therefore, in [7], the A part of this paper has previously been presented in first block is modeled by an unstructured finite impulse [20]. More specifically, the case where only the sensors response (FIR) model of high order. The major draw- directly measuring the input and the output of the tar- back of this approach is that, as is usually the case with get module are used in the identification process were estimated models of high order, the variance of the es- partly covered in [20], whereas, the general case where timated FIR model is high. The uncertainty in the es- more sensors spread in the network are used in the iden- timate of the FIR model of the first block will in turn tification of the target module is completely novel. decrease the accuracy of the estimated target module. The objective of this paper is to propose a method for 2 Problem Statement the identification of a module in dynamic networks that circumvents the high variance that is due to the high or- We consider dynamic networks that consist of L scalar der model of the first block. Our main focus is on the internal variables wj(t), j = 1;:::;L and L scalar exter- identification of a specific module, which we assume is nal reference signals rl(t), l = 1;:::;L. We do not state well described through a low-order parametric model. any specific requirement on the reference signals (i.e., we Following a recent trend in system identification [13], we do not assume any condition on persistent excitation in use regularization to control the covariance of the iden- the input [12]), requiring only rl(t) 6= 0, l = 1;:::;L, for tified sensitivity path by modeling its impulse response some t. Notice, however, that even though the method as a zero-mean stochastic process. The covariance ma- presented in this paper does not require any specifics of trix of this process is given by the recently introduced the input, the resulting estimate is of course highly de- stable spline kernel [14], whose structure is parametrized pendent on the properties of the input [12]. Some of the by two hyperparameters. reference signal may not be present, i.e., they may be identically zero. Define R as the set of indices of refer- We also consider the case where more sensors spread in ence signals that are present. In the dynamic network, the network are used in the identification of the target the internal variables are considered nodes and transfer module, motivated by the fact that adding information functions are the edges. Introducing the vector notation > > through addition of measurements used in the identifica- w(t) := [w1(t) : : : wL(t)] , r(t) := [r1(t) : : : rL(t)] , tion process has the potential to further reduce the vari- the dynamics of the network are defined by the equation ance of the estimated module [15]. To avoid too many additional parameters to estimate, we model also the im- w(t) = G(q)w(t) + r(t); (1) pulse response of the path linking the target module to any additional sensor as a Gaussian process. with 2 3 An estimate of the target module is obtained by em- 0 G (q) ··· G (q) pirical Bayes (EB) arguments, that is, by maximiza- 12 1L 6 . 7 tion of the marginal likelihood of the available measure- 6 . 7 6G21(q) 0 . 7 ments [13]. This likelihood does not admit an analyti- G(q) = 6 7 ; 6 . .. .. 7 cal expression and depends not only on the parameter 6 . G(L−1)L(q)7 of the target module, but also on the kernel hyperpa- 4 5 G (q) ··· G (q) 0 rameters and the variance of the measurement noise. To L1 L(L−1) estimate all these quantities, we design a novel itera- tive scheme based on an EM-type algorithm [16], known where Gji(q) is a proper rational transfer function for as the Expectation/Conditional-Maximization (ECM) j = 1;:::;L, i = 1;:::;L. The internal variables w(t) algorithm [17]. This algorithm alternates the so called are measured with additive white noise, that is expectation step (E-step) with a series of conditional- maximization steps (CM-steps) that, in the problem un- w~(t) = w(t) + e(t); der analysis, consist of relatively simple optimization problems, which either admit a closed form solution, or where e(t) 2 RL is a stationary zero-mean Gaussian can be efficiently solved using gradient descent strate- white-noise process with diagonal noise covariance ma- 2 2 2 gies. As for the E-step, we are required to compute trix Σe = diag σ1; : : : ; σL . We assume that the σi are an integral that, in the general case of multiple down- unknown. To ensure stability and causality of the net- stream sensor, does not admit an analytical expression. work, the following assumptions hold for all networks To overcome this issue, we use Markov Chain Monte considered in this paper. Carlo (MCMC) techniques [18] to solve the integral as- sociated with the E-step. In particular, we design an in- Assumption 2.1 The network is well posed in the sense tegration scheme based on the Gibbs sampler [19] that, that all principal minors of limq!1(I − G(q)) are non- 2 zero [2]. Furthermore, the sensitivity path S(q) = (I − G(q))−1 is stable G31 r4 r2 w w w w Assumption 2.2 The reference variables frl(t)g are + 4 G + 1 G + 2 G + 3 mutually uncorrelated and uncorrelated with the mea- 14 21 32 surement noise e(t).