Metric and Topological Entropy Bounds for Optimal Coding of Stochastic Dynamical Systems Christoph Kawan and Serdar Yuksel¨
Total Page:16
File Type:pdf, Size:1020Kb
1 Metric and topological entropy bounds for optimal coding of stochastic dynamical systems Christoph Kawan and Serdar Yuksel¨ Abstract—We consider the problem of optimal zero-delay cod- The system (1) is connected over a possibly noisy channel ing and estimation of a stochastic dynamical system over a with a finite capacity to an estimator, as shown in Fig. 1. The noisy communication channel under three estimation criteria estimator has access to the information it has received through concerned with the low-distortion regime. The criteria considered are (i) a strong and (ii) a weak form of almost sure stability the channel. A source coder maps the source symbols (i.e., of the estimation error as well as (ii) quadratic stability in state values) to corresponding channel inputs. The channel expectation. For all three objectives, we derive lower bounds on inputs are transmitted through the channel; we assume that the smallest channel capacity C0 above which the objective can be the channel is a discrete channel with input alphabet and achieved with an arbitrarily small error. We first obtain bounds output alphabet 0. M through a dynamical systems approach by constructing an M infinite-dimensional dynamical system and relating the capacity xt qt q0 with the topological and the metric entropy of this dynamical SYSTEM CODER CHANNEL t ESTIMATOR system. We also consider information-theoretic and probability- theoretic approaches to address the different criteria. Finally, xˆt we prove that a memoryless noisy channel in general constitutes qt0 no obstruction to asymptotic almost sure state estimation with arbitrarily small errors, when there is no noise in the system. The results provide new solution methods for the criteria introduced Fig. 1. Coding and state estimation over a noisy channel with feedback (e.g., standard information-theoretic bounds cannot be applied for some of the criteria) and establish further connections We refer by a coding policy Π, to a sequence of functions between dynamical systems, networked control, and information (γe) which are causal such that the channel input at time theory, and especially in the context of nonlinear stochastic t t2Z+ systems. t, qt , under Π is generated by a function of its local information,2 M i.e., e e qt = γ ( ); t It I. INTRODUCTION e 0 where = x ; q and qt , the channel input It f [0;t] [0;t−1]g 2 M In this paper, we consider nonlinear stochastic systems given alphabet given by = 1; 2;:::;M ; for 0 t T 1. by an equation of the form M f g ≤ ≤ − Here, we use the notation x[0;t−1] = xs : 0 s t 1 for t 1. f ≤ ≤ − g xt+1 = f(xt; wt): (1) ≥ 0 The channel maps qt to qt in a stochastic fashion so that Here xt is the state at time t and (wt)t2Z+ is an i.i.d. sequence 0 0 P (qt qt; q[0;t−1]; q[0;t−1]) is a conditional probability measure of random variables with common distribution wt ν, j on 0 for all . If this expression is equal to ( 0 ), modeling the noise. In general, we assume that ∼ t Z+ P qt qt theM channel is said2 to be memoryless, i.e., the past variablesj f : X W X do not affect the channel output q0 given the current channel × ! t input q . is a Borel measurable map, where ( ) is a complete t arXiv:1612.00564v3 [math.OC] 26 Jun 2018 X; d metric space and W a measurable space, so that for any The receiver, upon receiving the information from the channel, w W the map f( ; w) is a homeomorphism of X. We generates an estimate x^t at time t, also causally: An admissible 2 · d further assume that x0 is a random variable on X with an causal estimation policy is a sequence of functions (γt )t2Z+ d 0 associated probability measure π0, stochastically independent such that x^t = γt (q[0;t]) with of (wt)t2Z+ . We use the notations d :( 0)t+1 0 x γt X; t : fw(x) = f(x; w); f (w) = f(x; w): M ! ≥ x so that fw : X X and f : W X. For a given " > 0, we denote by C" the smallest channel ! ! capacity above which there exist an encoder and an estimator C. Kawan is with the Faculty of Computer Science and Mathematics, so that one of the following estimation objectives is achieved: University of Passau, 94032 Passau, Germany (e-mail: christoph.kawan@uni- passau.de). S. Yuksel¨ is with the Department of Mathematics and Statistics, Queen’s University, Kingston, Ontario, Canada, K7L 3N6 (e-mail: yuk- (E1) Eventual almost sure stability of the estimation error: [email protected]). This research was supported in part by the Natural There exists T (") 0 so that Sciences and Engineering Research Council (NSERC) of Canada. Some re- ≥ sults of this paper appeared in part at the 2017 IEEE International Symposium sup d(xt; x^t) " a.s. (2) on Information Theory. t≥T (") ≤ 2 (E2) Asymptotic almost sure stability of the estimation error: area have typically involved linear systems, and in the non- linear case the studies have only been on deterministic systems P lim sup d(xt; x^t) " = 1: (3) t!1 ≤ estimated/controlled over deterministic channels, with few exceptions. For linear systems, data-rate theorem type results (E3) Asymptotic quadratic stability of the estimation error in have been presented in [44], [39], [28], [23], [26]. expectation: The papers [18], [19], [34], [24], [25] studied state estimation 2 lim sup E[d(xt; x^t) ] ": (4) for non-linear deterministic systems and noise-free channels. t!1 ≤ In [18], [19], Liberzon and Mitra characterized the critical data rate C0 for exponential state estimation with a given exponent A. Literature Review and Contributions α 0 for a continuous-time system on a compact subset K of≥ its state space. As a measure for C , they introduced a In a recent work [16], we investigated the same problem 0 quantity called estimation entropy h (α; K), which equals for the special case involving only deterministic systems and est the topological entropy on K in case α = 0, but for α > 0 discrete noiseless channels. In this paper, we will provide is no longer a purely topological quantity. The paper [15] further connections between the ergodic theory of dynamical provided a lower bound on h (α; K) in terms of Lyapunov systems and information theory by answering the problems est exponents under the assumption that the system preserves posed in the previous section and relating the answers to the a smooth measure. In [24], [25], Matveev and Pogromsky concepts of either metric or topological entropy. Our findings studied three estimation objectives of increasing strength for complement and generalize our results in [16] since here discrete-time non-linear systems. For the weakest one, the we consider stochasticity in the system dynamics and/or the smallest bit rate was shown to be equal to the topological communication channels. entropy. For the other ones, general upper and lower bounds As we note in [16], optimal coding of stochastic processes is were obtained which can be computed directly in terms of a problem that has been studied extensively; in information the linearized right-hand side of the equation generating the theory in the context of per-symbol cost minimization, in dy- system. namical systems in the context of identifying representational A further closely related paper is due to Savkin [36], which and equivalence properties between dynamical systems, and uses topological entropy to study state estimation for a class in networked control in the context of identifying informa- of non-linear systems over noise-free digital channels. In fact, tion transmission requirements for stochastic stability or cost our results can be seen as stochastic analogues of some of the minimization. As such, for the criteria laid out in (E1)-(E3) results presented in [36], which show that for sufficiently per- above, the results in our paper are related to the efforts in the turbed deterministic systems state estimation with arbitrarily literature in the following three general areas. small error is not possible over finite-capacity channels. See Dynamical systems and ergodic theory. Historically there Remark III.11 for further discussions with regard to [36]. has been a symbiotic relation between the ergodic theory of A related problem is the control of non-linear systems over dynamical systems and information theory (see, e.g., [38], communication channels. This problem has been studied in [6] for comprehensive reviews). Information-theoretic tools few publications, and mainly for deterministic systems and/or have been foundational in the study of dynamical systems, deterministic channels. Recently, [48] studied stochastic sta- for example the metric (also known as Kolmogorov-Sinai bility properties for a more general class of stochastic non- or measure-theoretic) entropy is crucial in the celebrated linear systems building on information-theoretic bounds and Shannon-McMillan-Breiman theorem as well as two important Markov-chain-theoretic constructions. However, these bounds representation theorems: Ornstein’s (isomorphism) theorem do not distinguish between the unstable and stable components and the Krieger’s generator theorem [9], [31], [32], [14], [6]. of the tangent space associated with a dynamical non-linear The concept of sliding block encoding [8] is a stationary system, while the entropy bounds established in this paper encoding of a dynamical system defined by the shift process, make such a distinction, but only for estimation problems and leading to fundamental results on the existence of stationary in the low-distortion regime. codes which perform as good as the limit performance of a sequence of optimal block codes.