<<

Santa Fe Institute Working Paper 16-07-015 arxiv.org:1607.06526 [math-ph] Beyond the Spectral Theorem: Spectrally Decomposing Arbitrary Functions of Nondiagonalizable Operators

Paul M. Riechers∗ and James P. Crutchfield† Complexity Sciences Center Department of Physics University of California at Davis One Shields Avenue, Davis, CA 95616 (Dated: September 24, 2017) Nonlinearities in finite dimensions can be linearized by projecting them into infinite dimensions. Unfortunately, the familiar linear operator techniques that one would then use often fail since the operators cannot be diagonalized. This curse is well known. It also occurs for finite-dimensional lin- ear operators. We circumvent it via two tracks. First, using the well-known holomorphic functional , we develop new practical results about spectral projection operators and the relationship between left and right generalized eigenvectors. Second, we generalize the holomorphic calculus to a meromorphic functional calculus that can decompose arbitrary functions of nondiagonalizable lin- ear operators in terms of their eigenvalues and projection operators. This simultaneously simplifies and generalizes functional calculus so that it is readily applicable to analyzing complex physical systems. Together, these results extend the spectral theorem of normal operators to a much wider class, including circumstances in which poles and zeros of the function coincide with the operator spectrum. By allowing the direct manipulation of individual eigenspaces of nonnormal and nondi- agonalizable operators, the new theory avoids spurious divergences. As such, it yields novel insights and closed-form expressions across several areas of physics in which nondiagonalizable dynamics arise, including memoryful stochastic processes, open nonunitary quantum systems, and far-from- equilibrium thermodynamics. The technical contributions include the first full treatment of arbitrary powers of an operator, highlighting the special role of the zero eigenvalue. Furthermore, we show that the , previously only defined axiomatically, can be derived as the negative-one power of singular operators within the meromorphic functional calculus and we give a new general method to construct it. We provide new formulae for constructing spectral projection operators and delineate the relations among projection operators, eigenvectors, and left and right generalized eigenvectors. By way of illustrating its application, we explore several, rather distinct examples. First, we analyze stochastic transition operators in discrete and continuous time. Second, we show that nondiagonalizability can be a robust feature of a stochastic process, induced even by simple counting. As a result, we directly derive distributions of the time-dependent Poisson process and point out that nondiagonalizability is intrinsic to it and the broad class of hidden semi-Markov processes. Third, we show that the Drazin inverse arises naturally in stochastic thermodynamics and that applying the meromorphic functional calculus provides closed-form solutions for the dynamics of key thermodynamic observables. Fourth, we show that many memoryful processes have power spectra indistinguishable from white noise, despite being highly organized. Nevertheless, whenever the power spectrum is nontrivial, it is a direct signature of the spectrum and projection operators of the process’ hidden linear dynamic, with nondiagonalizable subspaces yielding qualitatively distinct line profiles. Finally, we draw connections to the Ruelle–Frobenius–Perron and Koopman operators for chaotic dynamical systems and propose how to extract eigenvalues from a time-series.

PACS numbers: 02.50.-r 05.45.Tp 02.50.Ey 02.50.Ga

... the supreme goal of all theory is to make I. INTRODUCTION the irreducible basic elements as simple and as few as possible without having to surrender Decomposing a complicated system into its constituent the adequate representation of a single datum parts—reductionism—is one of science’s most power- of experience. A. Einstein [1, p. 165] ful strategies for analysis and understanding. Large- scale systems with linearly coupled components give one paradigm of this success. Each can be decomposed into an equivalent system of independent elements using a similarity transformation calculated from the linear al- ∗ [email protected] gebra of the system’s eigenvalues and eigenvectors. The † [email protected] physics of linear wave phenomena, whether of classical 2 light or quantum mechanical amplitudes, sets the stan- ues. The corresponding theory of spectral decomposition dard of complete reduction rather high. The dynamics established the quantitative foundation of quantum me- is captured by an “operator” whose allowed or exhib- chanics. ited “modes” are the elementary behaviors out of which The applications and discoveries enabled by spectral composite behaviors are constructed by simply weighting decomposition and the corresponding spectral theory each mode’s contribution and adding them up. fill a long list. In application, direct-bandgap semi- However, one should not reduce a composite system conducting materials can be turned into light-emitting more than is necessary nor, as is increasingly appre- diodes (LEDs) or lasers by engineering the spatially- ciated these days, more than one, in fact, can. In- inhomogeneous distribution of energy eigenvalues and the deed, we live in a complex, nonlinear world whose con- occupation of their corresponding states [7]. Before their stituents are strongly interacting. Often their key struc- experimental discovery, anti-particles were anticipated as tures and memoryful behaviors emerge only over space the nonoccupancy of negative-energy eigenstates of the and time. These are the complex systems. Yet, perhaps Dirac Hamiltonian [8]. surprisingly, many complex systems with nonlinear dy- The spectral theory, though, extends far beyond phys- namics correspond to linear operators in abstract high- ical science disciplines. In large measure, this arises dimensional spaces [2–4]. And so, there is a sense in since the evolution of any object corresponds to a lin- which even these complex systems can be reduced to the ear dynamic in a sufficiently high-dimensional state study of independent nonlocal collective modes. space. Even nominally nonlinear dynamics over sev- Reductionism, however, faces its own challenges even eral variables, the canonical mechanism of determin- within its paradigmatic setting of linear systems: lin- istic chaos, appear as linear dynamics in appropriate ear operators may have interdependent modes with ir- infinite-dimensional shift-spaces [4]. A nondynamic ver- reducibly entwined behaviors. These irreducible com- sion of rendering nonlinearities into linearities in a higher- ponents correspond to so-called nondiagonalizable sub- dimensional feature space is exploited with much success spaces. No similarity transformation can reduce them. today in machine learning by support vector machines, In this view, reductionism can only ever be a guide. for example [9]. Spectral decomposition often allows a The actual goal is to achieve a happy medium, as Einstein problem to be simplified by approximations that use only reminds us, of decomposing a system only to that level the dominant contributing modes. Indeed, human-face at which the parts are irreducible. To proceed, though, recognition can be efficiently accomplished using a small begs the original question, What happens when reduc- of “eigenfaces” [10]. tionism fails? To answer this requires revisiting one of Certainly, there are many applications that highlight its more successful implementations, spectral decompo- the importance of decomposition and the spectral theory sition of completely reducible operators. of operators. However, a brief reflection on the math- ematical history will give better context to its precise results, associated assumptions, and, more to the point, A. Spectral Decomposition the generalizations we develop here in hopes of advancing the analysis and understanding of complex systems. Spectral decomposition—splitting a linear operator Following on early developments of operator theory into independent modes of simple behavior—has greatly by Hilbert and co-workers [11], the spectral theorem for accelerated progress in the physical sciences. The im- normal operators reached maturity under von Neumann pact stems from the fact that spectral decomposition is by the early 1930s [12, 13]. It became the mathemat- not only a powerful mathematical tool for expressing the ical backbone of much progress in physics since then, organization of large-scale systems, but also yields pre- from classical partial differential equations to quantum dictive theories with directly observable physical conse- physics. Normal operators, by definition, commute with † † quences [5]. Quantum mechanics and statistical mechan- their Hermitian conjugate: A A = AA . Examples in- ics identify the energy eigenvalues of Hamiltonians as the clude symmetric and orthogonal matrices in classical me- basic objects in thermodynamics: transitions among the chanics and Hermitian, skew-Hermitian, and unitary op- energy eigenstates yield heat and work. The eigenvalue erators in quantum mechanics. spectrum reveals itself most directly in other kinds of The spectral theorem itself is often identified as a col- spectra, such as the frequency spectra of light emitted lection of related results about normal operators; see, by the gases that permeate the galactic filaments of our e.g., Ref. [14]. In the case of finite-dimensional vector universe [6]. Quantized transitions, an initially mystify- spaces [15], the spectral theorem asserts that normal op- ing feature of atomic-scale systems, correspond to dis- erators are diagonalizable and can always be diagonalized tinct eigenvectors and discrete spacing between eigenval- by a unitary transformation; that left and right eigenvec- 3 tors (or eigenfunctions) are simply related by complex- stract mathematical theory into a more tractable frame- conjugate ; that these eigenvectors form a com- work for analyzing complex physical systems. plete basis; and that functions of a normal operator re- Taken together, the functional calculus, Drazin inverse, duce to the action of the function on each eigenvalue. and methods to manipulate particular eigenspaces, are Most of these qualities survive with only moderate pro- key to a thorough-going analysis of many complex sys- visos in the infinite-dimensional case. In short, the spec- tems, many now accessible for the first time. Indeed, the tral theorem makes physics governed by normal operators framework has already been fruitfully employed in sev- tractable. eral specific applications, including closed-form expres- The spectral theorem, though, appears powerless when sions for signal processing and information measures of faced with nonnormal and nondiagonalizable operators. hidden Markov processes [19–23], compressing stochastic What then are we to do when confronted by, say, com- processes over a quantum channel [24, 25], and stochas- plex interconnected systems with nonunitary time evolu- tic thermodynamics [26, 27]. However, the techniques are tion, by open systems, by structures that emerge on space sufficiently general they will be much more widely useful. and time scales different from the equations of motion, or We envision new opportunities for similar detailed anal- by other novel physics governed by nonnormal and not- yses, ranging from biophysics to quantum field theory, necessarily-diagonalizable operators? Where is the com- wherever restrictions to normal operators and - parably constructive framework for calculations beyond izability have been roadblocks. the standard spectral theorem? Fortunately, portions of With this broad scope in mind, we develop the math- the necessary generalization have been made within pure ematical theory first without reference to specific appli- [16], some finding applications in engineer- cations and disciplinary terminology. We later give ped- ing and control [17, 18]. However, what is available is agogical (yet, we hope, interesting) examples, exploring incomplete. And, even that which is available is often several niche, but important applications to finite hid- not in a form adapted to perform calculations that lead den Markov processes, basic stochastic process theory, to quantitative predictions. nonequilibrium thermodynamics, signal processing, and nonlinear dynamical systems. At a minimum, the exam- ples and their breadth serve to better acquaint readers B. Synopsis with the basic methods required to employ the theory. We introduce the meromorphic functional calculus in Here, we build on previous work in functional analy- §III through §IV, after necessary preparation in §II. §VA sis and operator theory to provide both a rigorous and further explores and gives a new formula for eigenprojec- constructive foundation for physically relevant calcula- tors, which we refer to here simply as projection opera- tions involving not-necessarily-diagonalizable operators. tors. §V B makes explicit their general relationship with In effect, we extend the spectral theorem for normal oper- eigenvectors and generalized eigenvectors and clarifies the ators to a broader setting, allowing generalized “modes” orthonormality relationship among left and right gener- of nondiagonalizable systems to be identified and manip- alized eigenvectors. §V B 4 then discusses simplifications ulated. The meromorphic functional calculus we develop of the functional calculus for special cases, while §VI A extends expansion and standard holomor- takes up the spectral properties of transition operators. phic functional calculus to analyze arbitrary functions of The examples are discussed at length in §VI before we not-necessarily-diagonalizable operators. It readily han- close in §VII with suggestions on future applications and dles singularities arising when poles (or zeros) of the research directions. function coincide with poles of the operator’s resolvent— poles that appear precisely at the operator’s eigenvalues. Pole–pole and pole–zero interactions substantially mod- II. SPECTRAL PRIMER ify the complex-analytic residues within the functional calculus. A key result is that the negative-one power of The following is relatively self-contained, assuming ba- a singular operator exists in the meromorphic functional sic familiarity with linear at the level of Refs. [15, calculus. It is the Drazin inverse, a powerful tool that is 17]—including eigen-decomposition and knowledge of the receiving increased attention in stochastic thermodynam- Jordan canonical form, partial fraction expansion (see ics and elsewhere. Furthermore, we derive consequences Ref. [28]), and series expansion—and basic knowledge from the more familiar holomorphic functional calculus of —including the residue theorem and that readily allow spectral decomposition of nondiago- calculation of residues at the level of Ref. [29]. For those nalizable operators in terms of spectral projections and lacking a working facility with these concepts, a quick re- left and right generalized eigenvectors—decanting the ab- view of §VI’s applications may motivate reviewing them. 4

In this section, we introduce our notation and, in doing is complex differentiable throughout the domain under so, remind the reader of certain basic concepts in linear consideration. A pole of order n at z0 C is a singular- n ∈ algebra and complex analysis that will be used exten- ity that behaves as h(z)/(z z0) as z z0, where h(z) is − → sively in the following. holomorphic within a neighborhood of z0 and h(z0) = 0. 6 To begin, we restrict attention to operators with finite We say that h(z) has a zero of order m at z1 if 1/h(z) has representations and only sometimes do we take the limit a pole of order m at z1.A meromorphic function is one of going to infinity. That is, we do not consider that is holomorphic except possibly at a set of isolated infinite- operators outright. While this runs counter poles within the domain under consideration. to previous presentations in that Defined over the continuous complex variable z C, ∈ consider only infinite-dimensional operators, the upshot A’s resolvent: is that they—as limiting operators—can be fully treated −1 with a countable point spectrum. We present examples of R(z; A) (zI A) , ≡ − this later on. Accordingly, we restrict our attention to op- erators with at most a countably infinite spectrum. Such captures all of A’s spectral information through the poles operators share many features with finite-dimensional of R(z; A)’s elements. In fact, the resolvent con- square matrices, and so we recall several elementary but tains more than just A’s spectrum: we later show that the essential facts from matrix theory used repeatedly in the order of each pole gives the index ν of the corresponding main development. eigenvalue. If A is a finite-dimensional , then its spec- The spectrum ΛA can be expressed in terms of the resolvent. Explicitly, the point spectrum (i.e., the set of trum is simply the set ΛA of its eigenvalues: eigenvalues) is the set of complex values z at which zI A  − ΛA = λ C : det(λI A) = 0 , is not a one-to-one mapping, with the implication that ∈ − the inverse of zI A does not exist: − where det( ) is the of its argument and I ·  is the . The algebraic multiplicity a of ΛA = λ C : R(λ; A) = inv(λI A) , λ ∈ 6 − eigenvalue λ is the power of the term (z λ) in the char- − where inv( ) is the inverse of its argument. Later, via acteristic det(zI A). In contrast, the geo- · − our investigation of the Drazin inverse, it should become metric multiplicity gλ is the dimension of the of the transformation A λI or, equivalently, the number clear that the resolvent operator can be self-consistently − of linearly independent eigenvectors associated with the defined at the spectrum, despite the lack of an inverse. eigenvalue. The algebraic and geometric multiplicities For infinite-rank operators, the spectrum becomes are all equal when the matrix is diagonalizable. more complicated. In that case, the right point spec- Since there can be multiple subspaces associated with trum (the point spectrum of A) need not be the same as a single eigenvalue, corresponding to different Jordan the left point spectrum (the point spectrum of A’s dual > blocks in the Jordan canonical form, it is structurally A ). Moreover, the spectrum may grow to include non- eigenvalues z for which the range of zI A is not dense important to distinguish the index of the eigenvalue as- − in the it transforms or for which zI A has sociated with the largest of these subspaces [30]. − dense range but the inverse of zI A is not bounded. − Definition 1. Eigenvalue λ’s index νλ is the size of the These two settings give rise to the so-called residual spec- largest Jordan block associated with λ. trum and continuous spectrum, respectively [32]. To mit- igate confusion, it should be noted that the point spec- If z / ΛA, then νz = 0. Note that the index of the trum can be continuous, yet never coincides with the con- ∈ operator A itself is sometimes discussed [31]. In such tinuous spectrum just described. Moreover, understand- contexts, the index of A is ν0. Hence, νλ corresponds to ing only countable point spectra is necessary to follow the index of A λI. the developments here. − The index of an eigenvalue gives information beyond Each of A’s eigenvalues λ has an associated projection what the algebraic and geometric multiplicities them- operator Aλ, which is the residue of the resolvent as z → selves yield. Nevertheless, for λ ΛA, it is always true λ [14]. Explicitly: ∈ that νλ 1 aλ gλ aλ 1. In the diagonalizable − ≤ − ≤ − −1  case, aλ = gλ and νλ = 1 for all λ ΛA. Aλ = Res (zI A) , z λ , ∈ − → The following employs basic features of complex analy- sis extensively in conjunction with . Let us where Res( , z λ) is the element-wise residue of its · → therefore review several elementary notions in complex first argument as z λ. The projection operators are → analysis. Recall that a holomorphic function is one that 5 orthonormal: convergence for most functions. For example, suppose f(z) has poles and choose a Maclaurin series; i.e., ξ = 0 AλAζ = δλ,ζ Aλ . (1) in Eq. (3). Then the series only converges when A’s spec- tral radius is less than the radius of the innermost pole and sum to the identity: of f(z). Addressing this and related issues leads directly X to alternative functional calculi. I = Aλ . (2)

λ∈ΛA

The following discusses in detail and then derives several B. Holomorphic functional calculus new properties of projection operators. Holomorphic functions are well behaved, smooth func- tions that are complex differentiable. Given a function III. FUNCTIONAL CALCULI f( ) that is holomorphic within a disk enclosed by a coun- · terclockwise contour C, its Cauchy integral formula is In the following, we develop an extended functional given by: calculus that makes sense of arbitrary functions f( ) of · 1 I a linear operator A. Within any functional calculus, one f(a) = f(z)(z a)−1 dz , (4) considers how A’s eigenvalues map to the eigenvalues of 2πi C − f(A); which we call a spectral mapping. For example, it Taking this as inspiration, the holomorphic functional is known that holomorphic functions of bounded linear calculus performs a contour integration of the resolvent operators enjoy an especially simple spectral mapping to extend f( ) to operators: theorem [33]: · 1 I f(A) = f(z)(zI A)−1 dz , (5) Λf(A) = f(ΛA) . 2πi − CΛA To fully appreciate the meromorphic functional calculus, where CΛA is a closed counterclockwise contour that en- we first state and compare the main features and limita- compasses ΛA. Assuming that f(z) is holomorphic at tions of alternative functional calculi. z = λ for all λ ΛA, a nontrivial calculation [30] shows ∈ that Eq. (5) is equivalent to the holomorphic calculus defined by: A. Taylor series

νλ−1 (m) X X f (λ) m Inspired by the Taylor expansion of functions: f(A) = (A λI) Aλ . (6) m! − λ∈ΛA m=0 ∞ (n) X f (ξ) n f(a) = (a ξ) , After some necessary development, we will later derive n! − n=0 Eq. (6) as a special case of our meromorphic functional a calculus for functions of an operator A can be based on calculus, such that Eq. (6) is valid whenever f(z) is holo- morphic at z = λ for all λ ΛA. the series: ∈ The holomorphic functional calculus was first proposed ∞ X f (n)(ξ) in Ref. [30] and is now in wide use; e.g., see Ref. [17, p. f(A) = (A ξI)n , (3) n! 603]. It agrees with the Taylor-series approach whenever n=0 − the infinite series converges, but gives a functional calcu- where f (n)(ξ) is the nth derivative of f(z) evaluated at lus when the series approach fails. For example, using the z = ξ. principal branch of the complex logarithm, the holomor- This is often used, for example, to express the expo- phic functional calculus admits log(A) for any nonsingu- log(A) nential of A as: lar matrix, with the satisfying result that e = A. Whereas, the Taylor series approach fails to converge for ∞ X An the logarithm of most matrices even if the expansion for, eA = . n! say, log(1 z) is used. n=0 − The major shortcoming of the holomorphic functional This particular series-expansion is convergent for any A calculus is that it assumes f(z) is holomorphic at ΛA. z since e is entire, in the sense of complex analysis. Un- Clearly, if f(z) has a pole at some z ΛA, then Eq. (6) ∈ fortunately, even if it exists there is a limited domain of fails. An example of such a failure is the negative-one 6 power of a singular operator, which we take up later on. The major assumption of our meromorphic functional Several efforts have been made to extend the holomor- calculus is that the domain of operators must have a spec- phic functional calculus. For example, Refs. [34] and [35] trum that is at most countably infinite—e.g., A can be define a functional calculus that extends the standard any compact operator. A related limitation is that sin- holomorphic functional calculus to include a certain class gularities of f(z) that coincide with ΛA must be isolated of meromorphic functions that are nevertheless still re- singularities. Nevertheless, we expect that these restric- quired to be holomorphic on the point spectrum (i.e., on tions can be lifted with proper treatment, as discussed in the eigenvalues) of the operator. However, we are not fuller context later. aware of any previous work that introduces and develops the consequences of a functional calculus for functions that are meromorphic on the point spectrum—which we IV. MEROMORPHIC SPECTRAL take up in the next few sections. DECOMPOSITION

The preceding gave an overview of the relationship be- C. Meromorphic functional calculus tween alternative functional calculi and their trade-offs, highlighting the advantages of the meromorphic func- Meromorphic functions are holomorphic except at a tional calculus. This section leverages these advantages set of isolated poles of the function. The resolvent of and employs a partial fraction expansion of the resolvent a finite-dimensional operator is meromorphic, since it is to give a general spectral decomposition of almost any holomorphic everywhere except for poles at the eigenval- function of any operator. Then, since it plays a key role ues of the operator. We will now also allow our function in applications, we apply the functional calculus to inves- f(z) to be meromorphic with possible poles that coincide tigate the negative-one power of singular operators—thus with the poles of the resolvent. deriving, what is otherwise an operator defined axiomat- Inspired again by the Cauchy integral formula of Eq. ically, the Drazin inverse from first principles. (4), but removing the restriction to holomorphic func- tions, our meromorphic functional calculus instead em- ploys a partitioned contour integration of the resolvent: A. Partial fraction expansion of the resolvent

X 1 I f(A) = f(z)R(z; A) dz , The elements of A’s resolvent are proper rational func- 2πi Cλ λ∈ΛA tions that contain all of A’s spectral information. (Recall that a proper rational function r(z) is a ratio of polyno- where Cλ is a small counterclockwise contour around the mials in z whose numerator has degree strictly less than eigenvalue λ. This and a spectral decomposition of the the degree of the denominator.) In particular, the re- resolvent (to be derived later) extends the holomorphic solvent’s poles coincide with A’s eigenvalues since, for calculus to a much wider domain, defining: z / ΛA: ∈ νλ−1 I −1 X X m 1 f(z) R(z; A) = (zI A) f(A) = Aλ A λI dz . − − 2πi (z λ)m+1 > λ∈Λ m=0 Cλ A − = C (7) det(zI A) −> = C , (8) The contour is integrated using knowledge of f(z) since Q aλ λ∈ΛA (z λ) meromorphic f(z) can introduce poles and zeros at ΛA − that interact with the resolvent’s poles. where aλ is the algebraic multiplicity of eigenvalue λ and The meromorphic functional calculus agrees with the is the matrix of cofactors of zI A. That is, ’s trans- C − C Taylor-series approach whenever the series converges and pose > is the adjugate of zI A: agrees with the holomorphic functional calculus when- C − > ever f(z) is holomorphic at ΛA. However, when both the = adj(zI A) , previous functional calculi fail, the meromorphic calculus C − extends the domain of f(A) to yield surprising, yet sen- whose elements will be polynomial functions of z of de- P sible answers. For example, we show that within it, the gree less than λ∈ΛA aλ. negative-one power of a singular operator is the Drazin Recall that the partial fraction expansion of a proper inverse—an operator that effectively inverts everything rational function r(z) with poles in Λ allows a unique that is invertible. decomposition into a sum of constant numerators divided 7

0 by monomials in z λ up to degree aλ, when aλ is the z = 1. Then, Eq. (13) implies: − order of the pole of r(z) at λ Λ [28]. Equation (8) thus ∈ X 1 I makes it clear that the resolvent has the unique partial I = R(z; A)dz 2πi Cλ fraction expansion: λ∈ΛA X aλ−1 = Aλ . X X 1 R(z; A) = Aλ,m , (9) λ∈ΛA (z λ)m+1 λ∈ΛA m=0 − This shows that the projection operators are, in fact, a where Aλ,m is the set of matrices with constant entries decomposition of the identity, as anticipated in Eq. (2). { } (not functions of z) uniquely determined elementwise by the partial fraction expansion. However, R(z; A)’s poles are not necessarily of the same order as the algebraic mul- tiplicity of the corresponding eigenvalues since the entries C. Dunford decomposition, decomposed of , and thus of >, may have zeros at A’s eigenvalues. C C This has the potential to render Aλ,m equal to the zero matrix 0. For f(A) = A, Eqs. (13) and (10) imply that: The Cauchy integral formula indicates that the con- X 1 I stant matrices Aλ,m of Eq. (9) can be obtained as the A = zR(z; A)dz { } 2πi Cλ residues: λ∈ΛA X  I I  1 I = λ 1 R(z; A)dz + 1 (z λ)R(z; A)dz A = (z λ)mR(z; A)dz , (10) 2πi 2πi − λ,m λ∈Λ Cλ Cλ 2πi C − A λ X = (λAλ,0 + Aλ,1) . (14) where the residues are calculated elementwise. The pro- λ∈ΛA jection operators Aλ associated with each eigenvalue λ were already referenced in §II, but can now be properly We denote the important set of nilpotent matrices Aλ,1 introduced as the Aλ,0 matrices: that project onto the generalized eigenspaces by relabel- ing them: Aλ = Aλ,0 (11) 1 I Nλ Aλ,1 (15) = R(z; A)dz . (12) ≡ 2πi 1 I Cλ = (z λ)R(z; A)dz . (16) 2πi Cλ − Since R(z; A)’s elements are rational functions, as we just showed, it is analytic except at a finite number of Equation (14) is the unique Dunford decomposi- P isolated singularities—at A’s eigenvalues. In light of the tion [16]: A = D + N, where D λ∈Λ λAλ is di- P ≡ A residue theorem, this motivates the Cauchy-integral-like agonalizable, N Nλ is nilpotent, and D and N ≡ λ∈ΛA formula that serves as the starting point for the mero- commute: [D,N] = 0. This is also known as the Jordan– morphic functional calculus: Chevalley decomposition. The special case where A is diagonalizable implies that X 1 I f(A) = f(z)R(z; A)dz . (13) N = 0. And so, Eq. (14) simplifies to: 2πi Cλ λ∈ΛA X A = λAλ .

Let’s now consider several immediate consequences. λ∈ΛA

D. The resolvent, resolved

B. Decomposing the identity As shown in Ref. [14] and can be derived from Eqs. (12) and (16): Even the simplest applications of Eq. (13) yield insight. A A = δ A and Consider the identity as the operator function f(A) = λ ζ λ,ζ λ A0 = I that corresponds to the scalar function f(z) = AλNζ = δλ,ζ Nλ . 8

Due to these, our spectral decomposition of the Dunford and itself: decomposition implies that: ν −1 X Xλ 1 I f(z)  X  f(A) = Aλ,m m+1 dz . (23) Nλ = Aλ A ζAζ 2πi Cλ (z λ) − λ∈ΛA m=0 − ζ∈ΛA  = Aλ A λAλ In obtaining Eq. (23) we finally derived Eq. (7), as −  promised earlier in § III C. Effectively, by modulating the = Aλ A λI . (17) − modes associated with the resolvent’s singularities, the scalar function f( ) is mapped to the operator domain, Moreover: · where its action is expressed in each of A’s independent m Aλ,m = Aλ A λI . (18) subspaces. − m It turns out that for m > 0: Aλ,m = Nλ . (See also Ref. [14, p. 483].) This leads to a generalization of the F. Evaluating the residues projection operator orthonormality relations of Eq. (1). Most generally, the operators of Aλ,m are mutually re- Interpretation aside, how does one use this result? { } lated by: Equation (23) says that the spectral decomposition of f(A) reduces to the evaluation of several residues, where: Aλ,mAζ,n = δλ,ζ Aλ,m+n . (19) 1 I Resg(z), z λ = g(z) dz . Finally, if we recall that the index νλ is the dimension of → 2πi Cλ the largest associated subspace, we find that the index m So, to make progress with Eq. (23), we must evaluate of λ characterizes the nilpotency of Nλ: Nλ = 0 for m νλ. That is: function-dependent residues of the form: ≥ m+1  Aλ,m = 0 for m νλ . (20) Res f(z)/(z λ) , z λ . ≥ − →

Returning to Eq. (9), we see that all Aλ,m with m νλ If f(z) were holomorphic at each λ, then the order of ≥ are zero-matrices and so do not contribute to the sum. the pole would simply be the power of the denomina- Thus, we can rewrite Eq. (9) as: tor. We could then use Cauchy’s differential formula for holomorphic functions: ν −1 X Xλ 1 n! I f(z) R(z; A) = m+1 Aλ,m (21) (n) (z λ) f (a) = n+1 dz , (24) λ∈ΛA m=0 − 2πi C (z a) a − or: for f(z) holomorphic at a. And, the meromorphic cal- culus would reduce to the holomorphic calculus. Often, νλ−1 X X 1 m f(z) will be holomorphic at least at some of A’s eigenval- R(z; A) = Aλ A λI , (22) (z λ)m+1 − ues. And so, Eq. (24) is still locally a useful simplification λ∈ΛA m=0 − in those special cases. for z / ΛA. In general, though, f(z) introduces poles and zeros at ∈ The following sections sometimes use Aλ,m in place of λ ΛA that change their orders. This is exactly the im- m ∈ Aλ A λI . This is helpful both for conciseness and petus for the generalized functional calculus. The residue − when applying Eq. (19). Nonetheless, the equality in of a complex-valued function g(z) around its isolated pole Eq. (18) is a useful one to keep in mind. λ of order n + 1 can be calculated from: n  1 d  n+1  Res g(z), z λ = lim n (z λ) g(z) . → n! z→λ dz −

E. Meromorphic functional calculus G. Decomposing AL In light of Eq. (13), Eq. (21) together with Eq. (18) allow us to express any function of an operator simply Equation (23) says that we can explicitly derive the and solely in terms of its spectrum (i.e., its eigenvalues spectral decomposition of powers of the operator A. Of for the finite dimensional case), its projection operators, course, we already did this for the special cases of A0 and 9

A1. The goal, though, is to do this in general. example, at this special value of λ and for integer L > 0, For f(A) = AL f(z) = zL, z = 0 can be either a λ = 0 induces poles that cancel with the zeros of f(z) = → zero or a pole of f(z), depending on the value of L. In zL, since zL has a zero at z = 0 of order L. For integer either case, an eigenvalue of λ = 0 will distinguish itself L < 0, an eigenvalue of λ = 0 increases the order of the in the residue calculation of AL via its unique ability to z = 0 pole of f(z) = zL. For all other eigenvalues, the change the order of the pole (or zero) at z = 0. For residues will be as expected. Hence, from Eq. (23) and inserting f(z) = zL, for any L C: ∈

1 dm L λL−m Qm = m! limz→λ dzm z = m! n=1(L−n+1) z }| { " νλ−1  I L  # ν0−1  I  L X X m 1 z X m 1 L−m−1 A = Aλ A λI m+1 dz + [0 ΛA] A0A z dz − 2πi C (z λ) ∈ 2πi C λ∈ΛA m=0 λ − m=0 0 λ6=0 | {z } =δL,m

" νλ−1   # ν0−1 X X L L−m m X m = λ Aλ A λI + [0 ΛA] δL,mA0A , (25) m − ∈ λ∈ΛA m=0 m=0 λ6=0

L where m is the generalized binomial coefficient: H. Drazin inverse

  m   −|L| L 1 Y L If L is any negative integer, then  can be writ- = (L n + 1) with = 1 , (26) m m m! − 0 ten as a traditional binomial coefficient ( 1)m|L|+m−1, n=1 − m yielding: and [0 ΛA] is the Iverson bracket which takes on value ∈ 1 if zero is an eigenvalue of A and 0 if not. A was νλ−1 λ,m −|L| X X m|L|+m−1 −|L|−m m A = ( 1) λ Aλ,m , (28) replaced by Aλ(A λI) to suggest the more explicit − m − L λ∈ΛA m=0 calculations involved with evaluating any A . Equation λ6=0 (25) applies to any linear operator with only isolated sin- for L 1, 2, 3,... . gularities in its resolvent. − | | ∈ {− − − } The eigen-decomposition of implied by Thus, negative powers of an operator can be consis- Eq. (25) makes the contribution of the zero eigen- tently defined even for noninvertible operators. In light value more explicit than previous treatments and enables of Eqs. (25) and (28), it appears that the zero eigen- closed-form expressions, e.g., for correlation functions, value does not even contribute to the function. It is well where the zero eigenvalue makes a qualitatively distinct known, in contrast, that it wreaks havoc on the naive, contribution. Consequentially, this formulation can lead oft-quoted definition of a matrix’s negative power: to the recognition of coexistent finite and infinite range ? adj(A) adj(A) physical phenomena of different mechanistic origin. A−1 = = , Q aλ det(A) λ∈Λ λ If L is a nonnegative integer such that L νλ 1 for A ≥ − all λ ΛA, then: ∈ since this would imply dividing by zero. If we can accept large positive powers of singular matrices—for which the νλ−1   L X X L L−m A = λ Aλ,m , (27) zero eigenvalue does not contribute—it seems fair to also m λ∈ΛA m=0 accept negative powers that likewise involve no contribu- λ6=0 tion from the zero eigenvalue. L Editorializing aside, we note that extending the defi- where m is now reduced to the traditional binomial −1 coefficient L!/(m!(L m)!). nition of A to the domain including singular operators − via Eqs. (25) and (28) implies that:

A|L|A−|`| = A−|`|A|L| |L|−|`| = A for L ` + ν0 , | | ≥ | | 10 which is a very sensible and desirable condition. More- Drazin inverse, it imposes more work than is actually −1 over, we find that AA = I A0. necessary. Using the meromorphic functional calculus, − Specifically, the negative-one power of any square ma- we can derive a new, simple construction of the Drazin trix is in general not the same as the matrix inverse since inverse that requires only the original operator and the inv(A) need not exist. However, it is consistently defined eigenvalue-0 projector. via Eq. (28) to be: First, assume that λ is an isolated singularity of R(z; A) with finite separation at least  distance from νλ−1 −1 X X m −1−m the nearest neighboring singularity. And, consider the A = ( 1) λ Aλ,m . (29)  − operator-valued function fλ defined via the RHS of: λ∈ΛA\{0} m=0 A = f (A) Drazin inverse D not λ λ This is the A of A. Note that it is I the same as the Moore–Penrose pseudo-inverse [36, 37]. 1 −1 = 2πi (ζI A) dζ , Although the Drazin inverse is usually defined ax- λ+eiφ − iomatically to satisfy certain criteria [38], it is naturally with λ+eiφ defining an -radius circular contour around derived as the negative one power of a singular oper- λ. Then we see that: ator in the meromorphic functional calculus. We can I check that it indeed satisfies the axiomatic criteria for  1 −1 fλ(z) = 2πi (ζ z) dζ the Drazin inverse, enumerated according to historical λ+eiφ − precedent:   = z C : z λ <  , (30) ∈ | − | (1ν0 ) Aν0 ADA = Aν0 where [z C : z λ < ] is the Iverson bracket that D D D ∈ | − | (2) A AA = A takes on value 1 if z is within -distance of λ and 0 if not. D Second, we use this to find that, for any c C 0 : (5) [A, A ] = 0 , ∈ \{ } νλ−1 I  −1 which gives rise to the Drazin inverse’s moniker as the X X z + cf0 (z) (A + cA )−1 = A 1 dz ν0 0 λ,m 2πi m+1 1 , 2, 5 -inverse [38]. The analytical form of Eq. (29) Cλ (z λ) { } λ∈ΛA m=0 − has been teased out previously by other means; see, e.g., ν0−1 I −1 Ref. [38] and for other settings see Refs. [39, 40]. Nev- D X m 1 (z + c) = A + A0A 2πi zm+1 ertheless, due to its utility in application, it is notewor- m=0 C0 thy and appealing that the Drazin inverse falls out or- ν0−1 D X m m m+1 ganically in the meromorphic functional calculus, as the = A + A0A ( 1) /c , (31) − negative-one power, in contrast to its otherwise rather m=0 esoteric axiomatic origin. where we asserted that the contour C exists within the While A−1 always exists, the resolvent is nonanalytic 0 finite -ball about the origin. at z = 0 for a singular matrix. Effectively, the mero- Third, we note that A + cA0 is invertible for all c = 0; morphic functional calculus removes the nonanalyticity 6 this can be proven by multiplying each side of Eq. (31) of the resolvent in evaluating A−1. As a result, as we by A + cA . Hence, (A + cA )−1 = inv(A + cA ) for all can see from Eq. (29), the Drazin inverse inverts what is 0 0 0 c = 0. invertible; the remainder is zeroed out. 6 −1 Finally, multiplying each side of Eq. (31) by I A0, Of course, whenever A is invertible, A is equal to − and recalling that A A = A , we find a useful inv(A). However, we should not confuse this coincidence 0,0 0,m 0,m expression for calculating the Drazin inverse of any linear with equivalence. Moreover, despite historic notation operator A, given only A and A . Specifically: there is no reason that the negative-one power should 0 in general be equivalent to the inverse. Especially, if an D −1 A = (I A0)(A + cA0) . (32) operator is not invertible! To avoid confusing A−1 with − inv(A), we use the notation AD for the Drazin inverse of which is valid for any c C 0 . Eq. (32) generalizes D ∈ \{ } A. Still, A = inv(A), whenever 0 / ΛA. the result found specifically for c = 1 in Ref. [41]. ∈ − Amusingly, this extension of previous calculi lets us For the special case of c = 1, it is worthwhile to − resolve an elementary but fundamental question: What also consider the alternative construction of the Drazin is 0−1? It is certainly not infinity. Indeed, it is just as close to negative infinity! Rather: 0−1 = 0 = inv(0). 6 Although Eq. (29) is a constructive way to build the 11 inverse implied by Eq. (31): within A’s spectrum at Ξf ΛA—then we expect to lose ⊂ both homomorphism and spectral mapping properties: ν0−1 D −1  X m A = (A A0) + A0 A . (33) • Loss of homomorphism: f1(A)f2(A) = (f1 f2)(A); − m=0 6 · • Loss of naive spectral mapping: f(ΛA Ξf ) \ ⊂ By a spectral mapping (λ 1 λ, for λ ΛT ), the Λf(A). → − ∈ Perron–Frobenius theorem and Eq. (31) yield an im- A simple example of both losses arises with the Drazin portant consequence for any stochastic matrix T . The −1 inverse, above. There, f1(z) = z . Taking this and Perron–Frobenius theorem guarantees that T ’s eigenval- f2(z) = z combined with singular operator A leads to ues along the unit circle are associated with a diagonal- the loss of homomorphism: ADA = I. As for the second 6 izable subspace. In particular, ν1 = 1. Spectral mapping property, the spectral mapping can be altered for the of this result means that T ’s eigenvalue 1 maps to the candidate spectra at Ξf via pole–pole or pole–zero inter- eigenvalue 0 of I T and T1 = (I T )0. Moreover: −1 − − actions in the complex contour integral. For f(A) = A , −1 D how does A’s eigenvalue of 0 get mapped into the new [(I T ) + T1] = (I T ) + T1 , − − spectrum of AD? A naive application of the spectral mapping theorem might seem to yield an undefined quan- since ν0 = 1. This corollary of Eq. (31) (with c = 1) tity. But, using the meromorphic functional calculus self- corresponds to a number of important and well known −1 results in the theory of Markov processes. Indeed, Z consistently maps the eigenvalue as 0 = 0. It remains −1 ≡ to be explored whether the full spectral mapping is pre- (I T + T1) is called the fundamental matrix in that − setting [42]. served for any function f(A) under the meromorphic in- terpretation of f(λ). It should now be apparent that extending functions I. Consequences and generalizations via the meromorphic functional calculus allows one to express novel mathematical properties, some likely capa- For an infinite-rank operator A with a continuous spec- ble of describing new physical phenomena. At the same trum, the meromorphic functional calculus has the nat- time, extra care is necessary. The situation is reminiscent ural generalization: of the loss of commutativity in non-Abelian operator al- gebra: not all of the old rules apply, but the gain in 1 I f(A) = f(z)(zI A)−1 dz , (34) nuance allows for mathematical description of important 2πi − CΛA phenomena. We chose to focus primarily on the finite-rank case where the contour CΛA encloses the (possibly continu- here since it is sufficient to demonstrate the utility of ous) spectrum of A without including any unbounded the general projection-operator formalism. Indeed, there contributions from f(z) outside of CΛA . The function are ample nontrivial applications in the finite-rank set- f(z) is expected to be meromorphic within CΛA . This ting that deserve attention. To appreciate these, we now again deviates from the holomorphic approach, since the turn to address the construction and properties of general holomorphic functional calculus requires that f(z) is an- eigenprojectors. alytic in a neighborhood around the spectrum; see § VII of Ref. [43]. Moreover, Eq. (34) allows an extension of the functional calculus of Refs. [34, 35, 44], since the function V. CONSTRUCTING DECOMPOSITIONS can be meromorphic at the point spectrum in addition being meromorphic on the residual and continuous spec- At this point, we see that projection operators are fun- tra. damental to functions of an operator. This prompts the In either the finite- or infinite-rank case, whenever f(z) practical question of how to actually calculate them. The is analytic in a neighborhood around the spectrum, the next several sections address this by deriving expressions meromorphic functional calculus agrees with the holo- with both theoretical and applied use. We first develop morphic. Whenever f(z) is not analytic in a neighbor- the projection operators associated with index-one eigen- hood around the spectrum, the function is undefined in values. We then explicate the relationship between eigen- the holomorphic approach. In contrast, the meromorphic vectors, generalized eigenvectors, and projection opera- approach extends the function to the operator-valued do- tors for normal, diagonalizable, and general matrices. Fi- main and does so with novel consequences. nally, we show how the general results specialize in sev- In particular, when f(z) is not analytic in a neigh- eral common cases of interest. After these, we turn to borhood around the spectrum—say f(z) is nonanalytic examples and applications. 12

A. Projection operators of index-one eigenvalues directly via residues, as in Eq. (12). An alternative procedure—one that extends a method To obtain the projection operators associated with familiar at least in quantum mechanics—is to obtain the each index-one eigenvalue λ ζ ΛA : νζ = 1 , we projection operators via eigenvectors. However, quan- ∈ { ∈ } apply the functional calculus to an appropriately chosen tum mechanics always concerns itself with a subset of function of A, finding: diagonalizable operators. What is the necessary gener- alization? For one, left and right eigenvectors are no νξ−1 Q νζ I ζ∈ΛA (z ζ) longer simply conjugate transposes of each other. More Y X X Aξ,m ζ6=λ (A ζI)νζ = − dz − 2πi (z ξ)m+1 severely, a full set of spanning eigenvectors is no longer ζ∈ΛA ξ∈ΛA m=0 Cξ ζ6=λ − guaranteed and we must resort to generalized eigenvec-

Q νζ I ζ∈ΛA (z ζ) tors. Since the relationships among eigenvectors, gener- 1 ζ6=λ − = Aλ dz alized eigenvectors, and projection operators are critical 2πi z λ Cλ − to the practical calculation of many physical observables Y νζ = Aλ (λ ζ) . of complex systems, we collect these results in the next − ζ∈ΛA section. ζ6=λ

Therefore, if νλ = 1: B. Eigenvectors, generalized eigenvectors, and  νζ Y A ζI projection operators Aλ = − . (35) λ ζ ζ∈ΛA − ζ6=λ Two common questions regarding projection operators are: Why not just use eigenvectors? And, why not use As convenience dictates in our computations, we let νζ → the Jordan canonical form? First, the eigenvectors of aζ gζ +1 or even νζ aζ in Eq. (35), since multiplying − → a do not form a complete basis with Aλ by (A ζI)/(λ ζ) has no effect for ζ ΛA λ if − − ∈ \{ } which to expand an arbitrary vector. One needs general- νλ = 1. Equation (35) generalizes a well known result that ap- ized eigenvectors for this. Second, some functions of an plies when the index of all eigenvalues is one. That is, operator require removing, or otherwise altering, the con- when the operator is diagonalizable, we have: tribution from select eigenspaces. This is most adroitly handled with the projection operator formalism where Y A ζI different eigenspaces (correlates of Jordan blocks) can ef- Aλ = − . λ ζ fectively be treated separately. Moreover, even for simple ζ∈ΛA − ζ6=λ cases where eigenvectors suffice, the projection operator formalism simply can be more calculationally or mathe- To the best of our knowledge, Eq. (35) is original. matically convenient. Since eigenvalues can have index larger than one, not That said, it is useful to understand the relationship all projection operators of a nondiagonalizable operator between projection operators and generalized eigenvec- can be found directly from Eq. (35). Even so, it serves tors. For example, it is often useful to create projection three useful purposes. First, it gives a practical reduction operators from generalized eigenvectors. This section of the eigen-analysis by finding all projection operators of clarifies their connection using the language of matrices. index-one eigenvalues. Second, if there is only one eigen- In the most general case, we show that the projection value that has index larger than one—what we call the operator formalism is usefully concise. almost diagonalizable case—then Eq. (35), together with the fact that the projection operators must sum to the identity, does give a full solution to the set of projection 1. Normal matrices operators. Third, Eq. (35) is a powerful theoretical tool that we can use directly to spectrally decompose func- tions, for example, of a stochastic matrix whose eigen- Unitary, Hermitian, skew-Hermitian, orthogonal, sym- values on the unit circle are guaranteed to be index-one metric, and skew-symmetric matrices are all special cases by the Perron–Frobenius theorem. of normal matrices. As noted, normal matrices are those Although index-one expressions have some utility, we that commute with their Hermitian adjoint (complex- † † need a more general procedure to obtain all projection conjugate transpose): AA = A A. Moreover, a matrix is operators of any linear operator. Recall that, with full normal if and only if it can be diagonalized by a unitary † generality, projection operators can also be calculated transformation: A = UΛU , where the columns of the unitary matrix U are the orthonormal right eigenvectors 13 of A corresponding to the eigenvalues ordered along the Aλ λ∈Λ with the property that: { } A Λ. For an M-by-M matrix A, the eigen- M M values in ΛA are ordered and enumerated according to X X † † Aζ Aλ = δζ,λi δλ,λj ~ui~ui ~uj~uj the possibly degenerate M-tuple (ΛA) = (λ1, . . . , λM ). i=1 j=1 Since an eigenvalue λ ΛA has algebraic multiplicity ∈ M M aλ 1, λ appears aλ times in the ordered tuple. X X † ≥ = δζ,λi δλ,λj ~uiδi,j~uj Assuming A is normal, each projection operator Aλ i=1 j=1 can be constructed as the sum of all ket–bra pairs of right- M eigenvectors corresponding to λ composed with their con- X † = δζ,λi δλ,λi ~ui~ui jugate transpose. We later introduce bras and kets more i=1 generally via generalized eigenvectors of the operator A = δζ,λAλ . and its dual A>. However, since the complex-conjugate transposition rule between dual spaces is only applicable Moreover: to a ket basis derived from a normal operator, we put off M using the bra-ket notation for now so as not to confuse X X † the more familiar “normal” case with the general case. Aλ = ~uj~uj To explicitly demonstrate this relationship between λ∈ΛA j=1 † projection operators, eigenvectors, and their Hermitian = UU adjoints in the case of normality, observe that: = I,

A = UΛU † and so on. All of the expected properties of projection    †  operators can be established again in this restricted set- λ1 0 0 ~u1 ··· † ting. 0 λ 0      2   ~u2  −1 † = ~u1 ~u2 ~uM  . . ··· .   .  The rows of U = U are A’s left eigenvectors. In this ···  . . .. .   .   . . . .   .  case, they are simply the conjugate transpose of the right † 0 0 λM ~u eigenvectors. Note that conjugate transposition is the fa- ··· M  †  miliar transformation rule between ket and bra spaces in ~u 1 quantum mechanics (see, e.g., Ref. [45])—a consequence  ~u†     2  of the restriction to normal operators, as we will show. = λ1~u1 λ2~u2 λM ~uM  .  ···  .   .  Importantly, a more general formulation of quantum me- † ~uM chanics would not have this same restricted correspon- M dence between the dual ket and bra spaces. X † To elaborate on this point, recall that vector spaces = λj~uj~uj j=1 admit dual spaces and dual bases. However, there is no X sense of a dual correspondence of a single ket or bra with- = λAλ . out reference to a full basis [15]. Implicitly in quantum λ∈ΛA mechanics, the basis is taken to be the basis of eigenstates Evidently, for normal matrices A: of any Hermitian operator, nominally since observables are self-adjoint. M † X † To allude to an alternative, we note that ~uj~uj is not Aλ = δλ,λj ~uj~uj . only the Hermitian form of inner product ~uj, ~uj (where j=1 h i , denotes the inner product) of the right eigenvec- h· ·i † tor ~uj with itself, but importantly also the simple dot- And, since ~ui ~uj = δi,j, we have an orthogonal set † product of the left eigenvector ~uj and the right eigen- † vector ~uj, where ~uj acts as a linear functional on ~uj. Contrary to the substantial effort devoted to the inner- product-centric theory of Hilbert spaces, this latter in- † terpretation of ~uj~uj—in terms of linear functionals and a left-eigenvector basis for linear functionals—is what gen- eralizes to a consistent and constructive framework for the spectral theory beyond normal operators, as we will see shortly. 14

2. Diagonalizable matrices Then: X By definition, diagonalizable matrices can be diagonal- A = λAλ ized, but not necessarily via a unitary transformation. λ∈ΛA M All diagonalizable matrices can nevertheless be diagonal- X −1 = λj λj λj ized via the transformation: A = P ΛP , where the | i h | j=1 columns of the square matrix P are the not-necessarily-   λ1 orthogonal right eigenvectors of A corresponding to the h |  λ2  eigenvalues ordered along the diagonal matrix Λ and    h |  −1 = λ1 λ1 λ2 λ2 λM λM  .  where the rows of P are A’s left eigenvectors. Impor- | i | i · · · | i  .  tantly, the left eigenvectors need not be the Hermitian λM h | adjoint of the right eigenvectors. As a particular exam-     λ1 0 0 λ1 ple, this more general setting is required for almost any ··· h |  0 λ2 0   λ2  transition dynamic of a Markov chain. In other words,    ···   h |  = λ1 λ2 λM  . . . .   .  the transition dynamic of any interesting complex net- | i | i · · · | i  . . .. .   .  work with irreversible processes serves as an example of 0 0 λM λM ··· h | a nonnormal operator. = P ΛP −1 . Given the M-tuple of possibly-degenerate eigen- values (ΛA) = (λ1, λ2, . . . , λM ), there is a cor- So, we see that the projection operators introduced ear- responding M-tuple of linearly-independent right- lier in a coordinate-free manner have a concrete repre- eigenvectors ( λ1 , λ2 ,..., λM ) and a correspond- | i | i | i sentation in terms of left and right eigenvectors when ing M-tuple of linearly-independent left-eigenvectors the operator is diagonalizable. ( λ1 , λ2 ,..., λM ) such that: h | h | h |

A λj = λj λj | i | i and:

λj A = λj λj h | h | 3. Any matrix with the orthonormality condition that: Not all matrices can be diagonalized, but all square λi λj = δi,j . matrices can be put into Jordan canonical form via the h | i transformation: A = YJY −1 [17]. Here, the columns To avoid misinterpretation, we stress that the bras and of the square matrix Y are the linearly independent kets that appear above are the left and right eigenvectors, right eigenvectors and generalized right eigenvectors cor- respectively, and typically do not correspond to complex- responding to the Jordan blocks ordered along the diag- conjugate transposition. onal of the block-diagonal matrix J. And, the rows of With these definitions in place, the projection opera- Y −1 are the corresponding left eigenvectors and gener- tors for a can be written: alized left eigenvectors, but reverse-ordered within each block, as we will show. M X Let there be n Jordan blocks forming the n-tuple Aλ = δλ,λj λj λj . th | i h | (J1,J2,...,Jn), with 1 n M. The k Jordan j=1 ≤ ≤ block Jk has dimension mk-by-mk:

λk 1 0 0 0 0  ···   0 λk 1 0 0     0 λk 0     . . . .  Jk =  ......  mk rows    0 λk 1 0     0 0 0 λk 1   0 0 0 0 λk | ···{z } mk columns 15 such that: Most directly, the generalized right and left eigenvec-

n tors can be found as the nontrivial solutions to: X mk = M. m (m) (A λkI) λ = ~0 k=1 − | k i

Note that eigenvalue λ ΛA corresponds to gλ differ- and: ∈ ent Jordan blocks, where gλ is the geometric multiplicity (m) m λ (A λkI) = ~0 , of the eigenvalue λ. Indeed: h k | − X n = gλ . respectively.

λ∈ΛA It should be clear from Eq. (36) and Eq. (37) that:

(m) (n) (m−`) (n) Moreover, the index νλ of the eigenvalue λ is defined as ` λk (A λkI) λk = λk λk the size of the largest Jordan block corresponding to λ. h | − | i h | i = λ(m) λ(n−`) , So, we write this in the current notation as: h k | k i

n for m, n, 0, 1, . . . , mk and ` 0. At the same time, νλ = max δλ,λk mk k=1 . ∈ { } ≥ { } it is then easy to show that:

If the index of any eigenvalue is greater than one, (m) (n) (m+n) (0) λ λ = λ λ = 0, if m + n mk , then the conventional eigenvectors do not span the M- h k | k i h k | k i ≤ dimensional vector space. However, the set of M gen- where m, n 0, 1, . . . , mk . Imposing appropriate nor- eralized eigenvectors does form a basis for the vector ∈ { } malization, we find that: space [46]. Given the n-tuple of possibly-degenerate eigenvalues (m) (n) λj λk = δj,kδm+n,mk+1 . (38) (ΛA) = (λ1, λ2, . . . , λn), there is a corresponding n- h | i tuple of mk-tuples of linearly-independent generalized Hence, we see that the left eigenvectors and generalized right-eigenvectors: eigenvectors are a dual basis to the right eigenvectors and   generalized eigenvectors. Interestingly though, within ( λ(m) )m1 , ( λ(m) )m2 ,..., ( λ(m) )mn , | 1 i m=1 | 2 i m=1 | n i m=1 each Jordan subspace, the most generalized left eigenvec- tors are dual to the least generalized right eigenvectors, where: and vice versa. m   (To be clear, in this terminology “least generalized” λ(m)  k λ(1) , λ(2) ,..., λ(mk) | k i m=1 ≡ | k i | k i | k i eigenvectors are the standard eigenvectors. For exam- (1) ple, the λk satisfying the standard eigenvector re- and a corresponding n-tuple of mk-tuples of linearly- (1)h | (1) lation λ A = λk λ is the least generalized left independent generalized left-eigenvectors: h k | h k | eigenvector of subspace k. By way of comparison, the   (m) m1 (m) m2 (m) mn “most generalized”’ right eigenvector of subspace k is ( λ1 )m=1, ( λ2 )m=1,..., ( λn )m=1 , (mk) h | h | h | λk satisfying the most generalized eigenvector rela- | i (mk) (mk−1) tion (A λkI) λ = λ for subspace k. The where: − | k i | k i orthonormality relation shows that the two are dual cor- (1) (mk) (m) mk  (1) (2) (m )   k respondents: λk λk = 1, while all other eigen-bra– λk m=1 λk , λk ,..., λk h | i h | ≡ h | h | h | eigen-ket closures utilizing these objects are null.) such that: With these details worked out, we find that the pro- jection operators for a nondiagonalizable matrix can be (m+1) (m) (A λkI) λ = λ (36) written as: − | k i | k i n mk and: X X (m) (mk+1−m) Aλ = δλ,λk λk λk . (39) m=1 | i h | (m+1) (m) k=1 λk (A λkI) = λk , (37) h | − h | And, we see that a projection operator includes all of its (0) ~ (0) ~ left and right eigenvectors and all of its left and right for 0 m mk 1, where λj = 0 and λj = 0. ≤ ≤ (1) − (1) | i h | generalized eigenvectors. This implies that the identity Specifically, λ and λ are conventional right and | k i h k | operator must also have a decomposition in terms of both left eigenvectors, respectively. 16 eigenvectors and generalized eigenvectors: we are forced by Eq. (38) to recognize that:

X  (m1+1−m) m1  I = Aλ λ1 m=1 h (m +1−m)| m2 λ∈ΛA  λ 2   −1  2 m=1  n mk Y =  h . |  X X (m) (mk+1−m)  .  = λk λk .  .  m=1 | i h |  (mn+1−m) mn k=1 λn h | m=1  (m) mk Let λ denote the column vector: −1 | k i m=1 since then Y Y = I, and we recall that the inverse is guaranteed to be unique.  (1)  λ The above demonstrates an explicit construction for | k i  (m) mk  .  λ =  .  , the Jordan canonical form. One advantage we learn from | k i m=1  .  (m ) this explicit decomposition is that the complete set of left λ k | k i eigenvectors and left generalized eigenvectors (encapsu- −1 m lated in Y ) can be obtained from the inverse of the and let  λ(mk+1−m)  k denote the column vector: h k | m=1 matrix of the complete set of right eigenvectors and gen-   eralized right eigenvectors (encoded in Y ) and vice versa. λ(mk) h k | One unexpected lesson, though, is that the generalized  (mk+1−m) mk  .  λ =  .  . left eigenvectors appear in reverse order within each Jor- h k | m=1  .  (1) dan block. λk h | Using Eqs. (39) and (18) with Eq. (37), we see that Then, using the above results, and the fact that Eq. (37) the nilpotent operators Aλ,m with m > 0 further link the (m+1) (m+1) (m) various generalized eigenvectors within each subspace k. implies that λ A = λk λ + λ , we derive h k | h k | h k | the explicit generalized-eigenvector decomposition of the Said more suggestively, generalized modes of a nondiag- nondiagonalizable operator A: onalizable subspace are necessarily cooperative. It is worth noting that the left eigenvectors and gen- X  A = Aλ A eralized left eigenvectors form a basis for all linear func-

λ∈ΛA tionals of the vector space spanned by the right eigen- n mk vectors and generalized right eigenvectors. Moreover, the X X (m) (m +1−m) = λ λ k A left eigenvectors and generalized left eigenvectors are ex- | k i h k | k=1 m=1 actly the dual basis to the right eigenvectors and gener- n mk X X (m)  (mk+1−m) (mk−m)  alized right eigenvectors by their orthonormality proper- = λ λk λ + λ | k i h k | h k | ties. However, neither the left nor right eigen-basis is a k=1 m=1 > priori more fundamental to the operator. Sympatheti-  (m) m1     (m1+1−m) m1  λ1 m=1 J1 0 0 λ1 m=1 cally, the right eigenvectors and generalized eigenvectors | (m)im2 ··· h (m2+1−m)|m2  λ   0 J2 0   λ  form a (dual) basis for all linear functionals of the vector  | 2 i m=1  ···   h 2 | m=1  =  .   . . . .   .  space spanned by the left eigenvectors and generalized  .   . . .. .   .      eigenvectors.  (m) mn  (mn+1−m) mn λn 0 0 Jn λn | i m=1 ··· h | m=1 = YJY −1 , 4. Simplified calculi for special cases where, defining Y as:

> In special cases, the meromorphic functional calculus  (m) m1  λ1 reduces the general expressions above to markedly sim- | i m=1  (m) m2   λ2 m=1 pler forms. And, this can greatly expedite practical cal- Y =  | i  ,  .  culations and provide physical intuition. Here, we show  .   (m) mn which reductions can be used under which assumptions. λn | i m=1 For functions of operators with a countable spectrum, recall that the general form of the meromorphic func- tional calculus is:

ν −1 X Xλ 1 I f(z) f(A) = Aλ,m m+1 dz . (40) 2πi Cλ (z λ) λ∈ΛA m=0 − 17

Equations (18) and (39) gave the method to calculate we illustrate how commonly the Drazin inverse arises Aλ,m in terms of eigenvectors and generalized eigenvec- in nonequilibrium thermodynamics, giving a roadmap to tors. developing closed-from expressions for a number of key When the operator is diagonalizable (not necessarily observables. Fourth, we turn to signal analysis and com- normal), this reduces to: ment on power spectra of processes generated by non- diagonalizable operators. Finally, we round out the ap- I X 1 f(z) plications with a general discussion of Ruelle–Frobenius– f(A) = Aλ dz , (41) 2πi Cλ (z λ) Perron and Koopman operators for nonlinear dynamical λ∈ΛA − systems. where Aλ can now be constructed from conventional right and left eigenvectors, although λj is not necessarily the h | conjugate transpose of λj . | i A. Spectra of stochastic transition operators When the function is analytic on the spectrum of the (not necessarily diagonalizable) operator, then our func- The preceding employed the notation that A represents tional calculus reduces to the holomorphic functional cal- a general linear operator. In the following examples, we culus: reserve the symbol T for the operator of a stochastic tran-

νλ−1 sition dynamic. If the state-space is finite and has a sta- X X f (m)(λ) f(A) = Aλ,m . (42) tionary distribution, then T has a representation that is m! λ∈ΛA m=0 a nonnegative row-stochastic—all rows sum to unity— transition matrix. When the function is analytic on the spectrum of a The transition matrix’s nonnegativity guarantees that diagonalizable (not necessarily normal) operator this re- for each λ ΛT its complex conjugate λ is also in ΛT . duces yet again to: ∈ Moreover, the projection operator associated with the X complex conjugate of λ is the complex conjugate of Tλ: f(A) = f(λ)Aλ . (43) T = Tλ. λ∈ΛA λ If the dynamic induced by T has a stationary distribu- When the function is analytic on the spectrum of a tion over the state space, then the spectral radius of T is diagonalizable (not necessarily normal) operator with no unity and all of T ’s eigenvalues lie on or within the unit degeneracy this reduces even further to: circle in the complex plane. The maximal eigenvalues have unity magnitude and 1 ΛT . Moreover, an exten- X λ λ ∈ f(A) = f(λ)| i h | . (44) sion of the Perron–Frobenius theorem guarantees that λ λ eigenvalues on the unit circle have algebraic multiplicity λ∈ΛA h | i equal to their geometric multiplicity. And, so, νζ = 1 for Finally, recall that an operator is normal when it com- all ζ λ ΛT : λ = 1 . ∈ { ∈ | | } mutes with its conjugate transpose. If the function is T ’s index-one eigenvalue of λ = 1 is associated with analytic on the spectrum of a normal operator, then we stationarity of the associated Markov process. T ’s other recover the simple form enabled by the spectral theo- eigenvalues on the unit circle are roots of unity and cor- rem of normal operators familiar in physics. That is, respond to deterministic periodicities within the process. Eq. (43) is applicable, but now we have the extra sim- All of these results carry over from discrete to contin- plification that λ is simply the conjugate transpose of tG j uous time. In continuous time, where e = Tt0→t0+t, h† | λj : λj = λj . T ’s stationary eigenvalue of unity maps to G’s station- | i h | | i ary eigenvalue of zero. If the dynamic has a stationary distribution over the state space, then the rate matrix G VI. EXAMPLES AND APPLICATIONS is row-sum zero rather than row-stochastic. T ’s eigenval- ues, on or within the unit circle, map to G’s eigenvalues To illustrate the use and power of the meromorphic with nonpositive real part in the left-hand side of the functional calculus, we now adapt it to analyze a suite of complex plane. applications from quite distinct domains. First, we point To reduce ambiguity in the presence of multiple oper- to a set of example calculations for finite-dimensional op- ators, functions of operators, and spectral mapping, we erators of stochastic processes. Second, we show that occasionally denote eigenvectors with subscripted opera- the familiar Poisson process is intrinsically nondiagonal- tors on the eigenvalues within the bra or ket. For exam- izable, and hint that nondiagonalizability may be com- ple, 0G = 1T = 0G = 1T = 0T disambiguates the | i | i 6 | i | i 6 | i mon more generally in semi-Markov processes. Third, identification of 0 when we have operators G, T , , and | i G 18

τG τG with T = e , = e , and 0 ΛG, ΛG, ΛT . T T ∈ r r r r 0 1 2 . . . N

B. Randomness and memory in correlated processes FIG. 1: Explicit Markov-chain representation of the The generalized spectral theory developed here has re- continuous-time truncated Poisson dynamic, giving cently been applied to give the first closed-form expres- interstate transition rates r among the first N + 1 sions for many measures of complexity for stochastic pro- counter-states. (State self-transition rates r are not − cesses that can be generated by probabilistic finite au- depicted.) Taking the limit of N recovers the full → ∞ tomata [19–23]. Rather than belabor the Kolmogorov– Poisson counting distribution. It can either be Chaitin notion of complexity which is inherently uncom- time-homogeneous (transition-rate parameter r is time-independent) or time-inhomogeneous (parameter r putable [47], the new analytic framework here infuses is time-dependent). computational mechanics [48] with a means to compute very practical answers about an observed system’s orga- nization and to address the challenges of prediction. powerful results. For example, we can now answer the obvious questions regarding prediction: How random is a process? How much information is shared between the past and the fu- C. Poisson point processes ture? How far into the past must we look to predict what is predictable about the future? How much about the observed history must be remembered to predict what The functional calculus leads naturally to a novel per- is predictable about the future? And so on. The Sup- spective on the familiar Poisson counting process—a fa- plementary Materials of Ref. [19] exploit the generalized miliar stochastic process class used widely across physics spectral theory to answer these (and more) questions for and other quantitative sciences to describe “completely the symbolic dynamics of a chaotic map, the spacetime random” event durations that occur over a continuous domain for an elementary cellular automata, and the domain [51–54]. The calculus shows that the basic Pois- chaotic crystallographic structure of a close-packed poly- son distribution arises as the signature of a simple nondi- typic material as determined from experimental X-ray agonalizable dynamic. More to the point, we derive the diffractograms. Poisson distribution directly, without requiring the limit In the context of the current exposition, the most no- of the discrete-time binomial distribution, as convention- table feature of the analyses across these many domains ally done [29]. is that our questions, which entail tracking an observer’s Consider all possible counts, up to some arbitrarily state of knowledge about a process, necessarily induce large integer N. The dynamics among these first N + 1 a nondiagonalizable metadynamic that becomes the cen- counter states constitute what can be called the trun- tral object of analysis in each case. (This metadynamic is cated Poisson dynamic. We recover the full Poisson dis- tribution as N . A Markov chain for the truncated the so-called mixed-state presentation of Refs. [49, 50].) → ∞ This theme, and the inherent nondiagonalizability of Poisson dynamic is shown in Fig. 1. The corresponding prediction, is explored in greater depth elsewhere [22, 23]. rate matrix G, for any arbitrarily large truncation N of We also found that another nondiagonalizable dynamic the possible count, is: is induced even in the context of quantum communica-  r r  tion when determining how much memory reduction can −  r r  be achieved if we generate a classical stochastic process  −   . .  using quantum mechanics [24]. G =  .. ..  ,   We mention the above nondiagonalizable metadynam-  r r  − ics primarily as a pointer to concrete worked-out exam- r − ples where the generalized spectral theory has been em- ployed to analyze finitary hidden Markov processes via where Gij is the rate of transitioning to state (count) explicitly calculated, generalized eigenvectors and projec- j given that the system is in state (count) i. Elements tion operators. We now return to a more self-contained not on either the main diagonal or first superdiagonal are discussion, where we show that nondiagonalizability can zero. This can be rewritten succinctly as: be induced by the simple act of counting. Moreover, G = rI + rD1 , the theory developed is then applied to deliver quick and − 19 where I is the identity operator in N-dimensions and quence that: D1 is the upshift-by-1 matrix in N-dimensions, with n −rt zeros everywhere, except 1s along the first superdiag- (rt) e δ0 T (t) δn = onal. Let us also define the upshift-by-m matrix Dm h | | i n! with zeros everywhere except 1s along the mth super- = δm T (t) δm+n . m n h | | i diagonal, such that Dm = D1 and Dm = Dm·n, with That is, the probability that the counter is incremented D0 = I. Operationally, if δ` is the probability distribu- h | tion over counter states that is peaked solely at state `, by n in a time interval t is independent of the initial count and given by: (rt)ne−rt/n!. then δ` Dm = δ`+m . h | h | For any arbitrarily large N, G’s eigenvalues are given Let us emphasize that these steps derived the Poisson by det(G λI) = ( r λ)N+1 = 0, from which we see distribution directly, rather than as the typical limit of − − − the binomial distribution. Our derivation depended crit- that its spectrum is the singleton: ΛG = r . More- {− } ically on spectral manipulations of a highly nondiagonal- over, since it has algebraic multiplicity a−r = N + 1 and izable operator. Moreover, our result for the transition geometric multiplicity g−r = 1, the index of the r eigen- − dynamic T (t) allows a direct analysis of how distributions value is ν−r = N + 1. Since r is the only eigenvalue, − and all projection operators must sum to the identity, we over counts evolve in time, as would be necessary, say, in a Bayesian setting with unknown prior count. This type must have the eigenprojection: G−r = I. The lesson is that the Poisson point process is highly nondiagonaliz- of calculus can immediately be applied to the analysis able. of more sophisticated processes, for which we can gen- erally expect nondiagonalizability to play an important functional role.

1. Homogeneous Poisson processes

When the transition rate r between counter states is constant in time, the net counter state-to-state transition 2. Inhomogeneous Poisson processes operator from initial time 0 to later time t is given simply by: Let us now generalize to time-inhomogeneous Pois-

tG son processes, where the transition rate r between count T (t) = e . events is instantaneously uniform, but varies in time as r(t). Conveniently, the associated rate matrices at dif- The functional calculus allows us to directly evaluate ferent times commute with each other. Specifically, with etG for the Poisson nondiagonalizable transition-rate op- Ga = aI + aD1 and Gb = bI + bD1, we see that: erator G; we find: − −

[Ga,Gb] = 0 . T (t) = etG ν −1 X Xλ  I etz  Therefore, the net counter state-to-state transition oper- = G (G λI)m 1 dz λ 2πi m ator from time t0 to time tf is given by: − Cλ (z λ) λ∈ΛG m=0 − t N m R f G(t) dt t0 X m 1 d tz Tt0,tf = e = lim I(G + rI) lim m e N→∞ m! z→−r dz R tf  m=0 r(t) dt (−I+D1) | {z } = e t0 tme−rt hri(∆t)(−I+D ) ∞ m −rt = e 1 X m t e = (rD1) m! (∆t)Ghri m=0 = e , (45) ∞ X (rt)me−rt = Dm . where ∆t = tf t0 is the time elapsed and: m! − m=0 Z tf r = 1 r(t) dt Consider the orthonormality relation δi δj = δi,j be- h i ∆t h | i t0 tween counter states, where δj is represented by 0s ev- | i erywhere except for a 1 at counter-state j. It effectively is the average rate during that time. Given Eq. (45), measures the occupation probability of counter-state j. the functional calculus proceeds just as in the time- Employing the result for T (t), we find the simple conse- homogeneous case to give the analogous net transition 20 dynamic: ther equilibrium steady states or nonequilibrium steady states [26]. ∞ m −hri∆t X r ∆t e Tt ,t = Dm h i . 0 f m! m=0

The probability that the count is incremented by n dur- 1. Dynamics in independent eigenspaces ing the time interval ∆t follows directly:

n r ∆t e−hri∆t An important feature of the functional calculus is its δm Tt ,t δm+n = h i . ability to address particular eigenspaces independently h | 0 f | i n! when necessary. This feature is often taken for granted With relative ease, our calculus allowed us to derive in the case of normal operators; say, in physical dynam- an important result for stochastic process theory that is ical systems when analyzing stationary distributions or nontrivial to derive by other means. Perhaps surprisingly, dominant decay modes. Consider a singular operator L we see that the probability distribution over final counts that is not necessarily normal and not necessarily diago- induced by any rate trajectory r(t) is the same as if the nalizable and evaluate the simple yet ubiquitous integral τ transition rate were held fixed at mean r throughout R etL dt. Via the meromorphic functional calculus we h i 0 the duration. Moreover, we can directly analyze the net find: evolution of distributions over counts using the derived τ Z τ νλ−1 I R etz dt transition operator Tt0,tf . tL X X 1 0 e dt = λ,m 2πi m+1 dz Note that the nondiagonalizability of the Poisson dy- 0 L Cλ (z λ) λ∈ΛL m=0 − namic is robust in a physical sense. That is, even varying ν0−1 I −1 τz the rate parameter in time in an erratic way, the inherent  X 1 z (e 1)  = 0,m − dz 2πi zm+1 structure of counting imposes a fundamental nondiago- m=0 L C0 nalizable nature. That nondiagonalizability can be ro- νλ−1 I −1 τz X X 1 z (e 1) bust in a physical sense is significant, since one might oth- + λ,m 2πi m−+1 dz L Cλ (z λ) erwise be tempted to argue that nondiagonalizability is λ∈ΛL\0 m=0 − extremely fragile due to numerical perturbations within ν0−1  X τ m+1  D τL  any matrix representation of the operator. This is simply = 0,m + e I , (46) (m+1)! L L − not the case since such perturbations are physically for- m=0 bidden. Rather, this simple example challenges us with where D is the Drazin inverse of , discussed earlier. the fact that some processes, even those familiar and L L The pole–pole interaction (z−1 with z−m−1) at z = widely used, are intrinsically nondiagonalizable. On the 0 distinguished the 0-eigenspace in the calculations and positive side, it appears that spectral methods can now required the meromorphic functional calculus for direct be applied to analyze them. And, this will be particularly analysis. The given solution to this integral will be useful important in more complex, memoryful processes [55– in the following. 58], including the hidden semi-Markov processes [51, 59] Next, we consider the case where is the transition- that are, roughly speaking, the cross-product of hidden L rate operator among the states of a structured stochas- finite-state Markov chains and renewal processes. tic dynamical system. This leads to several novel conse- quence within stochastic thermodynamics. D. Stochastic thermodynamics

The previous simple examples started to demonstrate the spectral methods of the functional calculus. Next, we 2. Green–Kubo relations show a novel application of the meromorphic functional calculus to environmentally driven mesoscopic dynami- Let us reconsider the above integral in the case when cal systems, selected to give a new set of results within the singular operator —let us call it G—is a transition- L nonequilibrium thermodynamics. In particular, we ana- rate operator that exhibits a single stationary distribu- lyze functions of singular transition-rate operators. No- tion. By the spectral mapping ln ΛeG of the eigenvalue tably, we show that the Drazin inverse arises naturally in 1 Λ G addressed in the Perron–Frobenius theorem, G’s ∈ e the general solution of Green–Kubo relations. We men- zero eigenmode is diagonalizable. And, by assuming a tion that it also arises when analyzing moments of the single attracting stationary distribution, the zero eigen- excess heat produced in the driven transitions atop ei- value has algebraic multiplicity a0 = 1. Equation (46) 21 then simplifies to: the logarithm of the Ruelle–Frobenius–Perron operator, as described later in § VI F 1— 0G 0G still induces the Z τ tG D τG  | i h | e dt = τ 0G 0G + G e I . (47) average over the steady-state trajectories. 0 | i h | − In the special case where the transition-rate operator is diagonalizable, AGDA is simply the integrated Since G is a transition-rate operator, the above integral − h is.s. contribution from a weighted sum of decaying exponen- corresponds to integrated time evolution. The Drazin in- tials. Transport coefficients then have a solution of the verse GD concentrates on the transient contribution be- simple form: yond the persistent stationary background. In Eq. (47), the subscript within the left and right eigenvectors explic- X 1 κ = 0G AGλA 0G . (49) itly links the eigenvectors to the operator G, to reduce − λ h | | i λ∈ΛG\0 ambiguity. Specifically, the projector 0G 0G maps any | i h | distribution to the stationary distribution. Note that the minus sign keeps κ positive since Re(λ) < 0 Green–Kubo-type relations [60, 61] connect the out- for λ ΛG 0 . Also, recall that G’s eigenvalues of-steady-state transport coefficients to the time integral ∈ \{ } with nonzero imaginary part occur in complex-conjugate of steady-state autocorrelation functions. They are thus pairs and G = Gλ. Moreover, if Gi,j is the classical very useful for understanding out-of-steady-state dissi- λ transition-rate from state i to state j (to disambiguate pation due to steady-state fluctuations. (Steady state from the transposed possibility), then 0G is the station- here refers to either equilibrium or nonequilibrium steady h | ary distribution. (The latter is sometimes denoted π in state.) Specifically, the Green–Kubo relation for a trans- h | the Markov process literature.) And, 0G is a column port coefficient, κ say, is typically of the form: | i vector of all ones (sometimes denoted 1 ) which acts to | i Z ∞ integrate contributions throughout the state space. 2  κ = A(0)A(t) s.s. A s.s. dt , A relationship of the form of Eq. (48), between the 0 h i − h i Drazin inverse of a classical transition-rate operator and where A(0) and A(t) are some observable of the station- a particular Green–Kubo relation was recently found ary stochastic dynamical system at time 0 and time t, in Ref. [62] for the friction for smoothly-driven respectively, and the subscript emphasizes that the transitions atop nonequilibrium steady states. Subse- h·is.s. expectation value is to be taken according to the steady- quently, a truncation of the eigen-expansion of the form state distribution. of Eq. (49) was recently used in a similar context to Using: bound a universal tradeoff between power, precision, and speed [63]. Equation (48) shows that a fundamental rela- tG  A(0)A(t) = tr 0G 0G A e A tionship between a physical property and a Drazin inverse h is.s. | i h | tG = 0G A e A 0G , is to be expected more generally whenever the property h | | i can be related to integrated correlation. the transport coefficient κ can be written more explicitly Notably, if a Green–Kubo-like relation integrates a in terms of the relevant transition-rate operator G for the cross-correlation, say between A(t) and B(t) rather than stochastic dynamics: an autocorrelation, then we have only the slight modifi- cation: Z τ tG 2 κ = lim 0G A e A 0G dt τ 0G A 0G Z ∞ τ→∞ h | | i − h | | i  D 0 A(0)B(t) s.s. A s.s. B s.s. dt= AG B s.s.. Z τ 0 h i −h i h i −h i  tG  2 = lim 0G A e dt A 0G τ 0G A 0G (50) τ→∞ h | 0 | i − h | | i D τG  = lim 0G AG e I A 0G The foregoing analysis bears on both classical and τ→∞ h | − | i quantum dynamics. G may be a so-called linear super- = AGDA . (48) − h is.s. operator in the quantum regime [64]; for example, the Thus, we learn that relations of Green–Kubo form are Lindblad superoperator [65, 66] that evolves density op- direct signatures of the Drazin inverse of the transition- erators. A Liouville-space representation [67] of the su- rate operator for the stochastic dynamic. peroperator, though, exposes the superficiality of the dis- The result of Eq. (48) holds quite generally. For ex- tinction between superoperator and operator. At an ab- ample, if the steady state has some number of periodic stract level, time evolution can be discussed uniformly flows, the result of Eq. (48) remains valid. Alternatively, across subfields and reinterpretations of Eq. (50) will be in the case of nonperiodic chaotic flows—where G will be found in each associated physical theory. Reference [26] presents additional constructive results ↵N ↵1

↵N 1 ↵2

↵4 ↵3

.

1 1

2 2

N 1 N 1

N N .

r r r r 0 1 2 N .

r r r r r 0 1 2 ... N

. ✏

vA vB ✏ . 22

The windowing function N τ appearing in Eq. (52) − | | t 1 t t+1 is a direct consequence of Eq. (51). It is not imposed S S S externally, as is common practice in signal analysis. This is important to subsequent derivations. The question we address is how to calculate the cor- Xt 1 Xt Xt+1 relation function and power spectrum given a model of the signal’s generator. To this end, we briefly introduce hidden Markov models as signal generators and then use FIG. 2: Bayes network for a state-emitting hidden the meromorphic calculus to calculate their autocorre- Markov model graphically depicts the structure of lation and power spectra in closed-form. This leads to conditional independence among random variables for several lessons. First, we see that the power spectrum the latent state n n∈ at each time n and the random is a direct fingerprint of the resolvent of the genera- {S } Z variables Xn n∈ for the observation at each time n. tor’s time-evolution operator, analyzed along the unit cir- { } Z cle. Second, spectrally decomposing the not-necessarily- diagonalizable time evolution operator, we derive the most general qualitative behavior of the autocorrelation that emphasize the ubiquity of integrated correlation function and power spectra. Third, contributions from and Drazin inverses in the transitions between steady eigenvalues on the unit circle must be extracted and states [68], relevant to the fluctuations within any phys- 6 dealt with separately. Contributions from eigenvalues ical dynamic. Overall, these results support the broader on the unit circle correspond to Dirac delta functions— notion that dissipation depends on the structure of cor- the analog of Bragg peaks in diffraction. Whereas, relation. eigen-contributions from inside the unit circle correspond Frequency-dependent generalizations of integrated cor- to diffuse peaks, which become sharper for eigenvalues relation have a corresponding general solution. To be closer to the unit circle. Finally, nondiagonalizable eigen- slightly less abstract, we give novel representative formu- modes yield qualitatively different line profiles than their lae for a particular application: the general solution to diagonalizable counterparts. In short, when applied to power spectra of a process generated by any countable- signal analysis our generalized spectral decomposition state hidden Markov chain. has directly measurable consequences. This has been key to analyzing low-dimensional disordered materials, for example, when adapted to X-ray diffraction spectra [20, 21, 70]. E. Power spectra Let the 4-tuple = , , ,T  be a discrete- M S A P time state-emitting hidden Markov model (HMM) A signal’s power spectrum quantifies how its power is that generates the stationary stochastic process distributed across frequency [69]. For a discrete-domain ...X−2X−1X0X1X2 ... according to the following. S process it is: is the (finite) set of latent states of the hidden Markov chain and is the observable alphabet. t is the * N + C 2 A ⊆ S 1 X −iωn random variable for the hidden state at time t that P (ω) = lim N Xne , (51) N→∞ takes on values s . Xt is the random variable for the n=1 ∈ S observation at time t that takes on values x . Given ∈ A where ω is the angular frequency and Xn is the ran- the latent state at time t, the possible observations dom variable for the observation at time n. For a wide- are distributed according to the conditional probability  sense stationary stochastic process, the power spectrum density functions: = p(Xt = x t = s) . For P |S s∈S is also determined from the signal’s autocorrelation func- each s , p(Xt = x t = s) may be abbreviated as ∈ S |S tion γ(τ): p(x s) since the probability density function in each | state is assumed not to change over t. Finally, the N 1 X  −iωτ latent-state-to-state stochastic transition matrix T has P (ω) = lim N N τ γ(τ)e , (52) N→∞ − | | elements Ti,j = Pr( t+1 = sj t = si), which give the τ=−N S |S probability of transitioning from latent state si to sj where the autocorrelation function for a wide-sense sta- given that the system is in state si, where si, sj . ∈ S tionary stochastic process is defined: It is important for the subsequent derivation that we use Pr( ) to denote a probability in contrast to p( ) · density · γ(τ) = XnXn+τ n . which denotes a probability . The Bayes network 1 q Gaussian q 1 p p 1

SpikesAndSmooth

23

. diagram of Fig. 2 depicts the structure of conditional 1 q independence among the random variables.

q

1 1

k1 A FIG. 3: Simple 3-state state-emitting↵h HMM that 1. Continuous-value, discrete-state and -time processes generates a stochastic process accordingk2 to the A A state-to-state transition3↵m dynamic2↵m T and the probability A density functions (pdfs) p(x s) ↵s∈mS associated with Figure 3 gives a particular HMM with continuous ob- { | } k3 ⌦ each state. TheoremA 1 assertsA thatA its power spectrum servable alphabet = R distributed according to the m 2m 3m A will be the same (with only constant offset) as the probability density function shown within each latent ✏0 0 ✏0 0 ✏0 0 ✏1 1 ✏0 0 power spectrum generated from the alternative process state. Processes generated as the observation of a func- 3↵B 2↵B ↵B where the pdfs inm each statem are solelym concentrated at tion of a Markov chain can be of either finite or infi- k3 the Platonic average value x p (3x)B of the former pdf h i s m ⌦ nite Markov order. (They are, in fact, typically infinite associatedB withB the state. m 2m Markov order in the space of processes [71].) k2 B ↵h

Directly calculating, one finds thatk1 the autocorrelation function, for τ > 0, for any such HMM is:

k γ(τ) = XnXn+τ n 1 Z Z k2 = xx0p(X = x, X = x0) dx dx0 3↵ 2↵ ↵ 0 τ m x∈A mx0∈A m Z Z k3 X X 0 ⌦ 0 0 0 = xx p(X0 = x, Xτ = x , 0 = s, τ = s ) dx dx m 2m 3m0 S S s∈S s0∈S x∈A x ∈A Z Z ↵h X X 0 0 0 0 0 = xx Pr( 0 = s, τ = s ) p(X0 = x 0 = s) p(Xτ = x τ = s ) dx dx 0 S S |S |S s∈S s0∈S x∈A x ∈A Z Z 7 X X τ    0 0 0 0 = π δs δs T δs0 δs0 1 x p(x s) dx x p(x s ) dx h | i h | | i h | i | 0 | s∈S s0∈S x∈A x ∈A X  τ  X  = π x δs δs T x 0 δs0 δs0 1 , h | h ip(x|s) | i h | h ip(x|s ) | i h | | i s∈S s0∈S where:

0 0 0 0 0 p(X0 = x, Xτ = x , 0 = s, τ = s ) = Pr( 0 = s, τ = s )p(X0 = x, Xτ = x 0 = s, τ = s ) S S S S |S S holds by definition of conditional probability. The decomposition of:

0 0 0 0 p(X0 = x, Xτ = x 0 = s, τ = s ) = p(X0 = x 0 = s)p(Xτ = x τ = s ) |S S |S |S for τ = 0 follows from the conditional independence in the relevant Bayesian network shown in Fig. 2. Moreover, the 6 equality:

0 τ Pr( 0 = s, τ = s ) = π δs δs T δs0 δs0 1 S S h | i h | | i h | i 24 can be derived by marginalizing over all possible intervening state sequences. Note that δs is the column vector of | i all 0s except for a 1 at the index corresponding to state s and δs is simply its transpose. Recall that π = 1T is h | h | h | the stationary distribution induced by T over latent states and 1 = 1T is a column vector of all ones. Note also | i | i that π δs = Pr(s) and δs0 1 = 1. h | i h | i

Since the autocorrelation function is symmetric in τ spectrum is: and: ∞ X X 2 Pd(ω) = 2π δ(ω ωλ + 2πk) γ(0) = x p(x) − | | k=−∞ λ∈ΛT X 2 |λ|=1 = π x p(x|s) δs , −1  h | | | | i Re λ π Ω TλΩ 1 , (56) s∈S × h | | i iω we find the full autocorrelation function is given by: where ωλ is related to λ by λ = e λ . An extension of the Perron–Frobenius theorem guarantees that the eigenval- ( 2 x if τ = 0 ues of T on the unit circle have index νλ = 1. γ(τ) = | | , π Ω T |τ|−1Ω 1 if τ 1 When plotted as a function of the angular frequency ω h | | i | | ≥ around the unit circle, the power spectrum suggestively appears to emanate from the eigenvalues λ ΛT of the where Ω is the -by- matrix defined by: ∈ |S| |S| hidden linear dynamic. See Fig. 4 for the analysis of an X Ω = x δs δs T. (53) example parametrized process and the last two panels for h ip(x|s) | i h | s∈S this display mode for the power spectra. Eigenvalues of T on the unit circle yield Dirac delta The power spectrum is then calculated via Eq. (52) us- functions in the power spectrum. Eigenvalues of T within ing the meromorphic calculus. In particular, the power the unit circle yield more diffuse line profiles, increasingly spectrum decomposes naturally into a discrete part and diffuse as the magnitude of the eigenvalues retreats to- a continuous part. Full details will be given elsewhere, ward the origin. Moreover, the integrated magnitude of but the derivation is similar to that given in Ref. [20] for each contribution is determined by projecting pairwise the special case of diffraction patterns from HMMs. We observation operators onto the eigenspace emanating the note that it is important to treat individual eigenspaces contribution. Finally, we note that nondiagonalizable separately, as our generalized calculus naturally accom- eigenmodes yield qualitatively different line profiles. modates. The end result, for the continuous part of the Remarkably, the power spectrum generated by such power spectrum, is: a process is the same as the that generated by a po- tentially much simpler one—a process that is a function 2 iω −1 Pc(ω) = x + 2 Re π Ω e I T Ω 1 . (54) | | h | − | i of the same underlying Markov chain but instead emits the state-dependent expectation value of the observable All of the ω-dependence is in the resolvent. Using the within each state: spectral expansion of the resolvent given by Eq. (21) al-  lows us to better understand the qualitative possibilities Theorem 1. Let = ps(x) be any set of proba- P s∈S for the shape of the power spectrum: bility distribution functions over the domain C. Let A ⊆ =  x and let = δ(x x ) . B h ips(x) s∈S Q − h ips(x) s∈S νλ−1 Then, the power spectrum generated by any hidden 2 X X π Ω Tλ,mΩ 1 Pc(ω) = x + 2 Reh | | i . (55)  | | (eiω λ)m+1 Markov model = , , ,T differs at most by a λ∈Λ m=0 M S A P T − constant offset from the power spectrum generated by the hidden Markov model 0 = , , ,T  that has the Note that π Ω Tλ,mΩ 1 is a complex-valued scalar and M S B Q h | | i same latent Markov chain but in any state s emits, all of the frequency dependence now handily resides in ∈ S with probability one, the average value x of the the denominator. h ips(x) state-conditioned probability density function p (x) The discrete portion (delta functions) of the power s of . ∈ P M Proof. From Eqs. (54) and (56), we see that Pc(ω) + 2  Pd(ω) x depends only on T and x s∈S . − | | h ip(x|s)} This shows that all HMMs that share the same T and  x s∈S have the same power spectrum P (ω) = h ip(x|s)} Pc(ω) + Pd(ω) besides a constant offset determined by . 2 1 1

1 3 1 p p p

0 1 p -1 1 1

0 -1

1 1

-1 -1

1 -1 1

. 1 b 0 -1 1 b b 1

0 1 1 b 3 -1 1 1

1 1 2 -1 -1 25 1 -1 1

. 1 b 0 -1 1 1 b b 0 -1 1 1 b 3 1 1 1 1

-1 2 -1

1 1 -1 -1 1 (a) A b-parametrized HMM (b) Eigenvalue evolution for all (c) Power spectrum and (d) Power spectrum and with mean values of each λ ∈ ΛT sweeping transition eigenvalues at b = 3/4. eigenvalues at b = 1/4.

state’s pdf hxip(x|s) indicated parameter b from 1 to 0. as the number inside each state.

FIG. 4: Parametrized HMM generator of a stochastic8 process, its eigenvalue evolution, and two coronal spectrograms showing power spectra emanating from eigen-spectra.

2 differences in x . the persistent eigenvalue of λT = 1, which is guaranteed | | One immediate consequence is that any hidden Markov by the Perron–Frobenius theorem. chain with any arbitrary set of zero-mean distributions In Fig. 4c and again, at another parameter setting, attached to each state, i.e.: in Fig. 4d, we show the continuous part of the power spectrum P (ω) (plotted around the unit circle in solid  c p(x s) s∈S : x = 0 for all s , blue) and the eigen-spectrum (plotted as red dots on P ∈ { | } h ip(x|s) ∈ S and within the unit circle) of the state-to-state transition generates a flat power spectrum with the appearance of matrix for the 11-state hidden Markov chain (leftmost white noise. On the one hand, this strongly suggests panel) that generates it. There is also a δ-function con- to data analysts to look beyond power spectra when at- tribution to the power spectrum at ω = 0 (corresponding tempting to extract a process’ full architecture. On the to λT = 1). This is not shown. These coronal spectro- other, whenever a process’s power spectrum is struc- grams illustrate how the power spectrum emanates from tured, it is a direct fingerprint of the resolvent of the the HMM’s eigen-spectrum, with sharper peaks when the hidden linear dynamic. In short, the power spectrum is eigenvalues are closer to the unit circle. This observation a filtered image of the resolvent along the unit circle. is fully explained by Eq. (55). The integrated magnitude The power spectrum of a particular stochastic process of each peak depends on π Ω λ λ Ω 1 . h | | i h | | i is shown in Fig. 4 and using coronal spectrograms, intro- Interestingly, the apparent continuous spectrum com- duced in Ref. [20], it illustrates how the observed spec- ponent is the shadow of the discrete spectrum of nonuni- trum can be thought of as emanating from the spectrum tary dynamics. This suggests that resonances in various of the hidden linear dynamic, as all power spectra must. physics domains concerned with a continuous spectrum Figure 4a shows the state-emitting HMM with state-to- can be modeled as simple consequences of nonunitary dy- state transition probabilities parametrized by b; the mean namics. Indeed, hints of this appear in the literature [72– values x of each state’s pdf p(x s) are indicated as h ip(x|s) | 74]. the blue number inside each state. The process generated depends on the actual pdfs and the transition parameter b although, and this is our point, the power spectrum is 2. Continuous-time processes ignorant to the details of the pdfs. The evolution of the eigenvalues ΛT of the transition We close this exploration of conventional signal analy- dynamic among latent states is shown from thick blue to sis methods using the meromorphic calculus by comment- thin red markers in Fig. 4b, as we sweep the transition ing on continuous-time processes. Analogous formulae parameter b from 1 to 0. A subset of the eigenvalues pass can be derived with similar methods for continuous-time continuously but very quickly through the origin of the hidden Markov jump processes and continuous-time de- complex plane as b passes through 1/2. The continuity of terministic (possibly chaotic) dynamics in terms of the this is not immediately apparent numerically, but can be generator G of time evolution. For example, the continu- revealed with a finer increment of b near b 1/2. Notice ≈ ous part Pc(ω) of the power spectrum from a continuous- 26 time deterministic dynamic has the form: eigenvector 1, either 0G or 1T , is uniform over the h | h | space. Other modes of the operator’s action, according −1 Pc(ω) = 2 Re π Ω iωI G Ω 1 . to the eigenvalues and left and right eigenvectors and h | − | i generalized eigenvectors, capture the decay of arbitrary Appealing to the resolvent’s spectral expansion again al- 3 distributions on R . lows us to better understand the possible shapes of their The meromorphic spectral methods developed above power spectrum: give a view of the Koopman operator and Koopman modes of nominally nonlinear dynamical systems [4] that νλ−1 X X π Ω Gλ,mΩ 1 is complementary to the Ruelle–Frobenius–Perron op- Pc(ω) = 2 Reh | | i . (57) (iω λ)m+1 erator. The Koopman operator K is the adjoint—in λ∈ΛG m=0 − the sense of vector spaces, not inner product spaces—of Since all of the frequency-dependence has been isolated the Ruelle–Frobenius–Perron operator T : effectively, the > in the denominator and π Ω Gλ,mΩ 1 is a frequency- transpose K = T . Moreover, it has the same spectrum h | | i independent complex-valued constant, peaks in Pc(ω) with only right and left swapping of the eigenvectors and c can only arise via contributions of the form Re (iω−λ)n generalized eigenvectors. for c C, ω R, λ ΛG, and n Z+. This provides a The Ruelle–Frobenius–Perron operator T is usually as- ∈ ∈ ∈ ∈ rich starting point for application and further theoretical sociated with the evolution of probability density, while investigation. For example, Eq. (57) helps explain the the Koopman operator K is usually associated with shapes of power spectra of nonlinear dynamical systems, the evolution of linear functionals of probability den- as have appeared, e.g., in Ref. [75]. Furthermore, it sug- sity. The duality of perspectives is associative in na- n  gests an approach to the inverse problem of inferring the ture: f T ρ0 corresponds to the Ruelle–Frobenius– h | | i spectrum of the hidden linear dynamic via power spectra. Perron perspective with T acting on the density ρ and n > In the next section, however, we develop a more general f T ρ0 corresponds to the Koopman operator T = h | | i proposal for inferring eigenvalues from a time series. Fur- K acting on the observation function f. Allowing an ob- ther developments will appear elsewhere. servation vector f~ = [f1, f2, . . . fm] of linear functionals, and inspecting the most general form of Kn given by Eq. (25) together with the generalized eigenvector decom- F. Operators for chaotic dynamics position of the projection operators of Eq. (39), yields the most general form of the dynamics in terms of Koopman Since trajectories in state-space can be generated in- modes. Each Koopman mode is a length-m vector-valued dependently of each other, any nonlinear dynamic cor- functional of a Ruelle–Frobenius–Perron right eigenvec- responds to a linear operation on an infinite-dimensional tor or generalized eigenvector. vector-space of complex-valued distributions (in the sense Both approaches suffer when their operators are defec- of generalized functions) over the original state-space. tive. Given the meromorphic calculus’ ability to work For example, the well-known Lorenz ordinary differential around a wide class of such defects, adapting it the equations [76] are nonlinear over its three given state- Ruelle–Frobenius–Perron and Koopman operators sug- space variables—x, y, and z. Nevertheless, the dynamic gests that it may lift their decades-long restriction to is linear in the infinite-dimensional vector space D(R3) only analyzing highly idealized (e.g., hyperbolic) chaotic of distributions over R3. Although D(R3) is an unwieldy systems. state-space, the dynamics there might be well approxi- mated by a finite truncation of its modes.

1. Ruelle–Frobenius–Perron and Koopman operators

The preceding operator formalism applies, in princi- 2. Eigenvalues from a time series ple at least. The question, of course, is, Is it practi- cal and does it lead to constructive consequences? Let’s Let’s explore an additional benefit of this view of the see. The right eigenvector is either 0G or 1T with Ruelle–Frobenius–Perron and Koopman operators, by | i | i T = eτG as the Ruelle–Frobenius–Perron transition op- proposing a novel method to extract the eigenvalues of −1 erator [77, 78]. Equivalently, it is also π, the stationary a nominally nonlinear dynamic. Let ON (f, z) be (z distribution, with support on attracting subsets of R3 in times) the z-transform [79, pp. 257–262] of a length-N the case of the Lorenz dynamic. The corresponding left- sequence of τ-spaced type-f observations of a dynamical 27 system: tral theory of normal operators as it made accessible the phenomena of the quantum mechanics of closed systems. N X This turns on nondiagonalizability and appreciating how O (f, z) z−1 z−n f T n ρ N 0 ubiquitous it is. ≡ n=0 h | | i −1 Nondiagonalizability has consequences for settings as N→∞ f (zI T ) ρ0 → h | − | i simple as counting, as shown in § VI C. Moreover, there νλ−1 X X f Tλ,m ρ0 we found that nondiagonalizability can be robust. The = h | | i , (reiω λ)m+1 Drazin inverse, the negative-one power in the meromor- λ∈Λ m=0 T − phic functional calculus, is quite common in the nonequi- n librium thermodynamics of open systems, as we showed as N for z = r > 1. Note that f T ρ0 is simply → ∞ | | h | | i the f-observation of the system at time nτ, when the in § VI D. Finally, we showed that the spectral charac- ter of nonnormal and nondiagonalizable operators man- system started in state ρ0. We see that this z-transform of observations automatically induces the resolvent of the ifests itself physically, as illustrated by Figs. 4c and 4d hidden linear dynamic. If the process is continuous-time, of § VI E. Our new formulae for spectral projection op- τG τλG erators and the orthonormality relation among left and then T = e implies λT = e , so that the eigenvalues should shift along the unit circle if τ changes; but the right generalized eigenvectors will thus likely find use in eigenvalues should be invariant to τ in the appropriate the analytic treatment of complex physical systems. τ-dependent conformal mapping of the inside of the unit From the perspective of functional calculus, nonuni- circle of the complex plane to the left half complex plane. tary time evolution, open systems, and non-Hermitian Specifically, for any experimentally accessible choice of generators are closely related concepts since they all rely inter-measurement temporal spacing τ, the fundamental on the manipulation of nonnormal operators. More- over, each domain is gaining traction. Nonnormal opera- set of continuous-time eigenvalues ΛG can be obtained 1 tors have recently drawn attention, from the nonequilib- from λG = τ ln λT , where each λT ΛT is extrapolated iω n ∈ iω rium thermodynamics of nanoscale systems [80] to large- from c/(re λT ) curves fit to ON (f, re ) for c C, − ∈ large N, and fixed r. scale cosmological evolution [81]. In another arena en- tirely, complex directed networks [82] correspond to non- The square magnitude of ON (f, z) is related to the power spectrum generated by f-type observations of the normal and not-necessarily-diagonalizable weighted di- system. Indeed, the power spectrum generated by any graphs. There are even hints that nondiagonalizable net- type of observation of a nominally nonlinear system is a work structures can be optimal for implementing cer- direct fingerprint of the eigenspectrum and resolvent of tain dynamical functionalities [83]. The opportunity here the hidden linear dynamic. This suggests many opportu- should be contrasted with the well established field of nities for inferring eigenvalues and projection operators spectral [84] that typically considers con- from frequency-domain transformations of a time series. sequences of the spectral theorem for normal operators applied to the symmetric (and thus normal) adjacency matrices and Laplacian matrices. It seems that the VII. CONCLUSION meromorphic calculus and its generalized spectral the- ory will enable a spectral weighted digraph theory beyond The original, abstract spectral theory of normal op- the purview of current spectral graph theory. erators rose to central importance when, in the early Even if the underlying dynamic is diagonalizable, par- development of quantum mechanics, the eigenvalues of ticular questions or particular choices of observable often Hermitian operators were detected experimentally in the induce a nondiagonalizable hidden linear dynamic. The optical spectra of energetic transitions of excited elec- examples already showed this arising from the simple im- trons. We extended this powerful theory by introducing position of counting or assuming a Poissonian dynamic. the meromorphic functional calculus, and unraveling the In more sophisticated examples, we recently found non- consequences of both the holomorphic and meromorphic diagonalizable dynamic structures in quantum memory functional calculi in terms of spectral projection opera- reduction [24], classical complexity measures [19], and tors and their associated left and right generalized eigen- prediction [22, 23]. vectors. The result is a tractable spectral theory of non- Our goal has been to develop tractable, exact analyti- normal operators. Our straightforward examples suggest cal techniques for nondiagonalizable systems. We did not that the spectral properties of these general operators discuss numerical implementation of algorithms that nat- should also be experimentally accessible in the behav- urally accompany its practical application. Nevertheless, ior of complex—open, strongly interacting—systems. We the theory does suggest new algorithms—for the Drazin see a direct parallel with the success of the original spec- inverse, projection operators, power spectra, and more. 28

Guided by the meromorphic calculus, such algorithms infinite-dimensional systems and continuous spectra. An- can be made robust despite the common knowledge that other direction forward is to develop creation and annihi- numerics with nondiagonalizable matrices is sensitive in lation operators within nondiagonalizable dynamics. In certain ways. the study of complex stochastic information processing, The extended spectral theory we have drawn out of the for example, this would allow analytic study of infinite- holomorphic and meromorphic functional calculi com- memory processes generated by, say, stochastic push- plement efforts to address nondiagonalizability, e.g., via down and counter automata [58, 87–89]. In a physical pseudospectra [85, 86]. It also extends and simplifies pre- context, such operators may aid in the study of open viously known results, especially as developed by Dun- quantum field theories. One might finally speculate that ford [16]. Just as the spectral theorem for normal op- the Drazin inverse will help tame the divergences that erators enabled much theoretical progress in physics, we arise there. hope that our generalized and tractable analytic frame- work yields rigorous understanding for much broader classes of complex system. Importantly, the analytic framework should enable a new theory of complex sys- ACKNOWLEDGMENTS tems beyond the limited purview of numerical investiga- tions. JPC thanks the Santa Fe Institute for its hospital- While the infinite-dimensional theory is in princi- ity. The authors thank John Mahoney, Sarah Marzen, ple readily obtained from the present framework, spe- Gregory Wimsatt, and Alec Boyd for helpful discussions. cial care must be taken to guarantee a similar level of We especially thank Gregory Wimsatt for his assistance tractability and generality. Nevertheless, even the finite- with § V B 3. This material is based upon work sup- dimensional theory enables a new level of tractability for ported by, or in part by, the U. S. Army Research Labora- analyzing not-necessarily-diagonalizable systems, includ- tory and the U. S. Army Research Office under contracts ing nonnormal dynamics. Future work will take full ad- W911NF-12-1-0234, W911NF-13-1-0390, and W911NF- vantage of the operator theory, with more emphasis on 13-1-0340.

[1] A. Einstein. On the method of theoretical physics. Phi- [10] L. Sirovich and M. Kirby. Low-dimensional procedure for losophy of Science, 1(2):163–169, April 1934. The Her- the characterization of human faces. J. Opt. Soc. Am. A, bert Spencer Lecture, delivered at Oxford (10 June 1953). 4(3):519–524, Mar 1987. 2 1 [11] R. Courant and D. Hilbert. Methods of mathematical [2] B. O. Koopman. Hamiltonian systems and transfor- physics: first English edition, volume 1. Interscience Pub- mation in hilbert space. Proceedings of the National lishers, 1953. 2 Academy of Sciences, 17(5):315–318, 1931. 2 [12] J. von Neumann. Zur algebra der funktionaloperationen [3] P. Gaspard, G. Nicolis, A. Provata, and S. Tasaki. Spec- und theorie der normalen operatoren. Math. Annalen, tral signature of the pitchfork bifurcation: Liouville equa- 102:370–427, 1930. 2 tion approach. Phys. Rev. E, 51:74–94, Jan 1995. [13] J. von Neumann. Mathematical Foundations of Quantum [4] M. Budii, R. Mohr, and I. Mezic. Applied Koopmanism. Mechanics. Princeton University Press, Princeton, New Chaos, 22(4), 2012. 2, 26 Jersey, 1955. 2 [5] N. Trefethen. Favorite eigenvalue problems. SIAM News, [14] S. Hassani. Mathematical Physics. Springer, New York, 44(10), Dec 2011. 2 1999. 2, 4, 7, 8 [6] A. Sandage and G. A. Tammann. Steps toward the Hub- [15] R. R. Halmos. Finite-Dimensional Vector Spaces. D. Van ble constant. VII-distances to NGC 2403, M101, and the Nostrand Company, 1958. 2, 3, 13 Virgo cluster using 21 centimeter line widths compared [16] N. Dunford. Spectral operators. Pacific J. Math., with optical methods: The global value of H sub 0. As- 4(3):321–354, 1954. 3, 7, 28 trophys. J., 210:7–24, 1976. 2 [17] C. D. Meyer. Matrix Analysis and Applied Linear Alge- [7] A.G. Milnes. Semiconductor heterojunction topics: bra. SIAM Press, Philadephia, Pennsylvannia, 2000. 3, Introduction and overview. Solid-State Electronics, 5, 14 29(2):99 – 121, 1986. 2 [18] P. J. Antsaklis and A. N. Michel. A Linear Systems [8] P. A. M. Dirac. Theory of electrons and positrons. In Primer. Springer Science & Business Media, New York, Nobel Lecture, Physics 1922–1941. Elsevier Publishing New York, 2007. 3 Company, Amsterdam, 1965. 2 [19] J. P. Crutchfield, C. J. Ellison, and P. M. Riechers. Exact [9] C. Cortes and V. Vapnik. Support-vector networks. Ma- complexity: Spectral decomposition of intrinsic compu- chine Learning, 20(3):273–297, 1995. 2 tation. Phys. Lett. A, 380(9-10):998–1002, 2015. 3, 18, 29

27 [40] J. J. Koliha and T. D. Tran. The Drazin inverse for [20] P. M. Riechers, D. P. Varn, and J. P. Crutchfield. Diffrac- closed linear operators and the asymptotic convergence tion patterns of layered close-packed structures from hid- of C 0-semigroups. J. Operator Th., pages 323–336, 2001. den Markov models. arxiv.org:1410.5028. 22, 24, 25 10 [21] P. M. Riechers, D. P. Varn, and J. P. Crutchfield. Pair- [41] U. G. Rothblum. A representation of the Drazin inverse wise correlations in layered close-packed structures. Acta and characterizations of the index. SIAM J. App. Math., Cryst. A, 71:423–443, 2015. 22 31(4):646–648, 1976. 10 [22] P. M. Riechers and J. P. Crutchfield. Spectral simplicity [42] J. G. Kemeny and J. L. Snell. Finite Markov Chains, of apparent complexity, Part I: The nondiagonalizable volume 356. D. Van Nostrand, New York, New York, metadynamics of prediction. arXiv:1705.08042. 18, 27 1960. 11 [23] P. M. Riechers and J. P. Crutchfield. Spectral simplicity [43] N. Dunford and J. T. Schwartz. Linear Operators. Inter- of apparent complexity, Part II: Exact complexities and science Publishers, New York, 1967. 11 complexity spectra. arXiv:1706.00883. 3, 18, 27 [44] T. Berm´udez.Meromorphic functional calculus and local [24] P. M. Riechers, J. R. Mahoney, C. Aghamohammadi, spectral theory. Rocky Mountain J. Math., 29(2):437– and J. P. Crutchfield. Minimized state complexity 447, 1999. 11 of quantum-encoded cryptic processes. Phys. Rev. A, [45] J. J. Sakurai and J. J. Napolitano. Modern Quantum 93:052317, May 2016. 3, 18, 27 Mechanics. Addison-Wesley, San Francisco, California, [25] F. C. Binder, J. Thompson, and Mile Gu. A practical, 2011. 13 unitary simulator for non-Markovian complex processes. [46] S. J. Axler. Linear Algebra Done Right, volume 2. arXiv.org:1709.02375. 3 Springer, New York, New York, 1997. 15 [26] P. M. Riechers and J. P. Crutchfield. Fluctuations when [47] M. Li and P. M. B. Vitanyi. An Introduction to Kol- driving between nonequilibrium steady states. J. Stat. mogorov Complexity and its Applications. Springer- Phys., 168(4):873–918, 2017. 3, 20, 21 Verlag, New York, 1993. 18 [27] A. B. Boyd, D. Mandal, P. M. Riechers, and J. P. [48] J. P. Crutchfield. Between order and chaos. Nature Crutchfield. Transient dissipation and structural costs Physics, 8(January):17–24, 2012. 18 of physical information transduction. Phys. Rev. Lett., [49] J. P. Crutchfield, C. J. Ellison, and J. R. Mahoney. 118:220602, 2017. 3 Time’s barbed arrow: Irreversibility, crypticity, and [28] B. Latni. Signal Processing and Linear Systems. Oxford stored information. Phys. Rev. Lett., 103(9):094101, University Press, New York, New York, 1998. 3, 7 2009. 18 [29] M. L. Boas. Mathematical Methods in the Physical Sci- [50] C. J. Ellison, J. R. Mahoney, and J. P. Crutchfield. ences, volume 2. Wiley and Sons, New York, New York, Prediction, retrodiction, and the amount of information 1966. 3, 18 stored in the present. J. Stat. Phys., 136(6):1005–1034, [30] N. Dunford. Spectral theory I. Convergence to projec- 2009. 18 tions. Trans. Am. Math. Soc., 54(2):pp. 185–217, 1943. [51] V. S. Barbu and N. Limnios. Semi-Markov Chains 4, 5 and Hidden semi-Markov Models toward Applications: [31] M. Atiyah, R. Bott, and V. K. Patodi. On the heat equa- Their Use in Reliability and DNA Analysis, volume 191. tion and the index theorem. Inventiones Mathematicae, Springer, New York, 2008. 18, 20 19(4):279–330, 1973. 4 [52] W. L. Smith. Renewal theory and its ramifications. J. [32] C. S. Kubrusly. Spectral theory of operators on Hilbert Roy. Stat. Soc. B, 20(2):243–302, 1958. spaces. Springer Science & Business Media, 2012. 4 [53] W. Gerstner and W. Kistler. of spike trains. [33] M. Haase. Spectral mapping theorems for holomorphic In Spiking Neuron Models. Cambridge University Press, functional calculi. J. London Math. Soc., 71(3):723–739, Cambridge, United Kingdom, 2002. 2005. 5 [54] F. Beichelt. Stochastic Processes in Science, Engineering [34] H. A. Gindler. An operational calculus for meromorphic and Finance. Chapman and Hall, New York, 2006. 18 functions. Nagoya Math. J., 26:31–38, 1966. 6, 11 [55] S. Marzen and J. P. Crutchfield. Informational and causal [35] B. Nagy. On an operational calculus for meromorphic architecture of discrete-time renewal processes. Entropy, functions. Acta Math. Acad. Sci. Hungarica, 33(3):379– 17(7):4891–4917, 2015. 20 390, 1979. 6, 11 [56] S. Marzen and J. P. Crutchfield. Informational and causal [36] E. H. Moore. On the reciprocal of the general algebraic architecture of continuous-time renewal processes. J. matrix. Bull. Am. Math. Soc., 26, 1920. 10 Stat. Phys., 168(a):109–127, 2017. [37] R. Penrose. A generalized inverse for matrices. Math. [57] S. Marzen, M. R. DeWeese, and J. P. Crutchfield. Time Proc. Cambridge Phil. Soc., 51, 1955. 10 resolution dependence of information measures for spik- [38] A. Ben-Israel and T. N. E. Greville. Generalized Inverses: ing neurons: Scaling and universality. Front. Comput. Theory and Applications. CMS Books in Mathematics. Neurosci., 9:109, 2015. Springer, New York, New York, 2003. 10 [58] S. Marzen and J. P. Crutchfield. Statistical signatures of [39] J. J. Koliha. A generalized Drazin inverse. Glasgow Math- structural organization: The case of long memory in re- ematical Journal, 38(3):367–381, 1996. 10 newal processes. Phys. Lett. A, 380(17):1517–1525, 2016. 20, 28 30

[59] S. Marzen and J. P. Crutchfield. Structure and random- [76] E. N. Lorenz. Deterministic nonperiodic flow. J. Atmos. ness of continuous-time discrete-event processes. J. Stat. Sci., 20(2):130–141, 1963. 26 Physics, 169(2):303–315, 2016. 20 [77] D. Ruelle and F. Takens. On the nature of turbulence. [60] M. S. Green. Markoff random processes and the statis- Comm. Math. Physics, 20(3):167–192, 1971. 26 tical mechanics of time-dependent phenomena. II. Irre- [78] M. C. Mackey. Time’s Arrow: The Origins of Thermo- versible processes in fluids. J. Chem. Physics, 22(3):398– dynamic Behavior. Springer, New York, 1992. 26 413, 1954. 21 [79] R. Bracewell. The Fourier Transform and Its Applica- [61] R. Zwanzig. Time-correlation functions and transport co- tions. McGraw-Hill, New York, third edition, 1999. 26 efficients in statistical mechanics. Ann. Rev. Phys. Chem- [80] B. Gardas, S. Deffner, and A. Saxena. Non-hermitian istry, 16(1):67–102, 1965. 21 quantum thermodynamics. Sci. Reports, 6:23408, 2016. [62] D. Mandal and C. Jarzynski. Analysis of slow transitions 27 between nonequilibrium steady states. arXiv:1507.06269. [81] N. Berkovits and D. Witten. Conformal supergrav- 21 ity in twistor-string theory. J. High Energy Physics, [63] S. Lahiri, J. Sohl-Dickstein, and S. Ganguli. A universal 2004(08):009, 2004. 27 tradeoff between power, precision and speed in physical [82] M. Newman. Networks: an introduction. Oxford Univer- communication. arXiv:1603.07758. 21 sity Press, Oxford, United Kingdom, 2010. 27 [64] P. L¨owdin.On operators, superoperators, Hamiltonians, [83] T. Nishikawa and A. E. Motter. Synchronization is and Liouvillians. Intl. J. Quant. Chem., 22(S16):485–560, optimal in nondiagonalizable networks. Phys. Rev. E, 1982. 21 73:065106, Jun 2006. 27 [65] G. Lindblad. On the generators of quantum dynamical [84] F. R. K. Chung. Spectral Graph Theory, volume 92. semigroups. Comm. Math. Physics, 48(2):119–130, 1976. American Mathematical Soc., Providence, Rhode Island, 21 1997. 27 [66] S. M. Barnett and S. Stenholm. Spectral decomposi- [85] L. N. Trefethen. Pseudospectra of linear operators. SIAM tion of the Lindblad operator. J. Mod. Optics, 47(14- Review, 39(3):383–406, 1997. 28 15):2869–2882, 2000. 21 [86] L. N. Trefethen and M. Embree. Spectra and pseudospec- [67] T. Petrosky and I. Prigogine. The Liouville space exten- tra: The behavior of nonnormal matrices and operators. sion of quantum mechanics. Adv. Chem. Phys, 99:1–120, Princeton University Press, Princeton, New Jersey, 2005. 1997. 21 28 [68] Y. Oono and M. Paniconi. Steady state thermodynamics. [87] J. P. Crutchfield and K. Young. Computation at the Prog. Theo. Phys. Supp., 130:29–44, 1998. 22 onset of chaos. In W. Zurek, editor, Entropy, Complex- [69] P. Stoica and R. L. Moses. Spectral Analysis of Signals. ity, and the Physics of Information, volume VIII of SFI Pearson Prentice Hall, Upper Saddle River, New Jersey, Studies in the Sciences of Complexity, pages 223 – 269, 2005. 22 Reading, Massachusetts, 1990. Addison-Wesley. 28 [70] H. Stark, W. R. Bennett, and M. Arm. Design consid- [88] N. Travers and J. P. Crutchfield. Infinite excess en- erations in power spectra measurements by diffraction of tropy processes with countable-state generators. Entropy, coherent light. Appl. Opt., 8(11):2165–2172, Nov 1969. 16:1396–1413, 2014. 22 [89] J. P. Crutchfield and S. E. Marzen. Signatures of in- [71] R. G. James, J. R. Mahoney, C. J. Ellison, and J. P. finity: Nonergodicity and resource scaling in prediction, Crutchfield. Many roads to synchrony: Natural time complexity, and learning. Phys. Rev. E, 91:050106, 2015. scales and their algorithms. Phys. Rev E, 89:042135, 28 2014. 23 [72] E. Narevicius, P. Serra, and N. Moiseyev. Critical phenomena associated with self- in non- Hermitian quantum mechanics. EPL (Europhys. Lett.), 62(6):789, 2003. 25 [73] A. V. Sokolov, A. A. Andrianov, and F. Cannata. Non- Hermitian quantum mechanics of non-diagonalizable Hamiltonians: Puzzles with self-orthogonal states. J. Physics A, 39(32):10207, 2006. [74] A. Mostafazadeh. Spectral singularities of complex scat- tering potentials and infinite reflection and transmission coefficients at real energies. Phys. Rev. Lett., 102:220402, 2009. 25 [75] J. D. Farmer, J. P. Crutchfield, H. Froehling, N. H. Packard, and R. S. Shaw. Power spectra and mixing properties of strange attractors. Ann. New York Acad. Sci., 357:453, 1980. 26