IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 17, SEPTEMBER 1, 2018 4657 Identification of Cascade Dynamic Nonlinear Systems: A Bargaining-Game- Theory-Based Approach Zhenlong Xiao , Member, IEEE, Shengbo Shan, and Li Cheng

Abstract—Cascade dynamic nonlinear systems can describe a I. INTRODUCTION wide class of engineering problems, but little efforts have been devoted to the identification of such systems so far. One of the dif- HE signal processing for nonlinear feature extractions in ficulties comes from its non-convex characteristic. In this paper, T nonlinear systems attracts increasing attentions due to its the identification of a general cascade dynamic nonlinear system significant application value in many engineering problems. For is rearranged and transformed into a convex problem involving a example, nonlinear characteristics (e.g., the second-order or the double-input single-output nonlinear system. In order to limit the estimate error at the frequencies of interest and to overcome the third-order harmonics) of Lamb waves can be used for struc- singularity problem incurred in the least-square-based methods, tural health monitoring applications [1], [2]. In such systems, in the identification problem is, thereafter, decomposed into a multi- addition to the damage-induced nonlinear components, nonlin- objective optimization problem, in which the objective functions earities can also be generated from other constituents of the sys- are defined in terms of the spectra of the unbiased error function tem (which can be regarded as subsystems) such as transducers, at the frequencies of interest and are expressed as a first-order poly- nomial of the model parameters to be identified. The coefficients bonding layers as well as measurement devices. Although some of the first-order polynomial are derived in an explicit expression of the nonlinear components, e.g., power-amplifier and trans- involving the system input and the measured noised output. To ducer nonlinearities, can be apprehended using pre-distortion tackle the convergence performance of the multi-objective opti- techniques [3], [4], the handling of other components from the mization problem, the bargaining game theory is used to model physical system itself is difficult to be apprehended. For ex- the interactions and the competitions among multiple objectives defined at the frequencies of interest. Using the game-theory-based ample, the bonding-layer nonlinearity and the damage-induced approach, both the global information and the local information nonlinearity. Moreover, due to the existence of the system damp- are taken into account in the optimization, which leads to an ob- ing, the corresponding subsystems behave like dynamic nonlin- vious improvement of the convergence performance. Numerical ear systems, in which the past inputs and/or the past outputs studies demonstrate that the proposed bargaining-game-theory- intervene in the nonlinear systems. Such system can therefore based algorithm is effective and efficient for the multi-objective optimization problem, and so is the identification of the cascade be modeled as a general cascade dynamic nonlinear system [2], dynamic nonlinear systems. which is the model under the consideration in this paper. Pioneering works in nonlinear system identification have been Index Terms—Cascade dynamic nonlinear systems, game the- largely devoted to the Wiener model (linear-nonlinear system) ory, system identification, frequency domain. [5]Ð[10] and the Hammerstein model (nonlinear-linear model) [11], [12]. The Hammerstein-Wiener model can deal with a cascade nonlinear system involving nonlinear-linear-nonlinear combinations, in which both nonlinear subsystems are static Manuscript received January 17, 2018; revised May 21, 2018 and June 14, nonlinear. In this paper, we will investigate the cascade of 2018; accepted July 2, 2018. Date of publication July 20, 2018; date of current version August 2, 2018. The associate editor coordinating the review of this two dynamic nonlinear systems. Existing techniques based on manuscript and approving it for publication was Dr. Dennis Wei. This work the Hammerstein-Wiener model, such as the was supported in part by the Xiamen Engineering Technology Center for In- [13], [14], the maximum-likelihood method [15], the over- telligent Maintenance of Infrastructures (No. TCIMI201814) and in part by the Research Grants Council of HKSAR through the project PolyU 152070/16E. parametrization method [16], [17], the biconvex based method (Corresponding author: Zhenlong Xiao.) [18], and the frequency domain method [19] cannot be directly Z. Xiao is with the Fujian Key Laboratory of Sensing and Computing for applied to the identification of the cascade dynamic nonlinear Smart City, School of Information Science and Engineering and the Xia- men Engineering Technology Center for Intelligent Maintenance of Infrastruc- systems. tures, Xiamen University, Fujian 361005, China, and also with the Department Despite its wide application value in representing a large class of Mechanical Engineering, Hong Kong Polytechnic University, Hong Kong of physical problems, cascade dynamic nonlinear systems have (e-mail:,[email protected]). S. Shan and L. Cheng are with the Department of Mechanical Engi- received little attention as far as the identification is concerned. neering, Hong Kong Polytechnic University, Hong Kong (e-mail:, shengbo. One of the fundamental problems is that the identification prob- [email protected]; [email protected]). lem is inherently non-convex. As an example, the input back- Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. lash nonlinearity and the output static nonlinearity were first Digital Object Identifier 10.1109/TSP.2018.2858212 identified together and then separated via the Kozen-Landau

1053-587X © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information. 4658 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 17, SEPTEMBER 1, 2018 decomposition algorithm [20], [21]. In [22], [23], an odd non- Levenberg-Marquardt (LM) algorithm in dealing with single- linearity was assumed for the input system, and in [24]Ð[26] objective optimization problems, for the multi-objective case, an iterative gradient based approach was proposed. The non- the ‘jump’ phenomenon may exist between two consecutive convex characteristic has been shown to lead to problems like iterations, which would be more obvious especially when the the local minimum. number of sub-problems is large, and therefore would deterio- Note that some of the nonlinear features needed to be de- rate the convergence performance of the identification. Although scribed in the frequency domain, exemplified by the second- evolution algorithms, for example, the genetic algorithm and the order harmonics and the third-order harmonics for power ampli- algorithm, are in principle applicable to this fiers [11] and structural health monitoring problems [2]. There- multi-objective optimization problem, potential limitations ex- fore, it is desirable that the identification of the cascade dynamic ist to hamper their practical implementation. For example, the nonlinear systems be conducted in the frequency domain such adjustment of the parameters (e.g., the commute rate for GA) that the nonlinear features can be directly captured with the may depend on the experience, and the local search performance estimate error being guaranteed at the frequencies of interest. needs to be improved because the feedback information of the In this way, the nonlinear cost function (error function) can be current candidate solution is not full employed for generating the defined and optimized directly in the frequency domain, which new candidate solutions. To tackle these problems, the present is more straightforward and more accurate to study the har- work makes use of the bargaining game theory to model the monics based features when compared with the indirect time- interactions and competitions among various sub-problems in domain approach. The latter requires the identification in the the multi-objective optimization. The sub-problems defined at time domain first and thereafter uses the fast Fourier transfor- the frequencies of interest can determine their own strategies mation. The whole process contributes to a global estimate error according to the preceding competition result. The bargaining which can be obtained. The error at the frequencies of interest, game theory based algorithm (BGTA) employs both the global however, cannot be guaranteed. To tackle this problem, some and the local information in the multi-objective optimization previous attempts were made to conduct the frequency-domain process, therefore would greatly improve the convergence per- based identification of cascade nonlinear systems. For example, formance. the best linear approximation (BLA) method was proposed for The contributions of this paper are summarized as follows: 1) the identification of the Wiener model [27], the Hammerstein To transform the non-convex problem into a convex problem, model [28], [29], and the Wiener-Hammerstein model [30] in the cascade dynamic nonlinear systems are rearranged, allow- the frequency domain. BLA, however, fails in the identification ing the identification problem to be modeled as a DISO Volterra of the cascade dynamic nonlinear systems because the Buss- system. Note that both of the two sub-systems are dynamic gang theorem does not hold for dynamic nonlinearities [31]. nonlinear; 2) To guarantee the estimate error at the frequen- The Volterra series based frequency-domain method is a power- cies of interest, the identification problem is decomposed into a ful tool for dynamic nonlinear systems [32]Ð[35] (e.g., nonlinear multi-objective optimization problem; the unbiased error func- ordinary differential equations (NODE) and nonlinear autore- tion is given, and the spectrum of the unbiased error function is gressive with exogenous input (NARX) model) under the con- demonstrated to be a first-order polynomial of the model param- vergence conditions [33], [36]Ð[38]. The generalized frequency eters of the cascade dynamic nonlinear systems to be identified, response function [39], [40] and the output spectrum function such characteristic relationship will greatly facilitate the multi- were studied [41]. Thereafter, the characteristic relationship be- objective optimization in the frequency domain; 3) To develop tween the output spectrum and the parameters of interest was an effective and efficient algorithm for the multi-objective opti- investigated [42]Ð[45]. However, very few efforts have been de- mization problem, bargaining game theory is employed to model voted to the identification of cascade nonlinear systems using the interactions and competitions among various sub-problems, Volterra series based frequency-domain method. which would greatly improve the convergence performance of The objective of the present work is to investigate the iden- the multi-objective optimization problem. tification of a general cascade dynamic nonlinear system in The remaining part of the paper is arranged as follows: In frequency domain. Firstly, to tackle the non-convex character- Section II, the model under the consideration and the identifi- istic, a cascade dynamic nonlinear system is rearranged, and cation problem are formulated. In Section III, the bargaining the estimate error function of the identification problem is cast game theory based identification of the cascade dynamic non- into a double-input single-output (DISO) nonlinear function. In linear systems is proposed. Numerical results and discussions this way, the identification problem is transformed into a convex are presented in Section IV. Finally, conclusions are presented problem. Secondly, in order to guarantee the estimate error at in Section V. each frequency of interest, the identification problem is decom- posed into a set of sub-problems that are defined at the same II. MODEL UNDER THE CONSIDERATION AND frequency. The identification problem then becomes a multi- PROBLEM FORMULATION objective optimization one. Although the convex characteristic A. Model Under the Consideration holds in the decomposition of the multi-objective problem, an effective and efficient algorithm is not straightforward. Despite Volterra model was widely reported in the literature to char- the success of the method, Newton method, acterize a dynamic nonlinear system. Here we study a cas- Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm and the cade dynamic nonlinear system as the cascade of two Volterra XIAO et al.: IDENTIFICATION OF CASCADE DYNAMIC NONLINEAR SYSTEMS: A BARGAINING-GAME-THEORY-BASED APPROACH 4659

· Ih · up to (Ih +1)-th order (i.e., c˜i ( )c i ( )), which is a non-convex problem. Therefore, the parameters c i (·) and c˜i (·) cannot be uniquely determined. To overcome the non-convex characteris- tic, we split sub-system (2) into two parts as h(x)=x(k)+h˜(x). (3)

Given a h˜(x), we can construct a nonlinear autoregressive exogenous (NARX) model

Ig ˜ ˜ h(x)+˜g(x, y)= dp,˜ 0 (n1 ,...,np˜)

p˜=1 0≤n 1 ≤···≤n p˜ ≤N

p˜ Ig ˜ x(k − ns )+ dp,q˜ (n1 ,...,np˜+q )

s=1 i=1,q≥1 0≤n 1 ≤···≤n p˜ ≤N p˜+q=i 0≤n p˜+1≤···≤n i ≤N

p˜ p˜+q Ig x(k − ns ) y(k − ns )=

s=1 s=˜p+1 i=1 0≤n 1 ≤···≤n p˜ ≤N p˜+q=i 0≤n p˜+1≤···≤n i ≤N p˜ p˜+q ˜ dp,q˜ (n1 ,...,np˜+q ) x(k − ns ) y(k − ns ) (4) s=1 s=˜p+1 such that h˜(x)+˜g(x, y) = e(x, y)→0 (5)

holds under certain norm · with appropriate nonlinear or- ˜ Fig. 1. Cascade dynamic nonlinear systems. (a) is a straightforward model of der Ig and difference order N. dp,q˜ (n1 ,...,np˜+q ) is the a cascade dynamic nonlinear system, (b) is an intermediate transformation of nonlinear parameter of the corresponding nonlinear term model (a), and (c) is the model under the consideration, which is a rearranged p˜ − p˜+q − model of (a). s=1 x(k ns ) s=˜p+1 y(k ns ), p˜ and q are the nonlin- ear orders of the input x and output y, respectively. Denote i · s=i+1 ( )=1, and models as shown in Fig. 1(a), in which f(u) and h(x) are given ˜ as polynomial nonlinearities dp,˜ 0 (n1 ,...,np˜)= ⎧ I f ⎪ c˜1 (0) − 1, p˜ =1,n1 =0, ⎪ x(k)=f(u)= c i(n1 ,...,ni ) ⎨ c˜i(n1 ,...,ni ), 1 ≤ p˜ = i ≤ Ih , 0 ≤ n1 ,...,ni ≤ Nh , ≤ ≤···≤ ≤ i=1 0 n 1 n i N f ⎪ ⎪ exceptp ˜ =1,n1 =0, × − ··· − ⎩ u(k n1 ) u(k ni ) (1) 0, others. and (6)

Ih From (4) and (5), h˜(x) can be approximated by −g˜(x, y), y(k)=h(x)= c˜ (n ,...,n ) i 1 i which can be denoted as h˜(x)=−g˜(x, y). Therefore, h(x)= i=1 0≤n ≤···≤n ≤N 1 i h x(k) − g˜(x, y) holds. Fig. 1(a) can then be converted to × x(k − n1 ) ···x(k − ni ), (2) Fig. 1(b). But the unmeasurable intermediate variable x still exists in the model. Substitute x into g˜(x, y), where x(k) is an unmeasurable intermediate variable, u(k) and y(k) are the system input and output, respectively. n1 ,...,ni I are the difference orders with the maximum order Nf and Nh , g˜(u, y)=˜g(x, y)= ≥ ≤ ≤···≤ ≤ and i is the nonlinear order with the maximum order If and i=1,q 1 0 n 1 n p N +N f p+q=i 0≤n p +1≤···≤n i ≤N Ih , respectively. c i(·) and c˜i(·) are the nonlinear parameters x =[x(k),...,x(k − N)] of model (1) and (2). Denote and × d (n ,...,n ) y =[y(k),...,y(k − N)]. p,q 1 p+q By substituting x(k) in (1) into sub-system (2), it is obvious p p+q × − − that the identification of the cascade dynamic nonlinear system u(k ns ) y(k ns ), (7) shown in Fig. 1(a) turns into the optimization of a polynomial s=1 s=p+1 4660 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 17, SEPTEMBER 1, 2018 where where

c i (n1 ,...,ni ) ci(n1 ,...,ni )= , (12) p ˜ d0,1 (0) + 1 dp,q(n1 ,...,np+q )= dp,q˜ (n1 ,...,np˜+q ) p˜=1 ⎧ p˜ (8) ⎪ d (n ,...,n ) ⎨ − p,q 1 i ≥ × ,q 1, c is (ns1 ,...,nsis ). ··· dp,q(n1 ,...,ni )= d 0 , 1 (0)+1 (13) i1 + +ip˜ =p s=1 ⎩⎪ 0,q=0,

˜ and obviously, Denote the difference order of parameter dp,q˜ (·) in (8) as d0,1 (0) = −1. (14) n˜1 = n1 ,...,n˜p˜ = np˜, then n1 ,...,np in dp,q(·) is the ascend- ing sorting of {n +˜n ,n +˜n ,...,n +˜n ,...,n + 11 1 12 1 1i1 1 p˜1 v(k) is the Gaussian white noise with zero mean and variance n˜ ,n +˜n ,...,n +˜n }. 2 p˜ p˜2 p˜ pi˜ p˜ p˜ σ . y(k) and y˜(k) are the outputs without noise and with noise, Substitute (1) and (7) into model Fig. 1(b), the following respectively. Obviously, equation can be obtained: x x(k) xs (k)=fs ( )= (15) d0,1 (0) + 1 y(k)=x(k) − g˜(x, y)=x(k) − g˜(u, y) and If − ··· − g˜(x, y) − y(k) = c i (n1 ,...,ni )u(k n1 ) u(k ni ) u y − g( , )= + y(k). (16) i=1 0≤n 1 ≤···≤n i ≤N f d0,1 (0) + 1 I Remark 1: Given the cascade dynamic nonlinear model − dp,q(n1 ,...,np+q ) shown in Fig. 1(a), there exists an equivalent model shown in i=1,q≥1 0≤n 1 ≤···≤n p ≤N +N f ≤ ≤···≤ ≤ p+q=i 0 n p +1 n i N Fig. 1(c) that can be described by (10)Ð(14), which is the model p p+q under the consideration in this paper. Remark 2: If the model in Fig. 1(a) is simply identified via × u(k − ns ) y(k − ns ). (9) s=1 s=p+1 a high-order Volterra model, the individual sub-system in the cascade cannot be uniquely and accurately determined because the identification is a non-convex problem. With the transfor- Moving y(k) in (9) from the left-hand side to the right-hand mation from Fig. 1(a) to Fig. 1(c), once the model (10)Ð(14) is identified, the first sub-system can be obtained as fs (u)= side, the corresponding parameter is −(d0,1 (0) + 1). Note that If c (n ,...,n ) i u(k − n ), and multiplying a non-zero constant onto both sides of the equation i=1 0≤n 1 ≤···≤n i ≤N f i 1 i j=1 j will not make any difference for identifying (9), which is the the second sub-system can be modeled by (10), i.e., y(k)= u y − x y x so-called scale deflection problem [18], [46], [47]. To avoid the xs (k)+g( , )=x(k) g˜( , )=h( ). scale deflection, we assume that the parameter of y(k) in (9) is Remark 3: If we only consider the identification of model (10)Ð(14) without the transformation from Fig. 1(a) to Fig. 1(c), non-zero, i.e., −(d0,1 (0) + 1) =0 . Multiplying both sides of the two nonlinear sub-systems, f(u) and h(x), cannot be (9) by a constant 1/(d0,1 (0) + 1), the model Fig. 1(c) can be uniquely estimated either, and only the general Volterra model obtained as (10) can be obtained. Assumption 1: A multi-frequency excitation, i.e., Q fs (x)+g(u, y) − y(k) u(k)= Aq sin(ωq k + χq ) (17) If q=1 = c (n ,...,n )u(k − n ) ···u(k − n ) i 1 i 1 i is assumed, where A ,ω ,χ are the amplitude, the frequency, i=1 0≤n ≤···≤n ≤N q q q 1 i f and the phase for the q-th sinusoidal input, respectively. As I assumed after (14), the noise satisfies v(k) ∼ N(0,σ2 ). + dp,q(n1 ,...,np+q ) ≥ ≤ ≤···≤ ≤ i=1,q 1 0 n 1 n p N +N f B. Identification of Cascade Dynamic Nonlinear Systems p+q=i 0≤n p +1≤···≤n i ≤N p p+q Given the model (10)Ð(14), the identification of the cascade × u(k − n ) y(k − n )=0, (10) dynamic nonlinear systems boils down to identifying the co- s s · · s=1 s=p+1 efficients ci( ) and dp,q( ) from the given input u(k) and the measurable output y˜(k). The intermediate variable x(k) is un- y˜(k)=y(k)+v(k), (11) measurable. The identification problem can be expressed in time XIAO et al.: IDENTIFICATION OF CASCADE DYNAMIC NONLINEAR SYSTEMS: A BARGAINING-GAME-THEORY-BASED APPROACH 4661 domain as ε(k) is obtained, the identification problem (18), (19) can be conducted in the frequency domain by optimizing the output If spectra in the whole frequency range Ω. In this case, we need ε(k)= ci (n1 ,...,ni )u(k − n1 ) ···u to consider the output spectrum of error ε(k) at each frequency i=1 0≤n 1 ≤···≤n i ≤N f of interest, then, the identification problem is converted into a I multi-objective optimization one. Note that the effective and × − (k ni )+ efficient handling of a multi-objective optimization problem is ≥ ≤ ≤···≤ ≤ i=1,q 1 0 n 1 n p N +N f not straightforward. To tackle this problem, the bargaining game p+q=i 0≤n p +1≤···≤n p + q ≤N theory is employed to model the interactions and competitions p p+q among the sub-problems defined at the frequencies of interest, × − − dp,q(n1 ,...,np+q ) u(k ns ) y˜(k ns ), leading to the proposition of BGTA, conducive to the multi- s=1 s=p+1 objective optimization problem. (18) · arg min ε , (19) A. Output Spectrum of the Error Function ε( ) θ It is well-known that in a forward problem, i.e., from input θ where =[c1 (0),...,cIf (Nf ,...,Nf ),d0,1 (0),...,d0,I (N, u(k) to intermediate variable x(k) and then the output y(k) ...,N),...,dI,0 (N + Nf ,...,N + Nf )] are the coefficients and noised output y˜(k), the identification is unbiased because to be identified, ε =[ε(k),...,ε(k − M)] with M the number the noise only exists in the output y˜(k), and E(˜y(k)) = y(k) of observed errors, and · denotes a metric with a certain holds. Unfortunately, in model (18), the noise is involved in norm. y˜ is the output with noise v. the new input y˜(k), and the identification becomes a biased Assumption 2: In this paper, we assume that the maximum estimation. To tackle the problem, the bias of model (18) is given nonlinear order I and If, and the maximum difference order firstly. N and Nf are known. The following work then focus on the Theorem 1: The bias of model (18) is given as θ identification of the model parameters . I Remark 4: The identification of the cascade model presented δ(k)= in Fig. 1(a) is non-convex, which may be solved using evo- i=2,q≥1 0≤n 1 ≤···≤n p ≤N +N f ≤ ≤···≤ ≤ lution algorithms. However, the evolution algorithms may be p+q=i 0 n p +1 n i N  time-consuming, and the identification cannot be guaranteed to p achieve the global optimum. After rearranging the model to the × dp,q(n1 ,...,np+q ) u(k − ns ) one shown in Fig. 1(c) and considering the identification prob- s=1 j0 +j1 +···+jl =q lem (18), (19) as a DISO system with u and y˜ being the two ⎛ ⎞⎤ new inputs and the error function ε as the new output, the error ⎜j0 l ⎟⎥ ⎜ js ⎟⎥ function (18) is linear in parameters, and therefore is a convex × ⎝ y(k − ni,j) × μ (js ) σ ⎠⎦ function. The rearrangement of the cascade dynamic nonlinear j=1 s=1, ≥ model from Fig. 1(a) to Fig. 1(c), therefore, transform the iden- js is even,js 2 tification problem from a non-convex problem into a convex (20) problem. and Remark 5: The identification problem (18), (19) may be ⎛ ⎞ j0 j0 solved with the least square method θˆ =(ΘT Θ)−1 ΘT ε, which y(k − n )=E ⎝ y˜(k − n )⎠ could potentially become singular especially when the number i,j i,j j=1 j=1 r0 +r1 +···+rl =j0 of parameters to be identified is large. Even when a global opti- ⎛ ⎞ mum is obtained, if the nonlinear features are to be described in ⎜r0 l ⎟ the frequency domain and the estimate error at the frequencies ⎜ r ⎟ − y(k − n ) × μ (j ) σ s , (21) of interest needs to be quantified, the evolution algorithms and ⎝ j,r s ⎠ r=1 s=1, the least square method can only provide the total estimate error rs is even,rs ≥2 but cannot guarantee the errors at the frequencies of interest. where σ is the standard derivation of the noise These motivate the present work to deal with the identification {{ } { } { v, and ni,1 ,...,ni,j0 , ni,j0 +1,...,ni,j0 +j1 ,..., ni,j0 problem (18), (19) in the frequency domain directly. ··· }} + + jl−1 +1,...,ni,j0 +···+jl is a partition of the set { } ··· np+1,...,np+q where ni,j0 +1 = = ni,j0 +j1 ,..., and III. GAME THEORY BASED FREQUENCY DOMAIN ··· ··· ni,j0 +···+jl −1 +1 = = ni,j0 + + jl−1 + jl . j1 ,...,jl are IDENTIFICATION APPROACH even numbers larger than 2, and j0 + ···+ jl = q. Denote { } { } In this section, the identification problem (18), (19) is dealt j0 ,...,jl a partition of set n1 ,...,ni , the summation ··· (·) stands for the sum of (·) on all partitions of with in the frequency domain. Firstly, the error function ε(k) j0 + +jl =q { } 0 · will be expressed as the output of a DISO nonlinear system, the set np+1,...,np+q , j=1 ( )=1, and and its output spectrum will be given as an explicit polyno- × × ×···× − mial function of the input spectra. Once the output spectrum of μ(js )=1 3 5 (js 1). (22) 4662 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 17, SEPTEMBER 1, 2018

It can be observed that (20) and (21) form a recursion cal- Theorem 2: The spectrum of the expectation of the unbi- j0 · − culation because a high-order product j=1 ( ) in (20) can be ased error function E(ε(k) δ(k)) can be given as a first-order r0 · recursively calculated using the low-order product r=1 ( ) in polynomial with respect to the parameters to be identified as (21), i.e., j0 >r0 . The recursion terminates with the following condition: Ψ(ω)=F [E (ε(k)) − δ(k)] y(k − n2,1 )y(k − n2,2 )=E (˜y(k − n2,1 )˜y(k − n2,2 )) If − E (v(k − n2,1 )v(k − n2,2 )) = c (n ,...,n )ϕ (ω)  i 1 i n 1 ,...,ni i=1 0≤n 1 ≤···≤n i ≤N f E (˜y(k − n2,1 )˜y(k − n2,2 )) ,n2,1 = n2,2 , = (23) 2 I E (˜y(k − n2,1 )˜y(k − n2,2 )) − σ ,n2,1 = n2,2 . + dp,q(n1 ,...,np+q )

Proof: For model (18), the following equation holds, i=1,p≥1 0≤n 1 ≤···≤n p ≤N +N f p+q=i 0≤n p +1≤···≤n i ≤N If × E (ε(k)) = ci(n1 ,...,ni ) φn 1 ,...,np + q (ω) , (27)

i=1 0≤n 1 ≤···≤n i ≤N f

I where ci (·) and dp,q(·) are the model parameters to be identified, × u(k − n1 ) ···u(k − ni )+ and F (·) denotes the Fourier transform, and the coefficients of ≥ ≤ ≤···≤ ≤ i=1,q 1 0 n 1 n p N +N f the first-order polynomial can be obtained as p+q=i 0≤n p +1≤···≤n i ≤N   p p+q i  × dp,q(n1 ,...,np+q ) u(k − ns )E y˜(k − ns ) , − ϕ (ω)= e jns ω s U(ω ) , (28) s=1 s=p+1 n 1 ,...,ni s ω +···+ω =ω s=1 (24) 1 i  p  where E(·) denotes the expectation of (·), and −jns ω s φn 1 ,...,ni (ω)= e U(ωs ) ω +···+ω =ω E (˜y(k − np+1) ×···×y˜(k − np+q )) 1 p + q s=1   p+q=i p+q   p+q − − = E [y(k ns )+v(k ns )] −jns ω s ×E e Y˜ (ωs ) − s=p+1 ···   s=p+1 ω 1 + +ω p + ω i,1 +···+ω i,j =ω p+q p+q ⎡ 0 − − = y(k ns )+E v(k ns ) p  − s=p+1 s=p+1 × ⎣ e jns ω s U(ω ) ×    s ··· i1 i2 s=1 j0 +j1 + +jl =q − × − ⎛ ⎛ ⎞⎞⎤ + y(k ni,s1 ) E v(k ni,i1 +s2 ) . i1 +i2 =q s1 =1 s2 =1 ⎜ ⎜ l ⎟⎟⎥ (25) × ⎜ ⎜ js ⎟⎟⎥ ⎝Gj0 (ωi,1 ,...,ωi,j0 ) ⎝ μ (js ) σ ⎠⎠⎦ , The problem then becomes the high-order moments of the s=1, js is even,js ≥2 noise v(k), which can be given as [48]:  (29) 0 s is odd, E(vs (k)) = (26) μ(s)σs s is even, where where μ(s)=1× 3 ×···×(s − 1). Substituting (26) into   j0 (25), and note that (10) holds, the bias δ(k)=E(ε(k)) can − jni,sω i,s ˜ be obtained. Gj0 (ωi,1 ,...,ωi,j0 )=E e Y (ωi,s) s=1 Let q = j0 and p =0, and substitute these two values into ⎛ (25), then (21) is straightforward. This completes the proof.  Example 1: Given g˜ = d (1, 1, 1, 1)˜y(k − 1)4 . All the ⎜ 4 − ⎜ ⎝Gr0 (ωj0 ,1 ,...,ωj0 ,r0 ) partitions of the set {n1 ,n2 ,n3 ,n4 } can be given as r0 +r1 +···+rl =j0 {{n1 ,n2 ,n3 ,n4}} , {{n1 ,n2}, {n3 ,n4}} , {{n1 ,n3}, {n2 ,n4}}, {{n1 ,n4}, {n2 ,n3}} , {{ n2 ,n3 }, {n1 ,n4 }}, {{n2 ,n4 }, {n1 ,n3 }}, ⎞ {{ } { }} and n3 ,n4 , n1 ,n2 . Then the bias can be obtained as l ⎟ 4 2 2 4 2 2 δ(k)=μ(4)σ +6× y(k − 1) μ(2)σ =3σ +6y(k − 1) σ . rs ⎟ × μ (rs ) σ ⎠ , (30) − 2 − 2 − 2 Note that y(k 1) = E(˜y(k 1) ) σ holds. Therefore, s=1, 4 2 2 ≥ δ(k)=−3σ +6E(˜y(k − 1) )σ . rs is even,rs 2 XIAO et al.: IDENTIFICATION OF CASCADE DYNAMIC NONLINEAR SYSTEMS: A BARGAINING-GAME-THEORY-BASED APPROACH 4663

B. Frequency Domain Based Modeling of Multi-Objective G2 (ω2,1 ,ω2,2 )== ⎧  2 ⎪ − Given the output spectrum of the expectation of the unbiased ⎪ jn2 ,sω 2 ,s ˜ ⎨⎪ E e Y (ω2,s) ,n2,1 = n2,2 , s=1 error function at frequency ω, i.e., Ψ(ω) in (27), the identifica-  ⎪ 2 tion problem (18), (19) can then be investigated in the frequency ⎪ −jn2 ,sω 2 ,s 2 ⎩ E e Y˜ (ω2,s) − σ ,n2,1 = n2,2 , domain based on the L∞-norm by minimizing the maximum er- s=1 ror over the entire frequency range Ω, (31) arg min max {|Ψ(ω)|| ∀ω ∈ Ω} , (33) and U(ω) and Y˜ (ω) are the spectra of u(k) and the noised θ output y˜(k), respectively. E(·) denotes the expectation of (·). Proof: Note that the unbiased error function can be given as where |Ψ(ω)| denotes the module of Ψ(ω). Problem (33) is a convex optimization problem with Ψ(ω) given in (27). If Proof: From Remark 5, it is shown that Ψ(ω) is a con- E (ε(k)) − δ(k)= ci(n1 ,...,ni ) vex function with respect to the model parameters to be ≤ ≤···≤ ≤ i=1 0 n 1 n i N f identified, θ, and so is |Ψ(ω)|. The pointwise maximum {| ||∀ ∈ } I max Ψ(ω) ω Ω is a convex function [50]. Problem (33) then comes to minimize a convex function over a convex region, × u(k − n1 ) ···u(k − ni )+ and therefore is a convex optimization problem. This completes i=1,q≥1 0≤n 1 ≤···≤n p ≤N +N f ≤ ≤···≤ ≤ p+q=i 0 n p +1 n i N the proof.    p p+q Remark 7: The convex optimization problem (33) needs fur- ther tactic treatment in order to get an effective and efficient × dp,q(n1 ,...,np+q ) u(k − ns )E y˜(k − ns ) s=1 s=p+1 algorithm. Noted that the objective function in (33) is non- differentiable, the module |Ψ(ω)| at each frequency ω is dif- − δ(k), (32) ferentiable exclude the singularities. A naive gradient based method, e.g., in each step choose the gradient of |Ψ(ωs )| that which can be considered as a double-input single-output non- |Ψ(ωs )| = max{|Ψ(ω)||∀ω ∈ Ω}, may lead to the ‘jump’ phe- linear system because both of u(k) and y˜(k) are measurable. nomenon, i.e., the sudden increase of the cost function between The following equations hold [49]: two consecutive iterations, which is a key issue affecting the convergence performance of the multi-objective optimization If problem. F (x(k))| = X(ω)= c (n ,...,n ) ω i 1 i In the following, the game theory is introduced to model the i=1 0≤n 1 ≤···≤n i ≤N f behaviors of the optimization process and to establish an effec- −jm1 ω 1 −jmi ω i × e U(ω1 ) ×···×e U(ωi ) tive and efficient algorithm for the multi-objective optimization

ω 1 +···+ω i =ω problem (33). and   C. Bargaining Game Theory Based Multi-Objective i Optimization − F y(k − n ) = e jn1 ω 1 Y (ω ) ×··· s 1 Given the multi-objective optimization problem (33), clas- s=1 ω +···+ω =ω ω 1 i sical optimization algorithms can well employ the local infor- −jni ω i × e Y (ωi ). mation to update the coefficients, by using, for example, the gradient based strategy. Problems, however, may occur in such Taking the Fourier transform on both sides of (32) and substi- optimization process such as the jump phenomenon, resulting tuting the above two equations into the Fourier transform lead to from the lack of global information in the coefficient updating the results in (27)Ð(29). For (30) and (31), by taking the Fourier process. Evolution algorithms may offer an alternative to the transform on both sides of (21) and (23), respectively, and then problem, but most of them such as the genetic algorithm cannot following the similar procedures for proving (27)Ð(29), the re- make full use of the local information, detrimental to the local sults can thereafter be obtained. This completes the proof.  searching. In this section, the bargaining game theory is used to Remark 6: It is worth noting that the spectrum of the ex- overcome such problems through a simultaneous consideration pectation of the unbiased error function is a first-order poly- of both the global and local information. Firstly, the definition nomial of the model parameters to be identified (i.e., θ), and of the game model for a multi-objective optimization problem the coefficients of the first-order polynomial, ϕn 1 ,...,ni (ω) and (33) is given. Then the game theory based behavior modeling θ φn 1 ,...,ni (ω), are both independent of . According to [50], (27) of the interactions and competitions among all multiple objec- is therefore a convex function with respect to the model parame- tives is presented, which provides a feasible way to achieve the ters to be identified, θ, which means that the optimization of (27) optimal strategy (i.e., the estimated optimal coefficients). The at each given frequency ω is a convex optimization problem. Nash equilibrium of the behavior modeling is demonstrated. 4664 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 17, SEPTEMBER 1, 2018

Finally, a game theory based algorithm is summarized for the Remark 8: According to (38) in step 4, the maximum sys- multi-objective optimization problem (33). tem utility Φs+1,max is chosen among all the players’ utilities. Definition 1: The multi-player bargaining game model in- Therefore, the new strategy θs+1 in (39) and the indicator κs+1 volves: in (40) would involve the global information. In step 1, each Players: The frequency ω in the frequency range Ω, i.e., player generates its own new strategy independently via a local ∀ω ∈ Ω. Denote the number of players as λ. searching, and obviously, the local information is employed. Strategy set: θ ∈ R , where denotes the dimension of the Both the global and local information are considered in the coefficients θ with R being a real number set. multi-objective optimization problem, which would improve the Utility set: {Φ(ω)|∀ω ∈ Ω}, where Φ(ω) is the utility of convergence performance of the optimization. player ω given as Remark 9: The indicator κ in (40) represents an approxi- mation of the targeted system utility. For Φ (ω ) >κin step Φ(ω)=1−|Ψ(ω)| (34) s α 1, if player ωα still chooses to maximize its own utility, the and the system utility is defined as corresponding strategy will have a larger chance to reduce the other players’ utilities, and consequently will reduce the sys- Φ = arg max min Φ(ω) = arg min max |Ψ(ω)| . (35) tem utility. Other players would have a greater chance to reject θ ω ∈Ω θ ω ∈Ω this strategy. In order to help achieve an agreement, player ωα Assumption 2: In order to guarantee an agreement in the should choose its new strategy such that Φs+1(ωα ) < Φs (ωα ) bargaining game, it is assumed that Φ(ω)=0, ∀ω ∈ Ω holds if holds. player ω decides not to enroll in the bargaining game. Theorem 3: After several rounds of negotiations, the multi- Behavior modeling:Inthes th-round of the bargaining, the player bargaining game finally converges to the Nash equilib- behavior can be modeled as: rium, which means that no unilateral deviation in the strategy θ Step 1: Given the strategy s and the indicator κ, each player by any single player ωα is profitable for that player, i.e., ∀ ∈ ( ωα Ω) calculates its utility Φs (ωα ).IfΦs (ωα ) < ∗ ∗ ∗ Φ(ωα | θ (ωα ) , θ (ω−α )) ≥ Φ(ωα | θ (ωα ) , θ (ω−α )) , κ holds, player ωα generates its new strategy

∀ωα ∈ Ω, (41) θs+1 (ωα )=θs (ωα )+rθ (36) θ∗ r where (ωα ) is the optimal strategy set of the player ωα , such that Φs+1(ωα ) > Φs (ωα ) holds, where = θ∗ T (ω−α ) is the optimal strategy set of all other players except [r1 ,...,r ] is a random vector with rs randomly for player ωα , and Φ(ωα |·) is the utility of player ωα with strat- generated in [−1, 1],s=1,..., , and  is a given ∗ ∗ θ egy set ·. The strategy θ (ω )=···= θ (ω ) is the optimal constant step; otherwise, the new strategy for player 1 λ solution of the multi-objective optimization problem (33). ωα should reduce its utility, i.e., Φs+1(ωα ) < Φs (ωα ). Proof: Suppose that there exists a strategy θ(ωα ) such Step 2: Each player ωα sends their new strategy θs+1(ωα ) to ∗ ∗ ∗ that Φ(ωα |θ(ωα ), θ (ω−α )) ≥ Φ(ωα |θ (ωα ), θ (ω−α )) for all all other players ∀ωβ ∈ Ω, ωβ = ωα . Each player ωβ ∗ strategies θ (ω− ) but it is not a Nash equilibrium. Resulting should calculate its utility Φ (ω ; ω ) based on the α s α β from the competition characteristic among all the players in the player ω ’s strategy and feedback to player ω . α α multi-objective optimization, at least one of the player’s utilities Step 3: Each player ω calculates the minimum utility of its α will be decreased by unilaterally deviating from the strategy, own strategy, i.e., namely: Φ (ω ) = min Φ (ω ; ω ) . (37) ∗ ∗ ∗ s+1,min α s α β | θ θ − ≤ | θ θ − ∀ω β ∈Ω Φ(ωβ (ωα ) , (ω α )) Φ(ωβ (ωα ) , (ω α )) , ∃ ∈  Step 4: All the players negotiate and choose the maximum ωβ Ω,ωβ = ωα , utility, i.e., which means that the minimum utility has been decreased. Note Φ = max Φ (ω ) . (38) that the goal of the optimization is to maximize the minimum s+1,max ∈ s+1,min α ω α Ω utility. Therefore, in the negotiation (step 4), the new strat- ∗ θ ∈ θ − θ If Φs+1,max > Φs,max, choose the corresponding player’s egy will be chosen as s+1 (ω α ), and (ωα ) will be strategy as the strategy of this round’s bargaining, i.e., rejected. The assumption does not hold and none of the play- ers can benefit by unilaterally deviating its strategy. Therefore, θ = θ (ω ) , (39) θ∗ ··· θ∗ s+1 s+1 α the strategy (ω1 )= = (ωλ) is the Nash equilibrium of the multi-objective bargaining game model. This completes the and calculate the indicator as  proof. The following algorithm is summarized to provide a step- κs+1 = Φs+1 (ωα ; ωβ ) λ, (40) by-step negotiation process to achieve the optimal strategy of ω β ∈Ω the multi-player bargaining game model, and so is the optimal where λ is the number of the players; go to step 5. Otherwise, estimate of the multi-objective frequency-domain based identi- if Φs+1,max < Φs,max, the new strategy of this round should be fication problem (33). directly rejected, then go to step 1 for generating new strategies. In the initialization, the coefficients θ0 and the indicator κ0 Step 5: End of the round’s negotiation. can be randomly given. After several rounds of bargaining, XIAO et al.: IDENTIFICATION OF CASCADE DYNAMIC NONLINEAR SYSTEMS: A BARGAINING-GAME-THEORY-BASED APPROACH 4665

TABLE I error provided by the proposed BGTA method) because it is a ALGORITHM TO ACHIEVE THE NASH EQUILIBRIUM total estimate error that integrates the error over the whole fre- quency range. If the L1 -norm or L2 -norm is selected as the cri- terion to formulate the objective function, the proposed BGTA method cannot be directly applied to the single-objective opti- mization problem. We will consider these criteria in our further study. Remark 12: To guarantee a unique solution, the frequencies taken into account in the identification problem (33) (i.e., the number of players considered in the bargaining game) should be greater than or at least equal to the number of model parameters to be identified. Remark 13: In the above, we demonstrate that the bargain- ing game theory based multi-objective optimization is a convex problem and a unique optimal solution exists. Generally, a small step θ can ensure the convergence of the numerical simulation to the optimal solution with a large number of bargaining, and a large step θ with a small number of bargaining. If the step θ is too large, the numerical simulation may not converge to the optimal solution. It is difficult to provide an analytical and explicit criterion for the step θ to guarantee the convergence of the numerical simulation. Instead, we can give two or more different steps θ , if the solutions of these different steps are the same, the numerical simulations can be considered as con- verging to the optimal solution. Otherwise, we can reduce the step θ , until the results using different steps are the same.

IV. NUMERICAL SIMULATIONS AND DISCUSSIONS Algorithm 1 will converge to the Nash equilibrium, and the Examples are investigated using the cascade dynamic nonlin- optimal estimation can therefore be obtained. Owing to the use ear system described by (42)Ð(44). Note model (42) is a Volterra of the global information, the system utility in Algorithm 1 in- model, and model (43) a NARX model, which can be merged to creases successively as the iterative bargaining goes on, that the general system (10)Ð(14) that describes the cascade dynamic is, the maximum error in (33) monotonously decreases, which nonlinear systems. will greatly improve the convergence performance of the multi- objective identification problem (33). x(k)=0.6u(k)+0.2u(k − 2) + 0.6u(k)u(k − 1), (42) Remark 10: Although finding the Nash equilibrium in a − − − game is a NP problem, Algorithm 1 together with the behavior y(k)=x(k)+0.4y(k 1)y(k 2) + 0.6y(k 1)u(k) modelling can achieve the Nash equilibrium. The reasons are: − 0.1y(k − 1)u(k − 2) − 0.1y(k − 1)u(k − 4) 1) the multi-objective optimization problem in (33) (or (35)) is − − − − a convex problem, so there only exists a single Nash equilib- +0.6y(k 1)u(k)u(k 1) 0.3y(k 1) rium in the game; 2) according to Algorithm 1 and the behavior × u(k − 2)u(k − 3), (43) modelling, a higher utility (i.e., a smaller estimate error at the frequencies of interest) can be guaranteed after each negotiation y˜(k)=y(k)+v(k), (44) until reach the Nash equilibrium, which means that the current where strategy gets closer to the Nash equilibrium after each negotia- tion, and after several rounds of negotiations (can be determined c1 (0) = 0.6,c1 (2) = 0.2,c2 (0, 1) = 0.6,d0,1 (0) = −1, by the targeted utility Φ or the estimate error acceptable targeted d (1, 2) = 0.4,d (0, 1) = 0.6,d (2, 1) = −0.1, at the frequencies of interest), the strategy can reach the Nash 0,2 1,1 1,1 equilibrium. d1,1 (4, 1) = −0.1,d2,1 (0, 1, 1) = 0.6,d2,1 (2, 3, 1) = −0.3. Remark 11: The bargaining game theory algorithm (BGTA) is proposed to solve the multi-objective optimization problem The input is given as where the objective function given in (33) is based on L∞- u(k)=0.3 sin (2πTs k)+0.3 sin (4πTs k + π/2), (45) norm. There also exist other criteria to characterize the objective function, e.g., the L1 -norm or L2 -norm that integrate the mod- where Ts = 0.025 s is the sampling time interval. In typical ule or squared module over the entire frequency range. With structural health monitoring applications, the received signal these norms, the estimate error at the frequencies of interest has a signal-to-noise ratio (SNR) around 60 dB [51]. Therefore, would be more conservative (much larger than the estimate the SNR of the white Gaussian noise v(k) is firstly set to be 4666 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 17, SEPTEMBER 1, 2018

Fig. 2. Output spectrum Y (ω).

60 dB in the simulation. The effect of signals with different SNR will be studied later. Fig. 3. Identification of the cascade dynamic nonlinear system based on the Adopting least square method mentioned in Remark 5, bargaining game modeling with different initial model parameters. θˆ =(ΘT Θ)−1 ΘT ε, and considering a simple case that only identifies the first sub-system with If =2and Nf =2,thema- trix ΘT Θ is singular. Therefore, the least square method is unavailable. Fig. 2 shows the output spectra, Y (ω), with and without noise, respectively. It can be observed that there exist multiple harmon- ics in the frequency range of Y (ω) (only the first 20 Hz range is shown in Fig. 2 because the sampling time interval is 0.025 s with a data length of 1 s) although only two frequencies are involved in the input (shown in (45), i.e., 1 Hz and 2 Hz), due to the nonlinearities involved in the first and second subsystems (42)Ð(44). As the output frequency increases, the output spec- trum without noise shows a downward tendency. For the noised output case, the output spectrum follows the same descending trend for the lower-order frequency components, before tending to a constant as the output frequency increases. The reason is that the noise in the numerical study is a Gaussian white noise, Fig. 4. Comparisons of the convergence performance of the proposed BGTA which has a constant power spectra density. and the gradient based naive method. Fig. 3 shows the identification results of the cascade dynamic nonlinear system with three different initial model parameters. The expectation of the noised output spectrum E(Y˜ (ω)) and is considered in the optimization (the gradient is employed). E(Y˜ (ω1 )Y˜ (ω2 )) in the proposed BGTA method are obtained This ‘jump’ phenomenon will be even worse when the estimate via 100 measurements. It can be seen that after several rounds of error become smaller. As to the proposed BGTA with different negotiations, the estimated coefficients converge to their respec- initial coefficients, Fig. 4 shows a much-improved convergence tive true values, which also indicates that the proposed BGTA behavior. In fact, the estimate error quickly reduces with the converges to the Nash equilibrium in the multi-objective opti- increasing number of the iteration for all cases. Meanwhile, as mization. pointed out in remark 6, the proposed BGTA employs both the Fig. 4 shows the comparison of the convergence perfor- local information and the global information in the optimization. mance between the proposed BGTA and the gradient based Therefore, no ‘jump’ phenomenon appears, which is also re- naive method mentioned in Remark 7. It can be seen that the sponsible for the improved convergence performance compared gradient based naive method has obvious ‘jump’ phenomenon with the gradient based naive method. It can also be observed between two consecutive iterations (shown in the sub-figure), that when the proposed BGTA achieves the optimal strategy, i.e., the estimate error suddenly increases to a large error in the the gradient based naive method is still far from convergence. iteration, which deteriorates the convergence performance of the The effectiveness, efficiency, and the superiority of the proposed optimization. This is due to the fact that only local information game theory algorithm can therefore be demonstrated. XIAO et al.: IDENTIFICATION OF CASCADE DYNAMIC NONLINEAR SYSTEMS: A BARGAINING-GAME-THEORY-BASED APPROACH 4667

Fig. 5. Estimate error at some frequencies of interest.

It can be observed that the convergence rate (number of it- erations to convergence) of the proposed BGTA may depend on the given initial model parameters, e.g., case 3 converges Fig. 6. Identification of the cascade dynamic nonlinear systems at different with 2 × 104 iterations, and case 1 with 5.4 × 104 iterations. noise levels. The given step θ in (36) and the maximum utility Φtargeted in Algorithm 1 (or the maximum estimate error acceptable at the frequencies of interest) will also influence the number of itera- tions to the convergence. It ranges from twenty to forty minutes for the four numerical simulations in Figs. 3 and 4, which is acceptable as an offline identification algorithm. To improve the efficiency, either the L1 -norm or L2 -norm can be considered to characterize the identification as a single-objective optimiza- tion problem, which will only provide a total estimate error in the whole frequency range instead of a tight estimate error at the frequencies of interest that can be obtained by the proposed BGTA method. Fig. 5 shows the estimate errors at the frequencies of interest. It can be observed that the maximum estimate error is −105 dB at 11 Hz, which is consistent with that shown in Fig. 4. The estimate error at all other frequencies are smaller. Figs. 6 and 7 show the performance of the proposed BGTA Fig. 7. The convergence performances at different noise levels. against different noise levels. When the SNR decreases from 40 dB to 30 dB, the estimated model parameters well converge [35], which obviously cannot be directly applied to the present to the true values, demonstrating the robustness of the pro- cascade dynamic nonlinear models. In Fig. 8, we compare the posed algorithm. The expectation of the noised output spectrum present BGTA method with a tensor based method that identifies E(Y˜ (ω)) in the proposed BGTA method are obtained via 100 a multiple-input multiple-output Volterra model [52]. Fig. 8(a) measurements for all different noise levels. When the SNR is set is the identification result of the first sub-system, and Fig. 8(b) to be 20 dB, although some of the parameters, e.g., d0,2 (1, 2), is that of the second sub-system. For the second sub-system, the slightly deviate from the true values, most of the parameters still measured output spectrum Y˜ (ω) and the input spectrum U(ω) converge to the true values. Therefore, the result is still accept- are used to calculate the spectrum of the intermediate variable able. For a SNR lower than 20 dB, more measurements should |XBGTA(ω)|. Comparing |XBGTA(ω)| and |XTensor(ω)| with be used to calculate an accurate expectation of the noised output the true spectrum, we can investigate the effectiveness of dif- spectrum E(Y˜ (ω)) to make the estimate results acceptable. ferent methods on the identification of the first and second sub- It is shown in Section II that the identification of the cas- systems. It can be observed that results from the BGTA method cade dynamic nonlinear systems can be transformed into the almost overlap with the true spectrum, |X(ω)|, outperforming identification of a double-input single-output Volterra system. its tensor based counterpart. Plausible reasons are: 1) the tensor Although the identification of a Volterra system was widely stud- based method is a time domain method, inevitably embracing ied in the literature, most of the studies focused on a single-input all the noise in the identification. On the contrary, the BGTA single-output second-order or third-order Volterra model [34], method only takes into account the frequencies shown in Fig. 2 4668 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 66, NO. 17, SEPTEMBER 1, 2018

coefficients. This characteristic relationship between the unbi- ased spectrum and the model parameters to be identified greatly facilitates the optimization in the frequency domain. Reach- ing this step, the identification problem was transformed into a multi-objective optimization problem. To ensure an effective and efficient multi-objective optimization, the bargaining game theory was employed to model the competitions and interactions among the multiple objectives in the optimization. It was shown that the consideration of both the global and local information in the BGTA greatly improves the convergence performance of the optimization. The proposed formalism allows the identification of the cascade dynamic nonlinear systems to be conducted in the frequency domain such that the nonlinear features can be directly captured with the estimate error being guaranteed at the frequencies of interest, conducive to numerous engineering ap- plications requiring the extraction of nonlinear features related to higher-order harmonics.

REFERENCES Fig. 8. Comparison with the tensor based method [52]. |X(ω)| is the magni- [1] M. Hong, Z. Su, Q. Wang, L. Cheng, and X. Qing, “Modeling nonlin- tude of the true spectrum of intermediate variable x(k), |XBGTA(ω)| is the earities of ultrasonic waves for fatigue damage characterization: The- magnitude of the spectrum obtained by the proposed BGTA method, and ory, simulation, and experimental validation,” Ultrasonics, vol. 54, no. 3, |XTensor(ω)| is the magnitude of the spectrum obtained by the tensor based pp. 770Ð778, 2014. method. (a) First sub-system, and (b) second sub-system. [2] S. Shan, L. Cheng, and P. Li, “Adhesive nonlinearity in lamb-wave-based structural health monitoring systems,” Smart Mater. Struct., vol. 26, no. 2, pp. 025019, 2017. [3] H. Jiang and P. Wilford, “Digital predistortion for power amplifiers using as the frequencies of interest. Therefore, only the noise at those separable functions,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4121Ð particular frequencies are involved in the identification; 2) a reg- 4130, Aug. 2010. ulation term in the objective function exists in the tensor based [4] R. Piazza, M. R. B. Shankar, and B. Ottersten, “Data predistortion for mul- ticarrier satellite channels based on direct learning,” IEEE Trans. Signal method, which may result in an additional error in the identifi- Process., vol. 62, no. 22, pp. 5868Ð5880, Nov. 2014. cation. Furthermore, the tensor based method can only provide [5] E. W. Bai, “Frequency domain identification of Wiener models,” Auto- a total estimate error that involves the error at each sampling in- matica, vol. 39, no. 9, pp. 1521Ð1530, 2003. [6] S. VanVaerenbergh, J. V«õa, and I. Santamar«õa, “Blind identification of stance but cannot quantify and guarantee a tight estimate error at SIMO Wiener systems based on kernel canonical correlation analysis,” the frequencies of interest. BGTA method makes this possible, IEEE Trans. Signal Process., vol. 61, no. 9, pp. 2219Ð2230, May 2013. which is one of distinct contributions of the present work. [7] G. Li and C. Wen, “Identification of Wiener systems with clipped ob- servations,” IEEE Trans. Signal Process., vol. 60, no. 7, pp. 3845Ð3852, Jul. 2012. [8] L. Vanbeylen, R. Pintelon, and J. Schoukens, “Blind maximum-likelihood V. C ONCLUSIONS identification of Wiener systems,” IEEE Trans. Signal Process., vol. 57, no. 8, pp. 3017Ð3029, Aug. 2009. Cascade dynamic nonlinear systems can be used to model a [9] M. Pawlak, Z. Hasiewicz, and P.Wachel, “On nonparametric identification large class of nonlinear systems in engineering practices, but of Wiener systems,” IEEE Trans. Signal Process., vol. 55, no. 2, pp. 482Ð very few efforts have been devoted to the identification of such 492, Feb. 2007. [10] M. Zhang, A. Zhang, and J. Li, “Fast and accurate rank selection methods systems. The main difficulty comes from the non-convex na- for multistage Wiener filter,” IEEE Trans. Signal Process., vol. 64, no. 4, ture of the identification problem. This paper investigated this pp. 973Ð984, Feb. 2016. problem based on a bargaining game theory model. Firstly, the [11] X. Hong, S. Chen, Y. Gong, and C. J. Harris, “Nonlinear equalization of Hammerstein OFDM systems,” IEEE Trans. Signal Process., vol. 62, cascade dynamic nonlinear systems were rearranged and con- no. 21, pp. 5629Ð5639, Nov. 2014. verted into an equivalent DISO system, where the estimate error [12] C. Yu, C. Zhang, and L. Xie, “A new deterministic identification approach was considered as the new output, and the system input and mea- to Hammerstein systems,” IEEE Trans. Signal Process., vol. 62, no. 1, pp. 131Ð140, Jan. 2014. sured noised output as the new inputs. By the same token, the [13] B. Ni, H. Garnier, and M. Gilson, “A refined instrumental variable method identification of the cascade dynamic nonlinear systems was for Hammerstein-Wiener continuous-time model identification,” IFAC transformed into a convex optimization problem. Secondly, in Proc. Vol., vol. 45, no. 16, pp. 1061Ð1066, 2012. [14] B. Q. Mu and H. F. Chen, “Recursive identification of multi-input multi- order to guarantee the estimate error at the frequencies of in- output errors-in-variables Hammerstein systems,” IEEE Trans. Autom. terest, the DISO identification problem was decomposed into a Control, vol. 60, no. 3, pp. 843Ð849, Mar. 2015. set of sub-problems defined in the output frequency range. The [15] A. Wills, T. B. Schon,¬ L. Ljung, and B. Ninness, “Identification of HammersteinÐWiener models,” Automatica, vol. 49, no. 1, pp. 70Ð81, spectrum of the unbiased estimation error function was then 2013. given, which was demonstrated to be a first-order polynomial [16] M. Schoukens, E. W. Bai, and Y. Rolain, “Identification of Hammerstein- of the model parameters to be identified and with independent Wiener systems,” IFAC Proc. Vol., vol. 45, no. 16, pp. 274Ð279, 2012. XIAO et al.: IDENTIFICATION OF CASCADE DYNAMIC NONLINEAR SYSTEMS: A BARGAINING-GAME-THEORY-BASED APPROACH 4669

[17] E. W. Bai, “An optimal two-stage identification algorithm for [44] Z. Xiao and X. Jing, “Frequency-domain analysis and design of linear HammersteinÐWiener nonlinear systems,” Automatica, vol. 34, no. 3, feedback of nonlinear systems and applications in vehicle suspensions,” pp. 333Ð338, 1998. IEEE/ASME Trans. Mechatronics, vol. 21, no. 1, pp. 506Ð517, Feb. 2016. [18] G. Li, C. Wen, W. X. Zheng, and G. Zhao, “Iterative identification of [45] Z. Xiao and X. Jing, “A novel characteristic parameter approach for anal- block-oriented nonlinear systems based on biconvex optimization,” Syst. ysis and design of linear components in nonlinear systems,” IEEE Trans. Control Lett., vol. 79, pp. 68Ð75, 2015. Signal Process., vol. 64, no. 10, pp. 2528Ð2540, May 2016. [19] P. Crama and J. Schoukens, “Hammerstein–Wiener system estimator ini- [46] G. Li, C. Wen, W. Zheng, and Y. Chen, “Identification of a class of tialization,” Automatica, vol. 40, no. 9, pp. 1543Ð1550, 2004. nonlinear autoregressive models with exogenous inputs based on kernel [20] A. Brouri, F. Giri, F. Ikhouane, F. Z. Chaoui, and O. Amdouri, “Identifi- machines,” IEEE Trans. Signal Process., vol. 59, no. 5, pp. 2146Ð2159, cation of Hammerstein-Wiener systems with backlash input nonlinearity May 2011. bordered by straight lines,” IFAC Proc. Vol., vol. 47, no. 3, pp. 475Ð480, [47] E. Bai and K. Chan, “Identification of an additive nonlinear system and its 2014. applications in generalized Hammerstein models,” Automatica, vol. 44, [21] F. Giri, A. Brouri, F. Ikhouane, F. Z. Chaoui, and A. Radouane, “Identifi- no. 2, pp. 430Ð436, 2008. cation of Hammerstein-Wiener systems including backlash input nonlin- [48] V. Krishnan and K. Chandra, Probability and Random Processes.New earities,” IFAC Proc. Vol., vol. 46, no. 11, pp. 360Ð365, 2013. York, NY, USA: Wiley, 2015. [22] Y. Rochdi, F. Giri, and F. Z. Chaoui, “Frequency identification of non- [49] A. Swain and S. Billings, “Generalized frequency response function matrix parametric Hammerstein-Wiener systems with output backlash operator,” for MIMO non-linear systems,” Int. J. Control, vol. 74, no. 8, pp. 829Ð844, IFAC Proc. Vol., vol. 45, no. 16, pp. 19Ð24, 2012. 2001. [23] Y. Rochdi, F. Giri, and F. Z. Chaoui, “Frequency identification of non- [50] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: parametric Hammerstein-Wiener systems,” IFAC Proc. Vol., vol. 44, no. 1, Cambridge Univ. Press, 2004. pp. 11190Ð11195, 2011. [51] Q. Wang, M. Hong, and Z. Su, “An in-situ structural health diagnosis [24] A. Wills and B. Ninness, “Generalised HammersteinÐWiener system esti- technique and its realization via a modularized system,” IEEE Trans. mation and a benchmark application,” Control Eng. Pract., vol. 20, no. 11, Instrum. Meas., vol. 64, no. 4, pp. 873Ð887, Apr. 2015. pp. 1097Ð1108, 2012. [52] K. Batselier, Z. Chen, and N. Wong, “Tensor network alternating linear [25] A. Wills and B. Ninness, “Estimation of generalised Hammerstein-Wiener scheme for MIMO Volterra system identification,” Automatica, vol. 84, systems,” IFAC Proc. Vol., vol. 42, no. 10, pp. 1104Ð1109, 2009. pp. 26Ð35, 2017. [26] J. Vor¬ os,¬ “Identification of nonlinear dynamic systems with input satura- tion and output backlash using three-block cascade models,” J. Franklin Inst., vol. 351, no. 12, pp. 5455Ð5466, 2014. Zhenlong Xiao (M’16) received the B.S. degree from [27] K. Tiels and J. Schoukens, “Wiener system identification with generalized the Nanjing University of Posts and Telecommunica- orthonormal basis functions,” Automatica, vol. 50, no. 12, pp. 3147Ð3154, tions, Nanjing, China, in 2008, the M.S. degree from 2014. the Beijing University of Posts and Telecommuni- [28] R. Castro-Garcia, K. Tiels, J. Schoukens, and J. A. K. Suykens, “Incor- cations, Beijing, China, in 2011, and the Ph.D. de- porating best linear approximation within LS-SVM-based Hammerstein gree from Hong Kong Polytechnic University, Hong system identification,” in Proc. IEEE 54th Conf. Decis. Control, 2015, Kong, China, in 2015. He is currently an Assistant pp. 7392Ð7397. Professor with the Department of Communication [29] R. Pintelon and J. Schoukens, System Identification: A Frequency Domain Engineering, School of Information Science and En- Approach. New York, NY, USA: Wiley, 2012. gineering, Xiamen University, Xiamen, China. His [30] M. Schoukens, R. Pintelon, and Y. Rolain, “Identification of WienerÐ research interests include the analysis and design Hammerstein systems by a nonparametric separation of the best linear of nonlinear system, nonlinear signal processing, and approximation,” Automatica, vol. 50, no. 2, pp. 628Ð634, 2014. system identification. [31] J. Bussgang, “Crosscorrelation functions of amplitude-distorted Gaussian signals,” Massachusetts Institute of Technology, Cambridge, MA, USA, Tech. Rep. No. 216, 1952. [32] I. Sandberg, “Expansions for nonlinear systems,” Bell Syst. Tech. J, vol. 61, Shengbo Shan received the B. Eng. and M. Eng. de- no. 2, pp. 159Ð199, 1982. grees from the Nanjing University of Aeronautics and [33] S. Boyd and L. Chua, “Fading memory and the problem of approximat- Astronautics, Nanjing, China, in 2012 and 2015, re- ing nonlinear operators with Volterra series,” IEEE Trans. Circuits Syst., spectively. He is currently a Ph.D. candidate with the vol. 32, no. 11, pp. 1150Ð1161, Nov. 1985. Department of Mechanical Engineering, Hong Kong [34] W. Rugh, Nonlinear System Theory. Baltimore, MD, USA: Johns Hopkins Polytechnic University, Hong Kong. Univ. Press, 1981. His research interests include structural health [35] M. Schetzen, The Volterra and Wiener Theories of Nonlinear Systems. monitoring, guided wave theory, and signal process- Melbourne, FL, USA: Krieger, 2006. ing. [36] T. Helie and B. Laroche, “Computation of convergence bounds for Volterra series of linear-analytic single-input systems,” IEEE Trans. Autom. Con- trol, vol. 56, no. 9, pp. 2062Ð2072, Sep. 2011. [37] Z. Xiao, X. Jing, and L. Cheng, “Parameterized convergence bounds for Volterra series expansion of NARX models,” IEEE Trans. Signal Process., Li Cheng received the B.Sc. degree from Xi’an vol. 61, no. 20, pp. 5026Ð5038, Oct. 2013. Jiaotong University, Xi’an, China, the DEA and [38] Z. Xiao, X. Jing, and L. Cheng, “Estimation of parametric convergence Ph.D. degrees from the Institut National des bounds for Volterra series expansion of nonlinear systems,” Mech. Syst. Sciences Appliquees« de Lyon (INSA-Lyon), Villeur- Signal Process., vol. 45, no. 1, pp. 28Ð48, 2014. banne, France. He is currently a Chair Professor of [39] J. Peyton and S. Billings, “Recursive algorithm for computing the fre- mechanical engineering and the Director of the Con- quency response of a class of non-linear difference equation models,” Int. sortium for Sound and Vibration Research (CSVR), J. Control, vol. 50, no. 5, pp. 1925Ð1940, 1989. Hong Kong Polytechnic University, Hong Kong. [40] S. Billings and J. Peyton, “Mapping non-linear integro-differential equa- He started his academic career at Laval University, tions into the frequency domain,” Int. J. Control, vol. 52, no. 4, pp. 863Ð Quebec, Canada, in 1992, rising from an Assistant 879, 1990. Professor to an Associate/Full Professor, before com- [41] Z. Lang and S. Billings, “Output frequency characteristics of nonlinear ing to Hong Kong in 2000, where he was promoted to a Chair Professor in 2005 systems,” Int. J. Control, vol. 64, no. 6, pp. 1049Ð1067, 1996. and was the Head of the Department from 2011 to 2014. He works in the field [42] Z. Lang, S. Billings, and R. Yue, “Output frequency response function of sound and vibration, structural health monitoring, and smart structure and of nonlinear Volterra systems,” Automatica, vol. 43, no. 5, pp. 805Ð816, control. He is an elected Fellow of the Acoustical Society of America, Acous- 2007. tical Society of China, IMechE, Hong Kong Institution of Engineers, and Hong [43] X. Jing, Z. Lang, and S. Billings, “Mapping from parametric characteris- Kong Institute of Acoustics. He currently serves as the Deputy Editor-in-Chief tics to generalized frequency response functions of non-linear systems,” and Receiving Editor for the Journal of Sound and Vibration and an Associate Int. J. Control, vol. 81, no. 7, pp. 1071Ð1088, 2008. Editor for the Journal of Acoustical Society of America.