J S I A M

Vol.3 (2011) pp.1-100

The Japan Society for Industrial and Applied Mathematics

Vol.3 (2011) pp.1-100

The Japan Society for Industrial and Applied Mathematics Editorial Board

Chief Editor Yoshimasa Nakamura (Kyoto University)

Vice-Chief Editor Kazuo Kishimoto (Tsukuba University)

Associate Editors Reiji Suda (University of ) Satoshi Tsujimoto (Kyoto University) Masashi Iwasaki (Kyoto Prefectural University) Norikazu Saito (University of Tokyo) Koh-ichi Nagao (Kanto Gakuin University) Koichi Kato (Japan Institute for Pacific Studies) Nagai Atsushi (Nihon University) Takeshi Mandai (Osaka Electro-Communication University) Ryuichi Ashino (Osaka Kyoiku University) Tamotu Kinoshita (University of Tsukuba) Yuzuru Sato (Hokkaido University) Ken Umeno (NiCT) Katsuhiro Nishinari (University of Tokyo) Tetsu Yajima (Utsunomiya University) Narimasa Sasa (Japan Atomic Energy Agency) Fumiko Sugiyama (Kyoto University) Hiroko Kitaoka (JSOL) Hitoshi Imai (University of Tokushima) Nobito Yamamoto (University of Electro-Communications) Daisuke Furihata (Osaka University) Takahiro Katagiri (The University of Tokyo) Tetsuya Sakurai (University of Tsukuba) Takayasu Matsuo (Tokyo University) Tomohiro Sogabe (Aichi Prefectural University) Yoshitaka Watanabe (Kyushu University) Katsuhisa Ozaki (Shibaura Institute of Technology) Kenta Kobayashi (Kanazawa University) Takaaki Nara (The University of Electro-Communications) Takashi Suzuki (Osaka University) Tetsuo Ichimori (Osaka Institute of Technology) Tatsuo Oyama (National Graduate Institute for Policy Studies) Hideyuki Azegami (Nagoya University) Kenji Shirota (Aichi Prefectural University) Eiji Katamine (Gifu National College of Technology) Masami Hagiya (University of Tokyo) Toru Fujiwara (Osaka University) Yasuyuki Tsukada (NTT Communication Science Laboratories) Naoyuki Ishimura (Hitotsubashi University) Jiro Akahori (Ritsumeikan University) Kiyomasa Narita (Kanagawa University) Ken Nakamula (Tokyo Metropolitan University) Miho Aoki (Shimane University) Kazuto Matsuo (Institute of Information Security) Keiko Imai (Chuo University) Ichiro Kataoka (HITACHI) Naoshi Nishimura (Kyoto University) Hiromichi Itou (Gunma University) Shin-Ichi Nakano (Gunma University) Akiyoshi Shioura (Tohoku University)

Contents

Regular solution to topology optimization problems of continua ・・・ 1-4 Hideyuki Azegami, Satoshi Kaizu and Kenzen Takeuchi

A convergence improvement of the BSAIC preconditioner by deflation ・・・ 5-8 Ikuro Yamazaki, Hiroto Tadano, Tetsuya Sakurai and Keita Teranishi

Cache optimization of a non-orthogonal joint diagonalization method ・・・ 9-12 Yusuke Hirota, Yusaku Yamamoto and Shao-Liang Zhang

Quasi-minimal residual smoothing technique for the IDR(s) method ・・・ 13-16 Lei Du, Tomohiro Sogabe and Shao-Liang Zhang

A new approach to find a saddle point efficiently based on the Davidson method ・・・ 17-20 Akitaka Sawamura

On rounding off quotas to the nearest integers in the problem of apportionment ・・・ 21-24 Tetsuo Ichimori

Traveling wave solutions to the nonlinear evolution equation for the risk preference ・・・ 25-28 Naoyuki Ishimura and Sakkakom Maneenop

Approximation algorithms for a winner determination problem of single-item multi-unit ・・・ 29-32 auctions Satoshi Takahashi and Maiko Shigeno

On the new family of wavelets interpolating to the Shannon wavelet ・・・ 33-36 Naohiro Fukuda and Tamotu Kinoshita

Conservative finite difference schemes for the modified Camassa-Holm equation ・・・ 37-40 Yuto Miyatake, Takayasu Matsuo and Daisuke Furihata

A multi-symplectic integration of the Ostrovsky equation ・・・ 41-44 Yuto Miyatake, Takaharu Yaguchi and Takayasu Matsuo

Solutions of Sakaki-Kakei equations of type 1, 2, 7 and 12 ・・・ 45-48 Koichi Kondo

Analysis of credit event impact with self-exciting intensity model ・・・ 49-52 Suguru Yamanaka, Masaaki Sugihara and Hidetoshi Nakagawa

On the reduction attack against the algebraic surface public-key cryptosystem(ASC04) ・・・ 53-56 Satoshi Harada, Yuichi Wada, Shigenori Uchiyama and Hiro-o Tokunaga

Deterministic volatility models and dynamics of option returns ・・・ 57-60 Takahiro Yamamoto and Koichi Miyazaki

Stochastic estimation method of eigenvalue density for nonlinear eigenvalue problem on ・・・ 61-64 the complex plane Yasuyuki Maeda, Yasunori Futamura and Tetsuya Sakurai

Computation of multipole moments from incomplete boundary data for ・・・ 65-68 Magnetoencphalography inverse problem Hiroyuki Aoshika, Takaaki Nara, Kaoru Amano and Tsunehiro Takeda

An alternative implementation of the IDRstab method saving vector updates ・・・ 69-72 Kensuke Aihara, Kuniyoshi Abe and Emiko Ishiwata

Error analysis of H1 gradient method for topology optimization problems of continua ・・・ 73-76 Daisuke Murai and Hideyuki Azegami

Evolution of bivariate copulas in discrete processes ・・・ 77-80 Yasukazu Yoshizawa and Naoyuki Ishimura

On boundedness of the condition number of the coefficient matrices appearing in ・・・ 81-84 Sinc-Nyström methods for Fredholm integral equations of the second kind Tomoaki Okayama, Takayasu Matsuo and Masaaki Sugihara

A modified Calogero-Bogoyavlenskii-Schiff equation with variable coefficients and its ・・・ 85-88 non-isospectral Lax pair Tadashi Kobayashi and Kouichi Toda

A parallel algorithm for incremental orthogonalization based on the compact WY ・・・ 89-92 representation Yusaku Yamamoto and Yusuke Hirota

Analysis of downgrade risk in credit portfolios with self-exciting intensity model ・・・ 93-96 Suguru Yamanaka, Masaaki Sugihara and Hidetoshi Nakagawa

Automatic verication of anonymity of protocols ・・・ 97-100 Hideki Sakurada

JSIAM Letters Vol.3 (2011) pp.1–4 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Regular solution to topology optimization problems of continua

Hideyuki Azegami1, Satoshi Kaizu2 and Kenzen Takeuchi3

1 Graduate School of Information Science, Nagoya University, A4-2 (780) Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan 2 College of Science and Technology, Nihon University, 7-24-1 Narashinodai, Funabashi, Chiba 274-8501, Japan 3 Quint Corporation, 1-14-1 Fuchu-cho, Fuchu, Tokyo 183-0055, Japan E-mail azegami is.nagoya-u.ac.jp Received September 30, 2010, Accepted November 1, 2010

Abstract The present paper describes a numerical solution to topology optimization problems of do- mains in which boundary value problems of partial differential equations are defined. Density raised to a power is used instead of the characteristic function of the domain. A design vari- able is set by a function on a fixed domain which is converted to the density by a sigmoidal function. Evaluation of derivatives of cost functions with respect to the design variable appear as stationary conditions of the Lagrangians. A numerical solution is constructed by a gradient method in a design space for the design variable. Keywords calculus of variations, boundary value problem, topology optimization, density method, H1 gradient method Research Activity Group Mathematical Design

1. Introduction χΩ is substituted by a function evaluated by homoge- A problem finding the optimum layout of holes in nization theory. A numerical scheme was demonstrated domain in which boundary value problem is defined is using the finite element method [4]. called the topology optimization problem of continua [1]. Moreover, it has been found that introducing a den- → In the present paper, the Poisson problem is considered sity ϕ : D [0, 1] and a constant α > 1, and replacing α as a boundary value problem for the simplicity. χΩ by ϕ obtains a similar result to that from the micro- One of the most natural expressions of a topology opti- structure model. This method is called the SIMP (solid mization problem uses the characteristic function of the isotropic material with penalization) method [1, 5]. The domain as a design variable. Let D be a fixed domain in meaning of the penalization is that the intermediate den- α d sity is weakened by the nonlinear function ϕ . R , d ∈ {2, 3},ΓD ⊂ ∂D be a fixed subboundary, define However, numerical instabilities such as checkerboard ΓN = ∂D \ Γ¯D, and let functions f, p, and uD be fixed functions on D. Denoting the characteristic function for patterns or mesh-dependencies are observed if the pa- ∞ rameters of micro-structure or the density is constructed Ω ⊆ D by χΩ ∈ X = {χ ∈ L (D; R) | 0 ≤ χ ≤ 1 a.e. in by a constant function in each finite element and they D}, the normal by ν, and ∂ν = ν · ∇, we can write the topology optimization problem as follows. are varied using a gradient method [6,7]. If the design pa- rameters are approximated by continuous functions [8], Problem 1 (Topology optimization problem) it is known that a numerical instability, such as the so- ∈ ∈ 1 R For each χΩ X, let u H (D; ) satisfy called island phenomena, is observed [9]. In addition, al-

− ∇ · (χΩ∇u) = f in D, though many numerical schemes have been proposed to overcome such numerical instabilities [10,11], regularity χΩ∂ν u = p on ΓN, u = uD on ΓD. in the sense of functional analysis has not been shown.

Find χΩ such that In the present paper, a regular solution which is free 0 of numerical instability is presented, where the meaning min {J (χΩ, u) | J(χΩ, u) ≤ 0}, χΩ∈X of regular is as follows. First, the admissible set of a design variable is defined. Then, a solution is regular if 0 1 m ⊤ l ∈ 0 × 1 where J and J = (J ,...,J ) , J C (X H (D; any point obtained by the solution from a point in the R R ); ) are cost functions. admissible set also belongs to the admissible set. However, it has been shown that Problem 1 does not always have a solution [2]. 2. Admissible set of design variable To avoid the non-existence of a solution, the idea of To define a boundary value problem, the Lipschitz assuming that D consists of a micro-structure having boundary is required for a domain. Accordingly, to de- rectangular holes was presented [3]. In this formulation, termine a boundary from a level set of density ϕ, ϕ has

– 1 – JSIAM Letters Vol. 3 (2011) pp.1–4 Hideyuki Azegami et al. 1,∞ R 0 3/2 R l ∈ 0 3/2−k R to be an element of W (D; ), where D also has a Lip- C (Yr; H (∂Dr; )) and ju C (Y ; H (∂D; )). schitz boundary. To avoid the restriction of the range of We call J 0 and J = (J 1,...,J m)⊤, ϕ to [0, 1], we introduce a function θ belonging to ∫ ∫ J l(θ, u) = gl(θ, u) dx + jl(θ, u) dγ + cl, { ∈ 1 R | ∈ 1,∞ R ∥ ∥ ≤ } S = θ H (D; ) θ W (D; ), θ 1,∞ M D ∂D as a design variable and relate it to the density ϕ by a the cost functions, where J 0 is the objective function and sigmoidal function, for which J are the constraint functions. 1 1 We assume that constants cl, l ∈ {0, 1, . . . , m}, are set ϕ (θ) = tan−1 θ + (1) π 2 such that some θ ∈ S satisfies J ≤ 0. is used in the present paper. Because M is initially fixed, Based on the definitions above, we consider a SIMP 1 problem as follows. the set S is weakly compact in H (D; R). If ∥θ∥1,∞ = ∥θ∥W 1,∞(D;R) ≤ M becomes active, let this condition be Problem 4 (SIMP problem) Let u be the solution included among the constraints. In the present paper, to Problem 2 for θ ∈ S. Find θ such that let M be sufficiently large for simplicity. min{J 0(θ, u) | J(θ, u) ≤ 0}. To avoid loss of regularity on ∂ΓD and a set Υ ⊂ ∂D θ∈S on which u∈ / Hk+2(D; R) and vl ∈/ H3−k(D; R), l ∈ l {0, 1, . . . , m}, k ∈ {0, 1}, in Problems 2 and 6 respec- 4. θ derivatives of J tively, we provide a fixed neighborhood Ur = {x ∈ D | To solve Problem 4 by a gradient method, the Fr´echet l |x − y| < r, y ∈ ∂ΓD ∪ Υ} for a small positive constant derivatives of J with respect to θ are required. Let ρ ∈ 1 r, and Dr = D \ Ur. H (D; R) be a variation of θ and denote We call S the admissible set of the design variable. We ρ call H1(D; R) the design space with respect to S because θ = θ + ρ a Hilbert space is required for the gradient method. as an updated function of θ. Also, let uρ be the solution to Problem 2 for θρ. 3. SIMP problem Definition 5 (θ derivative of J l) For J l(θ, u(θ)) : ′ Let us consider a topology optimization problem of H1(D; R) ⊃ S ∋ θ 7→ J l ∈ R, if J l (θ, u)[ρ] such that SIMP type by using θ ∈ S. First, we define a boundary ′ l ρ ρ l l value problem as follows. J (θ , u ) = J (θ, u) + J (θ, u)[ρ] + o(∥ρ∥1,2) ∈ { Problem 2 (Poisson problem) For some k 0, is a bounded linear functional for all ρ ∈ H1(D; R), we 1}, let f ∈ Hk(D, R), p ∈ Hk+1/2(Γ ; R) and u ∈ ′ ′ N D call J l (θ, u) ∈ H1 (D; R) the θ derivative of J l at θ, and Hk+2(D; R) be fixed functions, and ϕ(θ) as in (1). Find ′ denoting as J l (θ, u)[ρ] = ⟨Gl(θ, u), ρ⟩ with the notation u ∈ H1(D; R) such that of dual product, Gl(θ, u) ∈ H1′(D; R) the θ gradient. − ∇ · α ∇ (ϕ (θ) u) = f in D, Let us evaluate Gl(θ, u). The Lagrangian for J l(θ, u) ϕα(θ)∂ u = p on Γ , u = u on Γ . subject to Problem 2 is defined by ν N D D ∫ ∫ | l l l l l From the assumptions for Problem 2, we have u Dr be- L (θ, u, v ) = g (θ, u)dx + j (θ, u)dγ + c k+2 longs to H (Dr; R). Moreover, Problem 2 gives the D ∂D Lagrangian as − L BV(θ, u, vl), L BV (θ, v, w) l ∈ 1 R ∫ ∫ where v H (D; ) is used as the Lagrange multiplier for Problem 2, and L BV( · , · , · ) is as in (2). = ϕα(θ)∇v · ∇w dx − fw dx D D If u is the solution to Problem 2, the stationary con- ∫ ∫ L l l L BV dition such that vl (θ, u, v )[w] = (θ, u, w) = 0 for α 1 − wp dγ − (v − uD)ϕ (θ)∂ν w dγ all w ∈ H (D; R) is satisfied. ∫ΓD ΓD The stationary condition such that α − wϕ (θ)∂ν v dγ (2) L l l u(θ, u, v )[w] ΓD ⟨L l l ⟩ for all v, w ∈ H1(D; R) [12]. If u is a stationary point = u(θ, u, v ), w ∫ ∫ ∫ such that = gl w dx + jl w dγ − ϕα(θ)∇w · ∇vl dx L BV u u (θ, u, w) = 0 D ∫ ∂D ∫ D ∈ 1 R α l α l for all w H (D; ), u is the solution to Problem 2. + ϕ (θ)w∂ν v dγ + ϕ (θ)v ∂ν w dγ Using θ and u, we define cost functions. Let us use the ΓD ΓD · · · · following notation: ( )θ = ∂( )/∂θ and ( )u = ∂( )/∂u. = 0 ∈ × 1 Definition 3 (Cost functions) For (θ, u) S H ( 1 l 1 1 l 1 1 for all w ∈ H (D; R) is satisfied if v ∈ H (D; R) is the D; R) = Y and S × H (Dr; R) = Yr, let g ∈ C (Y ; L ( D; R)) and jl ∈ C1(Y ; L1(∂D; R)), l ∈ {0, 1, . . . , m}, are solution of the following adjoint problem. l ∈ 0 1 R l ∈ l given functions such that gθ C (Yr; H (Dr; )), gu Problem 6 (Adjoint problem for J ) For the solu- 0 1−k R ∈ { } l ∈ ∈ l ∈ 1 R C (Y ; H (D; )), k 0, 1 used in Problem 2, jθ tion u to Problem 2 at θ S, find v H (D; ) such

– 2 – JSIAM Letters Vol. 3 (2011) pp.1–4 Hideyuki Azegami et al.

l l l that = ⟨G , ϵρ¯ ⟩ + o(ϵ∥ρ¯ ∥1,2) ( ) G G − ∇ · α ∇ l l ≤ − l l ∥ l ∥ ϕ (θ) v = gu(θ, u) in D, ϵa(¯ρG, ρ¯G) + o(ϵ ϵρ¯G 1,2) α l l l ≤ − ∥ l ∥2 ∥ l ∥ ϕ (θ)∂ν v = ju(θ, u) on ΓN, v = 0 on ΓD. ϵβ ρ¯G 1,2 + o(ϵ ρ¯G 1,2) l ∈ 1−k R l ∈ 3/2−k R ∈ Since gu H (D; ) and ju H (∂D; ), k < 0 {0, 1} as in Problem 2, we have vl| ∈ H3−k(D ; R). Dr r l If u and vl are the solutions of Problems 2 and 6, re- for a sufficiently small positive number ϵ,ρ ¯G is a regular l spectively, for θ ∈ S, the θ derivative of L l with respect vector toward to a descent direction of J . ∈ 1 R In the present paper, we use to ρ H (D; ) is given by ∫ ′ L l l L l l ∇ · ∇ (θ, u, v )[ρ] = θ (θ, u, v )[ρ] a(y, z) = ( y z + cyz) dx (4) D l = ⟨G , ρ⟩ as a coercive bilinear form in Problem 8, where c is a ∫ ∫ positive constant. l l l = (Gg + Ga)ρ dx + Gjρ dγ (3) D ∂D ′ 6. Solution to SIMP problem l and agrees with J (θ, u)[ρ], where Let us consider a solution to Problem 4 by using a l l l l sequential quadratic approximation problem. Gg(θ, u) = gθ,Gj(θ, u) = jθ, 0 − Problem 9 (SQ approximation) Let G and G = Gl (θ, u, vl) = −αϕα 1ϕ ∇u · ∇vl. a θ (G1,...,Gm)⊤ be θ derivatives of J 0 and J, respectively, Therefore, we have the following result. for a θ ∈ S, a( · , · ) be given as in (4), and ϵ be a small Theorem 7 (θ derivative of J l) For the solutions u positive constant. Find ϵρ such that l and v of Problems 2 and 6, respectively, for θ ∈ S, min {Q(ϵρ) | J(θ, u) + ⟨G, ϵρ⟩ ≤ 0}, ∈ ′ ρ B l ⟨ l ⟩ J (θ, u)[ρ] = G , ρ 1 where B = {ρ ∈ H (D; R) | ∥ρ∥1,2 = 1}, and ∈ 1 R l l l holds for all ρ H (D; ), where G , Gg , Ga Dr Dr Dr 1 0 l 1′ 1 Q(ϵρ) = a(ϵρ, ϵρ) + ⟨G , ϵρ⟩. and G of (3) belong to H (Dr; R), H (Dr; R)s and j ∂Dr 2ϵ 3/2 H (∂Dr; R), respectively. The Lagrangian of Problem 9 is defined as L SQ · ⟨ ⟩ 5. H1 gradient method (ϵρ, λ) = Q(ϵρ) + λ (J(θ, u) + G, ϵρ ), ⊤ l 1′ 1 m ∈ Rm Since G belongs to the dual space H (Dr; R) of where λ = (λ , . . . , λ ) are the Lagrange multi- Dr l pliers for the constraints. The Karush-Kuhn-Tucker con- H1(D ; R), ⟨Gl, ρ⟩ is well defined in D . However, θϵG = r r ditions for Problem 9 are given as θ+ϵGl for a small ϵ > 0 does not belong to the admissible 1 set S. This is considered to be the cause of the numerical a(ϵρ, y) + ⟨(G0 + λ · G), y⟩ = 0, (5) instabilities discussed in the Introduction. ϵ To avoid irregularity, we propose using an H1 gra- J(θ, u) + ⟨G, ϵρ⟩ ≤ 0, (6) dient method, which is an application of the traction ⟨ ⟩ method [13–15] to the SIMP problem, to determine a diag(λ)(J(θ, u) + G, ϵρ ) = 0, (7) l ∈ 1 R ∈ ¯l variation ρG H (D; ) from θ S with G which is λ ≥ 0, (8) an extension of Gl to H1(D; R). Dr for all y ∈ Y . Problem 8 (H1 gradient method) Let a : H1(D; 0 1 m ⊤ Here, let ρG and ρG = (ρG, . . . , ρG ) be the solutions R × 1 R → R ) H (D; ) be a coercive bilinear form such to Problem 8 using a( · , · )/ϵ instead of a( · , · ), and ρG = that there exists β > 0 that satisfies 0 · ρG + λ ρG, ≥ ∥ ∥2 ρ a(y, y) β y 1,2 ρ = ρ0 + λ · ρ = G . (9) ∥ρ ∥ ∈ 1 R l l ∈ G 1,2 for all y H (D; ). For G as in (3), find ρG H1(D; R) such that Then, it is confirmed that ρG = ϵρ satisfies (5). If the all constraints in (6) are active, we have l −⟨ ¯l ⟩ a(ρG, y) = G , y ⟨G, ϵρ⊤⟩λ = −J(θ, u) − ⟨G, ϵρ0⟩. (10) for all y ∈ H1(D; R). If the G1,...,Gm are linearly independent, (10) has a By the Lax-Milgram theorem, there exists a unique l unique solution λ. Using the λ, if there are inactive con- solution ρG to Problem 8. From Theorem 7, it is guaran- l ⟨ l ⟩ l l 3 R ⊂ 1,∞ R straints l such that J (θ, u) + G , ϵρ < 0 or λ < 0, let teed that ρG belongs to H (Dr; ) W (Dr; ) l Dr us remove the constraints from (10), put λ = 0, and re- and an extensionρ ¯l of ρl belongs to W 1,∞(D; R). G G Dr solving (10). Then, we can obtain λ which satisfies from Moreover, since (6) to (8). Since Problem 9 is a convex problem, λ is the l ϵρ¯l ϵρ¯l l unique solution to Problem 9. J (θ G , u G ) − J (θ, u)

– 3 – JSIAM Letters Vol. 3 (2011) pp.1–4 Hideyuki Azegami et al.

To ensure the global convergence, we use the following criteria for ϵ in Problem 9. Let L (θ, u, λ) = J 0(θ, u) + ϵρ ¡ λ · J(θ, u) be the Lagrangian for Problem 4, and λ D D ϵρ ϵρ be the λ for (θ , u ) that satisfies the Karush-Kuhn- p Tucker conditions. For a constant ξ ∈ (0, 1), the Armijo criterion [16] gives the upper limit of ϵ as Fig. 1. Converged density (right) to the mean compliance mini- ϵρ ϵρ ϵρ L (θ , u , λ ) − L (θ, u, λ) mization problem with mass constraint for a linear elastic prob- ⟨( ( ) ) ⟩ lem as cantilever (left). ≤ ξ G0 u, v0 + λ · G(u, v) , ϵρ . (11) For a constant µ ∈ (0, 1) such that 0 < ξ < µ < 1, the Wolfe criterion [17] gives lower limit of ϵ as Acknowledgments µ⟨(G0(u, v0) + λ · G(u, v)), ϵρ⟩ The present study was supported by JSPS KAKENHI (20540113). ≤ ⟨(G0(uϵρ, v0 ϵρ) + λϵρ · G(uϵρ, vϵρ)), ϵρ⟩. (12) We propose a numerical solution as follows. Let References 0 0 0 J(θ , u ) ≤ 0 is satisfied for θ in the following. [1] M. P. Bendsøe, Optimization of Structural Topology, Shape, (i) Set θ0 ∈ S, ϵ > 0, ξ and µ such that 0 < ξ < µ < 1, and Material, Springer-Verlag, Berlin, 1995. [2] F. Murat, Contre-exemples pour divers probl´emes ou le ϵ0 > 0 and k = 0. 0 0 0 contrˆoleintervient dans les coefficients, Ann. Mat. Pura ed (ii) Compute J , J, G and G at θ . Appl., Serie 4, 112 (1977), 49–68. 0 0 k k (iii) Solve ρG = ρG and ρG = ρG in Problem 8. [3] M. P. Bendsøe and N. Kikuchi, Generating optimal topologies (iv) Solve λ in in structural design using a homogenization method, Comput. Meths. Appl. Mech. Engrg., 71 (1988), 197–224. ⟨ ⊤⟩ −⟨ 0 ⟩ G, ρG λ = G, ρG . (13) [4] K. Suzuki and N. Kikuchi, A homogenization method for shape and topology optimization, Comput. Meths. Appl. • If (8) is satisfied, proceed to the next step. • l Mech. Engrg., 93 (1991), 291–318. Otherwise, remove the constraints such that λ < [5] G. I. N. Rozvany, M. Zhou and T. Birker, Generalized shape l 0, put λ = 0 and resolve (13) until (8) is satisfied. optimization without homogenization, Struct. Optim., 4 (v) Using ρ defined by (9), compute J 0 and J at θϵρ. (1992), 250–254. • Put λl = 0 for the inactive constraints such that [6] A. R. Diaz and O. Sigmund, Checkerboard patterns in layout J l(θϵρ, uϵρ) < 0. optimization, Struct. Optim., 10 (1995), 40–45. • If J(θϵρ, uϵρ) ≤ 0, proceed to the next step. [7] O. Sigmund and J. Petersson, Numerical instabilities in • Otherwise, set λ = λ0 and i = 0, solve δλ in topology optimization: a survey on procedures dealing with checkerboards, mesh-dependencies and local minima, Struct. i i ⟨G, ϵρ⊤⟩δλ = −J(θϵρ(λ ), uϵρ(λ )) (14) Optim., 16 (1998), 68–75. [8] K. Matsui and K. Terada, Continuous approximation of ma- for the active constraints such that J l(θϵρ, uϵρ) ≥ terial distribution for topology optimization, Int. J. Numer. 0, replace λi+1 = λi + δλ and i + 1 with i, and Meth. Engng., 59 (2004), 1925–1944. [9] S. F. Rahmatalla and C. C. Swan, A Q4/Q4 continuum struc- ϵρ ϵρ ≤ resolve (14) until J(θ , u ) 0 is satisfied. tural topology optimization implementation, Struct. Multi- 0 ϵρ (vi) Compute G and G at θ . disc. Optim., 27 (2004), 130–135. • If (11) and (12) hold, proceed to the next step. [10] J. Petersson and O. Sigmund, Slope constrained topology op- • If (11) or (12) does not hold, update ϵ with a timization, Int. J. Numer. Meth. Engng., 41 (1998), 1417– smaller or larger value. Return to (v). 1434. (vii) Let θk+1 = θϵρ, λk+1 = λ, and judge terminal [11] G. -W. Jang, J. H. Jeong, Y. Y. Kim, D. Sheen, C. Park and ∥ k+1 − k∥ ≤ M. -N. Kim, Checkerboard-free topology optimization using condition by θ θ 1,∞ ϵ0. non-conforming finite elements, Int. J. Numer. Meth. Engng., • If the condition holds, terminate the algorithm. 57 (2003), 1717–1735. • Otherwise, replace k+1 with k and return to (iii). [12] G. Allaire, F. Jouve and A. M. Toader, Structural optimization using sensitivity analysis and a level-set method, J. Comput. 7. Numerical example Phys., 194 (2004), 363–393. A SIMP problem for a three-dimensional linear elastic [13] H. Azegami, Solution to domain optimization problems (in Japanese), Trans. JSME, Ser. A, 60 (1994), 1479–1486. continuum is solved by the method shown above. Let p [14] H. Azegami and K. Takeuchi, A smoothing method for shape be a traction force and u be a displacement.∫ Set uD = 0. optimization: traction method using the Robin condition, Int. A mean compliance J 0(θ, u) = p · u dγ and a mass J. Comput. Methods, 3 (2006), 21–33. ∫ ΓN J 1(θ) = (ϕ(θ) − 0.4) dx are used as cost functions. [15] S. Kaizu and H. Azegami, Optimal shape problems and trac- D tion method (in Japanese), Trans. JSIAM, 16 (2006), 277– 0 0 0 − α−1 · We have Gg = Gj = 0 and Ga = αϕ ϕθσ(u) 290. 0 1 ε(u) for J , and Gg = ϕθ and zeros of the other terms [16] L. Armijo, Minimization of functions having Lipschitz con- for J 1, where σ(u) and ε(u) denote the stress and the tinuous first partial derivatives, Pacific J. Math., 16 (1966), strain. We use α = 2 and c = 1/(10L)2 in (4) for the 1–3. [17] P. Wolfe, Convergence conditions for ascent methods, SIAM width L of D. Finite element model consists of eight- Review, 11 (1969), 226–235. node brick elements with three nonconforming modes and a bubble mode of 120 × 160 × 1. Fig. 1 shows the result of the density obtained by the present method. We did not encounter any numerical instability.

– 4 – JSIAM Letters Vol.3 (2011) pp.5–8 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

A convergence improvement of the BSAIC preconditioner by deflation

Ikuro Yamazaki1, Hiroto Tadano1, Tetsuya Sakurai1 and Keita Teranishi2

1 Graduate School of Systems and Information Engineering, University of Tsukuba, 1-1-1 Tenn- odai, Tsukuba-shi, Ibaraki 305-8573, Japan 2 Cray, Inc., 380 Jackson St. Suite 210, St Paul, MN 55101, USA E-mail yamazaki mma.cs.tsukuba.ac.jp Received May 31, 2010, Accepted September 16, 2010 Abstract We have proposed a block sparse approximate inverse with cutoff (BSAIC) preconditioner for relatively dense matrices. The BSAIC preconditioner is effective for semi-sparse matrices which have relatively large number of nonzero elements. This method reduces the computational cost for generating the preconditioning matrix, and overcomes the performance bottlenecks of SAI using the blocked version of Frobenius norm minimization and the drop-threshold schemes (cutoff) for semi-sparse matrices. However, a larger parameter of cutoff leads to a less effective preconditioning matrix with a large number of iterations. We analyze this convergence deterioration in terms of eigenvalues, and describe a deflation-type method which improves the convergence. Keywords linear system, preconditioning, sparse approximate inverse, deflation Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction oration and algorithms of the method in Section 3. In Linear systems Section 4, the BSAIC preconditioner applied to the im- proving method is verified by numerical experiments, fol- Ax = b, lowed by the concluding remarks in Section 5. where A ∈ Cn×n is a semi-sparse matrix which is rela- tively dense, appear in nano-simulations. A sparse ap- 2. Block SAI with Cutoff (BSAIC) proximate inverse (SAI) technique is proposed as a par- We describe the block SAI with cutoff (BSAIC) pre- allel preconditioner for sparse matrices [1]. This precon- conditioner. In the BSAIC preconditioner, the cutoff is ditioner has a good parallel performance. However, the applied to the coefficient matrix A in order to reduce arithmetic costs of constructing the preconditioning ma- the computational cost of least square problems which trix grow cubically with the number of nonzero entries appear in block SAI. Firstly, the approximate coefficient per row. We have proposed a block sparse approximate matrix Ac is generated by the following cutoff: { inverse with cutoff (BSAIC) [2] preconditioner for such aij, (|aij| > θ or i = j), semi-sparse linear systems. Ac = [˜aij], a˜ij = (1) The BSAIC preconditioner can reduce the computa- 0, otherwise, tional cost for constructing the approximate inverse ma- where θ is a nonnegative real value. After applying the trix, and overcome the performance bottlenecks of SAI cutoff, least square problems with the approximate ma- using the blocked version of Frobenius norm minimiza- trix Ac: tion and the cutoff strategy for semi-sparse matrices. A ∑L large cutoff parameter leads to a further decrease cost of ∥ − ∥2 ≈ ∥ − ∥2 min AcM I F min AcMk Ek F, (2) constructing the approximate inverse matrix. Thus, we M Mk want to use a larger cutoff parameter as much as possi- k=1 ble. However, a convergence of Krylov subspace methods where l is a block size, L = ⌈n/l⌉ and Ek is a submatrix preconditioned with BSAIC deteriorates when the cut- of the identity matrix I such that I = [E1,E2,...,EL] off parameter is large. In this paper, this deterioration are solved. The matrix M = [M1,M2,...,ML] is em- of convergence is investigated in terms of eigenvalues, ployed as the preconditioning matrix. The initial spar- and a method of the convergence improvement is also sity pattern M0 of the preconditioning matrix is decided presented. by the following: This paper is organized as follows. In Section 2, our spy(M0) = spy(Ac), (3) method, the BSAIC preconditioner, is described. We de- scribe the convergence deterioration by large cutoff pa- where “spy” denotes the sparsity pattern of a matrix. rameters, and how to improve this convergence deteri- We overcome a performance bottleneck by using a blocked version of SAI with drop-threshold schemes to

– 5 – JSIAM Letters Vol. 3 (2011) pp.5–8 Ikuro Yamazaki et al. reduce the computational cost for constructing the ap- Algorithm GMRES-IR(m, k) method proximate inverse matrix and to improve the conver- 1: Compute p = m − k and r0 = b − Ax0 gence of Krylov subspace methods. However, a larger 2: Compute β = ∥r0∥2 and v1 = v0/β value of θ leads to a less effective preconditioning ma- 3: Compute Vm+1, H¯m with Arnoldi method trix with a large number of iterations, but a value of θ ⊤ 4: Compute y, the minimizer of ∥V r0 − H¯my∥2, is preferred to be large as much as possible. In the next m+1 and xm = x0 + Vmy section, we describe this convergence deterioration and 5: If satisfied Stop, else proceed the improvement method. ˜ ˜ 6: Compute the harmonic Ritz values θ1,..., θm ˜ ˜ 7: Sort |θ1∥ ≥ · · · ≥ ∥θm| 3. Convergence improvement by defla- ˜ ˜ 8: Set shift θ1,..., θp tion 9: Update Vk+1 and H¯k with IRA method We consider to solve preconditioned linear systems 10: Go to 3, and resume the Arnoldi method from step (AM)(M −1x) = b, by some Krylov subspace methods. k + 1 We investigate an eigenvalue distribution of AM. The Fig. 1. Algorithm of the GMRES-IR(m, k) method. block size l is fixed and the cutoff parameter θ is varied in BSAIC. The preconditioning matrix M approximates 10 4 10 2 Iteration time [sec] (log) the inverse of matrix A, and AM is nearly equal to the identity matrix I when M is a good approximation to 3 Preconditioning time A−1. Eigenvalues of AM are clustered around 1 when 10 − M is a good approximation to A 1. 10 1 In the restarting GMRES (GMRES(m)) [3] method, 10 2 the information concerning the eigenvalues around the origin is discarded at the restart. These small eigenvalues Iteration time 0 1 10 often slow the convergence. As GMRES iterations are (log) [sec] time Preconditioning 10 10 −7 10 −6 10 −5 10 −4 10 −3 10 −2 performed, deflation-type schemes (e.g. GMRES-IR [4] Cutoff parameter θ and GMRES-DR [5]) calculate small approximate eigen- Fig. 2. The computational time of GMRES(50) with BSAIC cor- values and corresponding eigenvectors. These eigenvec- responding to θ for EGF. tors are added to the Krylov space in a bid to speed convergence. An implicitly restarted GMRES (GMRES- IR) [4] proposed by Morgan is employed in Section 4. The size of A is 4, 505 and the number of nonzero ele- In the GMRES-IR(m, k) method, we compute the ments is 5, 254, 215 (25.89%). In this example, the block eigenpairs of the eigenvalue problem from an Arnoldi size l of BSAIC is set to 30. process of length m. We then apply an implicitly restart- The computational time of GMRES(50) with precon- ing Arnoldi (IRA) [6] with the unwanted harmonic Ritz ditioned BSAIC corresponding to θ for EGF is reported values [7] as shifts. The IRA method filters a chosen in Fig. 2. Our BSAIC preconditioner can solve this prob- harmonic Ritz value away from the Arnoldi process. lem faster than SAI and block SAI. However, as Fig. 2 Here, small harmonic Ritz values are chosen, and k small indicates, GMRES(50) preconditioned with BSAIC does eigenvalues near the origin can be deflated. Therefore, not converge when θ is larger than 10−5. Fig. 2 also the convergence will be improved by this deflation. Fig. 1 shows that the cutoff parameter θ is preferred to be shows the algorithm of GMRES-IR. Our experiments in large as much as possible (e.g. θ = 10−3) for the pre- Section 4 indicate the validity of the GMRES-IR method conditioning time. We investigate the slow down of the preconditioned with BSAIC. convergence in terms of eigenvalue distributions of AM. Figs. 3(a), 3(b), ... , 3(e) show eigenvalue distribu- 4. Numerical experiments tions of AM corresponding to θ = 10−6, 10−5,..., 10−2, In this section, firstly, the performance of the Krylov respectively. Fig. 3(f) shows the eigenvalue distribution subspace method preconditioned with the BSAIC pre- of A. The red line in Fig. 3 denotes a zero eigenvalue. In conditioner corresponding to θ is verified. Secondly, we Figs. 3(a)–3(e), the eigenvalue distributions of AM are analyze the convergence deterioration by a larger value clustered around 0 as θ becomes larger. The eigenvalue of θ and apply the improvement strategy to the BSAIC ditribution of A in Fig. 3(f) is expanded and clustered preconditioner. All experiments are carried out by MAT- around 0 more than that of AM. It is predicted that the LAB 7.4 on MacBook (CPU: Intel Core 2 Duo 2.26GHz, coefficient matrix A is ill-conditioned. This clustering Memory: 4.0Gbytes, OS: Mac OS 10.6.3). The test of eigenvalues is one of the key reasons for the conver- problems are solved by the preconditioned GMRES(50) gence deterioration. Therefore, we apply the BSAIC pre- method. The stopping criterion for the relative residual conditioner and GMRES-IR, which deflates the smallest −10 is 10 . The initial guess x0 is set to 0 and all elements eigenvalues, to these linear equations, and we improve of b are set to 1. The notation #MVs means the num- the convergence of Krylov subspace methods. ber of matrix-vector products, and the dagger (†) means Table 1 shows the results of GMRES-IR without pre- that the stopping criterion is not satisfied in 5, 000 MVs. conditioner and GMRES-IR preconditioned with ILU(0) The test matrix is derived from the computation of the [8] and ILUT [8]. The ε in Table 1 denotes the threshold molecular orbitals of an epidermal growth factor (EGF). of ILUT. The GMRES-IR method does not converge ex-

– 6 – JSIAM Letters Vol. 3 (2011) pp.5–8 Ikuro Yamazaki et al.

0.25 0.25 0.25 0.2 0.2 0.2 0.15 0.15 0.15 0.1 0.1 0.1 0.05 0.05 0.05 0 0 0 −0.05 −0.05 −0.05 −0.1 −0.1 −0.1 −0.15 −0.15 −0.15 −0.2 −0.2 −0.2 −0.25 −0.25 −0.25 −0.20 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 −0.20 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 −0.20 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 (a) θ = 1.0 × 10−6. (b) θ = 1.0 × 10−5. (c) θ = 1.0 × 10−4.

0.25 0.25 0.25 0.2 0.2 0.2 0.15 0.15 0.15 0.1 0.1 0.1 0.05 0.05 0.05 0 0 0 −0.05 −0.05 −0.05 −0.1 −0.1 −0.1 −0.15 −0.15 −0.15 −0.2 −0.2 −0.2 −0.25 −0.25 −0.25 −0.20 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 −0.20 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 −0.20 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 (d) θ = 1.0 × 10−3. (e) θ = 1.0 × 10−2. (f) Eigenvalue distribution of A. Fig. 3. Eigenvalue distributions of AM and A for EGF.

Table 1. Results of preconditioned GMRES-IR(50, 25) for EGF. 10 4 10 2 Iteration time [sec] (log) Wall clock time [sec] Preconditioner #MVs Preconditioning time Precond. Iter. Total None † — — — 10 3 ILU(0) † 11.04 — — 1 ILUT(ε = 10−2) † 20.51 — — 10 −3 ILUT(ε = 10 ) 74 33.49 16.76 50.25 10 2

Table 2. Results of BiCGSTAB, GMRES(50) and GMRES-IR Iteration time 0 × −3 1 10 preconditioned with BSAIC (l = 30, θ = 1.0 10 ) for EGF. (log) [sec] time Preconditioning 10 10 −7 10 −6 10 −5 10 −4 10 −3 10 −2 Wall clock time [sec] Cutoff parameter θ Krylov #MVs Cutoff Precond. Iter. Total Fig. 4. The computational time of GMRES-IR(50, 25) with BiCGSTAB † 0.59 36.89 — — BSAIC corresponding to θ for EGF. GMRES(50) † 0.59 36.89 — — IR(50, 5) † 0.59 36.89 — — IR(50, 10) 331 0.59 36.89 15.91 53.39 × IR(50, 15) 262 0.59 36.89 12.55 50.03 Tables 2 and 3 show the results for EGF with θ = 1.0 −3 −3 IR(50, 20) 233 0.59 36.89 10.90 48.38 10 and 5.0 × 10 , respectively. When BiCGSTAB IR(50, 25) 226 0.59 36.89 10.83 48.30 [9] and GMRES(50) are used, the stopping criterion is not satisfied in both Tables 2 and 3. In Table 2, Table 3. Results of BiCGSTAB, GMRES(50) and GMRES-IR the GMRES-IR method preconditioned with BSAIC preconditioned with BSAIC (l = 30, θ = 5.0 × 10−3) for EGF. converges except GMRES-IR(50, 5). As a result, the Wall clock time [sec] GMRES-IR(50, 25) method converges faster than other Krylov #MVs Cutoff Precond. Iter. Total Krylov subspace methods. Table 3 shows that each of BiCGSTAB † 0.56 17.09 — — GMRES-IR(50, 20) and GMRES-IR(50, 25) converges, GMRES(50) † 0.56 17.09 — — and GMRES-IR(50, 25) converges faster than any other † IR(50, 5) 0.56 17.09 — — method. Fig. 4 shows that a larger value of θ can be ap- IR(50, 10) † 0.56 17.09 — — IR(50, 15) † 0.56 17.09 — — plied by using GMRES-IR. The convergence is depen- IR(50, 20) 351 0.56 17.09 15.15 32.80 dent not only on the cutoff parameter θ but also on the IR(50, 25) 329 0.56 17.09 15.85 32.50 restart value m and the number of deflated eigenvalues k. Thus, we need to set an appropriate m and k. Morgan also mentioned that the choice of m and k changes the cept ILUT(ε = 10−3). GMRES-IR preconditioned with convergence in [4]. ILUT(ε = 10−3) has good convergence. However, ILUT Tables 4 and 5 show the real part of harmonic Ritz does not have good parallel efficiency such as SAI. values of GMRES-IR(50, 25) and the real part of small eigenvalues of AM, respectively. In Table 4, the param-

– 7 – JSIAM Letters Vol. 3 (2011) pp.5–8 Ikuro Yamazaki et al.

Table 4. The harmonic Ritz values of GMRES-IR(50, 25) and of AM to be expanded and clustered around 0. This the eigenvalues of AM (l = 30, θ = 1.0 × 10−3). Underlines cluster of small eigenvalues makes the convergence slow, indicate the correct digits. and thus the deflation-type Krylov subspace methods Re(H. R.) Re(eig(AM)) improve the convergence. λ1 0.000001191238635 0.000001191238641 In future work, we will try to find an automatic pro- λ 0.000225851028469 0.000225851028462 2 cedure for selecting the cutoff parameter θ, the restart λ3 −0.000307394803250 −0.000307394803254 λ4 0.002204006095033 0.002204006095033 count m and the number of small eigenvalue k. We also λ5 −0.003517837137617 −0.003517837137617 apply for large scale problems.

Table 5. The harmonic Ritz values of GMRES-IR(50, 25) and Acknowledgments × −3 the eigenvalues of AM (l = 30, θ = 5.0 10 ). Underlines This research was supported in part by a Grant-in-Aid indicate the correct digits. for Scientific Research of Ministry of Education, Cul- Re(H. R.) Re(eig(AM)) ture, Sports, Science and Technology, Japan (Grant Nos. λ −0.000072402852854 −0.000072402852260 1 21246018 and 21105502). λ2 0.000197498551705 0.000197498552292 λ3 −0.001027046005229 −0.001027046005286 λ4 0.002560480805997 0.002560480809970 References λ5 0.002632881319376 0.002632881318118 [1] E. Chow and Y. Saad, Approximate inverse preconditioners via sparse-sparse iterations, SIAM J. Sci. Comput., 19 (1998), Table 6. The number of eigenvalues of AM around 0 (l = 30). 995–1023. θ #(|d| < 10−1) #(|d| < 10−2) #(|d| < 10−3) [2] I. Yamazaki, M. Okada, H. Tadano, T. Sakurai and K. Teran- 1.0 × 10−6 6 2 0 ishi, A block sparse approximate inverse with cutoff precondi- 1.0 × 10−5 10 3 0 tioner for semi-sparse linear systems derived from Molecular 1.0 × 10−4 18 6 2 Orbital calculations, JSIAM Letters, 2 (2010), 41–44. 1.0 × 10−3 26 9 3 [3] Y. Saad and M. H. Schultz, GMRES: a generalized minimal 5.0 × 10−3 39 11 2 residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986), 856–869. [4] R. B. Morgan, Implicitly restarted GMRES and Arnoldi methods for nonsymmetric systems of equations, SIAM J. eters of BSAIC are set at l = 30 and θ = 1.0 × 10−3. Matrix Anal. Appl., 21 (2000), 1112–1135. In Table 5, the parameters of BSAIC are set at l = 30 [5] R. B. Morgan, GMRES with deflated restarting, SIAM J. Sci. × −3 Comput., 24 (2002), 20–37. and θ = 5.0 10 . “Re” and “H.R.” denote a real part [6] D. C. Sorensen, Implicit application of polynomial filters in and a harmonic Ritz value, respectively. The MATLAB a k-step Arnoldi method, SIAM J. Matrix Anal. Appl., 13 command eig is used to calculate the eigenvalues of AM. (1992), 357–385. Both Tables 4 and 5 show that the harmonic Ritz values [7] C. C. Paige, B. N. Parlett and H. A. Van der Vorst, Approxi- approximate the eigenvalues of AM well. Hence, small mate solution and eigenvalue bounds from Krylov subspaces, Numer. Lin. Alg. Appl., 2 (1995), 115–133. eigenvalues of AM are deflated, and Tables 2 and 3 also [8] Y. Saad, Iterative methods for sparse linear systems, SIAM, show that the GMRES-IR method improves convergence Philadelphia, 2003. more than any other Krylov subspace method. [9] H. A. van der Vorst, BiCGSTAB: a fast and smoothly converg- The number of eigenvalues of AM around 0 corre- ing variant of Bi-CG for the solution of nonsymmetric linear sponding to θ is reported in Table 6. The block size l is systems, SIAM J. Sci. Stat. Comput., 13 (1992), 631–644. fixed at 30. #(|d| < value) in Table 6 denotes the num- ber of absolute eigenvalues which are less than value. When θ = 1.0 × 10−6 and 1.0 × 10−5 are used, #(|d| < 10−3) is zero and the GMRES(50) method with BSAIC converges in Fig. 2. However, when θ which is larger than 10−5 is used, #(|d| < 10−3) is not zero and GMRES(50) with BSAIC does not converge in Fig. 2. Thus, a larger value of θ increases the number of eigenvalue of AM around 0 and eventually deteriorates the convergence of Krylov subspace methods.

5. Conclusions We proposed a method to improve the convergence of the BSAIC preconditioner using the deflation of small eigenvalues. Our BSAIC preconditioner reduces the con- structing cost of the approximate inverse M for semi- sparse matrices. However, a larger value of cutoff param- eter θ increases iteration counts and makes convergence difficult. We investigate this convergence deterioration with respect to eigenvalue distributions of AM. As a re- sult, a larger value of θ leads the eigenvalue distribution

– 8 – JSIAM Letters Vol.3 (2011) pp.9–12 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Cache optimization of a non-orthogonal joint diagonalization method

Yusuke Hirota1, Yusaku Yamamoto1 and Shao-Liang Zhang2

1 Graduate School of System Informatics, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan 2 Graduate School of Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan E-mail hirota stu.kobe-u.ac.jp Received September 30, 2010, Accepted October 31, 2010 Abstract The LUJ2D algorithm is a recently proposed numerical solution method for non-orthogonal joint diagonalization problems appearing in signal processing. The original LUJ2D algorithm attains low performance on modern microprocessors since it is dominated by cache ineffective operations. In this study, we propose a cache efficient implementation of the LUJ2D algorithm. The experimental results show that the proposed implementation is about 1.8 times faster than the original one, achieving 21% of the peak performance on the Opteron 1210 processor using one core. Keywords joint diagonalization, LUJ2D, cache optimization Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction in Section 4. Given nonsingular symmetric matrices C(k) ∈ RN×N (k = 1, 2,...,K), we consider the problem of finding a 2. The LUJ2D algorithm nonsingular matrix W ∈ RN×N such that The non-orthogonal joint diagonalization problem can be formulated as the following minimizing problem WC(k)W T = Λ(k) (k = 1, 2,...,K) (1) min J2(W ) are diagonal. This is called the non-orthogonal joint di- W agonalization problem. This type of problem appears in where W is a nonsingular matrix and J2 is a nonnegative signal processing [1, 2]. Of course, these problems have function no solution for general {C(k)} . In that case, we try to k K (k) ∑ 2 make Λ ’s as diagonal as possible by some criterion. (k) −1 (k) T −1 T J2(W ) = C − W diag(WC W )(W ) . This problem can be solved as an orthogonal joint F k=1 diagonalization problem when some preprocessing is ap- plied. However, this approach degrades the quality of the Here, diag(X) is the diagonal part of X. J2 measures (k) approximate solution for real problems whose input ma- the non-diagonality of (1). Λ ’s are diagonal simulta- trices contain errors [3]. To avoid this problem, various neously if and only if J2(W ) = 0. algorithms to find the non-orthogonal matrix W directly The LUJ2D algorithm reduces J2 iteratively by the have been proposed [4–7]. update

The LUJ2D algorithm is the one, which is recently Wm+1 = LmUmWm, proposed by B. Afsari [7]. It has a desirable property that it does not require the positive-definiteness of the where Wm is the m-th approximate solution. Here, Um − ≤ input matrices. However, a simple implementation of the is a product of N(N 1)/2 matrices Ri,j (ai,j) (1 i < ≤ ≤ ≤ algorithm (e.g. the MATLAB code used in the experi- j N) and Lm is that of Ri,j(ai,j) (1 j < i N). T ∈ RN×N ments in [7]) attains low performance on modern cache Ri,j is defined as Ri,j(x) = I +xeiej where ek based processors, since the implementation is based on is k-th unit vector. Note that there is a freedom in the vector update operations. In this paper, we propose a order of the Ri,j’s in the product. The parameters ai,j’s are determined by cache efficient implementation of the LUJ2D algorithm  which is dominated by matrix products.  ′ ≤ ≤ argmin J2(Ri,j(a)UmWm) (1 i < j N) This paper is organized as follows. In Section 2, we de- a ai,j = ′ ≤ ≤  argmin J2(Ri,j(a)LmUmWm) (1 j < i N) scribe the LUJ2D algorithm in its original form. Then we a propose a cache efficient implementation. In Section 3, (2) we evaluate the performance of the proposed implemen- ′ ′ tation by numerical experiments. The paper is concluded where each Um and Lm is a product of Ri,j(ai,j )’s that have already been determined. Each one dimensional

– 9 – JSIAM Letters Vol. 3 (2011) pp.9–12 Yusuke Hirota et al. minimization problem (2) is solved by finding zero points 4: U ← I of a cubic equation. The coefficients of the algebraic 5: for j = 2,...,N do equation are determined from the (i, j)-th and (j, j)-th 6: for i = 1, . . . , j − 1 do ′ (k) ′ T elements of (UmWm)C (UmWm) (k = 1,...,K) (or 7: Compute ai,j by (2) ′ (k) ′ T (LmUmWm)C (LmUmWm) (k = 1,...,K)). The it- 8: Update U (k) T eration is terminated if WmC Wm (k = 1,...,K) are 9: end for T sufficiently close to diagonal. Then Wm is the numerical 10: aj = [a1,j, a2,j , . . . , aj−1,j ] solution. 11: for k = 1,...,K do In this section, we describe two existing implementa- ← (k) (k) T 12: y 1/2 cj,j aj + (Cj,1:j−1) tions of the LUJ2D algorithm and propose a new one. (k) 13: C += yaT + a yT The implementations use only lower and diagonal el- 1:j−1,1:j−1 j j (k) (only lower elements are computed) ements of C (k = 1,...,K) since the matrices are (k) (k) 14: C += C aT symmetric. j:N,1:j−1 j:N,j j 15: end for 2.1 Original implementations 16: end for Vector update based implementation A simple 17: (Lower part is constructed in a similar way) implementation of the LUJ2D algorithm is shown below. 18: W ← LUW Here, submatrices and subvectors of C(k) are represented 19: end for in MATLAB notation. The notation A+= B means A ← 20: return W A + B. 21: end procedure

1: procedure LUJ2D (vector update based) The determination of ai,j is not influenced by the up- (k) 2: W ← (a nonsingular initial guess) date of C by Rl,j, if l ≠ i. Accordingly, the update 3: for m = 1, 2,... until convergence do operations with the same j: 4: U ← I (k) (k) T C ← Ri,jC R (i = 1, 2, . . . , j − 1) 5: for j = 2,...,N do i,j 6: for i = 1, . . . , j − 1 do can be performed by two rank-1 updates 7: Compute ai,j by (2) (k) (k) T T (k) C ← C + [a1,j , a2,j , . . . , aj−1,j, 0,..., 0] (e C ), 8: Update U j (k) (k) (k) 9: for k = 1,...,K do C ← C + (C ej)[a1,j, a2,j, . . . , aj−1,j , 0,..., 0] (k) (k) 10: C += a C i,1:i i,j j,1:i if i < j (the updates can be performed in a similar way if (k) (k) T 11: Ci+1:j,i+= ai,j(Cj,i+1:j) i > j). Moreover, by exploiting the symmetry of C(k)’s, (k) (k) 12: Cj+1:N,i+= ai,jCj+1:N,j the updates are performed by rank-1 updates and sym- 13: end for metric rank-2 updates as shown in lines 13–14. 14: end for This implementation requires 4/3KN 3 FLOPS for the 15: end for rank-2 updates and 2/3KN 3 FLOPS for the rank-1 up- 16: (Lower part is constructed in a similar way) dates per iteration. This implementation attains better 17: W ← LUW performance than the vector update based one since 18: end for rank-1/2 updates are more cache effective than vector 19: return W updates. Nevertheless, the performance of the rank-1/2 20: end procedure update based implementation is still low.

At each (i, j) step, ai,j is determined first. Then, the 2.2 A matrix product based implementation (k) ← (k) T updates of C Ri,j C Ri,j (k = 1,...,K) are per- In this subsection, we propose a matrix product based formed. These updates are performed by adding C(k)’s implementation. It is shown below. j-th row vector multiplied by a to C(k)’s i-th row vec- i,j 1: procedure LUJ2D (matrix product based) tor and C(k)’s j-th column vector multiplied by a to i,j 2: W ← (a nonsingular initial guess) C(k)’s i-th column vector, since 3: for m = 1, 2,... until convergence do (k) T T (k) T T 4: ← Ri,jC Ri,j = [(I + ai,jeiej )C ](I + ai,j eiej ) . U I 5: for J = 1, 2, . . . , N/M do When the symmetry is exploited, these operations can ′ 6: j = (J − 1)M + 1 be written as lines 10–12 of the procedure. 7: for I = 1, 2,...,J − 1 do These vector update operations require 2KN 3 FLOPS ′ 8: i = (I − 1)M + 1 and dominate the computational cost. The performance ′ ′ 9: for j = j , j + 1,...,JM do of the implementation is quite low since the vector up- ′ ′ 10: for i = i , i + 1,...,IM do dates are cache ineffective. 11: Compute a by (2) Rank-1/2 update based implementation The i,j 12: Update U rank-1/2 update based implementation is shown below. 13: end for LUJ2D (rank-1/2 update based) T 1: procedure 14: aj = [ai′,j, . . . , aIM,j ] 2: W ← (a nonsingular initial guess) 15: for k = 1,...,K do 3: for m = 1, 2,... until convergence do (k) ← (k) (k) T 16: yj 1/2 cj,j aj + (Cj,i′:IM )

– 10 – JSIAM Letters Vol. 3 (2011) pp.9–12 Yusuke Hirota et al.

(k) (k) T T (j) 17: Cj′:j−1,i′:IM += (Cj,j′:j−1) aj (k) (k) T (i) 18: Cj:JM,i′:IM += Cj:JM,jaj 19: end for 20: end for 21: AI,J ← [aj′ , aj′+1,..., aJM ] N 22: for k = 1,...,K do (k) (k) (k) (k) Block size 23: Y ← [y ′ , y ′ ,..., y ] j j +1 JM M (k) T T 24: Ci′:IM,i′:IM += AI,J Y + YAI,J (only lower elements are computed) Fig. 1. The sweeping order of (i, j) (the red line is for Um, the (k) (k) 25: Ci′:IM,1:i′−1+= AI,J Cj′:JM,1:i′−1 blue one is for Lm). The left is the sweeping order of the original (k) implementations. The right is that of the proposed one. 26: CIM+1:j′−1,i′:IM += (k) T T (Cj′:JM,IM+1:j′−1) AI,J 27: end for can be performed by two rank-1 updates 28: end for ← T T T (k) ← (k) T (k) 29: AJ [A1,J ,...,AJ−1,J ] C C + a(ej C ), 30: for k = 1,...,K do (k) (k) (k) T (k) (k) T C ← C + (C ej)a , 31: CJM+1:N,1:j′−1+= CJM+1:N,j′:JM AJ 32: end for where a is N dimensional column vector and (a)i = ′ 33: for j = j + 1,...,JM do {ai,j ((J − 1)M + 1 ≤ i ≤ JM); 0 (otherwise)}. By ′ 34: for i = j , . . . , j − 1 do exploiting the symmetry, these updates are performed 35: Compute ai,j by (2) by rank-1/2 updates as shown in lines 17–18. The rest 36: Update U (k) \ ∪ \ ci,j ((SI,: SI,J ) (S:,I SJ,I )) are updated after the 37: end for whole determinations of ai,j ((i, j) ∈ SI,J ). The updates T 38: aj = [aj′,j , . . . , aj−1,j] (k) ← (k) T ∈ C Ri,jC Ri,j ((i, j) SI,J ) can be performed by 39: for k = 1,...,K do { (k) T (k) ← (k) (k) 40: y = 1/2 c a + (C ′ ) C C + AC , j j,j j j,j :j−1 (3) (k) T T C(k) ← C(k) + C(k)AT, 41: Cj′:j−1,j′:j−1+= yjaj + ajyj (only lower elements are computed) { ∈ (k) (k) where A is an N-by-N matrix and (A)i,j = ai,j((i, j) 42: C ′ ′ += a C ′ j :j−1,1:j −1 j j,1:j −1 SI,J ); 0(otherwise)}. Therefore, these updates can be (k) (k) T 43: Cj:N,j′:j−1+= Cj:N,jaj performed as matrix products. Moreover, the updates (k) 44: end for (3) (1 ≤ I ≤ J − 1 or J + 1 ≤ I ≤ N) to c ((i, j) ∈ ∪ ∪ i,j 45: end for { N/M ∪ J−1 ∪ P =J+1(SI,P SP,I )(I < J); P =1(SI,P SP,I )(I > 46: end for J)}) can be combined. By the combination, the perfor- 47: (Lower part is constructed in a similar way) mance of matrix products improves since the size in- ← 48: W LUW creases. By exploiting the symmetry, these updates are 49: end for performed by matrix products and rank-2M updates 50: return W (that are essentially identical with matrix products) as 51: end procedure shown in lines 24–26, and 31. If (i, j) ∈ SJ,J , the updates (k) ← (k) T − ≤ ≤ − The sweeping order for matrix products on Um and C Ri,jC Ri,j ((J 1)M + 1 i J 1) are per- Lm in this implementation is different from the ones formed by rank-1 updates like the case on the rank-1/2 described in the previous subsection as shown in Fig. 1. update based implementation. By exploiting the symme- We partition {(i, j)|1 ≤ i ≤ N, 1 ≤ j ≤ N} into subsets try, these updates can be performed by rank-1/2 updates SI,J = {(i, j)|(I −1)M +1 ≤ i ≤ IM, (J −1)M +1 ≤ j ≤ as shown in lines 41–43. JM} (1 ≤ I ≤ N/M, 1 ≤ J ≤ N/M), where M is the The number of floating point operations in this imple- divisor of N. M is called the∪ block size. Also,∪ the sets mentation is shown in Table 1. The total number of oper- N/M N/M ations is identical with that of the original implementa- SI,: and S:,I are defined as J=1 SI,J and J=1 SJ,I respectively. tions, most of which are matrix products if M ≪ N. The We consider determining the ai,j’s ((i, j) ∈ SI,J ) at performance of the matrix products is improved with an first and then perform the following M 2 updates increase in M. However, there is a trade-off between the improvement and an increase of the number of rank-1/2 (k) ← (k) T ∈ C Ri,jC Ri,j ((i, j) SI,J ) update operations which are cache ineffective. (k) (k) at once. However, to determine ai,j from cj,i and cj,j , we must use the values of these elements partially up- 3. Numerical experiments dated with the preceding Ri,j ’s. To solve this problem, To evaluate the performance of the implementations (k) ∈ ∪ we perform numerical experiments. we update only ci,j ((i, j) SI,J SJ,I ) just after the The test set of matrices are generated by the following determinations of ai,j ((J − 1)M + 1 ≤ i ≤ JM). The (k) ← (k) T − ≤ ≤ procedure: (i) Generate nonsingular diagonal matrices updates C Ri,j C Ri,j ((J 1)M + 1 i JM)

– 11 – JSIAM Letters Vol. 3 (2011) pp.9–12 Yusuke Hirota et al.

Table 1. The number of floating point operations per iteration 3000 Determination of ai, j in the proposed implementation. Matrix product Rank-2 updates 4/3KNM 2 2500 Rank-2 M update 2 − 2 Rank-1 update (lines 17, 18) Rank-1 updates (lines 42,43) 2KN M 4/3KNM Rank-1 update (lines 42, 43) 2 − 2 Rank-1 updates (lines 17,18) 2KN M 2KNM 2000 Rank-2 update 2 2 Rank-2M updates 2KN M − 2KNM Rank-2 update Matrix products 2KN 3 − 6KN 2M + 4KNM 2 Rank-1 update Total 2KN 3 1500

1000 Execution time [sec] time Execution (k) ∈ RN×N ∆ (k = 1, 2,...,K) whose diagonal elements 500 are random values in (0, 1). (ii) Generate a nonsingular matrix B ∈ RN×N using random numbers in (0, 1). (iii) 0 Compute C(k) = B∆(k)BT (k = 1, 2,...,K). Rank-1/2 5 6 8 10 12 15 20 24 30 40 We set N = 240 and K = 240. The iteration is started update Block size M based Matrix product based with W = I and terminated when the condition v u Fig. 2. The execution time of the implementations. u∑K ∑ t (W (m)C(k)(W (m))T)2 2.5 i,j Average k=1 i≠ j v −5 Matrix product u < 10 M K N N Rank-2 update u∑ ∑ ∑ 2 Rank-1 update (lines 17, 18) t (m) (k) (m) T 2 (W C (W ) )i,j Rank-1 update (lines 42, 43) Rank-2 update k=1 i=1 j=1 1.5 Rank-2 update is satisfied. Rank-1 update In the numerical experiments, we compare the per- formance of the proposed implementations with various 1 values of M and the rank-1/2 update based one. The ex- Performance [GFLOPS] Performance periments were carried out on the machine with CPU: 0.5 Opteron 1210 1.8 GHz (3.6 GFLOPS, only one core was used), OS: CentOS 5.5, Compiler: GFortran 4.4.0, Com- 0 piler options: -march=native -O3 -funroll-loops. Rank-1/2 5 6 8 10 12 15 20 24 30 40 AMD Core Math Library (ACML) subroutines are used update Block size M for the rank-1/2/2M updates and matrix products. based Matrix product based In all the implementations, the number of iteration Fig. 3. The performance of the subroutines. was 128 and W was almost identical. Fig. 2 shows the total execution time and its breakdown. The proposed implementation with optimal block size M = 12 is the the proposed one achieved 21% of the peak performance fastest and 1.77 times faster than the rank-1/2 update on the Opteron 1210 processor using one core. based one. Fig. 3 shows the performance of rank-1/2/2M updates, matrix products and average for each block References size. We can observe that the performance of matrix [1] A. Belouchrani, K. Abed-Meraim, J. -F. Cardoso and E. products and rank-2M updates increases significantly Moulines, A blind source separation technique using second- with M. On the other hand, as seen in Table 1, the order statistics, IEEE Trans. Signal Processing, 45 (1997), number of operations in rank-1/2 update, which pro- 434–444. vides less performance, increases. Accordingly, the per- [2] A. Ziehe and K. -R. M¨uller,TDSEP – an efficient algorithm for blind separation using time structure, in: Proc. of the 8th formance of the implementation is maximized when M Int. Conf. Artificial Neural Networks, pp. 675–680, 1998. is much smaller than the best size for matrix products. [3] J. -F. Cardoso, On the performance of orthogonal source sep- Also, we observe that the proposed implementation with aration algorithms, in: Proc. of European Signal Processing optimal block size achieves 21% of the peak performance. Conf., pp. 776–779, 1994. We remark that the performance of the blocked im- [4] A. Yeredor, Non-orthogonal joint diagonalization in the least- squares sense with application in blind source separation, plementation strongly depends on the performance of IEEE Trans. Signal Processing, 50 (2002), 1545–1553. the product of small matrices. Thus, the implementa- [5] A. Ziehe, P. Laskov, G. Nolte and K. -R. M¨uller,A fast al- tion may show better performance if a library tuned for gorithm for joint diagonalization with non-orthogonal trans- small matrix products is available. formations and its application to blind source separation, J. Mach. Learn. Res., 5 (2004), 777–800. [6] R. Vollgraf and K. Obermayer, Quadratic optimization for si- 4. Conclusion multaneous matrix diagonalization, IEEE Trans. Signal Pro- In this paper, we proposed a cache efficient implemen- cessing, 54 (2006), 3270–3278. tation of the LUJ2D algorithm. The numerical experi- [7] B. Afsari, Simple LU and QR based non-orthogonal matrix joint diagonalization, in: Proc. of the 6th Int. Conf. on Inde- ments show that the performance of the proposed im- pendent Component Analysis and Blind Source Separation, plementation with optimal block size is about 1.8 times J. Rosca et al. ed., Lect. Notes in Comput. Sci., Vol. 3889, faster than the rank-1/2 update based one. Moreover, pp. 1–7, Springer-Verlag, Berlin, 2006.

– 12 – JSIAM Letters Vol.3 (2011) pp.13–16 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Quasi-minimal residual smoothing technique for the IDR(s) method

Lei Du1, Tomohiro Sogabe2 and Shao-Liang Zhang1

1 Department of Computational Science and Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan 2 Graduate School of Information Science and Technology, Aichi Prefectural University, Nagakute-cho, Aichi-gun, Aichi 480-1198, Japan E-mail lei-du na.cse.nagoya-u.ac.jp Received September 17, 2010, Accepted December 13, 2010 Abstract The IDR(s) proposed by Sonneveld and Gijzen is an efficient method for solving large non- symmetric linear systems. In this paper, QMRIDR(s), a new variant of the IDR(s) method is presented. In this method, the irregular convergence behavior of IDR(s) is remedied and both the fast and smooth convergence behaviors are expected. Numerical experiments are reported to show the performance of our method. Keywords Induced Dimension Reduction, the IDR(s) method, linear systems, QMRIDR(s), residual smoothing Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction QMRIDR(s) are expected. Numerical iterative methods play an important role This paper is organized as follows: In the next section, in solving large and sparse linear systems of the form: we review the IDR(s) method to show how it works. In Section 3, we present our idea which tries to reformu- Ax = b (1) late the relations of residuals and their auxiliary vectors in which coefficient matrix A is real, nonsymmetric and in the IDR(s) method and construct an iterative solu- nonsingular with the order of n, and right hand side b tion by minimizing the norm of a quasi-residual. Numer- is a given vector. ical results are reported to show the performance of our The IDR(s), a generalization of the IDR method [1] method in Section 4. Finally, we conclude this paper in for solving the problem (1), was recently proposed by Section 5. Sonneveld and Gijzen [2]. Some variants of this method have been proposed since then. A new IDR(s) variant 2. The IDR(s) method by imposing bi-orthogonalization conditions was devel- In this section, we review the IDR(s) method. oped in [3]. By exploiting the merit of BiCGStab(ℓ) [4] Given an initial approximation x0 with its correspond- to avoid the potential breakdown, especially for skew- ing residual r0 := b − Ax0, the kth Krylov subspace can symmetric or nearly skew-symmetric systems, IDRStab be defined as follows: and GBi-CGStab(s, L) were proposed with higher order − K (A, r ) := span{r ,Ar ,...,Ak 1r }. stabilization polynomials in [5] and [6], respectively. A k 0 0 0 0 block version of IDR(s) for solving linear systems with Let G0 := Kn(A, r0) be the full Krylov subspace and S n multiple right-hand sides was developed in [7]. The re- be a subspace in C . Define a sequence of subspaces Gj lation between IDR and BiCGStab [8] was discussed in by recursion as Gj := (I − ωjA)(Gj−1 ∩ S) in which ωj’s [9]. From the view point of Petro-Galerkin method, IDR are nonzero constants. was explained by Gutknecht in [10]. Moreover, Ritz-IDR Under the assumption that subspace S ∩ G0 does not was also explained in [11]. contain a nontrivial invariant subspace of A, the fol- The IDR(s) method has the property of fast conver- lowing result of the IDR theorem [1, 2] is obtained: gence, but its convergence history of the norm of resid- Gj $ Gj−1, i.e., Gj is a proper subset of Gj−1. This fact uals shows a quite irregular convergence behavior like implies that the sequence of nested subspaces Gj is finite many other Lanczos-type product methods. Further, the until Gj = {0}. quasi-minimal residual technique [12], as a variant of the Based on this theorem, the IDR(s) method was pro- BiCG method [13], will be shortly called QMR. We know posed to construct the next s + 1 new residuals in that QMR can remedy the irregular convergence behav- the same subspace Gj when the former s + 1 residu- ior. Thus we consider using QMR to IDR(s), which can als are given in Gj−1. Usually subspace S is defined as produce our method QMRIDR(s). Both the property of S := N (P T ), a null space of the transpose of P , where fast convergence and smooth convergence behavior of P is a matrix with the order of n×s. It was suggested to

– 13 – JSIAM Letters Vol. 3 (2011) pp.13–16 Lei Du et al. orthogonalize a set of random vectors for P in [2]. Now, which can be represented in the matrix form of we show the process of constructing a new residual. AYk = Wk+1Hk (5) Assume that residuals ri−s,..., ri in subspace Gj−1 are known, then a new residual r is constructed as where i+1   ∗ · · · ∗ ri+1 := (I − ωjA)vi, (2)  ∗ · · · ∗ ∗ 0  −   ∑in which the auxiliary vector vi is defined as vi = ri   s  ......  γ ∆r − where γ ∈ R and ∆r := r − r .  . . . .  l=1 l i l l k k+1 k   It is obvious that vi ∈ Gj−1 for all linear combinations  .. ..  Hk =  ∗ . . ∗  of r − ,..., r . To ensure r ∈ G , the auxiliary vector   i s i i+1 j  .. .  vi should be also in subspace S which implies that the  ∗ . .   .  unknowns γl’s will be determined under the condition of  . ∗  T . P vi = 0. The parameter ωj is obtained by minimizing ∗ the 2-norm of ri+1 and will keep the same in the next s 0 iterations. is a (k + 1) × k upper Hessenberg matrix with the band- Let ∆xk := xk+1 − xk, then xi+1 can be updated as width of s + 2, and symbol ∗ denotes a nonzero entry of ∑s Hk. xi+1 = xi + ωjvi − γl∆xi−l. (3) By the definition of vi, we can easily prove that col- l=1 umn vectors of Wk and Yk can span the same Krylov In the following s iterations, the foremost residual is subspace, i.e., replaced by the new one and the above process can be Kk(A, r0) = span{r0, r1,..., rk−1} cycled to construct a new intermediate residual. Finally, { } = span y0, y1,..., yk−1 . (6) s+1 residuals in Gj are obtained. The IDR(s) algorithm [2] is summarized as follows. Now, we construct a new iterative solution x˜k based on the basis of y0, y1,..., yk−1 which can be written as Algorithm 1 IDR(s) k x˜k = x0 + Ykzk for zk ∈ R . By (5), the corresponding ∈ Rn×s 1: Initialize x0, j = 0,P ; residual vector r˜k = b−Ax˜k satisfies r˜k = r0 −AYkzk = − G T k+1 2: r0 = b Ax0, and compute r1,..., rs in 0 by an Wk+1(e1 − Hkzk) where e1 = [1, 0,..., 0] ∈ R . existing Krylov solver; To obtain a smooth convergence, the ideal generated 3: for k = s, s + 1,..., do by x˜k is to determine zk by minimizing ∥r˜k∥2, but the T 4: Determine γl’s by solving∑ P vk = 0; storage requirement is hard from the non-orthogonality − s 5: Construct vk = rk l=1 γl∆rk−l; of r0, r1,..., rk. As a compromise between the opti- 6: j = j + 1 when k + 1 ≡ 0 (mod s + 1), mality and storability, the quasi-minimal residual tech- ωj = arg min ∥rk+1∥2; nique used to BiCG is reconsidered here by minimizing ω − G ∥e − H z ∥ . 7: Compute rk+1 = (I ωjA)vk∑in j; 1 k k 2 s 8: Update xk+1 = xk + ωjvk − γl∆xk−l; A diagonal matrix Ωk+1 = diag (δ0, . . . , δk) with δi = l=1 ∥ ∥ 9: If xk+1 has converged then stop; ri 2, is used to make the columns of Wk+1 to be of −1 − ˜ 10: end for unit norm, i.e., r˜k = Wk+1Ωk+1(δ0e1 Hkzk) where H˜k = Ωk+1Hk. Then, the quasi-residual ∥δ0e1 − H˜kz∥2 is minimized for zk instead of ∥e1 − Hkzk∥2. 3. QMR smoothing technique Due to the special structure of H˜k, QR decomposition In this section, we reconsider the relations of residuals by Givens rotations can be adopted, let [ ] and their auxiliary vectors in the IDR(s) method and R H˜ = QT k propose the QMRIDR(s) method by constructing a new k k+1 0 iterative solution. First, let us define { where Qk+1 is a unitary (k + 1) × (k + 1) matrix, and r if i < s, × y := i Rk is a nonsingular upper triangular k k matrix with i v if i ≥ s, bandwidth of s + 2. Then, we have that i [ ] and ˜ Rk min ∥δ0e1 − Hkz∥2 = min δ0Qk+1e1 − z z z 0 2 Yk := [y0 y1 ... yk−1], −1 and zk is determined as zk = Rk tk, where Wk+1 := [r0 r1 ... rk−1 rk].  T Then, we see that (2) in Algorithm 1 can be reformu- τ1 [ ]  .  tk lated as tk :=  .  , := δ0Qk+1e1. τ˜k+1 1 τk Avi = (vi − ri+1) (4) ωj ( ) It is easy to see that ∑s 1 ˜ ˜ − − min ∥δ0e1 − Hkz∥2 = ∥δ0e1 − Hkzk∥2 = |τ˜k+1|, = ri γl∆ri−l ri+1 , z ωj l=1

– 14 – JSIAM Letters Vol. 3 (2011) pp.13–16 Lei Du et al. and the iterative solution x˜k can be rewritten as Table 1. Test matrices. −1 Matrix n nnz∗ Application discipline x˜k = x0 + YkR tk, k ADD32 4960 23884 Electronic circuit design instead of (3) in the IDR(s) method. FIDAP037 3565 67591 Finite element modeling PDE2961 2961 14585 Partial differential equation As Rk is a triangular matrix with bandwidth of s + 2, SHERMAN4 1104 3786 Oil reservoir modeling the iterative solution x˜k can be updated in short-term recurrence analogous to the way in [12]. The difference *Number of the nonzero entries. between the previous method and the proposed method Table 2. Computation time [sec.]. is using the decomposition of a Hessenberg matrix with the bandwidth of s + 2 instead of using a tridiagonal ADD32 FIDAP037 PDE2961 SHERMAN4 IDR(1) 0.20 0.31 0.30 0.06 matrix. QMRIDR(1) 0.23 0.34 0.34 0.07 Under the framework of Algorithm 1, we can propose IDR(4) 0.23 0.29 0.34 0.06 the QMRIDR(s) algorithm which is summarized as fol- QMRIDR(4) 0.32 0.34 0.45 0.09 lows.

Algorithm 2 QMRIDR(s) −8 × ∥rk∥2/∥b∥2 ≤ 10 , with rk = b − Axk being the true 1: Initialize x , j = 1,P ∈ Rn s; 0 residual, otherwise 2n matrix-vector products would be 2: Compute r = b − Ax , r ,..., r in G , and H˜ ; 0 0 1 s 0 s performed at most. 3: Compute vs, rs+1 in G1, the new column of H˜s+1, T T Experiments were performed on a Redhat linux sys- then decompose H˜s+1 and compute [ts+1 , τ˜s+2] ; − tem (64 bit) with an AMD Phenom(tm) 9500 Quad-Core 4: [f , f ,..., f ] = Y R 1 ; 1 2 s+1 s+1 k+1 Processor using double precision arithmetic. Codes were 5: x˜ = x + [f , f ,..., f ]t ; s+1 0 1 2 s+1 s+1 written in the C++ language and compiled with GCC 6: for k = s + 1, s + 2,... do T 4.1.2. All test matrices in this section were taken from 7: Determine γl’s by solving∑ P vk = 0; s the Matrix Market collection [14]. The order, number of 8: Construct v = r − γ ∆r − ; k k l=1 l k l nonzero elements and application disciplines of the test 9: j = j + 1 when k + 1 ≡ 0 (mod s + 1), matrices are listed in Table 1. ωj = arg min ∥rk+1∥2; ω Algorithms were run without preconditioning. The − G 10: Compute rk+1 = (I ωjA)vk in j; convergence behavior is shown by the number of matrix- ˜ 11: Update the new column of Hk+1 by the latest vector products (on the horizontal axis) versus log10 of (s + 1) Givens rotations, then zero out the last the relative norm ∥rk∥2/∥b∥2 (on the vertical axis) in all element by a new Givens rotation G(ck+1, sk+1); four figures, and the computation time is listed in Table − 12: τk+1 =∑ck+1τ˜k+1, τ˜k+2 = sk+1τ˜k+1, and f k+1 = 2. − s+1 (yk i=1 Rk+1−i,k+1f k+1−i)/Rk+1,k+1, where As shown in Figs. 1–4, we have the following observa- Ri,k+1 denotes the entry of Rk+1 at the ith row tions. First, all peaks of the graphs related to the IDR(s) and (k + 1)th column; method disappear for the graphs of the QMRIDR(s) 13: x˜k+1 = x˜k + τk+1f k+1; method which converged with much smoother curves. 14: If x˜k+1 has converged then stop; Second, both methods (the same s) need almost the 15: end for same number of matrix-vector products to stop the iter- ations, and with larger s can converged at less iteration Several criteria can be used to stop the iteration in steps. This shows the QMRIDR(s) method also keeps the fast convergence property of the IDR(s) method. our algorithm. A natural choice is to make use of ∥rk∥2, which has been calculated previously. Other conditions From Table 2, we also note that the QMRIDR(s) method required more computation time because of the √are checked for ∥r˜k∥2 or its upper bound ∥r˜k∥2 ≤ additional costs per iteration step. Although both meth- k + 1 |τ˜k+1| where residual r˜k can be easily updated at low cost per iteration step. Mixed strategies of them ods with larger s converged at less iteration steps, it can also be utilized in Algorithm 2. seems that they took more computation time for some We expect to obtain smooth convergence history of the of our test problems because of more inner products with residuals, but at the cost of more memory requirements larger s. and level one operations of BLAS per iteration step. For example, we should save more s + 1 vectors and update 5. Conclusions the vector f k for the iterative solution x˜k in Algorithm In this paper, we propose a variant of the IDR(s) 2. method: QMRIDR(s) for solving nonsymmetric linear systems. To define this method, we reformulated the re- 4. Numerical experiments lations of residuals and their auxiliary vectors in the In this section, we report some numerical results with IDR(s) method and presented them in matrix form. the IDR(s) and QMRIDR(s) methods. Parameter s was Based on this arrangement, we can adopt the quasi- equal to 1 and 4. As for the initial guess and right-hand minimal residual smoothing technique and successfully T side vectors, we always chose x0 = 0 and b = [1,..., 1] . construct an iterative solution in short-term recurrence. All the elements of matrix P n×s were random value dis- Numerical results show that the proposed method tributed in the interval (0,1). The stopping criterion was not only has the smooth convergence behavior but also

– 15 – JSIAM Letters Vol. 3 (2011) pp.13–16 Lei Du et al.

1 4 IDR(1) IDR(1) 0 QMRIDR(1) QMRIDR(1) IDR(4) 2 IDR(4) −1 QMRIDR(4) QMRIDR(4)

−2 0

−3 −2 −4 −4 −5

−6 −6 Relative 2-norm residuals 2-norm Relative −7 residuals 2-norm Relative −8 −8

−9 −10 0 20 40 60 80 100 120 0 50 100 150 200 250 300 Number of matrix-vector products Number of matrix-vector products Fig. 1. ADD32. Fig. 3. PDE2961.

1 2 IDR(1) IDR(1) 0 QMRIDR(1) QMRIDR(1) IDR(4) IDR(4) 0 QMRIDR(4) −1 QMRIDR(4)

−2 −2 −3

−4 −4

−5 −6 −6 Relative 2-norm residuals 2-norm Relative Relative 2-norm residuals 2-norm Relative −7 −8 −8

−9 −10 0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 120 140 160 180 Number of matrix-vector products Number of matrix-vector products Fig. 2. FIDAP037. Fig. 4. SHERMAN4. retains the fast convergence property of the IDR(s) SIAM J. Sci. Comput., 32 (2010), 2687–2709. method. [6] M. Tanio and M. Sugihara, GBi-CGSTAB(s,L): IDR(s) with higher-order stabilization polynomials, J. Comput. Appl. Math., 235 (2010), 765–784. Acknowledgments [7] L. Du, T. Sogabe, B. Yu, Y. Yamamoto and S.-L. Zhang, A We sincerely thank the anonymous referee whose block IDR(s) method for nonsymmetric linear systems with comments and suggestions helped us to improve the multiple right-hand sides, submitted to J. Comput. Appl. Math.. manuscript. This research was partially supported by the [8] H. A. van der Vorst, Bi-CGSTAB: A fast and smoothly con- China Scholarship Council and the Ministry of Educa- verging variant of Bi-CG for the solution of nonsymmetric tion, Science, Sports and Culture, Grant-in-aid for Scien- linear systems, SIAM J. Sci. Stat. Comput., 13 (1992), 631– tific Research (Nos. 21760058, 19560065 and 22104004). 644. [9] G. L. G. Sleijpen, P. Sonneveld and M. B. van Gijzen, Bi- CGSTAB as an induced dimension reduction method, Appl. Numer. Math., 60 (2010), 1100–1114. References [10] M. H. Gutknecht, IDR explained, Elec. Trans. Numer. Anal., 36 (2010), 126–148. [1] P. Wesseling and P. Sonneveld, Numerical experiments with [11] V. Simoncini and D. B. Szyld, Interpreting IDR as a petrov- a multiple grid and a preconditioned lanczos type method, galerkin method, SIAM J. Sci. Comput., 32 (2010), 1898– Lect. Notes Math., Vol. 771, pp. 543–562, Springer-Verlag, 1912. Berlin, Heidelberg, New York, 1980. [12] R. W. Freund, QMR: a quasi-minimal residual method for [2] P. Sonneveld and M. van Gijzen, IDR(s): a family of simple non-Hermitian linear systems, Numer. Math., 60 (1991), 315– and fast algorithms for solving large nonsymmetric systems of 339. linear equations, SIAM J. Sci. Comput., 31 (2008),1035–1062. [13] C. Lanczos, Solution of systems of linear equations by mini- [3] M. van Gijzen and P. Sonneveld, An elegant IDR(s) vari- mized iterations, J. Res. Nat. Bur. Standards, 49 (1952), 33– ant that efficiently exploits bi-orthogonality properties, Delft 53. Univ. of Technology, Reports of the Department of Applied [14] Matrix Market, http://math.nist.gov/MatrixMarket/. Mathematical Analysis, Report 08-21, 2008. [4] G. L. G. Sleijpen and D. R. Fokkema, BiCGstab(ℓ) for lin- ear equations involving unsymmetric matrices with complex spectrum, Elec. Trans. Numer. Anal., 1 (1993), 11–32. [5] G. L. G. Sleijpen and M. B. van Gijzen, Exploting BiCGSTAB(ℓ) strategies to induce dimension reduction,

– 16 – JSIAM Letters Vol.3 (2011) pp.17–19 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

A new approach to find a saddle point efficiently based on the Davidson method

Akitaka Sawamura1

1 Sumitomo Electric Industries, Ltd., 1-1-3, Shimaya, Konohana-ku, Osaka 554-0024, Japan E-mail sawamura-akitaka sei.co.jp Received September 30, 2010, Accepted February 24, 2011 Abstract A new eigenvector-following approach for finding a saddle point without the Hessian matrix is described. The most important feature of the proposed approach is to rely not only on the lowest as in the case of a conventional approach, but also on higher, albeit less accurate, eigensolutions which are available when the Davidson method is employed. The proposed approach is shown to be more efficient than the conventional one by application to diffusion of a Zn interstitial atom in an InP supercell. Keywords diffusion, reaction, transition state, potential energy surface Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction 2. Method Many aspects of diffusion and chemical reaction can To locate the saddle points, if the Hessian H is readily be reduced to questions about potential energy surface, available, iterative application of a Newton-like formula, in particular where saddle points are on the surface. One − ∆⃗x = (H − λ) 1f⃗ (1) of approaches frequently employed to locate the saddle points is an eigenvector-following strategy [1]. This strat- can be a preferred choice [7,8], where ∆⃗x is a step vector egy resembles nonlinear optimization methods. A step which atoms are moved according to, f⃗ is the force act- is, however, taken uphill along the eigenvector with the ing on the atoms, and λ is a shift parameter. If all the lowest eigenvalue of a Hessian or dynamical matrix, and eigenvalues ϵi (in an ascending order) and corresponding downhill along all the other directions. eigenventors ⃗vi of the Hessian are known, (1) is rewritten To my knowledge, the eigenvector-following method as is first described by Crippen and Scheraga [2] and re- ∑ 1 T ⃗ fined by Cerjan and Miller [3] from the viewpoint of ∆⃗x = ⃗vi ⃗vi f. (2) ϵi − λ Lagrangian multiplier technique. This early version re- − quires explicitly the Hessian, which it is usually costly or Form (2), clearly λ should be so chosen that ϵ1 λ is − tedious to evaluate. To overcome this problem, Munro negative while ϵi λ with i > 1 is positive to ensure and Wales (MW) proposed an alternative approach that ∆⃗x represents a direction energetically uphill along which makes use of only force [4]. MW purged the Hes- ⃗v1 and downhill along the remaining eigenvectors [3]. sian by employing conjugate-gradient method [5] to ob- Even when the Hessian is not explicitly available, for- tain the lowest eigensolution, considering the facts that tunately at least the lowest eigenvalue ϵ1 and corre- sponding eigenvector ⃗v can be calculated as already • 1 the conjugate-gradient method requires a matrix- mentioned. Using this solution the force f⃗ is partitioned vector product, not the matrix as a whole, into parallel and perpendicular components as and that ⃗ ∥ T ⃗ f = ⃗v1 ⃗v1 f, (3) • the matrix-vector product is calculated approxi- and mately as a difference in force. ( ) ⃗ ⊥ − T ⃗ The Davidson method [6] is another algorithm for f = I ⃗v1 ⃗v1 f, (4) eigenvalue problems. Since in the Davidson method a respectively. A modified force subspace is constructed explicitly, higher “otiose” eigen- † ∥ ⊥ soltions are also calculated even when the lowest is the f⃗ = −f⃗ + f⃗ (5) only targeted solution. Taking advantage of all these is one of directions uphill along ⃗v1 and downhill in the available solutions, the author propose a new approach tangent subspace as with ∆⃗x in (1). In the MW method for finding the saddle point efficiently while utilizing the the atoms are moved relying on −f⃗ ∥ and then on f⃗ ⊥ force only. in a sequential manner. Henkelman and J´onsson(HJ) proposed, however, that the step vector ∆⃗x is set pro-

– 17 – JSIAM Letters Vol. 3 (2011) pp.17–19 Akitaka Sawamura portional to a further modified form: mula, { ( ) − ⃗ ∥ 1 ⃗ † f if ϵ1 > 0 H⃗t ≈ − f⃗ − f⃗ , (13) f = ∥ ⊥ (6) η ⃗x+η⃗t ⃗x −f⃗ + f⃗ otherwise, where ⃗t is a normalized trial vector, specifically a residual for faster convergence toward a near-by saddle point [9]. associated with ⃗v orthonormalized against the current The approach proposed in the present Letter is some- 1 subspace, and η is a scaling parameter. If stopping cri- what in between the above two. Eq. (2) can be rewritten terion for the Davidson method is satisfied after n inner by dividing its summation into two parts: iterations, the n eigensolutions are available. Third, rely- ∑ ∑ ⃗ † 1 T ⃗ 1 T ⃗ ing on these solutions the modified force f is obtained ∆⃗x = ⃗vi ⃗vi f + ⃗vi ⃗vi f. (7) ϵi − λ ϵi − λ form (6), (10), and (11). Forth, a tentative step vector i≤n i>n ′ ⃗ † ∥ ′∥ ′ ∆⃗x proportional to f is so adjusted that ∆⃗x 2 is Suppose that ϵi is independent form i and equal to ϵ for ′ equal to a prescribed value. When ϵ1 is positive, ∆⃗x is i > n, where n is such an integer that n eigenvectors are accepted as the established step vector ∆⃗x. Otherwise, easily handled. On this assumption, the second term of fifth, f⃗ † is evaluated at ⃗x + ∆⃗x ′ with the eigensolutions the right-hand side of (7) can be rewritten as not recalculated. Sixth, a linearized modified force ∑ ∑ 1 T ⃗ 1 T ⃗ † † ⃗vi ⃗vi f = ⃗vi ′ ⃗vi f ζ f⃗ + (1 − ζ)f⃗ (14) ϵ − λ ϵ − λ ′ i>n i i>n ⃗x+∆⃗x ⃗x ∑ is minimized in a least-square sense with respect to ζ. In 1 T ⃗ = ⃗vi⃗v f other words, one-dimensional search is performed once. ϵ′ − λ i i>n ′ ( ) Seventh, ζ∆⃗x is accepted as the established step vector ∆⃗x. The force is evaluated n+1 or n+2 times per single 1 ∑ = I − ⃗v ⃗v T f.⃗ (8) cycle of the outer loop. ϵ′ − λ i i i≤n If (10) and (11) are replaced with (3) and (4) at the Inserting (8) into (7) we have third step, respectively, the proposed approach is re- ( ) duced to a conventional one which resembles the MW ∑ ∑ and HJ methods. 1 T ⃗ 1 − T ⃗ ∆⃗x = ⃗vi ⃗vi f + ′ I ⃗vi ⃗vi f. (9) ϵi − λ ϵ − λ i≤n i≤n 3. Test calculation While actually an approximation to (2), for finding While all the available eigensolutions are exploited in a saddle point when the Hessian is unavailable, (9) the proposed approach, since only the lowest is targeted, can lead to an efficient approach. In contrast to the the higher are not likely to be very accurate. This lack conjugate-gradient method as chosen by MW and HJ, of accuracy may hinder convergence. Therefore test cal- the Davidson method is one of iterative algorithms which culation is performed to confirm whether the proposed can compute multiple eigensolutions at once even if approach is actually more efficient or not than the con- only the lowest is to be sought. Therefore, if the multi- ventional one. ple eigensolutions supplied by the Davidson method are The system considered here is an InP supercell of 64 used with (9), faster convergence toward a saddle point atoms with an interstitial Zn atom. Initially the Zn atom can be expected than if only the lowest one is considered. is placed at a tetrahedral site surrounded by four In In practice, determining λ appropriately requires the atoms and displaced slightly toward a near-by hexago- Hessian again [3]. In the proposed approach, alternative nal site. The force f⃗ is evaluated by plane-wave, pseu- forms of the parallel and perpendicular forces similar in dopotential formalism [10, 11] within density-functional spirit to (9) are introduced as follows: theory [12, 13]. The saddle point is taken to be found ϵ ⃗ −11 ⃗ ∥ max T ⃗ when ϵ1 is negative and ∥f∥∞ falls within 4 × 10 N. f = ⃗v1 ⃗v1 f, (10) |ϵ1| The inner Davidson loop is terminated either when the and iteration count exceeds five or when 2-norm of the resid- ( ) ∑ ∑ ual vector associated with ⃗v1 is smaller than one tenth ⊥ ϵmax ∥ ⃗∥ ∥ ′∥ × −13 f⃗ = ⃗v ⃗v T f⃗ + I − ⃗v ⃗v T f,⃗ (11) of f 2. η and ∆⃗x 2 are set to be 5 10 m and i i i i −12 |ϵi| × 1

ϵmax = max |ϵi|. (12) maining technical details of the formalism are explained i elsewhere [14, 15]. The proposed approach is a doubly iterative one. The The runs with both the proposed and conventional ap- outer loop consists of following steps. First, the force f⃗ is proaches converged within numerical error to the same evaluated at the current atomic configuration ⃗x. Second, saddle point, where the Zn interstitial atom was located the Davidson method as the innter loop is started with at the hexagonal site. The results ate summarized in Ta- only the lowest eigensolution targeted. The Hessian- ble 1. As expected in the previous section, the proposed vector product is approximated by a finite-difference for- approach required the fewer number of times both of the outer iteration and of the force evaluation than the con-

– 18 – JSIAM Letters Vol. 3 (2011) pp.17–19 Akitaka Sawamura

Table 1. The number of times of outer iteration and force eval- [3] C. J. Cerjan and W. H. Miller, On finding transition states, uation required to find a saddle point with proposed and con- J. Chem. Phys., 75 (1981), 2800–2806. ventional approaches. [4] L. J. Munro and D. J. Wales, Defect migration in crystalline Number of times silicon, Phys. Rev. B, 59 (1999), 3969–3980. Approach Outer iteration Force evaluation [5] W. W. Bradbury and R. Fletcher, New iterative methods for Proposed 18 124 solution of the eigenproblem, Numer. Math., 9 (1966), 259– Conventional 24 178 267. [6] E. R. Davidson, The iterative calculation of a few of the low- est eigenvalues and corresponding eigenvectors of large real- 10 symmetric matrices, J. Comput. Phys., 17 (1975), 87–94. [7] A. Heyden, A. T. Bell and F. J. Keil, Efficient methods for finding transition states in chemical reactions: Comparison of improved dimer method and partitioned rational function optimization method, J. Chem. Phys., 123 (2005), 224010. [8] R. A. Olsen, G. J. Kroes, G. Henkelman, A. Arnaldson and 1 H. J´onsson,Comparison of methods for finding saddle points

−9 without knowledge of the final states, J. Chem. Phys., 121 10 N) (2004), 9776–9792. ×

( [9] G. Henkelman and H. J´onsson,A dimer method for finding saddle points on high dimensional potential surface using only

Force 0.1 first derivatives, J. Chem. Phys., 111 (1999), 7010–7022. [10] J. Ihm, A. Zunger and M. L. Cohen, Momentum-space for- malism for the total energy of solids, J. Phys. C: Solid State Phys., 12 (1979), 4409–4422. [11] W. E. Pickett, Pseudopotential methods in condensed matter 0.01 applications, Comput. Phys. Rep., 9 (1989), 115–197. 0 5 10 15 20 25 [12] P. Hohenberg and W. Kohn, Inhomogeneous electron gas, Iteration of the outer loop Phys. Rev., 136 (1964), B864–B871. [13] W. Kohn and L. J. Sham, Self-consistent equations includ- Fig. 1. Comparison of ∥f⃗∥∞ versus the number of times of the ing exchange and correlation effects, Phys. Rev., 140 (1965), outer iteration. Solid and dashed lines indicate the results of the A1133–A1138. proposed and conventional approaches. [14] A. Sawamura, Reformulation of the Anderson method using singular value decomposition for stable convergence in self- consistent calculations, JSIAM Letters, 1 (2009), 32–35. ventional one. The convergence history is shown in Fig. [15] A. Sawamura, M. Kohyama and T. Keishi, An efficient pre- conditioning scheme for plane-wave-based electronic struc- 1. With the proposed approach the force decreases not ture calculations, Comput. Mater. Sci., 14 (1999), 4–7. merely faster but also as smoothly after the third outer iteration. This means the efficiency and stability of the proposed approach.

4. Summary The new eigenvector-following approach without the Hessian matrix is presented. In the proposed approach, multiple eigensolutions obtained employing the David- son method are used together with the force to calculate the steps leading toward a saddle point. Faster conver- gence is expected than using the conventional one, which relies merely on the lowest eigensolution. This may not the case, however, because the higher eigensolutions, not targeted in the Davidson iteration, are less accurate in general. For comparison the approaches have been tested on a model system involving diffusion of a Zn intersti- tial atom in an InP supercell. The test calculation has confirmed that the proposed approach is the preferred choice because of the fewer number both of the steps and of the force evaluations. The next problem is to find opti- mal parameter settings that could be adopted by others.

References

[1] H. B. Schlegel, Exploring potential energy surface for chemi- cal reactions: an overview of some practical methods, J. Com- put. Chem., 24 (2003), 1514–1527. [2] G. M. Crippen and H. A. Scheraga, Mimimization of polypep- tide energy XI. The method of gentlest ascent, Arch. Biochem. Biophys., 144 (1971), 462–466.

– 19 – JSIAM Letters Vol.3 (2011) pp.21–24 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

On rounding off quotas to the nearest integers in the problem of apportionment

Tetsuo Ichimori1

1 Department of Information Systems, Osaka Institute of Technology, 1-79-1 Kitayama, Hirakata City, Osaka 573-0196, Japan E-mail ichimori is.oit.ac.jp Received September 16, 2010, Accepted January 16, 2011 Abstract Simulations are performed in order to make comparisons among five methods of U.S. Con- gressional apportionment. Specifically, the probability is estimated under each method of apportionment that the number of Representatives allocated to a state is equal to the number obtained by rounding off the quota of that state to the nearest integer. According to the Webster method, numerical evidence shows that the probability is 97.6 percent on average. Keywords apportionment, rounding, optimization Research Activity Group Mathematical Politics

1. Introduction just h would be very low. Generally, such a rounding- The U.S. Constitution requires that “Representatives off method would produce an apportionment of another ′ ̸ shall be apportioned among the several States according house size h = h. Conversely speaking, a method pro- to their respective numbers, counting the whole number ducing an apportionment of just h does not generally of persons in each State” (see U.S. Constitution, Art. 1, give an apportionment satisfying ai = [qi]0.5 for all i’s. Sec. 2, Amend. 14, Sec. 2). Because each state must be Nevertheless, it can be much expected that any reason- represented by a whole number of Representatives, it is able method producing an apportionment of just h will almost impossible to carry out the requirement exactly. give an apportionment satisfying ai = [qi]0.5 for almost In fact, the U.S. Supreme Court (in the case of United all i’s. States Department of Commerce v. Montana, 503 U.S. The purpose of this article is to identify a method of 442(1992)) admits this fact. The issue of apportioning apportionment which can produce an apportionment of Representatives among the several states constitution- h satisfying ai = [qi]0.5 for as many states i’s as possible ally has been debated for over 200 years. because such a method of apportionment seems to be Mathematically speaking, let s denote the number of most natural. states, h the total number of seats to be apportioned or 2. The Hamilton method and the Al- the house size, p = (p1, . . . , ps) the population of the s states where pi is a positive integer for each i. In the the- abama paradox oretically perfect apportionment, the proportional share, ∗ ∗ Because the constitutional requirement that the num- namely, the quota of state i is qi = hpi/p where∑p is ∗ ber of Representatives to which each state is entitled the total population of the country, i.e., p = j pj. shall be proportional to the population of that state can- ≥ Let a = (a1, . . . , as) 0 be a vector of non-negative not be met completely, it might be reasonable to seek an integers,∑ then the vector a is called an apportionment apportionment a which is as close to the vector of quotas of h if a = h. Then, carrying out the constitutional i i q = (q1, . . . , qs) as practicable. requirement exactly means to achieve the mathematical In fact, the method given by the first apportionment equality a = q where q = (q1, . . . , qs) is the vector of bill passed by Congress in 1792 minimizes the distance quotas. Undoubtedly it is virtually impossible. between these two vectors a and q, i.e., ∥a − q∥, or One of the most natural methods of apportioning Rep- minimizes resentatives among the states might be rounding off ∑s ( ) ∑s the quotas in the usual way. Mathematically, this im- 2 ai − qi s.t. ai = h and ai ∈ N for all i’s, plies∑ that ai must be [qi]0.5 for all i’s and the equality i=1 i=1 a = h must be achieved, where [z] is an integer i i 0.5 where N denotes the set of non-negative integers and obtained by rounding off z in the usual way, namely, “s.t.” is an abbreviation for “subject to.” At that time, [z] is the nearest whole number to z. If the fractional 0.5 the values of s = 15 and h = 120 were used. Although part of z is exactly 0.5, then [z] can be either of two 0.5 this bill was vetoed by President Washington, the ap- consecutive integers z − 0.5 and z + 0.5. portionment is appealing because exactly the same ap- On the other hand, the probability that such a portionment results if the quotas are rounded off in the rounding-off method can produce an apportionment of usual way, see Table 1. If an apportionment method

– 21 – JSIAM Letters Vol. 3 (2011) pp.21–24 Tetsuo Ichimori

Table 1. First apportionment bill in 1792, extracted from [1]. i.e., h = 11. This peculiar phenomenon is known as the State Population Quota Apportionment Alabama paradox because this phenomenon occurred in Virginia 630,560 20.926 21 the State of Alabama. Although the Hamilton method Massachusetts 475,327 15.744 16 had been used under the censuses of 1850 through 1900, Pennsylvania 432,879 14.366 14 Congress rejected it in 1911 because of this paradox. North Carolina 353,523 11.732 12 New York 331,589 11.004 11 Maryland 278,514 9.243 9 3. Methods of apportionment Connecticut 236,841 7.860 8 After the Hamilton method was rejected, Congress re- South Carolina 206,236 6.844 7 New Jersey 179,570 5.959 6 turned to the Webster method which had been used after New Hampshire 141,822 4.707 5 the 1840 census. After debating over the proper method Vermont 85,533 2.839 3 of apportionment for several decades, Congress adopted Georgia 70,835 2.351 2 the Hill method in 1941 and it has been used ever since Kentucky 68,705 2.280 2 then. The methods of Webster and Hill come under so- Rhode Island 68,446 2.271 2 Delaware 55,540 1.843 2 called “divisor methods” which can avoid the Alabama Totals 3,615,920 120.000 120 paradox, see [1] for the details of other paradoxes. There- fore, the scope of the debate over the proper method of apportionment may be reduced mainly to the methods of Webster and Hill. In what follows, divisor methods satisfies such a property, then it will be said that the are described shortly. method satisfies the “rounding-off constraints.” If not, it will be said that it violates the rounding-off constraints. 3.1 Divisor methods It is clear that rounding off quotas in the usual way Define a real valued function d(a) on the non-negative does not always yield an apportionment of h. For exam- integers a ≥ 0. The function d(a) is strictly increasing ple, let there be three states (s = 3) whose populations in a. It satisfies a ≤ d(a) ≤ a + 1 and moreover satisfies ≥ are p1 = 235, p2 = 333 and p3 = 432. When the to- d(b) = b and d(c) = c + 1 for no pair of integers b 1 tal number of representatives is h = 10, the quotas of and c ≥ 0. the three states are q1 = 2.35, q2 = 3.33 and q3 = 4.32. Let z be a positive real number and [z] denote an Rounding off the quotas in the usual way yields a vec- integer satisfying the following. (i) For d with d(0) = 0: If tor a = (2, 3, 4) which is an apportionment of only nine d(a−1) < z < d(a) for some positive integer a ≥ 1, then representatives. That is, one representative is surplus. If [z] = a. If z = d(a) for some positive integer a ≥ 1, then the total number of representatives increases by one, i.e., [z] = a − 1 or a. (ii) For d with d(0) > 0: Additionally, h = 11, then the quotas of the three states change into define d(−1) = 0. If d(a − 1) < z < d(a) for some non- ≥ q1 = 2.585, q2 = 3.663 and q3 = 4.752. Rounding them negative integer a 0, then [z] = a. If z = d(a) for some off as before gives another vector a = (3, 4, 5). This time, non-negative integer a ≥ 0, then [z] = a or a + 1. an apportionment of as many as twelve representatives Next introduce a divisor method M and a divisor x > results and one representative is short because there are 0. A divisor x means that each Representative is given no more than eleven representatives. ∑an approximate constituency of x persons. If the equality s In order to overcome this difficulty, Alexander Hamil- i=1[pi/x] = h is achieved for some divisor x > 0, then ton invented an apportionment method and several the number of Representatives which state i receives is decades later Samuel Vinton reinvented it which is today ai = [pi/x] where pi/x is referred to as the “quotient” referred to as the “Hamilton method” or the “method of state i. Given p, h ≥ s, and d(a), a divisor method of greatest remainders.” In fact, the Hamilton method M is defined as the set of apportionments { } yields the same seat distribution to 15 states as that [ ] s [ ] p ∑ p of the first apportionment bill passed by Congress in a : a = i and i = h for some x > 0 . i x x 1792. Although, as said above, this∑ method produces an i=1 s − 2 apportionment which minimizes i=1(ai qi) subject Although there can be innumerable divisor methods, to the constraints given above, another explanation for the following methods are known as the “five historical this method is more familiar: Each state i receives the methods” and have received special treatment for a long number of Representatives corresponding to the whole time: number of the quota q , that is, the number obtained by i • the Adams method with d(a) = a, ignoring the fractional remainders. The remaining Rep- • resentatives are distributed to the states with the largest the Dean method with d(a) =√a(a + 1)/(a + 0.5), fractional remainders. • the Hill method with d(a) = a(a + 1), Unfortunately, the Hamilton method is subject to the • the Webster method with d(a) = a + 0.5, so-called “Alabama paradox.” The first numerical exam- • the Jefferson method with d(a + 1) = a + 1. ple gives an apportionment a = (3, 3, 4) of ten Repre- sentatives under the Hamilton method while the second 3.2 Relaxedly proportional methods one an apportionment a = (2, 4, 5) of eleven Represen- In the history of U.S. apportionment, the Jefferson tatives. Then, the first state gets three Representatives method was used after each of the first four censuses when the house size is h = 10, while it gets only two and was abandoned by Congress because it tends to fa- Representatives when the house size increases by one, vor large states over small states. The Adams method

– 22 – JSIAM Letters Vol. 3 (2011) pp.21–24 Tetsuo Ichimori was considered by Congress but it was not adopted be- Table 2. Expected numbers of states violating the rounding-off cause it tends to favor small states over large states in constraints according to the 2000 through 1960 censuses. contrast to the Jefferson method. The Dean method has Hill T&S Theil Webster “1/3” never be used by Congress in the history of apportion- 2000 2.014 1.669 1.377 1.213 1.409 ment. Recently, this author has developed a class of “re- 1990 2.049 1.700 1.389 1.212 1.419 1980 2.089 1.724 1.404 1.210 1.415 laxedly proportional” methods, see [2] for the details. 1970 2.216 1.743 1.416 1.213 1.439 He explains why these three methods, i.e., the methods 1960 2.317 1.867 1.469 1.212 1.409 of Adams, Dean and Jefferson, produce apportionments means 2.137 1.741 1.411 1.212 1.418 which are not proportional to the population of states in some sense. Now consider the following minimization problem: which is obtained by minimizing ∑s a2 ∑s ∑s ∑s i s.t. a = h and a ∈ N for all i’s. ai i i ai log s.t. ai = h and ai ∈ N for all i’s. pi p i=1 i=1 i=1 i i=1

Let W denote the set of all optimal solutions a = (a1, √ • The “1/3” method with d(a) = a2 + a + 1/3 for . . . , as), then it is well known that W defines the Webster all integers a ≥ 0, which is obtained by minimizing method, see [1]. In other words, any apportionment∑ of 2 s s h produced by the Webster method minimizes a /pi ∑ 3 ∑ ∑ i i ai ∈ s.t. ai = h and ai ∈ N for all i’s. subject to i ai = h and ai N for all i’s, while any p2 i=1 i i=1 optimal solution a = (a1, . . . , as) to the minimization problem above is an apportionment of h under the Web- The “new five” are defined to be the methods of Hill, ster method. Here it should be noticed that the Webster T&S, Theil, Webster and “1/3.” They are not only di- method which is a divisor method can be also defined as visor methods but also relaxedly proportional. See [3,4] a discrete optimization problem. for the ancestors of the methods of Theil and T&S. Next consider its continuous relaxation minimizing ∑s a2 ∑s 4. Violating the rounding-off constraints i s.t. a = h and a ∈ R for all i’s, p i i + The purpose of this section is to study on average i=1 i i=1 how many out of the s states violate their rounding-off where R denotes the set of positive real numbers. Then + constraints, i.e., |ai −qi| ≤ 0.5, under apportionments of it is clear that there can exist some positive λ > 0 such h produced according to the new five methods. 2 ′ that (ai /pi) = 2(ai/pi) = λ for all i’s at optimality, First, according to the 2000 census fix an apportion- which means that ai is proportional to pi for all i’s at ment of the 435 Representatives among the 50 states optimality. In other words, ai = (λ/2)pi = (h/p)pi = qi produced by each of the new five methods. Note here at optimality. Then, the Webster method is said to be that the Hill method produces the existing apportion- relaxedly proportional. In general, if an apportionment ment according to the 2000 census. method can be described in the form of discrete opti- Let method M define an apportionment a = a(M) mization and its continuous relaxation has an optimal and a divisor x = x(a(M)). Let random Pi be uniformly solution identical to the vector of quotas, i.e., a = q, distributed on the interval then the method is relaxedly proportional. − ≤ ≤ Similarly, the Hill method is obtained by minimizing d(ai(M) 1)x(a(M)) Pi d(ai(M))x(a(M)), ∑s 2 ∑s then the apportionment method M gives the same ap- pi s.t. ai = h and ai ∈ N+ for all i’s, portionment a(M) for the population P ,...,P as for a 1 s i=1 i i=1 the actual population according to the 2000 census. where N+ denotes the set of positive integers. This au- To avoid the unrealistic assumption of very small thor proposed to use the following three relaxedly pro- states, assume in estimating the total number of states portional methods instead of the methods of Adams, violating the rounding-off constrains that no state’s quo- Dean and Jefferson which were shown not to be relaxedly tient is less than 0.5. In other words, the random popula- proportional, see [2]: tion of each state is assumed to be uniformly distributed on the interval • the Theil-Schrage (T&S for short) method with d(0) = 0 and d(a) = 1/ log((a + 1)/a) for all integers max{0.5, d(ai(M))}x(a(M)) ≤ Pi ≤ d(ai(M))x(a(M)). a ≥ 1, which is obtained by maximizing One million instances are generated for each of the ∑s ∑s new five methods. Then the average number of states p log a s.t. a = h and a ∈ N for all i’s. i i i i + out of the 50 states which violate the rounding-off con- i=1 i=1 strains is estimated for each method. In addition, the • The Theil method with d(0) = 1/e ≈ 0.37 and same simulation is run for each of the 1990 through 1960 d(a) = (1/e)(a + 1)a+1/aa for all integers a ≥ 1, censuses, see Table 2. The results of these simulations show that the average number of states whose numbers of Representatives are ai = [qi]0.5 under the Webster method is about 48.8,

– 23 – JSIAM Letters Vol. 3 (2011) pp.21–24 Tetsuo Ichimori

Table 3. Expected numbers of states violating the rounding-off straints under the methods of Hill, T&S and Theil de- constraints according to the 1950 through 1920 censuses. crease slightly and those under the methods of Webster Hill T&S Theil Webster “1/3” and “1/3” increase almost as much. Therefore the differ- 1950 2.138 1.751 1.408 1.177 1.331 ence between the methods of Hill and Webster shrinks 1940 2.134 1.750 1.406 1.177 1.330 a little. 1930 1.980 1.651 1.358 1.176 1.361 1920 2.010 1.666 1.316 1.178 1.349 means 2.066 1.705 1.372 1.177 1.343 5. Conclusions 50/48 times 2.151 1.776 1.429 1.226 1.399 Generally admitted, the debate over the proper meth- od of apportionment narrows down to which is better, Table 4. Expected numbers of states violating the modified Webster’s or Hill’s method. Using the numerical results rounding-off constraints according to the 2000 through 1960 cen- of Table 2, the probability that one state satisfies its suses. rounding-off constraint under the method of Webster is Hill T&S Theil Webster “1/3” about 97.58% while that under the method of Hill is 2000 1.837 1.585 1.374 1.255 1.483 about 95.73%. The Webster method wins by only 1.85 1990 1.861 1.603 1.380 1.257 1.496 1980 1.914 1.627 1.381 1.225 1.494 points. 1970 2.020 1.638 1.401 1.284 1.551 In this article, the rounding-off constraints are pro- 1960 2.133 1.755 1.428 1.229 1.461 posed to identify which method is superior to the oth- means 1.953 1.642 1.393 1.250 1.497 ers. Although this identification is limited to the new five methods (the methods of Hill, T&S, Theil, Webster and Table 5. Expected numbers of states violating the modified “1/3”), they include the leading two methods, namely, rounding-off constraints according to the 1950 through 1920 cen- Webster’ and Hill’s methods, and satisfy the most telling suses. properties in the apportionment problem, see [2,5], and Hill T&S Theil Webster “1/3” hence the limitation seems to be reasonable. 1950 1.993 1.661 1.366 1.164 1.341 In the end, the Webster method turned out to pro- 1940 1.990 1.659 1.364 1.165 1.400 duce almost the same apportionment as that obtained 1930 1.809 1.556 1.336 1.195 1.410 1920 1.883 1.587 1.302 1.196 1.395 by rounding off all the quotas of the states in the usual means 1.919 1.616 1.342 1.180 1.387 way. This is one of the most important properties which 50/48 times 1.999 1.683 1.398 1.229 1.444 a proper apportionment method should have. From this standpoint we can say that the Webster method is better than any other method discussed in this article. while that under the Hill method is about 47.9. Next, according to the 1950 through 1920 censuses, References fix an apportionment of the 435 Representatives among [1] M. L. Balinski and H. P. Young, Fair Representation, Yale the 48 states produced by each of the new five meth- Univ. Press, New Haven, 1982. ods. Note that Alaska and Hawaii became the 49th and [2] T. Ichimori, New apportionment methods and their quota 50th states of the in 1959 respectively. The property, JSIAM Letters, 2 (2010), 33–36. [3] H. Theil, The desired political entropy, Amer. Polit. Sci. Rev., same procedure is repeated, see Table 3. For easy com- 63 (1969), 521–525. parison with Table 2, each entry on the last line shows [4] H. Theil and L. Schrage, The apportionment problem and the the value of the respective entry on the second last line European Parliament, Eur. Econ. Rev., 9 (1977), 247–263. multiplied by 50/48. Here appear the numbers similar [5] M. L. Balinski, The problem with apportionment, J. Oper. to those in Table 2. Res. Soc. Jpn, 36 (1993), 134–148. The U.S. Constitution also requires that “each State shall have at least one Representative” (see U.S. Con- stitution, Art. 1, Sec. 2). Since this requirement favors extremely small states, it might be better to modify the ∗ quota qi = hpi/p of each state i or to change it into q˜i = max{1, θqi} where θ satisfies ∑s max{1, θqi} = h. i=1 In other words, the quotas of all states are reduced pro- portionally but never reduced to less than one. If the quota qi is replaced by the modified oneq ˜i for each state i, then the rounding-off constraint |ai − qi| ≤ 0.5 should be replaced by the modified rounding-off constraint, i.e., |ai − q˜i| ≤ 0.5. Simulations are performed according to this modifica- tion. Tables 4 and 5 present the simulation results. They show that expected numbers of states violating the con-

– 24 – JSIAM Letters Vol.3 (2011) pp.25–28 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Traveling wave solutions to the nonlinear evolution equation for the risk preference

Naoyuki Ishimura1 and Sakkakom Maneenop1

1 Graduate School of Economics, Hitotsubashi University, Kunitachi, Tokyo 186-8601, Japan E-mail ishimura econ.hit-u.ac.jp, ed101005 g.hit-u.ac.jp Received October 6, 2010, Accepted January 29, 2011 Abstract A singular nonlinear partial differential equation (PDE) is introduced, which can be inter- preted as the evolution of the risk preference in the optimal investment problem under the random risk process. The unknown quantity is related to the Arrow-Pratt coefficient of relative risk aversion with respect to the optimal value function. We show the existence of monotone traveling wave solutions and the nonexistence of non-monotone such solutions, which are suitable from the standpoint of financial economics. Keywords optimal economic behavior, Arrow-Pratt coefficient of relative risk aversion, risk preference, singular nonlinear partial differential equation, traveling wave solu- tions Research Activity Group Mathematical Finance

1. Introduction tence of monotone traveling wave solution to this PDE. In this article we propose a singular nonlinear partial The solutions can be interpreted positively from the differential equation (PDE) which is derived from the viewpoint of financial economics. In addition, we show Hamilton-Jacobi-Bellman (HJB) equation for the value the nonexistence of non-monotone traveling wave so- function in the optimal investment problem. We recall lutions, by which we refer to those whose derivative that optimal behavior within continuous time economics changes sign several times. This observation is also wel- environments has been an intensive area of research and come as a financial concept. that many models have already been introduced within We here perform an analytical study. A numerical the stochastic control framework. The analysis is then investigation, in particular for the monotone traveling often reduced to the treatment of the HJB equation for wave solution, is attempted in [5]. See also [6–9]. the value function. However, the HJB equation is typ- The organization of the paper is as follows. In Section ically fully nonlinear and hard to solve; it may not be 2 we recall the model and introduce our PDE. Sections an exaggeration to say that all that we can do is merely 3 and 4 are devoted to proving the existence of mono- guess a shape of solution and manage to arrange the tone traveling wave solutions and the nonexistence of parameters. See for instance [1]. non-monotone traveling wave solutions, respectively. We We here propose a different approach and derive a conclude with discussions in Section 5. singular quasilinear PDE from the HJB equation. Al- though essential difficulties are equivalent to those ex- 2. Model pressed by the HJB equation, the derived PDE is rather Here we briefly review our model. Suppose that the simple looking when viewed from the theory of nonlinear wealth Xt at time t (≥ 0) of the company is subject PDE. Moreover, the unknown quantity is related to the to a fluctuating process, and the company wants to in- Arrow-Pratt coefficient of relative risk aversion [2] with vest in one risky stock. We assume that the price Pt respect to the optimal value function. In this sense our of the stock available for investment is governed by the PDE may be interpreted as the characteristic equation stochastic differential equation of Black-Scholes-Merton (1) for the risk structure of the model. We do not insist that type [10, 11] dPt = Pt(µdt + σdWt ), where µ and σ our PDE would replace the HJB itself, but we at least { (1)} are constants and Wt t≥0 is a standard Brownian believe that the study of this PDE is interesting, as well motion. The fluctuating process, which directly affects as important. the wealth of the company, is denoted by Yt, and is as- The equation is related to our previous work [3, 4], (2) sumed to evolve as dYt = αdt + βdWt , where α and β which is concerned with the evolution of the risk prefer- (2) (β > 0) are constants and {W } ≥ is another standard ence whose unknown quantity is related to the Arrow- t t 0 Brownian motion. It is allowed that these two Brownian Pratt coefficient of the “absolute” risk aversion. The cur- motions be correlated with the correlation coefficient ρ rent equation is formulated on the “relative” risk aver- (0 ≤ |ρ| < 1). sion, which is much more popular in financial economics. The investment policy f = {f } ≤ ≤ of the company The main purpose of this article is to prove the exis- t 0 t T is a suitable admissible adapted control process. Here T

– 25 – JSIAM Letters Vol. 3 (2011) pp.25–28 Naoyuki Ishimura et al. stands for the maturity date. The stochastic process of Now we define f the wealth X of the company is then assumed to be 2 t ∂ V expressed as x ∂x2 ∂ ∂V r(x, τ) := − = −x log (x, τ) , (6) dXf dP ∂V ∂x ∂x t = f t + dY f t P t Xt t ∂x (1) (2) which extends the Arrow-Pratt coefficient of relative risk = (ftµ + α)dt + ftσdW + βdW , t t aversion for the utility function. Here we note that r is f ∈ X0 = x R. introduced with respect to the optimal value function. A similar transformation is considered in [12], where the Suppose that the company aims to maximize the util- transformation −V /V is employed. ity u(x) from his terminal wealth. The utility func- x xx Following [13], we make a change of variables x = ey tion u(x) is customarily assumed to satisfy u′ > 0 and ′′ (y = log x) and put r(y, τ) = r(x, τ); we infer that u < 0. Let ( )( ) ∂r ∂2 ∂ a2 ∂r f | f − − V (x, t) := sup E[u(XT ) Xt = x]. (1) = 2 + r (2r + b) f ∂τ ∂y ∂y r ∂y Now the Hamilton-Jacobi-Bellman equation for the for − ∞ < y < ∞, τ > 0. (7) value function (1) becomes In the following two sections, we prove the existence of sup{Af V (x, t)} = 0,V (x, T ) = u(x), (2) monotone traveling wave solutions and the nonexistence f of non-monotone solutions to (7). where the generator Af is given by 3. Monotone traveling wave solution ∂g ∂g (Af g)(x, t) := + (fµ + α)x ∂t ∂x For a standard risk averse investor, the coefficient of relative risk aversion is expected to be non-increasing 1 ∂2g 2 2 2 2 [14]. In addition, it is easy to see that every constant + (f σ + β + 2βσρf)x 2 . 2 ∂x function verifies (7). We thus wish to seek a traveling Suppose that (2) has a classical solution V with wave solution r = r(y − vτ) with the property ∂V/∂x > 0, ∂2V/∂x2 < 0. We then discover that the  { ∗} r′(y) < 0 for − ∞ < y < ∞, optimal policy ft 0≤t≤T is  → → −∞ ∂V r(y) r− as y , (8)  µ ∂x βρ → → ∞ ∗ − − and r(y) r+ as y , ft = . (3) σ2 ∂2V σ x where r− > r+ > 0 are prescribed constants and the ∂x2 wave speed v ∈ R should be determined later on. Placing (3) back into (2) we obtain Putting r(y, τ) = r(y − τv) into (7), we derive the  ( ) ordinary differential equation  ∂V 2  ( )′′ ( )′  ( ) a2 a2  ∂V βρµ ∂V µ2 ∂x − ′ − − − 2 ′  0 = + α − x − vr = r + r (r + br) , (9)  2 r r  ∂t σ ∂x 2σ ∂2V ′ ∂x2 (4) where r = r(y) and = d/dy. Integrating once, we obtain  2 ( )′  1 2 2 2 ∂ V 2 2  + β (1 − ρ )x for 0 < t < T, − a − a − 2 −  2 ∂x2 r + r r br + vr = C. (10)  r r  V (T, x) = u(x). Here C denotes a constant, and from the boundary con- dition (8) we deduce that Let τ := 2(1 − ρ2)−1β−2(T − t) and put V (x, τ) =  r− + r V (x, t) by abuse of notation, we find that  − + 2  C = r−r+ a ,  ( ) r−r+  2 (11)  ∂V  2   a  2 ∂x  v = r− + r − + b − 1.  ∂V 2 ∂ V 2 ∂V + = x − a − bx , r−r+ ∂τ ∂x2 ∂2V ∂x (5)  Eq. (10) can be written in the separable form  ∂x2   r2 + a2 V (x, 0) = u(x), dr = dy. (12) r[r3 + (b − v − 1)r2 + Cr + a2] where we have set We define f(r) := r3 − (v + 1 − b)r2 + Cr + a2, which is 2 − 2 µ 2(ρµβ ασ) the factor of the denominator in (12). The condition (11) a := 2 2 2 , b := 2 2 . 2 (1 − ρ )β σ (1 − ρ )β σ implies that f(r−) = f(r+) = 0. Since f(0) = a > 0, the solution r for (10) which fulfills (8) is constructed Eq. (5) is fully nonlinear parabolic type [8]. implicitly through the integration of (12) on the interval

– 26 – JSIAM Letters Vol. 3 (2011) pp.25–28 Naoyuki Ishimura et al.

′ r ∈ (r+, r−) and ∞ > y > −∞, provided the prescribed r (l0) = 0 for some l0 ∈ R. We know that the ODE (9) constants r− > r (> 0) are realized as positive real is equivalent to the first order system +   numbers. ′ ′ r We examine such criterion. Taking account of f (r) = [   ( )−  3r2 − 2(v + 1 − b)r + C, we learn that they are ( )  a2 1  d r  (2r + b − v) 1 +  − 2 − ′ =  r2  . ( i ) (v + 1 b) 3C > 0, r  ]  dy  ( )−  ( ii ) v + 1 − b > 0,  2a2 a2 1  ( ) − ′ ′ √ 1 + 3 1 + 2 r r v + 1 − b + (v + 1 − b)2 − 3C r r (iii) f < 0. ′ 3 Since this system is regular at (r(l0), r (l0)) = (r(l0), 0) ≡ In view of (11), condition ( i ) is reduced to and r(y) r(l0) solves the system, we conclude that ( ) the solution r = r(y) should be the constant function, 2 2 − 2 2 r− + r+ a thanks to the uniqueness theorem of ODE. This is a r− r−r+ + r+ + a + 2 2 > 0, r−r+ r−r+ contradiction and we obtain the next theorem.

which is true for r− > r > 0. Condition ( ii ) results in Theorem 2 There exists no traveling wave solution + ′ − 2 −1 −1 r = r(y − vτ)(v ∈ R) to (7) such that r changes sign. r− + r+ a r− r+ > 0, which should be imposed be- forehand. Finally, condition (iii) becomes, after a tedious calculation, 5. Discussions 4 2 2 2 We have introduced a singular quasilinear parabolic 8a < 2(r− + r )(r + r−r + r )a + − + + equation for the risk preference. The unknown func- 2 (r− + r+) 2 2 4 tion is related to the coefficient of relative risk aversion + 2 2 (r− + r+)a r−r+ with respect to the value function in the optimal invest- ment problem. We established the existence of monotone 2(r− + r+) − 2 6 traveling wave solutions and the nonexistence of non- + 3 3 (r− r+) a r−r+ monotone traveling wave solutions. Since the coefficient − 2 of relative risk aversion is claimed to be nonincreasing, (r− r+) 8 2 2 − 2 + 4 4 a + r−r+(r− r+) . our existence theorem of monotone solutions, as well as r−r + the nonexistence theorem of non-monotone solutions, is −2 −2 2 2 2 ≥ By virtue that r− r+ (r− + r+) (r− + r+) 8, this welcome in the standpoint of financial economics. requirement is always satisfied. The nonexistence theorem of non-monotone solutions To summarize, we have completed the proof of the perfectly corresponds with the economic theory where next theorem. clearly the coefficient of risk aversion is always nonnega- tive. Despite resulting in nonnegative wave solutions, the Theorem 1 For any r− > r+ > 0 satisfying r−r+(r− 2 existence of monotone solutions, however, casts doubt on +r+) > a , there exists a traveling wave solution r = r(y − − −1 −1 2 − what happens in the markets. From the traveling wave vτ) to (7) with v = r− + r+ r− r+ a + b 1 such that solution, as the maturity gets closer, the solution will de- crease. This means a company (or an individual) is less ′ r (y) < 0 for − ∞ < y < ∞, risk averse (recall that such a solution is determined as the coefficient of relative risk aversion). We can infer that and r(y) → r as y → ∞, respectively. the company is less risk averse in short-term investment 4. Nonexistence of non-monotone travel- and more risk averse in long-term investment. This is, however, counterintuitive from the general case, where ing wave solutions it should be the opposite. In brief, long-term investors In this section we make an elementary observation tend to be less risk averse than short-term investors as that there exists no non-monotone traveling wave solu- annualized volatilities of returns on some assets are lower tions to (7). Here, by non-monotone solution, we mean in the longer term. a traveling wave solution whose derivative changes sign Nevertheless, we may interpret this counterintuitive several times. As examples, the solution r = r(y) to (9) property as a special case. For example, when an econ- ′ ′ with r (y) > 0 on −∞ < y < l0 and r (y) < 0 on omy has been stable for a long period or is in the recov- l0 < y < ∞ for some l0 ∈ R is referred to as a “one- ering process from its trough, it seems that an individual ′ pulse” solution; r = r(y) with r (y) < 0 on −∞ < y < l0, or a company will be much cautious about its investment ′ l2i−1 < y < l2i (i = 1, 2, . . . , m) and r (y) > 0 on strategy in longer term. The company, thus, is less risk l2i < y < l2i+1 (i = 0, 1, . . . , m − 1), l2m < y < ∞ for averse for short-term investment (when it predicts that some −∞ < l0 < l1 < ··· < l2i−1 < l2i < ··· < l2m < ∞ markets are stable) and more risk averse for long term is referred to as an “(m+1)-bump” solution. We remark investment (when it forecasts that markets will be more that the similar nonexistence result holds true for solu- volatile). tions whose derivative changes sign an even number of Also, as to the derived equation (7) itself, there cer- times. tainly exist many remaining open questions. For in- The proof proceeds as follows. Suppose the solution stance, a general existence theorem is an interesting r = r(y) to (9) changes the sign of its derivative and let problem, which is worth further research.

– 27 – JSIAM Letters Vol. 3 (2011) pp.25–28 Naoyuki Ishimura et al. Acknowledgments We are grateful to the referee for various precious com- ments, which help in improving the manuscript. The first autor (NI) is partially supported by Grant-in-Aid for Sci- entific Research (C) No. 21540117 from Japan Society for the Promotion of Science (JSPS).

References

[1] T. Bj¨ork,Arbitrage Theory in Continuous Time, 2nd ed., Oxford Univ. Press, Oxford, 2004. [2] J.W.Pratt, Risk aversion in the small and in the large, Econo- metrica, 32 (1964), 122–136. [3] R. Abe and N. Ishimura, Existence of solutions for the nonlin- ear partial differential equation arising in the optimal invest- ment problem, Proc. Jpn Acad., Ser. A., 84 (2008), 11–14. [4] N. Ishimura and K. Murao, Nonlinear evolution equations for the risk preference in the optimal investment problem, Pa- per presented at AsianFA/NFA 2008 Int. Conf. in Yokohama, http://fs.ics.hit-u.ac.jp/nfa-net/. [5] N. Ishimura, M. N. Koleva and L. G. Vulkov, Numerical solu- tion of a nonlinear evolution equation for the risk preference, Lect. Notes Comp. Sci., Vol. 6046, pp. 445–452, 2011. [6] N. Ishimura and H. Imai, Global in space numerical computa- tion for the nonlinear Black-Scholes equation, in: Nonlinear Models in Mathematical Finance: New Research Trends in Option Pricing, M. Ehrhardt ed., Nova Science Publishers, Inc., New York, pp. 219–242, 2008. [7] N. Ishimura, M. N. Koleva and L. G. Vulkov, Numerical solu- tion via transformation methods of nonlinear models in op- tion pricing, in: AIP Conf. Proc., Volume 1301, pp. 387–394, 2010. [8] M. N. Koleva and L. G. Vulkov, Quasilinearization numerical scheme for fully nonlinear parabolic problems with applica- tions in models of mathematical finance, preprint. [9] M. N. Koleva and L. G. Vulkov, Fast two-grid algorithms for solutions of the difference equations of nonlinear Black- Scholes equations, preprint. [10] F. Black and M. Scholes, The pricing of options and corporate liabilities, J. Polit. Econ., 81 (1973), 637–654. [11] R. C. Merton, Theory of rational option pricing, Bell J. Econ. Manag. Sci., 4 (1973), 141–183. [12] L. Songzhe, Existence of solutions to initial value problem for a parabolic Monge-Amp´ereequation and application, Non- linear Anal., 65 (2006), 59–78. [13] Z. Macov´aand D. Sevˇcoviˇc,Weaklyˇ nonlinear analysis of the Hamilton-Jacobi-Bellman equation arising from pension sav- ing management, Int. J. Numer. Anal. Model., 7 (2010), 619– 638. [14] A. Mas-Collel, D. W. Michael and J. R. Green, Microeconomic Theory, Oxford Univ. Press, Oxford, 1995.

– 28 – JSIAM Letters Vol.3 (2011) pp.29–32 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Approximation algorithms for a winner determination problem of single-item multi-unit auctions

Satoshi Takahashi1 and Maiko Shigeno1

1 Graduate School of System and Information Systems, University of Tsukuba, Tsukuba, Ibaraki 305-8573, Japan E-mail stakahashi sk.tsukuba.ac.jp Received September 30, 2010, Accepted January 13, 2011 Abstract This paper treats a winner determination problem of a Vickrey-Clarke-Groves mechanism based single-item multi-unit auction. For this problem, two simple 2-approximation algorithms are proposed. One is a linear time algorithm using a linear knapsack problem. The other is a greedy type algorithm. In addition, a fully polynomial time approximation algorithm based on a dynamic programming is described. Computational experiments verify availabilities of our algorithms by comparing computational times and approximation ratios. Keywords auction theory, winner determination, approximation algorithm Research Activity Group Discrete Systems

1. Introduction there are at least n − 1 bidders whose getting quantities Recent Internet auctions with huge participators re- are given by their anchor values. quire to compute an optimal allocation and payments Lemma 1 ( [2] ) A problem (AP ) has an optimal so- as quick as possible. A winner determination problem lution satisfies the anchor property, when every bidder’s on auction theory consists of an item allocation problem unit values are monotone non-increasing for k. and a payment determination problem, which depends To compute a payment of bidder j, we need to solve a on an auction mechanism. One of the most desirable auc- restricted problem excepting j from bidders. Let N −j = tion mechanisms is due to Vickrey, Clarke and Groves, N \{j} and x−j be an optimal solution of an item allo- which is called VCG [1]. Throughout this paper, we con- cation problem under the set N −j, that is, sider only VCG based auctions. Winner determination ∑ problems of VCG based auctions are known as NP-hard. maximize ∑i∈N −j vi(xi) ≤ ≥ ∀ ∈ −j Therefore, it is important to consider fast approxima- subject to i∈N −j xi M, xi 0 ( i N ). tion algorithms for a winner determination problem in On a VCG based auction, a payment pj of bidder j is the Internet auction environment. defined by We treat single-item multi-unit auctions, where a ∑ ∑ −j − ⋆ seller who wants to sell M units of a single item and n pj = i∈N −j vi(xi ) i∈N −j vi(xi ). (1) bidders participate. Each bidder i submits sets of anchor We now review briefly approximation algorithms for { k | } { k | values di k = 0, . . . , ℓi and of unit values ei k = 1, a winner determination problem of single-item multi- } k−1 k . . . , ℓi , where anchor values satisfy di < di for any unit auctions. With respect to constant-factor approx- ≤ k 0 < k ℓi and ei implies a unit value over half-open imations for (AP ), Kothari, Parkes and Suri [2] pro- k−1 k 2 range (di , di ] of item quantity. Without loss of gen- posed a 2-approximation algorithm with O(ℓ ) time for 0 ℓi ≤ erality, we assume that di = 0 and di M for every the so-called generalized knapsack problem which mod- bidder∑ i. Let N = {1, . . . , n} be a set of bidders and els an item allocation problem in reverse auction. When R → R ℓ = i∈N ℓi. We define a value function vi : + of their greedy algorithm is applied to (AP ) directly, it re- bidder i by turns a solution whose approximation ratio may not be { k · k−1 ≤ k bounded by two. Zhou [3] said that he improved this al- ei x (di < x di , k = 1, . . . , ℓi), vi(x) = gorithm to run in O(ℓ log ℓ) time. Moreover, he showed a 0 (x = d0 or x > dℓi ). i i 3-approximation algorithm with O(ℓ) time and a (9/4)- Our item allocation problem (AP ) is to find each quan- approximation algorithm with O(ℓ log ℓ) time for the so- tity xi that bidder i gets such that the total valuation is called interval multiple-choice knapsack problem whose maximized. It is formulated as special case is (AP ). According to [3], it is an open prob- ∑ lem to compute a 2-approximation of (AP ) in linear (AP ) maximize ∑i∈N vi(xi) ≤ ≥ ∀ ∈ time. With respect to fully polynomial time approxima- subject to i∈N xi M, xi 0 ( i N). tion schemes (FPTAS) for a winner determination prob- ⋆ Denoted by x , an optimal solution for (AP ) is. We say lem, Kothari, Parkes and Suri [2] proposed the first one that a solution for (AP ) satisfies an “anchor property” if which is based on dynamic programming and uses the

– 29 – JSIAM Letters Vol. 3 (2011) pp.29–32 Satoshi Takahashi et al. anchor property. It finds a solution with an approxima- Hence, the optimal value of (LKP ) is not less than the tion ratio at most (1 + ϵ) for (AP ) in O(nℓ2/ϵ) time and optimal value of (AP ). (QED) calculates every bidders’ payment in O((nℓ2/ϵ) log(n/ϵ)) time. In order to solve (AP ), their algorithm repeats fix- With respect to a feasible solution y of (LKP ), we call ≤ k ing a specified bidder j and index 0 < k ℓj, and solving an index i as saturated if yi = 1 holds for some k. It is k−1 ≤ k the problem (AP ) adding a constraint dj < xj dj . known that there exists an optimal solution for (LKP ) This FPTAS was improved by Zhou [3]. His algorithm with at most one unsaturated index. Let y∗ be such an does not repeat to compute (AP ) with an additional optimal solution and i∗ be the unsaturated index. From constraint. Thus, Zhou’s FPTAS solves a winner deter- y∗, we construct two solutions of (AP ) by setting mination problem in O((nℓ/ϵ) log(n/ϵ)) time. Moreover, { ∑ ℓi k ∗k ̸ ∗ k=0 di y i (i = i ), by employing a technique of vector merge, he say that xˆi = ∗ (2) his algorithm can run in O((nℓ/ϵ) log n) time. However, 0 (i = i ), { a solution found by his algorithm may not satisfy the 0 (i ≠ i∗), anchor property. andx ˜i = k˜ ∗ (3) d i (i = i ), In the next section, we propose two 2-approximation i algorithms for solving (AP ). One is an O(ℓ) algorithm, ˜ k k where ki is an index attaining max0 0, k = ki), − ki Algorithm AA2 1 xi/di (k = 0), { Step 1 Set xi = 0 for any i ∈ N. ∗ k 1 (k = 0), ∗ ∗ k and, otherwise, y = Step 2 Find a pair (i , k ) such as p ∗ (xi∗ ) = max{ i ̸ ∗ i 0 (k = 0). k | ∈ k} k ∗ ≤ pi (xi) i N, xi < di . If pi∗ (xi ) 0, then k∗ The objective values of these solutions x and y satisfy return x∑, otherwise, update xi∗ = di∗ . ∑ ∑ ∑ ℓi k k k ki Step 3 If i∈N xi < M, go to Step 2. i∈N k=0(ei di )yi = {i∈N|0

– 30 – JSIAM Letters Vol. 3 (2011) pp.29–32 Satoshi Takahashi et al. { ∗ 0 (i ≠ i ), for (APg) by computing H[t, r] recursively, together with andx ˜i = ∗ xi∗ (i = i ). G[t, r], for r = 0,..., ⌊(2n)/ϵ⌋ and t = 0, . . . , n. It is ∑ ∑ If v (ˆx ) > v (˜x ), then return xˆ, oth- obvious that we can initialize G[0, 0] = H[0, 0] = 0 and i∈N i i i∈N i i ∞ erwise, return x˜. G[0, r] = H[0, r] = for any r > 0. For convenience, we set G[t, r] = H[t, r] = ∞ for any t ∈ N and r < 0. Theorem 4 Algorithm AA2 finds a 2-approximation By recursively, we can represent solution of (AP ) in O(ℓ(log n + ℓmax)) time, where ℓmax { − − − k k } ∈ G[t, r] = min G[t 1, r], min(G[t 1, r v˜t(dt )] + dt ) . = maxi N ℓi. k Proof When AA2 stops at Step 2, we can show that Defining m[t, r] by the returned solution is an optimal for (AP ). When AA2 ′ ′ min {min{xt | V˜G[t −1, r ]+˜vt(xt) ≥ r}+ G[t −1, r ]}, stops at∑ Step 4, for a solution x at the end of AA2, let 0≤r′≤r M ′ = x . It can be shown that the solution x is ∑ i∈N i ˜ − ′ t−1 optimal for where VG[t 1, r ] is the value i=1 v˜i(xi) for a solu- ∑ tion x that establishes G[t − 1, r′], we have the following maximize ∑i∈N vi(xi) recurrence for H: ≤ ′ ≥ ∀ ∈ subject to i∈N xi M , xi 0 ( i N). H[t, r] = min{H[t−1, r], Since x⋆ is also feasible for the above problem, we have ∑ ∑ − − k k } ⋆ ≤ mink(H[t 1, r v˜t(dt )]+dt ), m[t, r] . i∈N vi(xi ) i∈N vi∑(xi). It comes from∑ the defini- tions of xˆ and x˜ that ∈ vi(xi) ≤ ∈ vi(˜xi) + ′ ∑ i N i N In m[t, r], we can rewrite min{xt |V˜G[t − 1, r ] +v ˜t(xt) ≥ i∈N vi(ˆxi) holds. Thus, we obtain the desired approx- r} by imation ratio. { | ⌊ k ⌋ ≥ − ˜ − ′ k−1 ≤ k} It is clear that the number of iteration of AA2 is at min xt net xt/ϵV r VG[t 1, r ], dt < xt dt . { k | k} ∈ (5) most ℓ. If we store max pi (xi) xi < di for all i N in a heap, Step 2 can be performed in O(log n). After ′ Since r−V˜G[t−1, r ] is an integer, the smallest xt satisfies { k ∗ | ∗ k } Step 2, we need to compute max pi∗ (xi ) xi < di∗ − ˜ − ′ k · the first condition in (5) is given by (r VG[t 1, r ])/et for updated x ∗ , which runs in O(ℓ ∗ ). Hence, the total i i (ϵV/n). Thus (5) is equivalent to running time is bounded by O(ℓ(log n + ℓmax)). { − ˜ − ′ k · (QED) mink (r VG[t 1, r ])/et (ϵV/n) | k−1 − ˜ − ′ k · ≤ k} 2.2 FPTAS for winner determination problems dt < (r VG[t 1, r ])/et (ϵV/n) dt . We show a modified version of Zhou’s algorithm [3] By using this formula, the values m[t, r] for all r = 1, such that it finds a solution satisfying the anchor prop- ..., ⌊(2n)/ϵ⌋ can be found simultaneously in O((nℓi/ϵ) k−1 ≥ k erty, when every bidder’s unit values satisfy ei ei log(n/ϵ)) time. After obtaining the values of m[t, r] for for all 1 ≤ k ≤ ℓi. Let ϵ > 0 be a relative error and V be all r = 1,..., ⌊(2n)/ϵ⌋, we can compute each G[t, r] and an objective value obtained by a 2-approximation algo- H[t, r] in O(ℓi) time. Therefore we obtain entire elements rithm. We define a scaled value functionv ˜i : R+ → R of of G and H in O((nℓ/ϵ) log(n/ϵ)) time. bidder i by Finally, we compute each bidder’s payment defined by (1) by employing the method of Kothari, Parkes and Suri v˜i(x) = ⌊(n · vi(x))/ϵ · V ⌋. [2]. In their method, all payments can be computed in We denote an item allocation problem over this scaled the same time complexity to obtain G and H. value function by (APg). For an optimal solution x of Theorem 5 Our algorithm finds a solution with a rel- (APg), we have ∑ ∑ ative error at most ϵ of (AP ) in O((nℓ/ϵ) log(n/ϵ)) time. ⋆ ⋆ It also finds every payment in the same time complexity. ∈ vi(xi ) < ∈ (ϵV/n)(˜vi(xi ) + 1) i N ∑i N If a vector merge technique by [3] is applied, our ≤ ∈ (ϵV/n)˜vi(xi) + ϵV ∑i N ∑ algorithm solves a winner determination problem in ≤ ⋆ i∈N vi(xi) + ϵ i∈N vi(xi ). O((nℓ/ϵ) log n) time. g Thus, an optimal solution for (AP ) is a solution with 3. Experimental results a relative error at most ϵ for (AP ). In order to solve g This section shows computational results of algo- (AP ) by dynamic programming,∑ for∑ two parameters t t t rithms described in Section 2. All computations were and r, the value min{ xi | v˜i(xi) ≥ r} is i=1 i=1 conducted on a personal computer with Core2 Duo CPU stored in G[t, r] and H[t, r], where in G[t, r] each xi is restricted to an anchor value, and in H[t, r] each (3.06GHz) and 4GB memory. Our code was written by python2.6.5. For given numbers of bidders n and of units xi except only one bidder is restricted to an anchor value. An optimal solution of (APg) is obtained from a M, all instances used in this experiment were generated ∗ solution x[n,r ] establishing H[n, r∗], where r∗ attains using random numbers. The number of anchor values ℓi ∑ for each bidder i was selected uniformly from integers max { v˜ (x[n,r]) | H[n, r] ≤ M}. In order to ob- r i∈N i i within the interval [1, 15]. Every unit value ek and an- tain r∗, it is enough to search H[n, r] for r from 0 to i chor value dk were selected uniformly from integers in ⌊ ⌋ g i (2n)/ϵ , since the optimal value of (AP ) is bounded by [1, 100] and in [1,M], respectively. ⌊ ⌋ (2n)/ϵ . Thus, our algorithm finds an optimal solution Table 1 shows averages of computational times and

– 31 – JSIAM Letters Vol. 3 (2011) pp.29–32 Satoshi Takahashi et al.

Table 1. Averages of computational times and of approximation ratios of 2-approximation algorithms for ten instances of each (n, M). instance comp. times (sec.) app. ratios instance comp. times (sec.) app. ratios (n, M) AA1 AA2 AA1 AA2 (n, M) AA1 AA2 AA1 AA2 (10, 200) 0.00460 0.00119 1.873 1.093 (10, 50) 0.00388 0.00101 1.299 1.251 (50, 200) 0.01171 0.00409 1.736 1.244 (10, 100) 0.00314 0.00102 1.458 1.296 (100, 200) 0.02331 0.00768 1.505 1.195 (10, 200) 0.00460 0.00119 1.873 1.093 (200, 200) 0.04721 0.01440 1.673 1.429 (50, 50) 0.01237 0.00392 1.398 1.485 (400, 200) 0.09666 0.02869 1.492 1.399 (50, 100) 0.01430 0.00390 1.356 1.371 (800, 200) 0.19884 0.05754 1.559 1.316 (50, 200) 0.01171 0.00409 1.736 1.244 (1000, 200) 0.25427 0.06933 1.749 1.639 (100, 50) 0.02468 0.00755 1.831 1.206 (5000, 200) 1.84318 0.34113 1.550 1.328 (100, 100) 0.02401 0.00737 1.566 1.618 (10000, 200) 5.08805 0.72498 1.548 1.609 (100, 200) 0.02331 0.00768 1.505 1.195

Table 2. Averages of computational times (sec.) and relative er- rors of our FPTAS for ten instances with n = 10 and M = 50. the latter frequently returned solutions with better rela- comp. times (sec.) relative errors epsilon AA1 AA2 AA1 AA2 tive errors than the former, which derived from the fact 1.0 0.27925 0.66331 0.057 0.038 that Algorithm AA2 tended to return a greater value V 0.9 0.32014 0.78129 0.038 0.037 than Algorithm AA1. Although the theoretical complex- 0.8 0.40650 0.98921 0.041 0.047 ity does not depend on V , we say there is a difference 0.7 0.49900 1.20912 0.042 0.034 of average computational times between the cases us- 0.6 0.67769 1.66427 0.026 0.021 0.5 0.92052 2.26631 0.037 0.021 ing Algorithm AA1 and Algorithm AA2. This difference 0.4 1.42961 3.50712 0.024 0.001 comes from each computational time of m[t, r]. By using 0.3 2.54312 6.25542 0.014 0.001 Algorithm AA2, our FPTAS returned almost optimal so- 0.2 5.59242 13.77336 0.012 0.004 lutions, when ϵ is less than 0.4. However, the returned 0.1 22.06652 54.52771 0.007 0.003 allocations were different from optimal.

4. Concluding remarks approximation ratios of two 2-approximation algo- For winner determination problems of a VCG based rithms, Algorithm AA1 and Algorithm AA2, for ten in- single-item multi-unit auction, we proposed two 2- stances of each size. Algorithm AA1 was implemented approximation algorithms for item allocation problems. so that its time complexity was O(ℓ log ℓ), since we em- One runs in the linear time, which gives a positive an- ployed a sorting algorithm instead of linear-time median swer to the open problem in [3]. The other does not run finding in Dyer’s algorithm for (LKP ). It is consistent in the linear time, but it computes fast in some experi- with the theoretical complexities that resulting compu- ments. We also discussed an FTPAS, which returned an tational times depend on n but not M. On the other approximation solution satisfying the anchor property. hand, Algorithm AA2 is faster than Algorithm AA1 in When some bidders know all bids and can compute op- the average times, because our instances seem not to timal allocations and payments, they may not approve derive worst cases. Approximation ratios of both algo- an approximate solution whose allocations and pay- rithms seem not to be affected by sizes of n and M. In ments are entirely different from optimal ones. To ap- our results, Algorithm AA2 tended to have better aver- prove approximate solutions in real auctions, we need age approximation ratio than Algorithm AA1. This ten- some rules about allocations. For instance, Fukuta and dency was influenced by a few solutions with bad approx- Ito [5] discussed a rule that bidder j is allocated no more imation ratios. At the end of both algorithms Algorithm than an allocation of bidder i if vi(d) > vj(d) for some AA1 and Algorithm AA2, they choose a solution xˆ or anchor value d. It is our future work to develop an ap- x˜. Among 150 instances of this experiment, Algorithm proximation algorithm for finding an allocation satisfy- AA1 returned a solution x˜ in 89 instances and Algorithm ing this rule. AA2 returned x˜ in 23 instances. Indeed, a solution given by x˜, which was returned by Algorithm AA2 especially, References did not have so good approximation ratio, because it allocated almost all units to only one bidder. This fact [1] P. Milgrom, Putting Auction Theory to Work, Cambridge seems to affect evaluations of approximation ratios. Univ. Press, 2004. [2] A. Kothari, D. C. Parkes and S. Suri, Approximately- The second experiment evaluated behavior of our FP- strategyproof and tractable multiunit auctions, Decision Sup- TAS described in Subsection 2.2 to solve a problem port System, 39 (2005), 105–121. (AP ). We investigated influence of a given relative er- [3] Y. Zhou, Improved multi-unit auction clearing algorithms ror ϵ on computational times and on obtained relative with interval (multiple-choice) knapsack problems, in: Proc. errors. In addition, we compared performance of our of 17th Int. Sympo. on Algorithms and Computation, pp. 494– 506, 2006. FPTAS where the value V is given by Algorithm AA1 [4] M. E. Dyer, An O(n) algorithm for the multiple-choice knap- and Algorithm AA2, respectively. Table 2 shows aver- sack linear program, Math. Program., 29 (1984), 57–63. ages of computational times and of relative errors for [5] N. Fukuta and T. Ito, An analysis about approximated allo- ten instances fixed with n = 10 and M = 50. In our cation algorithms of combinatorial auctions with large num- result, while the case using Algorithm AA1 spent less bers of bids (in Japanese), IEICE Trans. D, J90-D (2007), 2324–2335. times on computing than the case using Algorithm AA2,

– 32 – JSIAM Letters Vol.3 (2011) pp.33–36 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

On the new family of wavelets interpolating to the Shannon wavelet

Naohiro Fukuda1 and Tamotu Kinoshita1

1 Institute of Mathematics, University of Tsukuba, 1-1-1 Tennodai, Tsukuba-shi, Ibaraki 305- 8571, Japan E-mail naohiro-f math.tsukuba.ac.jp Received November 17, 2010, Accepted January 6, 2011 Abstract There are various types of orthogonal wavelet families with order parameters. In this paper we introduce a new family of wavelets which converges in Lq to the Shannon wavelet as the order parameter n increases. In particular, we shall give a symmetric orthogonal scaling function whose time-bandwidth product is near 1/2. Keywords Shannon wavelet, Haar wavelet, Battle-Lemari´ewavelet Research Activity Group Wavelet Analysis

∫ n 1 n − n −iξ/2 It is well-known that the limit of B-spline family is t) dt/ 0 t (1 t) dt. Then, by neglecting e and the Gaussian function, which achieves the smallest time- replacing ξ/2 by πνn[3|ξ|/(2π)−1]/2 in the argument of H bandwidth product permitted by the uncertainty prin- the cosine of m1 (ξ), one will get ciple, i.e., 1/2 (see [1, 2]). The B-spline and the Gaus- [ ( )] π 3 sian function are not orthogonal to their translates. mM (ξ) = cos ν |ξ| − 1 . n 2 n 2π While, there are various types of orthogonal wavelet fam- M ilies with order parameters, e.g., Battle-Lemari´ewavelet, The Meyer wavelet family is constructed from mn (ξ) Daubechies wavelet, Str¨omberg wavelet (see [3–7]). In (n ≥ 1) and its Fourier transform belongs to Cn due to particular, [8] showed that the Battle-Lemari´ewavelet the irregularity of (1) at the points ξ = 0, 1. This causes of order n converges to the Shannon wavelet as n tends polynomial decay of the Meyer wavelet. to infinity. Let us denote the low pass filter, the scal- While in our paper, we shall replace ξ/2 by π sin2( H ing function and the wavelet of the Battle-Lemari´eby ξ/2)/2 in the argument of the cosine of m1 (ξ) and define BL BL BL ( ) mn (ξ), φn (x) and ψn (x) respectively. This family H π 2 ξ of the Battle-Lemari´ewavelet interpolates from the (non m2 (ξ) = cos sin . smooth) Haar wavelet which has the best localization 2 2 in time to the (smooth) Shannon wavelet which has the H Here we remark that m2 (ξ) is 2π-periodic and satisfies best localization in frequency. For some applications, the H | H |2 | H |2 m2 (0) = 1 and m2 (ξ) + m2 (ξ + π) = 1, since order parameter n enables us to control the smoothness ( ) ( ) π ξ π π ξ and the proportion between the time window and the mH (ξ + π) = cos cos2 = cos − sin2 2 2 2 2 2 2 frequency window. On the other hand, the Daubechies ( ) wavelet of order n does not converge to the Shannon π ξ = sin sin2 . wavelet as n tends to infinity. As for the Daubechies fil- 2 2 ters, the asymptotic behavior is studied in [9]. Firstly, we shall introduce another family of wavelets To construct a new wavelet family, let us consider Θn(ξ) interpolating from the Haar wavelet to the Shannon given recursively by

wavelet. In this paper, the low pass filter of the Haar ξ π 2 BL H Θ1(ξ) = and Θn(ξ) = sin Θn−1(ξ) for n ≥ 2. wavelet m1 (ξ) is denoted also by m1 (ξ) and given by 2 2 ( ) − ξ ξ Then we also define the 2π-periodic function H ≡ BL i 2 m1 (ξ) m1 (ξ) = e cos . 2 H ≥ mn (ξ) = cos Θn(ξ) for n 2. (2) mH (ξ) is 2π-periodic. We immediately see that mH (0) = 1 1 H H H 1 and |mH (ξ)|2 + |mH (ξ + π)|2 = 1. Now, let us put mn (ξ) satisfies mn (0) = 1. Noting that mn (ξ + π) = 1 1 | H |2 | H  sin Θn(ξ) still holds, we can obtain mn (ξ) + mn (ξ +  0 for ξ < 0, |2 H π) =∏ 1. Therefore, since mn is differentiable, we find ∞ − ν (ξ) = pn(ξ) for 0 ≤ ξ ≤ 1, (1) H j n  that j=1 mn (2 ξ) converges uniformly on bounded 1 for ξ > 1, H H sets of R (see [10]). Thus we can define φn (x) and ψn (x) by where pn(ξ) is the (2n+1)-th order polynomial∫ satisfying − ≡ x n − pn(ξ) + pn(1 ξ) 1 and pn(0) = 0, i.e., 0 t (1

– 33 – JSIAM Letters Vol. 3 (2011) pp.33–36 Naohiro Fukuda et al.

1 1 and ( ) π π 0 < Θ (ξ) = sin2 Θ (ξ) < Θ (ξ) < . 2 2 1 1 4 Recursively, we have n ≥ 1 ( ) π π π π π 2 π −π − π −π − π 0 < Θ (ξ) = sin Θ − (ξ) < Θ − (ξ) < . 2 2 2 2 n 2 n 1 n 1 4 (a) mH (b) mH 2 3 Let us fix 0 < ξ < π/2. We remark that 1 1 π 0 < Θ < Θ − < ··· < Θ < , (5) n n 1 1 4

since f(Θn−1) = Θn/Θn−1 < 1. In particular, there ex- ists a constant a > 0 such that f(Θ1) = Θ2/Θ1 < a < 1. Therefore, π π π π −π − π −π − π Θn n−1 n−1 2 2 2 2 = f(Θn−1)f(Θn−2) ··· f(Θ1) < f(Θ1) < a . H SH Θ1 (c) m5 (d) m∞ Hence, we get Fig. 1. Graphs of mH , mH , mH and mSH . 2 3 5 ∞ π π 0 < Θ (ξ) < Θ an−1 < an−1 for 0 < ξ < . n 1 4 2 ∏∞ Thus it follows that limn→∞ Θn(ξ) = 0. Consequently, H H −j we have φˆn (ξ) = mn (2 ξ) j=1 H − SH − lim mn (ξ) m∞ (ξ) = lim (1 cos Θn(ξ)) = 0. n→∞ n→∞ and ( ) ( ) In the case π/2 < ξ < π, noting that ξ ξ ξ ˆH i 2 H H π ψn (ξ) = e mn + π φˆn . mH (ξ + π) = sin Θ (ξ) and 0 < −ξ + π < , 2 2 n n 2 SH Let m∞ (ξ) be the low pass filter of the Shannon we obtain limn→∞ Θn(−ξ + π) = 0 and also wavelet, i.e., 2π-periodic function defined by H − SH H −  lim mn (ξ) m∞ (ξ) = lim mn ( ξ) π n→∞ n→∞ 1 for |ξ| ≤ , SH 2 m∞ (ξ) = π = lim sin Θn(−ξ + π) 0 for < |ξ| ≤ π, n→∞ 2 = 0. we get the following properties for (2): H ≥ Since Θn(π/2) = π/4 and (5), we can also have (3). Proposition 1 mn (ξ) satisfies that for all n 1 (QED) H ̸ | | ≤ π mn (ξ) = 0 for ξ , (3) H 2 From (3) it follows that mn (ξ) is the low pass filter and for an MRA (see [10]). This means that ψH (x) is defined { } n π by its Fourier transform H SH ∈ \ ( ) ( ) lim mn (ξ) = m∞ (ξ) for ξ R πZ + . (4) n→∞ 2 H i ξ ξ H ξ ψˆ (ξ) = e 2 mH + π φˆ . n n 2 n 2 Proof We shall show the pointwise convergence in (4). H H For ξ = 0, π we easily see that From (4) the scaling function φn and wavelet ψn also converge to the Shannon scaling function φSH and H SH H SH ∞ m (0) = m (0) = 1, m (π) = m (π) = 0. SH n ∞ n ∞ wavelet ψ∞ as the order parameter n increases. More H precisely, we can prove the following theorem: Since mn is even and 2π-periodic, it is enough to con- sider the two cases 0 < ξ < π/2 and π/2 < ξ < π. Theorem 2 For 2 ≤ q ≤ ∞, we have

Define the function H − SH lim φn φ∞ q = 0 2 n→∞ L π sin θ ≤ ≤ π f(θ) = for 0 θ . H − SH 2θ 4 and lim ψn ψ∞ q = 0. n→∞ L Noting that f(π/4) = 1 and Proof It is sufficient to give a proof only for the scaling ′ π sin θ(2θ cos θ − sin θ) π f (θ) = > 0 for 0 ≤ θ ≤ , functions. At first, we shall prove for 1 ≤ p < ∞ 2θ2 4 H − SH lim φˆn φˆ∞ p = 0. (6) we find that f(θ) is strictly increasing on [0, π/4] and n→∞ L (R) f(θ) < 1. In the case 0 < ξ < π/2, we have π 0 < Θ (ξ) < 1 4

– 34 – JSIAM Letters Vol. 3 (2011) pp.33–36 Naohiro Fukuda et al. ( ) [ ( )] ∈ ∏∞ ∏∞ For a fixed ξ R, there exists J > 0 such that 2 ξ π π 2 ξ = cos j+1 sinc cos j+1 ∏J 2 2 2 2 − ε j=1 j=1 φˆH (ξ) − mH (2 jξ) < ( ) n n 3 ∏∞ j=1 2 ξ ξ = sinc L j , and 2 2 j=1

∏J ε where L(ξ) = π sinc[π cos2(ξ/2)/2]/2. We remark that φˆSH (ξ) − mSH (2−jξ) < . ∞ ∞ 3 L(ξ + iη) is an entire function satisfying L(0) = 1. From j=1 the continuity, there exists C > 0 such that While, for a sufficiently large N = N(J) > 0 Proposi- ∪ |L(ξ + iη)| ≤ 1 + C|ξ + iη|. (7) ∈ \ J j{ } tion 1 gives that for ξ R j=1 2 πZ + π/2 Since |L(ξ)| ≤ π/2 for ξ ∈ R, for all ε > 0 there exists

∏J ∏J | | H −j − SH −j ε δξ > 0 such that for all η with 0 < η < δξ, we have mN (2 ξ) m∞ (2 ξ) < . 3 ( )1+ε j=1 j=1 π ∪ |L(ξ + iη)| ≤ . (8) ∈ \ J j{ } 2 Thus it follows that for ξ R j=1 2 πZ + π/2

We note that L(ξ) is 2π-periodic and H − SH φˆN (ξ) φˆ∞ (ξ) sup δξ = max δξ > 0. ξ∈R 0≤ξ≤2π ∏J ≤ H − H −j φˆN mN (2 ξ) So, we can take δ = max0≤ξ≤2π δξ > 0 which is indepen- j=1 dent of ξ. For arbitrarily fixed J ≥ 0, taking η such that j max1≤j≤J |η/2 | < δ, i.e., |η| < 2δ, by (7) and (8), we ∏J ∏J H −j − SH −j get for 2J ≤ |ξ + iη| ≤ 2J+1 + mN (2 ξ) m∞ (2 ξ) j=1 j=1 H φˆ2 (ξ + iη) ∏J ( ) ( ) − 2 ∏J + mSH (2 jξ) − φˆSH ξ + iη ξ + iη ∞ ∞ = sinc L j=1 2 2j j=1 ( ) < ε. ∏∞ ξ + iη × L This implies that for almost all ξ ∈ R 2j j=J+1 H − SH ( ) lim φˆn (ξ) φˆ∞ (ξ) = 0. 2 n→∞ ξ + iη ( ) ( ) sin J(1+ε) ∏∞ From the results of later Theorem 3 we will find that 2 π ξ + iη ≤ 1 + C φH (x) is smooth and |φˆH (ξ)|p is dominated by some ξ + iη 2 2j n n j=J+1 integrable function for a sufficiently large n = n(p) > 0. 2 ( ) Therefore, the dominated convergence theorem proves ∏∞ Cη J(1+ε) log π ξ + iη (6). ≤ 2 2 2 exp C |ξ + iη|2 + 1 2j Let us consider (6) especially for 1 < p ≤ 2. Taking j=J+1 ≤ ∞ ( ) 2 q < such that 1/p+1/q = 1, by Hausdorff-Young ∑∞ inequality we get Cη (1+ε) log π C ≤ |ξ + iη| 2 2 exp |ξ + iη|2 + 1 2j H − SH j=0 lim φn φ∞ q = 0. n→∞ L (R) ( ) ≤ 2C | | | | qε (QED) Cηe ξ + η + 1 , − | | ≤ In [8] one can see the corresponding results for the where qε = (1 + ε) log2(π/2) 2. For ξ + iη 1 we | H | ≤ Battle-Lemari´escaling function and wavelet. easily see that φˆ2 (ξ + iη) C. Thus, it follows that for ξ ∈ R and |η| < 2δ Throughout this paper, we denote the sinc function ( ) SH SH q by sinc(x) = sin x/x. φ∞ and ψ∞ have polynomial H ≤ | | | | ε φˆ2 (ξ + iη) Mη ξ + η + 1 . (9) ∞ SH decays and belong to C (Rx), since φ∞ (x) = sinc(πx) SH The exponent q becomes negative for a sufficiently small and ψ∞ (x) = 2sinc(2πx) − sinc(πx). Especially in the ε case n = 2,φ ˆH is rewritten as ε > 0. Thus, from the Paley-Wiener theorem we con- n clude the following two facts (see [11]): ∏∞ [ ( )] H π 2 ξ • H H φˆ (ξ) = cos sin φ2 has exponential decay, sinceφ ˆ2 is analytic with 2 2 2j+1 j=1 a positive radius of convergence. ∞ [ ( )] • H α2 ∏ π ξ φ2 belongs to C (Rx) for some α2 > 0 (the esti- = sin cos2 mate (9) with η = 0). 2 2j+1 j=1 Furthermore, we can find the decays and regularities H H ≥ of φn and ψn (n 2) as follows:

– 35 – JSIAM Letters Vol. 3 (2011) pp.33–36 Naohiro Fukuda et al.

H H Table 1. Regularities of φn and ψn . x and also satisfies ∆φ∆φˆ = 0.669 which is near 1/2. n 2 3 4 5 6 Some generalizations of the above results will appear αn 0.386 1.133 2.616 5.580 11.508 in the forthcoming paper [12].

Table 2. Time-bandwidth products of the scaling function and Acknowledgments wavelet. The authors would like to thank the referee for valu- n 2 3 4 5 6 able suggestions. ∆ H ∆ H 0.926 0.669 0.772 0.947 1.177 φn φˆn ∆ψH ∆ ˆH 2.603 2.136 2.500 3.069 5.393 n ψn References

Table 3. Time-bandwidth products of the scaling functions of [1] I. J. Schoenberg, Cardinal interpolation and spline functions, Battle-Lemari´e,Meyer and Daubechies. J. Approx. Theory, 2 (1969), 167–206. n 1 2 3 4 5 [2] M. Unser, A. Aldroubi and M. Eden, On the asymptotic convergence of B-spline wavelets to Gabor functions, IEEE ∆ BL ∆ BL ∞ 0.686 0.741 0.837 0.928 φn φˆn Trans. Inform. Theory, 38 (1992), 864–872. ∆ M ∆ M 0.810 0.875 0.949 1.012 1.065 φn φˆn [3] G. Battle, A block spin construction of ondelettes. I. Lemari´e ∆ D ∆ D ∞ 1.057 0.828 0.849 0.984 φn φˆn functions, Comm. Math. Phys., 110 (1987), 601–615. [4] I. Daubechies, Ten lectures on wavelets, CBMS-NSF Re- gional Conference Series in Applied Mathematics, 61, SIAM, Philadelphia, PA, 1992. 1 [5] N. Fukuda and T. Kinoshita, On non-symmetric orthogonal 1 spline wavelets, to appear in South. Asian Bull. Math. 0.5 [6] P. G. Lemari´e,Ondelettes `alocalisation exponentielles, J. 0.5 Math. pures et appl., 67 (1988), 227–236. [7] J. O. Str¨omberg, A modified Franklin system and higher-order −3 −2 −1 1 2 3 −3 −2 −1 1 2 3 spline systems on Rn as unconditional basis for Hardy spaces, in: Proc. Conf. on Harmonic Analysis in honor of Antoni Zyg- H H (a) φ2 (b) φ3 mund, pp. 475–493, 1981. [8] K. H. Oh, K. R. Young and K. J. Seung, On asymptotic behav- 1 1 ior of Battle-Lemari´escaling functions and wavelets, Appl. Math. Lett., 20 (2007), 376–381. [9] D. Kateb and P. G. Lemari´e-Rieusset,Asymptotic behavior 0.5 0.5 of the Daubechies filters, Appl. Comput. Harmon. Anal., 2 (1995), 398–399. [10] E. Hern´andezand G. Weiss, A first course on wavelets, CRC −3 −2 −1 1 2 3 −3 −2 −1 1 2 3 Press, Boca Raton, FL, 1996. H SH [11] S. G Krantz and H. R. Parks, A Primer of Real Analytic (c) φ (d) φ∞ 5 Functions, Birkh¨auser,Boston-Basel-Berlin, 2nd ed., 2002. H H H SH [12] N. Fukuda and T. Kinoshita, On the construction of new fam- Fig. 2. Graphs of φ2 , φ3 , φ5 and φ∞ . ilies of wavelets, preprint.

≥ H Theorem 3 Let n 2. The scaling function φn and H αn wavelet ψn have exponential decays and belong to C ( Rx) for some αn > 0 increasing in the parameter n. In fact, we can derive a more refined estimate than (9) in [12] and a better αn > 0 is given as the Table 1. Then, we also give the time-bandwidth product ∆f ∆fˆ of the scaling function and wavelet in Table 2, where ∫ ∞  1 2  2 2   (x − x0) |f(x)| dx −∞ ∆f := ∫ ∞    |f(x)|2dx  −∞ with ∫ ∞ x|f(x)|2dx −∞ x0 := ∫ ∞ . |f(x)|2dx −∞ From the uncertainty principle the lower bound is 1/2. It seems that ∆ H ∆ H is nice in comparison with φ3 φˆ3 some other famous scaling functions in Table 3. In ∏conclusion, we observe that φ(x) defined byφ ˆ(ξ) = ∞ { 2 2 j+1 } j=1 cos π sin [π sin (ξ/2 )/2]/2 is differentiable in

– 36 – JSIAM Letters Vol.3 (2011) pp.37–40 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Conservative finite difference schemes for the modified Camassa-Holm equation

Yuto Miyatake1, Takayasu Matsuo1 and Daisuke Furihata2

1 Graduate School of Information Science and Technology, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033, Japan 2 Cybermedia Center, Osaka University, Machikaneyama 1-32, Toyonaka, Osaka 560-0043, Japan E-mail yuto miyatake mist.i.u-tokyo.ac.jp Received November 18, 2010, Accepted April 5, 2011 Abstract We consider the numerical integration of the modified Camassa-Holm equation, which has been recently proposed by McLachlan and Zhang (2009) as a generalization of the prominent Camassa-Holm equation. We present nonlinear and linear finite difference schemes for the modified equation that preserve two invariants at a same time. We also show some numerical examples of the presented schemes, where it is found that certain solutions of the mCH can behave like solitons. Keywords modified Camassa-Holm equation, conservation law, discrete variational deriva- tive method Research Activity Group Scientific Computation and Numerical Analysis

1. Introduction the mCH (with p = 2): ∫ ∫ In this paper, we consider the numerical integration d d u2 + 2u2 + u2 u dx = 0, x xx dx = 0. (3) of the “modified Camassa-Holm (mCH) equation” of the dt dt 2 form: It is still an open question whether or not there are 2 p mt + umx + 2uxm = 0, m = (1 − ∂x ) u, (1) other invariants, in particular if the mCH is completely- integrable. The dynamics of the mCH, for example the where p is a positive integer and the subscript t (or x, possibility of soliton-like solutions, are not yet under- respectively) denotes the differentiation with respect to stood, except that in [1] an interesting phenomenon time variable t (or x). This equation was derived by called “weak blow-up” was numerically suggested (but McLachlan and Zhang [1] as the Euler-Poincar´ediffer- not mathematically confirmed). To the best of the au- ential equation on the Bott-Virasoro group with respect thors’ knowledge, so far no study has been carried out to the Hp metric. that mainly focuses on the numerical treatment of the When p = 1, (1) reduces to the well-known Camassa- mCH. Holm (CH) equation: Taking these backgrounds into account, the aim of the ut − uxxt = uuxxx + 2uxuxx − 3uux, (2) present paper is to show the following two points. First, we show that finite difference schemes preserving the which describes shallow water waves. The CH has bi- invariants (3) simultaneously can be constructed. Next, Hamiltonian structure, is completely-integrable, and has using these geometric integrators, we numerically show infinitely many conservation laws. Furthermore, it has an that certain solutions can behave like solitons. This pa- interesting feature that it allows strange singular solu- per is intended to be a prompt report on these re- tions called “peakons” (peaked solitons); this is in sharp sults, and the full detail will be presented in our future contrast to the classical smooth soliton equations such work [9]. as the Korteweg-de Vries equation. In order to reveal This paper is organized as follows. In Section 2 the the rich dynamics of the CH, many numerical studies proposed conservative schemes are presented, and their have been carried out, including the following geomet- properties are discussed. In Section 3 we show some nu- ric integrators: invariant-preserving integrators [2–6] and merical examples on the soliton-like solutions. Conclud- multisymplectic integrators [7]. ing remarks are given in Section 4. In contrast to this, the case p ≥ 2 has not yet been We use the following notation. Noting the fact that fully understood, both theoretically and numerically. physically u is the main variable which means the “wave Here we briefly review some known results on the case velocity,” we choose u as our main variable in the nu- p = 2. Global existence of smooth solutions on the unit merical computation. The discrete version is denoted by circle S and real line R were discussed in [1, 8]. The fol- (n) ≃ lowing two invariants have been found associated with Uk u(k∆x, n∆t), where ∆x = L/N (N is the num- ber of the spatial grids) and ∆t is the time mesh size.

– 37 – JSIAM Letters Vol. 3 (2011) pp.37–40 Yuto Miyatake et al.

(n+1/2) (n+1) (n) (n) (n) We use the abbreviation Uk = (Uk + Uk )/ where Mk is associated with Uk via the relation (n) (n) (n) (n) − ⟨2⟩ 2 (n) (n+1/2) (n+1) 2. We also write this as a vector: U = (U0 ,U1 , Mk = (1 δk ) Uk , and Mk = (Mk + (n) ⊤ (n) ...,UN−1) . In the presentation of the schemes, we also Mk )/2. (n) use the discrete version of m, which is denoted by Mk . Obviously (6) corresponds to (4). The scheme, as ex- Throughout this paper, we limit ourselves to the unit pressed in (6), formally coincides with the nonlinear circle case, i.e., we assume the periodic boundary con- scheme for the CH in [5, 6]. However, the relation be- dition, in accordance with numerical simulations. Nat- (n) (n) tween Uk and Mk is different (which means the over- urally we assume the discrete periodic boundary condi- all scheme is different; note that, as mentioned earlier, tion: U (n) = U (n) (∀k ∈ Z). (n) k k mod N the computation is carried out solely in the u (or Uk ) We often use the standard central difference operators (n) ⟨1⟩ ⟨2⟩ 2 space, by eliminating Mk ), and this makes the associ- δk , δk that approximate ∂x, ∂x , and the forward and + − ated discrete Hamiltonian also different. It can be shown backward difference operators δk , δk . that the following quantity: ( )2 ( )2 ( )2 ( ⟨ ⟩ )2 2. Conservative schemes U (n) + δ+U (n) + δ−U (n) + δ 2 U (n) H(n) = k k k k k k k In this section, we present the finite difference schemes k 2 that preserve some discrete counterparts of the two in- (7) variants in (3) at a same time. The schemes can be easily serves as the discrete Hamiltonian for the scheme. found by extending the conservative schemes for the CH Theorem 3 (Conservation laws) Under the dis- [5, 6] (see also [2]); below we briefly show the outline. crete periodic boundary condition, the numerical solution A key observation is that the mCH (1) can be formally by Scheme 2 conserves the two invariants: written in the following Hamiltonian form: N−1 N−1 2 2 2 ∑ ∑ δH u + 2ux + uxx (n) (0) mt = −(m∂x + ∂xm) ,H = , (4) Uk ∆x = Uk ∆x, (n = 1, 2,... ), δm 2 k=0 k=0 where δH/δm is the variational derivative of H with N∑−1 N∑−1 respect to m. Although H is defined with u, it is an H(n)∆x = H(0)∆x, (n = 1, 2,... ). − 2 2 k k easy exercise to find δH/δm = u with m = (1 ∂x ) u in k=0 k=0 mind. From this, the conservation of H is obvious, if we note that the operator (m∂x + ∂xm) is skew-symmetric. Proof We only show the outline of the proof. We first prove the first conservation law. Note that it is sufficient Remark 1 By the “operator” (m∂x + ∂xm), we promise that it applies to a function f in such a way that to prove (m∂x + ∂xm)f = m∂xf + ∂x(mf), which is a standard N∑−1 N∑−1 (n) (0) convention in this research area. The same convention Mk ∆x = Mk ∆x, (8) applies to the discrete versions. k=0 k=0 Interestingly enough, the Hamiltonian form is for- since under the discrete periodic boundary condition it mally the same as that of the CH (p = 1); in fact, the obviously holds that CH (2) can be rewritten as N∑−1 N∑−1 ( ) ⟨2⟩ (n) ⟨2⟩ 2 (n) 2 2 δH˜ u + u δk Uk ∆x = δk Uk ∆x = 0. m = −(m∂ + ∂ m) , H˜ = x (5) t x x δm 2 k=0 k=0 (This can be confirmed by the summation-by-parts for- with m = (1 − ∂ 2)u and δH/δm˜ = u. In [5, 6], (a vari- x mulas found in, for example, [10].) Now we prove (8). ant of) “the discrete variational derivative method” [10] was applied to the Hamiltonian form (5) to find schemes N∑−1 ( ) 1 (n+1) (n) preserving H˜ . In view of this, one would naturally think M − M ∆x ∆t k k that a similar approach can be taken also for (4), where k=0 only the concrete form of H and the relation between u N∑−1 ( ) (n+ 1 ) ⟨ ⟩ ⟨ ⟩ (n+ 1 ) (n+ 1 ) − 2 1 1 2 2 and m are different; the answer is yes. = Mk δk + δk Mk Uk ∆x Due to the restriction of space, we omit the detail of k=0 the derivation, and only show the resulting schemes and N∑−1 { (n+ 1 ) ⟨ ⟩ (n+ 1 ) related discrete invariants. − 2 · 1 2 = Mk δk Uk Scheme 2 (A nonlinear scheme) We define the k=0 (0) ( )} ⟨1⟩ (n+ 1 ) (n+ 1 ) initial approximate solution by Uk = u(0, k∆x)(k = + δ M 2 U 2 ∆x 0,...,N − 1). Then for n = 0, 1,... , k k k − (n+1) (n) N∑1 ( ) − ⟨ ⟩ 2 (n+ 1 ) ⟨ ⟩ (n+ 1 ) Mk Mk − − 2 2 · 1 2 = 1 δk Uk δk Uk ∆x ∆t k=0 ( ) (n+1) (n) (n+ 1 ) ⟨ ⟩ ⟨ ⟩ (n+ 1 ) U + U − 2 1 1 2 k k N∑1 ( ) = − M δ + δ M ⟨ ⟩ ⟨ ⟩ 2 (n+ 1 ) (n+ 1 ) k k k k 1 − 2 2 · 2 2 = δk 1 δk Uk Uk ∆x (k = 0,...,N − 1), (6) k=0

– 38 – JSIAM Letters Vol. 3 (2011) pp.37–40 Yuto Miyatake et al.

−12 = 0. × 10 6 Here we frequently used the discrete periodic bound- U ary condition with various summation-by-parts formu- 5 Hamiltonian las [10]. The last equality follows from the skew symme- ⟨1⟩ − ⟨2⟩ 2 try of δk (1 δk ) . 4 Next we prove the second conservation law. N∑−1 ( ) 3 1 error H(n+1) − H(n) ∆x ∆t k k k=0 2 ( ) − N∑1 U (n+1) + U (n) M (n+1) − M (n) = k k · k k ∆x 1 2 ∆t k=0 N∑−1 { ( 0 (n+ 1 ) (n+ 1 ) ⟨ ⟩ 0 10 20 30 40 50 = U 2 − M 2 δ 1 k k k time k=0 ) } ⟨ ⟩ 1 1 Fig. 1. Error in the discrete invariants: Scheme 2. 1 (n+ 2 ) (n+ 2 ) + δk Mk Uk ∆x −12 × 10 = 0. 7 U The first equality can be confirmed by (7) and 6 Hamiltonian summation-by-parts formulas (this requires some calcu- lation, but we omit the detail). The third is from the 5 ⟨ ⟩ ⟨ ⟩ skew-symmetry of −(M (n+1/2)δ 1 + δ 1 M (n+1/2)). k k k k 4 (QED) error Since Scheme 2 is nonlinear, it requires expensive non- 3 linear solvers in each time step. As a remedy for this, we 2 can construct a linear scheme. Again, this bases on the linearly implicit scheme in [5, 6, 11]. We only show the 1 results. Scheme 4 (A linearly implicit scheme) We de- 0 0 10 20 30 40 50 (0) fine the initial approximate solution by Uk = u(0, k∆x) time (k = 0,...,N − 1). Then for n = 1, 2,..., Fig. 2. Error in the discrete invariants: Scheme 4. (n+1) (n−1) ( ) M − M ⟨ ⟩ ⟨ ⟩ k k = − M (n)δ 1 + δ 1 M (n) U (n), 2∆t k k k k k (n) (n) the stability numerically in the next section. where Mk is associated with Uk via the relation (n) − ⟨2⟩ 2 (n) Mk = (1 δk ) Uk . 3. Numerical examples with a soliton- Note that since Scheme 4 is a multistep scheme, we need like solution not only the initial value U (0) but also the starting value In this section we show some numerical examples with U (1). If we adopt Scheme 2 for computing U (1), we get the presented schemes, and point out that certain so- the following conservation laws. lutions of mCH can behave like solitons. All the com- Theorem 5 (Conservation laws) Under the dis- putations were done in the computation environment: crete periodic boundary condition, the numerical solution CPU Xeon(3.00GHz), 16GB memory, Linux OS. We by Scheme 4 conserves the two invariants: used MATLAB (R2007b), where nonlinear equations −10 N∑−1 N∑−1 were solved by “fsolve” with tolerance T olF un = 10 (n) (0) −10 Uk ∆x = Uk ∆x, (n = 1, 2,... ), and T olX = 10 . k=0 k=0 First we confirm the discrete conservation laws in ∈ N∑−1 N∑−1 the proposed schemes. The parameters were set to t (n+ 1 ) ( 1 ) ∈ − H 2 ∆x = H 2 ∆x, (n = 1, 2,... ), [0, 50], x [ 15, 15], ∆x = 0.1, ∆t = 0.05, and the ini- k k 2 k=0 k=0 tial value was set to u(0, x) = sech (0.3x). Fig. 1 shows the errors in the discrete invariants in Scheme 2 (the (n+1/2) { (n) (n+1) + (n) + (n+1) where Hk = Uk Uk + (δk Uk )(δk Uk ) + nonlinear scheme). It confirms that the scheme conserve − (n) − (n+1) ⟨2⟩ (n) ⟨2⟩ (n+1) } (δk Uk )(δk Uk ) + (δk Uk )(δk Uk ) /2. both the two discrete invariants within the accuracy of The proof is similar to the nonlinear case (omitted). Note the nonlinear solver (recall that the tolerance was set to −10 that, in contrast to the nonlinear scheme case, the dis- 10 ). Fig. 2 shows the errors in Scheme 4 (the linear crete Hamiltonian is now defined in a multi-step way. In scheme), which again well confirms the discrete conser- general, multi-step scheme can be unstable; we observe vation laws.

– 39 – JSIAM Letters Vol. 3 (2011) pp.37–40 Yuto Miyatake et al. 4. Concluding remarks We presented two finite difference schemes for the 1 mCH equation that preserve the two associated invari- u 0 ants. We considered a soliton-like solution for the mCH, and confirmed the soliton-like behavior numerically. As 100 far as the authors understand, this is a new observation. 80 The discussion on the schemes —the scheme deriva- 60 tion and the establishment of discrete invariants— can 15 t 40 10 be easily carried to the general case p ≥ 3. Moreover, 5 it is possible to discuss some theoretical aspects of the 20 0 −5 schemes, for example, the (unique) existence of the nu- 0 −10 x −15 merical solutions. These discussions are left to [9]. We Fig. 3. The evolution of the “one soliton” solution. also plan to include more numerical results there. Finally, as noted in the introduction, the study of the mCH has just started, and many open problems still remain. Does the mCH admit other invariants? Or more aggressively, is the mCH completely-integrable? As for 1 the dynamical aspects, although in the present study we could find a soliton-like solution in analogy with the u 0 standard CH, it is not clear at all whether or not the 100 entire dynamics can be understood in a similar way. The 80 answer should be negative, at least partly, since it has 60 been shown in [1] that the blow-up in the sense of “wave- 20 t 15 breaking” should not occur in the mCH (p ≥ 2). Thus 40 10 5 much more effort should be devoted in this topic, and 20 0 −5 −10 x there we believe that the presented conservative schemes 0 −15 −20 serve as effective numerical tools. Fig. 4. The evolution of the “two soliton” solutions. References

[1] R. McLachlan and X. Zhang, Well-posedness of modified Next, we seek for soliton-like solutions using the con- Camassa-Holm equations, J. Differential Equations, 246 servative schemes. In the CH (p = 1), the singular soliton (2009), 3241–3259. solutions, the “peakons,” can be obtained by formally [2] D. Cohen and X. Raynaud, Geometric finite difference setting m = cδ(x) (the Dirac delta function) where c is schemes for the generalized hyperelastic-rod wave equation, J. Comput. Appl. Math., 235 (2011), 1925–1940. a generic constant. In view of the strong similarity be- [3] T. Matsuo, A Hamiltonian-conserving Galerkin scheme for tween the CH and the mCH, a natural expectation would the Camassa-Holm equation, J. Comput. Appl. Math., 234 be that also in the mCH the delta function behaves as (2010), 1258–1266. a soliton. By formally integrating the delta function, we [4] T. Matsuo and H. Yamaguchi, An energy-conserving Galerkin obtain scheme for a class of nonlinear dispersive equations, J. Com- put. Phys., 228 (2009), 4346–4358. c u(x, t) = (1 + |x − ct|)e−|x−ct|. (9) [5] K. Takeya, Conservative finite difference schemes for the 4 Camassa-Holm equation (in Japanese), Master’s Thesis, Os- (The argument here is on the whole real line R for the aka Univ., 2007. [6] K. Takeya and D. Furihata, Conservative finite difference sake of mathematical brevity. In the following numerical schemes for the Camassa-Holm equation, in preparation. experiments, the solution is truncated so that it fits in [7] D. Cohen, B. Owren and X. Raynaud, Multi-symplectic in- the periodic interval.) tegration of the Camassa-Holm equation, J. Comput. Phys., We tested this solution using the conservative schemes 227 (2008), 5492–5512. with the parameters: t ∈ [0, 100], x ∈ [−15, 15], ∆x = [8] P. Zhang, Global existence of solutions to the modified Camassa-Holm shallow water equation, Int. J. Nonlinear Sci., 0.05, ∆t = 0.025, and the initial value: u(0, x) = (1 + 9 (2010), 123–128. −|x| |x|)e . We found that both schemes stably captured [9] Y. Miyatake, T. Matsuo and D. Furihata, Invariants- the same behavior (note that this numerically supports preserving integration of the modified Camassa-Holm equa- tion, submitted. ( ) the stability of the schemes, in particular, of Scheme 4). α ∂u ∂ δG Fig. 3 shows the result of Scheme 4. We can see that the [10] D. Furihata, Finite difference schemes for ∂t = ∂x δu solution actually behaves like a soliton. that inherit energy conservation or dissipation property, J. Next, in order to see if the solutions actually interact Comput. Phys., 156 (1999), 181–205. [11] T. Matsuo and D. Furihata, Dissipative or conservative finite- like solitons, we considered the initial value u(0, x) = difference schemes for complex-valued nonlinear partial dif- −|x+5| −|x−5| (1+|x+5|)e +(1/2)(1+|x−5|)e . The param- ferential equations, J. Comput. Phys., 171 (2001), 425–447. eters were set as follows: t ∈ [0, 100], x ∈ [−20, 20], ∆x = 0.1, ∆t = 0.1. The result is shown in Fig. 4, which seems to support our view that the solution behaves like two solitons.

– 40 – JSIAM Letters Vol.3 (2011) pp.41–44 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

A multi-symplectic integration of the Ostrovsky equation

Yuto Miyatake1, Takaharu Yaguchi2 and Takayasu Matsuo1

1 Department of Mathematical Informatics, Graduate School of Information Science and Tech- nology, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033, Japan 2 Department of Computational Science, Graduate School of System Informatics, Kobe Univer- sity, Rokkodai-cho 1-1, Nada-ku, Kobe, 657-8501, Japan E-mail yuto miyatake mist.i.u-tokyo.ac.jp Received March 15, 2011, Accepted April 6, 2011 Abstract We consider structure-preserving integration of the Ostrovsky equation, which for example models gravity waves under the influence of Coriolis force. We find a multi-symplectic formula- tion, and derive a finite difference discretization based on the formulation and by means of the Preissman box scheme. We also present a numerical example, which shows the effectiveness of this scheme. Keywords Ostrovsky equation, multi-symplecticity, Preissmann box scheme Research Activity Group Scientific Computation and Numerical Analysis

1. Introduction of schemes that conserve the norm (4). For other exist- In this paper, we consider structure-preserving inte- ing schemes, see also [4–7]. In this paper, we devote our gration of the Ostrovsky equation [1] under the periodic effort to multi-symplectic integration, which is a branch boundary condition of length L: of structure-preserving integrations. We show that this equation has a multi-symplectic formulation, and pro- −1 ut + αuux − βuxxx = γ∂x u, u(t, x) = u(t, x + L), vide a multi-symplectic scheme based on this formula- where α, β, γ are real parameters and the subscript t (or tion and by applying the Preissman box scheme. This x, respectively) denotes the differentiation with respect formulation is motivated by the multi-symplectic for- −1 mulation of the KdV equation by Ascher-McLachlan [8]. to time variable t (or x). The operator ∂x is defined by Although this multi-symplectic scheme preserves neither ∫ ∫ ∫ the energy nor the norm exactly, our numerical results x L x −1 1 show that the deviations are very small compared to the ∂x u = u(t, s) ds − u(t, s) dsdx, (1) 0 L 0 0 existing schemes by Yaguchi et al. [3]. for any zero-mean and L-periodic function u [4]. This This paper is organized as follows. In Section 2 the equation is often called the rotation-modified Korteweg- schemes by Yaguchi-Matsuo-Sugihara are summarized de Vries equation. This equation has many physical for readers’ convenience. In Section 3 a multi-symplectic meanings. For example, it models gravity waves under formulation and a multi-symplectic scheme based on it the influence of Coriolis force, surface and internal waves are proposed. In Section 4 some numerical results are in the ocean, and capillary waves on the surface of a liq- provided. Concluding remarks and comments are given uid. in Section 5. The Ostrovsky equation has three first integrals [2]: We use the following notation. Numerical solutions are n ≃ n ≃ ∫ denoted by Uk u(n∆t, k∆x) or Φk ϕ(n∆t, k∆x), L where ∆x = L/N (N is the number of the spatial udx = const. = 0, (2) 0 nodes) and ∆t is the time mesh size. We use the follow- ∫ [ ] n+1/2 n n+1 n L ing abbreviations: Uk = (Uk + Uk )/2, Uk+1/2 = α 3 β 2 γ −1 2 u + ux + (∂x u) dx = const., (3) (U n + U n )/2 and U n+1/2 = (U n + U n+1 )/2. We 0 6 2 2 k k+1 k+1/2 k+1/2 k+1/2 ∫ also use the following difference operators: the standard L 2 u + dx = const. (4) forward, backward and central difference operators δx , ⟨ ⟩ ⟨ ⟩ ⟨ ⟩ ⟨ ⟩ ⟨ ⟩ 0 2 − 1 2 + − 2 3 2 1 δx and δx for ∂x, δx = δx δx for ∂x , δx = δx δx 3 + The invariant (2), which we call the total mass, is the for ∂x , and the forward difference operator δt for ∂t. −1 condition for the existence of the potential ϕ = ∂x u. The invariants (3) and (4) correspond to the energy and 2. Approach by Yaguchi-Matsuo the L2 norm conservation laws, respectively. From the -Sugihara perspective of structure-preserving integration, Yaguchi et al. have proposed four conservative numerical schemes In this section the previous finite difference approach [3]: a finite difference scheme and a pseudospectral by Yaguchi-Matsuo-Sugihara is summarized. Their en- scheme that conserve the energy (3), and the same types ergy conservative scheme is based on the following

– 41 – JSIAM Letters Vol. 3 (2011) pp.41–44 Yuto Miyatake et al. Hamiltonian structure: 3. A multi-symplectic integrator δG α β γ − In this section a new multi-symplectic formulation and u = −∂ ,G(u) = u3 + u2 + (∂ 1u)2, (5) t x δu 6 2 x 2 x associated local conservation laws are shown. A multi- and their norm conservative scheme is based on the fol- symplectic discretization is also proposed by means of lowing form: the Preissman box scheme. α u + (u∂ u + ∂ u2) − β∂ 3u = γ∂ −1u. (6) 3.1 Multi-symplectic partial differential equations and t 3 x x x x multi-symplectic integrators The symbol δG/δu is the variational derivative of G with We start by briefly reviewing the concept of multi- respect to u. They assumed that the initial condition is symplecticity in general context. A partial differential given so as to satisfy equation F (u, ut, ux, utx,... ) = 0 is said to be multi- symplectic if it can be written as a system of first order N∑−1 0 equations: Uk ∆x = 0, k=0 Mzt + Kzx = ∇zS(z) (8) ⟨− ⟩ ˜ 1 d which corresponds to (2), and defined the operator δx , with z ∈ R a vector of state variables, typically consist- −1 which is the approximation of ∂x , by a summation op- ing of the original variable u as one of its components. M erator × ( ) and K are constant d d skew-symmetric matrices, and k−1 S : Rd → R is a scalar-valued smooth function depend- U n ∑ U n δ˜⟨−1⟩U n =∆x 0 + U n + k ing on z. A key observation for the multi-symplectic for- x k 2 j 2 j=1 mulation (8) is that this system has a multi-symplectic ( ) − j−1 conservation law (∆x)2 N∑1 U n ∑ U n − 0 + U n + j . L 2 l 2 ∂tω + ∂xκ = 0, (9) j=0 l=1 (7) where ω and κ are differential two forms ∧ ∧ This is a natural discretization of (1). ω = dz Mdz, κ = dz Kdz. Firstly, the energy conservative scheme is summarized. Another key property is the following conservation 3 2 A discrete version of the energy G = αu /6 + βux/2 + laws. The system (8) has local energy and norm con- −1 2 γ(∂x u) /2, and the “discrete variational derivative” servation laws: 2 −2 that approximates δG/δu = αu /2 − βuxx − γ∂x u are defined by ∂tE(z) + ∂xF (z) = 0, ∂tI(z) + ∂xG(z) = 0, α β [ ] where E(z), F (z), I(z) and G(z) are the density func- Gn = (U n)3 + (δ+U n)2 + (δ−U n)2 k 6 k 4 x k x k tions defined as γ 1 ⊤ ⊤ 1 ⊤ ⊤ + (δ˜⟨−1⟩U n)2, E(z) = S(z) − z K z, F (z) = z K z, 2 x k 2 x 2 t δG α [ ] 1 ⊤ ⊤ 1 ⊤ ⊤ n+1 2 n+1 n n 2 G(z) = S(z) − z M z, I(z) = z M z. n+1 n = (Uk ) + Uk Uk + (Uk ) t x δ(U , U )k 6 2 2 ⟨ ⟩ n+ 1 ⟨− ⟩ n+ 1 Thus integrating the densities E(z) and I(z) over the − βδ 2 U 2 − γ(δ˜ 1 )2U 2 . x k x k spatial domain under the usual assumption on vanishing Then the scheme is defined as follows. boundary terms for the functions F (z) and G(z), we obtain the global invariants: Scheme 1 (The Energy Conservative Finite Dif- ∫ ∫ ference Scheme [3]) E(z) = E(z)dx, I(z) = I(z)dx. U n+1 − U n δG k k − ⟨1⟩ = δx n+1 n . ∆t δ(U , U )k A scheme is called to be multi-symplectic if it satisfies some discrete version of the multi-symplectic conserva- This scheme corresponds to the Hamiltonian structure tion law (9). As multi-symplectic schemes, the Preiss- (5). Numerical solutions by Scheme 1 conserve both the man box scheme and the Euler box scheme are widely total mass and the energy. known. We adopt the Preissman box scheme in this re- Next, the norm conservative finite difference scheme port. The Preissman box scheme, introduced by Preiss- is summarized. This scheme is defined as follows. man in 1960 and then most widely used in hydraulics, Scheme 2 (The Norm Conservative Finite Differ- was proved to be multi-symplectic by Bridges-Reich [9]. ence Scheme [3]) It is also called the centered box scheme. It leads [ ] ( ) n+1 n ( ) n+ 1 n+ 1 U − U α 1 1 1 2 + n + 2 2 k k n+ 2 ⟨1⟩ n+ 2 ⟨1⟩ n+ 2 Mδ Z 1 + Kδ Z = ∇ S Z . (10) + U δ U + δ U t k+ x k z k+ 1 ∆t 3 k x k x k 2 2

⟨ ⟩ n+ 1 ⟨− ⟩ n+ 1 3.2 A multi-symplectic formulation and an integrator − βδ 3 U 2 = γδ˜ 1 U 2 . x k x k for the Ostrovsky equation This scheme corresponds to (6). Numerical solutions by In this subsection, a multi-symplectic formulation Scheme 2 conserve both the total mass and the norm. for the Ostrovsky equation is presented. Setting z =

– 42 – JSIAM Letters Vol. 3 (2011) pp.41–44 Yuto Miyatake et al.

(ϕ, u, v, w)⊤, we derive a multi-symplectic formulation −1.5 (8) with two skew-symmetric matrices   −2 −1    0 0 0  −  2  0 0 0 1 −2.5  1   0 0 −1 0  M =  0 0 0  ,K =   ,  2   0 1 0 0    −3

0 0 0 0 1 0 0 0 Energy 0 0 0 0 −3.5 (11)

3 −4 Scheme 1 and with the scalar function S(z) = uw − αu /6 + Scheme 2 2 2 Scheme 3 v /2β − γϕ /2. ∇zS(z) is given by ∇zS(z) = (−γϕ, w − ⊤ −4.5 αu2/2, v/β, u) . This formulation is motivated by the 0 1 2 3 4 5 multi-symplectic formulation for the KdV equation [8]. time Actually when γ = 0, this reduces to the KdV case. Fig. 1. Evolution of the energy for each scheme. From (11), the density functions E and I are explicitly given by 3.4 1 E(z) = S(z) − z⊤K⊤z 3.2 2 x 3 α β γ = − + u2 − ϕ2 + uw 6 2 x 2 2.8 − 1 − − 2.6

(ϕ w + u v v u w ϕ) , Norm 2 x x x x 2.4 1 1 I(z) = z⊤M ⊤z = (ϕ u − u ϕ). 2 x 4 x x 2.2 Scheme 1 Under the periodic (or vanishing) boundary condition, 2 Scheme 2 we obtain the following two global invariants Scheme 3 ∫ ( ) 1.8 0 1 2 3 4 5 α 3 β 2 γ 2 E(z) = − u + u + ϕ dx, time 6 2 x 2 ∫ Fig. 2. Evolution of the norm for each scheme. 1 I(z) = u2dx, 2 by using the standard integration-by-parts formula. length of the spatial period was set to L = 2π. The n n n n n ⊤ initial condition was set to u(0, x) = sin(x), and accord- By substituting Zk = (Φk ,Uk ,Vk ,Wk ) into (10), we give the following multi-symplectic scheme. ingly the potential was set to ϕ(0, x) = − cos(x). In this Scheme 3 (A Multi-Symplectic Scheme) setting, Hunter reported that oscillations were observed [4], and Yaguchi et al. confirmed this [3]. We set the n+1 1 1 − n n+ 2 n+ 2 U 1 Uk+ 1 1 U − U time mesh size to ∆t = 0.1, and used a uniform grid, k+ 2 2 n+ 2 k+1 k + αUk+ 1 where ∆x = L/N with N = 101. Computation environ- ∆t 2 ∆x ment is CPU Xeon (3.00GHz), 16GB memory, Linux OS. n+ 1 n+ 1 n+ 1 n+ 1 2 − 2 2 − 2 Uk+2 3Uk+1 + 3Uk Uk−1 n+ 1 We used MATLAB (R2007b), where nonlinear equations − 2 − β 3 = γΦk+ 1 , 16 (∆x) 2 were solved by “fsolve” with tolerance T olF un = 10 and T olX = 10−16. n+1/2 n+1/2 − n+1/2 where Uk+1/2 = (Φk+1 Φk )/∆x. Figs. 1 and 2 show the evolutions of the energies and The detail on how this scheme is in fact “multi- the norms. Schemes 1 and 2 (the conservative schemes) symplectic,” i.e., how it realizes a discrete version of (9) preserve one invariant, but the deviation for the other is here omitted due to the restriction of the space (see invariant is large. On the other hand, the deviation of the our coming complete paper [10] for detail). numerical solutions by Scheme 3 (the multi-symplectic In Scheme 3, we have to give the initial approximate scheme) is very small. solution for the potential ϕ. This can be generated either Next, let us evaluate each scheme in view of qualitative by integrating u(0, x) analytically or by the summing of behaviors. The numerical solutions are shown in Figs. 3– ⊤ (U0,...,UN−1) via (7). 5. Fig. 6 shows the numerical solutions by Scheme 1 with a sufficiently small mesh size. If we regard Fig. 6 as 4. Numerical examples the exact solution, Scheme 3 can be said to be the best In this section we compare the multi-symplectic of the three schemes because the numerical solution is scheme with the conservative schemes by Yaguchi et al. much smoother and closer to the exact solution than numerically. The aim of this section is to confirm the Schemes 1 and 2. effectiveness of the multi-symplectic scheme. The pa- rameters were set to α = 1, β = −0.01, γ = −1. The

– 43 – JSIAM Letters Vol. 3 (2011) pp.41–44 Yuto Miyatake et al.

3 3 2 2 1 1 u 0 u 0 −1 −1 −2 5 −2 5 −3 −3 4 4 3 3 0 0 1 2 t 1 t 2 2 2 3 1 3 4 4 1 x 5 5 6 0 x 6 0 Fig. 3. The numerical solution obtained by Scheme 1 (the norm Fig. 5. The numerical solution obtained by Scheme 3 (the multi- conservative finite difference scheme by Yaguchi et al.) with N = symplectic scheme) with N = 101 and ∆t = 0.1. 101 and ∆t = 0.1.

3 3 2 2 1 1 u 0 u 0 −1 −1 −2 5 −2 5 −3 −3 4 4 3 3 0 0 1 t 1 t 2 2 2 2 3 3 4 1 4 1 5 5 x 6 0 x 6 0 Fig. 6. The numerical solution obtained by Scheme 1 (the norm Fig. 4. The numerical solution obtained by Scheme 2 (the norm conservative finite difference scheme by Yaguchi et al.) with N = conservative finite difference scheme by Yaguchi et al.) with N = 301 and ∆t = 0.1. 101 and ∆t = 0.1.

1967–1985. 5. Concluding remarks [8] U. M. Ascher and R. I. McLachlan, Multisymplectic box We proposed the multi-symplectic scheme for the Os- schemes and the Korteweg-de Vries equation, Appl. Numer. trovsky equation that preserves the multi-symplectic Math., 48 (2004), 255–269. [9] T. J. Bridges and S. Reich, Multi-symplectic integrators: nu- conservation law, and confirmed that this scheme gives merical schemes for Hamiltonian PDEs that conserve sym- better numerical solutions compared with the conserva- plecticity, Phys. Lett. A, 284 (2001),184–193. tive schemes by Yaguchi et al. [10] Y. Miyatake, T. Yaguchi and T. Matsuo, Numerical integra- Although we have also considered other structure- tion of the Ostrovsky equation based on its geometric struc- preserving schemes including conservative Galerkin tures, in preparation. schemes, the full detail is left to [10]. More numerical results will be included there.

References

[1] L. A. Ostrovsky, Nonlinear internal waves in the rotating ocean, Okeanologia, 18 (1978), 181–191. [2] R. Choudhury, R. I. Ivanov and Y. Liu, Hamiltonian formula- tion, nonintegrability and local bifurcations for the Ostrovsky equation, Chaos Soliton. Fract., 34 (2007), 544–550. [3] T. Yaguchi, T. Matsuo and M. Sugihara, Conservative numer- ical schemes for the Ostrovsky equation, J. Comput. Appl. Math., 234 (2010), 1036–1048. [4] J. K. Hunter, Numerical solutions of some nonlinear dispersive wave equations, Lectures Appl. Math., 26 (1990), 301–316. [5] G. Y. Chen and J. P. Boyd, Analytical and numerical studies of weakly nonlocal solitary waves of the rotation-modified Korteweg-de Vries equation, Physica D, 155 (2001), 201–222. [6] R.Grimshaw, J.M.He and L.A.Ostrovsky, Terminal damping of a solitary wave due to radiation in rotational systems, Stud. Appl. Math., 101 (1998), 197–210. [7] Y. Liu, D. Pelinovsky and A. Sakovich, Wave breaking in the Ostrovsky-Hunter equation, SIAM J. Math. Anal., 42 (2010),

– 44 – JSIAM Letters Vol.3 (2011) pp.45–48 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Solutions of Sakaki-Kakei equations of type 1, 2, 7 and 12

Koichi Kondo1

1 Graduate School of Engineering, Doshisha University, Tatara-Miyakodani 1-3, Kyotanabe, Kyoto 610-0394, Japan E-mail kokondo mail.doshisha.ac.jp Received May 6, 2011, Accepted June 16, 2011 Abstract We consider solutions of Sakaki-Kakei equations of type 1, 2, 7 and 12, which are irreversible two dimensional systems. We first obtain their conserved quantities, and reduce them to one dimensional nonautonomous systems. We next show that the equations of type 2 and 7 are transformed to arithmetic-harmonic mean equation, and obtain their general solutions. We finally show that the equations of type 1 and 12 are related to the solvable chaotic system proposed by Umeno. We also show that their iteration maps have self semiconjugacy, and obtain their particular solutions which are expressed in terms of lemniscate elliptic function. Keywords Sakaki-Kakei equation, Umeno equation, lemniscate elliptic function Research Activity Group Applied Integrable Systems

1. Introduction Lemma 1 Suppose that a0 > 0, a0 − b0 > 0. Then, the − In [1], Sakaki and Kakei presented twelve types of variables an, bn of SK7 satisfy an > 0, an bn > 0 for irreversible two dimensional dynamical systems. Each n = 0, 1, 2,... , and they are well-defined as real. type of the equations has an invariant, which is ex- pressed in terms of the hypergeometric function. Here, 3. Conserved quantities the twelve types of the Sakaki-Kakei equations are de- We consider conserved quantity of SK1. We derive noted as SK1, SK2, SK3, ... , and SK12 in order of ap- −4anbn pearance in the paper. In [2], Kondo showed that the an+1bn+1 = (an − bn) = −4anbn (5) a − b iteration maps of SK3, SK5 and SK6 are semiconjugate n n − to one of arithmetic-harmonic mean equation (AHM) from (1). It follows from (5) that anbn = 4an−1bn−1 = ··· − n [3], and obtained their general solutions. The aim of this = ( 4) a0b0. Let c = a0b0. Then, we have anbn = − n − n paper is to obtain solutions of SK1, SK2, SK7 and SK12. ( 4) c. Thus, c = anbn/( 4) is a conserved quantity of SK1, since c is a constant determined with a0, b0. 2. Sakaki-Kakei equations Similarly, we can derive conserved quantities of SK2, SK7 and SK12 from (2)–(4) and Lemma 1. We obtain We consider the Sakaki-Kakei equations [1] of type 1, the following theorem. 2, 7 and 12, which are given by − Theorem 2 Conserved quantities of SK1, SK2, SK7 4anbn − n n n SK1: an+1 = an − bn, bn+1 = , (1) and SK12 are c = anbn/( 4) , c = anbn/4 , c = 4 anbn an − bn n and c = anbn/16 for n = 0, 1, 2,... , respectively. 4anbn SK2: an+1 = an + bn, bn+1 = , (2) The conserved quantities in Theorem 2 are different an + bn from the invariants in [1], which are expressed in terms  √ √  √ a + a − b of the hypergeometric function. It is future problem to a = a n n n , n+1 n 2 find out the relationship between them. SK7: √ √ (3) By Theorem 2, the variables b of SK1, SK2, SK7 and  √ − − n  an an bn − n n bn+1 = an , SK12 are expressed as bn = ( 4) c/an, bn = 4 c/an, 2 n n  bn = c/(4 an) and bn = 16 c/an, respectively. Substi- 2  (an + bn) tuting bn in (1)–(4) by them, we obtain one dimensional an+1 = , a − b equations of an. We thus obtain the following theorem. SK12: n n (4)  −  16anbn(an bn) Theorem 3 The SK1, SK2, SK7 and SK12 are re- bn+1 = 2 (an + bn) duced to one dimensional nonautonomous equations, n for n = 0, 1, 2,... . Here, an, bn are real variables. (−4) c an+1 = an − , c = a0b0, (6) In (3), we should take account of the square roots of an − SK7. Assume that an > 0, an bn > 0 for a certain n. 4nc − It follows√ from (3) that an+1 > 0 and an+1 bn+1 = an+1 = an + , c = a0b0, (7) √ an an an − bn > 0. By mathematical induction, we thus obtain the following lemma.

– 45 – JSIAM Letters Vol. 3 (2011) pp.45–48 Koichi Kondo ( √ ) 1 c 6. General solution of SK7 a = a + a 2 − , c = a b , (8) n+1 2 n n 4n 0 0 We consider the general solution of SK7. Let an = n (a 2 + 16nc)2 a˜n/2 in (8). Then, (8) yields autonomous equation n √ an+1 = 2 n , c = a0b0 (9) an(an − 16 c) 2 − a˜n+1 =a ˜n + a˜n c, n = 0, 1, 2,.... (16) for n = 0, 1, 2,... , respectively. We here consider the map Φc defined by (11). Since Φc is two-to-one map, the inverse of Φc has two√ branches. 4. General solution of AHM ̸ { | | | | |} ′ Suppose that√ c = 0. Let Uc = x x > √c , Uc = In order to obtain solutions of SK2 and SK7, we con- {x | |x| > max(0, c)} and Tc = {x | |x| < |c|} for c. sider arithmetic-harmonic mean equation [3], Let us introduce Φˆ c, Φ˜ c with c by a + b 2a b √ n n n n ˆ 2 − ∈ ′ AHM: an+1 = , bn+1 = (10) Φc(x) = x + sgn(x) x c, x Uc, (17) 2 an + bn √ ˜ − 2 − ∈ ′ for n = 0, 1, 2,... , and its general solution [2]. Φc(x) = x sgn(x) x c, x Uc. (18) Let c = a0b0. A conserved quantity of AHM is given Here, sgn(x) denotes sgn(x) = x/|x| for x ≠ 0. Hence, ̸ by c = anbn for n = 0, 1, 2,... . Suppose that c = 0. → ′ → ′ the inverses of the maps Φc : Uc Uc,Φc : Tc Uc are Substituting bn in (10) by bn = c/an, we obtain one ˆ ′ → ˜ ′ → Φc : Uc Uc, Φc : Uc Tc, respectively. dimensional equation an+1 = Φc(an) for n = 0, 1, 2,... . n By Lemma 1 and an =a ˜n/2 , we havea ˜n > 0 and Here, Φc is defined with c by ( ) sgn(˜an) = 1 for n = 0, 1, 2,... . By use of (17), (16) is ˆ 1 c expressed asa ˜n+1 = Φc(˜an) for n = 0, 1, 2,...√. Φc(x) = x + . (11) ∞ → 2 x √Suppose that √c > 0. The√ inverse of Φc :( c, ) ∞ ˆ ∞ → ∞ The general solution of an+1 = Φc(an) is shown in [2]. ( c, ) is √Φc :( c, ) ( c, ). It follows from (13) ∞ → We briefly review it. Let R¯ = R ∪ {∞} and S1 = {z ∈ that ϕc :( c, ) (0, 1) is homeomorphic. The map ¯ ¯ 2 → C | |z| = 1}. The map Φc : R → R is conjugate (cf. [4, x : (0, 1) (0, 1) is bijection. By (12), we hence obtain pp. 108–109]) to x2 : R¯ → R¯ if c > 0, or x2 : S1 → S1 if − 1 ˆ 1 ◦ 2 ◦ Φc = ϕc x ϕc, (19) c < 0. Namely, Φc is expressed as and −1 ◦ 2 ◦ Φc = ϕc x ϕc (12) √ 1 −1 √ ϕ 2 ϕ Φˆ :( c, ∞) −→c (0, 1) −−→x (0, 1) −−→c ( c, ∞). with a homeomorphic map ϕc. Here, the map ϕc and its c −1 √ inverse ϕc are given with c by Suppose that c < 0. The inverse of Φ :( −c, ∞) → √ √ c − √ (0, ∞) is Φˆ : (0, ∞) → ( −c, ∞). Let S = {eiθ ∈ C | − x c −1 1 + x c m ϕc(x) = √ , ϕ (x) = c . (13) } x + c c 1 − x mπ/√2 < θ <√0 for m = 1, 2. Then, it follows from√ (13) − iθ − −1 − √ and c = i c that ϕc(x) = e , θ√= 2 tan ( c/x) If c < 0, then we treat c as single-valued function such ∈ R −1 iθ − − − √ √ for x , and that ϕc (e ) = c sin θ/(1 cos θ) that c = i −c. Here, i is the imaginary unit. Employ- for θ ∈ [0, 2π]. It turns out that ϕ : (0, ∞) → S , ϕ : − n √ c 2 c ing (12), we obtain the solution by a = (ϕ 1 ◦ x2 ◦ 2 n c ( −c, ∞) → S1 are homeomorphic. The map x : S1 → ϕ )(a ) for n = 0, 1, 2,... . Hence, the general solution c 0 S2 is bijection. By (12), we obtain (19) again, and of an+1 = Φc(an) is obtained by 1 −1 √ 2 ϕ n n ϕc x c √ 2 2 Φˆ c : (0, ∞) −→ S2 −−→ S1 −−→ ( −c, ∞). λ1 + λ2 an = c n n , n = 0, 1, 2,..., (14) λ 2 − λ 2 Thus, we obtain the following lemma. √1 2 √ √ √ ˆ ∞ → ∞ where λ1 = a0 + c, λ2 = a0 − c. Lemma 5 If c > 0, then Φc :( c, ) ( c, ) 1/2 is conjugate to√x : (0, 1) → (0, 1). If c < 0, then ˆ 1/2 5. General solution of SK2 Φc : (0, ∞) → ( −c, ∞) is conjugate to x : S2 → S1. ˆ We consider the general solution of SK2. Let an = The map Φc composed with itself is expressed as 2na˜ in (7). Then, (7) yields autonomous equation − 1 − 1 n ˆ ◦ ˆ 1 ◦ 2 ◦ ◦ 1 ◦ 2 ◦ ( ) Φc Φc = ϕc x ϕc ϕc x ϕc (20) 1 c ◦ −1 → a˜n+1 = a˜n + , n = 0, 1, 2,.... (15) by (19). If c > 0 or c < 0, then ϕc ϕc : (0, 1) (0, 1) 2 a˜n ◦ −1 → or ϕc ϕc : S1 S1 is identity map. The composite Suppose that c ≠ 0. Eq. (15) is expressed asa ˜n+1 = (20) is written as Φc(˜an), where Φc is defined by (11). Thus, the solution of − 1 ˆ ◦ ˆ 1 ◦ 4 ◦ n Φc Φc = ϕc x ϕc. SK2 can be derived from (14). Recall that bn = 4 c/an. We obtain the following theorem. ˆ ˆ −1◦ 1/4◦ Hence, we obtaina ˜2 = Φc(Φc(˜a0)) = (ϕc x ϕc)(˜a0). Theorem 4 Let c = a0b0. Suppose that a0b0 ≠ 0 and Repeatedly, we obtain ̸ − a0 + b0 = 0. The general solution of SK2 is −1 ◦ 2 n ◦ a˜n = (ϕc x ϕc)(˜a0), n = 0, 1, 2,.... (21) √ 2n 2n √ 2n − 2n n λ1 + λ2 n λ1 λ2 n n an = 2 c n n , bn = 2 c n n Recall that an =a ˜n/2 , bn = c/(4 an). By (21), we λ 2 − λ 2 λ 2 + λ 2 1 2 √ 1 2 √ obtain the following theorem. − for n = 0, 1, 2,... , where λ1 = a0 + c, λ2 = a0 c. Theorem 6 Let c = a0b0. Suppose that a0 > 0, b0 ≠ 0

– 46 – JSIAM Letters Vol. 3 (2011) pp.45–48 Koichi Kondo and a0 − b0 > 0. The general solution of SK7 is Hence, we have √ − − √ − − 2 n 2 n 2 n 2 n 2 − 4 c λ1 + λ2 c λ1 − λ2 2 4 sl (x)(1 sl (x)) an = − − , bn = − − sl (2x) = 4 . (28) 2n λ 2 n − λ 2 n 2n λ 2 n + λ 2 n (1 + sl (x))2 1 2 √ 1 2 √ Suppose that c > 0. Let s = 1. Substituting x = for n = 0, 1, 2,... , where λ1 = a0 + c, λ2 = a0 − c. √ s c/ sl2(x) in (25) and employing (28), we obtain ( √ ) √ 7. Umeno equation s c s c Ψc = . (29) In order to obtain solutions of SK12 and SK1, we con- sl2(x) sl2(2x) sider Umeno equation [5], which is one dimensional solv- 2 R → able chaotic system. The equation is given by Here, we can see that sl : [0, 1]√ by (27). Let σ be a constant which satisfies sl2(σ) = c/|a˜ | for a given − − − √ √ 0 4un(1 un)(1 lun)(1 mun) a˜ under |a˜ | ≥ c. Leta ˜ = sgn(˜a ) c/ sl2(2nσ) for un+1 = (22) 0 0 n 0 2 3 4 n 1 + Aun + Bun + Cun n = 0, 1, 2,... . Substituting s = sgn(˜a0) and x = 2 σ in for n = 0, 1, 2,... , where A = −2(l + m + lm), B = 8lm, (29), we have Ψc(˜an) =a ˜n+1. Hence,a ˜n is a solution of C = l2 + m2 − 2lm − 2l2m − 2lm2 + l2m2, and l, m are (24). Let us introduce Fc(n, α), µc(α) with c > 0 by √ real numbers such that −∞ < m ≤ l < 1. 1 sgn(α) c −1 c 4 In [5], it was proved that the Umeno equation has Fc(n, α) = , µc(α) = sl 2 n 1 sl (2 µc(α)) |α| 2 a solution in terms of Weierstrass elliptic function ℘(x) √ with the help of its duplication formula. It is well-known under |α| ≥ c. Then, it holds thata ˜n = Fc(n, a˜0) and that ℘(x) is related to Jacobi elliptic function sn(x; k) (cf. [6, p. 505]). The solution is equally rewritten as Ψc(Fc(n, α)) = Fc(n + 1, α). (30) √ We thus obtain the following lemma. k2 sn2(2nσ; k) l − m √ un = , k = , (23) | | ≥ l − m dn2(2nσ; k) 1 − m Lemma 7 Suppose that c > 0 and a˜0 c. Then, √ the solution of a˜n+1 = Ψc(˜an) is a˜n = Fc(n, a˜0). where dn(x; k) = 1 − k2 sn2(x; k) and σ is a constant We here consider relationship between the map Ψc and determined by u0. the map Φc defined by (11). Computing Φc composed with Φ− , we obtain 8. Particular solutions of SK12 c n Ψc = Φc ◦ Φ−c (31) We consider solutions of SK12. Let an = 4 a˜n in (9). Then, (9) yields autonomous equation for any c ≠ 0. Replacing c in (31) with −c, we have 2 2 (˜a + c) Ψ− = Φ− ◦ Φ . (32) a˜ = n , n = 0, 1, 2,.... (24) c c c n+1 2 − 4˜an(˜an c) It follows from (32) that Φc ◦ Ψ−c = Φc ◦ Φ−c ◦ Φc. Suppose that c ≠ 0. Let us introduce Ψc with c by Employing (31), we have (x2 + c)2 ◦ ◦ Ψ (x) = . (25) Φc Ψ−c = Ψc Φc. (33) c 2 − 4x(x c) ¯ ¯ The map Φc : R → R is continuous, onto, and at most ¯ ¯ ¯ ¯ Eq. (24) is expressed asa ˜n+1 = Ψc(˜an) for√ n = 0, 1,... . two-to-one map. The maps Ψc : R → R,Ψ−c : R → R Suppose that c > 0. Leta ˜n = sgn(˜a0) c/un in (24). satisfy (33). Thus, we obtain the following lemma. Then, (24) becomes ¯ ¯ Lemma 8 The map Ψ−c : R → R is semiconjugate (cf. 2 ¯ ¯ 4un(1 − un ) [4, p. 125]) to the map Ψc : R → R with semiconjugacy un+1 = 2 2 , n = 0, 1, 2,.... (26) ̸ (1 + un ) Φc for c = 0. Eq. (26) is equal to the special case of (22) with l = Suppose that c < 0. We consider a solution of (24). 0, m = −1, so that we can derive a solution of (26) Leta ˆn be a solution of another equation from (23) with l = 0, m = −1. Thus, we obtain u = √ n aˆn+1 = Ψ−c(ˆan), n = 0, 1, 2,.... (34) k2 sn2(2nσ; k)/ dn2(2nσ; k) with k = 1/ 2. √ | | − − We can also directly obtain a solution of (24). We in- fora ˆ0 such that aˆ0 > c. Note here that c > 0. troduce lemniscate elliptic function sl(x) (cf. [6, p. 524]), By Lemma 7, we obtain a solution of (34) bya ˆn = which is defined by inverse of an integral F−c(n, aˆ0). Mapping both sides of (34) by Φc, we have ∫ x dt Φc(ˆan+1) = Φc(Ψ−c(ˆan)), n = 0, 1, 2,.... (35) sl−1(x) = √ , 4 0 1 − t By (33), (35) becomes and expressed as √ √ Φc(ˆan+1) = Ψc(Φc(ˆan)), n = 0, 1, 2,.... (36) sn( 2x; 2−1) sl(x) = √ √ √ . (27) Leta ˜n = Φc(ˆan) for n = 0, 1, 2,... . Then, (36) is writ- −1 2 dn( 2x; 2 ) ten asa ˜n+1 = Ψc(˜an) for n = 0, 1, 2,... , which is By virtue of (27), we can derive a duplication formula same as (24). Hence, we obtain a solution of (24)√ by | | − of sl(x) from those of sn(x; k), dn(x; k) (cf. [6, p. 496]). a˜n = Φc(F−c(n, aˆ0)), wherea ˆ0 should satisfy aˆ0 > c anda ˜0 = Φc(ˆa0) for a givena ˜0. Recall that (17).

– 47 – JSIAM Letters Vol. 3 (2011) pp.45–48 Koichi Kondo → ′ ˆ ′ → n − n Since the inverse of Φc : Uc Uc is Φc : Uc Uc, Recall that an = 2 a˜n, bn = ( 4) c/an and c = a0b0. ˆ we obtaina ˆ0 = Φc(˜a0). Thus, the solution becomes√ We obtain the following theorem. a˜n = Φc(F−c(n, Φˆ c(˜a0))). The condition |aˆ0| > −c is Theorem 11 Let c = a0b0 and s = sgn(a0). If a0b0 > equivalent toa ˜0 ≠ 0. We obtain the following lemma. 0 and |a0| ≥ |b0| > 0, then the solution of SK1 is ̸ √ Lemma 9 Suppose that c < 0 and a˜0 = 0. Then, the n √ 4 c n n solution of a˜ = Ψ (˜a ) is a˜ = Φ (F− (n, Φˆ (˜a ))). n+1 c n n c c c 0 a2n = s n , b2n = s4 c f1(2 σ), f1(2 σ) n n Recall that an = 4 a˜n, bn = 16 c/an and c = a0b0. √ 22n+1 c √ Let us introduce f1(x), f2(x) by − 2n+1 n a2n+1 = s n , b2n+1 = s2 c f2(2 σ) 2 f2(2 σ) 2 2 sl (x) f1(x) = sl (x), f2(x) = . for n = 0, 1, 2,... , where σ is same as (37). If a b < 0, 1 − sl4(x) 0 0 then the solution of SK1 is By Lemmas 7 and 9, we obtain the following theorem. √ n − √ 4 c − n − n Theorem 10 Let c = a0b0 and s = sgn(a0). If a0b0 > a2n = s n , b2n = s4 c f2(2 σ), f2(2 σ) 0 and |a0| ≥ |b0| > 0, then the solution of SK12 is √ √ 22n+1 −c √ n √ 2n+1 − n+1 4 c n n a2n+1 = s n+1 , b2n+1 = s2 c f1(2 σ) an = s n , bn = s4 c f1(2 σ) f1(2 σ) f1(2 σ) for n = 0, 1, 2,... , where σ is same as (38). for n = 0, 1, 2,... , where

( )1 Theorem 11 gives us particular solutions of SK1 under 4 − b0 some conditions. It is future problem to obtain general σ = sl 1 . (37) a0 solution of SK1.

If a0b0 < 0, then the solution of SK12 is √ 10. Conclusion 4n −c √ − n − n The aim of this paper is to obtain solutions of SK1, an = s n , bn = s4 c f2(2 σ) f2(2 σ) SK2, SK7 and SK12. We first obtained their conserved for n = 0, 1, 2,... , where quantities, and reduced them to one dimensional nonau- √√ √ tonomous equations. We next showed that SK2 and SK7 a a σ = sl−1 1 − 0 − − 0 . (38) are transformed to AHM, and obtained their general so- b0 b0 lutions. We finally showed that SK12 and SK1 are re- lated to the solvable chaotic system proposed by Umeno. Theorem 10 gives us particular solutions of SK12 un- We also showed that the iteration maps of SK12 and SK1 der some conditions. It is future problem to obtain gen- have self semiconjugacy. Under some conditions, we ob- eral solution of SK12. tained particular solutions of SK12 and SK1 which are 9. Particular solutions of SK1 expressed in terms of lemniscate elliptic function. n We consider solutions of SK1. Let an = 2 a˜n in (6). Acknowledgments Then, (6) becomes [ ] The author would like to thank Dr. Umeno for help- 1 (−1)n+1c ful suggestions, and the reviewers for their careful read- a˜n+1 = a˜n + , n = 0, 1, 2,.... (39) 2 a˜n ings and insightful suggestions. This research was par- tially supported by the Ministry of Education, Science, Eq. (39) is expressed asa ˜ = Φ (˜a ) with c = n+1 cn n n Sports and Culture, Grant-in-Aid for Young Scientists, (−1)n+1c, where Φ is defined by (11). It holds that c (B) 21740086. a˜2n+1 = Φ−c(˜a2n) anda ˜2n+2 = Φc(˜a2n+1) for n = 0, 1, 2,... . Hence, we have References a˜2n+2 = Φc(Φ−c(˜a2n)), n = 0, 1, 2,.... (40) [1] T. Sakaki and S. Kakei, Difference equations with an invari- Leta ¯n =a ˜2n for n = 0, 1, 2,... . Recall that (31). Then, ant expressed in terms of the hypergeometric function (in Japanese), Trans. JSIAM, 17 (2007), 455–462. (40) becomesa ¯n+1 = Ψc(¯an) for n = 0, 1, 2,... , whose [2] K. Kondo, Solutions of Sakaki-Kakei equations of type 3, 5 solutions can be obtained by Lemmas 7 and 9 under and 6, JSIAM Letters, 2 (2010), 73–76. some conditions. [3] Y. Nakamura, Algorithms associated with arithmetic, geo- We thus obtain solutions of (39) bya ˜2n =a ¯n and√ metric and harmonic means and integrable systems, J. Com- a˜2n+1 = Φ−c(˜a2n). Suppose that c > 0 and |a˜0| ≥ c. put. Appl. Math., 131 (2001), 161–174. [4] R. L. Devaney, A first course in chaotic dynamical sys- By Lemma 7, we obtaina ˜2n = Fc(n, a˜0) anda ˜2n+1 = tems: theory and experiment, Addison-Wesley, Reading, Mas- Φ−c(Fc(n, a˜0)). Suppose that c < 0 anda ˜0 ≠ 0. By ˆ sachusetts, 1992. Lemma 9, we obtaina ˜2n = Φc(F−c(n, Φc(˜a0))) and [5] K. Umeno, Method of constructing exactly solvable chaos, a˜2n+1 = Φ−c(Φc(F−c(n, Φˆ c(˜a0)))). Recall that (32). We Phys. Rev. E, 55 (1997), 5280–5284. ˆ − [6] E. T. Whittaker and G. N. Watson, A course of modern anal- havea ˜2n+1 √= Ψ−c(F−c(n, Φc(˜a0))). Since c > 0 and ysis 4th ed., Cambridge Univ. Press, Cambridge, 1927. |Φˆ c(˜a0)| ≥ −c, we can employ (30). We thus obtain a˜2n+1 = F−c(n + 1, Φˆ c(˜a0)).

– 48 – JSIAM Letters Vol.3 (2011) pp.49–52 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Analysis of credit event impact with self-exciting intensity model

Suguru Yamanaka1, Masaaki Sugihara1 and Hidetoshi Nakagawa2

1 Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan 2 Graduate School of International Corporate Strategy, Hitotsubashi University, 2-1-2 Hitotsu- bashi, Chiyoda-ku, Tokyo 101-8439, Japan E-mail yamanaka mtec-institute.co.jp Received May 4, 2011, Accepted June 16, 2011 Abstract The aim of this article is to examine self-exciting effect and/or mutually exciting effect on rating migrations. First, we examine with self-exciting/mutually exciting intensity models whether such effect can be observed for rating migrations in Japanese enterprises. Second, we analyze which explanatory variable is more significant to the jump of the intensity via model selection with Akaike information criterion (AIC). Keywords credit rating, rating migration, self-exciting intensity Research Activity Group Mathematical Finance

1. Introduction ally exciting effect on rating migrations in Japan. Then In this article, we use a kind of self-exciting/mutually we consider what is more significant to the jump of the exciting process as credit event intensity model to an- intensity in Section 5. Section 6 gives some concluding alyze credit event impact against event frequency. To remarks. be more precise, we demonstrate that our mutually ex- citing intensity model is to some extent consistent with 2. Intensity process the sample data, and we also attempt to explain how In this section, we present a self-exciting/mutually ex- the jump size of self-exciting intensity model is related citing intensity process, which was originally introduced to some explanatory variables. into credit risk literature by [1, 5]. Consider a filtered The self-exciting type intensity models have been re- complete probability space (Ω, F, P, {Ft}), where P is cently used in credit risk modeling (see [1–4]). For ex- the actual probability measure. Here {Ft} is a right- ample, in [1], a rating migration intensity model with continuous and complete filtration. Let ℓ ∈ {1, 2} denote self-exciting/mutually exciting property is used and it is the types of the credit events. Particularly, we set ℓ = 1 concluded that some rating migrations can give an im- as down-grade and ℓ = 2 as up-grade. For each ℓ, con- { ℓ ℓ } ℓ pact on the intensities of rating migration of not only sider marked point processes (Tn, ζn) n∈N. 0 < T1 < ℓ ··· the same category but other categories. T2 < is an increasing sequence of totally inaccessible Though [1] focused on self-exciting effect and/or mu- {Ft}-stopping times, which represents the event times ℓ tually exciting effect among sectors of industry, we ex- of event type ℓ. Random variable ζn represents a vec- ℓ amine if there are self-exciting effect and/or mutually tor consisted by attributes of event time ∑Tn. We denote ℓ exciting effect between down-grade and up-grade inten- the counting process of event ℓ by N = 1{ ℓ ≤ }. t n≥1 Tn t sity. Furthermore, we assume different types of events do not Our model is advanced in the sense that jump effect is occur at the same time. ℓ ℓ more flexibly introduced than that of previous works. In Suppose each Nt has intensity process λt. Namely, ℓ {F } particular, our model can explicitly relate the jump-size each λt is a t -progressively measurable∫ non-negative to other variables, which is different from the case where ℓ − t ℓ {F } process, and the process Nt 0 λsds is an t - the jump-size of intensity is assumed to be constant or ℓ local martingale. We specify λt with the self- independent of other variables as seen in previous works. exciting/mutually exciting stochastic process: With this model, we analyze which explanatory variable ℓ ℓ ℓ − ℓ ℓ ℓ ℓ is more significant to the jump of the intensity via model dλt = κ (c λt)dt + dJt , λ0 = c , ∑ ∑ selection with Akaike information criterion (AIC). ′ ℓ ℓ ℓ ′ Jt = f(ζn)1{T ℓ ≤t} + g(ζn )1{T ℓ ≤t}. The structure of this article is as follows. Section 2 n n n≥1 n≥1 introduces our self-exciting/mutually exciting intensity model. Section 3 presents the sample data of rating mi- Here, the constants κℓ > 0 and cℓ > 0 are parameters to gration records of Japanese enterprises. In Section 4, we be estimated. Function f(·) represents self-exciting jump discuss the existence of self-exciting effect and/or mutu- size and g(·) represents mutually exciting jump size. If

– 49 – JSIAM Letters Vol. 3 (2011) pp.49–52 Suguru Yamanaka et al. g = 0, the mutually exciting intensity model would be number so as to make every event times different. Influ- called self-exciting intensity model. ence of this treatment is slight, because the number of In Section 4, we examine the existence of self-exciting events in one day is big enough and they are scattered effect and/or mutually exciting effect, namely, whether at random. the functions f and g are identically zero or not. For the Figs. 1, 2 and 3 are distributions of “last rating”, “ab- purpose, we consider two types of simple jump models. solute difference of last rating and current rating” and First type of jump model is as follows: “interval of rating migrations in whole enterprises”. Fig. ∑ ∑ ℓ ℓ ℓ 1 indicates most ratings before migration are between 4 Model A : J = δ 1{T ℓ ≤t} + δ 1{ ℓ′ ≤ }. t 1 n 2 Tn t and 11. Fig. 2 indicates most of rating migrations are n n rating changes with one or two ranks. Fig. 3 indicates ℓ ℓ Here, the constants δ1 and δ2 are parameters to be esti- there are some cases where more than one rating mi- mated. Event ℓ intensity model with jump type of Model grations occur in the same day. Also, most of the event A indicates that event occurrences of type ℓ cause self- intervals are narrower than a week (5 working-days). ℓ exciting jump with size of δ1. In addition, event occur- rences of type ℓ′ cause mutually exciting jump with size 4. Existence of self-exciting effect and ℓ of δ2. Model A has simple jump with constant size. Ac- mutually exciting effect cordingly, Model A is tractable on parameter estimation. ℓ ℓ In this section, we analyze existence of self-exciting However, if either δ1 < 0 or δ2 < 0, intensity with jump Model A could be negative. This is contradictory to the effect and/or mutually exciting effect in down-grades fact that intensity is non-negative. From this, we con- and up-grades. In other words, we examine whether the sider not only Model A but also another type of jump down-grade intensity and up-grade intensity jump or not model as follows: by occurrences of down-grades and up-grades. ∑ To examine the existence of self-exciting effect and/or ℓ ℓ ℓ Model B : J = min(δ λ ℓ −, γ )1{ ℓ ≤ } t 1 Tn Tn t mutually exciting effect, we estimate the jump parame- n ∑ ters of both Models A and B. If the estimated values of ℓ ℓ self-exciting jump size are significantly δℓ ≠ 0, we con- + min(δ λ ℓ′ −, γ )1{ ℓ′ ≤ }. 1 2 Tn Tn t n clude that the self-exciting effect exists. Similarly, if the estimated values of mutually exciting jump size are sig- ℓ − ℓ − ℓ Here, constants δ1 > 1, δ2 > 1 and γ > 0 are pa- ℓ ̸ nificantly δ2 = 0, we conclude that the mutually exciting rameters to be estimated. The jump size of Model B are effect exists. To obtain estimated jump sizes, we exe- proportional to the intensity before the event and has ℓ cuted maximum-likelihood approach. The log-likelihood upper bound γ . Conditions on proportional constants function of the intensity of the event ℓ is the following: ℓ − ℓ − of δ1 > 1 and δ2 > 1 keep the intensity non-negative. ∫ ∑N H In Section 5, we attempt to explain the relation be- ℓ ℓ log λ ℓ − − λ ds. tween self-exciting impact and some explanatory vari- Tn s 0 ables. For the purpose, we employ new jump type, say n=1 ℓ ℓ affine function of explanatory variables type, as follows: Here, λt− := lims↑t λs. For likelihood maximization, we ∑ ( ∑ ) employed statistical software , using intrinsic function ℓ ℓ R Model C : J = a0 + amxm(T ) 1{ ℓ ≤ }. t n Tn t optim. For estimation tractability, we set search range of n m γℓ as γ1 = 100, 125, 150, 175, 200 for down-grade inten- 2 Here, a0 is constant terms, {xm} are explanatory vari- sity, and γ = 5.0, 7.5, 10.0, 12.5, 15.0 for up-grade inten- ables and {am} are coefficients. As we focus on self- sity. If the absolute value of the estimated parameter is exciting effect in Section 5, Model C has only self- larger than twice of standard estimation error (meaning exciting jump. about 95% significant level), we consider the self-exciting effect and mutually-effect exists significantly. 3. Data Table 1 shows the estimation results of Models A and The data for analysis is issuer rating migration records B. Estimated jump size of down-grade intensity are sig- 1 1 of Japanese corporations from April 1, 1998 to March 31, nificantly δ1 > 0 and δ2 < 0. This indicates that the 2009. The ratings are announced by R&I. The record down-grade intensity has self-exciting property and mu- of each rating migration consists of the event date, the tually exciting effect from up-grade. Namely, the pos- issuer’s name, the industry it belongs to, the type of sibility of down-grade is raised by down-grades and is event, the current ratings and the last ratings. The data drop down by up-grade. For up-grade intensity, esti- 2 includes rating for insurance claims paying ability. Ex- mated jump sizes are significantly δ1 > 0, indicating ex- cluding rating monitors from the samples, we observe istence of self-exciting effect. On the other hand, mutu- 2 965 down-grades, 481 up-grades during the period. In ally exciting jump sizes of up-grade intensity are δ2 < 0 the samples, there are 25 rating categories, AAA, AA+, in both Models A and B, however, the jump of Model 2 ... ,C− in order of credit worthiness. Hereafter, we rep- A is not significantly δ2 < 0. This indicates that the resents rating categories by 1, 2,...,K for simplicity. Ex- up-grade intensity has self-exciting effect and slight mu- cluding no-business days, we transformed calendar times tually exciting effect. Namely, the possibility of up-grade April 1, 1998, April 1, 1999, ... to t = 0, 1,... . In our is raised by up-grades and would be drop down by down- analyses, we slide the event times with uniform random grade.

– 50 – JSIAM Letters Vol. 3 (2011) pp.49–52 Suguru Yamanaka et al.

Table 1. Estimation result for Models A and B. Values in paren- theses are standard estimation errors. Estimated values of γℓ are γ1 = 125 and γ2 = 10.0. ℓ ℓ ℓ ℓ Model κ c δ1 δ2 A 234.43 33.19 150.36 −9.20 Down-grade (42.33) (3.09) (22.64) (2.10) down − up ℓ = 1 B 170.66 33.59 2.32 0.38 (9.89) (2.75) (0.39) (0.09) A 7.24 5.92 6.37 −0.042 Up-grade (2.01) (2.91) (1.72) (0.095) ℓ = 2 B 11.75 13.14 0.28 −0.021 Percentage (%) Percentage (0.54) (0.81) (0.026) (0.009)

Table 2. Estimation result on self-exciting impact. Values in

2 4 6 8 10 12 14 16 18 20 22 parentheses are standard estimation errors.

0 a0 a2 a3 a4 Down-grade 193.53 −11.56 34.57 −380.18 2 4 6 8 10 12 14 16 18 (36.96) (3.37) (14.97) (137.64) Rating Up-grade 9.43 −0.09 1.81 −7.38 Fig. 1. Distribution of last rating. Sample span is from April 1, (5.35) (0.53) (2.48) (12.52) 1998 to March 31, 2009.

5. Explanation of self-exciting impact 5.1 Explanation of self-exciting impact In this section, we introduce some information ob- down up served at the event-time, and use them as explanatory variables of self-exciting impact. We employ Model C in analyses and consider following four explanatory vari- ables: ℓ ℓ • x1(T ) ∈ {1, 2,...,K}: Current rating on T ,

Percentage (%) Percentage n n • ℓ ∈ { } ℓ x2(Tn) 1, 2,...,K : Last rating before Tn, • ℓ ℓ − ℓ x3(Tn) = x1(Tn) x2(Tn): Difference of current and last ratings, 10 20 30 40 50 60 70 80 90 ℓ ℓ ℓ • x (T ) = T − T − : Time interval from last migra- 0 4 n n n 1 tion to current migration. 1 2 3 4 5 6 7 8 ℓ ℓ ℓ ℓ Difference of ratings Excluding x1(Tn), we employ x2(Tn), x3(Tn) and x4(Tn) ℓ Fig. 2. Distribution of the absolute difference of last rating and for explanation variables, because x1(Tn) overlaps with ℓ ℓ current rating. Sample span is from April 1, 1998 to March 31, x2(Tn) and x3(Tn). Also, we note that x3 is just differ- 2009. ence of current and last ratings, not absolute difference. Table 2 shows the estimation result of Model C. In Ta- ble 2, for down-grade model, coefficients are estimated significantly, implying that the self-exciting impact be- come larger in one of the following three cases: down • up Last rating is high. • Difference of ratings before-after migration is wide. • Time interval of rating migration is narrow. These implications, which are intuitively recognizable,

Percentage (%) Percentage are derived respectively by the significant estimation re- sults of a2 < 0, a3 > 0 and a4 < 0. On the other hand, for up-grade model, estimation error of each coefficient is not small enough to make some significant explanations. 5 10 15 20 25 30 35 40 45 50 5.2 Selection of explanatory variables 0 In this subsection, we examine which explanatory vari- 0 2 4 6 8 10 12 14 able is more significant to explain self-exciting impact. Event time interval (day) We compare several combinations of explanatory vari- Fig. 3. Distribution of interval of rating migrations in whole en- ables for Model C and select the combination whose AIC terprises. Sample span is from April 1, 1998 to March 31, 2009. is smaller. In particular, we consider the model is bet- ter than another when the difference of AIC exceeds one (see [6]). Explanatory variables we consider are current

– 51 – JSIAM Letters Vol. 3 (2011) pp.49–52 Suguru Yamanaka et al.

Table 3. Selection of self-exciting impact explanatory variables some explanatory variables. We used the self-exciting in- for down-grade intensity. Order of explanatory variables combi- tensity model whose jump type is affine function of some nations is associated with AIC. explanatory variables. Significance of explanatory vari- Number of a0 x1 x2 x3 x4 LL AIC parameters ables were analyzed via model selection with AIC. As 1 1 1 5 3865.1 −7720.2 a result, we obtained some implications on down-grades 1 1 1 1 6 3865.9 −7719.8 which are intuitively recognizable. − 1 1 1 5 3862.7 7715.3 The explanatory variables we used are the informa- 1 1 4 3861.0 −7714.0 1 1 1 5 3860.6 −7711.3 tion about issuer ratings and rating migration intervals. 1 1 4 3858.7 −7709.5 Considering estimation tractability, we did not consider 1 1 1 5 3858.9 −7707.8 other explanatory variables. However, we are able to con- 1 1 4 3857.4 −7706.8 sider other explanatory variables, such as size of corpo- − 1 1 4 3857.4 7706.8 rate, in the same way. The analyses with additional ex- 1 3 3855.5 −7704.9 planatory variables would be future work. Finally, we would like to mention that the intensity models we con- Table 4. Selection of self-exciting impact explanatory variables for up-grade intensity. Order of explanatory variables combina- sidered are naturally applied to credit risk modeling. tions is associated with AIC.

Number of Acknowledgments a0 x1 x2 x3 x4 LL AIC parameters 1 3 1503.2 −3000.5 This work was supported in part by Global COE Pro- 1 1 4 1503.5 −2999.0 gram “The research and training center for new devel- 1 1 4 1503.3 −2998.5 opment in mathematics”, MEXT, Japan. 1 1 4 1503.3 −2998.5 1 1 4 1503.2 −2998.5 References 1 1 1 5 1503.5 −2997.0 1 1 1 5 1503.5 −2997.0 [1] H. Nakagawa, Analysis of records of credit rating transi- 1 1 1 5 1503.3 −2996.6 tion with mutually exciting rating-change intensity model (in 1 1 1 5 1503.3 −2996.5 Japanese), Trans. JSIAM, 20 (2010), 183–202. 1 1 1 1 6 1503.5 −2995.0 [2] S. Azizpour, K. Giesecke and B. Kim, Premia for Correlated Default Risk, J. Econ. Dyn. Control, 35 (2011), 1340–1357. [3] K. Giesecke and B. Kim, Risk analysis of collateralized debt obligations, Oper. Res., 59 (2011), 32–49. ℓ ℓ rating x1(Tn), last rating x2(Tn), the difference between [4] S. Yamanaka, M. Sugihara and H. Nakagawa, Modeling of ℓ the last rating and the current rating x3(Tn) and rating Contagious Credit Events and Risk Analysis of Credit Port- ℓ folios, Asia-Pacific Financial Markets, in press. migration time interval x4(Tn). Table 3 shows the selection of explanatory variables [5] H, Nakagawa, Modeling of contagious downgrades and its ap- plication to multi-downgrade protection, JSIAM Letters, 2 on down-grade self-exciting impacts. Table 3 indicates (2010), 65–68. following observations: [6] Y. Sakamoto, M. Ishiguro and G. Kitagawa, Akaike Informa- • “Rating migration time interval” seems less impor- tion Criterion Statistics, D. Reidel Pub. Co., Dordrecht, 1986. tant when “current rating” and “last rating” exist. • “Last rating” seems more important than “current rating”. • “Difference of last rating and current rating” seems less important than either “current rating” or “last rating”. Table 4 shows selection result of up-grade self-exciting impact explanatory variables. Contrary to the result of down-grades, Table 4 indicates that the increase of num- ber of explanatory variables does not increase likelihood effectively, and the model with less variables tends to be selected. This means that the explanatory variables are not so significant to explain our up-grade samples.

6. Concluding remarks We examined self-exciting effect and mutually excit- ing effect on rating migrations. At first, we considered mutually exciting intensity models of two jump types, and examined the existence of self-exciting effect and/or mutually exciting effect. The estimated jump parameters imply that both self-exciting and mutually exciting effect exist in down-grades. Also, we recognized self-exciting ef- fect and slight mutually exciting effect in up-grades. Sec- ond, we attempted to explain self-exciting impact with

– 52 – JSIAM Letters Vol.3 (2011) pp.53–56 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

On the reduction attack against the algebraic surface public-key cryptosystem(ASC04)

Satoshi Harada1, Yuichi Wada2, Shigenori Uchiyama3 and Hiro-o Tokunaga3

1 NRI SecureTechnologies, Ltd., Tokyo 105-7113, Japan 2 Waseda Junior & Senior High School, Tokyo 162-8654, Japan 3 Tokyo Metropolitan University, Tokyo 192-0397, Japan E-mail s2-harada nri.co.jp, uchiyama-shigenori tmu.ac.jp Received May 10, 2011, Accepted June 25, 2011 Abstract In 2004, Akiyama and Goto proposed an algebraic surface public-key cryptosystem (ASC04) which is based on the hardness of finding sections on fibered algebraic surfaces. In 2007, Uchiyama and Tokunaga gave an efficient attack, which is called the reduction attack, against ASC04 under some condition of a public-key of the scheme. In 2008, Iwami proposed its improved attack. In this paper, we point out a flaw in Iwami’s attack and propose a generalized reduction attack. The attack is based on Iwami’s attack, and the flaw is fixed. We also discuss our experiments of the attack. Keywords multivariate public-key cryptography, algebraic surface, section finding problem, Gr¨obnerbasis, elimination ideal Research Activity Group Algorithmic Number Theory and Its Applications

1. Introduction any conditions. Moreover, we discuss our experiments of In 1994, Shor proved that the integer factorization the proposed attack. problem and the discrete logarithm problem can be solved in probabilistic polynomial time by using quan- 2. ASC04 tum computers [1]. Thus, once a quantum computer is In this section, we briefly review the ASC04. See [3] realized, public-key cryptosystems based on them would for the detail. not be secure. For this reason, cryptographic schemes 2.1 Secret-Key which are expected to have resistance against quantum computers have been researched actively [2]. Algebraic Two different curves D1 and D2 parameterized with t A3 surface public-key cryptosystems (ASCs for short) [3,4], in (k):

proposed by Akiyama and Goto, is one of the candidates D1 :(x, y, t) = (ux(t), uy(t), t), for such schemes. ASC is based on the hardness of find- ing sections on fibered algebraic surfaces. This problem D2 :(x, y, t) = (vx(t), vy(t), t). is called the section finding problem (SFP for short). SFP is the following problems. (Let k := F be a finite 2.2 Public-Key p • prime field of p elements.) Algebraic Surface∑ X: X(x, y, t) := c (t)xiyj = 0 (∈ k[x, y, t]) Let X(x, y, t) = 0 be an algebraic surface over k, the (i,j)∈ΛX ij { ∈ Z 2 | ̸ } problem is to find two polynomials ux(t), uy(t) ∈ k[t] (ΛX := (i, j) ( ≥0) cij(t) = 0 ) satisfying such that X(ux(t), uy(t), t) = 0. X(ux(t), uy(t), t) = X(vx(t), vy(t), t) = 0. Two of the authors, Uchiyama and Tokunaga, pro- posed an efficient attack, which is called the reduction • l: an integer satisfying the following condition. attack, against the ASC04 (which is the first imple- degt X(x, y, t) < l, and l is the minimum degree of mentation of ASC proposed in 2004) in 2007 [5]. They a monic irreducible polynomial f(t) ∈ k[t] given for make use of some fundamental properties of Gr¨obner encryption. basis. The correctness of the reduction attack can be • d: an integer satisfying the following condition. proven under a certain condition of the leading term of ≥ { } a public-key X(x, y, t) with respect to a monomial order d max deg ux(t), deg uy(t), deg vx(t), deg vy(t) . in k[x, y, t]. Moreover, Ivanov and Voloch proposed a so- called trace attack in 2008 [6]. Then, Iwami proposed an 2.3 Encryption improved reduction attack [7]. In this paper, we point Divide a plaintext m into l blocks as m = m0||m1|| out a flaw in Iwami’s scheme, and propose a generalized · · · ||ml−1 and embed mi (0 ≤ mi < p (i = 0, . . . , l − 1)) reduction attack against the ASC04. The attack is based within coefficients of a plaintext polynomial m(t) ∈ k[t]. on Iwami’s attack, and the flaw is fixed by our proposal. Choose a monic irreducible polynomial f(t) ∈ k[t] of The correctness of our proposed attack is proven without degree greater than or equal to l and randomly choose

– 53 – JSIAM Letters Vol. 3 (2011) pp.53–56 Satoshi Harada et al.

Table 1. Reduction attack. Table 2. Generalized reduction attack. Input: Public-Key X ∈ k[x, y, t], Ciphertext F ∈ k[x, y, t]. Input: Public-Key X ∈ k[x, y, t], Ciphertext F ∈ k[x, y, t]. Output: Plaintext m corresponding to ciphertext F (x, y, t). Output: Plaintext m corresponding to ciphertext F (x, y, t).

1. Find the remainder R1 ∈ k[x, y, t] by dividing F by X. 1. Assume the public-key X ∈ k(t)[x, y], and compute Y := i j 2. Randomly choose some terms of R1 with cij (t)x y ((i, j) ≠ X/LC(X). (Y ∈ k(t)[x, y], LC(X) ∈ k[t]) (0, 0), cij (t) ̸∈ k), and let its coefficients cij (t) be C(⊂ k[t]). 2. Find the remainder R1 ∈ k(t)[x, y] by dividing F by Y . i j 3. Factorize elements of a set C, and let the set of irreducible 3. Randomly choose some terms of R1 with cij (t)x y ((i, j) factors of degree l or more be G(⊂ k[t]). (0, 0), cij (t) ̸∈ k), changing its coefficients cij (t) to equiv- 4. Choose g ∈ G, and find the remainder n ∈ k[t] by dividing alent fractions with a common denominator, and let the R1 by g. If n ̸∈ k[t], we choose another g ∈ G. numerators be C(⊂ k[t]). k−1 5. Let n(t) = nk−1t + ··· + n1t + n0 ∈ k[t], and compute 4. Factorize elements of a set C, and let the set of irreducible m = n0||n1|| · · · ||nk−1. factors of degree greater than or equal to l be G(⊂ k[t]). 5. Choose g ∈ G, and compute a Gr¨obnerbasis for an idea ⟨g, X⟩ w.r.t. the lex order (x > y > t) in k[x, y, t]. Find the remainder n(t) ∈ k[t] by dividing F by the basis. k−1 two polynomials r(x, y, t), s(x, y, t) ∈ k[x, y, t] with some 6. Let n(t) = nk−1t + ··· + n1t + n0 ∈ k[t], and compute || || · · · || conditions about its degree. The ciphertext F (x, y, t) ∈ m = n0 n1 nk−1. k[x, y, t] is defined as follows: F (x, y, t) := m(t) + f(t)s(x, y, t) + X(x, y, t)r(x, y, t). 4. Generalized reduction attack 2.4 Decryption 4.1 Generalized reduction attack Substituting sections D1, D2 into F (x, y, t), we obtain: In this section, we propose a generalized reduction at- tack (GRA for short). This attack is based on Iwami’s h (t) := F (u (t), u (t), t)= m(t) + f(t)s(u (t), u (t), t), 1 x y x y attack, and the flaw is fixed. See Table 2. h (t) := F (v (t), v (t), t) = m(t) + f(t)s(v (t), v (t), t). 2 x y x y 4.2 Analysis of the generalized reduction attack Factorize h1(t) − h2(t) and choose f(t) as an irreducible We can prove the correctness of the generalized re- polynomial with largest degree. Then, m(t) is obtained duction attack without using Condition 1 based on the by dividing h1(t) by f(t). Finally, we obtain the plaintext following two theorems. m from m(t). Theorem 3 In Step 4 of our attack, ∃g ∈ G s.t. g = f(t). 3. Reduction attack ⟨ ⟩ ⊂ 3.1 Reduction attack Proof Let I := Y k(t)[x, y] be an ideal gener- ated by Y . Then, {Y } is a Gr¨obnerbasis. Since I is a In 2007, Uchiyama and Tokunaga proposed an effi- principal ideal, ∀a ∈ I, a = GY¯ (G¯ ∈ k(t)[x, y]). There- cient attack, which is called the reduction attack, against fore, ∃1G ,R ∈ k(t)[x, y] s.t. F = G Y + R . This R the ASC04 [5]. (See Table 1.) They make use of funda- 1 1 1 1 1 is clearly equal to R in Step 2. Similarly, ∃1G ,R ∈ mental properties of Gr¨obnerbasis. For the proof of its 1 2 2 k(t)[x, y] s.t. s(x, y, t) = G Y +R . Therefore, the cipher correctness, the following condition is assumed: 2 2 text F (x, y, t) = m(t)+f(t)s(x, y, t)+X(x, y, t)r(x, y, t) Condition 1 For the defining equation of the alge- is as follows: (Note that X = LC(X)Y ) braic surface X, the leading term of X as LT(X) w.r.t. a monomial order in k[x, y, t] is in the form of cxαyβ F = m(t) + f(t)(G2Y + R2) + LC(X)Y r ∈ ̸ (c k, (α, β) = (0, 0)). = m(t) + f(t)R2 + Y (f(t)G2 + LC(X)r).

3.2 Iwami’s reduction attack Then, each term of m(t) + f(t)R2 can not be divided In 2008, Iwami generalized the reduction attack [7], by LT(Y ). Therefore, we obtain R1 = m(t) + f(t)R2 by and claimed Condition 1 can be dropped. the uniqueness of R1. ∈ We implemented the attack. However we could not Now, we assume R2 = R2(t) k(t). Then, we evalu- obtain the valid plaintexts. So there is a flaw in Iwami’s ate the cipher polynomial F at sections D1 and D2, we scheme. See [7] for the detail. obtain:

Proposition 2 In Iwami’s attack, we have n = 0 in h1(t) = F (ux(t), uy(t), t) = m(t) + f(t)s(ux(t), uy(t), t), Step 5. h2(t) = F (vx(t), vy(t), t) = m(t) + f(t)s(vx(t), vy(t), t). Proof For ∀g ∈ G in Step 4, g(t) ∈ k[t] ⊂ k(t) ⊂ Since X(ux(t), uy(t), t) = LC(X)Y (ux(t), uy(t), t) = 0 k(t)[x, y]. Therefore, g(t) is a unit in k(t)[x, y]. Thus, we and LC(X) ≠ 0, we obtain Y (u (t), u (t), t) = 0. There- obtain as follows: x y fore, s(ux(t), uy(t), t) = G2Y (ux(t), uy(t), t) + R2 = R2. R1 = (1/g(t))R1g(t). Similarly, we have s(vx(t), vy(t), t) = R2. Thus, we ob- tain: Since (1/g(t))R1 ∈ k(t)[x, y], we obtain n = r = 0 ∈ k in Step 5. Thus, we cannot obtain the valid plaintext m. h1(t) = m(t) + f(t)R2 = h2(t). (QED) Therefore, we cannot decrypt because of h1(t) = h2(t), and this is a contradiction.

– 54 – JSIAM Letters Vol. 3 (2011) pp.53–56 Satoshi Harada et al.

Thus, ∃xiyjtk ((i, j) ≠ 0, k ≥ 0) in the numerator of Table 3. Improved generalized reduction attack. R2 and R1(= m(t)+f(t)R2). Then, we randomly choose Input: Public-Key X ∈ k[x, y, t], Ciphertext F ∈ k[x, y, t]. some terms of R1 satisfying Step 3, and change them to Output: Plaintext m corresponding to ciphertext F (x, y, t). equivalent fractions with a common denominator. Let 1. Assume the public-key X ∈ k(t)[x, y], and compute Y := X/ the numerators be a set C. Since any element of C can LC(X). (Y ∈ k(t)[x, y], LC(X) ∈ k[t]) ∈ be divided by f(t), we obtain f(t) ∈ G. 2. Find the remainder R1 k(t)[x, y] by dividing F by Y . i j (QED) 3. Randomly choose some terms of R1 with cij (t)x y ((i, j) ≠ (0, 0), cij (t) ̸∈ k), changing its coefficients cij (t) to equivalent Note: In what follows, we use f instead of g since we fractions with a common denominator, and let the numera- can obtain f(t) = g (∈ G) by Theorem 3. tors be C(⊂ k[t]). Theorem 4 n(t) in Step 5 is the plaintext polynomial 4. Factorize elements of a set C, and let the set of irreducible factors of degree greater than or equal to l be G(⊂ k[t]). m(t). 5. Choose g ∈ G, and compute a normal form n of F by {g, X} Proof Let an ideal I be I := ⟨X, f⟩, and let a Gr¨obner w.r.t. the lex order (x > y > t) in k[x, y, t]. If the remainder { } n is a univariate polynomial n(t) ∈ k[t], go to Step 7. basis for I be GB(I) := f1, . . . , fs . Moreover, the Otherwise go to Step 6. Gr¨obnerbasis for I ∩ k[t] is equal to GB(I) ∩ k[t] by the 6. Choose g ∈ G, and compute a Gr¨obnerbasis for an ideal ⟨g, elimination ideals. Then, we gather fi ∈ GB(I) ∩ k[t] X⟩ w.r.t. the lex order (x > y > t) in k[x, y, t]. Find the re- ∈ from GB(I), then change the indices of fi in ascending mainder n(t) k[t] by dividing F by the basis. k−1 ··· ∈ ∩ { } 7. Let n(t) = nk−1t + + n1t + n0 k[t], and compute order of degree. We obtain GB(I) k[t] = fi1 , . . . , fil . m = n ||n || · · · ||n − . Since we can regard GB(I)∩k[t] as the reduced Gr¨obner 0 1 k 1 basis, we have: ∩ { } GB(I) k[t] = fi1 . By Theorems 3 and 4, we can prove that this algo- rithm is effective for ASC04. Now, we will prove fi1 (t) = f(t). First, we shall prove that f (t) is divisible by f(t). ∃a(x, y, t), b(x, y, t) ∈ i1 5. Efficiency of the generalized reduction k[x, y, t] s.t. fi1 (t) = a(x, y, t)X(x, y, t) + b(x, y, t)f(t) ∈ ⊂ since fi1 I k[x, y, t]. Then, substitute the secret-key attack (x, y, t) = (u (t), u (t), t) into f , and we have: (Note x y i1 When we implement the GRA, since it takes many that X(ux(t), uy(t), t) = 0) times to compute a Gr¨obnerbasis, the GRA is not so efficient in many cases. From a practical point of view, fi1 (t) = b(ux(t), uy(t), t)f(t). we need to reduce its running time. Here we propose ˜ We assume b(t) := b(ux(t), uy(t), t), and we obtain: some improved methods for the GRA by adding some step just before Step 5 in the Table 3. We call this attack f (t) = ˜b(t)f(t)(˜b(t) ∈ k[t]). i1 IGRA for short. See Table 3 for the detail. ∈ Secondly, we shall prove that f(t) is divisible by fi1 (t). If n(t) k[t] in Step 5, then the n(t) is a plaintext ∈ ∩ Since f I k[t], fi1 is a Gr¨obnerbasis. Then, we have: polynomial m(t). We have the following theorem. ∈ Theorem 5 If a normal form of F by {f, X} w.r.t. f(t) = c(t)fi1 (t)(c(t) k[t]). lex order is a univariate polynomial n(t) ∈ k[t] in Step Therefore, we have: 5 of Table 3, n(t) is the plaintext polynomial m(t) for ˜ ASC04. f(t) = c(t)fi1 (t) = c(t)b(t)f(t). Proof Let an ideal I be I := ⟨X, f⟩, and let a Gr¨obner Since c(t)˜b(t) = 1 and ˜b, c ∈ k, we obtain GB(I) ∩ k[t] = basis for I be GB(I) := {f , . . . , f }. As shown at the {f(t)}. 1 s proof of Theorem 4, Thus, we obtain GB(I) = {f(t), f2, . . . , fs} s.t. fi = xαyβtγ (2 ≤ i ≤ s, (α, β) ≠ 0, γ ≥ 0). Since we compute GB(I) ∩ k[t] = {f(t)}. a Gr¨obnerbasis for an ideal I w.r.t. the lex order (x > Therefore, we can assume f = f(t) and LT(f ) = y > t) in k[x, y, t], we have: 1 i αi βi γi ≤ ≤ ∈ Z3 ̸∈ x y t (2 i s, (αi, βi, γi) ≥0, (αi, βi) (0, 0)). LT(f) ∈ k[t], LT(fi) ̸∈ k[t] (2 ≤ i ≤ s). By Theorem 4, since we can obtain the valid plaintext polynomial m(t) by dividing the ciphertext F (x, y, t) by Then, we shall consider about dividing the cipher text GB(I), we have: F = m(t) + sf + Xr by GB(I). Any terms of the ∈ ≤ ≤ m(t) k[t] can not be divided by LT(fi) (2 i s). F (x, y, t) = m(t) + f1g1 + f2g2 + ··· + fsgs Furthermore, any terms of the m(t) ∈ k[t] can not ∀ (LT(f ) /m| (t), g ∈ k[x, y, t], 1 ≤ i ≤ s). be divided by LT(f) because of deg m(t) = l − 1 and i i deg f(t) = l. Since sf + Xr ∈ I, sf + Xr is divisible by Moreover, we can obtain a univariate polynomial n(t) GB(I). by a normal form of F by {f, X}, and we have: Thus, by the uniqueness of the reminder of dividing F (x, y, t) = n(t) + fh + Xh (h , h ∈ k[x, y, t]). by Gr¨obnerbasis, the reminder of dividing the cipher 1 2 1 2 text F by GB(I) makes m(t). Therefore, we compute difference of the both members, (QED) and we obtain: (Note that f1 = f(t))

n(t) − m(t) = f1(g1 − h1) + f2g2 + ··· + frgr − Xh2.

– 55 – JSIAM Letters Vol. 3 (2011) pp.53–56 Satoshi Harada et al.

Table 4. IGRA. against ASC09, where the ASC09 is an another imple- p d l avg. [s] memory [MB] mentation of the ASC [4, 10]. 17 20 160 0.152 11.78 17 50 400 0.572 15.02 Acknowledgments The authors would like to thank the reviewers for their Table 5. p = 17, d = 5, l = 50. valuable comments. This work was supported in part by GRA IGRA Grant-in-Aid for Scientific Research (C)(20540125). time [s] 521.350 0.010 References

∈ − ∈ [1] P. W. Shor, Polynomial-time algorithms for prime factoriza- Since f1(= f), f2, . . . , fs,X I, we obtain n(t) m(t) tion and discrete logarithms on a quantum computer, SIAM I ∩ k[t] (= ⟨f(t)⟩). Moreover, since deg m(t) < l ≤ J. Comput., 26 (1997), 1484–1509. deg f(t) and deg n(t) < deg f(t), we obtain f(t) ̸ | (n(t)− [2] T. Okamoto, K. Tanaka and S. Uchiyama, Quantum Public m(t)). Therefore, we obtain: Key Cryptosystems, in: Proc. of Crypto2000, LNCS 1880, pp. 147–165, Springer, 2000. n(t) − m(t) = 0 ⇐⇒ n(t) = m(t). [3] K. Akiyama and Y. Goto, A Public-Key Cryptosystem using Algebraic Surfaces, in: Proc. of PQCrypto2006, pp. 119–138, (QED) 2006. By Theorem 5, we do not need to compute a Gr¨obner [4] K. Akiyama, Y. Goto and H. Miyake, An Algebraic Surface ∈ Cryptosystem, in: Proc. of PKC2009, LNCS 5443, pp. 425– basis if n(t) k[t], and we can find a plaintext m effi- 442, Springer, 2009. ciently. [5] S. Uchiyama and H. Tokunaga, On the Security of the Al- gebraic Surface Public-Key Cryptosystems (in Japanese), in: 6. Implementation Proc. of SCIS2007, 2C1-2, 2007. [6] P. Ivanov and J. F. Voloch, Breaking the Akiyama-Goto Cryp- In this section, we will show some experimental re- tosystem, in: Proc. of AGCT11, Contemporary Math. 487, pp. sults about the GRA (Table 2) and the IGRA (Table 3). 113–118, 2009. We used a system of Solaris10 with 2GHz CPU (AMD [7] M. Iwami, A Reduction Attack on Algebraic Surface Public- Opteron246), 4GB memory, and 160GB hard disk. More- Key Cryptosystems, in: Proc. of ASCM2007, LNCS 5081, pp. over, we used Magma [8](Ver. 2.16-4) as a software for 323–332, Springer, 2008. [8] Magma, http://magma.maths.usyd.edu.au/magma/. writing the program. [9] J-C. Faug`ereand P-J Spaenlehauer, Algebraic Cryptanalysis (a) IGRA We describe the experimental results of the PKC’2009 Algebraic Surface Cryptosystem, in: Proc. of PKC2010, LNCS 6056, pp. 35–52, Springer, 2010. about the IGRA. For each (p, d, l), we generate 100 [10] K. Akiyama and Y. Goto, An improvement of the algebraic sets (X, f, s, r, m) randomly. See Table 4 for the re- surface public-key cryptosystem, in: Proc. of SCIS2008, 1F1- sults. We could efficiently find the valid plaintext m 2, 2008. for larger size parameters. The above results of the IGRA could compute m(t) ∈ k[t] at Step 5. Thus, we do not need to compute a Gr¨obnerbasis at Step 6. Here we note that, there exist some cases we need Step 6. (b) GRA v.s. IGRA We compared with the GRA and the IGRA. As stated in the previous section, it takes many times to compute a Gr¨obnerbasis generally in the GRA. Actually, there exist some parameters, which take more than several hours to compute a Gr¨obnerbasis in the GRA. Now, we show some experimental results. See Table 5 for the re- sults. In Table 5, for IGRA, the average running time is shown. Here, we generate randomly 100 sets (X, f, s, r, m) for p = 17, d = 5, l = 60. On the other hand, for GRA, the fastest running time in the ex- periments is only shown since its running time was too long, and we had to terminate the program be- fore finished in most cases.

7. Conclusion We proposed a generalized reduction attack against ASC04, and the flaw in Iwami’s attack was fixed by our proposal. Also we showed some experimental results about our proposed attack. One of our future works is to evaluate the computational complexity of the general- ized reduction attack according to [9] which is an attack

– 56 – JSIAM Letters Vol.3 (2011) pp.57–60 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Deterministic volatility models and dynamics of option returns

Takahiro Yamamoto1 and Koichi Miyazaki1

1 Graduate School of Informatics and Engineering, The University of Electro-Communications, 1-5-1, Chohugaoka, Chohu-shi, Tokyo 182-8585, Japan E-mail y1030103 edu.cc.uec.ac.jp Received June 28, 2011, Accepted July 12, 2011 Abstract In this research, we revamp the approach of Buraschi and Jackwerth (2001), especially, in the derivation of the pricing-kernel and the data-handling technique and then empirically analyze the consistency of the DVMs introduced in Mawaribuchi, Miyazaki and Okamoto (2009) to the dynamics of the cross-sectional option returns. The implication attained from our quantitative analyses is that even in the trending and volatile market, we could build the equity models that are rational to the dynamics of the cross-sectional option market prices within the framework of the complete model without incorporating the additional stochastic variable such as jump or stochastic volatility. Keywords pricing kernel, deterministic volatility model, nikkei225 option Research Activity Group Mathematical Finance

1. Introduction 2. Quantitative methods One extension of the famous BS equity model (ge- 2.1 Framework of the quantitative analyses based on ometric Brownian motion) is deterministic volatility the pricing-kernel model (for short, DVM), whose volatility is deterministic The purpose of this research is to discuss whether functional form of equity price (Dupire (1994) [1] among the DVMs is able to capture the dynamics of the cross- others). Mawaribuchi, Miyazaki and Okamoto (2009) [2] sectional option market prices. To the end, we attempt calibrates the newly introduced 5-parameter DVM to to examine statistically whether the option returns from cross-sectional options market prices on an evaluation the DVMs is consistent to the realized returns of the op- date and reports that the model prices of options de- tions (ATM, OTM, ITM) time-series-wise. In the deriva- rived from their 5-parameter DVM are quite close to tion of the realized returns from the ITM, the ATM and their corresponding market prices on the date. The pur- the OTM option market prices, we regard these options pose of this study is to discuss whether the model prices as the individual assets and compute the realized returns of options derived from the DVMs are close to their cor- of the assets under the empirical measure. To make all responding market prices time-series-wise, in short, the the analyses proceed under the empirical measure, we DVMs could capture the dynamics of the market prices introduce the pricing-kernel induced from the DVM and of options. statistically examine whether the market prices of the The preceding research Buraschi and Jackwerth options (ATM, OTM, ITM) multiplied by the pricing- (2001) [3] statistically examines the consistency of the kernel are all close to 1 time-series-wise by GMM tech- pricing-kernel induced from the DVM to the time-series nique. of returns of the S&P 500 options (ATM, OTM, ITM) by The pricing-kernel mt.t+∆t (the suffix indicates the GMM (Generalized Method of Moments) technique. We time interval from time t + ∆t to time t) satisfies (1) revamp their approach, especially, in the derivation of and the asset price St at time t is able to be evaluated the pricing-kernel and the data handling technique and by taking expectation of the multiplication of the asset then empirically analyze the consistency of the DVMs in- price St+∆t and the pricing-kernel mt.t+∆t at time t+∆t troduced in Mawaribuchi, Miyazaki and Okamoto (2009) under the empirical measure Et conditioned on St. [2] to the dynamics of the cross-sectional option returns. S = E [m S ]. (1) The organization of this letter is as follows. In Section t t t,t+∆t t,t+∆t 2, we provide the statistical method to evaluate the con- S Transforming (1) by St+∆t/St = Rt+∆t, we attain (2) sistency of the DVM pricing-kernel to the dynamics of for the pricing-kernel and the options gross returns. option returns. In Section 3, we report the results of our S empirical analyses on the NIKKEI225 options market 1 = Et[mt,t+∆tRt,t+∆t], (2) and provide the implications from the results. In Sec- S where Rt+∆tis the gross return from time t to t + ∆t of tion 4, summary and the concluding remarks are added. the asset S. As the assets to be examined, we adopt four kinds of

– 57 – JSIAM Letters Vol. 3 (2011) pp.57–60 Takahiro Yamamoto et al. assets such as the NIKKEI225 index, the ATM option, Table 1. Three kinds of the DVMs. the OTM option and the ITM option. Denoting the vec- 2P-DVM σ(S , t) = aSb t t [ ( )] tor consisting of the gross returns of the four assets by S − S S AT M OTM ITM ′ − t 0 Rt,t+∆t = [R ,R ,R ,R ] (for example, 3P-DVM σ(St, t) = a + b 1 tanh c t+∆t t+∆t t+∆t t+∆t S0 RAT M indicates the gross return from time t to t+∆t of [ ( )] t+∆t St − S0 the ATM option), we statistically examine whether all 5P-DVM σ(St, t) = a + b 1 − tanh c [ ( S0 )] of the components in the expectation of the gross return S − S + d 1 − sech e t 0 vector multiplied by the DVM pricing-kernel are close S0 to 1 (convergence in (3)) using GMM technique.

ht = 1 − Et[mt.t+∆tRt,t+∆t] → 0. (3) 2.3 Setting, data-handling and statistical method 2.2 Construction of the pricing-kernel 2.3.1 Setting Assuming that the equity process follows the DVM in In this quantitative analysis, the most important thing (4) and the risk-free interest rate r is not equal to 0, is to decide the period of the analysis appropriately. we introduce the pricing-kernel mt,t+∆t that is able to When we test the hypothesis that the pricing-kernel discount both of the bond and the equity returns. composed of the equity return is able to discount the

dSt = µStdt + σ(St, t)StdWt, equity option returns properly, we should take the ma- [( ) ] turities of the options into consideration. The underlying 2 σ(St, t) equity does not have its maturity, whereas the equity op- St = S0 exp µ − t + σ(St, t)Wt , (4) 2 tion has its own maturity and thus we should distinguish one option from the other if the maturities of the options where Wt is Winner process and σ(St, t) is volatility. The pricing-kernel should satisfy (5). are different from each other. Investors price the equity [ ] [ [ ]] option incorporating the forecast of the underlying eq- St St+∆t uity dynamics and the risk premium up to the maturity = Et mt,t+∆t . (5) Bt Bt+∆t of the option and thus the dynamics of the option re- Eq. (5) is the extension of the pricing-kernel (derived turn that could be related to that of the underlying eq- assuming that the risk-free interest rate is equal to 0) uity is only up to the maturity of the option. When the in the preceding research that could discount only the parameters of the equity model are estimated from the equity return. data that does not fall on the period from the issuing date of the option to its maturity, we could not identify The stochastic process ξt in (6) satifies ξ0 = 1, ξT > 0, whether the equity model to derive the pricing-kernel is ξt = Et[ξt+∆t] and the mt,t+∆t = ξt+∆t/ξt also satisfies (5) and is found to be the pricing-kernel. not appropriate or the data period is not appropriate in [ ] the rejection of the hypothesis testing. Therefore, each − 2 − −rt − (µ r) − µ r quantitative analysis should be attempted for each op- ξt = e exp 2 t Wt . (6) 2σ(St, t) σ(St, t) tion in the period from the issuing date of the option to Replacing the small time interval with unit time interval its maturity. 1 and taking logarithms of the pricing-kernel, we get (7). 2.3.2 Data handling technique (µ − r)2 µ − r We mention the way to measure the option return. ln mt,t+1 = −r − − (Wt+1 − Wt). (7) For the options data, we adopt three kinds of the three- 2σ(S , t)2 σ(S , t) t t month call options such as the ATM (the strike price Removing the Wiener process in (7) by way of (4), we is equal to the current equity price), the OTM500 (the could derive (8). strike price is 500 yen higher than the current equity ( ) 2 2 price) and the ITM500 (the strike price is 500 yen lower (µ − r) µ − r σ(St, t) ln mt,t+1 = −r − + µ − than the current equity price). The strike prices of the 2σ(S , t)2 σ(S , t)2 2 t ( )t listed options are fixed by 500 yen interval and thus − − µ r St+1 above options actually do not exist except for the case 2 ln . (8) σ(St, t) St that the current equity price exactly falls on a multi- ple of 500 yen. Thus, we have to infer the prices of the Setting the risk-free interest rate to be 0% in (8), the above options from the market prices of the listed op- pricing-kernel is reduced to (9). ( ) tions. We adopt the approach to interpolate the implied − 2 µ(µ σ(St, t) ) − µ St+1 volatility (hereafter, we call it IV) using spline-function. ln mt,t+1 = 2 2 ln . (9) 2σ(St, t) σ(St, t) St We select six kinds of options close to the current equity price. The three options are put options whose strike Due to the market environment such that the Japanese prices are 500 yen, 1000 yen and 1500 yen lower than risk-free interest rate is equal to around 0 in most the current equity price and other three options are call of the period, we adopt (9) as the DVM pricing- options whose strike prices are 500 yen, 1000 yen and kernel. Regarding the specific functional form of the 1500 yen higher than the current equity price. We com- volatility σ(S , t), we examine the 2-parameter DVM, t pute the IVs of the six options by inverting the options the 3-parameter DVM and the 5-parameter DVM in market prices by way of BS model and spline-interpolate Mawaribuchi, Miyazaki and Okamoto (2009) [2] and list the six IVs and pick up the IVs corresponding to the them in Table 1. – 58 – JSIAM Letters Vol. 3 (2011) pp.57–60 Takahiro Yamamoto et al. strike prices of ATM, OTM500, ITM500 options. Then, 20000 nk225 we compute the prices of ATM, OTM500, ITM500 op- 16000 tions by putting the spline-estimated IVs for the three into the BS model. Once we attain the daily prices of 12000 ATM, OTM500, ITM500 options, it is easy to compute 8000 the daily returns of the three options. 4000 2.3.3 Statistical method (Generalized Method of Mo- ments; GMM) 0 03/06 04/06 05/06 06/06 07/06 08/06 09/06 10/06 We statistically test the hypothesis that all the four components in the expectation of the gross return vec- Fig. 1. Dynamics of the NIKKEI225 index. S AT M OTM ITM ′ tor Rt,t+∆t = [Rt+∆t,Rt+∆t ,Rt+∆t ,Rt+∆t] multiplied by the pricing-kernel m are all close to 1 in (3) us- t,t+∆t the March, June, September and December contracts of ing GMM technique. As the moment conditions of the the NIKKEI225 options (ATM, OTM500 and ITM500) GMM, we adopt (10) and (11). in the period from June 2003 contract to December 2010 1 ∑N contract and the daily NIKKEI225 index data corre- g(θ) = h , (10) N t sponding to the options data period. We set the risk-free t=1   interest rate to be equal to 0% due to the fact that the St+∆t most of the period of the analyses is under the BOJ’s 1 − mt,t+∆t  St  zero interest rate policy. We test the four equity models  AT Mt+∆t   1 − mt,t+∆t   AT Mt  such as the 2-parameter model, the 3-parameter model OTMt+∆t  1 − mt,t+∆t  and the 5-parameter model of Mawaribuchi, Miyazaki  OTMt  ht = . (11)  ITMt+∆t  and Okamoto (2009) [2] as listed in Table 1 in addition  1 − mt,t+∆t   ( ITMt )  to the BS model.  St+∆t   St 1 − mt,t+∆t  ( St ) AT Mt+∆t 3.2 Results and their implications St 1 − mt,t+∆t AT Mt First of all, from Fig. 1, we review the dynamics of the Using the moment condition, we construct JN (θ) in (12) NIKKEI225 index in the period of the analyses (from and minimize it to estimate the parameter set θˆ of the June 2003 to December 2010). There are two notable DVM. periods. One is the period from the beginning of 2005 to ′ ′ the mid 2006 when the NIKKEI225 index surges due to JN (θ) = g(θ) WN g(θ), (12) the recovery of the economy (the period is called period where N is the number of the data, WN is the variance- (i)) and the other is the period from the end of 2007 to covariance matrix of the moment conditions, θ is the pa- the beginning of 2009 when the NIIKEI225 index dives rameter set {µ, parameters in volatility function σ} of due to the global recession originated from the U.S. sub- the pricing-kernel mt,t+∆t. The maturities of the options prime loan problem (we call the period as period (ii)). in our analysis are three months and they have 60 busi- Except for the two periods, the NIKKEI225 index moves ness dates from the issuing date to the maturity. We use almost in a range. the daily option returns up to the 5 business date before The results of testing the hypothesis that the pricing- the maturity to stay away from the relatively large noise kernels induced by the DVMs are rational to the dy- included in the very short period option prices. Thus, for namics of the cross-sectional option market prices are each quantitative analysis corresponding to each option provided in Table 2. In Table 2, ** and * indicate 0.5% contract month, the number of the data N is 55. We and 1% significance levels, respectively. From Table 2, statistically test the hypothesis by GMM using the fact we see that the BS pricing-kernel is rejected for 19 con- ˆ that the test statistics JN (θ) with the estimated param- tract months (with 0.5% significance for 10 contract eter θˆ follows the Chi-square distribution χ2(n) with n months and with 1% significance for 9 contract months) degrees of freedom (refer to Newey and West (1987) [4] out of total 31 contract months and the 2-parameter for more detail). DVM pricing-kernel is rejected for 15 contract months (with 0.5% significance for 3 contract months and with ∼ 2 dN = N[JN (θ)] χ (n), (13) 1% significance for 12 contract months) out of total 31 contract months. The result suggests that the exten- H0 : JN (θ) = 0. (14) sion of the BS model to the 2-parameter DVM improves The rejection of the hypothesis testing implies that the the rationality of the pricing-kernel to the dynamics of equity model to derive the pricing-kernel is not consis- the cross-sectional option market prices, however, the tent to the dynamics of the cross-sectional option market effect is quite limited. On the contrary, regarding the 3- prices. parameter DVM and the 5-parameter DVM, except for only the December 2008 contract just after the corrup- 3. Quantitative analyses tion of the Leaman-Brothers security, the rationality of 3.1 Data and the equity model the pricing-kernel derived from the two models to the The data in this analyses are daily option prices of dynamics of the cross-sectional option market prices is the remaining maturities from 60 to 5 business days for not rejected even with 1% significance level.

– 59 – JSIAM Letters Vol. 3 (2011) pp.57–60 Takahiro Yamamoto et al.

Table 2. GMM test-statistics and the results of hypothesis tests. 1.003 BSM 2PDVM 3PDVM 5PDVM SQ BS 2P 3P 5P 1.002 2003/9 17.340 ∗ 15.338 ∗ 5.897 4.900 2003/12 16.762 ∗ 15.590 ∗ 4.424 4.142 1.001 2004/3 16.859 ∗ 12.336 6.165 5.530 2004/6 13.734 12.332 10.630 8.906 2004/9 15.108 12.789 6.899 4.575 1 2004/12 19.711 ∗∗ 14.983 ∗ 9.206 9.200 ∗∗ ∗ 2005/3 19.422 14.933 7.197 6.488 Kernel Pricing 0.999 2005/6 17.328 ∗∗ 15.332 ∗ 3.349 3.306 2005/9 16.152 12.337 5.516 4.998 0.998 2005/12 18.973 ∗∗ 16.337 ∗∗ 7.926 6.621 ∗∗ ∗ 2006/3 18.405 15.007 9.836 8.415 0.997 2006/6 14.961 8.833 5.984 4.964 2004/03/22 2004/04/21 2004/05/21 2006/9 13.452 10.453 3.083 2.919 2006/12 18.323 ∗∗ 15.332 ∗ 7.939 7.920 Fig. 2. Dynamics of the pricing-kernel for June 2004 contract. 2007/3 14.635 10.222 3.661 3.662 2007/6 16.967 ∗ 15.040 ∗ 8.949 7.971 2007/9 12.231 10.181 7.237 7.231 1.003 BSM 2PDVM 3PDVM 5PDVM 2007/12 16.948 ∗ 11.472 6.571 4.575 2008/3 19.081 ∗∗ 16.916 ∗∗ 9.947 8.984 1.002 2008/6 15.256 13.103 6.527 4.526 ∗ 2008/9 16.933 11.237 9.936 5.365 1.001 2008/12 45.787 ∗∗ 35.531 ∗ 25.480 ∗∗ 22.468 ∗∗ ∗ ∗ 2009/3 17.575 15.402 8.502 7.355 1 2009/6 16.819 ∗ 14.919 ∗ 7.931 6.749 2009/9 15.001 9.004 4.238 3.481

Pricing Kernel Pricing 0.999 2009/12 17.035 ∗ 11.022 6.447 5.448 2010/3 19.871 ∗∗ 14.978 ∗ 10.274 10.189 ∗ 2010/6 11.623 8.631 7.465 3.517 0.998 2010/9 18.557 ∗∗ 14.943 ∗ 9.305 8.822 2010/12 14.862 10.845 2.599 2.203 0.997 2011/3 17.017 11.434 6.938 5.019 2005/03/18 2005/04/17 2005/05/17 ∗∗, ∗ indicate 0.5% and 1% significance levels, respectively. Fig. 3. Dynamics of the pricing-kernel for June 2005 contract.

More closely examining the relation between the dy- parameter DVM, whose volatility could depend on the namics of the NIIKEI225 index and the result of the equity price, is not consistent to the dynamics of the hypothesis testing, we find that the pricing-kernels in- cross-sectional option market prices due to the strong duced from the BS model and the 2-parameter DVM restriction of the functional form of the volatility. On are not so rejected in the periods when the NIKEI225 the contrary, regarding the 3-parameter and 5-parameter index moves mostly in a range. They, however, are quite DVMs that incorporate tanh(x) in the functional form often rejected in the periods when the NIKKEI225 is of the volatility, the consistencies of the two models trending and volatile such as the period (i) and the pe- to the dynamics of the cross-sectional option market riod (ii). Contrary to the result, the pricing-kernels in- prices are not rejected in most of the testing periods. duced from the 3-parameter DVM and the 5-parameter The implication attained from our quantitative analy- DVM are seldom rejected even in the periods when the ses is that even in the trending and volatile market, we NIKKEI225 index is upward or downward trending. To could build the equity models such as the 3-parameter investigate the background reason of the result, we pro- and 5-parameter DVMs in Mawaribuchi, Miyazaki and vide the dynamics of the pricing-kernel induced from Okamoto (2009) [2] that are rational to the dynamics each equity model for the period of the June 2004 con- of the cross-sectional option market prices within the tract (the range market) and the period of the June 2005 framework of the complete model without incorporat- contract (the upward trending market) in Figs. 2 and 3, ing the additional stochastic variable such as jump or respectively. Fig. 2 indicates that the dynamics of the stochastic volatility. pricing-kernel for each model is not so different from each other in the range market. On the other hands, in the up- Acknowledgments ward trending market, Fig. 3 suggests that the dynamics This work was supported by JSPS KAKENHI of the pricing-kernel quite differs model by model. Due (22510143). We sincerely thank for the reviewer for con- to the strong restriction of the model, the dynamics of structive comments. the pricing-kernels induced from the BS model and the 2-parameter DVM are not flexible enough to capture the References dynamics of the cross-sectional option market prices. [1] B. Dupire, Pricing with a smile, Risk, 7 (1994), 18–20. 4. Summary and concluding remarks [2] J. Mawaribuchi, K. Miyazaki and M. Okamoto, 5-parameter local volatility models: fitting to option market prices and We improved the preceding approach, especially, in forecasting ability (in Japanese), IPSJ Trans. Math. Model. the derivation of the pricing-kernel and the data han- Appl., 49 (2009), 58–69. dling technique and then empirically examined the [3] A. Buraschi and J. Jackwerth, The price of a smile: hedg- ing and spanning in option markets, Rev. Financ. Stud., 14 consistency of the DVMs introduced in Mawaribuchi, (2001), 495–527. Miyazaki and Okamoto (2009) [2] to the dynamics of [4] W. K. Newey and K. D. West, A simple, positive-semidefinite, the cross-sectional option returns. From the results, we heteroskedasticity and autocorrelation consistent covariance found that, not to mention the BS model, even the 2- matrix, Econometrica, 55 (1987), 703–708.

– 60 – JSIAM Letters Vol.3 (2011) pp.61–64 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Stochastic estimation method of eigenvalue density for nonlinear eigenvalue problem on the complex plane

Yasuyuki Maeda1, Yasunori Futamura1 and Tetsuya Sakurai1,2

1 Department of Computer Science, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8573, Japan 2 CREST, Japan Science and Technology Agency, 4-1-8 Hon-machi, Kawaguchi, Saitama 332- 0012, Japan E-mail maeda mma.cs.tsukuba.ac.jp Received June 5, 2011, Accepted August 31, 2011 Abstract The performance of some nonlinear eigenvalue problem solvers can be increased by setting parameters that are based on rough estimates of the desired eigenvalues. In the present paper, we propose a stochastic method for estimating the eigenvalue density for nonlinear eigenvalue problems of analytic matrix functions. The proposed method uses unbiased estimation of the matrix traces and contour integrations. Its performance is evaluated through the numerical experiments. Keywords nonlinear eigenvalue problem, analytic matrix function, trace estimation Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction 2. Stochastic method for estimating the We herein consider a nonlinear eigenvalue problem number of eigenvalues for NEPs (NEP) (F (λ)x = 0) of finding eigenpairs (λ, x), where In this section, we propose a stochastic method for × the matrix F (λ) is an n n analytic matrix function. estimating the number of eigenvalues for NEPs on the Nonlinear eigenvalue problems appear in a variety of complex plane. Let F (z) be an analytic matrix function problems in science and engineering, such as delay dif- defined in a simply connected region in C. The determi- ferential equations [1], quantum dots [2], and acceler- nant of F (z) is not identically zero in the domain Ω. In ator design [3]. These problems require specific eigen- other words, F (z) is regular when z ∈ Ω. We introduce pairs. The Sakurai–Sugiura (SS) method [4] is a solver the Smith form for analytic matrix functions [6]. for NEPs that can find eigenpairs locally. The SS method × requires parameters such as closed curves on the complex Theorem 1 Let F (z) be an n n regular matrix ana- plane, and its performance can be improved by setting lytic function. Then, F (z) admits the representation the approximate parameters based on rough estimates P (z)F (z)Q(z) = D(z), (1) of the desired eigenvalues. If we obtain a rough esti- mate of the eigenvalue density in advance, we can set where D(z) = diag(d1(z), . . . , dn(z)) is a diagonal ma- parameters more efficiently. The method of estimating trix of analytic functions dj(z) for j = 1, 2, . . . , n, such ̸ the eigenvalue density using contour integrals of gener- that d1(z) = 0 and dj(z)/dj−1(z) are analytic functions × alized eigenvalue problems has been proposed in [5]. In for j = 2, 3, . . . , n. In addition, P (z) and Q(z) are n n the present paper, we extend this method to NEPs. regular analytic matrix functions with constant nonzero The remainder of the present paper is organized as fol- determinants. lows. In Section 2, we describe a derivation of the num- The eigenpairs of the NEP are formally derived from ber of eigenvalues for NEPs by introducing the Smith the Smith form. Let λ1, λ2, . . . , λs be zeros of dn(z) in Ω. form [6] and a contour integral. Then we propose an ap- Since d1(z) ≠ 0 and dj(z)/dj−1(z) are analytic functions plication of the unbiased estimation of the matrix trace for j = 2, 3, . . . , n, dj(z) can be represented in terms of in order to avoid the matrix inversion. In Section 3, we λi as describe an estimation method of the eigenvalue den- ∏s αji sity and show its simple implementation. In Section 4, dj(z) = hj(z) (z − λi) , j = 1, 2, . . . , n, (2) we investigate the performance of the proposed method i=1 through numerical experiments using four matrices. Fi- where hj(z) are analytic functions, hj(z) ≠ 0 for z ∈ Ω, nally, the conclusions are presented in Section 5. ∈ Z+ the eigenvalues∑ of the NEP are equal to λi, and αji , n and j=1 αji exhibits the multiplicity of λi. We propose the following theorem. Theorem 2 Let F (z) be an n × n regular analytic ma- trix function, and let tr(F (z)) be the matrix trace of

– 61 – JSIAM Letters Vol. 3 (2011) pp.61–64 Yasuyuki Maeda et al.

F (z). In addition, let m be the number of eigenvalues, Im ∈ counting multiplicity, inside closed curves Γ Ω on the Γ1 Γ2 ΓK complex plane for the NEP F (λ)x = 0. Then we have ρ ρ ρK I ( ) 1 2 1 dF (z) a b tr F (z)−1 dz = m, (3) . . . γ1 γ2 γK 2πi Γ dz where det(F (z)) ≠ 0. Proof From Theorem 1, we can derive the following equation: Re ( ) dF (z) Fig. 1. Closed curves on the complex plane. tr F (z)−1 dz ( ) ( ) dQ(z) dD(z) 1: Input : F (z), N, L, K, a, b = tr Q(z)−1 + tr D(z)−1 dz dz 2: Output :m ˜ 1, m˜ 2,..., m˜ K ( ) dP (z) 3: Set vl of which the elements take 1 or −1 with equal + tr P (z)−1 . (4) dz probability, l = 1, 2,...,L − Since P (z) and Q(z) are regular analytic matrix func- 4: ρ = (b a)/(2K) tions with constant nonzero determinants and D(z) = 5: for k = 1, 2,...,K do diag(d1(z), . . . , dn(z)), from [7, Section 8.3] we obtain − ( ) ( ) 6: γk = a + (2k 1)ρ dQ(z) dP (z) (2πi/N)(j+1/2) − tr Q(z)−1 = 0, tr P (z)−1 = 0, 7: zjk = γk + ρe , j = 0, 1,...,N 1 dz dz l ′ l 8: Solve F (zjk)xjk = F (zjk)vl for xjk, and l = 1, 2,...,L, j = 0, 1,...,N − 1 ( ) ∑ ∑ N−1 (2πi/N)(j+1/2) L T l − dD(z) 9:m ˜ = [ρ/(NL)] e v x tr D(z) 1 k j=0 l=1 l jk dz 10: end for ∑n dd (z) 1 = j Fig. 2. Algorithm for estimating the eigenvalue density. dz d (z) j=1 j ∑ ∑s n ∑n αji dh (z) 1 In order to avoid the matrix inversion in (7), we es- = j=1 + j . (5) z − λi dz hj(z) timate the trace with an unbiased estimation described i=1 j=1 in [8, 9], that is From the residue theorem we have   ∑L I ∑ − ′ 1 − ′ ∑s n ∑n tr(F (z ) 1F (z )) ≈ (vTF (z ) 1F (z )v ), (8) 1 αji dh (z) 1 j j l j j l  j=1 + j  dz L 2πi z − λ dz h (z) l=1 Γ i=1 i j=1 j where vl are sample vectors, the entries of which take ∑t ∑n 1 or −1 with equal probability, and L is the number of = αji = m, (6) sample vectors. Using (7) and (8), the number of eigen- i=1 j=1 values m is estimated as follows: where t is the number of mutually distinct eigenvalues N−1 L 1 ∑ ∑ inside closed curves Γ. m ≈ m˜ = w (vTF (z )−1F ′(z )v ). (9) (QED) L j l j j l j=0 l=1 Eq. (3) is approximated by an N-point quadrature rule N∑−1 3. Implementation −1 ′ m ≈ mˆ = wjtr(F (zj) F (zj)), (7) In this section, we describe an estimation method of j=0 the eigenvalue density using (9). We set points a and b on where the complex plane, and we divide the interval [a, b] into

K domains. Let Γk (k = 1, 2,...,K) be closed curves ′ dF (z) F (zj) = , which enclose each domain. We estimate the number of dz z=zj eigenvalues in each closed curve by (9). From the result of the estimation, we have regions where the number zj is a quadrature point and wj is a weight. In the case of the trapezoidal rule on a circle with center γ and radius of eigenvalues is large and regions where the number of ρ, quadrature points and weights are defined by eigenvalues is small. A schematic illustration for setting of closed curves on the complex plane is shown in Fig. ρ 2πi j+ 1 2πi j+ 1 w = e N ( 2 ), z = γ + ρe N ( 2 ). 1. This figure indicates a case that the shapes of Γ are j N j k the circle. The center and the radius of each circle is set to γk = a + (2k − 1)(b − a)/2K and ρk = (b − a)/2K,

– 62 – JSIAM Letters Vol. 3 (2011) pp.61–64 Yasuyuki Maeda et al.

Table 1. Matrix properties. F (λ) size γ ρ 2 3 4 Butterfly A0 + λA1 + λ A2 + λ A3 + λ A4 64 1 + 0.7i 1 2 3 4 5 Quantum dot (QD) A0 + λA1 + λ A2 + λ A3 + λ A4 + λ A5 2475 1 0.06 −τλ Delay-differential equation (DDE) λI − A0 −√A1e 3600 −4.3 + 6.3i 0.2 2 Accelerator designs (SLAC) A0 − λA1 + i λ − σ A2 5384 360000 25000

A0,A1,A2,A3,A4, and A5 of each NEP are different

2429 10 Table 2. Results for Example 2. Standard deviation Number of eigenvalues Trace(F(z)−1 F’(z)) N Butterfly QD DDE SLAC 2428.5 Average 8 4 55.3413 35.5420 22.7747 10.4819 Standard deviationStandard 6 87.7399 33.1033 23.2993 10.3486 2428 6 8 69.0999 32.2280 23.5541 10.9679 16 66.8368 31.1224 23.6857 9.8864

Trace 32 74.2438 30.8099 24.9703 9.9782 2427.5 4 64 69.2881 31.0783 23.7490 10.0001 m 70.0000 31.0000 24.0000 10.0000

2427 2 Table 3. Results for Example 3.

2426.5 0 Number of eigenvalues 0 20 40 60 80 100 L Butterfly QD DDE SLAC The number of sample vectors 10 69.5271 38.8910 36.6366 11.6292 Fig. 3. Trace and standard deviation. 20 69.5066 35.8612 26.9446 10.1837 30 68.3734 35.4354 19.0834 10.3820 40 68.1347 34.9229 17.9046 9.4075 respectively, so that each circle encloses an equally di- 50 67.7309 33.6268 19.1009 8.8863 100 67.4809 33.1391 22.0621 9.6417 vided sub-segment of [a, b]. The algorithm shown in 500 69.1751 32.9185 22.7606 10.7124 Fig. 2 estimates the eigenvalue density with γk and ρk 1000 69.0185 33.1510 23.6547 10.4950 (k = 1, 2,...,K). mˆ 69.0999 32.2280 23.5541 10.9679

4. Numerical examples In this section, we confirm the validity of the pro- 4.3 Example 3 posed method by applying the method to a number In this example, we investigate how the unbiased esti- of NEPs. The algorithm is implemented in MATLAB mation of the matrix trace in (9) affects the estimation 7.4. The MATLAB command mldivide is used to solve for the number of eigenvaluesm ˜ as the number of sam- l ′ l F (zjk)xjk = F (zjk)vl for xjk numerically, and the ele- ple vectors L increases. The test problems are the same ments of the sample vectors are given by the MATLAB as those in Example 2. Here, N is set to 8, and L is set function rand. We use one random sequence except for from 10 to 1000. The results are shown in Table 3. The Example 1. bottom rowm ˆ of Table 3 is the approximate number of 4.1 Example 1 eigenvalues in the case N = 8 shown in Table 2. The In this example, we see the behavior of the average and results indicate that the order of magnitude ofm ˜ andm ˆ the standard deviation of (8) where the number of sam- agree in all cases. Thus, in the cases of these numerical ple vectors L increases. The test problem is QD in Table experiments, the estimation of the number of eigenvalues 1. The average and the standard deviation are evaluated m by (9) can be used for applications that only require by using 30 different random sequences. z is set to 0.1. exponent of m. An example of such an application is the The results of this example are shown in Fig. 3. The parameter setting for the eigensolver which described in horizontal axis indicates L. The vertical axes on the left the introduction of the present paper. and right indicate the trace and the standard deviation, 4.4 Example 4 respectively. The result indicates that the average gets In this example, we give a demonstration of the algo- closer to the exact value of the trace as L increases. The rithm shown in Fig. 2. The test problems are Butterfly, standard deviation decreases rapidly until around 30 and QD and SLAC in Table 1. Here, K = 30, N = 8, and L = decreases slowly from then on. 30. The interval [a, b] of Butterfly, QD and SLAC is set to − × 6 × 6 4.2 Example 2 [ 3.2 + 0.5i, 3.2 + 0.5i], [0, 2] and [0.02 10 , 1.02 10 ], In this example, we investigate how the numerical in- respectively. The results of this example are shown in tegral in (7) affects the approximation for the number Figs. 4, 5 and 6. The horizontal axis indicates the real of eigenvaluesm ˆ as the number of quadrature points N part of γk and the vertical axis indicates the number increases. The test problems are given in [1–3,10]. Their of eigenvalues in each circle. The results indicate that properties are shown in Table 1. Here, N is set to 4, 6, the order of magnitude of the exact number of eigenval- 8, 16, 32 and 64. The shape of Γ is a circle. The results ues and the estimated number of eigenvalues agree in all listed in Table 2 indicate that the order of magnitude of cases. Through these numerical experiments, it is exper- mˆ and m agree in all cases. imentally confirmed that the proposed method lets us

– 63 – JSIAM Letters Vol. 3 (2011) pp.61–64 Yasuyuki Maeda et al.

Eigenvalue Estimation value problems of analytic matrix functions. The pro- 6 Exact posed method uses unbiased estimation of the matrix Estimation traces and contour integrations and is considered to be 5 an extension of the parallel stochastic estimation method for eigenvalue density proposed in [5]. By the numerical 4 experiments, we learn that the proposed method can ob- tain the information about whether eigenvalues exist or 3 not in the specified region. This information can be used to perform efficient parameter settings for eigensolvers. 2 Acknowledgments Thenumber eigenvalues of 1 This research was supported in part by JST, CREST and Grant-in-Aids for Scientific Research from the Min- 0 istry of Education, Culture, Sports, Science and Tech- −4 −3 −2 −1 0 1 2 3 4 nology, Japan, Grant Nos. 21246018 and 23105702. Real part of γk Fig. 4. Number of eigenvalues (Butterfly). References

Eigenvalue Estimation [1] E. Jarlebring, K. Meerbergen and W. Michiels, An Arnoldi 250 Exact method with structured starting vectors for the delay eigen- Estimation value problem, in: IFAC Workshop on Time Delay Systems, 2010. 200 [2] F. N. Hwang, Z. H. Wei, T. M. Huang and W. Wang, A parallel additive Schwarz preconditioned Jacobi-Davidson algorithm for polynomial eigenvalue problems in quantum dot simula- 150 tion, J. Comput. Phys., 229 (2010), 2932–2947. [3] B. S. Liao, Subspace projection methods for model order re- duction and nonlinear eigenvalue computation, PhD thesis, 100 Department of Mathematics, UC Davis, 2007. [4] J. Asakura, T. Sakurai, H. Tadano, T. Ikegami and K. Kimura, A numerical method for nonlinear eigenvalue problems using The numberThe eigenvalues of 50 contour integral, JSIAM Letters, 1 (2009), 52–55. [5] Y. Futamura, H. Tadano and T. Sakurai, Parallel stochastic estimation method for eigenvalue distribution, JSIAM Let- 0 ters, 2 (2010), 127–130. 0 0.5 1 1.5 2 [6] I. Gohberg and L. Rodman, Analytic matrix functions with Real part of γk prescribed local data, J. d’Analyse Math´ematique, 40 (1981) Fig. 5. Number of eigenvalues (QD). 90–128. [7] J. R. Magnus and H. Neudecker, Matrix Differential Calcu- Eigenvalue Estimation lus with Applications in Statistics and Econometrics, Wiley, 12 Exact 1999. Estimation [8] Z. Bai, M. Fahey and G. Golub, Some large scale matrix com- 10 putation problems, J. Comput. Appl. Math., 74 (1996), 71–89. [9] M. F. Hutchinson, A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines, Commun. 8 Stat. Simulation Comput., 19 (1990), 433–450. [10] NLEVP : A Collection of Nonlinear Eigenvalue Problems http://www.mims.manchester.ac.uk/research/ 6 numerical-analysis/nlevp.html.

4 Thenumber eigenvalues of 2

0 0 2 4 6 8 10 5 Real part of γk x 10 Fig. 6. Number of eigenvalues (SLAC). know whether eigenvalues exist or not in the specified region.

5. Conclusions In the present paper, we propose a stochastic method for estimating eigenvalue density for nonlinear eigen-

– 64 – JSIAM Letters Vol.3 (2011) pp.65–68 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Computation of multipole moments from incomplete boundary data for Magnetoencphalography inverse problem

Hiroyuki Aoshika1, Takaaki Nara1, Kaoru Amano2,3 and Tsunehiro Takeda3

1 The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu, Tokyo 182-8585, Japan 2 Precursory Research for Embryonic Science and Technology, Japan Science and Technology Agency, Saitama 332-0012, Japan 3 The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba 227-8561, Japan E-mail aoshika inv.mce.uec.ac.jp Received June 6, 2011, Accepted August 8, 2011

Abstract In this paper, we present a method for reconstructing the dipole sources inside the human brain from radial Magnetoencephalography data measured on the part of the boundary which encloses the source. Combining the proposed method with the direct method provides a good initial solution for an optimization-based algorithm. The method is verified with the numerical simulations, phantom experiments, and a somatosensory evoked field data analysis. Keywords inverse problem, Magnetoencephalography, multipole moment Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction the upper hemisphere. We regularize the linear equation Magnetoencephalography (MEG) is a non-invasive proposed by Taulu et al. [5] by using the singular value brain monitoring tool that records the magnetic field decomposition. Then, by combining this method with outside the head generated by the neural current in the our direct method proposed in [4], the source parame- brain. Here one must solve an inverse problem to recon- ters are estimated from incomplete boundary measure- struct the current source from the measured magnetic ments, which can be used as a good initial solution for field. The conventional methods for the inverse problem an optimization-based algorithm. assume that the current source can be represented by The rest of this paper is organized as follows. In Sec- a relatively small number of equivalent current dipoles tion 2, our direct method is summarized. The method (ECDs). Although the usual algorithm for this source using data only on the upper hemisphere is proposed model is the non-linear least-squares method that min- in Section 3, which is verified by numerical simulations, imizes the squared error of the data and the forward phantom experiments, and a real data analysis in Sec- solution, it has a problem that an initial parameter es- tions 4, 5, and 6, respectively. timate close to the true one is required without which the algorithm often converges to a local minimum. To 2. Direct method address this issue, several researchers have proposed a Assume that the head can be modeled by the three direct method [1–4] which reconstructs the source pa- concentric spheres Ω = Ω1 ∪ Ω2 ∪ Ω3 representing the rameters directly and algebraically from the data. From brain, skull, and scalp, respectively. The sensors mea- the efficiency of the algorithm, it is expected to be used suring the radial component of the magnetic field are for real-time monitoring of the brain activity. Also from placed on the upper hemisphere Γ with radius R cen- a practical point of view, it can provide a good initial tered at the origin. Generally, the solution to the inverse solution for the iterative algorithm. problem in which the neural current in the brain is re- However, the problem of the direct method is that it constructed from the measured MEG data is not unique. requires the data on the boundary which encloses the To guarantee the uniqueness, we assume that the neu- source. The practical MEG system has no sensors in ral current is expressed∑ by equivalent current dipoles K − ∈ front of the face and beneath the neck. The lack of data (ECDs) Jp = k=1 pkδ(r rk) where rk Ω1. The at those parts is the cause of error in computing the radial component of the magnetic field at r ∈ Γ is then weighted integral of the boundary data. given by the Biot-Savart law The aim of this paper is to develop a method to re- ∑K µ0 ′ 1 construct the source parameters from the data on the B (r) = (n × p ) · ∇ | ′ , r 4π k |r − r′| r =rk part of the boundary. First, using the multipole expan- k=1 sion of the radial component of the magnetic field, the where µ is the permeability assumed to be constant multipole moments are estimated from the data only on 0 in the whole space and n = r/|r| is the outward unit

– 65 – JSIAM Letters Vol. 3 (2011) pp.65–68 Hiroyuki Aoshika et al. normal to Γ. Our inverse problem is to reconstruct the where (θ, ϕ)i represents the spherical coordinates of the number N, positions rk, and the moments pk of the ith sensor. We choose L such that L is maximum un- ECDs from measurements of Br on Γ. der the condition that the linear system becomes over- In contrast to the conventional method with the determined, that is, N > (L + 1)2 − 1. non-linear least-squares method, we proposed a direct In order to obtain x while suppressing the effect of method which reconstructs the source parameters di- noise contained in d, we use the truncated singular value + rectly and algebraically from MEG data [4]. The method decomposition of G denoted by GT where T is a trunca- is based on the multipole expansion of the radial MEG tion order. To determine T , we use the following method: given by first we fix T and estimate the multipole moments from + ∞ the data on Γ by x = G d. From the components of ∑ ∑l l + 1 Yˆ ∗ (θ, ϕ) T B = µ M lm , (1) l = m in x with the direct method in Section 2, we can r 0 2l + 1 lm rl+2 l=0 m=−l identify the source positions projected on the xy-plane. Using them as the initial solution, the z-coordinates ˆ where Ylm(θ, ϕ) are the normalized spherical harmonic and the moments of the source are determined by the functions. It is shown that the multipole moments Mlm non-linear least-squares method. Then, we compute the where l = m are expressed in terms of the source pa- Goodness of Fit (GoF) defined by rameters. On the other hand, they are expressed by the   ∑N boundary data. As a result, we have the algebraic equa-  (B [i] − B [i])2  tions relating the source parameters to the data:  data th   i=1  GoF [%] = 100 1 −  , (5) ∑N  ∑N  m 2 qkSk = αm, (2) Bdata[i] k=1 i=1 ≡ where Sk xk + iyk is the kth source position projected where Bdata[i] and Bth[i] are the data and the forward on the xy-plane, qk ≡ rk ×pk is the magnetic moment of solution at the ith sensor position, respectively. We re- the kth ECD, qk ≡ [qk]x + i[qk]y, where [∗]x represents peat these computations by changing T and choose T the x-component of the vector ∗, and such that GoF becomes maximum. ∫ Practically, the sensors called ‘gradiometers’ are often 2m + 3 m+1 αm = Br(x + iy) dS, (3) used which measure the difference of Br at a point r on (m + 1)µ0 S Γ and r + bn, where b is called the baseline distance, where S is a sphere which is centered at the origin and in order to cancel the noise which originates from the encloses Ω. It is also shown that (2) is reduced to a gen- source far apart. In this case, we only need to change eralized eigenvalue problem [6] so that the source param- l+2 Xlm from Xlm = µ0 · [(l + 1)/(2l + 1)] · (Mlm/r ) to eters can be reconstructed algebraically from the bound- l+2 l+2 Xlm = µ0 ·[(l +1)/(2l +1)]·[1/r −1/(r +b) ]·Mlm. ary data. Although this method is theoretically simple, Thus even in this case, from x = G+d, we can obtain a problem is that we need B on the whole S which en- T r Mlm where l = m that are used in the direct method. closes Ω. In the practical situation the sensors cannot be placed in front of the face and in the middle of the neck. 4. Numerical simulations Hence, lack of data on the part of S becomes a factor of First we verified our method numerically. A single errors in computing αm. dipole (K = 1) was set at r1 = (40, 40, 40) mm with 3. Computation of M from data on the moment p1 = (0, 0, 10) nAm. N = 183 gradiome- mm ters with b = 50 mm were uniformly distributed on the the upper hemisphere upper hemisphere with R = 120 mm using the spheri- Truncate (1) up to order L: cal t-design [7]. For this number of the sensors N, the truncation order was L = 12. Gaussian noise was added ∑L ∑l B ≃ X Yˆ ∗ (θ, ϕ), (4) whose standard deviation was 10% of the root-mean- r lm lm squares of the theoretical data. l=0 m=−l Fig. 1 shows the relative localization error (the error l+2 where Xlm = µ0 · [(l + 1)/(2l + 1)] · (Mlm/r ). Then devided by R) and GoF when changing the truncation the linear equations relating the multipole moments to order T . We observed that choosing T = 118 gave the radial MEG on Γ are obtained [5]: d = Gx, where d = maximum GoF (98.1%) and the minimum relative local- T N (Br1,Br2,...,BrN ) ∈ R is the data measured on Γ, ization error (0.18%). Hence from GoF we can determine T (L+1)2−1 x = (X0,0,X1,−1,X1,0,...,XL,L) ∈ C consists the optimal truncation order T . The relative magnetic of the unknown multipole moments, and moment error was 1.8%. Fig. 2 shows the estimated |q2/q1| when assuming that G = ′   there were K = 2 dipoles. We observed that |q2| be- Yˆ ∗ (θ, ϕ) Yˆ ∗ (θ, ϕ) ··· Yˆ ∗ (θ, ϕ) | |  1,−1 1 1,0 1 L,L 1  came much smaller than q1 for most T . In fact, when  ˆ ∗ ˆ ∗ ··· ˆ ∗  T = 118 that was the truncation order used for recon-  Y1,−1(θ, ϕ)2 Y1,0(θ, ϕ)2 YL,L(θ, ϕ)2   . . .  , struction, |q |/|q | = 0.043, showing that |q | can be  . . .  2 1 2 . . . neglected compared to |q | and hence K = 1. ˆ ∗ ˆ ∗ ··· ˆ ∗ 1 Y1,−1(θ, ϕ)N Y1,0(θ, ϕ)N YL,L(θ, ϕ)N

– 66 – JSIAM Letters Vol. 3 (2011) pp.65–68 Hiroyuki Aoshika et al. 0.2

0.15

0.1

0.05 Relativelocalization error[%] 0 25 50 75 100 125 150 Truncation order T Fig. 3. Radial Gradiometer 168ch.

T =118 100 y [mm] 97.5 80 95 60 92.5 40 90 20

GoF [%] GoF 87.5 85 x [mm] −80 −60 −40 −20 20 40 60 80 82.5 −20 0 25 50 75 100 125 150 Truncation order T −40

Fig. 1. Relative localization error and GoF with respect to the −60 truncation order T . When T = 118, the error and GoF become true −80 estimated minimum and maximum, respectively.

Fig. 4. Reconstruction result when z1 =16 mm. Black dots: true 0 ECD positions, Red dots: estimated ECD positions. −0.5 ] −1 y [mm] −1.5 80 −2 60 magratio [ −2.5 40

Log −3 −3.5 20

x [mm] 0 25 50 75 100 125 150 −80 −60 −40 −20 20 40 60 80 Truncation number −20 ′ Fig. 2. Ratio |q2|/|q1| when assuming K = 2. −40

−60 5. Phantom experiments true −80 estimated Next we examined our method using the phantom head. We used 168 gradiometers as shown in Fig. 3 where Fig. 5. Reconstruction result when z1 =42 mm. the radius R = 129 mm. L was set to be 11. A single cur- rent source (K = 1) was moved on the plane z = 16 mm √ 1 tifact originating from the electrical stimulation). Fig. 8 and z = 42 mm where x2 + y2 = 62.5 mm and 1 1 1 shows the contour maps of radial MEG. ϕ = tan−1 y /x = 45 × i (i = 0, 1,..., 7) degrees. 0 1 1 Fig. 9 shows the localization result. The maximum The reconstruction results are shown in Figs. 4 and 5. GoF was 74% when T = 51 with which reconstruction The mean estimation error was 3.5 mm and 2.4 mm, re- was conducted. In this case, two dipoles are estimated in spectively. Fig. 6 shows an example of the relationship the right and left somatosensory cortices. Figs. 10 and between the relative localization error and GoF when 11 shows |q /q | and |q /q |, respectively, when assuming z = 42 mm and ϕ = 180◦. It is observed that the trun- 2 1 3 2 that there were K′ = 3 dipoles. One finds that |q /q | cation order T = 78 that maximizes GoF coincides with 3 2 often becomes small when changing T while |q | is com- the order that minimizes the localization error. This co- 2 parable to |q | for wide range of T . In fact, when T = 51 incidence was observed in all the source positions. 1 which was used for reconstruction, |q2/q1| = 0.83 and | | 6. Real data analysis q3/q2 = 0.0002 from which we can reasonably judge that K = 2. We analyzed a somatosensory evoked field (SEF) data where a right hand index finger was electrically stimu- 7. Conclusion lated. 205 gradiometers were used. L was set to be 13. In this paper, we developed a method for computing In the time series shown in Fig. 7, we used the peak at the multipole coefficients of the radial magnetic field cre- 101 msec for reconstruction (the peak at 0 msec is an ar- ated by the dipole source from radial MEG data on the

– 67 – JSIAM Letters Vol. 3 (2011) pp.65–68 Hiroyuki Aoshika et al.

10 0 0.5 8 ] −1 6 −1.5 −2 magratio 4 −2.5

Log[ −3 2 −3.5

40 50 60 7080 90 100 0 50 100 150 Truncation order T Truncation number ′ 100 Fig. 10. |q2/q1| when assuming that there were K = 2 dipoles. For most T , |q2| is comparable to |q1|. When T = 51 which is 90 the truncation order used in reconstruction, |q2/q1| = 0.83.

80 0 0.5 70 GoF [%] GoF Relativelocalization error[%] ] −1 60 −1.5 −2 magratio 40 50 60 70 80 90 100 −2.5

Truncation order T Log[ −3 − Fig. 6. Relative localization error (top) and GoF (bottom) when 3.5 ϕ = 180◦ and z = 42 mm. The truncation order T = 78 1 1 0 50 100 150 maximizes GoF and minimized the relative localization error. Truncation number

′ 500 Fig. 11. |q3/q2| when assuming that there were K = 3 dipoles. For most T , |q3| becomes much smaller than |q2|. When T = 51 250 which is the truncation order used in reconstruction, |q3/q2| = 0.0002. 0

−250 upper hemisphere, that were used in the direct inver- −500 020 4060 80 100100 120140 16018 180 sion method for reconstructing the dipole parameters. Magnetic flux density, fT Magneticdensity, flux

0 msec 101 msec The method was verified with the numerical simulations, Time, msec phantom experiments, and somatosensory evoked field Fig. 7. Time series data. (SEF) data analysis. Although it was suggested that K could be estimated from the ratio of the source strength Isofield Contour Map assuming larger number of dipoles than the true one, the rigorous analysis for the threshold is required. General- LR ization of our method to the case when the data is given not on the upper hemisphere but on an arbitrary open surface which does not enclose the source is straightfor- ward; its verification with simulations as well as phan- tom/real data analyses is also required.

References

Sink Source [1] A. El-Badia and T. Ha-Duong, An inverse source problem in potential analysis, Inverse Problems, 16 (2000), 651–663. 5fT/Step [2] T. Ohe and K. Ohnaka, A precise estimation method for lo- Fig. 8. Contour map of radial MEG. Red and blue colors show cations in an inverse logarithmic potential problem for point the outward and inward magnetic field. mass models, Appl. Math. Modelling, 18 (1994), 446–452. [3] K. Yamatani, T. Ohe and K. Ohnaka, An identification method of electric current dipoles in spherically symmetric Axial View Sagittal View Coronal View LRPRAL conductor, J. Comp. Appl. Math., 143 (2002), 189–200. [4] T. Nara, J. Oohama, M. Hashimoto, T. Takeda and S. Ando, Direct reconstruction algorithm of current dipoles for vector magnetoencephalography and electroencephalography, Phys. Med. Biol., 52 (2007), 3859–3879. [5] S. Taulu, M. Kajola and J. Simola, Suppression of interference and artifacts by the signal space separation method, Brain Topography, 16 (2004), 269–275. [6] P. Kravanja, T. Sakurai and M. V. Barel, On locating clusters Fig. 9. Localization result at 101 msec. Two dipoles are esti- of zeros of analytic functions, BIT, 39 (1999), 646–682. mated in the left and right somatosensory cortices. [7] E. B. Saff and A. B. J. Kuijlaars, Distributing many points on a sphere, Mathematical Intelligencer, 19 (1997), 5–11.

– 68 – JSIAM Letters Vol.3 (2011) pp.69–72 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

An alternative implementation of the IDRstab method saving vector updates

Kensuke Aihara1, Kuniyoshi Abe2 and Emiko Ishiwata3

1 Graduate School of Science, Tokyo University of Science, 1-3 Kagurazaka, Shinjuku-ku, Tokyo 162-8601, Japan 2 Faculty of Economics and Information, Gifu Shotoku University, 1-38 Nakauzura, Gifu-shi, Gifu 500-8288, Japan 3 Department of Mathematical Information Science, Tokyo University of Science, 1-3 Kagu- razaka, Shinjuku-ku, Tokyo 162-8601, Japan E-mail j1411701 ed.tus.ac.jp Received August 6, 2011, Accepted October 4, 2011

Abstract The IDRstab method is often more effective than both IDR(s) and BiCGstab(ℓ) for solving large nonsymmetric linear systems. However the computational costs for vector updates are expensive on the original implementation of IDRstab. In this paper, we propose a variant of IDRstab to reduce the computational cost; vector updates are saved. Numerical experiments demonstrate the efficiency of our variant of IDRstab for sparse linear systems. Keywords linear systems, Induced Dimension Reduction, IDRstab method, vector update Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction which saves the computational costs for AXPYs. Numer- The IDR(s) method [1], which is based on the In- ical experiments demonstrate that our proposed variant duced Dimension Reduction (IDR) principle, has been of IDRstab is more efficient than the original one for proposed by Sonneveld and Gijzen for solving nonsym- sparse linear systems. metric linear systems Ax = b of order n for x, where the right-hand side vector b is an n-vector. 2. The IDRstab method It has been shown that IDR(s) corresponds to the Bi- In this section, we describe the outline of the original Conjugate Gradient STABilized (BiCGSTAB) method IDRstab method. [2] which uses s-dimensional initial shadow residual [3]. The jth residual rj of the IDR method based on the As for BiCGSTAB, IDR(s) also contains the residual IDR principle such as IDRstab is generated in subspace minimization step using stabilizing polynomials of de- Gj. Here, the subspaces Gj, j = 0, 1, 2,... are related by G ≡ Cn G ≡ − G ∩ ˜⊥ ˜⊥ gree one. Therefore, as BiCGSTAB, the residual mini- 0 and j+1 (I ωj+1A)( j R0 ), where R0 is mization step causes numerical instabilities in the case the orthogonal complement of the range of a fixed n × s of a strongly nonsymmetric matrix. To overcome this matrix R˜0, and the ωj’s are nonzero scalars. In generic problem, the IDRstab method has been developed by case, the dimension of Gj decreases with increasing j by Sleijpen and Gijzen as an alternative IDR(s) with stabi- the IDR theorem; for the details, we refer to [1, 3, 4]. lizing polynomials of degree ℓ [4]. Note that, IDRstab The residual rk ∈ Gk of IDRstab is updated to the with ℓ = 1 is mathematically equivalent to IDR(s), next residual rk+ℓ ∈ Gk+ℓ without explicitly produc- and that with s = 1 is mathematically equivalent to ing the residuals rk+i ∈ Gk+i for i = 1, 2, . . . , ℓ − 1, the BiCGstab(ℓ) method [5]. The related method GBi- where the integer k is a multiple of ℓ. The process of CGSTAB(s, L) [6] which incorporates the stabilizing this update from rk to rk+ℓ is referred to as one cycle of polynomials of order L to IDR(s) has been proposed IDRstab. The cycle has two steps called the IDR step by Tanio and Sugihara, but its implementation is quite and the polynomial step. different from IDRstab. IDRstab with s and ℓ larger than 1 is often more ef- 2.1 The IDR step fective than both IDR(s) and BiCGstab(ℓ). However the Suppose that we have an approximation xk and the ∈ G × original implementation of IDRstab presented in [4] re- corresponding residual rk k, plus the n s matrices G quires many vector updates of the form ax + y with Uk and AUk with columns also in k. The IDR step is a scalar a and the n-vectors x and y (AXPYs). The repeated ℓ times before the residual minimization step, computation time depends significantly on the compu- i.e., the polynomial step. The ℓ repetitions of the IDR (j) tational costs for AXPYs. It is known that a number step are performed by using the projections Πi for i = of different implementations of IDRstab can be devised. 0, 1, . . . , j, j = 1, 2, . . . , ℓ which are defined by Therefore, in this paper, we propose a variant of IDRstab (j) ≡ − i (j−1) −1 ˜∗ j−i ≡ ˜∗ j (j−1) Πi I A Uk σj R0A , σj R0A Uk .

– 69 – JSIAM Letters Vol. 3 (2011) pp.69–72 Kensuke Aihara et al.

˜∗ (j) (j) (j) U (0) x(0) → x(1) U (1)e U (1)e ...U (1)e Note that R0Πj = O and Πi+1A = AΠi for i = k k k k 1 k 2 k s (1) (1) Π Π 0, 1, . . . , j − 1. Here we use the superscript ‘(j)’ as the 0 0 ↗ ↓ ↗ ↓ ↓ (j−1) A A A jth repetition of the IDR step. An approximation xk (0) (1) (1) (1) (1) (j−1) rk rk AUk e1 AUk e2 ... AUk es and the corresponding residual rk , plus the vectors i (j−1) − × A rk for i = 1, 2, . . . , j 1, and the n s matrices Fig. 1. The first repetition. i (j−1) A Uk for i = 0, 1, . . . , j are generated by performing (0) (0) (1) (1) → (2) (2) (2) (2) j −1 (j ≤ ℓ) repetitions, where x ≡ xk, r ≡ rk and Uk xk xk Uk e1 Uk e2 ...Uk es k k (2) (2) (0) Π Π ≡ 0 0 Uk Uk. The jth repetition is performed as follows. ↗ ↓ A ↗ ↓ A ↓ A (j) (2) i − Π The vectors A rk for i = 0, 1, . . . , j 1 are obtained (1) (1) →1 (2) (2) (2) (2) AUk rk rk AUk e1 AUk e2 . . . AUk es by the projection (2) (2) Π1 Π1 − ↓ A ↗ ↓ A ↗ ↓ A ↓ A Air(j) ≡ Π(j) Air(j 1) k i+1 k (2) 2 (2) 2 (2) 2 (2) Ark A Uk e1 A Uk e2 ... A Uk es i (j−1) − i+1 (j−1) (j) = A rk A Uk ⃗α (1) ↓ A − ∗ − (j−1) 2 (2) (j) ≡ 1 ˜ j 1 A rk with ⃗α σj (R0A rk ), and the associated ap- (j) proximation xk is expressed by Fig. 2. The second repetition. (j) (j−1) (j−1) (j) xk = xk + Uk ⃗α . (2) i (j) i (j−1) 3. An alternative implementation of the The matrices A Uk are updated from A Uk for j (j) i = 0, 1, . . . , j such that the columns of A Uk form IDRstab method K (j) (j) j (j) a basis of the Krylov subspace s(Πj A, Πj A rk ). In the IDR step stated in preceding section, the com- Specifically, the vector Ajr(j) is obtained by multi- putational costs for vector updates (i.e., AXPYs) to k (j) j−1 (j) i (j) generate the matrices AiU for i = 0, 1, . . . , j, j = plying A rk by A. Then the vectors A Uk e1 for k i = 0, 1, . . . , j are obtained by the projection 1, 2, . . . , ℓ are expensive. Note that the vector update using a projection Π(j) such as (1) contains s AXPYs. i (j) ≡ (j) i (j) i (j) − i (j−1) ⃗(j) i A Uk e1 Πi A rk = A rk A Uk β1 (3) Therefore, in this section, we give a new formulation of the IDR step in which vector updates are saved, and with β⃗(j) ≡ σ−1⃗ρ(j) and ⃗ρ(j) ≡ R˜∗Ajr(j). Similarly, for 1 j 1 1 0 k derive an alternative implementation of IDRstab. (j) j (j) some q < s, the vector cq ≡ A(A U eq) is computed k 3.1 Saving vector updates in the IDR step as the qth column of Aj+1U (j), after that, the vectors k We compute the matrix A∗R˜ at the beginning of the i (j) 0 A Uk eq+1 for i = 0, 1, . . . , j can be computed as iteration, and store it. Then, following the idea noted − (j) (j) (j) ∗ ∗ j−1 (j 1) i i+1 in [4], we compute the σj as (A R˜0) A U . The A Uk eq+1 = Πi A Uk eq k vectors ⃗α(j), ⃗ρ(j) and ⃗ρ(j) (q < s) can also be com- (j) (j−1) (j) 1 q+1 i+1 − i ⃗ − = A Uk eq A Uk βq+1 (4) −1 ∗ ˜ ∗ j−2 (j 1) ∗ ˜ ∗ j−1 (j) puted as σj ((A R0) A rk ), (A R0) A rk ⃗(j) ≡ −1 (j) (j) ≡ ˜∗ (j) ∗ ˜ ∗ j (j) with βq+1 σj ⃗ρq+1 and ⃗ρq+1 R0cq . and (A R0) A Uk eq, respectively. These forms enable Thus, at the jth repetition, the vectors Air(j) for i = us to perform the jth repetition of the IDR step without k j (j−1) − i (j) the matrix A Uk . 0, 1, . . . , j 1 and the columns of A Uk for i = 1, 2, . . . , j G ∩ ˜⊥ At the first repetition, i.e., for j = 1, we compute belong to k R0 . (0) (1) Uk ⃗α , then obtain the approximation (2) and the 2.2 The polynomial step (1) (0) − (0) (1) residual rk = rk A(Uk ⃗α ), where multiplying After ℓ repetitions of the IDR step, we have an approx- (0) (0) (1) U ⃗α(1) by A gives A(U ⃗α(1)). We obtain U by the imation x(ℓ) and the corresponding residual r(ℓ), plus k k k k k projection Π(1), and multiplying U (1) by A gives AU (1). i (ℓ) 0 k k the vectors A rk for i = 1, 2, . . . , ℓ, and ℓ + 2 matrices From the second repetition, i.e., for j = 2, 3, . . . , ℓ, i (ℓ) A Uk for i = 0, 1, . . . , ℓ + 1. The residual minimization we perform the updates (1) for i = 0, 1, . . . , j − 2 and is performed using a polynomial of degree ℓ. Specifically, (2), and obtain Aj−1r(j) by multiplying Aj−2r(j) by A. (ℓ) (ℓ) k k the residual rk and the matrix AUk are updated by Then we also perform the updates (3) and (4) for i = − (ℓ) − (ℓ) − · · · − ℓ (ℓ) 0, 1, . . . , j 1. At the end of the ℓth repetition, we obtain rk+ℓ = rk γ1,kArk γℓ,kA rk , ℓ (ℓ) ℓ−1 (ℓ) A rk by multiplying A rk by A. (ℓ) − 2 (ℓ) − · · · − ℓ+1 (ℓ) The scheme of the IDR step stated above is displayed AUk+ℓ = AUk γ1,kA Uk γℓ,kA Uk in Figs. 1 and 2, where ℓ = 2. The notations of the with scalars γ1,k, γ2,k, . . . , γℓ,k which are determined by scheme follow [4]. The explicit multiplications by A are minimizing a norm of the residual rk+ℓ. The approxi- used to obtain the boxed vectors. (ℓ) (ℓ) mation xk and the matrix Uk are also updated to the In our proposed IDR step, it is not needed to generate next associated approximation xk+ℓ and matrix Uk+ℓ. j+1 (j) A Uk at the jth repetition. Hence, we can save the storage of an n × s matrix and ℓ(s2 + s) AXPYs. How-

– 70 – JSIAM Letters Vol. 3 (2011) pp.69–72 Kensuke Aihara et al. ever we need an additional matrix-vector multiplication Table 1. The computational costs of IDRstab and our variant for (0) (1) MVs and AXPYs per cycle. (MV) to obtain the vector A(Uk ⃗α ) per cycle. MVs AXPYs 1 2 2 3.2 A variant of IDRstab IDRstab ℓ(s + 1) 2 ℓ(ℓ + 1)(s + s ) + 2ℓ + sℓ + ℓ(s + 2s) 1 2 Our variant of IDRstab algorithm saving vector up- our variant ℓ(s + 1) + 1 2 ℓ(ℓ + 1)(s + s ) + 2ℓ + sℓ dates is expressed as follows: Our proposed variant of IDRstab 1 −1 1. Select an initial guess x and an (n × s) matrix R˜ 0

2. Compute r0 = b − Ax, r = [r0] −3

% Generate an initial (n × s) matrix U = [U0] −5

3. For q = 1, . . . , s −7 4. if q = 1, u0 = r0, else, u0 = Au0 original IDRstab ∗ −9 5. ⃗µ = (U0(:,1:q−1)) u0, u0 = u0 − U0(:,1:q−1)⃗µ

of relative residual 2−norm 2−norm residual of relative our variant

10 −11 6. u0 = u0/∥u0∥2, U0(:,q) = u0

Log −13 7. End for 0 50 100 150 200 250 8. While ∥r0∥ > tol Number of cycles 9. For j = 1, . . . , ℓ Fig. 3. Convergence histories of the original IDRstab and our % T he IDR step variant with (s, ℓ) = (2, 4) for example 1. ∗ ∗ 10. σ = (A R˜ 0) Uj−1 3 −1 ˜ ∗ −1 ∗ ˜ ∗ 11. if j = 1, ⃗α = σ (R0r0), else, ⃗α = σ ((A R0) rj−2) 1 12. x = x + U0⃗α −1 − − 13. if j = 1, r = r AU0⃗α, else, r = r [U1; ... ; Uj−1]⃗α −3 14. if j > 1, r = [r; Ar − ] j 2 −5 15. For q = 1, . . . , s −7 original IDRstab 16. if q = 1, u = r, else, u = [u1; ... ; uj ] −9 −1 ∗ ∗ our variant 17. β⃗ = σ ((A R˜ 0) uj−1), u = u − Uβ,⃗ u = [u; Auj−1] 2−norm residual of relative ∗ 10 −11 18. ⃗µ = (Vj(:,1:q−1)) uj , u = u − V(:,1:q−1)⃗µ Log −13 19. u = u/∥u ∥ , V = u 0 200 400 600 800 j 2 (:,q) Number of cycles 20. End for Fig. 4. Convergence histories of the original IDRstab and our 21. U = V variant with (s, ℓ) = (2, 4) for example 2. 22. End for

23. r = [r; Arℓ−1] % T he polynomial step pect that our variant is more efficient than the original ∥ − ∥ 24. ⃗γ = [γ1; ... ; γℓ] = arg min⃗γ r0 [r1,..., rℓ]⃗γ 2 IDRstab for large sparse linear systems. 25. x = x + [r ,..., r − ]⃗γ, r = r − [r ,..., r ]⃗γ Note that, the original IDRstab and our variant are 0 ∑ ℓ 1 0 0 1 ℓ ℓ mathematically equivalent, but which may show differ- 26. U = [U0 − γj Uj ] j=1 ent convergence property. 27. End while The notations in this algorithm follow the MATLAB 4. Numerical experiments conventions: for a matrix W = [w1,..., ws], the matrix In this section, we present some numerical experi- ≤ [w1,..., wq] and the vector wq for q s are notated ments on model problems with nonsymmetric matrices. as W(:,1:q) and W(:,q), respectively, and [W0; ... ; Wj] ≡ ⊤ ⊤ ⊤ 4.1 Computational condition [W ,...,W ] . Note that, in this algorithm, the Ui, 0 j Numerical calculations were carried out in double- Vi, ri and ui for i = 0, 1, . . . , j are related to U, V, precision floating-point arithmetic on a PC (Intel Core i7 r and u by U = [U0; ... ; Uj], V = [V0; ... ; Vj], r = 2.67GHz CPU) with Intel C++ 11.1.048 compiler. The [r0; ... ; rj] and u = [u0; ... ; uj], respectively. As in the original implementation in [4], we use the iterations were started with 0. The stopping criterion tol −12∥ ∥ ˜ Arnoldi’s process to obtain the orthonormalized matri- was set at 10 b 2. The columns of R0 were given by j (j) the orthonormalization of s real random vectors in the ces U0 and A Uk at the lines 5-6 and 18-19, respec- (ℓ) interval (0, 1). The combinations of the parameters (s, ℓ) tively. Since AUk is not needed to be updated to AUk+ℓ were set at (2, 4), (4, 4) and (8, 2). in the polynomial step, sℓ AXPYs are saved further. Ta- Figs. 3 and 4 display the convergence histories with ble 1 summarizes the computational costs of the original (s, ℓ) = (2, 4) for the examples 1 and 2, respectively. IDRstab and our variant for MVs and AXPYs per cycle. The plots show the number of cycles on the horizon- Here we don’t include the costs for the Arnoldi’s process. tal axis versus the log of the relative residual 2-norm From Table 1, the computational costs of our variant per 10 (∥rk∥2/∥b∥2) on the vertical axis, respectively. Tables 2 cycle are less than that of the original IDRstab when and 3 show the number of cycles and MVs, the computa- 2 nnz < ℓ(s + 2s)n holds, where nnz is the number of tion times and the explicitly computed relative residual nonzero entries of the coefficient matrix. Thus, we ex- 2-norms (∥b − Axk∥2/∥b∥2) at termination, which are

– 71 – JSIAM Letters Vol. 3 (2011) pp.69–72 Kensuke Aihara et al.

Table 2. Number of cycles and MVs, computation times and ex- From Fig. 4 and Table 3, we can observe the follow- plicitly computed relative residual norms for example 1. ing: The number of cycles required for successful con- (s, ℓ) Cycles MVs Time[sec] True res. vergence of our variant are about the same as that of IDRstab 225 2703 2.730 4.2E−09 (2, 4) the original IDRstab in the case of (s, ℓ) = (2, 4) and our variant 235 3057 2.398 7.6E−13 IDRstab 116 2325 2.621 8.3E−10 (8, 2). Then, the convergence behavior of our variant is (4, 4) our variant 123 2587 2.356 4.8E−11 about the same as that of the original IDRstab. As be- IDRstab 132 2385 3.442 1.5E−10 fore, the computation times for our variant are shorter (8, 2) our variant 132 2516 2.886 2.7E−11 although the number of MVs increases compared with that of the original IDRstab. In particular, the compu- Table 3. Number of cycles and MVs, computation times and ex- tation time for our variant is about 77% of that for the plicitly computed relative residual norms for example 2. original IDRstab in the case of (s, ℓ) = (2, 4). (s, ℓ) Cycles MVs Time[sec] True res. In the case of (s, ℓ) = (4, 4), the number of cycles re- IDRstab 677 8127 38.33 1.6E−06 (2, 4) quired for successful convergence of our variant is slightly our variant 626 8140 29.58 8.5E−08 IDRstab 226 4525 25.20 3.1E−05 more than that of the original IDRstab. Nevertheless, (4, 4) our variant 259 5443 24.42 6.2E−08 the computation time for our variant is shorter than that IDRstab 206 3717 27.85 1.9E−07 for the original IDRstab. (8, 2) our variant 221 4207 24.38 3.2E−09 Note that our variant leads to more accurate approx- imate solutions as well as the result of example 1. abbreviated as “Cycles”, “MVs”, “Time[sec]” and “True 5. Concluding remarks res.”, respectively. We proposed a variant of IDRstab saving vector up- dates. A feature of our variant is that the computational 4.2 Example 1 costs for AXPYs are sufficiently saved instead of an ad- As in [4], we take up a test matrix SHERMAN5 ditional MV per cycle. Numerical experiments show that from the Matrix-Market collection. The order n and the number of cycles required for successful convergence nnz of this matrix are 3312 and 20793, respectively. of our variant are about the same as that of the original The percentage of nonzero entries is 0.19. The right- IDRstab. As a result, the computation time can be re- hand side vector is given by substituting a vector x∗ ≡ duced with sparse linear systems. Moreover, we observed (1, 1,..., 1)T into the equation b = Ax∗. that our variant of IDRstab leads to more accurate ap- From Fig. 3 and Table 2, we can observe the follow- proximate solutions than the original one. We will an- ing: The number of cycles required for successful con- alyze on a future work why the approximate solutions vergence of our variant are about the same as that of obtained by our variant are more accurate than that ob- the original IDRstab for each of the combinations of s tained by the original IDRstab. and ℓ. Then, the convergence behavior of our variant is almost the same as that of the original IDRstab. The Acknowledgments computation times for our variant are shorter although the number of MVs increases compared with that of the The authors would like to thank Dr. G. L. G. Sleijpen original IDRstab because the computational costs for (Utrecht University) for his helpful advices, and the re- AXPYs are sufficiently saved. In particular, the compu- viewer for his or her constructive comments. tation time for our variant is about 84% of that for the original IDRstab in the case of (s, ℓ) = (8, 2). References Note that the approximate solutions obtained by our [1] P. Sonneveld and M. B. van Gijzen, IDR(s): a family of sim- variant are more accurate than those obtained by the ple and fast algorithms for solving large nonsymmetric linear original IDRstab for all of the combinations of s and ℓ. systems, SIAM J. Sci. Comput., 31 (2008), 1035–1062. [2] H. A. van der Vorst, Bi-CGSTAB: A fast and smoothly con- 4.3 Example 2 verging variant of Bi-CG for the solution of nonsymmetric As shown in [7], we take up a system with a sparse linear systems, SIAM J. Sci. Stat. Comput., 13 (1992), 631– nonsymmetric coefficient matrix derived from the finite 644. difference discretization of the following partial differen- [3] G. L. G. Sleijpen, P. Sonneveld and M. B. van Gijzen, Bi- CGSTAB as an induced dimension reduction method, Appl. tial equation on the unit square Ω = [0, 1] × [0, 1]: [( ) ( )( ) ] Numer. Math., 60 (2010), 1100–1114. − u − u + D y − 1 u + x − 1 x − 2 u [4] G. L. G. Sleijpen and M. B. van Gijzen, Exploiting xx yy 2 x 3 3 y BiCGstab(ℓ) strategies to induce dimension reduction, SIAM 2 J. Sci. Comput., 32 (2010), 2687–2709. − 43π u = G(x, y), u(x, y)|∂Ω = 1 + xy. [5] G. L. G. Sleijpen and D. R. Fokkema, BiCGstab(ℓ) for lin- This equation is discretized by using the 5-point central ear equations involving unsymmetric matrices with complex difference approximation. The mesh size h is chosen as spectrum, Elec. Trans. Numer. Anal., 1 (1993), 11–32. −1 [6] M. Tanio and M. Sugihara, GBi-CGSTAB(s, L): IDR(s) 129 in both directions of Ω. Then the order n and nnz with higher-order stabilization polynomials, J. Comput. Appl. 2 of the coefficient matrix are 128 and 81408, respectively. Math., 235 (2010), 765–784. The percentage of nonzero entries is 0.03. The right-hand [7] W. Joubert, Lanczos methods for the solution of nonsymmet- side vector of the discretized system is given such that ric systems of linear equations, SIAM J. Matrix Anal. Appl., the exact solution u(x, y) of the above equation is 1+xy. 13 (1992), 926–943. The parameter Dh is set at 2−1.

– 72 – JSIAM Letters Vol.3 (2011) pp.73–76 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Error analysis of H1 gradient method for topology optimization problems of continua

Daisuke Murai1 and Hideyuki Azegami1

1 Graduate School of Information Science, Nagoya University, A4-2 (780) Furo-cho, Chikusa-ku, Nagoya 464-8601 E-mail murai az.cs.is.nagoya-u.ac.jp Received July 6, 2011, Accepted October 19, 2011 Abstract The present paper describes the result of the error estimation of a numerical solution to topology optimization problems of domains in which boundary value problems are defined. In the previous paper, we formulated a problem by using density as a design variable, presented a regular solution, and called it the H1 gradient method. The main result in this paper is the proof of the first order convergence in the H1 norm of the solution in the H1 gradient method with respect to the size of the finite elements if first order elements are used for the design and state variables. Keywords calculus of variations, boundary value problem, topology optimization, H1 gra- dient method, error analysis Research Activity Group Mathematical Design

1. Introduction for l ∈ {0, 1, . . . , m} with constants cl and given func- l l l The problem of finding the optimum layout of holes in tions g and j . By using J , we define the SIMP problem a domain in which a boundary value problem is defined as follows [2]. is called the topology optimization problem of continua Problem 2 Find θ such that [1]. One method for formulating this topology optimiza- min{J 0(θ, u) | J l(θ, u) ≤ 0, l ∈ {1, . . . , m}}. tion problem uses density as a design variable; in this θ∈S case the problem is called the SIMP problem. In the l previous paper [2], we formulated the problem and pre- 3. θ derivative of J sented a regular solution by using a gradient method in The Fr´echet derivative of J l with respect to θ is ob- tained as a function space, and called this method the H1 gradi- ∫ ∫ ent method. The aim of the present paper is to show l′ l l l l J (θ, u, v )[ρ] = (gθ + Ga)ρ dx + jθρ dγ the error estimation of the H1 gradient method using D ∂D standard finite element analyses. = ⟨Gl, ρ⟩ (2) 2. SIMP problem for all ρ ∈ H1(D; R) [2]. Here, ⟨·, ·⟩ is the dual product, − d l − α 1 ∇ · ∇ l · · Let D ∈ R , d ∈ {2, 3}, be a fixed bounded domain Ga = αϕ (θ)ϕθ u v , and ( )θ denotes ∂( )/∂θ. l with boundary ∂D,ΓD ⊂ ∂D be a fixed subboundary The function v is the solution of the following problem. of |ΓD| > 0, and ΓN = ∂D \ Γ¯D. Following [2], let ϕ ∈ Problem 3 For the solution u to Problem 1 at θ ∈ S, C∞(R; [0, 1]) be the density given by a sigmoidal func- find vl ∈ H1(D; R) such that 1,∞ tion of design variable θ ∈ S = {W (D; R) | ∥θ∥ ∞ 1, − ∇ · (ϕα(θ)∇vl) = gl (θ, u) in D, ≤ M} for a constant M > 0. Let u be the solution to u the following problem. α l l l ϕ (θ)∂ν v = ju(θ, u) on ΓN, v = 0 on ΓD. 1 3/2 Problem 1 Let f ∈ H (D; R), p ∈ H (ΓN; R) and 3 4. Solution to Problem 2 uD ∈ H (D; R) be given functions, and α > 1 be a con- 1 stant. For a given θ ∈ S, find u ∈ H (D; R) such that Following [2], we generate θi, i ∈ {1, 2, . . . , n}, from θ by the simplified steps as follows. − ∇ · (ϕα(θ)∇u) = f in D, 0 (i) Set a small constant ε > 0 for step size, and i = 0. α ϕ (θ)∂ν u = p on ΓN, u = uD on ΓD. (ii) Compute ui = u by solving Problem 1 with θ = θi. l l Here, ∂ν = ν · ∇ where ν is the unit outward normal (iii) Compute vi = v by solving Problem 3 with θ = θi. l l l vector along ∂D. Moreover, we provide cost functions as (iv) Compute Gi = G by (2) using ui, vi and θi. ∫ ∫ l ∈ 1 R (v) Compute ρG,i H (D; ) by solving J l(θ, u) = gl(θ, u) dx + jl(θ, u) dγ + cl (1) ∫ D ∂D ∇ l · ∇ l −⟨ l ⟩ ( ρG,i y + cρG,iy) dx = Gi, y (3) D

– 73 – JSIAM Letters Vol. 3 (2011) pp.73–76 Daisuke Murai et al.

1 for a constant c > 0 and all y ∈ H (D; R). Theorem 4 (Error of θn) Assume from (H1) to (vi) Solve λ = (λl) in Aλ = −b where A = (a ) , (H4). Then there exists a constant C > 0 independent i l jl jl ∥ ∥ ≤ k ⟨ j l ⟩ j 0 of ε and h such that δθn 1,q Cεnh holds for n. ajl = Gi , ρG,i and b = (J + aj0)j. Put λi = 1 and construct Here εn = T can be considered as the total amount of ∑m variation of θ. To prove this theorem, we introduce an ρG,i l l induction hypothesis for θ : ρi = , ρG,i = λiρG,i. (4) h,i ∥ρG,i∥1,2 l=0 k ∥δθi∥1,q ≤ Cεih (8)

(vii) Construct θi+1 = θi + ερi and return to (ii) with for i ∈ {0, 1, . . . , n − 1} and the lemmas below. i = i + 1. Lemma 5 (Error of ui) Assume (H1), (H2) and (8). ′ Then there exists a constant C1 > 0 independent of ε 5. Error analysis ∥ ∥ ≤ ′ k and h such that δui 1,q C1(εi + 1)h holds. We estimate the error of the numerical solution by Proof ui andu ˆi satisfy the finite element method with respect to θn obtained ∫ in the solution in Section 4. Let D = ∪{K} be a fi- α l h ϕ (θi)∇δuˆi · ∇v dx nite element approximation of D with elements {K}, D h ∫ h = maxK∈{K} diam(K). For positive integer k and α α l ≥ l l = (ϕ (θi) − ϕ (θh,i))∇uˆi · ∇v dx. (9) even number q d, we restrict ui, vi and ρ to G,i D k+1,q h W (Dh; R), and θi on Dh. We denote θh,i = θi + l q−1 α By taking ∇v = (∇δuˆ ) and m = min ∈ ϕ (θ ) δθi is the approximation of θi,u ˆi = ui + δuˆi ∈ i θi Dh i k+1,q R l l l ∈ k+1,q R in (9), we have W (Dh; ) andv ˆi = vi + δvˆi W (Dh; ) are the analytical solutions of Problems 1 and 3 replacing q m|δuˆi| l l l l 1,q θi by θh,i. Let uh,i = ui + δui, vh,i = vi + δvi, Gh,i = q−1 l l l l l ≤ α∥δθi∥0,∞|uˆi|1,q|δuˆi| Gi + δGi, ρGh,i = ρG,i + δρG,i, ρGh,i = ρG,i + δρG,i, and 1,q l ρh,i = ρi + δρi be the approximate functions of ui, v , α−1 i × max ∥ϕ (θi + tδθi)ϕθ(θi + tδθi)∥0,∞. (10) l l l l l ∈ t∈[0,1] Gi, ρG,i, ρG,i, and ρi, respectively.ρ ˆG,i = ρG,i + δρˆG,i k+1,q R W (Dh; ) represents an analytical solution of (3) By substituting (8) into (10) and dividing (10) by l l l q−1 ′ replacing G by G . Also, let λh = (λ )l be the solu- | | | | ≤ k i h,i h,i δuˆi 1,q , noticing (H1), we obtain δuˆi 1,q C1εih . Us- − ⟨ j tion to Ahλh = bh with Ah = (ah,jl)jl, ah,jl = Gh,i, ing the Poincar´einequality, we get l j j j ′ ρ ⟩, bh = (J + ah,j0)j, J = J (θh,i, uh,i). We use ∥ ∥ ≤ k Gh,i h h δuˆi 1,q C1εih (11) ( ) [ ∫ ] ∑j 1/q 1/q ′ by rewriting C1 > 0. By substituting (11) and (5) into ∥u∥ = |u|q , |u| = (∇ju)qdx j,q k,q j,q ∥δui∥1,q ≤ ∥δuˆi∥1,q +∥uˆi−uh,i∥1,q, the proof is complete. D k=0 h (QED) j,q as the W norm ∥ · ∥j,q and seminorm | · |j,q on Dh for l 0 Lemma 6 (Error of vi) Assume from (H1) to (H3) j ∈ {0, 1}, q ∈ {4, 6,..., ∞} with ∇ = 1. We set the ′ and (8). Then there exists a constant C2 > 0 indepen- following necessary hypotheses to evaluate the error. l ≤ ′ k dent of ε and h such that δvi 1,q C2(εi + 1)h holds. (H1) We take α ≥ 2 in Problems 1, 3 and (2). Proof Noticing (H3), vl andv ˆl satisfy ∫ i i (H2) There exist some positive constants C1, C2, C3 α ∇ l · ∇ ′ independent of h such that ϕ (θi) δvˆi u dx D k+1−j h ∥uˆi − uh,i∥j,q ≤ C1h |uˆi|k+1,q, (5) ∫ α − α ∇ l · ∇ ′ ∥ l − l ∥ ≤ k+1−j| l| = (ϕ (θi) ϕ (θh,i)) vˆi u dx vˆi vh,i j,q C2h vˆi k+1,q, (6) D h∫ ∥ l − l ∥ ≤ k+1−j| l | ρˆG,i ρGh,i j,q C3h ρˆG k+1,q. (7) l − l ′ + (gu(θh,i, uh,i) gu(θi, ui))u dx Dh (H3) For J l(θ, u), we restrict jl(θ, u) to a function of u, ∫ l l ∈ 2 1,q R 1 R l ∈ l − l ′ i.e. j (u), j C (W (D; ); L (D; )), and g + (ju(uh,i) ju(ui))u dγ. (12) 2 1 1,q C (Y ; L (D; R)) for Y = S × W (D; R) such that ∂Dh jl ∈ C1(W 1,q(D; R); W 1,∞(D; R)), gl , gl ∈ C1(Y ; ∇ ′ ∇ l q−1 u θ u By taking u = ( δvˆi) and using the Poincar´ein- ∞ R l ∈ 0 1,q R 1,∞ R L (D; )), juu C (W (D; ); W (D; )), equality, we have l l l l ∈ 0 ∞ R ∫ and gθθ, gθu, guθ, guu C (Y ; L (D; )), respec- l − l ′ tively. (gu(θh,i, uh,i) gu(θi, ui))u dx −1 Dh (H4) There exists C4 > 0 such that ∥A ∥∞ < C4, m ≤ ∥ ∥ | l|q−1 ∥ l ∥ where ∥ · ∥∞ is the maximum norm on R and the δθi 0,q δvˆi 1,q max guθ(θi + tδθi, ui) 0,∞ t∈[0,1] corresponding operator norm for m × m matrices. ∥ ∥ | l|q−1 ∥ l ∥ + δui 0,q δvˆi 1,q max guu(θh,i, ui + tδui) 0,∞ Then we have the following main theorem. t∈[0,1] (13)

– 74 – JSIAM Letters Vol. 3 (2011) pp.73–76 Daisuke Murai et al. and and ∫ ′ α−1 α−1 l − l ∥ϕ (θh,i) − ϕ (θi)∥0,∞ (ju(uh,i) ju(ui))u dγ ∂Dh ∫ ≤ (α − 1)∥δθi∥0,∞ l l ′ = ∇[(j (uh,i) − j (ui))u ] dx α−2 u u × max ∥ϕ (θi + tδθi)ϕθ(θi + tδθi)∥0,∞, Dh t∈[0,1] ≤ | | | l|q−1 | l | δui 1,q δvˆi 1,q max juu(ui + tδui) 1,∞ ∥ϕ (θ ) − ϕ (θ )∥ ∞ t∈[0,1] θ h,i θ i 0, ≤ ∥ ∥ ∥ ∥ ∥ ∥ | l|q−1 ∥ l ∥ δθi 0,∞ max ϕθθ(θi + tδθi) 0,∞. + δui 0,q δvˆi 1,q max juu(ui + tδui) 0,∞. t∈[0,1] t∈[0,1] (14) We can obtain the result in this lemma by substituting (17) and (18) into (16), using Lemmas 5, 6, (8), (H1) By the same argument as in the proof of Lemma 5, sub- and (H3). stituting (13) and (14) into (12), we have (QED) ∥ l∥ ≤ ′′ | l| l m δvˆi 1,q C1 m δvˆi 1,q Lemma 8 (Error of ρi) Assume from (H1) to (H3) and (8). Then there exists a constant C′ > 0 indepen- ≤ ′′ ∥ ∥ ∥ l∥ 4 C1 α δθi 0,∞ vˆi 1,q ∥ l ∥ ≤ ′ k dent of ε and h, such that δρG,i 1,q C4(εi + 1)h α−1 × max ∥ϕ (θi + tδθi)ϕθ(θi + tδθi)∥0,∞ holds. t∈[0,1] Proof ρl andρ ˆl satisfy ′′∥ ∥ ∥ l ∥ G,i G,i + C1 δθi 0,q max guθ(θi + tδθi, ui) 0,∞ ∫ t∈[0,1] (∆δρˆl − cδρˆl )y dx = ⟨δGl, y⟩. ′′ G,i G,i i ∥ ∥ ∥ l ∥ D + C1 δui 0,q max guu(θh,i, ui + tδui) 0,∞ h t∈[0,1] − l ∇ l q−2 Taking y = (q 1)δρˆG,i( δρˆG,i) and considering ′′ l + C |δui|1,q max |j (ui + tδui)|1,∞ ∇ l 1 ∈ uu δρˆG,i = 0 on ∂Dh, we have t [0,1] ∫ ′′∥ ∥ ∥ l ∥ l q l 2 l q−2 + C1 δui 0,q max juu(ui + tδui) 0,∞ (15) |δρˆ | + c(q − 1) (δρˆ ) (∇δρˆ ) dx t∈[0,1] G,i 1,q G,i G,i Dh ′′ for some constant C > 0. From (H3), substituting (8) l l q−2 l 1 ≤ (q − 1)∥δG ∥0,q|δρˆ | |δρˆ |0,q. (19) and (11) into (15) and substituting (15) and (6) into i G,i 1,q G,i ∥ l∥ ≤ ∥ l∥ ∥ l − l ∥ | l |q−2 δvi 1,q δvˆi 1,q + vˆi vh,i 1,q, the proof is complete. Now we divide (19) by δρˆG,i 1,q . Then, since q (> d) (QED) is even number and the Poincar´e inequality, we get ∥ l ∥ ≤ ′′ − ∥ l∥ ′′ δρˆG,i 1,q C4 (q 1) δGi 0,q for some constant C4 > Lemma 7 (Error of Gi) Assume from (H1) to (H3) ′ 0. By substituting (7) into and (8). Then there exists a constant C3 > 0 indepen- ∥ l∥ ≤ ′ k ∥δρl ∥ ≤ ∥δρˆl ∥ + ∥ρˆl − ρl ∥ , dent of ε and h, such that δGi 0,q C3(εi+1)h holds. G,i 1,q G,i 1,q G,i Gh,i 1,q Proof By (H3), Gl and Gl satisfy and using Lemma 7, the proof is complete. i h,i (QED) l l − l δGi = gθ(θi, ui) gθ(θh,i, uh,i) l Lemma 9 (Error of λi) Assume from (H1) to (H4) α−1 ∇ · ∇ l ′ + αϕ (θh,i)ϕθ(θh,i) uh,i vh,i and (8). Then there exists a constant C5 > 0 indepen- dent of ε and h, such that |λl − λl | ≤ C′ (εi + 1)hk − α−1 ∇ · ∇ l i h,i 5 αϕ (θi)ϕθ(θi) ui vi. (16) holds. We estimate the bound on the first and the second terms Proof λ and λh satisfy in the right-hand side of (16) as A(λ − λ ) = b − b − (A − A )λ ∥ l − l ∥ h h h h gθ(θi, ui) gθ(θh,i, uh,i) 0,q By (H4) and multiplying by A−1, we get l ≤ ∥δui∥0,∞ max ∥g (θi,h, ui + tδui)∥0,∞ ∈ θu t [0,1] ∥λ − λh∥∞ ∥ ∥ ∥ l ∥ −1 + δθi 0,∞ max gθθ(θi + tδθi, ui) 0,∞. (17) ≤ ∥A ∥∞(∥b − bh∥∞ + ∥A − Ah∥∞∥λh∥∞) t∈[0,1] ≤ ∥ −1∥ ∥ ∥ | − | By using triangle inequality, we can estimate the remain- A ∞(1 + m λh ∞) max ajl ah,jl j∈{1,...,m}, ing terms as l∈{0,...,m} α−1 α−1 l ∥ − ∥ ∞∥ ∇ · ∇ ∥ −1 j j α ϕ (θh,i) ϕ (θi) 0, ϕθ(θi) ui vi 0,q + ∥A ∥∞ max |J (θi, ui) − J (θh,i, uh,i)|. j∈{1,...,m} α−1 l + α∥ϕ (θ ) − ϕ (θ )∥ ∞∥ϕ (θ )∇u · ∇v ∥ θ h,i θ i 0, h,i i i 0,q Here, ∥ α−1 ∇ l∥ | − | + α ϕ (θh,i)ϕθ(θh,i) vi 0,∞ uh,i ui 1,q | − | ≤ |⟨ j l ⟩| |⟨ j l ⟩| ajl ah,jl δGi , ρG,i + Gh,i, δρG,i ∥ α−1 ∇ ∥ | l − l| + α ϕ (θh,i)ϕθ(θh,i) uh,i 0,∞ vh,i vi 1,q (18) ≤ ∥ j∥ ∥ l ∥ ∥ j ∥ ∥ l ∥ δGi 0,2 ρG,i 0,2 + Gh,i 0,2 δρG,i 0,2.

– 75 – JSIAM Letters Vol. 3 (2011) pp.73–76 Daisuke Murai et al.

f

¡D D (1,1) p (0,0) D

(0,1) (a) Boundary condition (b) ϕ(θ1/20,800)

(a) f in Problem 1 (b) ϕ(θ1/20,100) Fig. 2. Setting for linear elastic problem and converged ϕ. Fig. 1. Setting for Problem 1 and converged ϕ. − ∥ ∥ Table 2. Results of log2 δθn 1,2 with T = εn = 80 to a linear − ∥ ∥ elastic problem. Table 1. Results of log2 δθn 1,2 with T = εn = 10 to Problem 1. n h=1/5 h=1/10 h=1/20 h=1/40 h=1/80 n h=1/5 h=1/10 h=1/20 h=1/40 h=1/80 400 −5.7374 −4.9172 −4.0073 −2.9443 −1.6998 50 0.9012 1.8513 2.8614 3.8983 5.0598 incr. 0.8202 0.9099 1.0630 1.2445 − − − − − incr. 0.9501 1.0101 1.0369 1.1615 800 5.7580 4.9492 4.0596 3.0060 1.7628 100 0.9201 1.8655 2.8397 3.8761 5.0407 incr. 0.8088 0.8896 1.0536 1.2432 − − − − − incr. 0.9454 0.9742 1.0364 1.1646 1600 5.7598 4.9506 4.0618 3.0086 1.7656 200 1.4861 2.4064 3.4106 4.4414 5.6005 incr. 0.8092 0.8888 1.0532 1.2430 − − − − − incr. 0.9203 1.0042 1.0308 1.1591 3200 5.7607 4.9514 4.0629 3.0099 1.7669 400 1.0518 2.0617 3.0481 4.0759 5.2331 incr. 0.8093 0.8885 1.0530 1.2430 − − − − − incr. 1.0099 0.9864 1.0278 1.1572 6400 5.7611 4.9517 4.0635 3.0106 1.7676 800 0.7343 1.6836 2.7203 3.7521 4.9121 incr. 0.8094 0.8882 1.0529 1.2430 incr. 0.9493 1.0367 1.0318 1.1600

6. Numerical examples 2 and For Problem 1, we use the setting D = [0, 1] ,ΓD = 2 2 − | j − j | ∂D, f = 2[x1 + x2 (x1 + x2)], uD = 0, ϕ(θ) = (tanh θ J Jh +1)/2, α ∫= 2. The cost functions∫ are assumed as ≤ ∥ ∥ ∥ j ∥ 0 1 − 1 δθi 0,∞ max gθ(θi + tδθi, ui) 0,∞ J (θ, u) = D fu dx and J (θ) = D ϕ(θ) dx c , where t∈[0,1] 1 1 c is taken as J (θ0) = 0 for θ0 = 0. We take c = 1 in ∥ ∥ ∥ j ∥ (3). D is approximated as D using triangular element. + δui 0,∞ max gu(θh,i, ui + tδui) 0,∞ h t∈[0,1] We take k = 1 in (H2). Fig. 1 shows f and converged | | | j | ϕ obtained by the present method. Table 1 shows the + δui 1,∞ max juu(ui + tδui) 1,∞ t∈[0,1] − ∥ ∥ results of log2 δθn 1,2 with T = nε = 10. ∥ ∥ ∥ j ∥ Another example is a SIMP problem for linear elastic + δui 0,∞ max juu(ui + tδui) 0,∞. 3/2 2 t∈[0,1] continuum. Let D = [0, 3] × [0, 2], p ∈ H (ΓN; R ) be a traction force, u = 0 ∈ H2(Γ ; R2), and u ∈ From (8), Lemmas 5, 6, 7 and 8, the lemma is proven. D D H1(D; R2) be a displacement as a solution of the linear (QED) elastic∫ problem for p. A mean∫ compliance J 0(θ, u) = p · u dγ and a mass J 1(θ) = ϕ(θ) dx − c1 are used Lemma 10 (Error of ρi) Assume from (H1) to (H4) ΓN D ′ 0 − α−1 · and (8). Then there exists a constant C6 > 0 indepen- as cost functions. We have Ga = αϕ ϕθσ(u) ε(u) ∥ ∥ ≤ ′ k 0 dent of ε and h, such that δρi 1,q C6(εi+1)h holds. for J where σ(u) and ε(u) are denoted by the stress and the strain, respectively. The space approximation of Proof By (4), ρ and ρ satisfy ∥δρ ∥ ≤ i h,i i 1,q D, c1, α, c and k are the same as above. Fig. 2 shows the 2∥δρ ∥ /∥ρ ∥ and G,i 1,q G,i 1,2 problem setting and the result ϕ obtained by the present ∥ ∥ − ∥ ∥ δρG,i 1,q method. Table 2 shows the results of log2 δθn 1,2 with T = nε = 80. ≤ | l | ∥ l ∥ (m + 1) max λh,i max δρG,i 1,q ∥ ∥ l∈{0,...,m} l∈{0,...,m} From Tables 1 and 2, we can observe δθn 1,2 achieves first order convergence in the H1 norm with respect to | l − l | ∥ l ∥ + m max λi λh,i max ρG,i 1,q. l∈{1,...,m} l∈{1,...,m} h expected by Theorem 4 with k = 1. Also, these tables show ∥δθn∥1,2 is independent of T = εn. By using Lemmas 8 and 9, the theorem is proven. (QED) Acknowledgments Proof of Theorem 4 If n = 0, we have Theorem 4 We want to thank Prof. Norikazu Saito and reviewer by θ0 = θh,0. for their valuable comments of the proof. The present If n > 0, for i ∈ {0, . . . , n − 1}, we have ∥δθi+1∥1,q ≤ study was supported by JSPS KAKENHI (20540113). ε∥δρi∥1,q + ∥δθi∥1,q with ∥δθ0∥1,q = 0. By applying Lemma 10 and (8) to the previous inequality, we have References ∥ ∥ ≤ { ′ } k ′ 2 k δθi+1 1,q max C6,C ε(i + 1)h + C6ε ih . Since ε is a small constant, C = max{C′ ,C}, and n = i + 1, we [1] M.P.Bendsøe and O.Sigmund, Topology optimization : theory, 6 methods and applications, Springer, 2003. obtain Theorem 4. (QED) [2] H. Azegami, S. Kaizu and K. Takeuchi, Regular solution to topology optimization problems of continua, JSIAM Letters, 3 (2011), 1–4.

– 76 – JSIAM Letters Vol.3 (2011) pp.77–80 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Evolution of bivariate copulas in discrete processes

Yasukazu Yoshizawa1 and Naoyuki Ishimura1

1 Graduate School of Economics, Hitotsubashi University, 2-1 Naka Kunitachi, Tokyo 186-8601, Japan E-mail ed091006 g.hit-u.ac.jp, ishimura econ.hit-u.ac.jp Received October 13, 2011, Accepted October 19, 2011 Abstract A copula function makes a bridge between multivariate joint distributions and univariate marginal distributions, and provides a flexible way of describing nonlinear dependence among random circumstances. We introduce a new family of bivariate copulas which evolves according to the discrete process of heat equation. We prove the convergence of solutions as well as the measure of dependence. Numerical experiments are also performed, which shows that our procedure works substantially well. Keywords copula, discrete processes, risk management Research Activity Group Mathematical Finance

1. Introduction on RanF × RanG, such that There has been much interest in the theory of copulas H(x, y) = C(F (x),G(y)). (3) these days. A copula technique provides a flexible and convenient method of describing nonlinear dependence Conversely, if C is a copula and F and G are distribu- among multivariate random events. Copulas make a link tion functions, then the function H defined by (3) is a between a multivariate joint distribution and univariate bivariate joint distribution function with margins F and marginal distributions. The technique is employed not G. only in statistics but also in many areas of applications, In this article, we introduce a new family of bivariate which include financial engineering, risk management, copulas, which evolves according to discrete process. actuarial science, seismology and so on. We refer to [1– Although there exist many one-parameter families 10] and the references therein. of copulas, such as the Clayton family, the Gumbel- In the case of bivariate joint distribution, the defini- Hougaard family and the Frank family, little attention tion of copula and the fundamental theorem developed seems to have been paid to the time-dependent copulas by A. Sklar [11] is expressed as follows. despite its importance. We just recall one important ex- Definition 1 A function C defined on I2 := [0, 1] × ception of the concept of dynamic copula due to A. J. [0, 1] and valued in I is called a copula if the following Patton [12]. conditions are fulfilled. On the other hand, we have introduced the time evo- lution of copulas in [13–15]. To be precise, we consider (i) For every (u, v) ∈ I2, a time parameterized family of copulas {C(u, v, t)}t≥0, C(u, 0) = C(0, v) = 0, which satisfy the heat equation: ( ) ∂C ∂2 ∂2 C(u, 1) = u and C(1, v) = v. (1) (u, v, t) = + C(u, v, t). (4) ∂t ∂u2 ∂v2 (ii) For every (u , v ) ∈ I2 (i = 1, 2) with u ≤ u and i i 1 2 Here, by the definition of copula, we understand that v ≤ v , 1 2 C(·, ·, t) fulfills (1), (2); to be precisely, we postulate that − − ≥ C(u1, v1) C(u1, v2) C(u2, v1) + C(u2, v2) 0. (i) for every (u, v, t) ∈ I2 × (0, ∞), (2) C(u, 0, t) = C(0, v, t) = 0, The requirement (2) is referred to as the 2-increasing C(u, 1, t) = u and C(1, v, t) = v. (5) condition. We also note that a copula is continuous by its definition. 2 (ii) for every (ui, vi, t) ∈ I × (0, ∞)(i = 1, 2) with Theorem 2 (Sklar’s theorem) Let H be a bivari- u1 ≤ u2 and v1 ≤ v2, ate joint distribution function with marginal distribution C(u , v , t) − C(u , v , t) functions F and G; that is, 1 1 1 2 − C(u , v , t) + C(u , v , t) ≥ 0. (6) lim H(x, y) = G(y), lim H(x, y) = F (x). 2 1 2 2 x→∞ y→∞ The stationary solution to (4), which is referred to Then there exists a copula, which is uniquely determined as the harmonic copula, is uniquely determined to be

– 77 – JSIAM Letters Vol. 3 (2011) pp.77–80 Yasukazu Yoshizawa et al.

Π(u, v) := uv, in view of the boundary condition (1). sion to all I2 has been made similarly above. It then { n n − } We note that the copula Π represents the independent follows that Di,j := Ci,j uivj n=0,1,2,... satisfies the structure between two respective random variables. system of difference equations (7) with the null bound- Here we discretize (4) in a sense, and define a time- ary conditions. Consequently we see that dependent family of copulas in discrete processes. We max |Dn(u, v)| ≤ Kθn, (9) hope that these discretized families are rather ready to (u,v)∈I2 be numerically computed and to be applied in many sit- for some constants K, θ with 0 < θ < 1, provided λ < uations. We exhibit some examples in Section 4. 1/4. In particular, we have Dn → 0 as n → ∞ uniformly 2 2. Discrete processes of copulas on I . To summarize, we have established the next theorem. The construction of our discretely parametrized fam- Theorem 3 For any initial copula C , there exists ily of copulas proceeds as follows. 0 a sequence of copulas {Cn(u, v)} , which sat- Let N ≫ 1 and 0 < h ≪ 1. We put n=0,1,2,... isfy the system of difference equations (7) at every 1 { } → ∞ ∆u = ∆v := , ∆t := h, (ui, vj) i,j=0,1,...,N . As n , it follows that N Cn(u, v) → uv uniformly on I2. ∆t ∆t λ := = = hN 2, (∆u)2 (∆v)2 3. Measure of dependence and It is an important subject for research to quantita- i tively estimate the dependence relation between random u := i∆u = for i = 0, 1,...,N, i N variables. For this purpose, several measures of associ- j ation have been already introduced so far. We recall, v := j∆v = for j = 0, 1,...,N. j N as widely known examples, the population version of Kendall’s tau and the Spearman’s rho, which will be Our family of copulas {Cn(u, v)} is now de- n=0,1,2,... denoted by τ and ρ, respectively. fined as follows: First, The formulas for τ and ρ in terms of copula function 0 C (u, v) := C0(u, v), is well known. There is also the formulas with respect to the empirical copulas (see Section 5.6 of [6]), which can where C denotes given initial copula. 0 be utilized for our discretized family of copulas. For the At {(u , v )} , the value Cn := Cn(u , v ) is i j i,j=0,1,...,N i,j i j completeness of our exposition, we here reproduce them. governed by the system of difference equations ∑N n+1 − n n − n n 2N Ci,j Ci,j Ci+1,j 2Ci,j + Ci−1,j τ = (Cn Cn − Cn Cn ), = N − 1 i,j i−1,j−1 i,j−1 i−1,j ∆t (∆u)2 i,j=2 n − n n Ci,j+1 2Ci,j + Ci,j−1 ∑N + 12 n 2 ρ = (C − u v ). (∆v) N 2 − 1 i,j i j i,j=1 for i, j = 1, 2,...,N − 1, (7) Thanks to these formulas, the convergence as n → ∞ is together with the boundary conditions deduced directly, which is read as follows. { n n C0,j = 0 = Ci,0 Theorem 4 For any initial copula C0, a sequence of n n for i, j = 0, 1,...,N. (8) { n } Ci,N = ui,CN,j = vj copulas C (u, v) n=0,1,2,... proved in Theorem 3 fulfills 2 | | | | → → ∞ As to the point (u, v) ∈ I other than {(ui, τ , ρ 0 exponentially as n . v )} , the value Cn(u, v) is provided by linear j i,j=0,1,...,N Sketch of Proof In view of (9) and the uniform bound interpolation. That is, if for instance max(u,v)∈I2 |C(u, v)| ≤ 1, we assert that ≤ ≤ ≤ ≤ − ≤ − ui u ui+1, vj v vj+1, v vj u ui, | n − | → max Ci,j uivj 0 i,j=0,1,...,N then → ∞ Cn − Cn exponentially as n . Taking into account that n n i+1,j i,j − C (u, v) := Ci,j + (u ui) ui+1 − ui uivjui−1vj−1 − uivj−1ui−1vj = 0, Cn − Cn i+1,j+1 i+1,j − we see immediately the desired result. + (v vj). (QED) vi+1 − vi Other parts are computed similarly. 4. Numerical experiments It is easy to check that a sequence of copulas n {C (u, v)}n=0,1,2,... defined above verify the boundary The time evolution of copulas in discrete processes conditions (1) as well as the 2-increasing condition (2) have strong affinity to their numeric solutions. Thus we provided λ ≤ 1/4. We also note that in this range of λ, can construct the copulas in accordance with (7) in The- the difference scheme (7) is stable. orem 3. For examples, we calculate a Clayton copula − − − Next, we define Dn(u, v) := Cn(u, v) − uv; the exten- C(u, v) = (u θ + v θ − 1) 1/θ (0 < θ < ∞) and its time

– 78 – JSIAM Letters Vol. 3 (2011) pp.77–80 Yasukazu Yoshizawa et al.

Time evolution of Densities of time evolution of namic dependencies, which are concordant with Brown- Clayton copula with θ = 5 Clayton copula with θ = 5 ian motion. Unfortunately they have smoothing nature t = 0 (Clayton copula) and our family of time evolving copulas converges to the 10 harmonic copula, which means the independence rela- tion. 1 Probably in many aspects of applications, the time evolution toward the intended dependence will be much relevant. Thus we propose the backward type of evolu- 0 tion of copulas, which is obtained by transforming the 0 opposite direction of (7): T t = 3/50 C (u, v) := CT (u, v),

where CT denotes given maturity copula and the system 1 of difference equations is 1.6 T −(n+1)h − T −nh Ci,j Ci,j ∆t 0 T −nh − T −nh T −nh Ci+1,j 2Ci,j + Ci−1,j 0 = (∆u)2 ∞ t = (Harmonic copula) T −nh T −nh T −nh C − 2C + C − + i,j+1 i,j i,j 1 (∆v)2 1 for i, j = 1, 2,...,N − 1, n = 0, 1,..., [T/h], 1 (11)

0 together with the boundary condition (8). We solve (8) 0 and (11) backwardly from the maturity to present.

Fig. 1. Time evolution copulas. Acknowledgments The authors are grateful to the referee for helpful com- ments. The second author (N. I.) is supported in part by evolution in the left side of Fig. 1. As the time evolves, the grant from the Japan Society for the Promotion of the copulas are smoothed and converge to the harmonic Sciences (No.21540117), as well as the research grant (independence) copula Π. (2011) from the Tokio Marine Kagami Memorial Foun- We are also able to compute densities of above copulas dation. by the following formula. Cn − Cn − Cn + Cn References The density = i,j i,j+1 i+1,j i+1,j+1 . (10) (∆u)(∆v) [1] E. W. Frees and E. A. Valdez, Understanding relationships As an example, we calculate densities of Clayton cop- using copulas, N. Amer. Actuarial J., 2 (1998), 1–25. ula and its time evolution, which are depicted in the right [2] K. Goda, Statistical modeling of joint probability distribution side of Fig. 1. According to the time evolution, the den- using copula: Application to peak and permanent displace- ment seismic demands, Struct. Safety, 32 (2010), 112–123. sities of copulas are seen to be smoothed and converge [3] K. Goda and G. M. Atkinson, Interperiod dependence of to a flat surface of density one, which is the density of ground-motion prediction equations: A copula perspective, the independence copula Π. Bull. Seism. Soc. Amer., 99 (2009), 922–927. [4] R. Lebrun and A. Dutfoy, A generalization of the Nataf trans- formation to distributions with elliptical copula, Probab. Eng. 5. Discussions Mech., 24 (2009), 172–178. As stated in Section 1, copulas are employed in quanti- [5] A. J. McNeil, R. Frey and P. Embrechts, Quantitative Risk tative risk management (QRM) for financial engineering, Management, Princeton Univ. Press, Princeton, 2005. actuarial science, seismology and so on. There are many [6] R. B. Nelsen, An Introduction to Copulas, 2nd edition, Springer Series in Statistics, Springer, New York, 2006. risk elements and their dependencies are very critical for [7] H. Tsukahara, Copulas and their applications (in Japanese), risk management, especially measuring aggregated risks. Jpn J. Appl. Statist., 32 (2003), 77–88. Recently copulas are recognized sophisticated method to [8] Y. Yoshizawa, Modeling for the enterprise risk management express quantitatively dependencies between risks. QRM (in Japanese), Sonpo-Soken Report, 90 (2009), 1–49. is of use for various purposes. For examples, state depen- [9] Y. Yoshizawa, Risk management of extremal events (in Japanese), Sonpo-Soken Report, 92 (2010), 35–90. dencies may be enough for solvency regulation purpose, [10] R. W. J. van den Goorbergh, C. Genest and B. J. M. Werker, but dynamic dependencies must play important roles in Bivariate option pricing using dynamic copula models, Insur- QRM for catastrophic events, such as earthquake, ty- ance: Math. Econ., 37 (2005), 101–114. phoon and financial crisis. We have constructed the time [11] A. Sklar, Random variables, joint distribution functions, and evolution of copulas for the purpose of analyzing dy- copulas, Kybernetika, 9 (1973), 449–460. [12] A. J. Patton, Modelling asymmetric exchange rate depen-

– 79 – JSIAM Letters Vol. 3 (2011) pp.77–80 Yasukazu Yoshizawa et al.

dence, Int. Econ. Rev., 47 (2006), 527–556. [13] N. Ishimura and Y. Yoshizawa, On time-dependent bivariate copulas, Theor. Appl. Mech. Jpn, 59 (2011), 303–307. [14] N. Ishimura and Y. Yoshizawa, A note on the time evolution of bivariate copulas, in: Proc. of FAM2011, Sofia Univ., to appear. [15] Y. Yoshizawa and N. Ishimura, Time Evolution Copulas and Rank Correlation (in Japanese), in: Proc. of JCOSSAR 2011.

– 80 – JSIAM Letters Vol.3 (2011) pp.81–84 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

On boundedness of the condition number of the coefficient matrices appearing in Sinc-Nystr¨ommethods for Fredholm integral equations of the second kind

Tomoaki Okayama1, Takayasu Matsuo2 and Masaaki Sugihara2

1 Graduate School of Economics, Hitotsubashi University, 2-1, Naka, Kunitachi, Tokyo 186- 8601, Japan 2 Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1, Hongo, Bunkyo, Tokyo, 113-8656, Japan E-mail tokayama econ.hit-u.ac.jp Received September 27, 2011, Accepted November 8, 2011

Abstract Sinc-Nystr¨ommethods for Fredholm integral equations of the second kind have been indepen- dently proposed by Muhammad et al. and Rashidinia-Zarebnia. They also gave error analyses, but the results did not claim the convergence of their schemes in a precise sense. This is be- cause in their error estimates there remained an unestimated term: the norm of the inverse of the coefficient matrix of the resulting linear system. In this paper, we estimate the term theoretically to complete the convergence estimate of their methods. Furthermore, we also prove the boundedness of the condition number of each coefficient matrix. Keywords Sinc method, Fredholm integral equation, condition number, Nystr¨ommethod Research Activity Group Scientific Computation and Numerical Analysis

1. Introduction which also suggested the exponential convergence of We are concerned with Fredholm integral equations of their method. Strictly speaking, however, the exponen- the second kind of the form tial convergence of those two methods still has not been ∫ b established at this point since the dependence of the ∥ −1∥ ∥ ˜−1∥ λu(t) − k(t, s)u(s) ds = g(t), a ≤ t ≤ b, (1) terms AN 2 and AN 2 on N has not been clarified. a It seems direct estimates of them are difficult, and that where λ is a given constant, g(t) and k(t, s) are given was the reason why they have remained open. continuous functions, and u(t) is the solution to be deter- In this paper, we take a different approach: we give mined. Various numerical methods have been proposed estimates in ∞-norm as −1 −1 to solve (1), and the convergence rate of most existing ∥A ∥∞ ≤ K, ∥A˜ ∥∞ ≤ K,˜ methods has been polynomial with respect to the num- N N ˜ ∥ ∥ ≤ ber of discretization points N [1]. for√ some constants K and K. Through X 2 One of the exceptions is the Sinc-Nystr¨ommethod, n∥X∥∞ for any n × n matrix X, the estimates im- which has firstly been developed by Muhammad et ply the desired exponential convergence estimates. The al. [2]. According to their error analysis, the method can key here is the analysis of Sinc-collocation methods pre- converge exponentially if the coefficient matrix of the re- viously given by the present authors [4]. sulting linear equations, say AN , does not behave badly. The above approach has another virtue that we can To be more precise, the error of the numerical solution show a stronger result; we also show uN (t) has been estimated as ∥ ∥ ≤ ′ ∥ ˜ ∥ ≤ ˜ ′ ( ) AN ∞ K , AN ∞ K , − | − | ≤ ∥ −1∥ cN max u(t) uN (t) C AN 2 exp , (2) from which the condition numbers of the matrices are t∈[a,b] log N bounded (in the sense of ∞-norm). This result guaran- where C and c are positive constants independent of tees not only that the two methods converge exponen- ∥ −1∥ N. In their numerical experiments the term AN 2 re- tially, but also that the resulting linear equations do not mained low for all N, which suggested that the method become ill-conditioned as N increases. can converge exponentially. Afterwards Rashidinia- This paper is organized as follows. In Section 2, we ex- Zarebnia [3] have proposed another type of Sinc- plain the concrete procedure of the Sinc-Nystr¨ommeth- Nystr¨ommethods, and claimed that the error can be ods. New theoretical results are described in Section 3 estimated as with their proofs. In Section 4 a numerical example is √ | − | ≤ ˜∥ ˜−1∥ − shown. Section 5 is devoted to conclusions. max u(t) u˜N (t) C AN 2 exp( c˜ N), (3) t∈[a,b]

– 81 – JSIAM Letters Vol. 3 (2011) pp.81–84 Tomoaki Okayama et al. 2. Sinc-Nystr¨ommethods we consider the new equation: 2.1 Sinc quadrature g(t) + KSE[uSE](t) uSE(t) = N N . (7) In the Sinc-Nystr¨ommethods, the Sinc quadrature: N λ ∫ ∞ N SE ∑ The approximated solution uN is obtained by determin- ≈ KSE SE F (x) dx h F (jh) (4) ing the unknown coefficients in N uN , i.e., −∞ j=−N SE SE SE − SE SE T un = [uN (ψ ( Nh)), . . . , uN (ψ (Nh))] , is employed to approximate the integral. Although the interval of the integral in (1) is finite, we can apply the where n = 2N + 1. To this end, let us discretize (7) at SE − Sinc quadrature by combining it with a variable trans- t = ψ (ih)(i = N,...,N), and consider the resulting formation. Rashidinia-Zarebnia [3] utilized the Single- system of linear equations Exponential (SE) transformation defined by − SE SE SE (λIn Kn )un = gn , (8) − ( ) SE b a x b + a SE × t = ψ (x) = tanh + , where Kn is an n n matrix whose (i, j) element is 2 2 2 (KSE) = k(ψSE(ih), ψSE(jh)), i, j = −N,...,N, which enables us to apply the Sinc quadrature as follows: n ij ∫ ∫ SE b ∞ and gn is an n-dimensional vector defined by SE SE′ f(t) dt = f(ψ (x))ψ (x) dx SE SE − SE T a −∞ gn = [g(ψ ( Nh)), . . . , g(ψ (Nh))] . ∑N By solving the system (8), the desired solution uSE is ′ N ≈ h f(ψSE(jh))ψSE (jh). (5) obtained. This is called the SE-Sinc-Nystr¨ommethod. j=−N 2.3 DE-Sinc-Nystr¨ommethod Muhammad et al. [2] utilized another one: Next we explain the method derived by Muhammad − ( ) et al. [2]. Assume the following two conditions: DE b a π b + a t = ψ (x) = tanh sinh x + , ∞ DE 2 2 2 (DE1) u ∈ H (ψ (Dd)), ∞ DE which is called the Double-Exponential (DE) transfor- (DE2) k(t, ·) ∈ H (ψ (Dd)) for all t ∈ [a, b]. mation. By using the DE transformation we have: Then the integral Ku in (1) can be approximated by ∫ b ∑N ∑N ′ ′ f(t) dt ≈ h f(ψDE(jh))ψDE (jh). (6) KDE DE DE DE N [u](t) := h k(t, ψ (jh))u(ψ (jh))ψ (jh). a − j= N j=−N In order to achieve quick convergence with the Sinc The mesh size h here is chosen as h = log(4dN)/N. quadrature (4), it is necessary that the integrand F is Then, instead of the original equation u = (g + Ku)/λ, D { ∈ analytic and bounded in the strip domain: d = z we consider the new equation: C : | Im z| < d} for a positive constant d. Accordingly, g(t) + KDE[uDE](t) as for the approximations (5) and (6), it is appropriate DE N N uN (t) = . (9) to introduce the following function space. λ DE Definition 1 Let D be a bounded and simply-connected To obtain the approximated solution uN , we have to KDE DE domain (or Riemann surface). Then we denote by determine the unknown coefficients in N uN , i.e., ∞ H (D) the family of all functions that are analytic and DE DE DE − DE DE T un = [uN (ψ ( Nh)), . . . , uN (ψ (Nh))] . bounded in D. By discretizing (9) at t = ψDE(ih)(i = −N,...,N), we D SE D DE D The domain should be either ψ ( d) or ψ ( d), have the linear system: ∞ SE i.e., we may assume f ∈ H (ψ (Dd)) for the approx- ∞ DE − DE DE DE imation (5), and f ∈ H (ψ (Dd)) for the approxima- (λIn Kn )un = gn , (10) tion (6). DE × where Kn is an n n matrix whose (i, j) element is 2.2 SE-Sinc-Nystr¨ommethod DE DE DE − (Kn )ij = k(ψ (ih), ψ (jh)), i, j = N,...,N, Firstly we explain the method derived by Rashidinia- DE Zarebnia [3]. Assume the following two conditions: and gn is an n-dimensional vector defined by ∞ SE DE DE − DE T (SE1) u ∈ H (ψ (Dd)), gn = [g(ψ ( Nh)), . . . , g(ψ (Nh))] . ∞ SE (SE2) k(t, ·) ∈ H (ψ (Dd)) for all t ∈ [a, b]. DE ∫ By solving the system (10), the desired solution uN is K b obtained. This is called the DE-Sinc-Nystr¨ommethod. Then the integral [u](t) := a k(t, s)u(s) ds in (1) can be approximated by ∑N 3. Boundedness of the condition num- ′ KSE SE SE SE N [u](t) := h k(t, ψ (jh))u(ψ (jh))ψ (jh). bers j=−N √ 3.1 Main result The mesh size h here is chosen as h = 2πd/N. Then, The main contribution of this paper is the following corresponding to the original equation u = (g + Ku)/λ, theorem.

– 82 – JSIAM Letters Vol. 3 (2011) pp.81–84 Tomoaki Okayama et al.

Theorem 2 Let the function k be continuous on [a, b]× solution of the linear system in (B), which shows the [a, b]. Furthermore, suppose that the homogeneous equa- existence of a solution. The uniqueness is shown as fol- tion (λI − K)f = 0 has only the trivial solution f ≡ 0. lows. Suppose that there exists another solution c˜n = T Then there exists a positive integer N0 such that for all [˜c−N ,..., c˜N ] . Define a functionv ˜ ∈ R as N ≥ N the matrices (λI − KSE) and (λI − KDE) ( ) 0 n n n n ∑N ′ have bounded inverses. Furthermore, there exist con- 1 SE SE v˜(t) = g(t) + h k(t, ψ (jh))˜cjψ (jh) . (17) SE DE λ stants C and C independent of N such that for all j=−N N ≥ N0 SE At the points ti = ψ (ih)(i = −N,...,N), clearly ∥ − SE ∥ ∥ − SE −1∥ ≤ SE (λIn Kn ) ∞ (λIn Kn ) ∞ C , (11) ∑N ′ ∥ − DE ∥ ∥ − DE −1∥ ≤ DE SE (λIn Kn ) ∞ (λIn Kn ) ∞ C . (12) λv˜(ti) = g(ti) + h k(ti, tj)˜cjψ (jh) (18) j=−N 3.2 Sketch of the proof holds. On the other hand, In what follows we write C = C([a, b]) for short. The ∑N next result plays an important role to prove Theorem 2. SE′ λc˜i = g(ti) + h k(ti, tj)˜cjψ (jh) (19) Lemma 3 (Okayama et al. [4, in the proofs of j=−N Theorems 6.3 and 8.2]) Suppose that the assump- tions in Theorem 2 are fulfilled. Then there exist con- holds since c˜n is a solution of the linear system. And since the right-hand side of (18) is equal to that of (19), stants C1 and C2 independent of N such that for all N we concludev ˜(ti) =c ˜i. Therefore (18) can be rewritten SE ∥K ∥L ≤ I − KSE N (C,C) C1, as (λ N )˜v = g, which meansv ˜ is a solution of the ∥KDE∥ ≤ equation in (A). From the uniqueness of the equation, N L(C,C) C2. v ≡ v˜ holds, which implies cn = c˜n. This shows the Furthermore, there exists a positive integer N0 such that desired uniqueness. ≥ I −KSE I −KDE T for all N N0 the operators (λ N ) and (λ N ) Next we show (B) ⇒ (A). Let c˜n = [˜c−N ,..., c˜N ] have bounded inverses, and be a unique solution in (B), and define a function SE −1 v˜ ∈ C by (17). Then by the same argument as ∥(λI − K ) ∥L ≤ C , N (C,C) 3 above, we can concludev ˜ is a solution of the equa- ∥ I − KDE −1∥ ≤ (λ N ) L(C,C) C4, tion in (A), which shows the existence. The unique- ness is shown as follows. Suppose that there exists an- hold, where C3 and C4 are constants independent of N. n other solution v ∈ C. Define the vector cn ∈ R as In view of this, we see that Theorem 2 is established SE SE T cn = [v(ψ (−Nh)), . . . , v(ψ (Nh))] . Then clearly cn if the following lemma is shown. is a solution of the linear system in (B). From the unique- Lemma 4 Suppose that the assumptions in Theorem 2 ness of the linear system, we have cn = c˜n. Therefore SE are fulfilled. Then we have v(ψ (jh)) =c ˜j, and v can be rewritten as ( ) SE SE N ∥(λIn − K )∥∞ ≤ ∥(λI − K )∥L(C,C), (13) ∑ n N 1 SE SE′ v(t) = g(t) + h k(t, ψ (jh))˜cjψ (jh) . (20) ∥ − DE ∥ ≤ ∥ I − KDE ∥ λ (λIn Kn ) ∞ (λ N ) L(C,C). (14) j=−N I − KSE −1 Furthermore, if the inverse operators (λ N ) and In view of (17) and (20), we have v ≡ v˜, which shows I − KDE −1 − SE −1 (λ N ) exist, then the matrices (λIn Kn ) the desired uniqueness. − DE −1 (QED) and (λIn Kn ) also exist, and we have SE −1 SE −1 In the same manner we can prove the following lemma ∥(λI − K ) ∥∞ ≤ ∥(λI − K ) ∥L , (15) n n N (C,C) for the DE-Sinc-Nystr¨ommethod. The proof is omitted. ∥ − DE −1∥ ≤ ∥ I − KDE −1∥ (λIn Kn ) ∞ (λ N ) L(C,C). (16) Lemma 6 Suppose that the assumptions in Theorem 2 ∈ We prove this lemma below. are fulfilled, and let g C([a, b]). Then the following two statements are equivalent: 3.3 Proofs (A) The equation (λI−KDE)v = g has a unique solution − SE −1 N The existence of the inverse matrix: (λIn Kn ) is v ∈ C. shown by the following lemma. − DE (B) The system of linear equations (λIn Kn )cn = Lemma 5 Suppose that the assumptions in Theorem 2 gDE has a unique solution c ∈ Rn. are fulfilled, and let g ∈ C([a, b]). Then the following two n n statements are equivalent: Thus the existence of the inverse matrix is guaran- teed in both cases (SE and DE). The remaining task is (A) The equation (λI −KSE)v = g has a unique solution N to show (13)–(16). We show only (13) and (15) since (14) v ∈ C. and (16) are shown in the same manner. − SE SE (B) The system of linear equations (λIn Kn )cn = gn n Proof of Lemma 4 We show (13) first. Let cn = has a unique solution cn ∈ R . T [c−N , . . . , cN ] be an arbitrary n-dimensional vector. SE Proof We show (A) ⇒ (B) first. Using the unique Pick a function γ ∈ C that satisfies γ(ψ (ih)) = ci n solution v ∈ C, define the vector cn ∈ R as cn = (i = −N,...,N) and ∥γ∥C = ∥cn∥∞. Using this func- SE − SE T ∈ I − KSE [v(ψ ( Nh)), . . . , v(ψ (Nh))] . Clearly this cn is a tion γ, define a function f C as f = (λ N )γ, and

– 83 – JSIAM Letters Vol. 3 (2011) pp.81–84 Tomoaki Okayama et al.

140 1 SE −Sinc−Nyström SE −Sinc−Nyström − − 0.01 DE Sinc Nyström 120 DE −Sinc−Nyström 0.0001 100

1e−06 80 1e−08 60 1e−10

maximum error maximum 1e−12 40 condition number condition 1e−14 20 1e−16 0 0 20 40 60 80 100 0 20 40 60 80 100 N N

Fig. 1. Error of the Sinc-Nystr¨ommethods for (21). Fig. 2. Condition number of the coefficient matrix appearing in the Sinc-Nystr¨ommethods for (21).

SE − SE T a vector f n as f n = [f(ψ ( Nh)), . . . , f(ψ (Nh))] . Then we have vestigated on equally-spaced 1000 points on [0, π/2],

SE and the maximum of them is√ shown in Fig. 1. We ∥(λIn − K )cn∥∞ = ∥f ∥∞ n n can observe the rate O(exp(−c1 N)) in the SE-Sinc- − ≤ ∥f∥C Nystr¨ommethod, and O(exp( c2N/ log N)) in the DE- Sinc-Nystr¨ommethod. These results can be explained ∥ I − KSE ∥ = (λ N )γ C by combining the existing estimates (2) and (3) with ≤ ∥ I − KSE ∥ ∥ ∥ the new result (Theorem 2). Furthermore from Fig. 2, (λ N ) L(C,C) γ C we can also confirm boundedness of the condition num- ∥ I − KSE ∥ ∥ ∥ = (λ N ) L(C,C) cn ∞, bers, i.e., the estimates (11) and (12). from which (13) follows. Next we show (15). Notice that the inverse matrix 5. Concluding remarks − SE −1 (λIn Kn ) exists from Lemma 5. Let cn be an arbi- The Sinc-Nystr¨ommethods for (1) have been known trary n-dimensional vector. In the same manner as the as efficient methods in the sense that exponential con- above, pick a function γ ∈ C. Define a function f ∈ C vergence can be attained. However, the convergence has I − KSE −1 as f = (λ N ) γ, and a vector f n in the same not been guaranteed theoretically, since in the exist- way as the above. The difference from the above is in f; ing estimates (2) and (3), there remained unestimated − (λI −KSE) is replaced with (λI −KSE) 1. Then we have ∥ −1∥ ∥ ˜−1∥ − DE N N terms: AN 2 and AN 2 (AN = In Kn and SE −1 ˜ − SE ∥(λIn − K ) cn∥∞ = ∥f ∥∞ AN = In Kn ). In this paper we showed theoretically n n −1 −1 that ∥A ∥∞ and ∥A˜ ∥∞ are bounded, from which ≤ ∥ ∥ N N f C exponential convergence of the methods is guaranteed. SE −1 Furthermore we showed that ∥AN ∥∞ and ∥A˜N ∥∞ are = ∥(λI − K ) γ∥C N also bounded, and consequently the condition number ≤ ∥ I − KSE −1∥ ∥ ∥ (λ N ) L(C,C) γ C of them is bounded, as stated in Theorem 2.

SE −1 Muhammad et al. [2] have also developed the Sinc- = ∥(λI − K ) ∥L ∥c ∥∞, N (C,C) n Nystr¨ommethods for Volterra integral equations, and from which (15) follows. This completes the proof. the similar result to this paper can be shown for them. (QED) We are now working on this issue, and the result will be reported somewhere else soon. 4. Numerical example Acknowledgments In this section we show numerical results for ∫ ( ) This work was supported by Grants-in-Aid for Scien- π/2 √ π3 π u(t) − (ts)3/2u(s) ds = t 1 − t , 0 ≤ t ≤ , tific Research, MEXT, Japan. 0 24 2 (21) References which has also been conducted by Muhammad et√ al. [2, [1] P. K. Kythe and P. Puri, Computational Methods for Linear Example 4.3]. The exact solution is u(t) = t. Let Integral Equations, Birkh¨auser,Boston, MA, 2002. us first check the conditions described in Sections 2.2 [2] M. Muhammad, A. Nurmuhammad, M. Mori and M. Sugi- and 2.3. The conditions (SE1) and (SE2) are satisfied hara, Numerical solution of integral equations by means of with d = π − ϵ, and (DE1) and (DE2) are satisfied with the Sinc collocation method based on the double exponential transformation, J. Comput. Appl. Math., 177 (2005), 269– d = (π − ϵ)/2, where ϵ is an arbitrary small positive 286. number (we set ϵ = π − 3.14 in our computation). [3] J. Rashidinia and M. Zarebnia, Convergence of approximate Based on the information, we implemented the SE- solution of system of Fredholm integral equations, J. Math. Sinc-Nystr¨ommethod and DE-Sinc-Nystr¨ommethod in Anal. Appl., 333 (2007), 1216–1227. C++ with double-precision floating-point arithmetic. [4] T. Okayama, T. Matsuo and M. Sugihara, Improvement of a Sinc-collocation method for Fredholm integral equations of | − SE | | − DE | The errors u(t) uN (t) and u(t) uN (t) were in- the second kind, BIT Numer. Math., 51 (2011), 339–366.

– 84 – JSIAM Letters Vol.3 (2011) pp.85–88 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

A modified Calogero-Bogoyavlenskii-Schiff equation with variable coefficients and its non-isospectral Lax pair

Tadashi Kobayashi1 and Kouichi Toda2,3

1 Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606- 8501, Japan 2 Department of Mathematical Physics, Toyama Prefectural University, Kurokawa 5180, Imizu, Toyama 939-0398, Japan 3 Research and Education Center for Natural Sciences, Hiyoshi Campus, Keio University, 4-1-1 Hiyoshi, Kouhoku-ku, Yokohama, 223-8521, Japan E-mail tadashi amp.i.kyoto-u.ac.jp Received May 9, 2011, Accepted October 11, 2011

Abstract In this paper, we present a modified version with variable coefficients of the (2 + 1) dimen- sional Korteweg-de Vries, or Calogero-Bogoyavlenskii-Schiff equation, derived by applying the Painlev´etest. Its Lax pair with a non-isospectral condition in (2 + 1) dimensions is also given. Moreover a transformation which links the form with variable coefficients to the canonical one is shown. Keywords integrable equation with variable coefficients, Painlev´eproperty, Lax pair Research Activity Group Applied Integrable Systems

1. Introduction equations in higher dimensions with variable coefficients Over the last three decades many mathematicians has attracted much more attention. So the main aim of and physicists study the nonlinear integrable systems this paper is to construct a (2 + 1) dimensional inte- from various perspectives. They have remarkable appli- grable version of the modified Korteweg-de Vries (KdV) cations to many physical systems such as hydrodynam- equations with variable coefficients. ics, nonlinear optics, plasma physics, field theories and It is widely known that the Painlev´etest in the sense so on [1–3]. Generally the notion of the nonlinear inte- of the Weiss-Tabor-Carnevale (WTC) method [13] is a grable systems is characterized by several features: soli- powerful tool to investigate integrable equations with tons [4–8], Lax pairs [9–11], Painlev´etests [12–18] and variable coefficients. We will discuss the following higher so on. The integrable system has “good” nature as pre- dimensional nonlinear evolution equation with variable viously described. Moreover, solitons are a major attrac- coefficients for q = q(x, z, t): tive issue in mechanical and engineering sciences as well 2 −1 2 qt + a(x, z, t)qxxz + b(x, z, t)q qz + c(x, z, t)qx∂x (q )z as mathematical and physical ones. For instance, a real ocean is inhomogeneous, and the dynamics of nonlin- + d(x, z, t)q + e(x, z, t)qx + f(x, z, t)qz = 0, (1) ear waves is strongly influenced by refraction, geometric where a(x, z, t) ≠ 0, b(x, z, t) ≠ 0, c(x, z, t) ≠ 0, b(x, z, t) divergence and so on. +c(x, z, t) ≠ 0 and subscripts with respect to indepen- The physical phenomena in which many nonlinear in- −1 dent variables denote their partial∫ derivatives and ∂x tegrable equations with constant coefficients arise tend −1 x is the integral operator, ∂x q := q(X)dX. Here (and to be very highly idealized. Therefore, equations with hereafter) a(x, z, t), b(x, z, t), ... , f(x, z, t) are coeffi- variable coefficients may provide various models for real cient functions of two spatial variables x, z and one tem- physical phenomena, for example, in the propagation of poral one t. We will carry out the WTC method for (1), small-amplitude surface waves, which run on straits or and present a set of the coefficient functions. Equation large channels of slowly varying depth and width. On one of the form (1) includes one of the integrable higher di- hand, the variable-coefficient generalizations of nonlin- mensional modified KdV equations: ear integrable equations are a currently exciting subject 1 1 ( ) 1 [19–22] (and also [21, Refs. [24–45]]). Many researchers − 2 − −1 2 qt q qz qx∂x q z + qxxz = 0, (2) have mainly investigated (1 + 1) dimensional nonlinear 4 8 4 integrable systems with constant coefficients for discov- which is called the modified Calogero-Bogoyavlenskii- ery of new nonlinear integrable systems. On the other Schiff (CBS) equation [23]. Eq. (2) can be dimension- hand, there are few research studies to find nonlinear ally reduced to the standard modified KdV equation for integrable systems with variable coefficients, since they q = q(x, t): are essentially complicated. Analysis of higher dimen- 3 1 q − q2q + q = 0, (3) sional systems is also an active issue in nonlinear inte- t 8 x 4 xxx grable systems. Since the study of nonlinear integrable

– 85 – JSIAM Letters Vol. 3 (2011) pp.85–88 Tadashi Kobayashi et al. by a dimensional reduction ∂z = ∂x. Here (and here- Eqs. (7) and (8) must be satisfied in the respective pow- −4+k after) ∂x = ∂/∂x and so on. ers of ϕ. Requiring that every power of ϕ (ϕ and The plan of this paper is as follows. In Section 2, we ϕ−3+k with positive integer k in (7) and (8), respec- will review the process of the WTC method of (1) in tively) should vanish, we obtain the consistency condi- brief. Next we will construct its Lax pair. In Section 4, tions as follows. we prove that the equation with variable coefficients can − − (ϕ 3, ϕ 2): be reduced to the canonical form by a certain transfor- mation. Section 5 will be devoted to conclusions. 1 { 2 q1 = 2 bq0q0,z 2(b + c)q0ϕz 2. Painlev´etest of (1) 2 − + c(2q0q0,z + r0q0,x q0r0,x) Weiss et al. claimed in [13] that a partial differential equation (PDE) has the Painlev´eproperty if the solu- + 2a [ ϕx(q0,zϕx + 2q0,xϕz + 2q0ϕxz) tions of the PDE are single-valued about the movable + q0ϕxxϕz ]}, singularity manifold. They have proposed a technique, which determines whether a given PDE is integrable or 1 r1 = { cr0q0,x + bq0(r0,x − q0q0,z) not. This technique is called the WTC method. Now we (b + c)q0ϕx show the WTC method for (1). In order to eliminate the + 2a [ ϕx(q0,zϕx + 2q0,xϕz + 2q0ϕxz) integral operator, we rewrite (1) in the form of coupled } systems: + q0ϕxxϕz ] , 2 −2 −1 qt + a(x, z, t)qxxz + b(x, z, t)q qz + c(x, z, t)qxr (ϕ , ϕ ): − − 2 2 + d(x, z, t)q + e(x, z, t)qx + f(x, z, t)qz = 0, (4) q0ϕt (fq0 + bq0q1 + bq0q2 + aq0,xx)ϕz 2 − − − − rx = (q )z. (5) + (cq2r0 cq0r2 eq0 2aq0,xz)ϕx 2aq0,xϕxz − − 2 We are now looking for solutions of (4) and (5) in the aq0,zϕxx aq0ϕxxz + 2bq0q1q0,z + bq0q1,z Laurent series expansion with ϕ = ϕ(x, z, t): + cr1q0,x + cr0q1,x = 0, (10) ∑∞ ∑∞ j−α j−β − q = qjϕ , r = rjϕ , (6) r1,x 2(q0q1)z = 0. (11) j=0 j=0 It follows from (10) and (11) that one of the two variables where qj = qj(x, z, t) and rj = rj(x, z, t) are analytic (q2, r2) must be arbitrary, functions in a neighborhood of ϕ = 0. In this case, the 4b3a − b2 [ a (8b + 11c ) − 9ca + 4a(b + c )] leading orders are α = 1 and β = 2. Then xx x x x xx xx xx √ − c { a(b − 2c )(b + c ) − c2a 6a 6a x x x x xx q0 = i ϕx, r0 = − ϕxϕz, b + c b + c + c [ ax(bx − 2cx) + a(bxx + cxx)]} 2 − 2 are obtained. Here i = 1. To find the resonance we − b { −a(bx + cx)(8bx + 11cx) − 6c axx now substitute the Laurent series expansions (6) into } (4) and (5). Rearranging (4) into terms of ϕj−4 and the + c [ ax(7bx + 13cx) + 5a(bxx + cxx)] = 0, (12) other higher powers of ϕ, we obtain recurrence relations 3 2 2b axz − b [ ax(2bz + 5cz) + az(2bx + 5cx) for qj and rj: ( + 2a(bxz + cxz)] − 2 { − − (j 3)bq0qjϕz + c [(j 1)r0qj q0rj ] ) + b {a [ bz(4bx + 7cx) + cz(7bx + 10cx)] + (j − 1)(j − 2)(j − 3)aq ϕ ϕ } ϕj−4 = δ . (7) j x z j 2 − 6c axz + c [ ax(5bx − cx) + 2a(bxz + cxz)]} Similarly, rearranging (5) into terms of ϕj−3 and higher { − − − 2 powers of ϕ, we have + c a [ cz(11bx + 8cx) bz(14bx + 11cx)] 4c axz j−3 (j − 2)(2q0qjϕz − rjϕx)ϕ = σj. (8) + c [ ax(7bz + 4cz) + az(7bx + 4cx) } Here δj and σj are given in terms of qℓ and rℓ (0 ≤ ℓ ≤ + 4a(bxz + cxz)] = 0, (13) j − 1). Then we get the following resonances: 3 2 2b azz − b [ az(4bz + cz) − 9cazz + 2a(bzz + czz)] j = −1, 2, 3, 4. (9) 2 + c { a(13bz + 10cz)(bz + cz) + 5c azz Let us note here that the resonance j = −1 in (9) cor- − c [ a (13b + 10c ) + 5a(b + c )]} responds to the arbitrary singularity manifold ϕ. If the z z z zz zz 2 recurrence relations are consistently satisfied at the reso- + b { a(bz + cz)(4bz + cz) + 12c azz nances then the differential equations are said to possess − } the Painlev´eproperty. c [ az(17bz + 11cz) + 7a(bzz + czz)] = 0, (14)

Subsequent coefficients qj and rj are determined from (b + c)(2b + 5c)[ az(b + c) − a(b + c)z ] = 0. (15) (7) and (8). However, from the consistency condition, they must include arbitrary functions at the resonances. We take into account two cases, case (i) c(x, z, t) =

– 86 – JSIAM Letters Vol. 3 (2011) pp.85–88 Tadashi Kobayashi et al.

−(2/5)b(x, z, t) and case (ii) a(x, z, t) = a(x, t)(b(x, z, t) are obtained, respectively. +c(x, z, t)) from (15). Therefore the equation given in the form: − 3 case (i): c(x, z, t) = (2/5)b(x, z, t) q + a(t)b(z, t)q + b(z, t)q2q t 2 xxz z We obtain the constraints, a = a(z, t) and b = b(z, t), ( ) 2 ′ from (12)–(14). But a relation, ab = 0, is appeared at 1 − 1 a (t) + b(z, t)q ∂ 1(q2) + e (z, t) − q the next calculation (ϕ−1, ϕ0). This breaks the initial 2 x x z 0 2 a(t) condition, a ≠ 0 and b ≠ 0. Namely we have failed the + (xe (z, t) + e (z, t))q + f(z, t)q = 0, (23) Painlev´etest in this case. 0 1 x z admits the sufficient number of arbitrary functions cor- case (ii): a(x, z, t) = a(x, t)(b(x, z, t) + c(x, z, t)) responding to the resonances and namely passes the From (12)–(14), we obtain the following equations: Painlev´etest in the sense of the WTC method. It means 2 2 that we have succeeded in finding of the modified CBS 4b axx + c axx + 3axbxc + 5bcaxx − 3axbcx = 0, (16) equation with variable coefficients (23). We used the − ax(bcz bzc) = 0. (17) MATHEMATICA [24] to handle calculation for the ex- From (17), we obtain the following two cases: case (ii-1) istence of arbitrary functions at the above resonances. c(x, z, t) = c(x, t)b(x, z, t) and case (ii-2) a(x, t) = a(t). 3. Lax pair of (23) case (ii-1): c(x, z, t) = c(x, t)b(x, z, t) It is well known that the Lax pair plays a key role in We obtain a relation for c(x, t) from (12)–(14), c(x, t) 3 the theory of integrable systems. Consider two operators = 3/(1 − c(t) ax(x, t)) − 4. But a relation, b = 0, is −1 0 L and T which are called the non-isospectral Lax pair appeared at the next calculation (ϕ , ϕ ). This breaks and given by the initial condition, b ≠ 0. Namely we have failed the Painlev´etest in this case. Lψ = λψ, T ψ = 0, case (ii-2): a(x, t) = a(t) and λ being a non-isospectral parameter [23, 25, 26] in- The compatibility condition is satisfied in this case. dependent of only x. Then the commutation relation: ≡ − (ϕ−1, ϕ0): [L, T ] LT TL = 0, (24) cq [ r ϕ − 2(q q + q q )ϕ contains a nonlinear evolution equation for suitably cho- 0 3 x 0 3 1 2 z sen operators L and T . Eq. (24) is called the Lax equa- − − 2 (2q0q2 q1)z ] + F = 0, (18) tion. The Lax pair of (23) is as follows, − − − 2 r3ϕx 2(q0q3 + q1q2)ϕz (2q0q2 q1)z = 0, (19) √ 3a(t) (ϕ0, ϕ1): L = i ∂2 + q∂ 2 x x 2 ( ) cq0 [ r4ϕx − (2q0q4 + 2q1q3 + q )ϕz − ′ 2 √ i −1 4a(t)e0(z, t) a (t) + 3 ∂z , (25) − (q0q3 + q1q2)z ] + G = 0, (20) 4 6a(t) 2 b(z, t) √ − 2 r4ϕx (2q0q4 + 2q1q3 + q2)ϕz 3a(t) 2 1 −1 2 T = i ∂x∂z + (∂x qz)∂x + q∂x∂z − 2 2 (q0q3 + q1q2)z = 0. (21) {[ It follows from (18)–(21) that one of the two variables i + √ xe0(z, t) + e1(z, t) must be arbitrary in both pairs (q3, r3) and (q4, r4), and 2 6a(t)b(z, t) similarly F in (18) and G in (20) must vanish. Hence the −1 − −1 −1 following equations are obtained from F = 0, + b(z, t)∂x (qqz) 2b(z, t)∂x (qx∂x qz) √ ] } b − 2c = 0, b − c = 0, 3a(t) x x − 3i b(z, t)q ∂ + f(z, t)∂ + ∂ , 2 z x z t bxc − bcx = 0, cxf − cfx = 0, ′ (26) cx = 0, ca + 2a(cd + cxe − cex) = 0, (22) with a constraint condition: where ′ denotes the ordinary derivative with respect to ( ) a′(t) 4a(t)e (z, t) − a′(t) 8a(t)e (z, t)f(z, t) t. Hence, from (22), ∂−1 0 − 0 a(t) x b(z, t) 3b(z, t) b(z, t) [ b(x, z, t) = b(z, t), c(x, z, t) = , ′ − ′′ − 2 −1 4e0(z, t)a (t) a (t) + 4a(t)e0,t(z, t) 2 ∂z ′ 3 b(z, t) 1 a (t) ] d(x, z, t) = e0(z, t) − , f(x, z, t) = f(z, t), − ′ 2 a(t) − bt(z, t)(4a(t)e0(z, t) a (t)) b(z, t)2 and then, from G = 0, 2a′(t)f(z, t) e = 0 ⇒ e(x, z, t) = xe (z, t) + e (z, t), + = 0. (27) xx 0 1 3b(z, t)

– 87 – JSIAM Letters Vol. 3 (2011) pp.85–88 Tadashi Kobayashi et al.

Notice here that λ = λ(z, t) satisfies a non-isospectral mura are acknowledged. The authors wish to thank the condition: anonymous referee for careful reading of this manuscript [ ( ) b(z, t) 4a(t)e (z, t) − a′(t) and valuable remarks. λ + f(z, t) − ∂−1 0 t 2a(t) z b(z, t) ] √ References − 2i 6a(t)b(z, t)λ λz = 0. (28) [1] G. L. Lamb Jr., Elements of soliton theory, Wiley, New York, 1980. From (27), we obtain a relation, [2] N. N. Akhmediev and A. Ankiewicz, Solitons: nonlinear pulses and beams, Chapman & Hall, London, 1997. a′(t) e (z, t) = . (29) [3] E. Infeld and G. Rowlands, Nonlinear waves, solitons and 0 4a(t) chaos, 2nd ed., Cambridge Univ. Press, Cambridge, 2000. [4] M. J. Ablowitz and H. Segur, Solitons and the inverse scat- Using the above, (23) is rewritten as tering transform, SIAM, Philadelphia, 1981. 3 1 [5] A. Jeffrey and T. Kawahara, Asymptotic methods of nonlinear 2 −1 2 wave theory, Pitman Advanced Publ., London, 1982. qt + a(t)b(z, t)qxxz + b(z, t)q qz + b(z, t)qx∂x (q )z 2 2 [6] P. G. Drazin and R. S. Johnson, Solitons: an introduction, a′(t) xa′(t) Cambridge Univ. Press, Cambridge, 1989. − q + qx + e1(z, t)qx + f(z, t)qz = 0. [7] M. J. Ablowitz and P. A. Clarkson, Solitons, nonlinear evolu- 4a(t) 4a(t) tion equations and inverse scattering, Cambridge Univ. Press, (30) Cambridge, 1991. [8] R. Hirota, The direct methods in soliton theory, Cambridge Note that (30) possesses both the Painlev´eproperty and Univ. Press, Cambridge, 2004. the Lax pair. [9] P. D. Lax, Integrals of nonlinear equations of evolution and solitary waves, Comm. Pure Appl. Math., 21 (1968), 467–490. 4. Reducibility to the canonical form [10] F. Calogero and A. Degasperis, Spectral transform and soli- tons I, Elsevier Science, Amsterdam, 1982. We show that (30) can be transformed to the standard [11] M. Blaszak, Multi-Hamiltonian theory of dynamical systems, modified CBS equation (2) by suitable transformations. Springer-Verlag, Berlin, 1998. As an example, we set the following expressions: [12] A. Ramani, B. Dorizzi and B. Grammaticos, Painlev´econjec- ( ) ture revisited, Phys. Rev. Lett., 49 (1982), 1539–1541. − 1 −1 1 −1 1 [13] J. Weiss, M. Tabor and G. Carnevale, The Painlev´eproperty X = xa(t) 4 ,Z = ∂ ,T = ∂ a(t) 2 , z b(z, t) t for partial differential equations, J. Math. Phys., 24 (1983), 522–526. − 1 Q(X,Z,T ) = a(t) 4 q(x, z, t), e1(z, t) = 0, [14] J. D. Gibbon, P. Radmore, M. Tabor and D. Wood, Painlev´e ( ) property and Hirota’s method, Stud. Appl. Math., 72 (1985), 1 f(z, t) = −b(z, t)∂−1 , (31) 39–63. z [15] W. H. Steeb and N. Euler, Nonlinear evolution equations and b(z, t) t Painlev´etest, World Scientific, Singapore, 1989. for (30). Via a change of the dependent and independent [16] A. Ramani, B. Gramaticos and T. Bountis, The Painlev´eprop- variables, (30) is transformed to the modified CBS for erty and singularity analysis of integrable and non-integrable Q = Q(X,Z,T ): systems, Phys. Rep., 180 (1989), 159–245. [17] A. R. Chowdhury, Painlev´e analysis and its applications, 3 1 Chapman & Hall, New York, 1999. Q + Q + Q2Q + Q ∂−1(Q2) = 0, T 2 XXZ Z 2 X X Z [18] R. Conte (Ed.), The Painlev´eproperty one century later, Springer-Verlag, New York, 1999. of which N soliton solutions were given in [26]. [19] T. Brugarino, Painlev´eanalysis and reducibility to the canon- ical form for the nonlinear generalized Schr¨odingerequation, 5. Concluding remarks Nuovo Cimento, 120 (2005), 423–429. [20] T. Brugarino and M. Sciacca, Singularity analysis and inte- In this paper, we have presented a modified CBS equa- grability for a HNLS equation governing pulse propagation in tion with variable coefficients (30), which is integrable a generic fiber optics, Opt. Commun., 262 (2006), 250–256. in the sense of the Painlev´etest and the existence of [21] T. Kobayashi and K. Toda, The Painlev´etest and reducibility the Lax pair. Moreover we can construct its hierarchy to the canonical forms for higher-dimensional soliton equa- by using the Lax-pair Generating Technique [27] for the tions with variable-coefficients, SIGMA, 2 (2006), 63–72. [22] T. Brugarino and M. Sciacca, Integrability of an inhomoge- operator L (25). neous nonlinear Schr¨odingerequation in Bose-Einstein con- Let us note here that taking ∂z = ∂t as another di- densates and fiber optics, J. Math. Phys., 51 (2010), 093503. mensional reduction can respectively reduce (2) and (30) [23] O. I. Bogoyavlenskii, Breaking solitons. III, Math. USSR-Izv., to the standard form and its extension with variable- 36 (1991), 129–137. coefficients of the modified version of the Ablowitz- [24] Wolfram Research, Inc., Mathematica, Version 8.0, http:// www.wolfram.com/mathematica/index.en.html. Kaup-Newell-Segur equation in (2 + 1) dimensions. [25] P. R. Gordoa and A. Pickering, Nonisospectral scattering By applying the (weak) Painlev´etest, we are studying problems: A key to integrable hierarchies, J. Math. Phys., higher dimensional forms with variable coefficients of the 40 (1999), 5749–5786. nonlinear Schr¨odinger,Camassa-Holm and Degasperis- [26] S. Yu, K. Toda, N. Sasa and T. Fukuyama, N soliton solu- Procesi equations in (2 + 1) dimensions and so on. tions to the Bogoyavlenskii-Schiff equation and a quest for the soliton solution in (3 + 1) dimensions, J. Phys. A: Math. Gen., 31 (1998), 3337–3347. Acknowledgments [27] T. Kobayashi and K. Toda, A generalized KdV-family with Many helpful discussions with Dr. T. Tsuchida, Dr. S. variable coefficients in (2 + 1) dimensions, IEICE Trans. Fun- Tsujimoto, Professor X. -B. Hu and Professor Y. Naka- damentals, E88-A (2005), 2548–2553.

– 88 – JSIAM Letters Vol.3 (2011) pp.89–92 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

A parallel algorithm for incremental orthogonalization based on the compact WY representation

Yusaku Yamamoto1 and Yusuke Hirota1

1 Department of Computational Science, Graduate School of System Informatics, Kobe Univer- sity, 1-1 Rokkodai-cho, Nada-ku, Kobe, 657-8501, Japan E-mail yamamoto cs.kobe-u.ac.jp Received October 4, 2011, Accepted November 28, 2011 Abstract We present a parallel algorithm for incremental orthogonalization, where the vectors to be orthogonalized are given one by one at each step. It is based on the compact WY representation and always produces vectors that are orthogonal to working accuracy. Moreover, it has large granularity and can be parallelized efficiently. When applied to the GMRES method, this algorithm reduces to a known algorithm by Walker. However, our formulation makes it possible to apply the algorithm to a wider class of incremental orthogonalization problems, as well as to analyze its accuracy theoretically. Numerical experiments demonstrate accuracy and scalability of the algorithm. Keywords incremental orthogonalization, Householder transformation, compact WY rep- resentation, Arnoldi process, parallel processing Research Activity Group Algorithms for Matrix / Eigenvalue Problems and their Applications

1. Introduction number of interprocessor synchronizations is reduced to n O(m). It is also shown that deviation from orthogonality Let a1, a2,..., am ∈ R (m ≤ n) be a set of n lin- of q1, q2,..., qm is O(ϵ), where ϵ is the machine epsilon early independent vectors and q1, q2,..., qm be the vec- tors obtained by ortho-normalizing them. We consider [7]. However, this approach is applicable only when the condition O(ϵκ(A)) < 1 is satisfied. the situation where (i) ai (2 ≤ i ≤ n) is not given In [8], Walker proposes to use Householder transfor- in advance but is computed from q1, q2,..., qi−1, and mations for incremental orthogonalization arising in the (ii) qi (1 ≤ i ≤ m) is obtained by orthogonalizing ai GMRES method. With this approach, there is no restric- against q1, q2,..., qi−1 and normalizing the result. We call this type of orthogonalization process incremental tion on the condition number κ(A) and high orthogonal- orthogonalization. Incremental orthogonalization typi- ity of q1, q2,..., qm is always guaranteed. Furthermore, cally arises in the Arnoldi process for eigenvalue prob- Walker also proposes a blocked variant that aggregates lems [1, 2], linear simultaneous equations (the GMRES multiple Householder transformations and performs the method) [3] and matrix exponential exp(A)x [4]. It also computation in the form of matrix-vector products. This arises in computing eigenvectors of a symmetric tridiag- variant is intended for parallel processing and requires onal matrix by the inverse iteration [2] or the multiple only O(m) interprocessor synchronizations. However, as relatively robust representations (MR3) [5] algorithms for this variant, theoretical analysis on the orthogonal- when the corresponding eigenvalues are clustered. ity of q1, q2,..., qm has not been given yet. Also, per- The most popular algorithm for incremental or- formance of this variant on parallel computers has not thogonalization is the modified Gram Schmidt (MGS) been evaluated. method. However, it is inherently sequential because or- In this paper, we reformulate Walker’s blocked al- gorithm using the compact WY representation [9] for thogonalization against qk can be done only after or- Householder transformations. From our formulation, the thogonalization against qk−1 has been completed. To parallelize the MGS method, one has to parallelize the orthogonality property of the algorithm follows immedi- ′ ately from that of the compact WY representation [6]. innermost loops of one orthogonalization operation, ai = 2 It also becomes clear that the algorithm can be applied ai − (ai · qk)qk. This causes O(m ) interprocessor syn- chronizations and degrades parallel performance. The not only to the GMRES method but also to incremen- MGS method also has the drawback that deviation from tal orthogonalization problems in general. We evaluate the accuracy and parallel performance of the algorithm orthogonality of q1, q2,..., qm increases proportionally through numerical experiments. with κ(A), the condition number of A ≡ [a1, a2,..., am] [6]. Another approach is to repeat the classical Gram This paper is organized as follows: in Section 2, we briefly explain an algorithm for incremental orthogo- Schmidt (CGS) method twice to orthogonalize ai against nalization using Householder transformations and in- q1, q2,..., qi−1. With this approach, orthogonalization troduce the compact WY representation. By combining against q1, q2,..., qi−1 can be done in parallel and the them, we formulate an algorithm for incremental orthog-

– 89 – JSIAM Letters Vol. 3 (2011) pp.89–92 Yusaku Yamamoto et al. onalization based on the compact WY representation. This is called the compact WY representation of Hi, We discuss its numerical and computational properties, ...,H2,H1. Using the compact WY representation, ap- as well as its relationship with Walker’s blocked algo- plication of Hi ··· H2H1 or H1H2 ··· Hi to a vector rithm. Experimental results including numerical accu- can be computed as matrix-vector multiplications. This racy and parallel performance on a distributed memory greatly enhances parallelism. parallel computer will be presented in Section 3. Finally, It is known that the compact WY representation has Section 4 will give some concluding remarks. the same level of numerical stability as the usual House- holder transformation [6]. Below, we summarize some of 2. A parallel algorithm for incremental the numerical properties of the compact WY represen- orthogonalization tation given in [6, Section 18.4] as two theorems. Note that although the WY representation treated in [6] is of 2.1 Incremental orthogonalization using Householder a non-compact type, it is stated that the same conclu- transformations sions apply to the compact WY representation as well. We begin with an algorithm for incremental orthogo- nalization using Householder transformations [2,8]. The Theorem 1 Let Yi and Ti be the matrices computed algorithm is shown as Algorithm 1. At the ith step by (1) and (2) using finite precision arithmetic and let ¯ − T T Q = I YiTi Yi . Then of the algorithm, the vector ai is constructed from − ∥ ¯T ¯ − ∥ ≤ q1, q2,..., qi 1 and is orthogonalized against them. Qi Qi I 2 d1(i, n)ϵ (4) Here, ei denotes the ith column of I, the identity matrix for some positive constant d1(i, n) that depends only on of order n, and Housei(x) is a function that computes a − T i and n. Householder transformation Hi = I tiyiyi that elim- inates the (i + 1)th through the nth elements of x and From this theorem, it follows that deviation from leaves the 1st through the (i − 1)th elements intact. orthogonality of the computed q1, q2,..., qi is always O(ϵ), regardless of the condition number of [a1, a2, [Algorithm 1: incremental orthogonalization using ..., ai]. Householder transformations] The next theorem concerns application of the compact do i = 1, m WY representation to a matrix. × Generate ai from q1, q2,..., qi−1. Theorem 2 Let B ∈ Rn l and C be a matrix obtained ′ a = H − ··· H H a − T i i 1 2 1 i by applying I YiTiYi to B using finite precision arith- ′ n×l Hi = Housei(ai) metic. Then there exist ∆C ∈ R such that ··· qi = H1H2 Hiei T end do C = UiB + ∆C = Ui(B + Ui ∆C), (5) ∥∆C∥ ≤ [1 + d (i, n) + d (i, n)d (i, n)(1 + c (i, n, l) Algorithm 1 is the same as the usual Householder QR 2 1 2 3 1 2 decomposition [2] except that ai and qi is generated + c1(n, i, l))]ϵ∥B∥2 + O(ϵ ), (6) within the loop. The fact that qi can be computed as where Ui is the product of Hi,...,H2,H1 computed with above is readily confirmed if we note that qi is the ith exact arithmetic, d1, d2, d3 are positive constants that de- column of H1H2 ··· Hm and Hjei = ei for i+1 ≤ j ≤ m. pend only on i and n, and c1 is a positive constant that The vectors q1, q2,..., qm computed by Algorithm 1 are orthogonal to working accuracy since they are com- depends only on i, n and l. puted as the columns of H1H2 ··· Hm, which is a prod- Theorem 2 implies that the compact WY representa- uct of Householder transformations (see [6] for numer- tion is backward stable. ical properties of Householder transformations). How- 2.3 An algorithm for incremental orthogonalization ever, Algorithm 1 is inherently sequential because multi- based on the compact WY representation ple Householder transformations have to be applied one ′ We can rewrite Algorithm 1 using the compact WY by one in the computation of ai and qi. representation. The resulting algorithm is shown as Al- 2.2 Compact WY representation gorithm 2. Given multiple Householder transformations Hk = I− T ≤ ≤ [Algorithm 2: incremental orthogonalization based tkykyk (1 k i), we can aggregate them using a technique called the compact WY representation [9]. Let on the compact WY representation] do i = 1, m Y1 = [y1] and T1 = [t1], and define an n × k matrix Yk and a k × k lower triangular matrix T by the following Generate ai from q1, q2,..., qi−1. k ′ − T ai = (I Yi−1Ti−1Yi−1)ai recursion formulae: ′ (ti, yi) = Housei(ai) Yk = [Yk−1 yk], (1) Yi = [Yi−1 yi] ] [ ] Ti−1 0 Tk−1 0 Ti = T Tk = T . (2) −tiy Yi−1Ti−1 ti −t y Y − T − t i k k k 1 k 1 k − T T qi = (I YiTi Yi )ei Then the product Hi ··· H2H1 can be represented as fol- end do lows: This is the algorithm we propose for incremental or- ··· − T Hi H2H1 = I YiTiYi . (3) thogonalization. In this algorithm, application of the

– 90 – JSIAM Letters Vol. 3 (2011) pp.89–92 Yusaku Yamamoto et al.

Householder transformations Hi−1,...,H2,H1 to ai is Table 1. Comparison of algorithms for incremental orthogonal- ′ − ization. performed as matrix-vector multiplications ai = (I T MGS CGS2 House cWY Yi−1Ti−1Yi−1)ai. Since each matrix-vector multiplica- tion requires only one inter-processor synchronization, Work 2m2n 4m2n 4m2n 4m2n ′ Synchronizations O(m2) O(m) O(m2) O(m) the number of synchronizations required to compute ai is only three, in contrast to O(m) required in Algorithm Granularity O(n/P ) O(mn/P ) O(n/P ) O(mn/P ) Orthogonality O(ϵκ(A)) O(ϵ) O(ϵ) O(ϵ) 1. The same is true of the computation of q . Thus the i O(ϵκ(A)) Condition − − − parallel granularity of Algorithm 2 is O(m) times larger < 1 than that of Algorithm 1. When ai is computed as

a1 = b, (7) representation with other algorithms. Here, CGS2 is a method to repeat the classical Gram Schmidt orthogo- ai = Gqi−1 (i = 1, 2,... ) (8) nalization twice to increase the orthogonality [7]. House × for some G ∈ Rn n and b ∈ Rn, Algorithm 2 computes and cWY stand for the Householder-based method and an orthonormal basis of the Krylov subspace Km(G; b). the proposed method, respectively. The rows named Hence it can be used, for example, in the GMRES algo- Synchronizations and Granularity show, respectively, rithm for linear equation solution in place of the modified the number of inter-processor synchronizations and par- Gram-Schmidt method. Actually, the combination of the allel granularity, that is, the number of arithmetic opera- GMRES algorithm with Algorithm 2 leads to Walker’s tions that can be performed by a processor between two blocked Householder GMRES algorithm [8]. However, synchronization points. P denotes the number of pro- from our formulation, it is evident that Algorithm 2 can cessors. The rows named Orthogonality and Condition be applied not only to the GMRES algorithm but also show, respectively, theoretical bounds on ∥QT Q − I∥, to incremental orthogonalization problems in general. where Q = [q1,..., qm], and the condition (if any) that From Algorithm 2, it is clear that qi (1 ≤ i ≤ m) must be satisfied for the method to be applicable. The is the ith column of the compact WY representation matrix A is defined as A = [a1, a2,..., am]. The results − T T I YmTmYm . Thus we can conclude from Theorem 1 for MGS, CGS2 and House are taken from [10]. that the vectors q1, q2,..., qm are always orthogonal to From the table, we can conclude that the method working accuracy. On the other hand, Walker states that based on the compact WY representation is superior to he has no proof of numerical superiority of his blocked CGS2 in terms of applicability and to the Householder- method to another parallelizable method, namely, the based method in terms of parallel granularity. classical Gram Schmidt [8]. We can also discuss backward stability of Algorithm 3. Experimental results 2 based on Theorem 2. Let R be an n × m upper tri- We evaluated the performance and accuracy of the angular matrix whose (i, j)th element is the ith element incremental orthogonalization algorithm based on the of (I − t y yT )a′ . Then R is the upper triangular fac- j j j j compact WY representation. The computational envi- tor of the QR decomposition of A = [a , a ,..., a ]. 1 2 m ronment is a PC cluster with Intel Xeon processors and Using Theorem 2, it is easy to see that there exists we used up to 16 nodes. The program was written in ∆A ∈ Rn×m such that FORTRAN and MPI and compiled with the PGI FOR- A + ∆A = UmR, ∥∆A∥2 ≤ d4(m, n)ϵ∥A∥, (9) TRAN compiler. All the calculations were done with double precision floating-point arithmetic. where Um is defined in Theorem 2 and d4(m, n) is a pos- itive constant that depends only on m and n. The proof 3.1 Numerical accuracy is almost the same as the proof of backward stability To evaluate the accuracy of Algorithm 2, we gen- of the Householder QR decomposition. See [6, Lemma erated the vectors a1, a2,..., am using correlated ran- 18.3] for the latter proof. Eq. (9) shows that Algorithm dom numbers so that the condition number κ(A) of 2 is backward stable, as is the non-blocked algorithm A = [a1, a2,..., am] takes a specified value. We set n = (Algorithm 1). 20, 000, m = 50 and varied κ(A) from 1 to 1016. The Next, we count the number of operations required to matrix A was scaled so that its Frobenius norm is 1. perform Algorithm 2. The number of operations to com- Fig. 1 shows the accuracy as a function of κ(A). Here, ′ T pute ai, Ti and qi is 4in, 2in and 2in, respectively, if the orthogonality is measured by maxi,j |(Q Q − I)ij|, ≪ we assume m n and retain only the highest order where Q = [q1,..., qm]. Also, we plot the residual T terms. Note that Yi ei in the expression of qi requires maxi,j |(A−QR)ij|, where R is the upper triangular ma- no computation; we only need to extract the ith row trix defined in Section 2.3. T of Yi . By summing up these numbers over i, we know It is clear from the graph that both orthogonality and that the operation count of Algorithm 2 is about 4m2n. residual are independent of κ(A) and are of the order This is the same as the operation count of the original of machine epsilon. This is in consistent with the theo- Householder-based method, Algorithm 1. retical prediction made in Section 2.3. The behavior of 2.4 Comparison with other methods orthogonality and residual is almost the same for other In Table 1, we show a comparison of the incremental values of n and m. orthogonalization algorithm based on the compact WY

– 91 – JSIAM Letters Vol. 3 (2011) pp.89–92 Yusaku Yamamoto et al.

10−1 18 −3 16 n=20000 10 orthogonality 14 n=40000 10−5 residual 12 10−7 n=80000 10 10−9 8 10−11 6 −13 10 4 10−15 Orthogonality / Residual / Orthogonality

Performance (GFLOPS) Performance 2 10−17 0 1 104 108 1012 1016 1 2 4 8 16 Condition number of A Number of processors Fig. 1. Orthogonality and residual versus κ(A). Fig. 2. Parallel performance (m = 10).

18 3.2 Parallel performance 16 n=5000 In parallelizing Algorithm 2, we used block distribu- 14 n=10000 12 tion to distribute each of the vectors ai and qi (1 ≤ n=20000 i ≤ m) among the processors. Hence, if the number of 10 processors is P , each processor is allocated sub-vectors 8 of length n/P . To compute a matrix-vector product 6 T 4 like Yi−1ai, the processors first calculate partial matrix- vector products using the data they own, and then sum (GFLOPS) Performance 2 0 up the partial results using MPI AllReduce to get the 1 2 4 8 16 2 full result. Calculations that involve only O(m) or O(m ) Number of processors T operations, such as the product of Ti−1 and Y − ai, are i 1 Fig. 3. Parallel performance (m = 50). done redundantly on all the processors. This makes sub- sequent calculations easier. Figs. 2 and 3 show the parallel performance of our References program. In Fig. 2, m = 10 and n is varied from 20, 000 to 80, 000, while in Fig. 3, m = 50 and n is varied from [1] W. Arnoldi, The principle of minimized iterations in the so- 5, 000 to 20, 000. The horizontal axis is the number of lution of the matrix eigenvalue problem, Quart. Appl. Math., processors and the vertical axis is the parallel perfor- 9 (1951), 17–29. [2] G. Golub and C. van Loan, Matrix Computations, Johns Hop- mance measured in GFLOPS, where we assumed the kins Univ. Press, Baltimore, 1996. 2 number of operations to be 4m n (see Table 1). It can [3] Y. Saad and H. Schultz, GMRES: a generalized minimal be seen that our program achieves reasonable speedup, residual algorithm for solving nonsymmetric linear systems, especially when n is large. SIAM J. Sci. Stat. Comput., 7 (1986), 856–869. [4] E. Gallopoulos and Y. Saad, Efficient solution of parabolic equations by Krylov approximation methods, SIAM J. Sci. 4. Conclusion Stat. Comput., 13 (1992), 1236–1264. In this paper, we presented an algorithm for incremen- [5] I. Dhillon and B. Parlett, Multiple representations to com- tal orthogonalization based on the compact WY repre- pute orthogonal eigenvectors of symmetric tridiagonal matri- ces, Linear Algebra Appl., 387 (2004), 1–28. sentation. It requires the same amount of computational [6] N. Higham, Accuracy and Stability of Numerical Algorithms, work as the classical Gram-Schmidt with reorthogonal- SIAM, Philadelphia, 2002. ization or the Householder-based algorithm, but is su- [7] J. Daniel, W. Gragg, L. Kaufman and G. Stewart, Reorthog- perior to the former in terms of applicability and to the onalization and stable algorithms for updating the Gram- latter in terms of parallel granularity. Numerical exper- Schmidt QR factorization, Math. Comp., 30 (1976), 772–795. [8] H. Walker, Implementation of the GMRES method using iments on a PC cluster demonstrate accuracy and scal- Householder transformations, SIAM J. Sci. Stat. Comput., ability of the algorithm. 9 (1988), 152–163. [9] R. Schreiber and C. van Loan, A storage-efficient WY repre- Acknowledgments sentation for products of Householder transformations, SIAM J. Sci. Stat. Comput., 10 (1989), 53–57. We are grateful to the anonymous referee and the edi- [10] J Demmel, L. Grigori, M. Hoemmen and J. Langou, tor, whose comments helped us to improve the quality of Communication-optimal parallel and sequential QR and LU this paper. We also would like to thank the participants factorizations, LAPACK Working Notes, No. 204, 2008. of the annual meeting of the Japan Society of Indus- trial and Applied Mathematics for valuable comments. This work is partially supported by the Ministry of Ed- ucation, Science, Sports and Culture, Grant-in-aid for Scientific Research.

– 92 – JSIAM Letters Vol.3 (2011) pp.93–96 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Analysis of downgrade risk in credit portfolios with self-exciting intensity model

Suguru Yamanaka1, Masaaki Sugihara2 and Hidetoshi Nakagawa3

1 Mitsubishi UFJ Trust Investment Technology Institute Co., Ltd, 4-2-6 Akasaka, Minato-ku, Tokyo 107-0052, Japan 2 Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan 3 Graduate School of International Corporate Strategy, Hitotsubashi University, 2-1-2 Hitotsu- bashi, Chiyoda-ku, Tokyo 101-8439, Japan E-mail yamanaka mtec-institute.co.jp Received August 19, 2011, Accepted November 25, 2011

Abstract We present an intensity based credit rating migration model and execute empirical analyses on forecasting the number of downgrades in some credit portfolios. The framework of the model is based on so-called top-down approach. We firstly model economy-wide rating migration intensity with a self-exciting stochastic process. Next, we characterize the downgrade intensity for the underlying sub-portfolio with some thinning model specified by the distribution of credit ratings in the sub-portfolio. The results of empirical analyses indicate that the model is to some extent consistent with downgrade data of Japanese firms in a sample period. Keywords credit risk, rating migration, self-exciting intensity Research Activity Group Mathematical Finance

1. Introduction In credit portfolio risk management, we quantify credit risks with some model of credit event occurrences such as credit rating migrations and defaults. In this paper, we introduce an intensity based credit rating mi- gration model for risk analyses of credit portfolios and perform statistical test for model validation with credit migration samples of Japanese firms. Our modeling framework is based on the top-down ap- proach studied in [1,2]. Namely, our model is constituted Event number Event by two parts, top-part and down-part. In the top-part, we model rating migration in whole economy with event intensities. In this paper, we use a self-exciting process for the intensity model, where the term “self-exciting” means that the intensity increases when a event occurs. Several self-exciting type intensity models have been re- 0 10 20 30 40 50 cently used in credit risk modeling to capture credit 1998 2000 2002 2004 2006 2008 2010 event clusters (see [2–5]). Credit event cluster is well- known feature of credit events. For example, the histor- Fig. 1. The monthly number of downgrades in Japan announced by R&I. ical data of the monthly downgrade number in Fig. 1 show that there are downgrade clusters, from 1998 to 2000, from 2001 to 2003 and from 2008 to 2010. Specif- form statistical test for in-sample fit. Second, we per- ically, we use the self-exciting model proposed in [5]. form statistical test for out-of-sample forecast with the In the down-part, we obtain intensity models of sub- fitted model. Specifically, with the estimated intensity portfolios with a thinning model. Our thinning model is model and thinning model, we derive the distribution specified by some factors which represent characteristics of the downgrades number in a reference bond portfo- of sub-portfolios. We adopt rating distribution for one lio underlying a collateralized bond obligation, and test of the factors to consider the size of credit portfolios, in the validity of the distribution with realized downgrade common with the thinning model of [5]. number. To check the adequacy of the model, we perform em- The organization of this paper is as follows. Section pirical analyses on downgrade forecast. First, we specify 2 provides a rating migration model for credit portfo- our model by maximum likelihood approach and per-

– 93 – JSIAM Letters Vol. 3 (2011) pp.93–96 Suguru Yamanaka et al. lios. Section 3 shows empirical analyses on downgrades. credit ratings in the sub-portfolios. In particular, we Section 4 gives some concluding remarks. specify thinning model for downgrade intensity as fol- lows: 2. Model i i ˜i Zt (k) = ζ Zt (k) (1) In this section, we introduce an intensity model of economy-wide rating migrations. In addition, we spec- where i ify intensities of rating migration in sub-portfolios by a i Xt (k) ˜ ∑ − Zt (k) = − 1{ K 1 X∗(k˜)>0}, (2) thinning model. K∑1 k˜=1 t X∗(k˜) 2.1 Intensity model for economy-wide events t ˜ We model the uncertainty in the economy by a fil- k=1 ˜i tered complete probability space (Ω, F, P, {Ft}), where and the quantity Zt (k) denotes the rating distribution of i P is the actual probability measure and {Ft} is a right- portfolio. Xt (k) denotes the number of firms in the port- k ∗ continuous and complete filtration. For each type of the folio Si at time t. Xt (k) denotes the number of k-rated credit event, consider an increasing sequence of totally firms in the whole economy at time t . The denomina- inaccessible {Ft}-stopping times 0 < T1 < T2 < ··· , tor in the thinning model (2) represents the number of which represents the ordered event times in the whole firms with downgrade possibility. The quotients in (2) is economy.∑ We denote the counting process of the event taken to be 0 when the denominator vanishes. The quan- tity ζi(k) represents the portfolio characteristic that the by Nt = n≥1 1{Tn≤t}. t Suppose Nt has intensity process λt. Namely, λt is rating distribution of portfolio can not capture. While {F } a t -progressively∫ measurable non-negative process, we use thinning models (1) with two factors, we can − t {F } consider some additional factors in thinning models to and the process Nt 0 λsds is an t -local martingale. Let λt be the self-exciting stochastic process: obtain more specific sub-portfolio intensities. − dλt = κt(ct λt)dt + dJt, 3. Empirical analyses ∑ In this section, we estimate the intensity model with Jt = (min(δλTn−, γ)1{Tn≤t}), n≥1 the downgrades samples of Japanese firms. Then, we estimate the thinning model for a reference portfolio κt = κλT , ct = cλT , Nt Nt underlying the collateralized bond obligation called J- where λt− := lims↑t λs and the constants κ > 0, c ∈ Bond Limited. In addition, we perform validation test (0, 1), δ > 0, γ ≥ 0, λ0 > 0 are parameters. on in-sample fitness and out-sample downgrade forecast. Specifically, we firstly divide downgrade samples into 2.2 Thinning model first half period and second half period. Next, we es- In down-part, we decompose the economy-wide event timate the model with the first half period and perform intensity into sub-portfolio event intensity with a thin- statistical tests on fitness. Then, we derive the down- ning model based on rating distributions. grade distribution in second period with the model, and Suppose each firm in the economy is associated with a compare it with the realized downgrades in the second credit rating. There are K credit ratings and we denote period. As our validation test is statistical one, the re- credit ratings by 1, 2,...,K, in order of credit quality. k sults of our test indicate whether the model is rejected Let Si denote the set of k rated firms in portfolio Si or not. In other words, the testing methods in this paper (i = 1, 2,...,I, k = 1, 2,...,K). At each time, each k i do not necessary give active support on model validity. rated firm belongs to one of sub-portfolios Si . Let Nt (k) be the counting process of credit events in sub-portfolio 3.1 Data k Si . The counting process is given by The data for parameter estimation are the sample ∑ records on rating changes of Japanese firms from April i N (k) = 1{ ≤ }∩{ ∈ k }, t Tn t Tn τ(Si ) 1, 1999 to March 31, 2004. The ratings are announced n≥1 by Rating and Investment Information, Inc. (R&I). Dur- where τ(S) denotes the set of event times in portfolio S. ing the sample period of 1243 working days, there are i To obtain intensity of the counting process Nt (k), we 509 downgrades and 55 up-grades. We focus on down- {F } { i } i introduce t -adapted process Zt (k) . Zt (k) repre- grades, because the number of up-grades are too small sents the conditional probability that the credit event is to estimate up-grade intensity and to discuss the model k the event in the portfolio Si , given that an event oc- adequacy. Excluding no-working days, we transformed i curs in the economy. Zt (k) satisfy the following prop- calendar times April 1, 1999, April 1, 2000, ... to i erties:∑ (a) Zt (k) takes values in the unit interval [0, 1], t = 0, 1,... . There are a lot of events in the same day, so i we slide the event times with uniform random number (b) i,k Zt (k) = 1. From [4, Proposition 2.1], we obtain i so as to make every event times different. We employ the the intensity associated with counting process Nt (k) as follows: reference credit portfolio underlying J-Bond Limited as a target sub-portfolio (corresponding index number is i i λt(k) = Zt (k)λt. i = 1). J-Bond Limited is a collateralized bond obliga- To analyze credit risk in sub-portfolios, we introduce tion with the reference portfolio consisted by 67 corpo- a thinning model characterized by the distribution of rate bonds. J-Bond Limited was issued in 1999 and re-

– 94 – JSIAM Letters Vol. 3 (2011) pp.93–96 Suguru Yamanaka et al. demption date of tranches were from 2002 to 2003. The Table 1. Maximum likelihood estimates of the downgrade inten- details of J-Bond Limited are described in [6]. sity model (data: April 1, 1999–September 27, 2001). Values in For testing in-sample fit and out-sample forecast, we parentheses are standard estimation errors. divide the samples into first half period, from April 1, κ c δ γ λ0 1999 to September 27, 2001 ([0, 2.5)) and the second half 0.937 0.200 2.706 150 63.885 (0.115) (0.015) (0.101) - (60.368) period, from September 28 to March 31, 2004 ([2.5, 5.0]). The downgrade samples in the first half period are used Table 2. Average number and maximum of downgrades obtained for estimating models and testing in-sample fit. The by the model and realized downgrade number, in the first span. downgrade samples in the second half period are used “Percentiles” are the percentile of realized downgrade number for testing out-sample forecast. in the model distribution. The “complement” means the com- plement portfolio of J-Bond Limited. 3.2 Estimation procedure Economy J-Bond Limited Complement For estimating event intensity models, we apply the Average 270.066 31.848 238.218 Model maximum likelihood method performed in [3]. Suppose Max 456 68 399 that we have event time samples 0 < T1 < T2 < ··· < Realized number 267 31 236 (Percentile) (46.03%) (46.34%) (56.68%) TN (≤ H). Then the log-likelihood function of the inten- sity is following: P-value 0.865 1.000 0.919 ∫ ∑N H − log λTn− λsds. (3) n=1 0 γ within γ = 100, 125, 150, 175, 200. With Kolmogorov- We specify the parameters that maximize (3). Smirnov test for in-samples, we obtained P-value of To test the validity of the estimated intensity model to 0.694, indicating the intensity model is not rejected in the data, we apply the Kolmogorov-Smirnov test, that standard significant level. Also, we obtained parameter value of thinning model for J-Bond Limited as ζ1 = 1.27. [3] performed as follows. First, we transform the event 1 { }N As the value of parameter ζ exceeds 1, the downgrade times Tn n=1 into An by ∫ frequency of J-Bond Limited reference portfolio is higher Tn than that of the rating distribution indicates. An := λsds. 0 Table 2 shows the result of in-sample fitness on downgrade number, namely, comparison of the distri- We perform the Kolmogorov-Smirnov test using the fact { }N bution of downgrades obtained by the model and real- that An n=1 will be the jump times of the standard { }N ized downgrade number in the first period. Especially, Poisson process in the case of Tn n=1 are generated by − we focus on downgrades in whole Japanese bond is- λ . Thus, the null hypothesis is that {A − A }N 1 t n+1 n n=1 suer portfolio with credit ratings (Economy), down- are independent and exponentially-distributed (param- grades in J-Bond Limited reference portfolio (J-Bond eter 1). Limited) and downgrades in complement portfolio of J- We performed the maximum likelihood estimation of Bond Limited (Complement). To derive the distributions the parameters with the free statistical software package of downgrades, we performed Monte-carlo simulation R. Specifically, we used the intrinsic function “optim” with 100,000 scenarios. With the downgrades distribu- to maximize the objective function. We performed the tion, we performed two-tailed test of realized downgrade maximization for 30 sets of initial values, and finally number and obtained P-values in Table 2. Specifically, chosen the estimates that maximize the objective func- P-values in Table 2 are the sum of probabilities of the tion among the initial value sets. In addition, we per- downgrade number whose probability is less than that of formed Kolmogorov-Smirnov test with R, using the in- realized downgrade number. Comparison of the average trinsic function “ks.test”. and realized downgrade number, the percentile of real- The log-likelihood function of thinning models is as ized downgrade number in the model distribution and follows: P-values indicate that the model is consistent with the i | Hi log(L(ζ t)) realized number of downgrades. Namely, the estimation ∑ ∑ for whole model (both top-part and down-part) worked = log(Zi (k)) + log(1 − Zi (k)), Tn Tn well with the first period samples. i i 1 (Tn)=1 1 (Tn)=0 3.4 Testing out-of-sample forecast i i where H = {(Tn, 1 (Tn))}n≤N and t { t The result of out-of-sample fit test, namely compari- son of the model obtained by the first data with second 1 (Tn ∈ τ(Si)), 1i(T ) = data, is following. First, with Kolmogorov-Smirnov test n 0 (T ∈/ τ(S )). n i for out-of-sample fitness, we obtained P-values of 0.416, We also performed the maximum likelihood estimation indicating the intensiy model is not rejected at stan- of the parameters with R. dard significant level. Table 3 shows comparison of the model distribution of downgrade number and realized 3.3 Testing in-sample fit downgrade number. P-values in Table 3 indicate that Table 1 shows estimation result for intensity model the whole model is not rejected at standard significant obtained from downgrade samples in the first period. level. Namely, the model is consistent with the out-of- For estimation tractability, we restricted the value of

– 95 – JSIAM Letters Vol. 3 (2011) pp.93–96 Suguru Yamanaka et al.

Table 3. Average number and maximum of downgrades obtained by the model and realized downgrade number, in the second span. “Percentiles” are the percentile of realized downgrade less than 298 events more than 298 events number in the model distribution. The “complement” means the complement portfolio of J-Bond Limited. Economy J-Bond Limited Complement Average 274.344 32.350 241.994 Model Max 442 68 392 Realized number 242 24 218 (Percentile) (19.33%) (11.23%) (23.60%)

P-value 0.409 0.248 0.456 Probability

Table 4. Average number and maximum of downgrades in J- Bond Limited obtained by the model when the downgrade num- ber in the complement portfolio is under 95% percentile (298 downgrades) and over 95% percentile. Down-grades number in the complement portfolio 0.00 0.02 0.04 0.06 0.08 0.10 Under 298 Over 298 0 10 20 30 40 50 60 70 Average 31.978 39.290 Event number Max 65 68 Fig. 2. Conditional distribution of downgrades number in J- Bond Limited over 2.5 years. The solid line indicates the condi- tional distribution of downgrades number in J-Bond Limited, on sample. the condition that the number of down grades in the complement Now, we show one of features of our model, namely, portfolio is under 298. The dash line indicates the conditional distribution of downgrades number in J-Bond Limited, on the the model can capture the risk contagion among several condition that the number of down grades in the complement portfolios. As we considered economy wide self-exciting portfolio is over 298. intensity for top-part model, the occurrence of an event in a sub-portfolio increase the possibility of event occur- rence in the whole economy. That means the model cap- Acknowledgments tures event risk contagions among portfolios. In the fol- This work was supported in part by Grant-in-Aid for lowing example, we see the model captures the risk con- Scientific Research (A) No. 21243019 from Japan Soci- tagion from the complement portfolio to J-Bond Limited ety for the Promotion of Science (JSPS) and Global COE in the second period. Fig. 2 shows the conditional distri- Program “The research and training center for new de- bution on downgrades number in J-Bond Limited, con- velopment in mathematics”, MEXT, Japan. ditioned on that the downgrade number in the comple- ment is under 95-percentile (298 downgrades) and over References 95-percentile. Table 4 shows the averages and the maxi- mums of both distribution in Fig. 2. Table 4 and Fig. 2 [1] K. Giesecke, L. R. Goldberg and X. Ding, A top-down ap- indicate that as the downgrade risk in the complement proach to multi-name credit, Oper. Res., 59 (2011), 283–300. portfolio increases, the downgrade risk of J-Bond Lim- [2] H, Nakagawa, Modeling of contagious downgrades and its ap- plication to multi-downgrade protection, JSIAM Letters, 2 ited increases. (2010), 65–68. [3] H. Nakagawa, Analysis of records of credit rating transi- 4. Concluding remarks tion with mutually exciting rating-change intensity model (in Japanese), Trans. JSIAM, 20 (2010), 183–202. We introduced the intensity based rating migration [4] K. Giesecke and B. Kim, Risk analysis of collateralized debt model and performed goodness of fit test. Our model obligations, Oper. Res., 59 (2011), 32–49. is consisted by two parts, self-exciting intensity model [5] S. Yamanaka, M. Sugihara and H. Nakagawa, Modeling of for economy wide rating migrations and thinning model contagious credit events and risk analysis of credit portfolios, based on rating distributions. For testing model ade- Asia-Pacific Financial Markets, in press. [6] Rating and Investment Information, Inc., News release, quacy, we used downgrade samples of Japanese firms No.99-C-410, 1999. from 1999 to 2004. We divided the sample period into first and second periods, then we estimated the models with first period and performed in-sample fitness test. The result of fitness test indicates the model estimation worked well. Also, the result of out-of-sample downgrade forecast indicates that the model prediction is consistent with the downgrade out-of-samples. The opinions expressed here are those of the authors and do not necessarily reflect the views or policies of their employers.

– 96 – JSIAM Letters Vol.3 (2011) pp.97–100 ⃝c 2011 Japan Society for Industrial and Applied Mathematics

Automatic verification of anonymity of protocols

Hideki Sakurada1

1 NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato Wakamiya, Atsugi-shi, Kanagawa, 243-0198 Japan E-mail sakurada.hideki lab.ntt.co.jp Received October 3, 2011, Accepted December 6, 2011 Abstract Anonymity is an important security requirement for protocols such as voting schemes. It is often guaranteed by using anonymous channels such as mixnets. In this paper, we present a technique for automatically verifying the anonymity of protocols that use anonymous channels by using Proverif, a tool for the automatic verification of security protocols. We use this technique to verify the voting scheme developed by Fujioka, Okamoto, and Ohta. Keywords anonymity, protocol, verification, security, Proverif Research Activity Group Formal Approach to Information Security

1. Introduction M,N ::= terms Designing a security protocol is an error-prone task. x, y, z variables There are a large body of work and many tools for a, b, c, k, s names finding errors and verifying the security of protocols. f(M1,...,Mn) constructor application Proverif [1, 2] is one such tool that can automatically verify various security properties including anonymity. D ::= term evaluations Anonymity is an important security property for pro- M term tocols such as voting schemes. It is often guaranteed by eval h(D1,...,Dn) function evaluation using anonymous channels such as mixnets [3]. The vot- P, Q, R ::= processes ing scheme developed by Fujioka, Okamoto, and Ohta M⟨N⟩.P output (FOO) is one of such protocols [4]. M(x).P input Kremer and Ryan [5] have used Proverif to verify var- 0 inactive process ious security properties of FOO, except for anonymity. P | Q parallel composition Delaune, Ryan, and Smyth [6] have developed a tech- !P replication nique for the automatic verification of anonymity and (νc)P restriction applied it to protocols including FOO. let x = D in P else Q term evaluation In this paper, we develop a technique for the au- if M = N then P else Q conditional tomatic verification of anonymity by using Proverif. Our technique is similar to that of Delaune, Ryan, and Table 1. Syntax for terms and processes. Smyth. While their technique enables us only to model anonymous channels for publishing data to the envi- ronment, our technique enables us to model those for an input subprocess N ′(x).R such that Σ ⊢ N = N ′ sending data to another participants. We also verify the holds, then the message M is transmitted over the anonymity of FOO by employing our technique. channel N from the output subprocess to the input subprocess. Then these subprocesses are replaced 2. Preliminaries with Q and R{M/x} respectively where {M/x} is To describe protocols and their executions, we intro- a substitution that replaces x with M. duce the language used in Proverif and its semantics (see • If P has a conditional subprocess if M = [1] for details). In this language, protocols are modeled as N then Q else R, this process is rewritten into Q if processes and messages exchanged between the partici- Σ ⊢ M = N holds and otherwise rewritten into R. pants are modeled as terms. Table 1 summarizes the syn- If more than one rewriting steps are possible, one of them tax for the terms and processes. Terms are subject to an is non-deterministically chosen. For example, a process equational theory Σ. Users of Proverif may extend it, for P = (νc)(c⟨M⟩.0 | c(x).R | c(x).R ) has two possible example, with the equation dec(enc(M, k), k) = M for 1 2 executions: modeling symmetric-key encryption. We write Σ ⊢ M = N if M and N are equal in Σ; otherwise Σ ⊢ M ≠ N. P → (νc)(0 | R1{M/x} | c(x).R2), Intuitively, an execution of a process P is a sequence P → (νc)(0 | c(x).R | R {M/x}). of either of the following rewriting steps: 1 2 • If P has both an output subprocess N⟨M⟩.Q and Here 0 is the inactive process, which is often omitted.

– 97 – JSIAM Letters Vol. 3 (2011) pp.97–100 Hideki Sakurada

M ⇓ M N⟨M⟩.Q | N ′(x).P → Q | P {M/x} (Red I/O) ′ ′ eval h(D1,...,Dn) ⇓ σN if Σ ⊢ fst(N) = fst(N ) and Σ ⊢ snd(N) = snd(N ) if h(N1,...,Nn) → N ∈ defΣ(h) and σ is let x = D in P else Q → P {diff[M1,M2]/x} such that for all i, Di ⇓ Mi and Σ ⊢ Mi = σNi if fst(D) ⇓ M1 and snd(D) ⇓ M2 (Red Fun 1) let x = D in P else Q → Q (Red Fun 2) P | 0 ≡ P P ≡ P if there is no M such that fst(D) ⇓ M and P | Q ≡ Q | P Q ≡ P ⇒ P ≡ Q 1 1 there is no M such that fst(D) ⇓ M . (P |Q)|R ≡ P |(Q|R) P ≡ Q, Q ≡ R ⇒ P ≡ Q 2 2 ≡ ≡ ⇒ | ≡ | (νa)(νb)P (νb)(νa)PP Q P R Q R Table 3. Semantics for biprocesses. (νa)(P |Q) ≡ P |(νa)QP ≡ Q ⇒ (νa)P ≡ (νa)Q if a∈ / fn(P ) if M = N then P else Q 3. Specifying anonymity ≡ let x = eq(M,N) in P else Q The anonymity of a protocol is specified in terms of N⟨M⟩.Q | N ′(x).P → Q | P {M/x} the observational equivalence between two instances of if Σ ⊢ N = N ′ (Red I/O) the protocol. Intuitively, two processes are observation- let x = D in P else Q → P {M/x} ally equivalent if and only if no attacker can distinguish if D ⇓ M (Red Fun 1) between these processes by interacting with either of let x = D in P else Q → Q them. For example, the anonymity of the scheme V in if there is no M such that D ⇓ M (Red Fun 2) Example 1 is specified by the observational equivalence !P → P |!P (Red Repl) V {v1/x1}{v2/x2} ∼ V {v2/x1}{v1/x2} P → Q ⇒ P | R → Q | R (Red Par) P → Q ⇒ (νa)P → (νa)Q (Red Res) between V {v1/x1}{v2/x2} and V {v2/x1}{v1/x2}. In ′ ′ ′ ′ P ≡ P,P → Q, Q ≡ Q ⇒ P → Q (Red ≡) V {v1/x1}{v2/x2}, the first and second voters vote for candidates v1 and v2 respectively. In V {v2/x1}{v1/x2}, Table 2. Semantics for terms and processes. the voters vote for v2 and v1 respectively. This anonymity holds because, intuitively, even if an attacker observes the pair (v1, v2) of votes on the public channel We write this process P as R1 + R2 if the variable x cp , he does not know which process MIX0 or MIX1 has occurs neither in R1 nor in R2. sent it and which voter has sent v1. Formally, observa- The formal semantics for terms and processes is shown tional equivalence is defined as follows [1]: in Table 2. We refer readers to [1] for details. Definition 2 An evaluation context C is a process that Example 1 We specify a simple voting scheme V as is built from a hole [], parallel compositions C | P and a process as follows: P | C with a process, and a restriction (νa)C. A process P emits on M (P ↓ ) if and only if P ≡ V = (νc )(νc )(c ⟨x ⟩ | c ⟨x ⟩ | MIX(c , c , c )), M 1 2 1 1 2 2 1 2 p C[M ′⟨N⟩.R] for some evaluation context C that does not ⊢ ′ MIX = MIX0 + MIX1, bind fn(M) and Σ M = M . Observational equivalence ∼ is the largest symmetric MIX = c (y ).c (y ).c ⟨(y , y )⟩, 0 1 1 2 2 p 1 2 relation R on closed processes such that P R Q implies ⟨ ⟩ MIX1 = c1(y1).c2(y2).cp (y2, y1) . • if P ↓M then Q ↓M ; • → ′ → ′ ′ R ′ ′ This scheme consists of two voter subprocesses c1⟨x1⟩ if P P then Q Q and P Q for some Q ; and c2⟨x2⟩ and a process MIX that models an anony- • C[P ] R C[Q] for all evaluation contexts C. mous channel. These processes communicate over pri- vate channels c1 and c2, and a public channel cp . Intu- 4. Automatic verification in Proverif itively, communications over the public channel cp are Proverif verifies a sufficient condition of the observa- visible from an attacker while those over the private tional equivalence between processes fst(P ) and snd(P ) channels c1 and c2 are not. One of the possible execu- for a given biprocess P . A biprocess is a process in which tions of V is shown below: terms of the form diff[M1,M2] may occur. Processes V → (νc1)(νc2)(c1⟨x1⟩ | c2⟨x2⟩ | MIX0) fst(P ) and snd(P ) are obtained from a biprocess P by re- placing each term of the form diff[M1,M2] with M1 and → (νc2)(c2⟨x2⟩ | c2(y2).cp ⟨(x1, y2)⟩) M2 respectively. The semantics for biprocesses is defined → cp ⟨(x1, x2)⟩). by replacing rules (Red I/O), (Red Fun 1), and (Red Fun 2) with those in Table 3. For example, a biprocess Here the subprocess MIX of V first reduces to MIX0, and if M = N then P else Q reduces to P if both equalities then the first and the second voters send their respective Σ ⊢ fst(M) = fst(N) and Σ ⊢ snd(M) = snd(N) hold. It votes x1 and x2 to MIX0 in order. If there is another reduces to Q if neither of the equalities holds. It reduces process that runs in parallel with V , it may receive the to no biprocess if exactly one of these equations holds. pair (x1, x2) over the public channel cp . Other executions The sufficient condition is shown in the following lemma: are also possible: MIX may reduce to MIX1, and the second voter may send a vote before the first voter. Lemma 3 ( [1, Theorem 1] ) Let P0 be a closed biprocess. Then fst(P0) ∼ snd(P0) if for any evaluation

– 98 – JSIAM Letters Vol. 3 (2011) pp.97–100 Hideki Sakurada context C that has no occurrence of diff and any reduc- Theorem 4 We have observational equivalences fst( ∗ ′ ′ tion sequence C[P0] → P , fst(P ) → Q1 implies that MIX ) ∼ MIX and snd(MIX ) ∼ MIX. P → Q for some biprocess Q with fst(Q) ≡ Q , and 1 Proof The first equivalence trivially follows from the symmetrically for snd(P ) → Q . 2 definition of fst. From the definition of ≡, we have Although the sufficient condition is useful in verify- | | ⟨ ⟩ ing many security properties, it does not work for the (νc)(c(x).MIX0 c(x).MIX1 c M ) anonymity of many protocols. For example, to verify the ≡ (νc)(c(x).MIX1 | c(x).MIX0 | c⟨M⟩). anonymity of V in Example 1, we consider the biprocess Thus we have MIX ≡ snd(MIX′) from the definitions of P0 = V {M1/x1}{M2/x2} where M1 = diff[v1, v2] and ‘+’, snd, and MIX′. Thus the claim follows from ≡⊆∼, M2 = diff[v2, v1]. Consider the following process which is shown as follows: For any processes P and Q A = c(x).if x = (v1, v2) then c⟨1⟩ else 0 such that P ≡ Q, ′ and an evaluation context C[] = A | []. Similarly to the • If P ↓M , then we have P ≡ C[M ⟨N⟩.R] and Σ ⊢ execution shown after Example 1, we have an execution M = M ′ for some process M ′⟨N⟩.R and evaluation context C that does not bind fn(M ′). Then Q ≡ C[P0] →→→ A | cp ⟨(M1,M2)⟩ ′ P ≡ C[M ⟨N⟩.R], hence Q ↓M . → ⟨ ⟩ if (M1,M2) = (v1, v2) then cp 1 else 0. • If P → P ′, we have Q → P ′ from the definition of ′ ′ Let P be the last process above. Then, since we have → and have P ≡ P from the definition of ≡. fst(M1) = v1 and fst(M1) = v2, we have Σ ⊢ fst((M1, • For any evaluation context C, we have C[P ] ≡ M2)) = fst((v1, v2)), hence fst(P ) → cp⟨1⟩. However, C[Q]. This is shown by induction on the construc- since we have snd(M1) = v2 and snd(M2) = v1, we do tion of C and using the definition of ≡. not have Σ ⊢ snd((M ,M )) = snd((v , v )). Hence P 1 2 1 2 (QED) reduces to no biprocess. Thus P0 does not satisfy the sufficient condition in Lemma 3. In fact, Proverif fails to verify the observational equivalence. 6. Application to anonymity of FOO We employ our technique to verify the anonymity of 5. Our technique and its soundness the FOO protocol on Proverif. The entire script for the In this section, we introduce a technique to overcome verification is shown in Table 4. In the implementation the problem described in the previous section. Then we of Proverif, the function symbol diff is renamed choice. show the soundness of the technique. There are two voters in this script. The first one votes To overcome the problem, we replace the anonymous for candidates cand1 and cand2 in the first and second channel MIX with an alternative representation MIX′ of executions respectively. The second one votes for can- the anonymous channel defined as follows: didates cand2 and cand1 in the first and second execu- ′ ′ ′ tions respectively. These voters first obtain signatures on MIX = MIX0 + MIX1, their votes from the administrator admin through pub- ′ ⟨ ⟩ lic channels. They then publish the commitments on the MIX0 = c1(y1).c2(y2).cp (diff[y1, y2], diff[y2, y1]) , votes and the keys to open the commitments through ′ ⟨ ⟩ MIX1 = c1(y1).c2(y2).cp (diff[y2, y1], diff[y1, y2]) . anonymous channels. Anonymous channels are modeled ′ The problem is fixed with this MIX′. For example, by the process mix, which is same as MIX , except that ′ the non-deterministic choice ‘+’ is expanded according replace MIX with MIX in P0 in the previous section. The reduction sequence in the example becomes to the definition. When this script is input, Proverif out- puts ‘Observational equivalence is true.’ It means that →∗ | ⟨ ′ ′ ⟩ C[P0] A cpub (M1,M2) these two executions are not distinguished by the at- → ′ ′ ⟨ ⟩ tacker and that the anonymity of the voters. We stress if (M1,M2) = (v1, v2) then cp 1 else 0 ′ ′ that Proverif will fail to verify the anonymity if our tech- where M1 = diff[diff[v1, v2], diff[v2, v1]] and M2 = diff[ ′ nique is not employed. diff[v2, v1], diff[v1, v2]]. Let P be the last process above. ′ Since we have fst(M1) = fst(diff[v1, v2]) = v1 and simi- 7. Related work ′ ⊢ ′ ′ larly fst(M2) = v2, we have Σ fst((M1,M2)) = (v1, ′ As we have mentioned, our technique is similar to that v2), hence fst(P ) → cp⟨1⟩. Similarly, since we have ′ ′ ⊢ developed by Delaune, Ryan, and Smyth. They used the snd(M1) = v1 and snd(M2) = v2, we also have Σ ′ ′ ′ → ⟨ ⟩ following technique to verify anonymity of protocols in- snd((M1,M2)) = (v1, v2). Thus we have P cp 1 as a biprocess. Thus this reduction sequence satisfies the cluding FOO. They consider protocols of the form sufficient condition in Lemma 3. The other reduction se- W = let x˜ =x ˜1 in P | let x˜ =x ˜2 in P quences are similarly checked by using Proverif, and we { }{ } ∼ { }{ } ′ ∼ ′ and anonymity W v˜1/x˜1 v˜2/x˜2 W v˜2/x˜1 v˜1/x˜2 . succeed in verifying fst(P0) snd(P0). Now we show the soundness of the replacement by Herex ˜,x ˜1, andx ˜2 are sequences of variables, andv ˜1 and proving the equivalence between MIX and MIX′, which v˜2 are sequences of names. The process P is a process means that the replacement does not change the obser- extended with some annotations and containing neither | vational equivalence. parallel composition (‘ ’) nor conditional (let). They also

– 99 – JSIAM Letters Vol. 3 (2011) pp.97–100 Hideki Sakurada rewrite protocols to overcome the problem described in (* Defs. of commitment and signature schemes *) Section 4. For example, consider the following protocol: fun ok/0. fun commit/2. fun vk/1. fun blind/3. fun bsign/2. fun sign/2. ′ | V = let x = x1 in P let x = x2 in P, reduc open(commit(m, k), m, k) = ok. P = (**swap*)c⟨x⟩ reduc extract_open(commit(m, k), k) = m. reduc unblind(bsign(blind(m, r, vk(sk)), sk), where (**swap*) is an annotation that assists rewrit- m, r, vk(sk)) ing. As with the protocol V , Proverif fails to verify the = sign(m, sk). anonymity of this protocol. They therefore rewrite this reduc verify(sign(m, sk), m, vk(sk)) = ok. protocol into the equivalent one: reduc extract_blind(blind(m, r, k)) = k. reduc extract_bsign(bsign(m, sk)) = (m, vk(sk)). ′′ V = let x = diff[x2, x1] in P | let x = diff[x1, x2] in P. reduc extract_sign(sign(m, sk)) = (m, vk(sk)). Proverif can verify the anonymity of this protocol, given ′′{ }{ } (* Definition of processes in FOO *) the biprocess V diff[v1, v2]/x1 diff[v2, v1]/x2 . free ca, cco, ccv, cand1, cand2. Both their technique and ours enable automatic ver- ification by rewriting protocols into equivalent ones. In let voter = addition to the above rewriting, they also introduce new rc; new rb; new rb0; another annotation and rewriting rule for protocols in let com = commit(v, rc) in which agents are synchronized. However, they consider let b = blind(com, rb, vkA) in only protocols of the above form, in which P does not out(ca, (sign(b, sk), b)); contain parallel compositions. For this reason, to verify in(ca, bs); a protocol consisting of some processes communicating let sig = unblind(bs, com, rb, vkA) in with each other such as the voting scheme V , we must if verify(sig, com, vkA) = ok then out(cmv, (sig, com)); write a process P that simulates these processes. For ex- out(cmo, (v, rc)). ample in their verification of the FOO voting protocol, they combined a voter, a portion of the administrator, let mix = and an anonymous channel into a single process of the new ch_choice; above form. On the other hand, such a transformation (out(ch_choice, ()) | is not necessary with our technique. (in(ch_choice, y); in(cin1, m0); in(cin2, m1); 8. Conclusion out(cout, (choice[m0, m1], choice[m1, m0]))) | (in(ch_choice, y); In this paper, we described a technique we have de- in(cin1, m0); in(cin2, m1); veloped for the automatic verification of anonymity of out(cout, (choice[m1, m0], choice[m0, m1])))). protocols that use anonymous channels and proved its soundness. We also used the technique to verify the let admin = anonymity of the FOO protocol. in(ca, (s, b)); if verify(s, b, vk(sk1)) = ok then References out(ca, bsign(b, skA)); in(ca, (s’, b’)); [1] B. Blanchet, M. Abadi and C. Fournet, Automated verification (if verify(s’, b’, vk(sk2)) = ok then of selected equivalences for security protocols, J. Logic Algebr. out(ca, bsign(b’, skA))) Progr., 75 (2008), 3–51. else if verify(s, b, vk(sk2)) = ok then [2] B. Blanchet, Automatic verification of correspondences for se- out(ca, bsign(b, skA)); curity protocols, J. Comput. Secur., 17 (2009), 363–434. [3] D. L. Chaum, Untraceable electronic mail, return addresses, in(ca, (s’, b’)); and digital pseudonyms, Commun. ACM, 24 (1981), 84–90. (if verify(s’, b’, vk(sk1)) = ok then [4] A. Fujioka, T. Okamoto and K. Ohta, A Practical Secret Vot- out(ca, bsign(b’, skA))). ing Scheme for Large Scale Elections, in: Advances in Cryp- tology - AUSCRYPT’92, J. Seberry and Y. Zheng eds., Lect. process Notes Comput. Sci., Vol. 718, pp. 244–251, Springer-Verlag, new sk1; new sk2; new skA; Berlin, 1993. new cmv1; new cmo1; new cmv2; new cmo2; [5] S. Kremer and M. D. Ryan, Analysis of an Electronic Voting ((let v = choice[cand1, cand2] in Protocol in the Applied Pi Calculus, in: Programming Lan- let sk = sk1 in let vkA = vk(skA) in guages and Systems - 14th European Symposium on Program- let cmv = cmv1 in let cmo = cmo1 in voter) | ming, ESOP 2005, M. Sagiv ed., Lect. Notes Comput. Sci., Vol. 3444, pp. 186–200, Springer-Verlag, Berlin, 2005. (let v = choice[cand2, cand1] in [6] S. Delaune, M. Ryan and B. Smyth, Automatic Verification let sk = sk2 in let vkA = vk(skA) in of Privacy Properties in the Applied pi Calculus, in: Trust let cmv = cmv2 in let cmo = cmo2 in voter) | Management II, Y. Karabulut, J. C. Mitchell, P. Herrmann and (let cin1 = cmv1 in let cin2 = cmv2 in C. Damsgaard Jensen eds., IFIP Advances in Information and let cout = ccv in mix) | Communication Technology, Vol. 263, pp. 263–278, Springer- (let cin1 = cmo1 in let cin2 = cmo2 in Verlag, Berlin, 2008. let cout = cco in mix) | admin)

Table 4. Verification of FOO in Proverif.

– 100 –

JSIAM Letters Vol.3 (2011) ISBN : 978-4-9905076-2-6 ISSN : 1883-0609 ©2011 The Japan Society for Industrial and Applied Mathematics

Publisher : The Japan Society for Industrial and Applied Mathematics 4F, Nihon Gakkai Center Building 2-4-16, Yayoi, Bunkyo-ku, Tokyo, 113-0032 Japan tel. +81-3-5684-8649 / fax. +81-3-5684-8663