A DOMAIN DECOMPOSITION METHOD FOR SOLVING ELECTRICALLY LARGE ELECTROMAGNETIC PROBLEMS
DISSERTATION
Presented in Partial Fulfillment of the Requirements for
the Degree Doctor of Philosophy in the Graduate
School of The Ohio State University
By
Kezhong Zhao, M.S.
*****
The Ohio State University 2007
Dissertation Committee: Approved by Professor Jin-Fa Lee, Adviser
Professor Fernando L. Teixeira ______Professor Ronald M. Reano Adviser Graduate Program in Electrical and Computer Engineering
© Copyright by
Kezhong Zhao
2007
ABSTRACT
This dissertation presents a domain decomposition method as an effective and
efficient preconditioner for frequency domain FEM solution of geometrically complex
and electrically large electromagnetic problems. The method reduces memory
requirements by decomposing the original problem domain into several non-overlapping and possibly repeatable sub-domains. At the heart of this research are the Robin-to-Robin map, the “cement” finite element coupling of non-conforming grids and the concept of duality paring. The Robin’s transmission condition is employed on interfaces between adjacent sub-domains to enforce continuity of electromagnetic fields and to ensure the sub-domain problems are well-posed. Through the introduction of cement variables, the meshes at the interface could be non-conformal which significantly relaxes the meshing procedures. By following the spirit of duality paring a symmetric system is obtained to better reflect physical nature of the problem. These concepts in conjunction with the so- called finite element tearing and interconnecting algorithm form the basic modules of the present domain decomposition method. To enhance the convergence of DDM solver, the
Krylov solvers instead of classical stationary solvers are employed and studied.
In order to account the radiation condition exactly thus eliminating spurious reflection, a boundary element formulation is hybridized with the present DD method, also through the aforementioned novel concepts. One of the special cases of present
ii hybridization is the well known hybrid finite element and boundary element method. It will be shown that the proposed hybrid offers simultaneously: (1) symmetry, (2) modularity, (3) non-conformity between FEM and BEM domains, (4) free of internal resonance, and (5) natural and effective preconditioning scheme that guarantees spectral radius less or equal to one.
Lastly this dissertation presents a DDM solution scheme for analyzing electromagnetic problems involving multiple separable scatterers. The method first decomposes the original problem into several disjoint sub-regions. In each sub-region, the domain decomposition method is further applied rendering geometrically complicated and electrically large sub-region problems tractable. The sub-regions communicate through the near-field Green’s function. To overcome the vast computational costs required in exchanging information between electrically large sub-regions, the adaptive cross approximation algorithm is adopted to expedite the process.
iii
Dedicated to My Family and the Loving Memory of My Late Wife, Xiu Ming Lin
iv ACKNOWLEDGMENTS
Fruitful harvest comes only with constant hard working. At this moment of celebrating my ending of long enduring student career and a beginning of new challenges,
I wish to express my deepest appreciation to my advisor, Prof. Jin-Fa Lee, for introducing me into this fantastic subject, for sharing his passions, and for his guidance, continuous encouragement. I understand such appreciation can never be expressed in few simple words and will be with my heart throughout the remaining of my life.
I would also like to express my deepest gratitude to the present and former members of the Computational Science Group at ElectroScience Laboratory, especially
Dr. Marinos N. Vouvakis, Dr. Seung-Mo Seo, Seung-Cheol Lee, and Vineet Rawat, not only for their friendships, but for our stimulating discussions and their continuous encouragements throughout my graduate study. Thanks are extended to Prof. Jun Zou of
Tsinghua University for his friendship and stimulating discussions during his time of visiting ElectroScience Laboratory.
I wish to thank my examination committee members, Prof. Fernando L. Teixeira and Prof. Ronald M. Reano for their constructive comments, inputs and services throughout the Ph.D. examinations. In addition, I would like extend my thanks to the entire ElectroScience Laboratory family including fellow graduate students, staff, network and system administrators, researchers and professors for their helps and for
v providing a wonderful research environment. Particularly, I would like to express my gratitude towards Prof. Prabhakhar H. Pathak, Prof. Denny Burnside, Dr. Teh-Hong Lee,
Dr. Robert Burkholder for their kindness, teachings and collaborations. I am especially indebted to Prof. Denny Burnside and Dr. Teh-Hong Lee for our long-time collaborations, the permissions of using some of results, and providing some interesting real-life geometries along with their simulation results. I would also mention Mr. Kevin Reaver for continuous computer support and friendship.
I would also express my gratitude to ANSOFT Corporation for the financial support of my last quarter’s tuition as well as their continuous support and interest on the research works over the years.
Finally, I would like to express my most sincere gratitude to my parents and my wife, Xi Lin, for their understanding, patience, support and unconditional loves.
vi
VITA
April 08, 1978 ...... Born in Fuzhou, Fujian, China June 2001 B. S. Electrical Engineering, The Ohio State University, Columbus, OH.
March 2003 ...... M.S. Electrical Engineering, The Ohio State University, Columbus, OH.
March 2003 – present ...... ElectroScience Laboratory, Electrical and Computer Engineering Department, The Ohio State University, Columbus, OH.
PUBLICATIONS
Research Publications
K. Zhao, and J.-F. Lee, “A Single-Level IE-QR Algorithm to Model Large Microstrip Antenna Arrays”, IEEE Transactions on Antennas and Propagation, vol. 52, no. 10, pp. 2580-2585, Oct., 2004.
M. N. Vouvakis, S.-C. Lee, K. Zhao and J.-F. Lee, “A Symmetric FEM-IE Formulation with a Single-Level IE-QR Algorithm for Solving Electromagnetic Radiation and Scattering Problems'”, IEEE Transactions on Antennas and Propagation, vol. 52, no. 11, pp. 3060-3070, Nov., 2004.
S.-C. Lee, M. N. Vouvakis, K. Zhao and J.-F. Lee, “Analyzing Microwave Devices Using a Symmetric Coupling of Finite and Boundary Elements”, International Journal for Numerical Methods in Engineering, vol. 64, no. 4, pp. 528-546, Sept., 2005.
K. Zhao, M. N. Vouvakis, and J.-F. Lee, “The Adaptive Cross-Approximation Algorithm for Accelerated Method of Moments Computations of EMC Problems”, IEEE Transactions on Electromagnetic Compatibility, vol. 47, no. 4, pp. 763-773, Nov., 2005.
vii K. Zhao, M. N. Vouvakis, and J.-F. Lee, “Solving Electromagnetic Problems Using A Novel Symmetric FEM-BEM Approach”, IEEE Transactions on Magnetic, vol. 42, no. 4, pp.583-587, Apr., 2006.
M. N. Vouvakis, K. Zhao, and J.-F. Lee, “FEM Analysis of Infinite Periodic Structures with Non-Matching Triangulations”, IEEE Transactions on Magnetic, vol. 42, no. 4, pp.691-695, Apr., 2006.
M. N. Vouvakis, K. Zhao, S. M. Seo, and J.-F. Lee, “A Domain Decomposition Approach for Non-Conformal Couplings Between Finite and Boundary Elements for Unbounded Electromagnetic Problems in R3”, Journal of Computational Physics, vol. 225, no. 1, pp. 975-994, Jul. 2007.
K. Zhao, V. Rawat, S.-C. Lee, and J.-F. Lee, “A Domain Decomposition Method with Non-Conformal Meshes for Finite Periodic and Semi-Periodic Structures”, IEEE Transactions on Antennas and Propagation, Sept. 2007.
FIELDS OF STUDY
Major Field: Electrical and Computer Engineering
viii TABLE OF CONTENTS
Page
Abstract...... ii
Acknowledgments...... v
Vita...... vii
List of Tables ...... xii
List of Figures...... xiii
1. Introduction...... 1
1.1 Problem Statement...... 1 1.2 Background...... 2 1.3 Benefits of the DDM...... 3 1.4 Summary of Present DDM Algorithm...... 5 1.5 Notations...... 6 1.6 Outline...... 9
2. Non-Conforming Finite Element Domain Decomposition for Time-Harmonic Maxwell Equations...... 11
2.1 Boundary Value Statement ...... 11 2.2 Symmetric Formulation ...... 13 2.3 Invertibility of Sub-Domain Matrix...... 16 2.4 Krylov Subspace Method Solution ...... 18 2.5 Compression of the Numerical Green’s Function ...... 22 2.6 Numerical Implementation ...... 23 2.6.1 Finding Neighboring Domains ...... 23 2.6.2 Rotational Symmetry ...... 25 2.6.3 Reordering of Domains...... 25
3. FEM Domain Decompostion Results and Numerical Studies...... 27
ix 3.1 Accuracy Study...... 28 3.1.1 Rectangular Waveguide: Error Convergence...... 28 3.1.2 Coaxial Section Array: Stationary Solver vs. Krylov Solver ...... 28 3.1.3 Corrugate Horn Antenna: Comparison with an Existing Solver...... 30 3.1.4 Vivaldi Antenna Arrays: Accuracy of the FETI Like Algorithm ...... 32 3.1.5 Dielectric Cylinder and Turbine Inlet: Rotational Symmetry Modeling 35 3.1.6 Mobile Phones in the Presence of a Human Head...... 37 3.1.7 Metamaterial Applications...... 42 3.1.7.1 Plano-Concave Lens ...... 42 3.1.7.2 Microwave Photonic Crystal: Geometrically Non-Conformal Modeling ...... 46 3.2 Convergence Study ...... 49 3.2.1 The Effect of Diagonal Scaling and Reordering of Domains...... 49 3.2.2 FETI vs. FETI+QR ...... 52 3.2.3 Choice of Krylov Solvers...... 52
4. A Domain Decomposition Based Finite Element and Boundary Element Coupling ....54
4.1 Symmetric FEM and BEM Coupling ...... 57 4.1.1 Boundary Value Statement ...... 57 4.1.2 Transmission Problem ...... 59 4.1.3 Exterior Problem...... 59 4.1.4 Interior Problem...... 60 4.1.5 Coupled Problem ...... 60 4.1.6 Matrix Form...... 62 4.2 Preconditioning Schemes...... 63 4.2.1 DDB Preconditioner...... 64 4.2.2 AMS Preconditioner ...... 64 4.2.3 MMS Preconditioner...... 65 4.3 Hybrid DDM and BEM ...... 65
5. Hybrid FEM-BEM Results and Numerical Studies...... 67
5.1 Air Box: Internal Resonance and Numerical Stability Study...... 67 5.2 Dielectric Sphere: Convergence Study...... 71 5.3 Coated Sphere Scattering: Accuracy...... 73 5.4 Performance of DDB, AMS and MMS Preconditioners ...... 76 5.4.1 Dielectric Sphere...... 76 5.4.2 RCS from a Generic Battle Ship...... 77 5.5 Large Antenna Arrays...... 80 5.5.1 Patch Antenna Arrays ...... 81 x 5.5.2 Ultra Wide Band Antenna Arrays...... 84
6. Metamaterial Electromagnetic Cloak: Derivation and Full-Wave Simulations ...... 90
6.1 Closed Form of Material Properties for EM Cloaking ...... 91 6.1.1 Derivation of Material Properties for the Cloaking of PEC Spheres..... 91 6.2 Numerical Experiments ...... 96
7. Multi-Region and Multi-Technique Formulation ...... 100
7.1 Domain Decomposition Based Hybrid Method...... 102 7.1.1 Boundary Value Statement ...... 102 7.1.2 Variational Form of Interior Problem...... 104 7.1.3 Variational Form of Exterior Problem...... 104 7.1.4 Matrix Equation ...... 105 7.1.5 Discussion...... 106 7.2 Inter-Region Computation ...... 108 7.2.1 Representation Formulae ...... 108 7.2.2 Matrix Form...... 109 7.2.3 The ACA Algorithm ...... 111 7.3 Numerical Results...... 113 7.3.1 Validation of Multiple-Region Solver ...... 114 7.3.2 Reflector Antenna System ...... 115 7.3.3 A Conformal Ultra Wide Band Antenna Array with a Slot Frequency Selective Surface...... 120
8. Conclusion ...... 125
Bibliography ...... 127
xi LIST OF TABLES
Table Page
Table 3.1 Comparison of FEM and DDM for the phone and head example at 700MHz. 40
Table 3.2 Computational statistics for MPC geometry...... 49
Table 3.3 Performance of QR compression for the SRR example...... 51
Table 3.4 Convergence of various Krylov solvers: Vivaldi arrays example...... 51
Table 3.5 Convergence of various Krylov solvers: SRR lens example...... 51
Table 5.1 Performance of three preconditioners for dielectric sphere example...... 76
Table 5.2 Computational statistics of the DD-FE-BEM for solving bistatic RCS of a generic battleship using three different preconditioning strategies...... 79
Table 5.3 DDM performances of patch antenna arrays with ABC truncation...... 82
Table 5.4 DDM performances of patch antenna arrays with BEM truncation...... 82
Table 5.5 DDM performances of UWB arrays with BEM truncation...... 85
Table 7.1 CPU time per iteration for the reflector antenna system (hh:mm:ss)...... 117
Table 7.2 CPU time per iteration for UWB array with slot FSS (hh:mm:ss)...... 124
xii LIST OF FIGURES
Figure Page
Figure 2.1 Decomposition of Ω into two non-overlapping domains Ω1 and Ω2 ...... 12
Figure 3.1 Free-space rectangular waveguide, error in S-parameters with mesh refinement (a) S11 error, (b) S12 error...... 29
Figure 3.2 S-parameter of an array of coaxial sections; (a) magnitude, (b) phase...... 31
Figure 3.3 Iterations of symmetric Gauss-Seidel and GCR solvers of an array of coaxial sections for a range of frequencies...... 32
Figure 3.4 Corrugated horn waveguide, (a) 2-D cross-section and dimension, (b) far field pattern...... 33
Figure 3.5 (a) The dimension of single Vivaldi element; (b) The far field patterns of a 100x100 Vivaldi array using direct DDM and the FETI like algorithm...... 34
Figure 3.6 Bistatic RCS pattern of a dielectric cylinder in the xz-plane...... 36
Figure 3.7 E-plane bistatic RCS pattern of a PEC turbine inlet. Inserts show the geometries and mesh for MoM model...... 37
Figure 3.8 Geometry of a mobile phone and a human head...... 38
Figure 3.9 Splitting of the geometry into (a) the surrounding region including the head and (b) the phone region...... 38
Figure 3.10 Plot of the magnitude of the S-parameters in the frequency range of 0.7GHz ~2.2GHz. The insert shows the field distribution at f = 700MHz...... 39
Figure 3.11 Flow chart of modeling phones in the effect of human head via the FETI like algorithm...... 41
Figure 3.12 The top view of DDM modeling of a plano-concave lens. The blue cell is a unit cell of the lens. The red cell is a quarter-wavelength monopole antenna. The green cell is the air box. Along the z-axis, there are two air boxes on the top and bottom...... 43 xiii Figure 3.13 The geometry of the SRR lens, (a) views of a unit cell and its dimensions; (b) assembly of the lens. The permittivity of the substrate is 2.2...... 44
Figure 3.14 Real part of z-component of total E-field along z = 0 plane; (a) the positive lens, (b) the negative lens, (c) the SRR lens...... 45
Figure 3.15 Dimensions and domain setup of the microwave photonic crystal...... 47
Figure 3.16 Field distributions at the center plane, (a) f = 6.6GHz, (b) f = 9.7GHz...... 48
Figure 3.17 Convergence histories for the SRR lens, (a) case study, (b) solver study..... 50
Figure 4.1 A generic EM radiation/scattering problem used for the derivation of FEM- BEM. The insert shows the non-conforming FEM and BEM meshes...... 57
Figure 5.1 Condition number in the neighborhood of the “internal” resonance; (a) Costabel’s symmetric FEBI formulation, (b) present approach without diagonal scaling, (c) present approach with diagonal scaling...... 68
Figure 5.2 Eigenvalue distribution of the preconditioned system (I-M-1 A) for: (a) N = 1076 unknown problem, (b) N = 2708 unknown problem, (c) N = 4824 unknown problem. The frequency is kept constant at f=300MHz...... 70
Figure 5.3 Convergence properties of proposed DD FEM-BEM, (a) RCS error vs. discretization, (b) history of the iterative convergence for the smallest and largest discretization...... 72
Figure 5.4 Scattering by a dielectric coated PEC sphere at the internal resonance frequency...... 74
Figure 5.5 Scattering by dielectric coated PEC spheres; (a) small sphere, (b) large sphere...... 75
Figure 5.6 A generic battle ship, (a) the geometry and dimensions; and, (b) computational domain for DD-FEM-BEM...... 77
Figure 5.7 Comparisons of the bistatic RCS results of DD-FE-BEM and the MoM, (a) 30MHz, (b) 60MHz...... 78
Figure 5.8 Field distributions, (a) 30MHz, (b) 60MHz...... 79
Figure 5.9 Dimensions and geometry of a coaxial fed patch array...... 81 xiv Figure 5.10 Far field patterns of patch arrays, (a) 2x2, (b) 7x7, (c) 11x11...... 83
Figure 5.11 Field distributions of patch arrays, (a) 2x2, (b) 7x7, (c) 11x11...... 84
Figure 5.12 A UWB antenna array, (a) dimensions of a unit cell in unit mm, (b) a 5x5 UWB array...... 86
Figure 5.13 Far field patterns of UWB arrays, (a) 3x3, (b) 7x7, (c) 10x10, (d) 20x20. ... 87
Figure 5.14 Directivity of a 50x50 UWB array as a function of frequency...... 88
Figure 5.15 Electric field distributions of a 50x50 UWB array, (a) 12GHz, (b) 16GHz, (c) 20GHz...... 88
Figure 5.16 Far field patterns of a 50x50 UWB array, (a)-(c) E-plane, (a) 12GHz, (b) 16GHz, (c) 20GHz, (d)-(f) H-plane, (d) 12GHz, (e) 16GHz, (f) 20GHz...... 89
Figure 6.1 Cloaking of a PEC sphere of radius R1...... 91
Figure 6.2 The scattering of PEC sphere without cloaking at f = 100MHz. Inner sphere shows the location of PEC sphere and outer sphere shows the location of cloak, for comparison purpose...... 96
Figure 6.3 The scattering of PEC sphere with cloaking, (a) f = 100MHz, (b) f = 150MHz...... 98
Figure 6.4 Scattered far fields with and without cloak, (a) f = 100MHz, (b) f = 150MHz...... 99
Figure 7.1 Domain decomposition of a two-object problem...... 102
Figure 7.2 Radiation pattern of an electric dipole in the presence of a PEC cube...... 114
Figure 7.3 (a) Geometry of a reflector antenna system, (b) dimension of the corrugated horn...... 116
Figure 7.4 Radiation patterns of horn antenna with main reflector and sub-reflector, (a) comparison of MoM and PO modeling of sub-reflector, (b) first iteration solution vs. last iteration solution...... 118
Figure 7.5 Current distributions, (a) horn, (b) sub-reflector, (c) main-reflector...... 119
xv Figure 7.6 Dimensions of unit cell of slot-FSS element in unit mm...... 120
Figure 7.7 Surface field distributions at the truncation boundary, (a) top view of array, (b) side view of array, (c) bottom view of FSS, (d) top view of FSS...... 122
Figure 7.8 Far field comparison of UWB array with and without presence of FSS, (a) φ=0 plane, (b) φ=90 plane...... 123
Figure 7.9 Convergence history of DDM solver for UWB array and FSS...... 124
xvi CHAPTER 1
INTRODUCTION
1.1 Problem Statement
The accurate simulation of electrically large and geometrically complicated
electromagnetic (EM) problems is of vital importance in many areas of electrical engineering, but also is a very challenging task. The scope and application of traditional
approaches such as the finite element method (FEM) [1], the boundary element method
(BEM) [2] and the finite difference method (FDM) [3] are only limited to moderate electrical size and simplified complexity. These limitations stem from the vast computational resources required by these numerical methods and unsatisfactory convergence of iterative solvers due to the lack of effective preconditioners.
The non-overlapping domain decomposition method (DDM) [4]-[17] has emerged as a powerful and attractive technique for numerically-rigorous solution of Maxwell’s equations due to its inherent parallelism and its beauty as an efficient and effective preconditioner. The DDM is based on a divide-and-conquer philosophy. Instead of tackling a large and complex problem directly as a whole, the original problem is partitioned into smaller, possibly repetitive, and easier to solve sub-domains. Some suitable boundary conditions called transmission conditions are prescribed at the
1 interfaces between adjacent sub-domains to enforce the continuity of electromagnetic
fields. These transmission conditions are imposed iteratively. Namely, starting from an
arbitrary initial guess for each sub-domain, sub-domains communicate with each other until certain equilibrium in the solution has been achieved.
1.2 Background
In the EM community, the non-overlapping DDM was first introduced by Després
in [9] for 2-D problems with a transmission condition (TC) of Robin’s type. For this type
of interface condition, it was proven in [10] that sub-domain solutions converge to the
solution of original problem. This algorithm has received considerable attention in the
past decade and several extensions and improvements have been presented in [11]-[14]
(and the references therein).
However, one serious drawback of the algorithm is its requirement of periodic
meshes. Meshes at the interfaces are constrained to be identical for adjacent sub-domains,
leading to considerable difficulty in mesh generation for arbitrary complex geometries.
To alleviate the constraints of periodic meshes, the non-overlapping and non-conforming
DDM was proposed in [15] through the introduction of additional surface unknowns at
the interface. By introducing these surface unknowns, the information in the entire volume of a domain is translated into information on the boundary surface, resulting in a
tremendous reduction in memory requirements. The development of [15] was then
extended in [16][17] to further reduce the computational burdens of the DDM, via the
adoption of a Finite Element Tearing and Interconnecting (FETI) like algorithm. The
FETI like algorithm, in essence, is the computation of a “transfer function” or a
“numerical Green’s function” in a matrix form. Once the numerical Green’s function 2 matrix is obtained, it can be readily combined with other sub-domains in the solution process of the DDM as a matrix-vector multiplication, instead of a computational involved matrix solution.
1.3 Benefits of the DDM
Three classes of EM problems can be effectively and efficiently addressed by
using the DDM.
1. Electrically large EM problems without repetitions or symmetries. For electrically
large EM problems, direct inversion methods such as Gaussian elimination or LU
decomposition can no longer be applicable due to ON()2 memory and ON()3
central processing unit (CPU) time requirements, where N is number of unknowns.
Iterative solution methods such as conjugate gradient (CG) method are the only
options but their convergences are often chaotic or failing. Much of the work in the
DDM in this aspect is related to the selection of the transmission conditions to ensure
the convergence of the DD algorithm. When the transmission conditions are properly
devised, DDM becomes an effective preconditioner for such problems. Furthermore,
memory requirement can be greatly reduced since DDM can be easily parallelized.
Specific applications in this class may include: modeling and design of the Rotman
lens, computation of radar cross section (RCS) from realistic military targets,
electromagnetic compatibility/interference (EMC/EMI) problems, and
electromagnetic field effects on tissues and human bodies.
2. Electrically large EM problems with repetitions and local symmetries. The potential
of the DDM can become even more pronounced for the problems exhibiting large
3 number of repetitions and/or local symmetries. By taking advantages of the
repetitions and symmetries and utilizing the FETI like algorithm, the computational
resources can be further reduced. The FETI like algorithm is also inherently a parallel
procedure, solving the same matrix equation with multiple right-hand-sides.
Important applications in this class include: finite antenna arrays, metamaterials,
photonic crystals, photonic band gap (PBG) and electromagnetic band gap (EBG)
structures, and conformal antenna arrays.
3. Multiple-Region Problems. DDM can serve as coupling procedures for iterative
multi-methods and multi-regions solution. Hybridization of different numerical
techniques can be conveniently addressed in the frame work of the DDM. One
potential application includes hybridization of a FEM for inhomogeneous structures
and a BEM to eliminate spurious reflections from the truncation boundary. As will be
shown in the latter chapters, DD based hybrid FEM-BEM offers several distinctive
and attractive features. It is also relatively easy for DDM to model the problems
involving multiple disjoint targets. This situation is tackled based on hierarchical
DDM concept. Basically, the problem domain is first decomposed into separable sub-
regions, and then in each sub-region the most efficient method can be utilized
independently. The communications between sub-regions are done through
equivalent current sources with the aid of near-field Green’s functions. Explicit
applications contain: reflector antenna system analysis and design, antenna arrays
mounted on large platforms (such as aircraft and battleship), and RCS of military
targets concealed under foliages.
4 1.4 Summary of Present DDM Algorithm
The core of this research is on the investigations and applications of domain decompositions for the solution of time-harmonic Maxwell’s equations. This dissertation as the continuation of the earlier work on DDM [15], [16] includes the main components listed below.
1) We remark that all of the aforementioned formulations [9]-[17] make use of classical
stationary iterative solvers which have well-known limitations. This dissertation
addresses the issue of solving DDM matrices effectively through the employment of
Krylov-type solvers. In solving the DDM matrices using Krylov-type solvers, the
choice of solvers is of paramount importance. We compare a few popular Krylov
solvers on some complex examples and conclude that GCR offers the best
performance in terms of both iteration count and CPU time.
2) A symmetric DDM formulation to better reflect the physical nature of the problem is
derived in this work. When the actual physical problem involves only reciprocal
media, reciprocity relationship demands a symmetric system. Note that the
formulation of [15] and [16] results in a non-symmetric linear system.
3) The rank-revealing QR factorization is applied to compress the FETI matrices and
thereby reduce memory and CPU requirements of the DDM.
4) An ad-hoc reordering of domains to mimic the wave front propagation is proposed to
improve the convergence of the DDM solver.
5) To account the radiation condition exactly, the presence of infinite space is modeled
through the marriage of the DDM with a BEM.
6) A domain decomposition framework is derived to hybridize various numerical
5 methods for the modeling of EM problems involving separable objects. In this work,
DDM will be hybridized with DDM itself to analyze antenna array structures
displaying different periodicities and symmetries.
1.5 Notations
In this section, we explain in details some of the notations used throughout the
manuscript. We will use boldface capital letters to represent matrices and operators,
except when explicitly stated otherwise; boldface lowercase letters will represent column/row vectors and vector fields. Position vectors r and r ' are referred to the
observation and source, respectively. Throughout the document the free space wave
number will be denoted by k000= ω με , where ω = 2/π f is the radial frequency, and
ε 0 and μ0 are the free space permittivity and permeability, respectively. Note that
ε r = εε/ 0 and μr = μμ/ 0 will represent the relative permittivity and permeability of
dielectric and magnetic materials, respectively. The free space intrinsic impedance will
be symbolized by η = με00/ . The fundamental solution (or Green’s function) for the
scalar Helmholtz equation in free-space will be denoted by
′ e−−jk0 rr g ()rr|,.′ = r≠ r′ (1.1) 4π rr− ′
The surface integral of two complex-valued vector functions will be shorted conveniently as
uv,.= uvi dx2 (1.2) Γ ∫ ( ) Γ
6 Similarly, the volume integral of two complex valued functions in a domain Ω is denoted by
uv,.= uvi dx3 (1.3) ( )Ω ∫ ( ) Ω
Several spaces need to be defined beforehand. The following convention will be adopted: spaces of scalar valued functions will be denoted by H which abbreviates Hilbert space, whereas for spaces of vector valued functions the boldface letter H will be used. One of the most important spaces in electromagnetics is that of curl-conforming functions in a domain Ω :
⎪⎧ 223 ⎪⎫ Huuu()curl,.Ω =∇×+<∞⎨ ∫ () dx ⎬ (1.4) ⎩⎭⎪ Ω ⎪
This is the space where electric and magnetic fields reside; the physical meaning of the space H()curl,Ω is that in domain Ω , the electric and magnetic energies are finite.
Throughout the manuscript, we will frequently encounter three trace operators on the boundary surface Γ≡∂Ω. The first one is the tangential surface trace or Dirichlet
trace γ t given as
γ tunun:,=׈ˆ( ×) (1.5) where nˆ denotes the outwardly directed unit normal vector of the surface Γ . In other
words, γ tu contains the tangential components of the vector field u on the surface Γ .
Second trace is the twisted tangential trace γ × defined as
γ ×unu:,= ˆ × (1.6)
7 which also contains the tangential components of u on Γ , but twisted 90 around nˆ .
The last one is the magnetic trace or Neumann trace γ N given as
γ N un:,=×∇׈ ( u) (1.7) which will be used to define surface electric current.
The following two important spaces need to be defined
−−−1/2 1/2 1/2 HuHu ()divΓΓ,,,Γ =∈{ ( Γ) div ∈ H ( Γ)} (1.8) and
−−−1/2 1/2 1/2 HuHu⊥Γ()curl,,,Γ =∈{ ⊥( Γ) curl Γ ∈ H ( Γ)} (1.9)
where divΓ and curlΓ are the surface divergence and curl operators as defined in [18].
−1/2 From engineering point of view, it is sufficient to say that H (divΓ ,Γ) is the space that some of the most famous vector basis functions in EM community such as the RWG
−1/2 (Rao-Wilton-Glisson) basis functions [19] belong to; similarly, H⊥Γ()curl ,Γ contains the surface Whitney 1-form (or better known as edge elements) [20]. The following theorem, taken directly from [21], helps to establish the relationship between the trial and test function spaces.
−1/2 Trace Theorem The trace mapping γ t :,HH(curlΩ)() ⊥Γ curl ,Γ and
×−1/2 γ :,HH()curlΩΓ () divΓ , are linear and continuous.
−1/2 −1/2 It is very important at this point to note that H⊥Γ(curl ,Γ) and H ()divΓ ,Γ are dual to each other through a duality pairing defined in (1.3).
8 1.6 Outline
The remaining of this dissertation is organized as follows. In Chapter 2, a FEM based DDM formulation for time-harmonic vector wave propagation problems is presented. The focus is mainly on the domain decomposition method as an efficient and effective preconditioner for the frequency domain FEM formulation. Specifically, we will first start with a systematic derivation of a symmetric DDM formulation, guided by the principle of duality paring. A sufficient condition that ensures the invertibility and thus uniqueness of the sub-domain matrix is followed. We then continue to derive preconditioners of Gauss-Seidel type where the FETI like algorithm is also deduced.
Subsequently we discuss the possibility of compressing FETI matrix via rank revealing
SVD algorithms to further accelerate the solution process. Lastly in this chapter we present a few practical ideas regarding the numerical implementation of non-overlapping
DDM. The verification of DDM accuracy, rates of error convergence in using non- conformal meshes across interfaces, the advantage as well as the choice of a few popular
Krylov solvers are investigated in Chapter 3.
In Chapter 4, a novel hybrid FEM-BEM formulation is presented. We propose the marriage of FEM with BEM based on DDM concepts, more explicitly through the
Robin’s transmission condition. Due to this interface condition enforced on the boundary shared by both FEM and BEM domains and with the aid of duality paring concept, a symmetric system without suffering any internal resonance issue is reached. This formulation in addition possesses the potential of employing non-matching meshes in
FEM and BEM domains. Consequently great modularity arises in term of basis functions, system matrix solver. This two-domain formulation could be easily enhanced through
9 multiple domain decomposition of FEM domain, and the extension will be presented at the end of Chapter 4. The verification of these aforementioned properties of proposed hybrid FEM-BEM will be investigated in details in Chapter 5 through some canonical examples as well as a few practical geometries. Chapter 5 also presents some of numerical results of hybrid DDM-BEM approach.
In Chapter 6, a novel metamaterial structure known as EM cloak will be studied.
The cloak makes scattering objects “invisible” to microwave when it is applied. The cloak consists of very anisotropic materials whose material properties will be derived, specifically for the cloaking of PEC spherical objects. Full-wave verification of invisibility will be demonstrated through hybrid FEM-BEM modeling of a PEC sphere coated with EM cloak at two frequencies.
Chapter 7 deals with multiple objects alienated in distance. In this situation, as separation distance becomes electrically large, huge number of sub-domains is required, degrading the performance of DDM solvers. To alleviate this deficiency, the problem domain can be decomposed into disjoint sub-regions which communicate with each other through near field Green’s function. It will be shown that in each sub-region the most efficient method can be applicable provided it produces a sufficiently accurate approximation. When the sub-region is geometrically complicate and electrically large,
DDM solvers are applied. To tackle the issue of highly inefficiency on straightforward implementation of inter-region coupling, the adaptive cross approximation (ACA) algorithm is adopted to expedite the process. Several examples of practical interest are solved to demonstrate the performance of the present approach.
10 CHAPTER 2
NON-CONFORMING FINITE ELEMENT DOMAIN DECOMPOSITION FOR TIME-HARMONIC MAXWELL EQUATIONS
2.1 Boundary Value Statement
The current domain decomposition method begins by partitioning the original problem domain Ω into N non-overlapping sub-domains:
Ω=∪∩ Ωiij,,1. Ω Ω =∅ ≤ijN ≠ ≤ (2.1) iN=1,
By denoting the boundary of Ωi as ∂Ωi and Ωii(≡Ω∪ ∂Ω i) the closure of Ωi , the interface Γij is defined as Γ≡Ωij i∩ Ω j . Obviously Γij=Γ ji , but here we will make an artificial distinction due to the possibility of differing triangulations on either side of the
interface. We employ Γij when Ωi is the “master” sub-domain and Γ ji if the converse is true. For the sake of simplicity and without loss of generality, we consider only the case in which N = 2 , as shown in Fig. 2.1. Subsequently, the boundary value problem (BVP) can be written as
11 ∂ΩΓ112\ ∂ΩΓ221\
j 1 j2
e e Ω Ω1 1 2 2
n2
n1
Γ12 Γ21
Figure 2.1 Decomposition of Ω into two non-overlapping domains Ω1 and Ω2 .
1 2 imp ∇× ∇×EE1011 −kjkinεηr =− 01 J Ω 1 μr1 11 γγNtEE10−=−−jk m 1 γγ N EE 20 jk m t 2 on Γ 12 μμrr12 1 γγEE− jk=∂ΩΓ0 on \ μ Nt101 112 r1 (2.2) 1 2 imp ∇× ∇×EE2022 −kjkinεηr =− 02 J Ω 2 μr 2 11 γγγγNtEE20−=−−jk m 2 Nt EE 10 jk m 1 on Γ 21 μμrr21 1 γγNtEE202−=jk 0 on ∂ΩΓ221\ μr 2
Here Ei , i =1, 2 , denotes the electric field interior to Ωi . ε ri and μri are relative
permittivity and relative permeability of the medium in Ωi , respectively. The near-field 12 imp imp excitations are expressed via impressed electric currents J1 and J 2 . The complex parameter m is chosen as
ε + εμμ+ m ==εμ,, εrr12 μ = r 1 r 2 . (2.3) rr r22 r
Note that the Robin-type transmission conditions, the second and fifth equations of (2.2), imply the correct tangential field continuities and render the unique solution of (2.2) equal to that of the original system.
2.2 Symmetric Formulation
We start the derivation of the symmetric formulation with the introduction of tangential electric fields and surface electric currents given by
1 eEjitii==γγ,, Ni E i =1,2. (2.4) jk0μri
Note that the definition of the surface electric current, ji , differs from those in [15] and
1 [16] by the scaling factor . In a traditional manner, the variational statement of the jk0
interior problem for Ω1 reads as
−1/2 Seek EH11∈Ω(curl; ) and jH11∈ (divΓ ;Γ ) such that
bjkjkvE,,+=−∀∈Ωγη vj vJ ,,;,imp v H curl (2.5) ()11 0t 11Γ 0( 11) 1() 1 12 Ω1 where bilinear forms b()ii, is defined by
⎡⎤1 bdxkdxvu,.=∇×⋅∇×− v u32 v ⋅ε u 3 (2.6) ()Ω ∫∫⎢⎥ ( )()0 (r ) ΩΩ⎣⎦μr
13 Following the spirit of duality paring described in [17], testing functions are appropriately chosen such that they belong to the space dual to that of the residual. This means the Robin transmission condition (2nd equation in (2.2)) must be tested twice, once
−1/2 −1/2 by γ t vH112∈Γ⊥Γ()curl , and the other by λ112∈ H (divΓ ,Γ ) since its residual
−1/2 −1/2 induces components in H⊥Γ(curl ,Γ12 ) and H (divΓ ,Γ12 ). We note that even though the testing procedures are somewhat non-conventional, they offer distinct advantages that will be apparent in the latter part of the derivation. Specifically, we obtain
jk011γγttvj,,=−− jk 011 v m e jk 012 γγ tt vj ,,, jk 012 v m e ΓΓΓΓ12 12 12 12 (2.7) jk00 jk11 jk 0 jk 0 λ11,,e −=+λ 1j 1λ 12 ,,.e λ 1j 2 22ΓΓ12mm 22 12 ΓΓ12 12
Symmetric coupling is accomplished by splitting the surface integral of (2.5) into two halves. One half remains intact while the other is replaced by the first equation of (2.7), leading to
jk00 jk jk 0 bm()vE11,,++γγtt vj 11 v 1 , e 1 − γ t vj 12 , Γ12ΓΓ 12 12 22 2 (2.8) jk −=−0 γηve,,.mjk vJimp t 12Γ 011() 2 12 Ω1
In the above formulation, the introduction of two sets of tangential traces in (2.4) allows
differing triangulations on Γ12 and Γ21 . Subsequently, the approach presented herein offers great flexibility in terms of mesh non-conformity. After Galerkin testing of the transmission conditions, the field continuities are weakly enforced due to the non- conformal interface meshes. Note that the non-conformal tessellations require attention when performing integration for quantities residing on differing meshes, and a union of
14 the two interface meshes is used to accurately perform the numerical quadrature. A full analysis of the error convergence of this formulation is outside the scope of this paper but the interested reader may consult [22] for a similar proof for the closely related mortar method. The accuracy of the method is demonstrated in next chapter via numerical experiments.
Equation (2.8) together with the second equation of (2.7) form the sub-domain
matrix for Ω1 . After similar treatment for Ω2 , a symmetric system matrix is then obtained
⎡ KGuy11211− ⎤⎡ ⎤ ⎡ ⎤ ⎢ ⎥⎢ ⎥= ⎢ ⎥, (2.9) ⎣−GKu21 2⎦⎣ 2 ⎦ ⎣ y 2 ⎦ where
⎛⎞ ⎜⎟ACii 0 ⎜⎟⎛⎞Ebii ⎛⎞ T jke jk ⎜⎟ ⎜⎟ KCB=+⎜⎟00 T D,,, ue = y == 0 i 1,2, (2.10) iii⎜⎟22 iiiiii⎜⎟ i ⎜⎟ ⎜⎟j0 ⎜⎟ ⎜⎟jk jk ⎝⎠i ⎝⎠ ⎜⎟0D00T − Tj ⎝⎠22ii ii
⎛⎞ ⎜⎟00 0 ⎜⎟ jk jk GG==T ⎜⎟ 000 Te D. (2.11) 12 21⎜⎟22 12 12 ⎜⎟jk jk ⎜⎟0D00T Tj ⎝⎠2221 12
The explicit form of matrix sub-blocks are defined as
AvvBvvCi==γ=γγbb() i,, i i( i , ti) , i b( ti vv , ti) , (2.12)
15 ej1 TvvTij===γγ t i,,m t j ijλλ i ,, jDv ij γ t i ,,λ j (2.13) ΓΓij m ij Γij
and the excitation vector bi is given as
imp bvJiii=−jk0 η( ,.) (2.14) Ωi
Note that (2.9) is symmetric equivalent of (36) in [15].
2.3 Invertibility of Sub-Domain Matrix
Before proceeding to the solution of (2.9) using DDM as a preconditioner, a
sufficient condition must be derived to ensure sub-domain matrix Ki is always invertible.
The invertibility of K i also implies the following source-free BVP will have only the
trivial solution, namely Ei ≡ 0 in Ωi :
1 2 ∇× ∇×EEirii −kin0 ε =0, Ω i, μri (2.15)
jeii− mon=∂Ω0, i,
1 with jE= γ . To reach this solution, we invoke the Poynting theorem in iNi∂Ωi jk0μri
complex form [23]. Namely Ei satisfies
⎛⎞⎡⎤*2 ⎛⎞112 μ ⎜⎟EExE×∇×⋅+ddx2*ωε⎢⎥ −ri ∇×= E 30, (2.16) ∫∫ii⎜⎟ riii ⎜⎟μημri⎢⎥k0 ri ∂Ωii⎝⎠⎣⎦⎝⎠ Ω where * denotes complex conjugate, and ω is the angular frequency.
16 ⎛⎞* * ⎛⎞11⎛⎞ 2 However, ⎜⎟EExEnE×∇×⋅=−ddxjkmdx22*2i ˆ ×∇×= e. ∫∫ii⎜⎟ iii⎜⎟0 ∫ i ⎜⎟μμrri ∂Ωii⎝⎠⎝⎠ ∂Ω⎝⎠ ∂Ω i
Then (2.16) becomes
⎡⎤2 22μ 1 jk m*2eEE dx+ωε⎢⎥ *−∇×=ri dx 30. (2.17) 0 ∫∫iriii ⎢⎥k0ημri ∂Ωii Ω ⎣⎦
'" '" Denote ε ri=−εε rij ri and μri=−μμ rij ri . Since real and imaginary components of (2.17) must vanish, subsequently we have
2 ⎡⎤" 22μ 1 kRe meE dx2"+ωε⎢⎥+∇×=ri E dx 3 0, (2.18) 0 ∫∫()iriii ⎢⎥koriημ ∂Ωii Ω ⎣⎦
⎡⎤' 2 22μ 1 kIm( m )eE dx2'+ωε⎢⎥−∇×=ri E dx 3 0. (2.19) 0 ∫∫iriii ⎢⎥koriημ ∂Ωii Ω ⎣⎦
The sufficient condition of uniqueness is then stated by the following theorem.
" " Theorem 1: Under the assumption that ε ri ≥ 0 , μri ≥ 0 and Re(m )> 0 , the solution Ei to (2.15) is trivial.
" " We prove the theorem as follows. Suppose that ε ri > 0 and μri ≥ 0 , then the only
" way for (2.18) to be zero would be Ei = 0 everywhere in Ωi . For the case ε ri = 0 and
" μri ≥ 0 , we can conclude, from the assumption Re(m )> 0 , that eii=∂Ω 0 on . The
desired result of Ei = 0 in Ωi is then a direct consequence of analytic continuation from
∂Ωi to Ωi . The details of the analytic continuation can be consulted with theorem 4.12 and theorem 4.13 from a recent book by Monk [24].
17 2.4 Krylov Subspace Method Solution
It was shown in [16] via Fourier analysis of the transmission conditions that a stationary iteration will not always converge. In particular, evanescent modes on the interface lead to an iteration matrix with a spectral radius of one. Motivated by this reason, here we use the DD method as a preconditioner to a Krylov subspace method
capable of resolving such modes. Having established the invertibility of Ki , we first describe a block Gauss-Seidel type preconditioner for the effective solution of (2.9).
Explicitly, we have
−−11 ⎡⎤⎡⎤⎡⎤⎡⎤⎡⎤K01112111 K− Gu K0y ⎢⎥⎢⎥⎢⎥⎢⎥⎢⎥= . (2.20) ⎣⎦⎣⎦⎣⎦⎣⎦⎣⎦−−GK21 2 G 21 K 2 u 2 − GK 21 2y 2
Via the aid of the identity
−1 −1 ⎡⎤K01 ⎡ K01 ⎤ ⎢⎥= ⎢ −−−111⎥ , (2.21) ⎣⎦−GK21 2 ⎣−KGK2211 K 2⎦
(2.20) is rewritten as
⎡⎤IKG− −1 ⎡u ⎤⎡⎤y 112 11= , (2.22) ⎢⎥−−11⎢ ⎥⎢⎥ ⎣⎦0 I− KGKG221112⎣u22⎦⎣⎦y where
−−11 y 1112==K yy,.K 22211(y -G y ) (2.23)
However, in (2.22) only information regarding surface unknowns is required. Therefore interior unknowns can be eliminated during the solution process by applying a restriction
operator Ri on both sides of (2.22) as
18 −1 T ⎡⎤IRKGR− 11 122 ⎡ v11⎤⎡⎤y ⎢⎥−−11T ⎢ ⎥⎢⎥= . (2.24) ⎣⎦0IRKGKGR− 22 211 122⎣v22⎦⎣⎦y
Here, the surface unknowns ve= j T are related to the volume unknowns through iii( [])
the restriction operator Ri via
⎡⎤0I0 vRuiii==⎢⎥ u i, i =1,2. (2.25) ⎣⎦00I
T Recall uEeiiii= [ j ] . Moreover yiii= R y . Note that the restriction operator involves nothing but Boolean operations, and its application to a vector does not require significant computational effort.
Nevertheless, a direct implementation of (2.24) would require sub-domain matrix solutions (via an iterative method) at every iteration of an iterative solver. The requirement of sub-domain solutions is circumvented by using the identity
⎡000⎤ RRT = ⎢ 0 I 0⎥ , (2.26) ii ⎢ ⎥ ⎣⎢00I⎦⎥ in (2.24), which results in
⎡⎤⎡⎤⎡⎤IZ− 112g v 1y 1 ⎢⎥⎢⎥⎢⎥= , (2.27) ⎣⎦⎣⎦⎣⎦0IZ− 221112g Z g v 2y 2 where
−1 TT ZRKRgiiiiijiijj===,,,1,2. RGR ij (2.28)
Note that the solution of (2.24), with iterative sub-domain solutions at each iteration, will be referred to as “direct DDM” while the solution of (2.27) will be referred to as the FETI 19 like solution [16], [25]-[29]. The similarities of the present approach and the FETI algorithm of [25]-[29] have been discussed in [16]. Moreover, iterative solutions incurred in (2.24) and (2.28) can be accelerated through the use of the p-type multiplicative
Schwarz (pMUS) preconditioner reported in [30]. Since a domain is a translational or
rotational invariance of a building block, it is only necessary to compute Zi for each building block, rather than each domain, in a preprocessing step [16].
The above procedure is easily extended to problems with N sub-domains, which is summarized in the following algorithm.
Algorithm 2.1:
RHS Computation and Matrix-Vector Multiplication with FETI Like Algorithm
1. Compute preconditioned-RHS
Initialize yi =∀=0, iN1,
For iN=1,
−1 yRKyZiiiiiijj=−()∑ gy, j ∈neighbor( i )
End for
2. Matrix-vector multiplication rRMKRv= ( −1 ) T
Initialize tvii=∀=, iN1,
For iN=1,
tZiiijj=∈∑ gt, j neighbor( i )
rvtiii=−
End for
20 The Gauss-Seidel preconditioner is not symmetric but can be easily symmetrized by traversing back through all the sub-domains after the final sub-domain has been reached [4]. Explicitly, the symmetric Gauss-Seidel preconditioner for the two-domain problem has the form [31]
−1 ⎡⎤K01112⎡⎤K01 ⎡⎤ KG− M = ⎢⎥⎢⎥−1 ⎢⎥. (2.29) ⎣⎦−GK21 2⎣⎦0K2 ⎣⎦ 0K 2
Using this new preconditioner, both the preconditioned RHS computation and matrix- vector multiplication required in a Krylov solver must be modified. The modifications required in both procedures are similar and thus only matrix-vector multiplication is rewritten in the algorithm below.
Algorithm 2.2:
Matrix-Vector Multiplication using Symmetric Gauss-Seidel Preconditioner
Matrix-vector multiplication rRMKRv= ( −1 ) T
Initialize tvii=∀=, iN1,
For iN=1,
tZiiijj=∈∑ gt, j neighbor( i )
rvtiii=−
End for
For iN= ,1
tZiiijj=∈∑ gt, j neighbor( i )
rvtiii=−
End for 21 2.5 Compression of the Numerical Green’s Function
The FETI like algorithm achieves speed-up in the solution process by reducing the sub-domain matrix solutions of (2.24) to matrix-vector multiplications of (2.27) using
the pre-computed iteration matrix Zi . In this subsection we further improve the
performance of the FETI like algorithm by compressing Zi via some rank-revealing matrix factorization algorithm.
The iteration matrix Zi is, in essence, the “numerical Green’s function” of
domain Ωi . This is evident from the definition of Zi (c.f. (2.28)), since each column of
Zi corresponds to responses of surface electric fields and electric currents when excited
by a unit electric current source on the boundary surface. The matrix Zi is therefore dense and possesses properties very similar to those of the impedance matrix of the method of moments (MoM). Namely, the matrix consists of many numerically rank deficient sub-blocks which can be accurately represented by a greatly reduced set of column vectors. In this sense, the FETI like algorithm also transforms the problem from a finite element method (FEM) to a MoM problem. Consequently, through multi-level partitioning of the boundary surface, matrix sub-blocks representing two well-separated group interactions termed A can be accurately approximated by A via the rank- revealing QR factorization [31]-[34] or the fully pivoted adaptive cross approximation
(ACA) algorithm [35]-[38]. More explicitly,
AAUVmn× ≈= mn××× mr rn, (2.30) such that
22 A-A ≤ δ A , (2.31) where m, n, and r denote row dimension, column dimension and numerical rank, respectively, and δ is a user-defined tolerance. For further detail of the theory and implementation of the rank-revealing SVD algorithms mentioned above, interested readers are referred to [38].
2.6 Numerical Implementation
2.6.1 Finding Neighboring Domains
One of the difficulties associated with the DDM implementation is the robust identification of neighboring domains, without assumptions regarding the orientation or shape of individual domains. However, if domains are geometrically conforming, each interface of a domain is going to have exactly one neighboring domain. Note that by geometrically conforming, we mean that the intersection between the closures of two domains is either empty, a vertex, a whole edge, or a whole interface. Therefore, given a
domain Ωi and one of its interfaces Γij , the neighboring domain Ω j as well as the
corresponding intersecting interface Γ ji can be identified through the information of the
smallest rectangular bounding boxes that tightly enclose interfaces Γij and Γ ji .
For example, in the situation where domains exhibit translational periodicities, coordinates of a bounding box can be defined from coordinates of extreme points of each
interface, ()xyzmin,, min min and ( xyzmax,, max max ) , where xmin, yz min, min and
xmax, yz max, max are the minimums and maximums of the x, y, z coordinates of nodes on the interface surface mesh, respectively. Because bounding boxes are defined based on
23 global coordinates or coordinates of the original problem, the bounding box of interface
Γij is only identical to that of Γ ji and differs from all others. Note that for problems exhibiting rotational symmetries, the same ideas can be applied after each domain is rotated to the global position.
In some situations such as the analysis of an antenna array in the presence of frequency selective surfaces with different periodicity, however, geometrically conforming DDM may not be easily applied; difficulties arise due to the nature, varying size, and relative position of the sub-structures. Furthermore, for the hybridization of non-conforming DDM with a BEM (as will be presented in a latter chapter), the BEM domain is inherently geometrically non-conformal to its neighbors if cubic cells are used for sub-domains. Due to these rationales, the inclusion of geometrically non-conformal modeling therefore deems necessary. The identification of neighboring domains in this case is attempted based on the observation that when the normal of an interface is along one of the principle axes (x-, y-, or z-axis), the 3-D rectangular bounding box reduces to a
2-D rectangular plate. In this special case, the interception of two geometrically non- conformal plates can be determined without much difficulty. In light of this, the necessary steps to identify the neighbor domains when geometrically non-conformal sub- domains are involved include the followings:
a) Break each interface into a collection of planar surfaces.
b) Rotate each planar surface such that its normal is along one of the principle axes.
c) Identify if two planar surfaces intercept.
24 2.6.2 Rotational Symmetry
Exploitation of rotational symmetry is particularly useful for geometries such as corrugated horn antennas, aircraft radome, and conformal antenna arrays. Here we briefly discuss one of implementations based on Euler’s rotation theorem [39]. According to
Euler’s rotation theorem, any rotation can be described by three parameters ()φ, θ, ψ .
Explicitly, rotation of a point r given in Cartesian form as r = ( x, yz, ) can be obtained through the following 3 steps.
1) rotate φ -angle about z-axis (clockwise from observer point of view);
2) rotate θ -angle about x-axis (clockwise);
3) rotate ψ -angle about z-axis (clockwise).
Mathematically, these three steps are equivalent to three matrix-vector multiplications given as
⎡⎤⎡x ' cosψψ sin 0 ⎤⎡ 1 0 0 ⎤⎡ cos φφ sin 0 ⎤⎡⎤x ⎢⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢⎥ ⎢⎥⎢y '=− sinψψ cos 0 ⎥⎢ 0 cos θ sin θ ⎥⎢ − sin φφ cos 0 ⎥⎢⎥y , (2.32) ⎣⎦⎣⎢⎥⎢zz'0010sincos001 ⎦⎣⎥⎢− θθ ⎦⎣⎥⎢ ⎦⎣⎦⎥⎢⎥ where ()x ', yz', ' is the coordinate after rotation.
2.6.3 Reordering of Domains
The convergence of the DDM algorithm can be improved simply by reordering sub-domains such that the numbering scheme is more consistent with wave front propagation. The motivation for this can be seen via observation of the Jacobi iteration.
In the first iteration of the Jacobi solver, solution vectors are updated only for those sub- domains containing sources. At subsequent iterations, solutions are updated layer by
25 layer away from sources, similar to the propagation of wave fronts. Therefore, by numbering sub-domains such that they mimic the mechanism of wave front propagation, each iteration of a multiplicative type solver or preconditioner (such as Gauss-Seidel) will update all sub-domain solutions.
An ad-hoc algorithm implementing the above idea is presented below.
Algorithm 2.3:
Reordering of Domains to Mimic Wave Front Propagation
1. Initialize a queue with domains with non-zero excitations
For iN=1,
If Ωi contains non-zero excitations
Place Ωi at the end of the queue
End if
End for
2. Renumbering domains using the queue
While the queue is not empty
Renumber top element of the queue: Ωi
If Ω j is not yet renumbered, ∀j∈Ω neighbor ( i )
Place Ω j at the end of the queue
End if
Remove Ωi from the queue
End while
26 CHAPTER 3
FEM DOMAIN DECOMPOSTION RESULTS AND NUMERICAL STUDIES
This chapter demonstrates the performance of the present DDM approach and validates the numerical implementations discussed in the previous chapter through some numerical examples of practical interest. We also study the performance of several
Krylov-subspace solvers, namely TFQMR [40], restarted GMRES(5) [41], and GCR(5)
[42] with truncation as suggested in [43]. Note that the recurrence of the latter two solvers is limited to five due to memory usage considerations. All Krylov-subspace solvers are equipped with a Gauss-Seidel preconditioner unless otherwise stated. For all examples, the discretized space is modeled using the p=2 1st kind Nédélec tetrahedral elements [20]. Double precision arithmetic was used throughout the programs. The numerical Green’s function matrices are compressed via QR factorization with δ =10−3 unless otherwise explicitly specified. The outer-loop tolerance ε is based on a preconditioned residual error. Note that for the first two sub-sections ε =10−2 is used whereas ε =10−3 is employed in the convergence study section. All computations were performed on a 64-bit AMD Opteron 246 workstation with a 1 MB L2 cache and 16 GB of RAM.
27 3.1 Accuracy Study
3.1.1 Rectangular Waveguide: Error Convergence
A simple rectangular waveguide is used to verify the accuracy of the non- conformal DDM. The cross-section of the array has width λ0 and height λ0/2, respectively, where λ0 is the free space wavelength at f=10GHz. The length of the waveguide is 2λ0. Simulations are performed with varying discretization sizes, h, using a conventional FEM code and the domain decomposition method with both conformal and non-conformal meshes on the interface. The domain decomposition simulations divide the waveguide into two equal segments. A TE10 mode is excited by a port at one end and the S-parameters calculated for both the input and output ports. The ends of the waveguide are terminated with perfectly matched layers. The S11 and S12 errors are found and compared to the analytical results. Figures 3.1(a) and 3.2(b) demonstrate that all three methods have equal rates of error convergence in the S11 and S12. Specifically, each converges at a rate h2p = h4 where p is the order of the basis functions employed.
3.1.2 Coaxial Section Array: Stationary Solver vs. Krylov Solver
We now analyze a linear array of coaxial sections with alternating impedances of
50 and 100 Ohms. The array consists of 11 50Ohm and 10 100Ohm sections, including
50Ohm sections at each end, as shown in the insert of Fig. 3.2(a). The 50Ohm sections have inner radii of 1.5mm and outer radii of 5.022mm , whereas the inner and outer radii of the 100Ohm sections are 1.5mm and 16.82mm , respectively. Each section is 50mm in
length and filled with a lossy dielectric material of permittivityε r =− 2.1j 0.0042 . A mesh discretization of h=λ0/15 followed by 5 steps of h-adaptive mesh refinement (h-
28
(a)
(b)
Figure 3.1 Free-space rectangular waveguide, error in S-parameters with mesh refinement (a) S11 error, (b) S12 error. 29 AMR) [44] is used for all simulations, where λ0 is the free space wavelength at f = 1GHz.
This results in a range of λ00/1400≤ h ≤ λ /15 for the final mesh. The parameters of interest include both the reflection and transmission coefficients and are computed at increments of 10MHz, within the range of 100MHz to 1GHz. The magnitude and phase of the coefficients are compared with those obtained via simulation using a commercial
FEM solver, HFSS (version 8.5). The results are shown in Fig. 3.2 and demonstrate very good agreement. For this example, solution is attempted via symmetric Gauss-Seidel solver and GCR solver equipped with symmetric Gauss-Seidel preconditioner.
Symmetric Gauss-Seidel converges well for low frequency simulations but convergence deteriorates beyond 700MHz. The solver diverges for all simulations at and above
760MHz. The convergence of the GCR solver is insensitive to frequency however, and requires less than 5 iterations at all frequencies. The iteration counts of both solvers at each frequency are plotted in Fig. 3.3.
3.1.3 Corrugate Horn Antenna: Comparison with an Existing Solver
A corrugated horn antenna is next analyzed. This antenna is a body-of-revolution
(BOR) whose 2D cross section is depicted in Fig. 3.4(a). The testing frequency is chosen as f = 15.4GHz. In the DDM modeling, the horn is decomposed into 5 domains with each domain uniformly discretized based on h=λ0/3. Because of fine detail of this geometry, the final mesh size ranges from λ0/377 to λ0/3. This mesh discretization results in a total unknown of 2,155,216 which demands 415MB of memory. The GCR solver converges in
5 iterations using 2.5 hours’ CPU time. The computed E-plane far field pattern is validated with that of a BOR MoM code as shown in Fig. 3.4(b).
30
(a)
(b)
Figure 3.2 S-parameter of an array of coaxial sections; (a) magnitude, (b) phase.
31
Figure 3.3 Iterations of symmetric Gauss-Seidel and GCR solvers of an array of coaxial sections for a range of frequencies.
3.1.4 Vivaldi Antenna Arrays: Accuracy of the FETI Like Algorithm
The arrays are borrowed from [45] and operated at f = 5GHz under a uniform broadside excitation. The dimensions of each antenna element are shown in Fig. 3.5(a) and the configuration of the arrays can be found in [15] or [45]. The outer truncation boundary is enforced by the first order absorbing boundary condition (ABC) placed 0.5λ0 away from antenna elements in the broadside direction. An initial discretization of h=λ0/3, followed by 8 h-adaptive mesh refinement steps is used for the antenna element,
which results in λ00/188≤≤h λ / 4 for the final mesh. We validate the accuracy of the 32
(a)
(b)
Figure 3.4 Corrugated horn waveguide, (a) 2-D cross-section and dimension, (b) far field pattern.
33
(a)
(b)
Figure 3.5 (a) The dimension of single Vivaldi element; (b) The far field patterns of a 100x100 Vivaldi array using direct DDM and the FETI like algorithm.
34 FETI like algorithm along with QR compression through the computation of the far field pattern of a 100x100 Vivaldi array. The result is compared with that of a direct DDM simulation [15] in Fig. 3.5(b) and the two results are virtually identical. For this 100x100 array, a total of 126,334,208 unknowns, including 18,184,192 surface unknowns, would be required if the “brute-force” FEM [1] is attempted. About 3.7GB of total memory are consumed by the DDM solver, where 3.3GB goes to storage of surface solution vectors required by the GCR solver. With 988 surface unknowns for each antenna element, the numerical Green’s function is computed using 12 minutes of CPU time and 19MB of memory. For this problem, matrix compression using QR factorization [37], [38] is not expected to be optimum, reducing memory only to 17MB and requiring 6 seconds of additional CPU time.
3.1.5 Dielectric Cylinder and Turbine Inlet: Rotational Symmetry Modeling
We analyze two problems exhibiting rotational symmetry. For the first example we consider a dielectric cylinder with εr=4.0 and μr=1.0, excited by a normally incident plane wave polarized in the x-direction. The cylinder has a radius of 0.1λ0 and is 0.2λ0 in length, where λ0 is the free space wavelength at f=300MHz. In DDM modeling, the cylinder is divided into 4 sectors, each of which is modeled by the same building block repeated in the φ-direction. The building block is uniformly discretized with h=λ0/8. The computed bistatic RCS pattern in the xz-plane is compared with that computed using a hybrid FEBE approach. The results are given in Fig. 3.6 and very good agreement is observed. For this geometry, 278,208 unknowns are required, using a total of 25MB memory. The GCR solver converges in 9 iterations and requires 2 minutes of CPU time.
35 z
l y r
x
r=0.1λ00, l = 0.2λ
Figure 3.6 Bistatic RCS pattern of a dielectric cylinder in the xz-plane.
In the next example, we analyze a PEC turbine inlet excited by a 2GHz y- polarized plane wave incident in the head-on direction. The geometry is shown in the insert of Fig. 3.7 and its dimensions can be found in [46]. The inlet is divided into 10 slices. Each slice is further divided into 8 sectors as illustrated in the insert of Fig. 3.7. An initial discretization of h=λ0/4, followed by 4 h-adaptive mesh refinement steps [44] is used for all sub-domains. With this treatment, a total of 3,809,920 unknowns are required and 433MB of memory is used. The GCR solver converges in 38 iterations using 1 hour and 38 minutes of CPU time. A comparison of the E-plane bistatic RCS computed using the DDM solver and a MoM solver (IE-FFT) [47][48] is shown in Fig. 3.7, where good agreement is observed. 36 y
z x
DDM Model MoM Mesh
Figure 3.7 E-plane bistatic RCS pattern of a PEC turbine inlet. Inserts show the geometries and mesh for MoM model.
3.1.6 Mobile Phones in the Presence of a Human Head
In many real-life applications, it is quite common to have engineering devices being tested within the same environment. Examples include mobile phones in the vicinity of a human head and antenna arrays mounted on a battle ship. The current practice, for example in the evaluation of the antenna arrays on the battle ship, is to repeat the simulations including the entire environment for different antenna arrays. This process is very time consuming.
37
Figure 3.8 Geometry of a mobile phone and a human head.
(a) (b)
Figure 3.9 Splitting of the geometry into (a) the surrounding region including the head and (b) the phone region.
38
Figure 3.10 Plot of the magnitude of the S-parameters in the frequency range of 0.7GHz ~2.2GHz. The insert shows the field distribution at f = 700MHz.
This type of problems can be efficiently handled via the FETI like algorithm, even though repetitions or local symmetries might not be exhibited. Mobile phones in the vicinity of a human head are simulated to demonstrate the utility of this algorithm. For
the material property of the head, ε r = 45.93 and σ = 0.756 are used [49]. The geometry setup for a conventional FEM is shown in Fig. 3.8, where the entire geometry is enclosed by a truncation box enforced by the first order ABC. A simulation of this configuration is first conducted to be compared to present approach.
To apply the DDM, the geometry in Fig. 3.8 is decomposed into two sub-domains as shown in Fig. 3.9. Since various phone designs are tested in the presence of the same human head model, the numerical Green’s function for the domain containing the human
39 CPU Time Method Unknowns Memory (MB) S (mm:ss) 11
FEM 752,864 759 10:59 0.91∠− 61 °
DDM 154,918 231 02:09 0.91∠− 65 °
Table 3.1 Comparison of FEM and DDM for the phone and head example at 700MHz.
head is first computed and compressed via the ACA algorithm. This preprocessing step requires 2 hours and 30 minutes at the frequency f = 700MHz. Although the computational time for the numerical Green’s function is significant, once it is obtained, the CPU time to simulate the phone shown in Fig. 3.9(b) is only around 2 minutes.
Shown in Fig. 3.10 is the magnitude of S11 in the frequency range of interest and its insert visualizes the resulted field distribution of this phone at f = 700MHz. The computational statistics of the conventional FEM and DDM are compared in Table 3.1.
The CPU time to compute the numerical Green’s function is omitted in the table assuming it is previously computed and reused in this simulation. It is clearly seen that the required computational resources are reduced with very close agreement in S- parameter. We comment that total unknown for DDM is significantly less than that of conventional FEM approach, mainly due to the fact that in DDM modeling, phone and human head are meshed separately. On the contrary in the FEM modeling, both are meshed altogether. Even with the same targeted discretization size, the latter case typically results in over-refinement of head region due to inefficiency of current existing meshers. 40 Entire problem Partitioning geometry the geometry
Initial mesh and adaptive mesh refinement
N = 154,918 N = 83,118 00:16:43 00:04:57
700 MHzNumerical Green’s 1400 MHz function and ACA compression 02:30:14 08:22:48
Solution process Solution process
00:01:14 00:02:09 00:02:06 00:02:33
Figure 3.11 Flow chart of modeling phones in the effect of human head via the FETI like algorithm.
41 To demonstrate the advantage of the FETI like algorithm, a flow chart to simulate two different mobile phones in the vicinity of the same human head is illustrated in Fig.
3.11. In the process, the numerical Green’s function is computed once for each frequency, and repeatedly recycled for each instance of the mobile phone.
3.1.7 Metamaterial Applications
3.1.7.1 Plano-Concave Lens
In this section we study a plano-concave lens borrowed from [50]. More specifically, three types of lenses are considered. The first one is built with a positive index of refraction material (PIM) with εr=4.97 and μr=1.0. The second one is a
“mathematically” negative index lens with homogenous material properties characterized by diagonal tensors given by
ε=()1.0, 1.0, − 1.27 −jjj 0.291 , μ=( − 1.33 − 0.562, − 1.33 − 0.562, 1.0) .
The final lens is a metamaterial one, fabricated from metallic wires and rings assembled in a periodic cell structure; each cell is referred to as a split-ring-resonator (SRR). The rings and wires are deposited on a Rogers substrate with εr=2.2 and μr=1.0. We will refer to these three lenses as the positive lens, negative lens, and SRR lens, respectively.
Because of the periodic nature of the lens, the present approach is capable of modeling fine details of the SRR lens without difficulty. Note that the material properties of the negative lens are derived from the SRR lens through a homogenization procedure [50]. It is the authors’ intent to study whether the negative and SRR lenses produce “equivalent” results.
42
Figure 3.12 The top view of DDM modeling of a plano-concave lens. The blue cell is a unit cell of the lens. The red cell is a quarter-wavelength monopole antenna. The green cell is the air box. Along the z-axis, there are two air boxes on the top and bottom.
The lenses are operated at a frequency of 14.7GHz. Fig. 3.12 shows the top view
(x-y plane) of DDM modeling of the lens, where each cell represents a domain. In this figure, each blue cell corresponds to a unit cell of the lens, green cells are air boxes modeling free space, and the red cell contains a quarter-wavelength monopole exciting the lens. For the cases of positive and negative lenses, the unit cell is a rectangular box with dimension of 0.251×× 0.251 1.004 ( x ××yz) cm. The geometry of the SRR lens is depicted in Fig. 3.13; Fig. 3.13(a) shows the pictorial views and dimensions of the unit cell while Fig. 3.13(b) illustrates the assembly of the lens. Each building block, excluding
43 h=0.01 Units=cm
εr=2.2 μr=1.0
(a)
(b)
Figure 3.13 The geometry of the SRR lens, (a) views of a unit cell and its dimensions; (b) assembly of the lens. The permittivity of the substrate is 2.2.
44 (a) (b)
(c)
Figure 3.14 Real part of z-component of total E-field along z = 0 plane; (a) the positive lens, (b) the negative lens, (c) the SRR lens.
45 the excitation, is discretized by a uniform mesh refinement of h=λ0/5. An initial discretization of h=λ0/3, followed by 10 h-adaptive mesh refinement steps are used for the monopole. With these treatments, the final mesh size varies from λ0/600 to λ0/5. For the SRR lens 26,110,910 total unknowns are required and 3.5GB of memory including
2.4GB for GCR are used.
Fig. 3.14 shows the real parts of total E-field’s z component in the z=0 plane, for the positive, negative, and SRR lenses. For the positive lens of Fig. 3.14(a), mainly standing waves exist within the lens. In Fig. 3.14(b), the negative lens demonstrates its negative index of refraction through the re-concentration of the dipole’s outgoing radiation. However, for the SRR lens we observe the formation of a surface wave along the interface between the lens and the air, and the fields outside the lens do not closely resemble those of the negative lens. Therefore, it is our opinions that in this particular example, the “homogenized” negative lens is not an accurate approximation of the actual
SRR lens.
3.1.7.2 Microwave Photonic Crystal: Geometrically Non-Conformal Modeling
In this section, we model a microwave photonic crystal (MPC) borrowed from
[51] via geometrically non-conformal DDM. The geometry, setup of sub-domains, and the detailed dimensions of a unit cell are shown in Fig. 3.15. Entire geometry is placed inside a free-space parallel plate waveguide, with the other four sides being absorbing boundaries. Two frequencies, f = 6.6GHz and f = 9.7GHz, are interested, at which only
TEM wave will be excited originated from red-colored domains as in Fig. 3.15. Note that
46 Parallel Plate Copper Free Space Waveguide Rod Air Box
Figure 3.15 Dimensions and domain setup of the microwave photonic crystal.
it is because of these source domains the difficulty of geometrically conformal modeling of this geometry arises. Disability of decomposing source regions poses a major issue for the user-friendly applicability of geometrically conformal DDM.
For each building blocks shown in Fig. 3.15, an initial discretization of h=λ0/4, followed by 5 h-adaptive mesh refinement steps is applied. With these treatments, the resulting electric field distributions at the center plane of both frequencies are depicted in
Fig. 3.16. We note that these fields are in close resemble to those shown in [51]. At f =
6.6GHz, the formation of standing waves is evident inside the crystal and transmitted wave exhibits positive index of refraction. Whereas at f = 9.7GHz, the phenomena of 47
(a)
(b)
Figure 3.16 Field distributions at the center plane, (a) f = 6.6GHz, (b) f = 9.7GHz.
48 Memory FETI Time Solution Time Freq. Unknowns Iteration (MB) (hh:mm:ss) (hh:mm:ss)
6.6GHz 7,215,354 397 94 00:06:38 01:05:04
9.7GHz 17,380,006 897 34 00:38:59 01:31:04
Table 3.2 Computational statistics for MPC geometry.
negative index of refraction is observed. However we point out that at both frequencies most of incident fields are reflected and the phenomena shown in Fig. 3.16 are visible only after exclusion of fields from the source domains.
Table 3.2 details computational resources required by DDM modeling of this geometry. Because of large number of surface unknowns and little repeatability of source domains, the FETI like algorithm is applied only to non-source building blocks to minimize the overall solution time.
3.2 Convergence Study
3.2.1 The Effect of Diagonal Scaling and Reordering of Domains
Superior iterative solver convergence may be obtained in some cases merely through careful implementation. Such robust implementation details include diagonal scaling [52] and proper reordering of domains, both of which incur very little overhead.
We demonstrate the benefits of diagonal scaling and reordering through the SRR lens example where initially the domains are ordered randomly. The convergence histories of the GCR solver are plotted in Fig. 3.17(a) for three different cases. In CASE 1, both
49
(a)
(b)
Figure 3.17 Convergence histories for the SRR lens, (a) case study, (b) solver study.
50 Method FETI FETI+QR
FETI Memory 1,002 MB 624 MB
GCR Iteration 78 78
CPU (hh:mm:ss) 24:25:16 18:21:52
Table 3.3 Performance of QR compression for the SRR example.
TFQMR GMRES(5) GCR(5) Array Unknowns Size CPU CPU CPU Iteration Iteration Iteration (h:m:s) (h:m:s) (h:m:s) 3x3 157,768 83 00:01:20 54 00:00:54 33 00:00:38
10x10 1,352,304 >200 - 73 00:08:33 43 00:05:00
50x50 31,129,264 >200 - 93 03:39:57 52 02:08:04
100x100 126,334,208 >200 - 88 13:46:03 49 08:07:09
Table 3.4 Convergence of various Krylov solvers: Vivaldi arrays example.
Solver TFQMR GMRES(5) GCR(5)
Iteration >200 133 78
CPU(hh:mm:ss) - 31:34:14 18:21:52
Table 3.5 Convergence of various Krylov solvers: SRR lens example.
51 diagonal scaling and reordering are not employed; CASE 2 includes only diagonal scaling and CASE 3 includes both diagonal scaling and reordering. The figure demonstrates that via attention to the implementation details mentioned above, and without change to the main algorithm, significant improvement in convergence can be obtained.
3.2.2 FETI vs. FETI+QR
Having established confidence in the solution accuracy when using QR compression in the previous section, the memory usage and CPU time reductions are studied further using the SRR lens example. Table 3.3 compares the FETI memory usages with and without QR compression as well as the corresponding GCR iterations and CPU times required. As can be seen from Table 3.3, the use of the QR compression to the FETI matrices in this example reduces the memory and the overall CPU time, without any sacrifice of accuracy. Note also that in this example CPU time does not scale linearly with the FETI memory, due to the fact that some non-repeating building blocks are solved via direct DDM, instead of FETI like algorithm, to minimize overall CPU time. Although in this example QR compression does not affect GCR’s iteration count, better convergence has been observed in other examples. QR compression regularizes iteration matrices by removing unwanted modes through elimination of linearly dependencies.
3.2.3 Choice of Krylov Solvers
We investigate the application of a few well-known Krylov-subspace type solvers, namely TFQMR, GMRES(5), and GCR(5), for the solution of electrically large
52 problems. The number of iterations required by each of the solvers is summarized in
Table 3.4 for various Vivaldi array simulations. Among the three, the GCR algorithm is the best in terms of both CPU time and iteration count. For SRR geometry, TFQMR diverges quickly as shown in Fig. 3.17(b), and the GCR solver still outperforms GMRES in terms of solution time and iterations as seen both in Table 3.5 and Fig. 3.17(b).
From all examples we have analyzed, including examples not shown in this paper due to space limitations, we have reached the conclusion that the GCR solver is the choice of Krylov solver for the DDM algorithm. Note that for the Helmholtz problems, the GCR solver was also advocated in [53].
53 CHAPTER 4
A DOMAIN DECOMPOSITION BASED FINITE ELEMENT AND BOUNDARY ELEMENT COUPLING
In the previous chapters, the radiation condition is approximated by the first order absorbing boundary condition (ABC), producing the unwanted spurious reflection from the truncation boundary. In order to minimize such unphysical reflection, the truncation boundary must be placed sufficiently far away from the object, resulting a large number of sub-domains. In this chapter, the unbounded space exterior to problem domain Ω will be treated as an additional domain. This domain is formulated by the boundary element method (BEM) which incorporates the radiation condition exactly through its Green’s function. Note that such hybridization has been previously attempted in [13] where the computational domain is partitioned into onion-like concentric sub-domains to enhance efficiency of the transmission conditions. However in [13] periodic meshes are required on the interfaces. Moreover, concentric partitions can not easily exploit the benefit of repetitions; hence fail to fully utilize the FETI like algorithm to speed up the computation.
A special case of this approach is the conventional FEM formulation of the interior domain Ω . This corresponds to the well-known hybrid finite element method- boundary element method (FEM-BEM), which is one of the most appealing approaches 54 analyzing unbounded electromagnetic radiation and scattering from heterogeneous structures. However, there are a number of undesired issues associated with existing approaches. A direct and widely used hybridization of FEM and BEM [1], [54]-[57] is based on a non-variational setting that leads to a non-symmetric complex system of equations, even when the actual physical problems involves only reciprocal media. Even though such formulations have been successfully applied to both scattering and radiation problems, they do not reflect the physical problem statement. Furthermore, they are typically more difficult and computationally expensive to solve with iterative solvers.
The symmetric coupling of FEM and BEM was first proposed by Costabel in [58]. Since then, a large number of papers has been published in both engineering and applied mathematics community [59]-[63] documenting the properties of the formulation. These approaches attempt a symmetric enforcement of the Dirichlet-to-Neumann map between
FEM and BEM; consequently, without any special treatment, they suffer the infamous internal resonance or “forbidden” frequency problem [64]. It should be emphasized that the conventional non-variational approaches of [1], [54]-[57] do not suffer internal resonance only if the combined field integral equation (CFIE) is employed on the BEM portion. On the other hand, both aforementioned variational and non-variational formulations lack modularity; Namely, FEM and BEM have to be consistent with each other in terms of mesh, basis functions and matrix solver. Modular FEM-BEM hybrid formulations have been previously proposed for two dimensional problems by Cwik in
[65] and Hoppe et.al. in [66] for body-of-revolution (BOR) type of problems. In both cases different FEM and BEM meshes were used leading to memory savings due to the reduced mesh density of the boundary element part. More importantly in [66] the internal
55 resonance problem was alleviated without losing the symmetry of the final FEM-BEM system. Finally, to the best of our knowledge, an effective and efficient preconditioning scheme that guarantees convergence of the BEM-BEM system is yet to be found. Here it is worth mentioning the work of Liu et.al. [67] where an effective preconditioner for a non-variational FEM-BEM coupling was proposed. The important ingredients in that preconditioning approach included the use of the Robin-boundary conditions and domain overlapping between FEM and BEM.
As a direct consequence of duality paring and Robin-to-Robin map, the present formulation based on domain decomposition approach described in Chapter 2 will alleviate each of the aforementioned matters. Namely, the resulting system matrix is symmetric. Moreover, the Robin transmission condition as an impedance boundary condition leads to a CFIE formulation in BEM domain. Consequently, the present method is free of internal resonances. In addition, the meshes on the interface are non- conforming, leading to modular treatments of FEM and BEM domains in terms of meshing procedure, selection and order of basis functions, matrix assembly and solution process. As a result, these benefits lead to significantly simplified integration of existing
FEM and BEM implementations.
The rest of this chapter is planned as follows. Section 4.1 provides a systematic step-by-step derivation of symmetric FEM-BEM coupling. Three Schwarz type preconditioners are proposed in section 4.2. Due to recent advancement in BEM research, the detail on the acceleration of BEM computation is skipped. We only note that this research adopts the IE-FFT algorithm [47] in BEM domain.
56
Figure 4.1 A generic EM radiation/scattering problem used for the derivation of FEM- BEM. The insert shows the non-conforming FEM and BEM meshes.
4.1 Symmetric FEM and BEM Coupling
Without loss of any generality and for the sake of simplicity, we first derive the symmetric formulation for one-domain decomposition of FEM domain as depicted in Fig.
4.1. The formulation for multiple-domains decomposition of the interior domain will then be straightforwardly extended afterward.
4.1.1 Boundary Value Statement
Referred to Fig. 4.1, the interior domain Ω ⊂ 3 is bounded by the surface Γ and may contain the localized heterogeneous scatterer and/or antenna, where 3 denotes the
57 3-dimensional unbounded space. The remaining domain is thus unbounded homogeneous free space region: Ω=′ 3 \ Ω, where Ω(≡Ω∪Γ) denotes the closure of the domain.
Thus for domain Ω′ boundary element approaches are suitable, which take into account the Silver-Müller’s radiation condition exactly through its Green’s function.
Based on this domain decomposition, the boundary value problem (BVP) can be written as the following transmission problem:
1 2 imp ∇× ∇×EE −kjk00εηr =− J in Ω μr × γ ()E =Γ0 on PEC (4.1)
× ⎛⎞1 γ ⎜⎟∇×E =0 on ΓPMC ⎝⎠μr
2 ∇×∇×EE −k0 =0 in Ω′ inc inc (4.2) lim∇×EE − × r −jk0 rEE − = 0 in Ω′ r →∞ ()() ()
×× γγEE−+= on Γ ( )ΓΓ( )
××⎛⎞⎛11 ⎞ (4.3) γγ⎜⎟⎜∇×=−∇×EE() ⎟on Γ μμ−+ ⎝⎠⎝rrΓΓ ⎠ where Γ+ and Γ− are the exterior and interior sides of the bounding surface Γ ,
respectively, and ΓPEC and ΓPMC are the surfaces of perfect electric and magnetic conductors, respectively. The near-filed excitation is accounted through an impressed electric current Jimp , while Einc denotes the incident electric filed. It should be emphasized here that EE=+inc E sct is the total electric field. Note that nnˆˆ==−− n ˆ+ denotes an outward normal.
58 4.1.2 Transmission Problem
We use the following electric, magnetic currents, and tangential electric field for the interior and exterior domains
±×11⎛⎞ ±×± jEmEeE=∇×=γγγ,,.± =± (4.4) ⎜⎟ ()ΓΓt () jk ± 0 ⎝⎠μr Γ
The key feature of the proposed FEM-BEM coupling is the Robin-to-Robin transmission problem, and the “cement” finite element coupling of non-conforming grids [15].
Namely, the continuity of the Dirichlet and Neumann traces of the electric field in (4.3) is replaced by the Robin-to-Robin map:
je− − −++=− je −, on Γ −, (4.5) je++− =− je −− −, on Γ+ .
Notice this modification not only maintains the continuity of both tangential electric and magnetic fields, but also alleviate any internal resonance problem because as an impedance condition, (4.5) results in a CFIE-like formulation for the BEM domain.
4.1.3 Exterior Problem
A set of integral equations for the exterior BEM domain can be obtained from the
Stratton-Chu representation formulae [18] for the electric and magnetic fields by letting the observation points to approach r→Γ+ :
+ e inc ++1 + =+eCmAγγtt()() −jk0 ()()jj + ∇Ψτ (), (4.6) 2 jk0
+ jk0 j inc ×+2 × + × + =+jCjAmmjk00γγγ()() − k ( ( ) ) −∇Ψ ( ), (4.7) 2
59 where three integral operators are given by
Ax()=Ψ=∇=×∇∫∫ xgds ', () x ( 'i x ) gds ', Cx () pv ∫ x ' gds '. (4.8) ∂Ω ∂Ω ∂Ω
Here pv indicates integration in principal value sense.
The weak statement of the above integral equations is obtained by testing each equation with the appropriate set of basis functions. Guided by the principle of duality pairing [61], the variational form for (4.7) is then obtained as
+−1/2 ++ −+1/2 Seek eH∈Γ⊥Γ()curl ; jH∈ (divΓ ;Γ ) such that
jk0 ++ +inc λ ,,e ++−=jk0 λ e 2 ΓΓ (4.9) kjk2 λ ++,()Aj −∇⋅λ + ,() Ψj + + λ + ,(Cm + ) , 00ΓΓ++τ Γ +
jk0 ++ +inc vj,,++−= vj 2 ΓΓ (4.10) ˆˆ++2 + + ˆ + + −×jk00nvCj,()++ +× k nvAm ,( ) −∇⋅×Ψτ nv , ( m ) , ΓΓ() Γ+
+−1/2 + + −+1/2 ∀∈λ H ()divΓ , Γ and vH∈ ⊥Γ(curl ,Γ ) .
4.1.4 Interior Problem
In the interior domain, the standard variational form of (4.1) reads as:
− −−1/2 Seek EH∈Ω()curl; and jH∈ (divΓ ;Γ ) such that
bjkcurlvE,,+=−∀∈Ωγη v j− vJ ,,;. v H (4.11) ()t ( ) Γ− 0 Ω ( )
4.1.5 Coupled Problem
The coupling of the interior and exterior problems will be accomplished through the variational form of the Robin-to-Robin map described in (4.5). The variational 60 statement of the transmission problem in (4.5) now reads as:
−−1/2 − − −−1/2 +−1/2 + Seek eH∈Γ⊥Γ()curl ;, jH∈ (divΓ ;Γ ) and eH∈Γ⊥Γ()curl ;
+−1/2 + jH∈Γ ()divΓ ; such that
λ −−,,,,,e −=+λ −−j λ −+e λ −+j ΓΓ−− ΓΓ −− vj−−,,,,,=−− ve − − vj −+ ve − + ΓΓΓΓ−−−− (4.12) λ ++,,,,,e =++λ ++j λ +−j λ +−e ΓΓΓΓ++++ vj++,,,,,=−− ve + + vj +− ve + − ΓΓΓΓ++++
−−1/2 − − −−1/2 + −+1/2 ∀∈vH⊥Γ()curl ;, Γ , λ ∈ H (divΓ ;Γ ) and vH∈Γ⊥Γ()curl ; ,
+−1/2 + λ ∈ΓH ()divΓ ; .
− −−1/2 Using the fact that for all vH∈ curl;Ω , γ vvH− ≡ ∈Γcurl ; , the above ( ) t ( )Γ ⊥Γ( ) variational statement is combined with the interior problem variational statement.
Specifically, the surface integral term in (4.11) is divided into two halves. One half remains intact and the other half is replaced by the second equation of (4.12). Similarly the exterior variational problem and transmission equations are combined by substituting the right-hand side terms of the last two equations of (4.12) into (4.10). Subsequently final variational form of the coupled problem reads as:
− −−1/2 −−1/2 − Seek EH∈Ω()curl; , eH∈ ⊥Γ(curl ,Γ ) , jH∈Γ ()divΓ ; ,
+−1/2 + +−1/2 + eH∈Γ⊥Γ()curl , , and jH∈ (divΓ ;Γ ) such that
jk0000−− jk − − jk −+ jk − + b()vE,,++−− vj− ve ,−− vj , ve , − 2222Γ ΓΓ Γ (4.13) =−jk η vJ,,imp 0 Ω 61 jkλ −−,,,,0,e −−−= jkλ −−j jkλ −+e jk λ −+j (4.14) 0000ΓΓ−− ΓΓ −−
jk000+− jk +− jk ++ −−−λ ,,,j +++λ e λ j 222ΓΓΓ (4.15) ++−∇⋅Ψ=−jkλ ++,(Cm ) k2 λ ++ ,()Aj λ ++ , ()j jk λ + ,einc , 00ΓΓ++τ ΓΓ ++ 0
jk jk jk 000++ +− +−ˆ + + + ve,,,+++−−+× ve vjjk0 n vCj ,() + 222ΓΓΓ Γ (4.16) 2 ˆˆ++ + ++ + +inc −×k0 nvAm,( )+ +∇⋅×Ψτ nv ,( m ) = vj ,+ , ΓΓ() Γ+
−−1/2 − − −−1/2 +−1/2 + ∀∈vH()curl; Ω, vH∈Γ⊥Γ(curl , ), and λ ∈ H (divΓ ,Γ ) , λ ∈ΓH ()divΓ , ,
+−1/2 + and vH∈Γ⊥Γ()curl , .
4.1.6 Matrix Form
The final matrix equation corresponding to (4.13)-(4.16) is written as
AA 0 0 0 ⎡⎤II IΓ− ⎡⎤E ⎡⎤y I ⎢⎥⎢⎥− ⎢⎥ AA−−−−−−−−+−++−− T D T D 0 ⎢⎥ΓΓΓΓΓΓΓΓΓΓΓI ⎢⎥e ⎢⎥ ⎢⎥0DTT−− T D − T⎢⎥ − ΓΓ−− ΓΓ −− ΓΓ −+ ΓΓ −+ j =⎢⎥0 . (4.17) ⎢⎥T ⎢⎥+ ⎢⎥ 0T−−+−+ DQTP −+ ++ y ⎢⎥ΓΓ ΓΓ ΓΓ ⎢⎥e ⎢⎥H ⎢⎥0D−−TTT T PQT −−⎢⎥ + ⎢⎥ ⎣⎦ΓΓ−+ ΓΓ −+ ΓΓ ++ ⎣⎦j ⎣⎦y E
The explicit form of matrix sub-blocks AA , , A , A and interior excitation II IIΓ− ΓΓΓ−−−
vector y I were previously defined in (2.12) and (2.14). The remaining sub-blocks are:
jk000−− jk −+ jk ++ TvvTvvT−−===γγ,, −+ γ ,, ++ λλ ,,(4.18) ΓΓ222ttΓΓΓ−−+ ΓΓ t ΓΓ
jk00−− jk −+ Dv−−=γ,,λ Dv −+ =γ ,,λ (4.19) ΓΓ22ttΓΓ−− ΓΓ
Q =−kjk2 λ ++,()A λλλ + ∇ ⋅ ++ ,() Ψ ,P = λ ++ ,().C λ (4.20) 00ΓΓ++τ Γ +
62 The BEM excitation vectors, yyH , E , which are nonzero for scattering problems only, are defined as
yvjy==−++,,incjk λ ,.e inc (4.21) HEΓΓ+ 0 +
Note BEM matrix sub-blocks Q and P are dense and symmetric, where Q also corresponds to the impedance matrix of traditional electric field integral equation (EFIE) formulation for PEC geometries, and P is the symmetric part of impedance matrix of conventional magnetic field integral equation (MFIE) formulation for PEC obstacles.
Moreover, the system matrix for the BEM domain with the absence of the mass matrix
T is very similar to that of the PMCHWT formulation [68] for the dielectric objects ΓΓ++ embedded in free space. It is well known that the PMCHWT formulation is free of internal resonance [68]. Furthermore, the mass matrix T is also well known to be Γ+Γ+ positive definite, thus its presence makes BEM system matrix even better conditioned.
These observations will be confirmed in next chapter through numerical studies of some canonical examples.
4.2 Preconditioning Schemes
To facilitate the discussions of this section, let’s rewrite (4.17) as
⎡⎤⎡⎤⎡⎤KGxyFEM FEM FEM ⎢⎥⎢⎥⎢⎥T = . (4.22) ⎣⎦⎣⎦⎣⎦GKBEM x BEM y BEM
In this section, we propose three Schwarz type preconditioners to solve (4.22) effectively.
They are termed domain diagonal block (DDB), additive-multiplicative Schwarz (AMS), and multiplicative-multiplicative Schwarz (MMS) preconditioners.
63 4.2.1 DDB Preconditioner
Because the proposed approach is based on the concept of DDM, which is often viewed as an effective preconditioner for the Krylov subspace iterative solvers [1], a simple and natural preconditioner M is thus
⎡()K −1 ⎤ M−1 = ⎢ FEM ⎥. (4.23) ⎢ −1 ⎥ ⎣ ()K BEM ⎦
In DDM community, the form of (4.23) is also known as an additive Schwarz precondtioner [1]. When a Krylov subspace solver such as conjugate gradient (CG) solver [69] is equipped with this preconditioner, the preconditioned residual rMr = −1 is required to be computed at each iteration of solver, where r and r are the residual and preconditioned residual vectors, respectively. Therefore the solution process can be considered as an inner-outer loop iteration scheme, where inner loop involves sub- domain solution.
In order to accelerate the inner-loop convergence, p-Type Multiplicative Schwarz
(pMUS) preconditioner [30] and “geo-neighboring” preconditioner [70] can be utilized for FEM matrix and BEM matrix, respectively.
4.2.2 AMS Preconditioner
Preconditioning matrices for FEM and BEM matrices can directly be used instead of computational intensive inverse operations, provided that they are good approximates of inverse matrices. This leads to the AMS preconditioner of the form
⎡()M −1 ⎤ M−1 = ⎢ FEM ⎥ , (4.24) ⎢ −1 ⎥ ⎣ ()MBEM ⎦
64 where MFEM is referred as the pMUS preconditioner, and MBEM is the “geo- neighboring” preconditioner. Note that preconditioners are in factorized form via incomplete Choleski factorization, thus the operations involved in their inverses are simply forward and backward substitutions.
4.2.3 MMS Preconditioner
To compromise between effectiveness and efficiency, pMUS preconditioner can be applied directly to (4.23), and the inverses of diagonal are in turn approximated by their preconditioners. Namely, we have a multi-level multiplicative Schwarz preconditioner
⎡⎤−1 ⎡⎤M0−1 ⎡ I0⎤ −1 IM−()FEM C()FEM M = ⎢⎥⎢⎥⎢ −1 ⎥ , (4.25) 0I⎢⎥−1 CMT I ⎣⎦⎢⎥⎣⎦0M()BEM ⎣⎢ ()FEM ⎦⎥ where I is an identity matrix. Operations involved in (4.25) are forward and backward substitutions plus two additional sparse matrix-vector multiplications.
4.3 Hybrid DDM and BEM
Having described the symmetric formulation as well as preconditioners for the two-domain formulation of hybrid FEM-BEM, generalization to multiple domain decomposition of the interior FEM domain can be straightforwardly extended. If we denote the BEM domain as Nth domain and assume 2-dimensional arrangement of sub- domains, the final system matrix can then be written by careful examination of (2.9) and
(4.17) as
65 ⎡⎤KG0112−− G 1N ⎡ uy11⎤⎡ ⎤ ⎢⎥−−GK G⎢ uy⎥⎢ ⎥ ⎢⎥21 2 2N ⎢ 22⎥⎢ ⎥ ⎢⎥0 ⎢ ⎥⎢= ⎥ (4.26) ⎢⎥⎢ ⎥⎢ ⎥ ⎢⎥KGNNN−−11− ⎢ uyNN−−11⎥⎢ ⎥ ⎢⎥⎢ ⎥⎢ ⎥ ⎣⎦−−GGN12 N − G NN− 1 K BEM ⎣vyBEM⎦⎣ BEM ⎦
All the sub-matrices and the column vectors follow obvious extensions from (2.10),
(2.11), and (4.17). Notice global nature of BEM formulation is expressed through its coupling with all interior FEM domains. Lastly, we comment that equivalent form of
MMS preconditioner for (4.26) actually corresponds to symmetric Gauss-Seidel preconditioner described in Algorithm 2.2.
66 CHAPTER 5
HYBRID FEM-BEM RESULTS AND NUMERICAL STUDIES
Verification of the accuracy, internal resonance free aspect, and the performance of the proposed preconditioners for hybrid FEM-BEM formulation is given in this chapter. Furthermore, we also verify the accuracy aspect of hybrid DDM-BEM implementation through a few antenna arrays, along with the study of DDM convergence of this hybridization. Note that a preconditioned residual error of 10−2 is used for all examples. All computations were performed on a 64-bit AMD Opteron 246 workstation with a 1 MB L2 cache and 16 GB of RAM.
5.1 Air Box: Internal Resonance and Numerical Stability Study
We first study the internal resonance issue via a simple one-meter square box computational domain. The air box is discretized with approximately h=λ0/5 tetrahedral elements, where λ0 is free space wavelength. For the present geometry the internal resonance (both TE and TM modes due to degeneracy) should occur around 212MHz. To identify the presence or absence of internal resonances, an estimate spectral condition
number κλ()AAA= max ()/ λ min ( ) of the system matrix A is computed in the neighborhood of the suspected resonance frequency. The condition number is estimated
67 no diagonal scaling with diagonal scaling
(a)
(b) (c)
Figure 5.1 Condition number in the neighborhood of the “internal” resonance; (a) Costabel’s symmetric FEBI formulation, (b) present approach without diagonal scaling, (c) present approach with diagonal scaling.
68 using the open source software SPARSE. The results are shown in Fig. 5.1(b) and 5.1 (c) with solid blue line. It is apparent that neither the diagonal scaled nor the non-diagonal scaled system shows signs of condition number increase around the resonance. On the other hand, the internal resonance problem surfaces exactly at 212MHz as shown in Fig.
5.1(a) when Costabel’s symmetric FEBE [58], [61], [62] is employed. It is interesting that the diagonal scaling does affect the bandwidth of the resonance, but not the location of the resonance. Here it should be noted that the actual values in the condition number curves in subfigure (a) (Costabel’s symmetric FEBE) versus (b) and (c) (proposed approach) should not be compared directly because the number of unknowns thus the size of the two matrices are different. Notice that the condition number for the BEM sub- matrix is also plotted in black solid line. When the sparse matrix T is added, the condition number improves by one order of magnitude (red solid line).
To gain further insight on the numerical stability of the proposed formulation, the complete eigenvalue distribution of the coupling matrix, or more precisely the matrix
−1 MAMDDB()− DDB is considered for the same one-meter square box computational domain. The results are plotted in Fig. 5.2 for increasing mesh densities. It is observed most eigenvalues are inside the unit circle (propagating modes), only very few on the unit circle (evanescent modes), thus the spectral radius of the coupling matrix is less or equal to one. In many practical applications, the presence of evanescent modes is negligible, and even if they are present they can be easily taken care by the Krylov iterations.
However, in these situations, the use of the simple Gauss-Seidel iteration method may exhibit slow convergence or even divergence. Therefore, for robust matrix solution performance, we strongly recommend the employment of Krylov subspace iteration 69
(a)
(b) (c)
Figure 5.2 Eigenvalue distribution of the preconditioned system (I-M-1 A) for: (a) N = 1076 unknown problem, (b) N = 2708 unknown problem, (c) N = 4824 unknown problem. The frequency is kept constant at f=300MHz.
70 methods. It is apparent that the spectrum has three accumulation centers: propagating modes around zero, and evanescent modes close to the unit circle rim around −1 and +1.
Moreover, we observe that as the mesh size decreases, the accumulation points become more clustered around -1, 0 and 1.
5.2 Dielectric Sphere: Convergence Study
To study the convergence properties of the proposed method, a dielectric sphere with progressively increasing mesh density is considered. With the diameter of d=4/3λ0 and relative permittivity of εr=2.0, the sphere is facetized with unstructured triangles of the order of h=λ0/5 at the coarsest level. The BEM (truncation) boundary is placed on- the-surface of the sphere. A series of progressively increasing mesh densities are constructed and simulated with the present method. The root mean square (RMS) error of the RCS of the sphere is then computed, defined by
2ππ 2 σ ()θφ,,− σ () θφdd θ φ ∫∫00 FEM− BEM Mie RMS RCS error= 2ππ 2 , (5.1) σθφθφ(), dd ∫∫00 Mie with σFEM-BEM and σMie being the RCS of the proposed method and the analytical Mie series solution, respectively.
After large numbers of angular sampling points are taken to ensure accurate error indication, the results of this study are presented in Fig. 5.3(a) where the RCS error is plotted versus the number of total unknowns. Note that the values obtained from the simulations (blue squares) are compared against with the second and first order slope lines. This is done because the present implementation uses second order FEM and first order BEM basis functions. It is believed that in the coarse discretization the 71 O(h2) Interior FEM error
O(h) Truncation reflection error
(a)
(b)
Figure 5.3 Convergence properties of proposed DD FEM-BEM, (a) RCS error vs. discretization, (b) history of the iterative convergence for the smallest and largest discretization.
72 discretization error of the FEM is predominant while for higher degrees of accuracy, the truncation error from BEM boundary dominates, and it is only first order accurate. Thus, in the asymptotic limit the method is only first order accurate even though second order
FEM is utilized. The preconditioned CG convergence, with M DDB preconditioner, is plotted in Fig. 5.3(b) for the smallest and largest discretizations. The solid red line represents a discretization of h=λ0/5 which results in 29,236 total unknowns. The solid blue line in the same figure represents the finest discretization of approximately h=λ0/20 and 3,031,760 total unknowns. Despite the large difference in both matrix sizes and discretizations, the iterative solver behaves very well with a total iteration number only mildly dependent on the discretization size.
5.3 Coated Sphere Scattering: Accuracy
We verify the use of non-conformal meshes using a dielectric coated PEC sphere borrowed from [61]. The sphere consists of a 0.3423λ0 inner PEC shell radius coated by a dielectric of εr=4.0 and μr=1.0 with 0.444λ0 radius. These dimensions are chosen such that the internal resonance occurs at the frequency of operation. The problem domain is truncated by a cubic box with length of 1λ0. For FEM domain a discretization size of h=λ0/5 is used, resulting 122,248 unknown. On the other hand, due to employment of lower order basis functions, BEM domain is discretized with h=λ0/7, requiring 2,436 unknowns. Shown in Fig. 5.4 is the bistatic scattering pattern obtained by the proposed method compared with the analytical solution. For this example, a Gauss-Seidel stationary solver in inner-outer loop scheme is used. Solver converges in 8 iterations and no sign of internal resonance is observed.
73 FEM Mesh geometry
BEM Mesh
Figure 5.4 Scattering by a dielectric coated PEC sphere at the internal resonance frequency.
Two additional PEC spheres coated with εr=2.0 dielectric are considered to further verify the accuracy of the present approach. The ratio of the inner and outer radius
3 3 are scaled such that the computational volume increases from 9λ0 to 30λ0 . Using conformal mesh with uniform discretization size of h=λ0/5, the number of total unknowns increases from 948,168 to 3,065,754. The total memory scales from 1,027MB to 4,054
MB, which includes storage of both FEM and BEM matrices, preconditioners and coupling matrix. The bistatic patterns compared with Mie series are shown in Fig. 5.5 where very good agreements can be observed.
74 a=1.8λ 0 ,
b=2.0λ 0
(a)
a=2.7λ 0 ,
b=3.0λ 0
(b)
Figure 5.5 Scattering by dielectric coated PEC spheres; (a) small sphere, (b) large sphere.
75 Unknown# DDB AMS MMS Freq. (MHz) CPU CPU CPU N N Iter. Iter. Iter. FEM BEM (hh:mm:ss) (hh:mm:ss) (hh:mm:ss)
300 155,720 5,280 53 00:33:32 81 00:02:37 23 00:01:03
600 363,846 24,576 109 15:59:44 104 00:21:58 70 00:12:36
Table 5.1 Performance of three preconditioners for dielectric sphere example.
5.4 Performance of DDB, AMS and MMS Preconditioners
We continue to study the performance of three preconditioners proposed in
Chapter 4 using two examples. The computational statistics reported are based on a
TFQMR solution of final system matrix.
5.4.1 Dielectric Sphere
The first example is a dielectric sphere of εr=2 and μr=1, with 1m radius under a monochromatic plane wave incidence. The computational domain is that of the sphere with truncation boundary placed at the dielectric-to-air interface. The discretization size is kept at h=λ0/5. Table 5.1 summarizes convergent behaviors for two different frequencies. In term of solution CPU time, DDB is the most costly due to its inner loop solve. We point out here that for DDB to work properly, inner loop is required to converge below 10−5 , making it unattractive for real-life examples. MMS outperforms the other two in both iteration counts and solution time. In term of matrix assembling
76 E
(a) (b)
Figure 5.6 A generic battle ship, (a) the geometry and dimensions; and, (b) computational domain for DD-FEM-BEM.
time, the BEM matrix is assembled via the IE-FFT algorithm using 126s and 654s for
300MHz and 600MHz, respectively. The memory storages of the BEM portion including
“geo-neighboring” preconditioner are 26MB and 151MB, while for the FEM portion with pMUS are 177MB and 552MB, respectively. The computed bistatic patterns are also on top of analytic Mie series solutions.
5.4.2 RCS from a Generic Battle Ship
To demonstrate the versatility of the method, the scattering by a generic battleship is analyzed. Fig. 5.6 shows the geometry, dimensions, direction of incident plane wave as well as the computational domain. Notice that the BEM surface is quite complicated and non-convex which is not longer suitable for absorbing boundary condition. The ship is unrealistically assumed to float in the free-space for the reason of simplicity. Since the battleship is perfectly electric conducting, an efficient PEC based EFIE MoM solution can be employed, whose results will be served as references.
77
(a)
(b)
Figure 5.7 Comparisons of the bistatic RCS results of DD-FE-BEM and the MoM, (a) 30MHz, (b) 60MHz.
78
(a)
(b)
Figure 5.8 Field distributions, (a) 30MHz, (b) 60MHz.
Unknown# DDB AMS MMS Freq. Memory (MHz) (MB) CPU CPU CPU N N Iter. Iter. Iter. FEM BEM (hh:mm:ss) (hh:mm:ss) (hh:mm:ss)
10 196 102,006 10,680 87 00:57:34 106 00:02:11 43 00:01:05
20 553 320,444 27,036 97 03:15:22 133 00:06:56 67 00:04:27
40 2100 1,230,158 72,594 91 17:51:27 132 00:34:08 70 00:28:04
Table 5.2 Computational statistics of the DD-FE-BEM for solving bistatic RCS of a generic battleship using three different preconditioning strategies.
79 The bistatic scattering patterns at 30MHz and 60MHz are plotted in Fig. 5.7. For both frequencies the comparisons between DD FEM-BEM and MoM are very good. The corresponding electric field distributions at the truncation boundary are shown in Fig. 5.8.
The DD-FEM-BEM mesh is obtained from an initial discretization of h=λ0/5 and through a goal-oriented h-version adaptive mesh refinements with estimated error of 0.05. The computational statistics of the DD-FE-BEM simulations for computing bistatic patterns of the generic battleship at 10MHz, 20MHz and 40MHz are reported in Table 5.2. The
CPU times reported are for matrix solution processes, which include the construction of the preconditioners, for different preconditioning strategies. However, we note that the time required to construct the preconditioner only consists of a very small fraction of the total solution times. In this example, even though the frequencies have been increased from 10MHz to 40MHz, the number of iterations for DDB, AMS, and MMS change very little, particularly for the DDB case. This is a much desired feature for real-life problems.
Furthermore, for the DDB preconditioner, we observe that the overall CPU times are significantly greater than the other two preconditioners, despite of small number of iterations for the DD-FEM-BEM to converge. Consequently, in practical computations, we simply employ either the AMS or MMS preconditioner in solving the DD-FEM-BEM matrix equations.
5.5 Large Antenna Arrays
We continue to analyze a few large antenna arrays to demonstrate the performance of hybrid DDM and BEM approach. For this formulation, a truncated GCR
Krylov solver [42] is used to solve the final system matrix, equipped with a symmetric
80
Figure 5.9 Dimensions and geometry of a coaxial fed patch array.
Gauss-Seidel preconditioner as described in Chapter 2. Note that for hybrid DDM-BEM formulation, our experiences indicate that symmetric Gauss-Seidel preconditioner typically outperforms Gauss-Seidel preconditioner in term of solution iteration counts.
5.5.1 Patch Antenna Arrays
We first study the scattering behavior of a planar finite square patch array on a finite grounded dielectric substrate. The same geometry, but with infinite substrate, has been considered by Pozar in [71]. Fig. 5.9 shows its geometry as well as detailed dimensions. In this study, the array is excited by a normally incident x-polarized plane wave at f=300MHz. We analyze three different array configurations: 2x2, 7x7 and 11x11, and compare their DDM performances with BEM and ABC truncation schemes. For each array, three building blocks are required and an initial mesh of h=λ0/4 followed by 8 h-
81 Total Solution Time Array Size Unknowns Iteration Memory (hh:mm:ss)
2x2 402,544 93MB 12 00:00:31
7x7 1,085,124 147MB 12 00:01:22
11x11 1,932,004 205MB 11 00:02:14
Table 5.3 DDM performances of patch antenna arrays with ABC truncation.
Total Solution Time Array Size Unknowns Iteration Memory (hh:mm:ss)
2x2 163,598 154MB 14 00:04:33
7x7 706,607 343MB 19 00:22:55
11x11 1,446,512 520MB 33 01:30:13
Table 5.4 DDM performances of patch antenna arrays with BEM truncation.
adaptive mesh refinement steps is employed for each building block. To accelerate the solution process, the FETI like algorithm is applied to each building block. This process demands 40MB of memory and 6.5 minutes of CPU time.
Differences in DDM performances with ABC and BEM truncations can be clearly compared from Tables 5.3 and 5.4. Note for ABC truncation, the boundary is placed at
λ0/2 away from the elements, while BEM boundary is placed about 0.1λ0 away. Because of this discrepancy, less number of sub-domains is required for BEM truncation, which in 82
(a) (b)
(c)
Figure 5.10 Far field patterns of patch arrays, (a) 2x2, (b) 7x7, (c) 11x11.
consequence results in less number of unknowns than those with ABC truncation.
However, as is well known BEM computation is much more computational intensive.
Thus in term of solution CPU time, BEM truncation is quite unattractive. Due to the employment the fast integral equation method, IE-FFT, the memory requirement of hybrid FEM-BEM is still acceptable. Furthermore, as array size increases, we notice that when ABC truncation is employed, solution iteration remains relatively unchanged
83
(a) (b) (c)
Figure 5.11 Field distributions of patch arrays, (a) 2x2, (b) 7x7, (c) 11x11.
whereas for BEM truncation convergence deteriorates. This undesired feature could be attributed to the fact that with ABC truncation sub-domains are geometrically conforming, whereas BEM domain is geometrically non-conformal to its neighbors.
Comparisons in far field pattern resulted from two truncation schemes are shown in Fig. 5.10 for all three arrays. It can be observed that fair accurate results are obtained with ABC truncation at main directions as expected. The corresponding surface electric field distributions at the truncation boundary are displayed in Fig. 5.11.
5.5.2 Ultra Wide Band Antenna Arrays
To further study convergent behavior of hybrid DDM-BEM, various ultra wide band antenna (UWB) arrays [72] operated at the center frequency of 12GHz are examined. Detailed dimensions about this geometry can be found in Fig. 5.12(a) and an example layout of a 5x5 array is illustrated in Fig. 5.12(b). As usual, an initial mesh of h=λ0/4 followed by 8 h-adaptive mesh refinement steps and the FETI like algorithm are applied to each building blocks. 84 Total Solution Time Array Size Unknowns Iteration Memory (hh:mm:ss)
3x3 217,850 80MB 10 00:01:11
7x7 989,760 142MB 12 00:05:22
10x10 1,944,192 214MB 17 00:16:28
20x20 7,448,726 525MB 38 01:39:16
Table 5.5 DDM performances of UWB arrays with BEM truncation.
Previous comments regarding to the comparisons of DDM convergences and the accuracy of far field patterns between ABC and BEM truncation conditions apply directly to this example. Namely deterioration of DDM convergence as array dimension grows and some disagreements of far field patterns at the back lobes are also observed. These are evident from Table 5.5 and Fig. 5.13. Note that in this example, ABC and BEM boundaries are placed λ0/2 and 0.07λ0 away from the elements, respectively.
Due to significantly rapid solution CPU time and reasonable well accuracy of
ABC truncation, it is used to study wide band aspect of a 50x50 UWB array with frequencies ranged from 2GHz to 20GHz. For each frequency, aforementioned rule of thumb on mesh discretization is utilized. For the entire frequency spectrum, the array’s directivities are computed as shown in Fig. 5.14 which indicates the trend of increasing directivity with frequencies. E-plane and H-plane far field radiation patterns as well as the corresponding surface electric fields at the truncation boundary at 12GHz, 16GHz and
20GHz are shown in Fig. 5.16 and Fig. 5.15, respectively. 85
(a)
(b)
Figure 5.12 A UWB antenna array, (a) dimensions of a unit cell in unit mm, (b) a 5x5 UWB array.
86
(a) (b)
(c) (d)
Figure 5.13 Far field patterns of UWB arrays, (a) 3x3, (b) 7x7, (c) 10x10, (d) 20x20.
87
Figure 5.14 Directivity of a 50x50 UWB array as a function of frequency.
(a) (b) (c)
Figure 5.15 Electric field distributions of a 50x50 UWB array, (a) 12GHz, (b) 16GHz, (c) 20GHz.
88
(a) (b) (c)
(d) (e) (f)
Figure 5.16 Far field patterns of a 50x50 UWB array, (a)-(c) E-plane, (a) 12GHz, (b) 16GHz, (c) 20GHz, (d)-(f) H-plane, (d) 12GHz, (e) 16GHz, (f) 20GHz.
89 CHAPTER 6
METAMATERIAL ELECTROMAGNETIC CLOAK: DERIVATION AND FULL-WAVE SIMULATIONS
Pendry et. al. [73] have recently reported a novel approach to design an electromagnetic cloak making objects “invisible” to microwave. In particular, explicit form of material properties for the cloaking of spherical PEC objects was provided and later verified both through experiments [74] and through numerical simulations of a 2D cylindrical problem [75].
The main focus of this chapter is on the study of such novel structures with the aid of previous presented numerical methods. Furthermore, material properties for cloaking
PEC spherical objects will be derived based on coordinate transform, also with the help of a numerical method. Namely, the derivation follows very similar procedures to those of perfectly matched layer (PML) structures as reported in [76]. This aspect demonstrates that numerical methods are much more than being simply simulation tools.
The remaining of this chapter is organized as follows. A proper problem statement is first stated. The material properties are then derived from Maxwell equations in curvilinear coordinates. Their applications on a PEC sphere are subsequently analyzed via hybrid FEM-BEM approach as presented in Chapter 4.
90 R2 s r
R1 k oa ()εμ, cl
εμ, ()00
Figure 6.1 Cloaking of a PEC sphere of radius R1.
6.1 Closed Form of Material Properties for EM Cloaking
Referred to Fig. 6.1, given a PEC sphere of radius R1, the goal is to construct an
EM cloak which is a concentrate sphere in shape with inner and outer radii of R1 and R2, respectively. Formally, the problem statement reads:
Find ()εμ, such that exterior fields (both E and H) at region rR≥ 2 are identical to the incident fields.
Note that with the boundary conditions given above where both E and H fields are specified a unique solution might not necessarily exist. Fortunately, following the procedures of deriving conformal PML tensors via coordinate stretching technique, a possible solution is reached. Note that the main difference between present derivation and that of conformal PML is the use of different coordinate transform.
6.1.1 Derivation of Material Properties for the Cloaking of PEC Spheres
The basic strategy is outlined as follows. The derivation starts from an initial configuration of sources embedded in the coordinate system where material properties
91 are known. This is the system where incident fields conform to. This coordinate system is then twisted to a new coordinate system at which the cloak resides. It turns out that
Maxwell equations are form-invariant to the coordinate transforms. The only components affected are the material properties (ε, μ) , becoming both spatially varying and anisotropic.
The main vehicle to achieve the desired material properties is Maxwell equations
restated for any orthogonal system in curvilinear coordinates, ξ12, ξξ, and 3, given as
[77]
1 ⎡⎤∂∂ ⎢⎥()hE33−−ωμ=() hE 22 j H 1 0, hh23⎣⎦∂ξ 2 ∂ξ 3 1 ⎡⎤∂∂ ⎢⎥()hE11−−ωμ=() hE 33 j H 2 0, hh ∂ξ ∂ξ 31⎣⎦ 3 1 (6.1) 1 ⎡⎤∂∂ ⎢⎥()hE22−−ωμ=() hE 11 j H 2 0, hh12⎣⎦∂ξ 1 ∂ξ 2 ∂∂∂ ()hh23ε+ E 1() hh 31 ε+ E 2() hh 12 ε E 3 = hhh 123 ρ. ∂ξ123 ∂ξ ∂ξ
1 ⎡⎤∂∂ ⎢⎥()hH33−−ωε=() hH 22 j E 1 J 1, hh23⎣⎦∂ξ 2 ∂ξ 3 1 ⎡⎤∂∂ ⎢⎥()hH11−−ωε=() hH 33 j E 2 J 2, hh ∂ξ ∂ξ 31⎣⎦ 3 1 (6.2) 1 ⎡⎤∂∂ ⎢⎥()hH22−−ωε=() hH 11 j E 2 J 3, hh12⎣⎦∂ξ 1 ∂ξ 2 ∂∂∂ ()hh23μ+ H 1() hh 31 μ+ H 2() hh 12 μ H 3 =0. ∂ξ123 ∂ξ ∂ξ
Here electric and magnetic fields are denoted respectively as Ex=++ˆˆˆEEE123y z and
Hx=++ˆˆˆHHH123y z . The impressed sources are expressed via electric charge ρ and
92 imp current Jx=++ˆˆˆJJJ123y z . Moreover, hh12, , and h 3 are generally known as metrical coefficients which can be computed from the formula [77]
222 2 ⎛⎞⎛⎞⎛⎞∂∂∂xyz hi =++⎜⎟⎜⎟⎜⎟. (6.3) ⎝⎠⎝⎠⎝⎠∂ξiii ∂ξ ∂ξ
We adopt the Dupin coordinate system [78] with unit vectors
urˆ iii=∂(/ ∂ξ )// ∂ r ∂ξ , i =1,2,3, such that uuˆˆ12and are tangent to the surface S (see
Fig. 6.1) and uˆ 3 normal to S. Furthermore, in this coordinate system metrical coefficients are given explicitly as [76]
rr01+ ξ+ξ 3 02 3 hh123= ,,1,== h (6.4) rr01 02
where rr01 and 02 represent the principal radii of curvatures [78]. For the spherical object
ˆ ˆ shown in Fig. 6.1, rrR01== 02 1 , uˆ 1 = θ , uˆ 2 =φ and ξ3 = r .
Using the local coordinate system define above, we compress a spherical region