RESEARCH STATEMENT

XUEMIN TU

1. Overview. I am working on developing efficient numerical algorithms for solving the system obtained from the of partial differential equations (PDEs). Much of my work has so far been supervised by Professor Olof Widlund. Usually the first step of solving an elliptic partial differential equation (PDE) nu- merically is its discretization. Finite difference, finite element, or other reduce the original PDE to an often huge and ill-conditioned linear or nonlinear sys- tem of algebraic equations. Limited by the memory and speed of the computers, the traditional direct solvers can often not handle such large linear systems. Moreover, iterative methods, such as Krylov space methods, may need thousands of iterations to obtain accurate solutions due to large condition numbers. Domain decomposi- tion methods provide efficient and scalable that can be accelerated by Krylov space methods and have become popular in applications in computational fluid dynamics, structural , electromagnetics, constrained optimization, etc. The basic idea of domain decomposition methods is to split the original huge problem into many small problems which can be handled by direct solvers, and then solve these smaller problems a number of times and accelerate the solution of the original problem with Krylov space methods. There are two main classes of domain decomposition methods: overlapping Schwarz methods and iterative substructuring methods. My research focuses on the second ap- proach. In iterative substructuring methods, the domain is decomposed into nonover- lapping subdomains. The unknowns in the interior of the subdomains are first elim- inated independently and we then work with the Schur complement with respect to the unknowns associated with the interface. Coarse problems are constructed using one or a few degrees of freedom for each subdomain. Among these algorithms, the Neumann-Neumann and finite element tearing and interconnecting methods (FETI) families are the best known and they have been tested in many applications. Recently, a new family of iterative substructuring methods, the balancing do- main decomposition by constraints (BDDC) algorithms, has been introduced by Clark Dohrmann in [6]. These methods have a Neumann-Neumann flavor. However, their coarse problems are given by sets of constraints enforced on the interface, which are similar to those of the dual-primal FETI (FETI-DP) methods [11]. It has been proved that the preconditioned operators for BDDC and FETI-DP have identical nontrivial eigenvalues except possibly for 0 and 1, see [21, 18, 3]. The condition number of the preconditioned operators have the bound:

2 H (1.1) κ ≤ C 1 + log , h   where H is the diameter and h is the typical mesh size of subdomains and C is constant independent of H and h. Combining this estimate and the convergence analysis of Krylov space method, we can conclude that the of FETI-DP and BDDC both are independent of the number of subdomains but depend slightly on the problem size of each subdomain. They are efficient and scalable algorithms. However, a shortcoming of both BDDC, FETI-DP, and all other domain decom- position methods is that the coarse problem needs to be assembled and the resulting matrix needs to be factored by a direct solver at the beginning of the computation. 1 Usually the size of the coarse problem is proportional to the number of subdomains. Nowadays some computer systems have more than 100,000 powerful processors, which allow very large and detailed simulations. The coarse component can therefore be a bottleneck if the number of subdomains is very large. Motivated by this fact, I have developed two three-level BDDC algorithms to remove this difficulty. I have also extended the BDDC algorithms to flow in porous media and established the same condition number estimate for the preconditioned BDDC operator as in (1.1). I also worked with Professor Maksymilian Dryja on the using domain decomposi- tion method directly for parabolic problems and have obtained the best error estimate, to the best of our knowledge, for this type of discretization. In the summer 2005, I worked as a Givens associate in Mathematics and Computer science division at Argonne National Laboratory with Dr. Barry Smith on nonlinear multigrid for solving nonlinear PDEs. During my master study at Worcester Polytechnic Institute, I worked with my advisor Dr. Marcus Sarkis on enhanced singular function mortar finite element for nonconvex domains. I also worked with Professor Haiyang Huang on modeling the dynamics of flour beetle population at Beijing Normal University during my graduate study there. In the following sections, I describe these projects in detail and related future work. 2. Three-level BDDC. 2.1. Current work: Three-level BDDC for scalar elliptic problems in two and three dimensions. The BDDC algorithms, previously developed for two levels [6, 20, 21], are similar to the balancing Neumann-Neumann algorithms. How- ever, the coarse problem, in BDDC, is given in terms of a set of primal constraints and is generated and factored by direct solvers at the beginning of the computation. The coarse components of the preconditioners can then ultimately become a bottleneck if the number of subdomains is very large. We try to remove this difficulty by using one or several additional levels. We proceed as follows: we group several subdomains together to form a sub- region. We could first reduce the original coarse problem to a subregion interface problem by eliminating independently the subregion interior variables, which are the primal variables on the subdomain interface and interior to the subregions. In one of the three-level BDDC algorithms, we do not solve the subregion interface prob- lem exactly, but replace it by one iteration of the BDDC ; Dohrmann has also suggested this approach in [6]. This means that we only need to solve sev- eral subregion local problems and one coarse problem on the subregion level in each iteration. We assume that all these problems are small enough to be solved by di- rect solvers. We have shown that the condition number estimate for the resulting three-level preconditioned BDDC operator is bounded by

2 2 Hˆ H (2.1) κ ≤ C 1 + log 1 + log , H h !   where Hˆ , H, and h are the typical diameters of the subregions, subdomains, and mesh of subdomains, respectively. C is constant independent of Hˆ , H, h, and the coefficients of the original PDE, provided that the coefficients of the PDE vary moderately in each subregion. 2 ˆ 2 H In order to remove the additional factor 1 + log H in (2.1), we can use a Chebyshev iteration method to accelerate the three-levelBDDC algorithms. With this device, the condition number bound is 2 H (2.2) κ ≤ CC(k) 1 + log , h   where C(k) depends on the eigenvalues of the preconditioned coarse problem, the two parameters chosen for the Chebyshev iteration, and k, the number of the Chebyshev iterations. C(k) goes to 1 as k goes to ∞. H and h are the same as before. We first obtained these results for two dimensional problem with vertex con- straints, see [29]. We then extended these algorithms to the three dimensional cases, see [33]. In the three dimensional case, vertex constraints alone are not enough to obtain good polylogarithmic condition number bound (1.1) due to much weaker inter- polation estimate and constraints on the averages over edges or faces are needed. The new constraints lead to a considerably more complicated coarse problem and the need for new technical tools in the analysis. The same condition number bounds (2.1) and (2.2) with Chebyshev acceleration are obtained for three dimensional cases in [33]. 2.2. Future work. Our three-level BDDC algorithms are one of the inexact BDDC methods. Recently, there are several new results which have been reported, [32, 19, 16, 5]. BDDC has been extended to incompressible Stokes equations [17], flow in porous media [30, 31], and with mortar finite element discretization [15]. I will continue to work on extending our three-level BDDC algorithms to these problems, successfully extending the two-level BDDC methods. Among these, a three-level BDDC method with mortar finite element discretization is joint work with Dr. Hyea Hyun Kim. 3. BDDC algorithm for flow in porous media. We have extended BDDC algorithm for flow in porous media with two kinds of discretizations: mixed and hybrid finite element discretizations. 3.1. A BDDC algorithm for a mixed formulation of flow in porous me- dia. Using mixed formulations of flow in porous media, we obtain a saddle point problem which is closely related to that arising from the incompressible Stokes equa- tions. In a recent paper [17], the BDDC algorithms have been applied to the incom- pressible Stokes equation. Our situation is different. First of all, our problem is not originally formulated in the benign, divergence free subspace, and it will therefore be reduced to the benign subspace, as in [10, 22, 23, 24], at the beginning of the computation. In addition, only edge/face constraints are needed to force the iterates into the benign subspace and to ensure a good bound for the condition number, since Raviart-Thomas finite elements, see [4, Chapter III], are utilized. These elements have no degrees of freedom associated with vertices/edges in two/three dimensions. Also, the condition number estimate for the Stokes case can be simplified since the Stokes extension is equivalent to the harmonic extension, see [2]. However, this is not the case here, and different technical tools are required. An iterative substructuring method with Raviart-Thomas finite elements for vector field problems was proposed in [35, 27]. We have borrowed some technical tools from these papers and proven the condition number bound (1.1) for this BDDC algorithm, see [30]. 3.2. A BDDC algorithm for flow in porous media with a hybrid finite element discretization. The hybrid finite element discretization is equivalent to a 3 nonconforming finite element method. We reduce the original saddle point problem to a positive definite system for the pressure by introducing Lagrange multipliers on the interface of the subdomains and by eliminating the velocity in each subdomain. Thus, we need not find a velocity that satisfies the divergence constraint at the beginning of the computation and then restrict the iterates to the divergence free, benign subspace. This approach is similar to the work on the FETI-DP methods as described in [28, Chapter 6]. We use the BDDC preconditioner to solve the interface problem for Lagrange multipliers, which can be interpreted as an approximation to the trace of the pressure. By enforcing a suitable set of constraints, we obtain the same convergence rate as for a conforming finite element case in (1.1), see [31]. 3.3. Future work. So far, we assume that each subdomain is general polygons. I am planning to work on the scalable algorithms for more general shape of subdo- mains. I am also planning to extend three-level BDDC algorithms to flow in porous media. 4. A domain decomposition discretization of parabolic problems.

4.1. Current work. Several strategies can be applied to obtaining domain de- composition algorithms for parabolic problems. Among them, a first approach uses the standard discretization of parabolic problems (e.g., the backward Euler, Crank Nicolson), and then applying domain decomposition methods to the resulting sys- tems, as in an iterative method for elliptic problems (for references, see [26], [28] and literature therein). In contrast, a second approach is based on the discretization of the parabolic problems which leads to a domain decomposition algorithm as a direct method (for references, see [26], [36] and the literature therein). These strategies have been proven, theoretically and practically, to be very effective for parallel computa- tion. In this project, a domain decomposition method is introduced for parabolic problems based on the second approach. For a second order parabolic equation in (0, T ) × Ω, where Ω is a polygonal region in two dimensional space, we consider an approximation of an initial-. This problem is directly dis- cretized a finite difference method with respect to the time variable t and a finite element method with respect to the spatial variables x = (x1, x2), leading to a direct domain decomposition method. Such a special discretization of the original problem results in an algorithm well-suited for parallel computation. The algorithm discussed can be viewed as a domain decomposition analog of the well known ADI methods for finite difference approximation of parabolic problems, see [7]. We have proven that the resulting discrete problem approximates the original problem, and that this algorithm is stable and convergent with an error bound O(τ + h) in an appropriate norm. The error bound obtained for the method is the same as for the backward Euler scheme. To the best of our knowledge, this is the best error estimate known in the literature for this type of discretization. The method discussed has previously been described in brief in [8]. A theorem formulated there (without proof) gives an error bound O(τ 1/2 + h) provided that τ is proportional to h. We have improved our estimate by using refined tools. This is joint work with Professor Maksymilian Dryja and the result is in [9]. 4.2. Future work. So far, we obtain this result for two dimensional case. In the future research, we will consider to extend this algorithm and result to three dimensions. 4 5. Nonlinear Multigird. For nonlinear (elliptic) partial differential equations, a popular method is Newton-Krylov-Schwarz/Multigrid. Newton method is used to globally linearize the problem and then the linear system is solved approximately by preconditioned Krylov methods. This is very difficult to obtain good floating point performance since the whole Jacobian matrix is stored and then floating point performance is limited by memory bandwidth (how quickly one can get the floating point numbers from main memory to the CPU and then move the results back). Instead of Newton-Multigrid, we can use Multigrid-Newton. We march through the grid solving small nonlinear systems associated with one or a small number of unknowns using Newton’s method. Now we are doing a relative large number of floating point operations and only need to store the vectors; the Jacobian entries that we compute NEVER get stored in main memory. (Note that with multigrid the smoothing takes up almost all the time, so to get high floating point performance we just need to have very efficient smoothers, the rest of the code doesn’t matter much). This is a well-known algorithm called FAS (full approximation scheme) or sometimes just nonlinear multigrid. Properly implemented, we have the possibility of going from 10 percentage of machine floating point peak to 60+ percent! We have studied simple and full FAS version with V or W cycles and tested dif- ferent smoothers. We have also studied Krylov space method to accelerate FAS and using FAS as a nonlinear preconditioner for a Quasi-Newton method. All these algo- rithm are implemented in PETSc [1] and give very nice results for our test problems. This is joint work with Dr. Barry Smith. A future direction is to apply these algorithms to more complicated and realistic computational fluid dynamics and other application problems.

6. Enhanced singular function mortar finite element methods for non- convex domain. This project concerned how to obtain second order discretizations (first order in the energy norm and second order in L2 norm) for the Poisson problem on non-convex domains. We also worked on how to obtain the second order accurate tensor coefficients corresponding to the non-convex corner. We have developed three families of finite element methods that combine mortar elements (Bernardi, Maday, and Patera, and Wohlmuth), singular functions, and Quimera algorithms. The results of these numerical experiments show that these three families all get the second order accuracy. We also analyzed the H1- and L2- error for these methods with Functional Analysis and Finite Element basic theory. A complete theory was developed for one of the methods. For the other two, the theory is under development with very promising preliminary numerical results. See [34, 25]. This is joint work with Dr. Marcus Sarkis.

7. Effect of cannibalism on flour beetle population dynamics. In this project, we constructed two kinds of population models: age-structured and size- structured population models. The asymptotic behaviors of the solution of the popu- lation models with cannibalism were discussed by analyzing the bifurcation diagram of their equilibriums and by numerical simulation. The effect of cannibalism among individuals on the population dynamics was studied, and some dynamical character- istics of the size-structured population model with cannibalism, due to the nonlinear growth of an individual, were shown. A complete theory was developed for the ex- isting steady states of the size-structures (hyperbolic PDEs) and the stability of the steady states (using bifurcation and semi-group theory). We established a numerical simulation method to get the dynamic behaviors of these models and to explain the 5 population dynamics from a new view point of ”energy”. This is joint work with Professor Haiyang Huang at Beijing Normal University, see [14, 13, 12].

8. Other research experiments. Programming in FORTRAN and MPI for ”Simulation of Unsteady Viscous Flow in Stenotic Collapsible Tubes”. In this project, I parallelized a FORTRAN code that was originally written for a serial machine using GFD (General Finite Difference) for discretization and the SIMPLE method with direct solvers (LAPACK) for both the pressure and velocity equations. We parallelized the code using MPI commands on an IBM SP2, using domain decomposition solvers for the linear systems. Because the original solver (Gauss Elimination) needs huge memory that is very restricted for 3D problem and this solver is not suitable for parallelization, more efficient and easier solvers are needed. In addition to several simple iterative solvers, I tested the GMRES with the Additive Schwarz preconditioner and versions of the Restricted Schwarz preconditioner with one or two cells of overlap. GMRES with right preconditioner was a good solver for this problem and the running time could be reduced to one eighth of the original time.

REFERENCES

[1] Satish Balay, Kris Buschelman, William D. Gropp, Dinesh Kaushik, Matt Knepley, Lois Curfman McInnes, Barry F. Smith, and Hong Zhang. PETSc home page. http://www.mcs.anl.gov/petsc, 2001. [2] James H. Bramble and Joseph E. Pasciak. A domain decomposition technique for Stokes problems. Appl. Numer. Math., 6:251–261, 1989/90. [3] Susanne C. Brenner and Li-yeng Sung. BDDC and FETI-DP without matrices or vectors. Preprint, 2005. [4] Franco Brezzi and Michel Fortin. Mixed and hybrid finite element, volume 15 of Springer Series in Computational Mathematics. Springer Verlag, Berlin-Heidelberg-New York, 1991. [5] Clark Dohrmann. An approximate BDDC preconditioner. Technical Report SAND2005-5424, Sandia National Laboratory, 2005. [6] Clark R. Dohrmann. A preconditioner for substructuring based on constrained energy mini- mization. SIAM J. Sci. Comput., 25(1):246–258, 2003. [7] J. Douglas. Alternating direction implicit methods for three variables. Numer. Math., 4:41–63, 1962. [8] Maksymilian Dryja. Substructuring methods for parabolic problems. Proceedings of Fourth In- ternational Symposium on Domain Decomposition Methods for Partial Differential Equa- tions, SIAM, Philadelphia, pages 264–271, 1991. [9] Maksymilian Dryja and Xuemin Tu. A domain decomposition discretization of parabolic prob- lems. Technical Report TR2005-860, Department of Computer Science, Courant Institute, May 2005. [10] Richard E. Ewing and Junping Wang. Analysis of the Schwarz algorithm for mixed finite element methods. RAIRO Mod´el. Math. Anal. Num´er., 26(6):739–756, 1992. [11] Charbel Farhat, Michel Lesoinne, Patrick Le Tallec, Kendall Pierson, and Daniel Rixen. FETI- DP: A dual-primal unified FETI method – part I: A faster alternative to the two-level FETI method. Internat. J. Numer. Methods Engrg., 50:1523–1544, 2001. [12] Haiyang Huang and Xuemin Tu. The regulation effect of cannibalism in the dynamics of the population model. Journal of Beijing Normal University (), 36(6):1–6, 2000. [13] Haiyang Huang and Xuemin Tu. The effect of cannibalism in the dynamics of the size-structured population model,. Journal of Beijing Normal University (Natural Science), 37(5):580– 585, 2001. [14] Haiyang Huang and Xuemin Tu. Existence and stability of multiple equilibria in a size- structured cannibalism population model. Journal of Beijing Normal University (Natural Science), 38(1):1–6, 2002. [15] Hyeahyun Kim, Maksymilian Dryja, and Olof Widlund. Three-level BDDC in two dimensions. Technical Report TR2005-871, Department of Computer Science, New York University, 2005. 6 [16] Axel Klawonn and Oliver Rheinbach. Inexact FETI-DP methods. Technical Report SM-E-609, University Duisburg-Essen, 2005. [17] Jing Li and Olof B. Widlund. BDDC algorithms for incompressible Stokes equations. Technical Report TR-861, Department of Computer Science, New York University, 2005. [18] Jing Li and Olof B. Widlund. FETI–DP, BDDC, and Block Cholesky Methods. Internat. J. Numer. Methods Engrg., 2005. To appear. [19] Jing Li and Olof B. Widlund. On the use of inexact subdomain solvers for BDDC algorithms. Technical Report TR-871, Department of Computer Science, New York University, 2005. [20] Jan Mandel and Clark R. Dohrmann. Convergence of a balancing domain decomposition by constraints and energy minimization. Numer. Appl., 10(7):639–659, 2003. [21] Jan Mandel, Clark R. Dohrmann, and Radek Tezaur. An algebraic theory for primal and dual substructuring methods by constraints. Appl. Numer. Math., 54(2):167–193, 2005. [22] Tarek P. Mathew. Domain Decomposition and Iterative Refinement Methods for Mixed Finite Element Discretizations of Elliptic Problems. PhD thesis, Courant Institute of Mathe- matical Sciences, September 1989. TR-463, Department of Computer Science, Courant Institute. [23] Tarek P. Mathew. Schwarz alternating and iterative refinement methods for mixed formulations of elliptic problems, part I: Algorithms and Numerical results. Numer. Math., 65(4):445– 468, 1993. [24] Tarek P. Mathew. Schwarz alternating and iterative refinement methods for mixed formulations of elliptic problems, part II: Theory. Numer. Math., 65(4):469–492, 1993. [25] Marcus Sarkis and Xuemin Tu. Singular function mortar finite element methods. Comput. Methods Appl. Math., 3(1):202–218 (electronic), 2003. Dedicated to Raytcho Lazarov. [26] Barry F. Smith, Petter E. Bjørstad, and William Gropp. Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations. Cambridge University Press, 1996. [27] Andrea Toselli. Domain decomposition methods for vector field problems. PhD thesis, Courant Institute of Mathematical Sciences, May 1999. TR-785, Department of Computer Science. [28] Andrea Toselli and Olof B. Widlund. Domain Decomposition Methods - Algorithms and Theory, volume 34 of Springer Series in Computational Mathematics. Springer Verlag, Berlin- Heidelberg-New York, 2004. [29] Xuemin Tu. Three-level BDDC in two dimensions. Technical Report TR2004-856, Department of Computer Science, Courant Institute, November 2004. [30] Xuemin Tu. A BDDC algorithm for a mixed formulation of flows in porous media. Electron. Trans. Numer. Anal., 20:164–179, 2005. [31] Xuemin Tu. A BDDC algorithm for flow in porous media with a hybrid finite element discretiza- tion. Technical Report TR2005-865, Department of Computer Science, Courant Institute, May 2005. [32] Xuemin Tu. Three-level BDDC. In O. Widlund, editor, Sixteenth international Conference of Domain Decomposition Methods. DDM.org, 2005. [33] Xuemin Tu. Three-level BDDC in three dimensions. Technical Report TR2005-862, Department of Computer Science, Courant Institute, April 2005. [34] Xuemin Tu and Marcus Sarkis. Singular function enhanced mortar finite element. In Do- main decomposition methods in science and engineering, pages 475–482 (electronic). Natl. Auton. Univ. Mex., M´exico, 2003. [35] Barbara I. Wohlmuth, Andrea Toselli, and Olof B. Widlund. Iterative substructuring method for Raviart-Thomas vector fields in three dimensions. SIAM J. Numer. Anal., 37(5):1657– 1676, 2000. [36] Yu Zhuang and Xianhe Sun. Stabilized explicit–implicit domain decomposition methods for the numerical solution of parabolic equations. SIAM, J. Sci. Comput., 24(1):335–358, 2003.

7