Computational Techniques for Solving the Sparse Matrix Eigenvalue Problem for Semiconductor Bandstructure Calculation

Computational Techniques for Solving the Sparse Matrix Eigenvalue Problem for Semiconductor Bandstructure Calculation Samuel SMITH University of Florida samuelsmith@ufl.edu Abstract The tight binding model used to efficiently model nanoscale systems is computationally bound by symmetric matrix eigenvalue problem. We present an overview of semiconductor crystallography and computational techniques to efficiently model these crystal structures. Using a generated crystal structure, we calculate the tight binding Hamiltonian matrix that governs the system. We then explore the computational tools for calculating the eigenvalues of this matrix and benchmark several popular packages. Finally, we use these tools to compute the bandgap for quantum dots of various dimensions. 1 CONTENTS I Crystal Properties and Generation 2 I-A Basis and primitive vectors . .2 I-B Schrodinger¨ equation and the Bloch theorem . .2 I-C Zincblende lattice . .3 I-D Wurtzite lattice . .4 I-E Adjacency matrix generation . .5 I-F Asymptotically optimal connectivity mapping algorithm . .5 II Generation of the Tight Binding Hamiltonian 6 II-A Same atom terms . .6 II-B Passivation of surface effects . .6 II-C Nearest neighbor terms . .7 III Eigenvalue Computation 9 III-A Overview of the eigenvalue problem for sparse symmetric matrices . .9 III-B Possible solutions and selection criteria . .9 III-C ARPACK . .9 III-D Trilinos and Anasazi . 10 III-E Other Python features for future work . 10 IV System and benchmarks 11 IV-A Overview of test system . 11 IV-B High Performance LINPACK Benchmark . 11 IV-C Comparison of eigenvalue solvers . 11 V Bandgap Calculation for Quantum Dots 13 VI Conclusions and Future Work 13 References 14 Appendix 15 2 I. CRYSTAL PROPERTIES AND GENERATION A. Basis and primitive vectors All major semiconductors used today in industry are crystalline materials[1]. The key feature of crystals that differentiates them from other solid matter is that they are spatially periodic. This allows the structure of the crystal to be completely described by a single unit cell. A crystal can be mathematically described using primitive vectors a1, a2, and a3. The full lattice can be determined by integral combinations of these primitive vectors, such that : 0 R = R + m1a1 + m2a2 + m3a3; where R is any known lattice point and m1; m2; m3 2 Z. For more complicated crystals like the zincblende structure, described later, two simple lattices are in superposition. For structures like these, we can define a basis vectors b1 and b2 which describe the relative offsets of the two lattices. B. Schrodinger¨ equation and the Bloch theorem In quantum mechanics, the famous Schrodinger¨ wave equation is used to describe the behavior of systems. For the time-independent case, we write this equation as: " # − 2 ~ r2 + V (r) = E ; 2m where ~ is the reduced Planck constant, m is the mass of the particle, V (r) is the spatially dependent potential energy, is the wavefunction, and E is the energy operator. We define the LHS operator on the wavefunction as the Hamiltonian, H^ . Using this definition, we can rewrite the Schrodinger¨ equation as: H^ = E : A useful result for the Schrodinger¨ equation for particles in a periodic potential structure like a crystal is the Bloch theorem [1]. The Bloch theorem states that for particles in a periodic potential, the eigenfuntions of the ik·r Hamiltonian will be the product of a plane wave e . and some function uk(r) with the same periodicity as the lattice. We can write this as: ik·r k(r) = e uk(r): We note that: uk(r) = uk(r + R) where R is the periodicity of the lattice. 3 C. Zincblende lattice The zincblende structure and the closely related diamond structure are perhaps the most important crystal structures in the semiconductor industry. Silicon and germanium (group IV semiconductors) have a diamond structure, and gallium arsenide (a III-V semiconductor) has a zincblende structure[2]. The only major difference between these structures is that the anion and cation species are the same for the diamond structure and different for the zincblende structure. The primitive vectors for the zincblende structure with lattice constant a and orthogonal basis [^x; y;^ z^] are: 1 1 a = a^y + a^z 1 2 2 1 1 a = a^x + a^z 2 2 2 1 1 a = a^x + a^y: 3 2 2 The basis vectors are: b1 = 0 1 1 1 b = a + a + a ; 2 4 1 4 2 4 3 where b1 is the basis for the cation sites and b2 is the basis for the anion sites. A silicon quantum dot is shown in figure 1. A quantum dot is a structure that is fully confined in all three dimensions. While it is locally periodic, it has well defined boundary conditions. This leads to it displaying dramatically different properties from bulk material. The dimension shown in the caption for the picture refers to the number of iterations for each sublattice (anionic and cationic). A 3 × 3 × 3 crystal has 54 atoms, 27 anion sites and 27 cation sites. Fig. 1. Silicon quantum dot (3 × 3 × 3) 4 D. Wurtzite lattice In the early stages of the project, the wurtzite structure was also considered along with the similar hexagonal diamond structure. Gallium nitride, a common wide bandgap semiconductor, has this structure. It is actually possible [3] to make silicon into this structure, but it is not commonly done. The wurtize lattice has more complicated primitive vectors than the zincblende lattice: 1 1 a = a^x − 31=2a^y 1 2 2 1 1 a = a^x + 31=2a^y 2 2 2 a3 = c^y; where c=a = (8=3)1=2. The basis vectors are: 1 2 b = a + a 1 3 1 3 2 2 1 1 b = a + a + a 2 3 1 3 2 2 3 1 2 b = a + a + ua 3 3 1 3 2 3 2 1 1 b = a + a + + u a ; 4 3 1 3 2 2 3 where u = 3=8. For the wurtzite lattice, b1 and b2 are basis vectors for the cation sites, and b3 and b4 are basis vectors for the anion sites. A wurtzite structure generated by the old MATLAB code used at the start of the project is shown in figure 2. Fig. 2. Wurtzite quantum dot 5 E. Adjacency matrix generation For computational modeling of a crystal system[4], we begin by iterating through integral combinations of the primitive vectors added to the appropriate basis vectors where applicable. This number of iterations is bounded by the variables xcells, ycells, and zcells, specifying the number of iterations (largest multiple) for each basis vector to be allowed. The sites for the anions and cations are stored in an n × 3 array for fast access. A matrix A over GF(2) of dimension n × m is created to store the connections between the n anions and the m cations (usually, m = n). Each element of the matrix Aij is defined as 1 if anion i is a nearest neighbor of cation j and 0 otherwise. Generation of this matrix is performed by iterating over every cation site for every anion site. F. Asymptotically optimal connectivity mapping algorithm Generating or iterating over a connectivity matrix is an inherently inefficient operation as we must perform an operation for every cation for every anion. This is asymptotically O(n2) complexity, where n is the number of atoms in the system. As n grows large, the calculations quickly become intractable. This looping structure is made even worse by the nature of most interpreted programming languages like MATLAB and Python. We present a new crystal generation algorithm that avoids these problems. We begin by finding the coordinates of all the atoms in the usual manner by finding linear combinations of primitive vectors and performing some affine transformation to offset the either the anionic or cationic sites (this analysis was performed with a simple zincblende structure, but could easily be extended to other crystal types of arbitrary shape). We improved this part slightly by multiplying the all the vectors by some constants to make all lattice points integers for fast comparisons and eliminating round-off difficulties. There is no harm in doing so because anytime a real distance is needed another proportionality factor can be used. After generating a list of all the sites, we sort the list of cation sites using Timsort, an O(n log n) sort that can take advantage of any ordering already present in the list. After we have a sorted list of cation sites, we perform an iteration over all the anion sites. For each anion, we apply the translation for all four possible crystal directions and perform the very efficient O(log n) binary tree search on the sorted list of cations to determine if that cation actually present in the system. If the atom is found, it is recorded in an adjacency list. Adjacency lists are used because low degree graphs (crystals with this sort of connectivity are essentially isomorphic to low degree non-planar graphs) are more efficiently represented in terms of lists than matrices. This includes sparse matrices as it is easier to create an iterator for a list than for a sparse matrix in most programming languages. The list stores only connected sites and can thus be iterated over in linear time. The overall asymptotic time complexity for the adjacency list generation is O(n log n). Using this data structure, it will be possible to generate the sparse Hamiltonian (described later) much faster. Using a single processor, connectivity lists (including bonding directions each site) for a 101306 atom system were generated in just under 40 seconds.

Computational Techniques for Solving the Sparse Matrix Eigenvalue Problem for Semiconductor Bandstructure Calculation

CUDA 6 and Beyond

Accelerating the LOBPCG Method on Gpus Using a Blocked Sparse Matrix Vector Product

Present and Future Leadership Computers at OLCF

LARGE-SCALE COMPUTATION of PSEUDOSPECTRA USING ARPACK and EIGS∗ 1. Introduction. the Matrices in Many Eigenvalue Problems

Exploring Capabilities Within Fortrilinos by Solving the 3D Burgers Equation

A High Performance Implementation of Spectral Clustering on CPU-GPU Platforms

A High Performance Block Eigensolver for Nuclear Conﬁguration Interaction Calculations Hasan Metin Aktulga, Md

Slepc Users Manual Scalable Library for Eigenvalue Problem Computations

Comparison of Numerical Methods and Open-Source Libraries for Eigenvalue Analysis of Large-Scale Power Systems

The Latest in Tpetra: Trilinos' Parallel Sparse Linear Algebra

Warthog: a MOOSE-Based Application for the Direct Code Coupling of BISON and PROTEUS (MS-15OR04010310)

Solving Applied Graph Theory Problems in the Juliagraphs Ecosystem