Introducing ONETEP: Linear-Scaling Density Functional Simulations on Parallel Computers Chris-Kriton Skylaris,A) Peter D
Total Page:16
File Type:pdf, Size:1020Kb
THE JOURNAL OF CHEMICAL PHYSICS 122, 084119 ͑2005͒ Introducing ONETEP: Linear-scaling density functional simulations on parallel computers Chris-Kriton Skylaris,a) Peter D. Haynes, Arash A. Mostofi, and Mike C. Payne Theory of Condensed Matter, Cavendish Laboratory, Madingley Road, Cambridge CB3 0HE, United Kingdom ͑Received 29 September 2004; accepted 4 November 2004; published online 23 February 2005͒ We present ONETEP ͑order-N electronic total energy package͒, a density functional program for parallel computers whose computational cost scales linearly with the number of atoms and the number of processors. ONETEP is based on our reformulation of the plane wave pseudopotential method which exploits the electronic localization that is inherent in systems with a nonvanishing band gap. We summarize the theoretical developments that enable the direct optimization of strictly localized quantities expressed in terms of a delocalized plane wave basis. These same localized quantities lead us to a physical way of dividing the computational effort among many processors to allow calculations to be performed efficiently on parallel supercomputers. We show with examples that ONETEP achieves excellent speedups with increasing numbers of processors and confirm that the time taken by ONETEP as a function of increasing number of atoms for a given number of processors is indeed linear. What distinguishes our approach is that the localization is achieved in a controlled and mathematically consistent manner so that ONETEP obtains the same accuracy as conventional cubic-scaling plane wave approaches and offers fast and stable convergence. We expect that calculations with ONETEP have the potential to provide quantitative theoretical predictions for problems involving thousands of atoms such as those often encountered in nanoscience and biophysics. © 2005 American Institute of Physics. ͓DOI: 10.1063/1.1839852͔ I. INTRODUCTION science, whose theoretical investigation would be possible The equations of quantum mechanics govern the corre- only with an accurate quantum mechanical description of the lated motions of electrons and nuclei and are thus essential in interactions between thousands of atoms. any theoretical description of the chemical or physical prop- In an attempt to extend the application of DFT to such erties of matter. Apart from trivial cases, these equations are problems, researchers in recent years have put substantial impossible to solve with pen and paper and highly sophisti- effort into the construction of DFT methods which are 11,12 cated computational methods for their solution have been linear-scaling, i.e., with a cost which increases asymp- devised.1,2 Amongst them the Kohn–Sham density functional totically only linearly with the number of atoms. These meth- 13,14 theory ͑DFT͒ formalism3,4 for electronic structure calcula- ods exploit the electronic localization that is inherent in tions has become established as an approach that provides a systems with a band gap and seek to optimize quantities that very good description of electron correlation effects while ͑in principle͒ are infinite in extent, but decay exponentially, keeping the size of calculations tractable. DFT calculations such as the single-particle density matrix10 or Wannier have become an indispensable tool for the study of matter functions.15,16 A common point between these methods is with myriads of applications in areas such as chemistry,5 that the onset of linear-scaling occurs only after the number biochemistry,6 polymers,7 and materials8,9 to name a few. of atoms exceeds a critical value. An important performance However even DFT calculations suffer from an unfavorable characteristic then is the crossover point, the number of at- scaling: the time taken to perform such a calculation on a oms at which a linear-scaling approach becomes faster than a computer increases asymptotically with the cube of the num- conventional cubic-scaling approach. This crossover point is ber of atoms. This cubic scaling is a consequence of the system dependent but often lies in the order of hundreds of delocalized nature of the wave functions which are the eigen- atoms. As single processor workstations are capable of cal- 4,10 solutions of the Kohn–Sham single particle Hamiltonian, culations with roughly no more than 100 atoms, it is impor- and limits the number of atoms we can treat to a few hundred tant to use multiprocessor ͑parallel͒ computers if we are to at most. There are many exciting problems at the interface reap any benefits from linear-scaling DFT. Conversely, we between the microscopic and mesoscopic worlds, could argue that only linear-scaling methods are suited to particularly in the emerging fields of biophysics and nano- take best advantage of parallel computers since, only with them does an eightfold increase in computational power al- a͒Author to whom correspondence should be addressed. Present low calculations for eight times as many atoms instead of address: Department of Physical and Theoretical Chemistry, South Parks only twice as many atoms as in conventional approaches. It Road, Oxford OX1 3QZ, UK. Electronic mail: [email protected] is not surprising therefore that the development of linear- URL: http://www.chem.ox.ac.uk/researchguide/ckskylaris.html scaling methods has often advanced hand in hand with the 0021-9606/2005/122(8)/084119/10/$22.50122, 084119-1 © 2005 American Institute of Physics Downloaded 31 Mar 2011 to 18.111.89.111. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions 084119-2 Skylaris et al. J. Chem. Phys. 122, 084119 (2005) development of suitable algorithms for calculations on par- where the factor of 2 above is included to account for elec- allel computers.17–20 tronic spin as we assume here a closed shell description. To be useful, a linear-scaling method should have a sys- Provided there is a band gap in the system, the density ma- tematic way to reduce error to any value desired, in the same trix ͑2͒ decays exponentially30–32 as a function of the dis- way as conventional methods. However progress towards tance between rЈ and r. This property can be exploited to this goal has been slow as it has been difficult to devise truncate the density matrix to a sparse band-diagonal form generally applicable schemes to truncate the exponentially such that the amount of information it contains increases decreasing ‘‘tails’’ of the density matrix or Wannier functions linearly with its size. To achieve this in practice, the density while maintaining control over the accuracy or the stability matrix is expressed in the equivalent form ͑throughout this of the iterative minimization procedure. Most linear-scaling paper a summation will be implied over repeated Greek in- approaches use nonorthogonal localized basis sets to express dices͒ their ͑also localized͒ functions. These approaches can be ␣ classified into methods which use atomic-like basis sets such ͑r,rЈ͒ϭ␣͑r͒K *͑rЈ͒, ͑4͒ as Gaussian functions,21 Slater functions,22 spherical Bessel functions23 or numerical atomic orbitals,24 and methods where the ͕ ␣͖ are a set of spatially localized, nonorthogo- which use simpler localized basis sets such as polynomials25 nal basis functions and the matrix K, as defined by the above 33 or real-space grids.19,26 equation, is called the density kernel. This form allows for Our linear-scaling method27 is different from all other a practical, ‘‘coarse-grained’’ truncation of the density matrix approaches as it uses a basis set of highly localized functions through truncation of the density kernel. Thus we ensure the which are orthogonal. This approach allows for systematic density kernel is sparse by enforcing the condition control of truncation errors and is compatible with an accu- ␣ϭ ͉ Ϫ ͉Ͼ ͑ ͒ rate representation of the kinetic energy operator,28 which K 0 when R␣ R rcut , 5 ensures variational behavior with respect to the basis set.29 where R␣ and R are the ‘‘centers’’ of the localization re- Our linear-scaling method is implemented in ONETEP ͑order- ͒ gions of the functions ␣(r) and (r). N electronic total energy package which has been developed Often a linear combination of atomic orbitals ͑LCAO͒ with algorithms intended for calculations on parallel super- approach is followed where the basis ͕␣͖ consists of atomic computers and is the subject of this paper. orbitals. Their radial shapes can be expanded in spherical We give a brief presentation of the formalism of linear- Bessel functions,23 Gaussians34,35—where sparsity is com- scaling methods in Sec. II. In Sec. III we focus on ONETEP, monly imposed via ‘‘thresholding’’ 36 rather than by Eq. and explain its capabilities with theoretical arguments and ͑5͒—and numerical atomic orbitals as in the SIESTA example calculations. In Sec. IV we give an overview of the program37 where instead of the density kernel, orthogonal principles behind the parallel implementation of ONETEP Wannier-like functions are truncated. All these sets of func- which is based again on the real space localization. Finally in tions are taken preoptimized and remain fixed during the Sec. V we demonstrate how ONETEP takes advantage of par- calculation. Based on only operations with a linear cost such allel computers in order to perform calculations with thou- as the construction of the Hamiltonian matrix in the LCAO sands of atoms. basis and sparse matrix algebra, a number of efficient techniques38–41 have been developed that minimize the en- ergy while