<<

“Tight Binding” Method: Linear Combination of Atomic Orbitals (LCAO)

W. E. Pickett (Dated: April 14, 2014) This write-up is a homemade introduction to the tight binding representation of the of crystalline solids. This information is important in the parametrization of the band structures of real solids and for the underlying character of model Hamiltonians for correlated studies, as well as for other uses. Those interested in the parametrization of band structures of real materials should consult the book by

D. A. Papaconstantopoulos, Handbook of the Band Structure of Elemental Solids, (Plenum, New York,1986). For the general TB method, one should consult the original classic work by

J. C. Slater and G. F. Koster, Phys. Rev. 94, 1498 (1954).

I. INTRODUCTION TO TIGHT BINDING non-uniqueness is irrelevant (only the form of the THEORY resulting parameters is relevant), while if the method is intended for real electronic structure calculations, Since a crystal is made up of a periodic array the non-uniqueness often is used to make numerical of , it may seem peculiar that when we think procedures as convenient as possible. of Bloch electron wavefunctions in solids it is often This potential is periodic by construction: in terms of wavy modulations that don’t pay much ~ ~ ~ attention to just where the atoms sit. Indeed, in V (~r + Ro) = Vat(~r + Ro R) − simple metals and covalent semiconductors that is a XR~ good picture: the crystal potential that enters into = Vat(~r (R~ R~ o)) the Hamiltonian is a smooth function and atomic X − − sites per se are not critical in the understanding (al- R~ though they are in the underlying description of co- = V (~r R~ ′) at − valent semiconductors). XR~ ′ There is in fact a common picture – the tight V (~r), (1.2) binding model – that is based on the “collection of ≡ atoms” viewpoint. It is most appropriate when elec- where the change of summation index R~ R~ ′ = trons move through the crystal slowly (or not at all, → R~ R~ o was made. Here R~ o is any lattice vector. as in insulators) and therefore ‘belong’ to an −The crystal Hamiltonian is, using units where for an appreciable time before they move on. The ¯h2/2m 1, are in some sense tightly bound to the atom ≡ and only hop because staying put on a simple atom H = 2 + V (~r). (1.3) costs a bit too much energy. The TB model is not −∇ readily applicable to simple (free or nearly free elec- The potential V (~r) is periodic so H is periodic. tron) metals, but it is quite good for a wide variety of Later we will generalize the situation to cover several other solids. It is interesting that covalent semicon- atoms in the unit cell. ductors can be described well from either viewpoint.

B. Periodic array of atomic orbitals A. Crystal as a collection of atoms Shouldn’t the electron wavefunction in the crys- A good approximation for the electron’s poten- tal be related to the atomic orbitals, which satisfy tial V (~r) in a crystal is the sum of atomic potentials: 2 Hatφn ( + Vat)φn = εnφn. (1.4) V (~r) = V (~r R~), (1.1) ≡ −∇ at − XR~ We might try the most simple linear combination of atomic orbitals that is periodic where the sum runs over lattice vectors. We will not worry about the considerable non-uniqueness of this Φ (~r) = φ (~r R~); (1.5) n n − decomposition. For parametrization purposes, this XR~ is this a Block function that can be put into the form D. Proceeding toward the 2

~ Φ(~r) = eik·~ru (~r)? (1.6) ~k For many solids there will be several types (s,p Indeed it is, but only for ~k = 0. We want, and need, or d) of atomic states in the valence region, and that is why we have kept the index n. In the solid these Bloch-like functions for arbitrary ~k within the first atomic states will mix with each other due to the Brillouin zone (BZ). overlap of atomic orbitals on neighboring atoms (as Problem 1: prove this statement, that this we will see). A Bloch sum of atomic orbitals itself is proposed function has Block form for, and ~ not an for the crystal. It is important only for, k=0. to allow the valence wavefunction in the solid to be some of each of the atomic functions, with the actual amounts to be determined by solving Schr¨odinger’s C. Bloch Sums equation. Thus we try expressing the electron wave-

function in the crystal as a bit bn,~k of each of the A much better choice of candidate for a crystal Bloch sums, wavefunction is to form the “Bloch sums” of atomic orbitals, given by ~ ψ~k(~r) = bn(k)Bn,~k(~r), (1.11) 1 ~ ~ Xn B (~r) = N − 2 eik·Rφ (~r R~). (1.7) n,~k n − XR~ where the coefficients b gives the amount of Bloch sum Bn in the crystal wavefunction. The Bloch where R~ suns over the N lattice cites in our normal- sums become the basis functions that we express ization volume. This type of linear combination is the wavefunction in terms of. Since the Bloch sums called a Bloch sum because it produces a function themselves are normalized to unity, we will want that satisfies the Bloch condition for wavevector ~k: b (~k) 2 = 1 for each ~k. n | n | 1 ~ ~ P Now we want to learn how to find the wavefunc- N 2 B (~r + R~ ) = eik·Rφ (~r + R~ R~) n,~k o n o tions. The condition is that they be solutions of the X − R~ Schr¨odinger equation i~k·R~ = e φn(~r (R~ R~ o)) − − Hψ~k = ε~kψ~k. (1.12) XR~ ~ ~ ′ ~ ik·(R +Ro) ′ The eigenvalues ε~ will be the energy bands of the = e φn(~r R~ ) k − crystal. But how do we solve for ε~ and the coeffi- X~ ′ k R ~ ~ ~ ~ ~ ′ cients bn(k)? = eik·Ro eik·R φ (~r R~ ′), n − XR~ ′ 1 ~ 2 ik·R~ o E. The matrix equation = N e Bn,~k(~r). (1.8) so In generally, a good try is to ~ ~ take matrix elements (integrals between basis func- B (~r + R~ ) = eik·Ro B (~r). (1.9) n,~k o n,~k tions) and reduce the problem to a matrix equa- This result then is in Bloch form. All such Bloch tion. In this case, the thing to do is to multiply the functions are periodic in ~k from BZ to BZ. Schr¨odinger equation on the left by another Bloch sum B∗ and integrate over the crystal. Note that Problem 2: prove this statement by manip- m,~k ulating it into the Block form, and discover we have not chosen a Bloch function corresponding ′ what periodic part u~ (~r) is. to another wavevector ~k = ~k. k 6 It suffices to confine ~k in Eq. 1.7 to the 1st BZ. Problem 3: calculate to see why ′ | If the ~k point were of the form ~kr + K~ , where ~kr we simplify to k = k. See below for notation. (the reduced wavevector) is in the 1st BZ and K~ is a The result is vector, note that ~ ~ ~ ~ Hm,n(k)bn(k) = ε~k Sm,n(k)bn(k), (1.13) ~ ~ ~ ~ X X eik·R = eikr ·R (1.10) n n where for any and all lattice vectors R~. (Be sure you un- derstand why.) Thus these Bloch sums, and any ~ ∗ Hm,n(k) B ~ (~r)HB ~ (~r) functions constructed from them, are periodic from ≡ Z m,k n,k BZ to BZ. = (1.14) mk| | nk and G. The Real Space TB Matrix Elements 3

S (~k) B∗ (~r)B (~r) The expression for the real space integral is m,n ≡ Z m,~k n,~k = . ∗ m.k n,k n,~k H (R~) = φ (~r)Hφ (~r R~), (1.20) | | m,n Z m n − These matrices are called the Hamiltonian matrix i.e. it indicates the amount by which the Hamil- and the overlap matrix, respectively, where m and n tonian H couples φ on the site at are the matrix indices. Writing the matrices implic- m the origin to the atomic orbital φ that is located at itly (without displaying the indices), the equation n site R~. Physically, H (R~) is the amplitude that an becomes m,n electron in orbital φn at site R~ will hop to the orbital φm at the origin under the action of the Hamiltonian. H(~k)b(~k) = ε~ S(~k)b(~k) (1.16) k One limit is easy to see: if R~ is so large that either | | or one or the other of the orbitals is vanishingly small where the other in nonzero, then the integral is neg- ligible. Thus we can confine ourselves to values of H(~k) ε~ S(~k) b(~k) = 0. (1.17) { − k } R~ that connect an atom to only a few near neigh- This is a linear algebra problem (generalized bors. Often the “nearest neighbor” approximation eigenvalue problem), but we have to know what the is excellent. matrices H and S are. All of this discussion applies as well to S, by just removing the Hamiltonian H from inside the inte- gral. Sm,n(R~) in fact is called the overlap of φm(~r) ~ F. The H and S matrices and φn(~r R). Note that, if the orbitals are normal- ized (as is− always possible, and is always assumed), ~ Well, we never said that this wouldn’t get a little the Sm,m(0)=1 for all m, i.e. the diagonal elements messy. However, the messiness does clear up soon. of S are unity. ∗ Substituting in the Bloch sum forms for Bm and Bn within the integral, we have the expression H. Several atoms in the unit cell

1 ~ − 2 2 ik·(R~ 2−R~ 1) Hm,n(~k) = (N ) e X So far, the notation has been limited to a ele- R~ 1,R~ 2 mental crystal. More generally, one encounters com- ∗ ~ ~ pounds where there are various atoms in the cell, φm(~r R1)Hφn(~r R2) × Z − − which can be labeled by their position ~τi with re- 1 spect to the origin of the cell (R~). Then the basis i~k·(R~ 2−R~ 1) ~ ~ e Hm,n(R2 R1). orbitals are ≡ N X − R~ 1,R~ 2 φ φ (~r R~ ~τ ). (1.21) (1.18) m,i,R ≡ m,i − − i Then, generalizing from the Bloch sum defined in Because the Hamiltonian is cell periodic, the matrix Eq. 1.7, one has the basis Bloch sums element on the right hand side depends only on the ~ ~ 1 ~ ~ difference between R2 and R1. We can then change B (~r) = N − 2 eik·(R+~τi)φ (~r R~ ~τ ). (1.22) ~ ~ ~ ~ m,i,~k n i the summation index R2 to R2 R1 = R, and the X − − − R~ index R~ 1 no longer appears on the right hand side. Then the sum over R~ 1 just gives the factor N, the Then, instead of Eq. 1.18 we obtain number of unit cells in our “crystal.” Then we ob- ~ tain Hm,i;n,j(k) 1 ~ ~ ~ = (N − 2 )2 eik·(R2+~τj −R1−~τi) i~k·R~ Hm,n(~k) = e Hm,n(R~). (1.19) X R~ 1,R~ 2 XR~ φ∗ (~r R~ ~τ )Hφ (~r R~ ~τ ) × Z m − 1 − i n − 2 − j There should be no confusion between Hm,n(~k) and 1 i~k·(~τj −~τi) i~k·(R~ 2−R~ 1) Hm,n(R~) in practice; in fact, they are lattice Fourier = e e N X transforms of each other. R~ 1,R~ 2 H (R~ + ~τ R~ ~τ ) given by an analogous expression: 4 × m,i;n,j 2 j − 1 − i ~ ~ ~ ~ −ik·~τi ~ ik·R ik·~τj = e Hm,i;n,j(R)e e ~ −1 i~k·(R~ 2−R~ 1) ~ ~ X Sm,n(k) = N e Sm,n(R2 R1) R~  X − R~ 1,R~ 2 ~ ~ −ik·~τi o ~ ik·~τj = e Hm,i;n,j(k)e . (1.23) i~k·R~ = e Sm,n(R~). (2.27) X Here the notation in the next-to-last line has been R~ shortened by leaving the internal coordinates im- plicit, A. Single Site Terms

Hm,i;n,j(R~) Hm,i;n,j(R~ + ~τj ~τi). (1.24) ≡ − First we will look at the R~ = ~0 terms, where The overlap matrix s behaves in a formally analo- both orbitals are on the same site (at our origin). gous way. Now it is helpful to write the crystal Hamiltonian as Eq. 1.23 can be viewed as the matrix Hˆ 0(~k) the atomic Hamiltonian for the atom at the origin, transformed by the unitary transformation plus the potential from all of the other atoms:

~ 2 ~ ~ ik·~τj H = + Vat(~r) + Vat(~r R) Um,i;n,j(k) = e δn,mδi,j, (1.25) −∇ − RX~=06 which can easily be shown to obey the unitarity con- 2 sph non = + V ( ~r ) + V (~r) + Vat(~r R~) ditions −∇ at | | at − RX~=06 ˆ † ˆ ˆ ˆ −1 ˆ † ˆ −1 sph U U = 1 = U U; U = U . (1.26) = Hat (~r) + ∆V (~r). (2.28)

However, a unitary transformation of a Hermitian The integral results primarily from the first part, matrix does not affect its eigenvalues, but merely which gives the “atomic” eigenvalues εm for a sph { } transforms the eigenvectors. So, unless there is some spherical atomic Hamiltonian Hat . Then because specific reason for doing so, the phase factors in the atomic orbitals on the same atom are orthogonal to last line of Eq. 1.23) (i.e. Uˆ) can be disregarded. each other, the on-site matrix element is Looked at the other way, the one may include any additional unitary transformation that one desires. ∗ φm(~r)Hat(~r)φn(~r) = εnδm,n, (2.29) There are occasionally good reasons for making the Z transformation – for checking various aspects of the i.e. just the atomic eigenvalue. code, for example, or more importantly for reducing the secular equation to a real equation in cases where there is a center of inversion in the crystal but the B. Crystal Field Splitting original choice of orbitals produced a complex secu- lar equation. The important result is that, for obtaining the The quantity ∆V (~r) in Eq. 2.28 is not well spec- eigenenergies (the energy bands) and even fractions ified, but we do know that it has the symmetry of the of atomic orbitals of each eigenstate, one does not atom in questions (defining the origin in this equa- need to deal with the internal atomic positions in tion). This symmetry is never spherical in a solid, but discrete, such as a mirror plane (m), a 3-fold ro- constructing H (~k) and this simplification is m,i,n,j tation in an axis containing inversion together with a usually made. mirror plane (3¯m), etc. This crystal field, that is, the non-spherical potential that arises because the atom is in a crystal, will split some eigenvalues that would II. APPLICATIONS OF THE TIGHT BINDING MODEL be degenerate in a spherical potential. The most common such situation is for the five d orbitals in a cubic crystal field, which are split into the threefold It may not have been obvious, but the mathe- representation called t2g (xy,yz,zx) and the twofold matics is finished; the problem has been solved ex- 2 2 2 representation known as eg (x y , 3z 1). Crys- cept for the numerical work. Obtaining the energy tal fields are also very important− for 4f and− 5f ions, bands ε~k (in general there will be several of them, where the designation may become more complex ~ labeled ε~k,n) and the expansion coefficients bn(k) of because intra-atomic coupling (correlation) is impor- the wavefunctions amounts to solving Eq. 1.16 or tant. Although not widely acknowledged, the oxy- 1.17, with the matrix H given by Eq. 1.19 and S gen ion in an axial site (such as in the perovskite structure) has crystal field split p levels: twofold Since the 2nd equation determines (pdπ) = 28 mRy5 (x,y) and a singlet (z). What this means is that (= 0.38 eV), the other two equations can be used instead of a single on-site (“atomic”) energy εd for separately to “determine” (pdσ). The two values a transition metal ion in a cubic site, one has two thus determined are -142 mRy and -131 mRy. Since

energies εt2g and εeg , or even more distinct ones if these do not differ greatly, one can use the 2CA with the symmetry is lower. (pdσ) = -136 mRy (the mean) with acceptable loss of accuracy. The dd parameters are the crucial ones in Nb C. Three Center vs. Two Center and in other elemental transition metals, and in most transition metal compounds. For nearest neighbors A general Hamiltonian matrix element in Eq. there are four three center parameters. Denoting (1.20) contains atomic potentials on a third atom, (ddσ), (ddπ), (ddδ) by S, P, D, respectively, the cor- besides the sites upon which the orbitals are cen- responding values and 2CA expressions are given by tered. The general form of the matrix elements then 1 4 contains three center integrals. Slater and Koster in- 22 = Hxy,xy S + 0P + D (2.30) troduced, and advocated, the use of the two-center − → 3 9 1 1 2 approximation (2CA), in which three center contri- 35 = Hxy,zx S P D butions to the parameters are neglected. By the − → 3 − 9 − 9 way, if you can follow this set of notes to this point, 2 2 31 = √3Hxy,3z2−r2 0S P + D you can read and understand the Slater-Koster that − → − 3 3 is referenced in the abstract. It is a must-read for 2 2 1 +31 = H3z2−r2,3z2−r2 S + P + D those using the tight binding method. → 9 3 3 There had already been suggestions that such The 3rd of these equations has been multiplied three center terms were negligible and, while point- through by √3 to facilitate solution by elimination ing out that they are not negligible in a serious cal- using various combinations of the equations. Sub- culation (borne out by many subsequent studies), tracting the 2nd from the 1st gives SK suggested that for the purposes of parametrizing information (from experiment or from band calcu- 1 2 lation) a 2CA is likely to be very useful. Table I +13 = 0S + P + D, (2.31) 9 3 in their paper provides the expression for s,p, and d matrix elements in terms of two center integrals, which can be combined with the 3rd and 4th equa- denoted (ssσ), (pdπ), (ddδ), etc. The 2CA is very tions (separately) to produce three “solutions,” one widely, almost universally, used. of which is S,P,D = 107, 77, 31 . Notice the As an example of the validity, or lack thereof, of strong decrease{ from}σ {−π δ. Comparison} of the the 2CA, we consider the case of Nb studied by Pick- values from different “solutions”→ → provides a guideline ett and Allen.[1] They performed a full three center for the viability of the 2CA. fit to the augmented plane wave bands of Mattheiss for Nb (also Mo; these were state of the art in ac- curacy at the time, and still perfectly good for most D. Introducing the usual terminology applications) at 55 k points in the irreducible zone, using an s + p + d basis set (9 9 matrix), and × Now we consider an intersite term. However, including parameters out to third neighbors. Sev- we do not intend here to do a serious calculation; eral of the 31 parameters are found to be negligibly rather, we want to identify the important quantities small, suggesting the full three-center treatment was (often called parameters) and choose likely values not essential. As a test of the 2CA in Nb, we can and learn some things about the general behavior of look at the pd parameters for nearest neighbors in the electronic system. In fact, we’ll just define a TB the bcc lattice, which lie along (111) directions. The parameter values of the independent parameters (in mRy), to- gether with the expression in the 2CA, are t (R~) H (R~) (2.32) m,n ≡ m,n 1 1 42 = Hx,xy (pdσ) + (pdπ) because t is the usual notation for these integrals. − → 3 3√3 This integral (see above) indicates how easy it is for 1 an electron – whose behavior after all is determined +16 = Hx,x2−y2 0(pdσ) + (pdπ) → 3√3 by the Hamiltonian – to “hop” from orbital n on 1 2 site R~ to orbital m at the origin. The t constants 51 = Hx,yz (pdσ) (pdπ). { } − → 3 − 3√3 are called hopping parameters or hopping amplitudes. The on-site (R~ = 0) term was given in Eq. 2.29, The solution to Eq. 1.16 is immediate: 6 which we repeat here to jog the memory: εs + 2t1cos(ka) ~ εk = (3.37) tm,n(0) = εnδm,n, (2.33) 1 + 2s1cos(ka) which is the atomic corresponding to It is very simple to add the effects of interaction with orbital φn. 2nd neighbors, which lie at 2a from the reference For the overlap matrix we have similarly the no- atom (“at the origin”). The± sum over the complex tation exponential factors leads to the result 2cos(2ka). Denoting the corresponding integrals by t and s , s (R~) = S (R~); s (0) = δ . (2.34) 2 2 m,n m,n m,n m,n the energy band dispersion relation becomes The latter result expresses the orthonormality of εs + 2t1cos(ka) + 2t2cos(2ka) atomic orbitals. εk = (3.38) Note that the local orbitals have never been con- 1 + 2s1cos(ka) + 2s2cos(2ka) structed. They enter implicitly through the Hamil- tonian and overlap parameters; the method is a C. General Features parametrization scheme. It is possible to do ac- tual calculations using atomic-like orbitals, which is known as linear combination of atomic orbitals It is time to reflect. Eq. 3.35 is the simple 1D (LCAO). tight binding band that arises commonly in modeling the behavior of 1D materials. Actually, it is almost always even simpler, because the s overlap parame- III. EXAMPLES ters are usually neglected for further simplicity. This is discussed in the next subsection. However, at the A. s functions on a simple lattice cost of a little messiness in deriving the expression, we have obtained a very simple form of ε vs. k dis- persion relation ε . In the simplest case it has only The simplest case to consider is when there is k a single parameter, t t, and by construction – only one, s-like function on each atom at sites in a 1 because the wavefunctions≡ were built to satisfy the Bravais lattice (i.e. one atom per cell). Then the TB Bloch condition – it has precisely the correct peri- matrix equation is a 1 1 equation, which gives a odicity. These are two general features of the TB direct expression for ε .× Since an s atomic orbital is ~k representation: (i) simplicity, and (ii) expressions a spherically symmetric function, there is no aggra- involving trig functions that automatically have the vating angular dependence to worry about in Eq. ?? correct periodicity. for the hopping parameters, and in fact the value of Another physical requirement is that the param- the hopping parameter is the same for all equivalent eter S (R~) in Eq. 2.27 cannot be greater than neighbors – all 1st neighbors have the same value, n,m unity in absolute value, however there can be a prob- all 2nd neighbors have the same (different, usually lem with the denominator vanishing in Eq. 3.37 un- smaller) value, etc. Let’s look at some cases. less s1 is less than 0.5. In practice, to be reasonable the various overlap parameters sm,n(R~) should be B. 1D linear chain of atoms small in magnitude compared with unity (they may be of either sign). The “self overlap” term is unity by normalization, and putting one of the orbitals on Taking some atom as our origin, we have two 1st a different site must reduce the magnitude of the neighbors, at a, and both have the same hopping overlap. Hence the denominator in Eq. 3.37 should amplitude t .± Then the sum in Eq. 1.19 contains 1 always take the form of a correction, an adjustment, the on-site term and those from neighbors: and not give very large alteration of the denomina- H (k) = ε + t eikR tor, or else the description loses its realism. s,s s 1 If the overlap matrix is approximated by the unit X~ R matrix, so that Eq. 3.37 becomes ikR −ikR = εs + t1(e + e )

= εs + 2t1cos(ka). (3.35) εk = εs + 2t1cos(ka), (3.39) Likewise, the interpretation of the parameters occurring in the dispersion relation is simple. The band minimum S (k) = 1 + s eikR =1+2s cos(ka).(3.36) and maximum are ε 2t and ε + 2t respec- s,s 1 1 s − | 1| s | 1| XR~ tively. This means that the center of the band lies 7 at εs and the bandwidth W is given by A. L¨owdin Orthogonalization

W = 4 t . (3.40) | 1| Let us return to Eq 1.17, This generalizes to square/cubic lattices in 2D and Hbˆ = εSb,ˆ (4.44) 3D, where the bandwidth is given by 2z t1 , where z is the usual notation for the number| of| nearest where the notation has been simplified even further neighbors. This can be changed somewhat by 2nd by dropping the ~k argument on each quantity, but neighbor terms, but the rule-of-thumb is this simple adding a “hat” on the matrices. We can make a relationship between bandwidth, coordination num- transformation that effectively eliminates S. This is done by introducing the square root, de- ber, and nearest neighbor hopping amplitude: 1 note it Sˆ 2 , of the overlap matrix S: W = 2z t1 . (3.41) 1 1 1 | | Sˆ = Sˆ 2 Sˆ 2 = (Sˆ 2 )2. (4.45) We will also need the inverse of the square root of D. Character of the Overlap Correction Sˆ (equal to the square root of the inverse):

1 1 1 1 We can obtain insight into the effect of includ- Sˆ− 2 Sˆ 2 = 1ˆ = Sˆ 2 Sˆ− 2 . (4.46) ing the overlap matrix S by expanding Eq. 3.38, Aha! I hear you say. Square roots and inverses of assuming s1,s2 are small. We obtain, shortening cos(ka) C (k), cos(2ka) C (k), matrices do not always exist. That is true; however, → 1 → 2 Sˆ is a positive matrix from its physical and mathe- matical definition. This means that all of its eigen- εk (εs + 2t1C1(k) + 2t2C2(k)) ≈ × values are positive, in which case its square root and (1 2s C (k) 2s C (k)) − 1 1 − 2 2 its inverse do exist. These matrices can be obtained = εs + 2(t1 s1)C1(k) + 2(t2 s2)C2(k) by (1) performing the similarity transformation that − 2 2− 4t1s1C1(k) 4t2s2C2(k) transforms Sˆ to a diagonal matrix, (2) taking the − − 4(t1s2 + t2s1)C1(k)C2(k). (3.42) square root[2] or the inverse of all diagonal elements − as desired, and (3) performing the inverse similarity Several effects can be noted. In the simple C1(k) and transformation. It is readily shown that the corre- C2(k) terms, the overlaps are reduced from t1,t2 to sponding matrices obey the properties given above t s and t s . Of course this can be an in- for the square root or the inverse. We don’t say 1 − 1 2 − 2 crease if t1 and s1 differ in sign, etc. The second more about this because for now it is only necessary effect is to add more wiggles (Fourier components), to know that it is possible. via the products of cj (k) terms. It does so in a some- With these matrices we can now carry out the what different way than simply adding more neigh- following steps: bors (more t’s) into the expansion. Probably it is ˆ ˆ more efficient, when using the TB parametrization Hb = εSb − 1 1 1 1 to fit given complicated band structures, to use an Hˆ Sˆ 2 Sˆ 2 b = εSˆ 2 Sˆ 2 b 1 1 1 1 overlap matrix rather than simply add more hopping (Sˆ− 2 Hˆ Sˆ− 2 )(Sˆ 2 b) = ε(S 2 b) parameters, but this question may not have been ˆb = εb, (4.47) studied systematically. H where − 1 − 1 IV. THE OVERLAP MATRIX ˆ = Sˆ 2 Hˆ Sˆ 2 . (4.48) H The generalized eigenvalue problem Hbˆ = εSbˆ has The overlap matrix deserves comment before we been transformed into a conventional eigenvalue continue. In a large majority of cases where the TB problem ˆb = εb with the same eigenvalues. The method is used for simplification or for pedagogical eigenvectors,H which contain the information about reasons, the overlap matrix is “set to unity”: how much each atomic orbital contributes to the eigenfunction, have changed, and the Hamiltonian S (R~) δ δ . (4.43) m,n → m,n R~,~0 matrix that needs to be diagonalized has been trans- formed. This observation, together with the lack of It is important to understand why, first, this is in any overlap matrix in the new equation, indicates principle a reasonable thing to do, and second, it that the transformation leads not only to simplification but to additional ap- 1 proximation. b = S 2 b (4.49) v+w v w 8 reflects a transformation of the underlying atomic amplitude t2, we take advantage of e = e e to orbitals into a set of mutually orthogonal orbitals, get the additional term, and the band looks like even if they were originally residing on different ε = ε + 2t (cos(k a) + cos(k a)) atoms. This orthogonalization procedure is called ~k s 1 x y L¨owdin orthogonalization, after the eminent quan- + 4cos(kxa)cos(kya). (5.54) tum chemist Per-Olov L¨owdin (pronounced love- Once you do a few of these, the pattern becomes dean). simple. When you need to deal with atomic p or What this orthogonalization procedure does is d orbitals, the procedure becomes more intricate: to add in (either sign is possible) to any given or- the t(R~) factor depends on R~ and cannot simply be bital φ (~r) a fraction of every other orbital that it j pulled out from under the summation. But that is overlaps. The amount it must mix in is related to another course. the corresponding element of Sm,n(R~). The effect is to make each of the orbitals more spread out, and thereby to increase the number of matrix ele- B. 2D square lattice of p orbitals ments that are needed in ˆ compared to what were ˆ H needed in H. This fact is bothersome and tends For p ,p orbitals on a square lattice, one can to get “forgotten,” because the objective of the TB x y take advantage of the observation that px and py formalism is simplicity, and therefore few parame- orbitals on neighboring sites are orthogonal, whether ters. The conventional applications of TB theory for oriented along thex ˆ ory ˆ directions. Keeping the (semi)empirical studies therefore usually presumes larger pσ hopping and also the smaller pπ hopping, (“we assume, for simplicity”) that the original local the 2 2 matrix (for the two orbitals) is diagonal, ~ orbitals φm(~r R) have been orthogonalized. with × If the{ off-diagonal− } elements of S (i.e. the over- laps) are small, an approximate square-root matrix px = 2tσcoskxa + 2tπcoskya, (5.55) can be obtained by expansion. Writing with the other diagonal elements just interchanging x y indices, giving rise to obvious square symme- δSˆ Sˆ 1, (4.50) ↔ ≡ − try. we can write VI. COMMENTS OF FITTING OF 1 1 1 Sˆ 2 = (1 + δSˆ)1/2 1 + δSˆ + (δSˆ)2 + ... (4.51) PARAMETERS ≈ 2 8 This expansion is readily evaluated without diago- The tight binding hopping parameters (and nalizing the overlap matrix. overlap parameters, if used) are commonly fit to an independently calculated band structure. Denote the parameters by a vector ~t and the set of eigenval- ~ V. RETURN TO APPLICATIONS ues to be fit by εK , where K will include both k and band indices.{ The} mathematical problem is to minimize the “residual” A. 2D Square Lattice of s Orbitals R δ = δ g [E (~t) ε ]2 = 0. (6.56) R K K − K Henceforward we neglect the Sˆ matrix, since XK most of the rest of the world does so. For the square Here gK is a weight to be chosen as desired (for lattice, the sum over nearest neighbors runs over the example, to fit eigenvalues around the Fermi en- sites (a, 0), (0, a), ( a, 0), (0, a). The sum becomes − − ergy more closely than elsewhere), and EK are the eigenvalues resulting from the TB secular{ equation,} eikxaj + eikxap → which depend on the parameters. XR~ jX=±1 pX=±1 Carrying out the minimization with respect to ~t = 2(cos(kxa) + cos(kya)), (5.52) gives g (E (~t) ε ) E (~t) = 0. (6.57) and the dispersion relation is K K − K ∇t K XK

ε~ = εs + 2t1(cos(kxa) + cos(kya)). (5.53) Linearizing around ~t ~t : k ≈ o E (~t) E (~t ), Now, suppose we want to account for “hopping” to ∇t K ≈ ∇t K o the second neighbors at the points ( a, a) with E (~t) E (~t ) + (~t ~t ) (E (~t (6.58)), ± ± K ≈ K o − o · ∇t K o leads to Since the Hamiltonian matrix elements depend lin-9 early on the components of ~t, the derivative in this gk[EK (~to) εK + (~t ~to) t(EK (~to)] last equation is trivial and can be evaluated exactly − − · ∇ XK by finite difference (numerically; of course, if the ma- t(EK (~to) = 0, (6.59) trix is in analytic form it can be obtained trivially). ∇ A note to programmers: for Nb, where a 9 9 matrix × which can be written in a matrix form was used, with 31 parameters and 55 ~k points, there are ~b + (~t ~t ) A = 0. (6.60) − o · This equation can be transposed to 9 10 ~t = ~t ~b A−1, (6.61) × 55 31 = 76, 725 (6.65) o − · 2 × × where b and the matrix A are defined by

~b = g E (~t ) ε ) E (~t ), such derivatives! k K o − K ∇t K o XK Eq. 6.61 provides an iterative procedure for fit- A = g [ E (~t )] [ E (~t )] .(6.62) ting the parameters. Like all such schemes, it will p,q k ∇t K o p ∇t K o q converge if the starting point is sufficiently close to XK a good minimum. There will in general be many lo- The Hellman-Feynman theorem gives a simple cal minima, and the criterion for a best fit may be method of calculating the derivatives. Given the ma- somewhat subjective as well. However, some study trix eigenvalue equation of these problems (unpublished) seems to indicate that non-uniqueness of the fit is not a big problem. ˆ ~ ~ ~ ~ H(k) kα, t >= E~k,α kα, t > (6.63) Although a converged result is certainly not unique, | | restarting the iteration procedure in practice leads where α is the band index, the derivative is to sets of parameters whose differences have little physical significance, i.e. the large parameters are ~ ~ ˆ ~ ~ ~ tE~kα(t) =< kα, t tH(k) kα, t > . (6.64) fairly well determined. ∇ |∇ |

[1] W. E. Pickett and P. B. Allen, Solid State Commun. elements. Surely the physical choice is to take the (1973). positive square root of each diagonal element, though [2] The square root of a matrix is ambiguous. After di- it is unclear how much this choice has been tested. agonalization, one can take ± any of the diagonal