Conductance of Single Electron Devices from Imaginary–Time Path Integrals

Dissertation

zur Erlangung des Doktorgrades der

Fakult¨at f¨ur Mathematik und Physik der Albert–Ludwigs–Universit¨at, Freiburg im Breisgau

vorgelegt von Christoph Theis aus Bernkastel-Kues

Freiburg, April 2004 Dekan : Prof. Dr. R. Schneider Leiter der Arbeit : Prof. Dr. H. Grabert Referent : Prof. Dr. H. Grabert Koreferent :

Tag der m¨undlichen Pr¨ufung: 26. Mai 2004 Contents

1 Introduction and Overview 1

2 Concepts of Transport in Nanoscopic Structures 5 2.1 Resonant Tunneling through Discrete Levels ...... 5 2.2 CoulombBlockadeofTransport...... 8 2.3 KondoEffectinQuantumDots ...... 11

I Transport Properties from Imaginary-Time Path Integrals 15

3 Path Integrals for Fermions 17 3.1 Introduction: The Feynman Path Integral ...... 17 3.2 SecondQuantization ...... 19 3.3 GrassmannAlgebra ...... 20 3.3.1 Motivation and Definition of the Grassmann Algebra ...... 21 3.3.2 Calculus for Grassmann Variables ...... 22 3.3.3 Important Integration Formulas ...... 23 3.4 FermionCoherentStates...... 25 3.4.1 Definition of Fermion Coherent States ...... 25 3.4.2 PropertiesofFermionCoherentStates ...... 26 3.5 CoherentStatePathIntegral ...... 28 3.6 Example: Non–Interacting Fermions ...... 29 3.6.1 ThePartitionFunction ...... 29 3.6.2 TheThermalGreen’sFunction ...... 30

4 Path Integral Monte Carlo 33 4.1 BasicsofMonteCarloIntegration...... 33 4.2 Importance Sampling and Markov Processes ...... 34 4.2.1 Reduction of Statistical Errors by Importance Sampling...... 34 4.2.2 Markov Processes and the Metropolis Algorithm ...... 35 4.3 Statistical Analysis of Monte Carlo Data ...... 37 4.3.1 Estimates for Uncorrelated Measurements ...... 37 4.3.2 Correlated Measurements and Autocorrelation Time ...... 38 4.3.3 Binning Analysis of the Monte Carlo Error ...... 39 4.4 Systematic Errors and Trotter Extrapolation ...... 39 4.4.1 Approximations for the Short–Time Propagator ...... 40 4.4.2 TrotterErrorofExpectationValues ...... 40 4.4.3 TrotterExtrapolation ...... 42

i ii CONTENTS

4.5 Non-Positive Actions and the Sign Problem ...... 42

5 Correlation Functions and Inverse Problems 45 5.1 Time Correlation Functions and Linear Response ...... 45 5.1.1 Real–Time Correlation Functions ...... 45 5.1.2 Linear Response Theory and Fluctuation–Dissipation Theorem ...... 46 5.1.3 The Kubo Formula for the Conductance ...... 48 5.1.4 Imaginary–Time Correlation Functions ...... 49 5.2 LinearInverseProblems ...... 50 5.2.1 Definition and Examples of Inverse Problems ...... 50 5.2.2 Ill–Posedness and Regularization ...... 52 5.3 The Singular Value Decomposition (SVD) ...... 54 5.3.1 Formal Solution for Linear Inverse Problems ...... 54 5.3.2 Regularization of the Solution ...... 55 5.3.3 Additional Constraints ...... 57 5.4 The Maximum Entropy Method (MEM) ...... 60 5.4.1 BayesianInference ...... 60 5.4.2 The Maximum Entropy Functional ...... 61 5.4.3 Determination of the Regularization Parameters ...... 63 5.5 TestoftheSVDMethod...... 64 5.5.1 AnExactlySolvableModel ...... 64 5.5.2 ApplicationoftheSVDMethod ...... 71 5.5.3 Comparison of SVD and MEM Results ...... 74

II Applications 85

6 The Metallic Single Electron Transistor 87 6.1 Single Electron Tunneling through a Metallic Island ...... 87 6.1.1 Experimental Realizations and Model Parameters ...... 87 6.1.2 ChargingModel ...... 88 6.2 PathIntegralFormulation ...... 89 6.2.1 PathIntegralAnsatz...... 89 6.2.2 TheCoulombAction...... 90 6.2.3 Coherent State Path Integral and Source Terms ...... 92 6.3 Effective Action of the Single Electron Transistor ...... 93 6.3.1 Exact Integration of Quasi-Particle Baths ...... 93 6.3.2 TheTunnelAction...... 94 6.3.3 The Current Autocorrelation Function ...... 97 6.4 Monte Carlo Calculation of the Correlation Function ...... 97 6.4.1 Discretization of the Path Integral ...... 97 6.4.2 Details of the Monte Carlo Simulation ...... 100 6.4.3 Results for the Cosine Correlation Function ...... 102 6.5 Results for the Conductance ...... 104 6.5.1 Inverse Problem for the Conductance ...... 104 6.5.2 Coulomb Oscillations of the Conductance ...... 108 6.5.3 Temperature Dependence of the Conductance ...... 110 6.5.4 Dependence on the Tunneling Strength ...... 111 CONTENTS iii

7 Semiconductor Dots 113 7.1 Band Diagram of Semiconductor Heterostructures ...... 114 7.1.1 Band Structure of GaAs and AlGaAs ...... 114 7.1.2 BandProfileofaHeterostructure...... 116 7.2 Electrostatics of Gated Quantum Dots ...... 119 7.2.1 The Constant Interaction Model and its Limitations ...... 119 7.2.2 Electrostatic Energy and Work of the Power Sources ...... 120 7.2.3 Green’s Function for a Vertical Quantum Dot ...... 121 7.3 TheoreticalModel ...... 123 7.3.1 ModelHamiltonian...... 123 7.3.2 ActionandSourceTerms ...... 124 7.3.3 DecouplingoftheInteraction ...... 125 7.4 EffectiveAction...... 126 7.4.1 Integration over the Lead Fermions ...... 126 7.4.2 Integration over the Quantum Dot Fermions ...... 127 7.5 DiscussionoftheResults...... 128 7.5.1 GeneralDiscussion ...... 128 7.5.2 Outlook: Stationary Phase Approximation ...... 128

8 Summary and Conclusions 133

III Appendices 137

A Properties of Correlation Functions 139

B Linear System of de Villiers’ SVD Method 141

C The Damped Harmonic Oscillator 143 C.1 Influence Functional for a Linearly Coupled Harmonic Bath ...... 143 C.2 ClassicalDynamicalFrictionKernel ...... 145 C.3 Correlation Function for the Tagged Oscillator ...... 146

D Representation of Operators 147 D.1 TheChargeShiftOperator ...... 147 D.2 TheCurrentOperator ...... 147

E Electrostatics of Quantum Dots 151 E.1 FormalSolutionoftheDirichletProblem ...... 151 E.2 Green’sFunctionforaCylindricalDot ...... 152

Bibliography 156

Chapter 1

Introduction and Overview

The continuing progress in miniaturization of electronic circuits has reduced the length of a single transistor down to the nanometer scale. Not only does this imply that the size of the fundamental building blocks approaches that of the chemical units of the material but also that quantum mechanical effects play a very important role in their operation. On the one hand this poses new problems as we are reaching a fundamental limit of miniaturization where noise and quantum mechanical interference effects reduce the reliability of ”classical” transistors as logic units. On the other hand new possibilities open up that can be summarized under the keywords ”molecular electronics” and ””. In this thesis we will examine two model systems that are important for the understanding of the relevant concepts of molecular electronics and that are currently under investigation for applications in quantum computing. The metallic single electron transistor (SET) [1] shown on the reflection electron microscope (REM) picture in fig. 1.1 consists of a small Al island (with linear dimension L 500nm and ≈ capacitance C) coupled to Al leads via tunnel barriers formed by an oxide layer. The Al island is also coupled electrostatically to gate electrodes via a gate capacitance Cg. The SET is an important model system for the study of the Coulomb blockade effect which is responsible for a suppression of the source drain current for voltages V V with V [0, e/C] depending ≤ th th ∈ on the gate voltage Ug. For the linear response conductance it leads to oscillations with period e/Cg as a function of the gate voltage.

gate

source island drain

gate

Figure 1.1: REM picture of a four junction SET. In the Coulomb blockade measurements both gates as well as the two source and the two drain electrodes are connected in parallel (from [2]).

1 2 CHAPTER 1. INTRODUCTION AND OVERVIEW

Among other possible applications [3] it can be used as an ultra sensitive electrometer [4] and it represents a building block in the so called ”quantronium” circuit [5] which is a promising candidate for a , i.e. the basic unit of information in quantum computing. Single–atom transistors [6], single–molecule transistors [7] or carbon nanotube single electron transistors [8] in which gold electrodes are used and the central island is replaced by a molecule or carbon nanotube are applications of the concept of the single electron transistor for molecular electronics research. Semiconductor quantum dots which are sometimes also referred to as artificial atoms [9, 10] are based on the realization of a two–dimensional electron gas (2DEG) formed in a semiconductor heterostructure. Using electrostatic gates or lithographic techniques (or a combination of both) to create a confinement potential in the plane of the 2DEG one forms two–dimensional ”atoms” containing between one and several hundred electrons. The quantum dot can be contacted either laterally or from above and below by n–doped GaAs layers. Examples for both geometries are shown in fig. 1.2.

Figure 1.2: Subfigure a) shows the schematic layout of a vertical quantum dot consisting of a InGaAs layer sandwiched between AlGaAs tunnel barriers and contacted from above and below (from [10]). Subfigure b) displays an electron micrograph showing the electrostatic gates defining a lateral quantum dot. The 2DEG is situated 190 nm below the surface of the sample. In the left part of b) one can see another realization of a SET that is used as an electrometer to measure the charge on the dot (adapted from [11]).

Semiconductor quantum dots are an interesting model system since they provide the possi- bility to study an electron gas with well–defined contacts that shows atom–like properties which can be easily tuned by electrostatic gates and magnetic fields. Due to the confinement on a scale of 100 nm in the plane of the 2DEG and . 10 nm in the perpendicular direction the electrons ≈ inside the quantum dot have a discrete spectrum which is responsible for an important aspect of electronic transport which is known as resonant tunneling. The existence of a singly occupied () degenerate level in the quantum dot can also give rise to many–body effects between the electron gas in the leads and the localized states of the dot that are analogous to the Kondo effect in a metal containing dilute magnetic impurities. The ”tunable” Kondo effect [12] in quantum dots has been studied extensively in the last years and constitutes another important concept of electronic transport through a confined electron system. Single quantum dots or quantum dot arrays as artificial atoms or molecules represent a step in the development towards molecular electronics. They (usually) are produced by traditional top–down approaches and lack the mechanical degrees of freedom and the possibility to undergo 3 conformational changes but they already share many of the mechanisms that will be important for transport through real molecules. With respect to applications in quantum computing in particular double dot systems are studied extensively and have been used to realize charge [13] and spin qubits [14, 15]. Like the quantronium circuit, (double) quantum dots are fabricated from materials that are already well established in information processing and thus can be more easily incorporated in integrated circuits than other realizations of qubits. The aim of this thesis is to examine theoretical approaches that allow a quantitative calcu- lation of charge transport in these important model systems over the range of experimentally accessible parameters. For the description of the models we use path integral methods that have been applied successfully in the non–perturbative treatment of tunneling. To avoid the so–called dynamic sign problem in the (direct) numerical calculation of real–time quantities we employ imaginary–time path integrals. Imaginary–time methods rely on linear response theory and schemes for the ”analytical continuation” of numerical data that will be critically examined for the exactly solvable model of a harmonic oscillator embedded in a harmonic environment. The path integral formulation for the metallic single electron transistor [16] has proven to be a promising candidate for a quantitative description of Coulomb blockade effects although a rigorous comparison with experiment was hampered by the fact that the relevant parameters of the theory were not (all) accessible to measurement. In a recent experiment Wallisser et al. employed an improved layout that allows a complete characterization of the sample and enables us to perform a comparison between theory and experiment without any adjustable parameters. A quantitative theory for the conductance of semiconductor quantum dots that gives a unified description of the Kondo effect and Coulomb blockade does not yet exist though first approaches in that direction have been made [17]. Therefore we will derive a realistic model of a (vertical) semiconductor quantum dot to which we apply the imaginary–time formalism to assess whether this approach can be generalized to this system. The first part of this work is devoted to the development of the methods that are used for our study. As a starting point we take the path integral description of and its generalization to many–body systems that will be discussed in detail in chapter 3. For a non–perturbative treatment of the SET in the regime of strong tunneling and for a realistic description of the electrostatics of a semiconductor quantum dot, the use of numerical methods for the calculation of the conductance is required. Chapter 4 describes how Monte Carlo methods can be used to evaluate numerically the high–dimensional integrals that result from the path integral description. Numerical approaches to the calculation of real–time correlation functions for quantum mechanical many–body systems are plagued by the so called ”sign–problem” that leads to an exponential decrease of the signal–to–noise ratio with increasing time t. Therefore we have used an alternative approach based on the calculation of imaginary–time correlation functions that can be determined with high accuracy by Monte Carlo methods. In chapter 5 we use linear response theory to work out the relations between the conductance and these correlation functions. For the imaginary–time formalism these relations have the form of an inverse problem and require special numerical methods for their solution. Since this represents a crucial step of the calculations we will discuss in detail the tests of their implementation for an exactly solvable model system. The second part of the thesis describes the application of the imaginary–time path integral formalism to the metallic single electron transistor and semiconductor quantum dots. In chapter 6 the single electron transistor is modeled as a macroscopic charge degree of freedom coupled to quasiparticle baths in the leads and on the island. We derive the effective action of the SET by exact integration over the quasiparticle degrees of freedom and use Monte Carlo methods to evaluate the current–current correlation function in imaginary–time from which the conductance 4 CHAPTER 1. INTRODUCTION AND OVERVIEW of the SET can be calculated. The results are compared in detail with experimental findings of Wallisser et al. [2] and Joyez et al. [18]. In chapter 7 we outline the extension of the imaginary–time path integral approach to the description of semiconductor quantum dots. Since the screening length in this system is larger than in the metallic SET of chapter 6, a realistic model of a semiconductor quantum dot has to take into account the screened electron–electron interaction on the quantum dot and the effects of the gate electrode on the confinement in more detail. We show how the geometry of this electrostatic problem can be incorporated in the action of the imaginary–time path integral. As in the case of the SET the quasiparticles can be integrated out exactly and an effective action for the description of semiconductor quantum dots can be derived. We compare the resulting theory with the path integral approach for the SET and point out directions for further research in relation with recently published theoretical results by Bednarek et al. [19]. A part of this thesis has been published in [2]. Chapter 2

Concepts of Transport in Nanoscopic Structures

In this chapter we give a more specific introduction to three important concepts of electronic transport through nanoscopic and mesoscopic systems. These are resonant tunneling through discrete levels of the confined nanostructure, Coulomb blockade of tunneling as a result of charge quantization, and the Kondo effect as a many–body phenomenon due to the correlation of the conduction electrons in the leads with a degenerate level in the confined structure.

2.1 Resonant Tunneling through Discrete Levels

When two macroscopic resistors are connected in series their resistance adds according to Ohm’s law. As demonstrated in the pioneering work of Chang, Esaki and Tsu [20] the situation is very different if we study a system consisting of two tunnel barriers in series separated by a conducting layer with a thickness of a few nanometers. A realization of such a double barrier system using a semiconductor heterostructure is shown schematically in fig. 2.1 a). Subfigure b) displays the IV –characteristic of such a system that is characterized by an initial increase of the current with increasing source–drain voltage followed by a region of negative differential resistance, i.e a drop of the current.

Figure 2.1: Schematic setup and IV –characteristic of a resonant tunneling device (from [21]).

5 6 CHAPTER 2. CONCEPTS OF TRANSPORT IN NANOSCOPIC STRUCTURES

To get a qualitative understanding of this negative differential resistance it is sufficient to consider coherent transport through the double barrier structure. Hence our starting point is the Schr¨odinger equation (with effective mass m∗) for the quasiparticles [21] that separates into the equation ~2 ∂2 ∂2 + + U (x, y) φ (x, y)= ǫ φ (x, y) (2.1) −2m∗ ∂x2 ∂y2 T m m m · µ ¶ ¸ for the energies ǫm of the transverse modes and the one–dimensional scattering problem

~2 d2 + U (z) ψ (z)= E ψ (z) with E = E ǫ (2.2) −2m∗ dz2 L m L m L − m · ¸ under the assumption that the potential can be split into the double–barrier potential UL(z) and a z–independent transversal confinement UT (x, y). For coherent transport the probability that a quasiparticle with (longitudinal) energy EL is transmitted through the two barriers can be given as [21]

T1T2 T1T2 T (EL)= 2 (2.3) 1 2√R1R2 cos(θ(EL)) + R1R2 ≈ T1+T2 + 2(1 cos(θ(E ))) − 2 − L £ ¤ where θ(EL) is the phase shift acquired by a plane wave with energy EL on a round–trip between the barriers and we have used the assumption T = 1 R 1, i = 1, 2 of weak transmission i − i ≪ through the single barriers. The maxima of the transmission T (EL) correspond to a phase shift θ(E ) = 2πn, n N, i.e. to energies E that coincide with the energies E of resonant or quasi– L ∈ L r bound states in the confined region. Expanding the factor 1 cos(θ(E )) around the energy E − L r of the resonance as

1 1 dθ 2 1 cos(θ(E )) (θ(E ) 2πn) (E E )2 (2.4) − L ≈ 2 L − ≈ 2 dE L − r µ L ¶ we can rewrite the transmission probability as

Γ1Γ2 Γ1 + Γ2 dEL T (EL) 2 with Γi Ti, (2.5) ≈ Γ1 + Γ2 (E E )2 + Γ1+Γ2 ≡ dθ L − r 2 i.e. as a Lorentzian with width Γ + Γ centered£ at energy¤ E . Since the phase shift is θ 2kw 1 2 r ≈ where k is the (longitudinal) wavevector of the quasiparticle and w the effective width of the well (including phase shifts due to reflection at the barriers)Γi/~ is given by Γ 1 dE 1 1 dE v i = T L = T = T . (2.6) ~ i ~ dθ i 2w ~ dk i 2w The quotient of the velocity v and the length 2w of a round–trip is just the rate at which the quasiparticles impinge on the barrier i and hence Γi/~ can be interpreted as the rate at which electrons in the confined region leak out through barrier i. Eq. (2.5) tells us that the double barrier structure acts as a filter which can only be passed by electrons with a longitudinal energy E E . Thus the negative differential resistance can L ≈ r be explained by the simple picture outlined in fig. 2.2. The transport voltage shifts the position of the resonant level with respect to the conduction band of the source where we have assumed for simplicity that the applied voltage drops linearly across the device which is sufficient for a qualitative understanding. For finite bias the resonant level lies within the transport window, 2.1. RESONANT TUNNELING THROUGH DISCRETE LEVELS 7

EL EL

      µ1  E µ1   r       Er         0 µ2 0      µ   2       V    V  

EL EL

Er        µ1  µ2 µ1                    E    r 0 0  Vth

  µ2       V  

Figure 2.2: Schematic conduction band diagrams of a resonant tunneling diode with a single resonant level for different regimes of the source drain voltage in correspondence to the IV – characteristic of the resonant tunneling diode. i.e. electrons from the source can tunnel through the resonant level into unoccupied states in the drain electrode. Since the energy difference between Er and µ1 is available for the energy ǫm of the transverse modes more and more modes can contribute to the current for increasing bias V . This leads to the observed increase of the current until the resonant level aligns with the (lower) conduction band edge of the source at the threshold voltage V = Vth. For higher transport voltages all incident electrons have energies above Er and transport is only possible if an electron tunneling into the well simultaneously emits a photon. Since this process has a much smaller probability the current exhibits a sharp drop for V > Vth. This qualitative picture remains correct also in the presence of scattering processes though they have to be included to get a quantitative description [21]. Scattering is important espe- cially for the calculation of the valley current for V > Vth when coherent resonant tunneling is forbidden. This can also be seen from the rather large discrepancy for V > Vth between the experimental results and theoretical calculations [22] based on the coherent tunneling picture as displayed in fig. 2.2. For devices like the vertical quantum dot shown in fig. 1.2 a) where the potential in the quantum well and hence the position of the resonant levels can be shifted by application of a voltage Ug to a side gate, resonant tunneling can also be observed in the linear response conductance G = limV →0 I/V . Here the action of the quantum well as an energy filter emerges even more clearly since we get an appreciable current only for those gate voltages at which a 8 CHAPTER 2. CONCEPTS OF TRANSPORT IN NANOSCOPIC STRUCTURES resonant level aligns with the (small) transport window at the Fermi energy of the leads. From the above discussion it is clear that a negative differential resistance due to resonant tunneling can only be observed if the level spacing ∆ in the well, i.e. the separation of subsequent resonant levels Er is large compared to the thermal energy kBT . In addition the broadening of the resonant levels due to tunneling which is characterized by the rate (Γ1 + Γ2)/~ has to be small. The single particle picture we have employed also neglects the Coulomb interaction of electrons in the quantum well. As we will see in the next section this assumption has to be modified if the capacitance of the quantum well becomes very small.

2.2 Coulomb Blockade of Transport

The electrostatic energy needed to charge a structure with capacitance C by one elementary 2 charge e is given by the so–called charging energy EC = e /(2C). This energy scale is negligible for the charging of a macroscopic body but in the case of the metallic island of a single electron transistor with capacitance in the range of C 10−15F [18] or for semiconductor quantum dots ≈ with a diameter of d 100nm and a capacitance of C 2 10−16F [10] the charging energy is ≈ ≈ × comparable to thermal energies at temperatures of T . 1K. For this introduction we study the Coulomb blockade in a metallic single electron transistor as shown in fig. 1.1. In the conductance measurements a transport voltage V is applied between source and drain and the island potential can be tuned by a gate voltage Ug as outlined in the circuit diagram shown in fig. 2.3.

Rs, Cs Rd, Cd n

Cg

Ug V

Figure 2.3: Circuit diagram of the SET as operated during conductance measurements. The region encircled by the dashed line is the central island containing n (excess) electrons.

Experimentally the Coulomb blockade results in a suppression of the source drain current for transport voltages V V with a gate voltage dependent threshold V [0, e/C] where | | ≤ th th ∈ C = Cs +Cd +Cg is the total capacitance of the island. Fig. 2.4 a) shows how the resulting IV – characteristic develops from an Ohmic behavior at high temperature into a Coulomb blockade of the current for E k T . As shown in fig. 2.4 b) the Coulomb blockade can also be observed C ≫ B in the linear response conductance of the SET where it leads to marked oscillations with period e/Cg as a function of the gate voltage when the charging energy EC is large compared with the thermal energy. The experimental results clearly show that E k T is a necessary condition C ≫ B 2.2. COULOMB BLOCKADE OF TRANSPORT 9 for the observability of Coulomb blockade and that charging effects are washed out by thermal energy fluctuations at higher temperatures. As a second prerequisite the charging energy EC has to be larger than the quantum mechanical energy uncertainty corresponding to the finite dwell time of the electrons on the island. The finite dwell time due to tunneling processes from the island to the leads can be expressed by the parallel conductance G = Gs+Gd = (Rs+Rd)/(RsRd) of the tunnel junctions as τ = G/C and the comparison with the charging energy leads to the condition G G = e2/h. As the experimental Coulomb oscillations shown in fig. 2.4 b) which ≪ K were taken at G = 4.75GK prove, the latter condition can be taken only as a rough estimate, and charging effects can be observed also at parallel conductances that exceed the conductance quantum GK . 1 1 a) b)

kB T = EC 0 0 I/I G/G kB T = 0.2EC

-0.5 0 0.5 kBT=10 EC 0.2 0.4 0.6 0.8 kBT=1.0 EC kBT=0.1EC k T = 0.02E -1 B C -V 0 V -1 -0.5 0 0.5 1 th th C U /e V g g

Figure 2.4: Traces of the Coulomb blockade in transport measurements on a metallic single electron transistor for different temperatures. Subfigure a) shows the suppression of the source drain current for voltages V V while subfigure b) displays the periodic Coulomb oscillations ≤ th of the linear response conductance as a function of the gate voltage Ug (from [23]).

To get a qualitative description for these observations we consider the electrostatic energy of the island containing n excess charges given by [24]

e2 C U C V U(n)= (n n )2 where n = g g + d (2.7) 2C − 0 0 e e is the (continuous) charge induced by the applied voltages. The chemical potential µ(n) of the island is the minimal energy needed to add the n–th electron, i.e.

e2 1 C U C V µ(n)= U(n) U(n 1) = n g g d (2.8) − − C − 2 − e − e ·µ ¶ ¸ where we have used the Fermi level of the source as the reference point for the energy. From this result we see that the continuum of energy levels in the metallic island is split by the Coulomb interaction into a series of equidistant levels with energy spacing e2/C. The same model can be used for a qualitative understanding of charging effects in semiconductor quantum dots by adding the electrostatic potential calculated above to the electrochemical potential of the n–electron quantum dot which is given by the energy En of the n–th single particle level (measured relative to the Fermi level in the source), i.e. µ (n)= E + U(n) U(n 1). dot n − − 10 CHAPTER 2. CONCEPTS OF TRANSPORT IN NANOSCOPIC STRUCTURES

µ n+1    EC      µs    µd µs  µn+1       EC

 1 eV µn   µd  µn 0 I/I

-0.5 0 0.5 kBT=10 EC kBT=1.0 EC kBT=0.1EC -1 -V 0 V µn+1 th th V µn+1     µs  µs   eV    µd eV   µ  d µn  µn

Figure 2.5: Chemical potentials of the leads and the island of a metallic SET for different regimes of the source drain voltage in correspondence to the observed IV –characteristic.

We can discuss the consequences of eq. (2.8) for the IV –characteristics and the linear response conductance in an analogous manner as for the discrete spectrum in the case of simple resonant tunneling in the preceding section. Fig. 2.5 shows how the chemical potentials of the leads and the island are shifted relative to each other by the applied transport voltage. At zero voltage V = 0 the electron transport is blocked electrostatically and the chemical potentials for n+1 and 2 n electrons on the island lie EC = e /(2C) above and below the Fermi level of the leads. The Coulomb blockade of transport persists also at finite voltage until the threshold voltage V = Vth is reached at which either µ(n+1) aligns with µs or µ(n) with µd (depending on the capacitances Cs and Cd and the offset in the chemical potential created by the gate voltage Ug). As long as only the chemical potential µ(n + 1) (or µ(n)) lies within the transport window the number of charges can fluctuate only between two adjacent values and the electrons have to pass the island of the single electron transistor one by one. Simultaneous tunneling of several electrons becomes possible only at even larger transport voltages. Due to this control of electron transport on the level of single electrons this device has been termed the single electron transistor. The Coulomb oscillations in the linear response conductance shown in fig. 2.4 are easily understood from eq. (2.8) since the gate voltage simply shifts the island levels leading to a conductance peak whenever the chemical potential of the island aligns with the Fermi levels in the leads. At the position of the conduction peaks two adjacent charge states (e.g. n and n + 1 electrons on the island) are degenerate and the charge on the island can fluctuate resulting in transport through the SET while the valleys in between peaks correspond to a specific number 2.3. KONDO EFFECT IN QUANTUM DOTS 11 of (excess) electrons on the island. The period of this oscillations are given by e/Cg.

2.3 Kondo Effect in Quantum Dots

A further modification of resonant tunneling due to many–body correlations was predicted the- oretically already in 1988 when Glazman and Ra˘ıkh [25] simultaneously with Ng and Lee [26] noted the similarity between transport through a quantum dot connected to leads and the Kondo problem, i.e. scattering of conduction electrons at (dilute) magnetic impurities in a metal.

initial state virtual state final state

ǫ    F              ǫ0           

Figure 2.6: Schematic representation of the process in which the spin of the impurity state (symbolized by the single level at ǫ0) is flipped via an excited virtual state in which the impurity electron is lifted to the conduction band (represented by the shaded continuum of states filled up to the Fermi level ǫF ).

The impurity states in such metal systems lie at an energy ǫ0 well below the Fermi energy ǫF of the conduction band electrons. Nevertheless the spin of the impurity can interact with the spins of the conduction band electrons by exchange processes in which the impurity spin is flipped and a spin excitation at the Fermi energy is created. Such a spin flip can be accomplished by virtual processes depicted in fig. 2.6 in which the impurity electron is lifted to the conduction band and replaced by a conduction electron with opposite spin. At low temperatures this exchange interaction generate a so–called Kondo resonance in the density of states of the impurity at the same energy as the Fermi level which explains the observed increase of the resistance since the resonance is an effective scatterer for the conduction electrons. The Kondo effect can be studied theoretically using the Anderson model [25]

† † † † † ∗ † H = ǫkσckσckσ + ǫ0dσdσ + Udσdσd−σd−σ + Tckσdσ + T dσckσ (2.9) Xkσ Xkσ ³ ´ † † where ckσ and ckσ create/destroy an electron in the conduction band and dσ and dσ are the corresponding operators for the impurity site, i.e. the impurity is described by a single spin- degenerate level at energy ǫ with on–site Coulomb energy U and finite width Γ T 2 coupling 0 ∝| | to a band of conduction electrons. Detailed calculations predict a logarithmic increase of the resistance when the temperature is lowered and a universal scaling of the ratio R(T )/R(0) = f(T/TK ) for systems characterized by different values of ǫ0,U and Γ with the Kondo temperature [27] √ΓU ǫ (ǫ + U) T = exp π 0 0 (2.10) K 2 ΓU µ ¶ 12 CHAPTER 2. CONCEPTS OF TRANSPORT IN NANOSCOPIC STRUCTURES as the single relevant parameter. A quantum dot with an odd number of electrons has a non–vanishing total spin and thus can be viewed as an artificial magnetic impurity where the role of the level ǫ0 in the Anderson model is played by the highest occupied level of the dot. By the mechanism outlined above the spin of the quantum dot interacts with the spin of the conduction electrons in the leads leading to the formation of a Kondo resonance in the density of states of the dot at the position(s) of the Fermi level(s) in the leads. In contrast to the situation of magnetic impurities that represent an obstacle for the conduction electrons, transport through the quantum dot can only occur via ”scattering” of electrons across the quantum dot by processes as shown in fig. 2.7. Hence the Kondo effect in quantum dots leads to a logarithmic increase of the (linear response) conductance at low temperatures.

initial state virtual state final state

ǫ       F                       ǫ       0                 

Figure 2.7: Schematic representation of a cotunneling process (involving a spin flip) by which an electron is transferred across the quantum dot.

For the Coulomb oscillations of the linear response conductance this theory predicts that the Coulomb blockade should be lifted by the Kondo effect for the valleys corresponding to an odd number of electrons on the quantum dot as shown in fig. 2.8. In several experiments on lateral quantum dots [12, 28] the theoretical prediction could be confirmed by the observation of an even–odd pattern like the one shown in fig. 2.8.

Figure 2.8: Theoretical prediction for the even–odd effect due to the formation of a Kondo resonance in the Coulomb oscillations of a quantum dot (from [29]).

Further experimental studies [30, 31] revealed that a Kondo effect can also occur for a quantum dot containing an even number of electrons. Instead of the degeneracy between the 2.3. KONDO EFFECT IN QUANTUM DOTS 13 two spin states of the highest occupied level in the spin-1/2 Kondo effect described above, the Kondo effects observed in integer spin quantum dots can e.g. be based on the degeneracy between the spin triplet and spin singlet state of the electrons occupying the two highest levels [30]. An example for the Kondo effect in a Coulomb blockade valley corresponding to N = 6 electrons is shown in fig. 2.9 together with a demonstration of the logarithmic increase of conductance at low temperature.

) a)) b) /h /h 2 2 e e Conductance ( Conductance (

Gate Voltage (V) Temperature (mK)

Figure 2.9: Kondo effect in an integer spin quantum dot (adapted from [30]). Subfigure a) shows the temperature dependence of the linear response conductance for a Coulomb blockade valley corresponding to a state with N = 6 electrons on the dot. The arrow points in direction of increasing temperature. Subfigure b) displays the temperature dependence of the maximum of the Kondo peak with the straight line indicating the logarithmic increase. 14 CHAPTER 2. CONCEPTS OF TRANSPORT IN NANOSCOPIC STRUCTURES Part I

Transport Properties from Imaginary-Time Path Integrals

15

Chapter 3

Path Integrals for Fermions

The aim of this chapter is to present the mathematical background and derivation of a path integral formulation for fermionic many-body systems. A detailed introduction to this method is given in [32]. Other interesting sources are [33] and [34]. We will start this chapter with Feynman’s path integral formulation of the partition function. Though it is conceptually quite similar to the functional integrals that we will consider in section 3.5, it is mathematically much simpler and thus will serve us as an introduction. In section 3.2 we very briefly summarize the Fock space representation of many-particle systems and fix the notation. The third section deals with the mathematical background about the Grassmann algebra which is necessary for the definition of Fermion coherent states in section 3.3. These two sections also contain a number of formulas involving Grassmann variables that will be important throughout our path integral calculations. In section 3.5 the coherent state path integral formulation of many–Fermion systems will be derived. We close this chapter with the application of the formalism to the simple example of non–interacting Fermions.

3.1 Introduction: The Feynman Path Integral

In this section we will derive Feynman’s path integral formulation for the partition function of a single particle in an external potential V (x) as an introduction to the basic ideas of path integrals. The ideas and results of this section will also be important for the numerical evaluation of measurable quantities by the path integral Monte Carlo method that will be described in chapter 4. Our starting point is the Hamiltonian

p2 H = + V (x) K + V. (3.1) 2m ≡

The canonical partition function can be expressed as

Z(β)=tr e−βH = dx x e−βH x (3.2) h | | i n o Z where we have used the common notation β = 1/(kBT ) and calculated the trace using the position basis x of the one–particle Hilbert space. | i

17 18 CHAPTER 3. PATH INTEGRALS FOR FERMIONS

Trotter Breakup

The matrix element x exp( βH) x can be viewed as the probability that a system start- h | − | i ing at x for (complex) time z = 0 comes back to x at the time z = iβ. (Here and in the − following we will set ~ = 1.) To evaluate the matrix element it is useful to split up the evolution from z = 0 to z = iβ into P small segments ∆ along a contour in the complex time plane, − j C i.e. P P ∆ = iβ such that e−βH = e−i∆j H . (3.3) j − Xj=1 jY=1 In the simplest case one just splits the interval [0, iβ] into P equal slices of length ǫ = β/P . The − more general approach outlined above becomes necessary when one considers real– or general complex–time correlation functions [35]. After breaking up the evolution into P so–called Trotter slices we can use the completeness relation for the position eigenstates to get

Z(β)= dx ... dx δ(x x ) x e−i∆P H x ... x e−i∆1H x . (3.4) P 0 0 − P h P | | P −1i h 1| | 0i Z Z Evaluation of the Short–Time Propagator

Though eq.(3.4) seems rather complicated at first sight, it has the advantage that we can use the identity −i∆j H −i∆j K −i∆j V 2 e = e e + O(∆j ) (3.5) 2 to evaluate the remaining matrix elements up to a Trotter error of at most O(∆j ). In the analytical derivation of the path integral we will consider the limit P lim max ∆j = 0 (3.6) P →∞ j=1 | | 2 in which the terms O(∆j ) vanish and our calculations become exact. Eqs. (3.4) and (3.5) are also the basis for numerical evaluations by the path integral Monte Carlo method where the necessarily finite Trotter number P leads to systematic errors that will be discussed in detail in 2 chapter 4. Using eq.(3.5) we get up to order O(∆j )

p2 −i∆j H . −i∆j −i∆j V (xj−1) xj e xj−1 = xj e 2m xj−1 e (3.7) h | | i h | | i 2 −i∆ V (x ) −i∆ p = e j j−1 dp x p e j 2m p x h j| i h | j−1i Z dp p2 = e−i∆j V (xj−1) exp i∆ + ip(x x ) . 2π − j 2m j − j−1 Z ½ ¾ Evaluating the Gaussian integral over the momentum p one gets 1 2 . m 2 m xj xj−1 x e−i∆j H x = exp i∆ − V (x ) . (3.8) h j| | j−1i 2πi∆ j 2 ∆ − j−1 · j ¸ ( ( µ j ¶ #) Inserting this result for the matrix elements into eq.(3.4) the partition function takes the form

1 P 2 P 2 m m xj xj−1 Z(β)= P dx0 ... dxP δ(x0 xP ) exp i ∆j − V (xj) . (2πi) ∆j −  " 2 ∆j − # · ¸ Z Z  Xj=1 µ ¶  Q (3.9)   3.2. 19

Continuum Limit of the Path Integral

In the limit P the points x ,...,x form a continuous path x(z), the Riemann sum in the → ∞ 0 P exponent becomes an integral along the contour and the differential quotient (x x )/∆ C j − j−1 j can be written as a time derivative such that our result can be simplified to m Z(β)= x eiS[x] with S[x]= dz x˙ 2(z) V (x(z)) (3.10) D C 2 − x(0)=Zx(−iβ) Z h i wherex ˙(z) denotes the derivative with respect to complex time z and the prefactor has been included in the path integral measure x. D Imaginary–Time Path Integral

For the calculation of the partition function, equilibrium expectation values or even imaginary– time correlation functions, the simplest choice for the integration contour is the straight line C connecting z = 0 and z = iβ. It is usually parameterized by the imaginary time τ = iz with − values in [0, β]. The expression for the partition function in terms of the imaginary time is

E Z(β)= x(τ) e−S [x(τ)] (3.11) D x(0)=Zx(β) with the so–called Euclidean action β m SE[x(τ)] = dτ x˙ 2(τ)+ V (x(τ)) (3.12) 0 2 Z h i wherex ˙ now denotes the derivative of the path with respect to τ.

3.2 Second Quantization

This section just fixes the notation and reminds the reader of the definitions of the Fermion Fock space and the annihilation and creation operators that are needed in the following. For a more detailed description of the method of second quantization we refer to [32] or standard textbooks like [36]. In single particle quantum mechanics the state of the system is represented as a vector in a Hilbert space . Usually the state vector is given as a linear combination of basis vectors H = α α where α is a (multi)index from a complete set of quantum numbers. B {| i| ∈ I} I The Fermion Fock Space

We can build up the Hilbert space for N particles as the direct product of N single particle HN Hilbert spaces , i.e. H N = . (3.13) HN H Oi=1 If is an orthonormal basis of then the states α . . . α ) = α ... α with α B H | 1 N | 1i ⊗ ⊗| N i i ∈ I form an orthonormal basis of . BN HN 20 CHAPTER 3. PATH INTEGRALS FOR FERMIONS

The Pauli principle for Fermions asserts that all physical states are totally antisymmetric under exchange of particles. An orthonormal basis for the state space of N Fermions is thus FN given by the states

1 P α1 . . . αN F α1 . . . αN ) ( 1) αP 1 . . . αPN ) (3.14) | i ≡ P | ≡ √N! − | PX∈SN where if the projection onto antisymmetric states and the sum runs over the set PF S of all possible permutations P of N particles. The factor ( 1)P denotes the sign of the N − permutation P . The Fock space for a many–Fermion system can now be defined as the sum of all finite F particle state spaces and the zero particle space containing just the vacuum state 0 . FN F0 | i ∞ = . (3.15) F FN NM=0

Creation and Annihilation Operators

In the method of second quantization the quantum mechanical description of a physical sys- tem is formulated by operators acting on the Fock space. Of fundamental importance are the † † creation and annihilation operators aα and aα, respectively. While the operator aα creates an additional electron in the one-particle state α , i.e. | i

† αα1 . . . αN α α1 . . . αN aα α1 . . . αN = | i 6∈ { } , (3.16) | i 0 α α1 . . . αN ½ ∈ { } its adjoint aα destroys a particle in this state, i.e.

0 α α1 . . . αN N 6∈ { } aα α1 . . . αN = i (3.17) | i  ( 1) δααi α1 ... αˆi . . . αN α α1 . . . αN  i=1 − | i ∈ { } P  whereα ˆi indicates that αi has been removed from the many-particle state. An important prop- erty of the annihilation and creation operators for fermions are the anticommutation relations

[a ,a† ] a a† + a† a = δ (3.18) α β + ≡ α β β α αβ † † [aα,aβ]+ = [aα,aβ]+ = 0 (3.19)

3.3 Grassmann Algebra

This section summarizes the necessary mathematical background about the Grassmann algebra. We will not only state the definition and important results but also present some derivations in more detail to demonstrate how actual calculation with Grassmann variables are performed. Further information about the Grassmann algebra can be found in [37]. 3.3. GRASSMANN ALGEBRA 21

3.3.1 Motivation and Definition of the Grassmann Algebra Motivation

Before we embark upon the more mathematical and formal aspects of Grassmann algebra it is surely helpful to give a little motivation why we need to take recourse to Grassmann variables in order to describe Fermionic systems. In the context of Bosonic systems there exists a partic- ular important basis of Fock space build up of so–called coherent states which can be defined as eigenstates of the annihilation operators, i.e. they fulfill

a ζ = ζ ζ , ζ C, α (3.20) α| i α| i α ∈ ∈ I with complex eigenvalues ζα. We will study the properties of (Fermion) coherent states in more detail in the next section. At this point we merely want to note how the attempt to generalize the concept of coherent states to Fermionic systems leads automatically to the generalized Fock space with coefficients from a Grassmann algebra. If ζ denotes a Fermion coherent state we | i can deduce from the eigenvalue condition (3.20) and the anticommutation relation (3.19)

ζ ζ ζ = a a ζ = a a ζ = ζ ζ ζ α β| i α β| i − β α| i − β α| i ζ ζ = ζ ζ . (3.21) ⇒ α β − β α Since real or complex numbers commute, this argument shows that we have to extend the usual complex Fock space to allow for anticommuting numbers as coefficients. As we will see this can be done by choosing the eigenvalues ζα as elements of a Grassmann algebra.

Definition of the Grassmann Algebra

To construct a Grassmann algebra one usually starts with a complex vector space E spanned by the so–called generators ζ α = 1,...,n . One defines a formal multiplication of elements { α| } of E and postulates the anticommutation rule

ζαζβ + ζβζα = 0 for all α, β (3.22)

2 for the generators. In particular one demands ζα = 0 for all α. Up to a sign factor there are 2n distinct products of generators 1, ζ , ζ ζ , . . . , ζ . . . ζ where by convention α < α < { α α1 α2 α1 αn } 1 2 . . . < α . The Grassmann algebra (E) corresponding to E is the 2n-dimensional vector space n G of complex linear combinations of these products.

Conjugation of Grassmann Variables

For an even number n = 2p of generators one can define a conjugation operation on the Grass- mann algebra. One selects a subset of p generators ζα, and associates to each one of the remaining ∗ p generators which are usually denoted by ζα. The conjugation is defined for the generators by (ζ )∗ ζ∗ and (ζ∗)∗ ζ and can be extended to all elements of the Grassmann algebra by the α ≡ α α ≡ α requirements ∗ ∗ ∗ (ζα1 . . . ζαj ) = ζαj . . . ζα1 (3.23) (λx + µy)∗ = λ∗x∗ + µ∗y∗ λ, µ C,x,y (E). (3.24) ∀ ∈ ∈ G 22 CHAPTER 3. PATH INTEGRALS FOR FERMIONS

3.3.2 Calculus for Grassmann Variables From the formal definition of the Grassmann algebra we want to proceed to the rules of calculus that will be applied in the path integral calculations for fermions.

Grassmann Functions

Since x2 = 0 for any element x of the vector space E spanned by the generators, the Tay- lor series of a Grassmann function 1 f : E (E) consists only of two terms, i.e. 7→ G f(x)= f + xf with f , f (E). (3.25) 0 1 0 1 ∈ G A particularly important example is ex =1+ x for x E. ∈ For functions of two variables A : E E (E) one gets analogously × 7→ G A(x, y)= a + xa + y a¯ + xy a with a ,a , a¯ ,a (E). (3.26) 0 1 1 12 0 1 1 12 ∈ G Generalizations to more variables are straightforward but are physically relevant only if many– body interactions have to be considered.

Differentiation of Grassmann Functions

Grassmann numbers are in general not invertible and it is not possible to define a quotient of Grassmann variables. In particular the derivative of a Grassmann function cannot be defined as a differential quotient. Due to the simple polynomial form of Grassmann functions one can still define the derivative as a linear operation in analogy to complex differentiation by setting

∂ ∂ 1 = 0 and x = 1. (3.27) ∂x ∂x

In the evaluation of derivatives one has to anticommute x through until it is adjacent to ∂/∂x. As an illustration we give the derivative of a general function of one and two variables

∂ ∂ f(x)= (f + x f )= f (3.28) ∂x ∂x 0 1 1

∂ ∂ A(x, y)= (a + xa + y a¯ + xy a )= a + y a (3.29) ∂x ∂x 0 1 1 12 1 12 ∂ ∂ A(x, y)= (a + xa + y a¯ yx a ) =a ¯ xa (3.30) ∂y ∂y 0 1 1 − 12 1 − 12 ∂ ∂ ∂ ∂ A(x, y)= a = A(x, y). (3.31) ∂y ∂x 12 −∂x ∂y In particular one should note that also the derivatives with respect to Grassmann variables anticommute.

1Though this property is often stated for ”functions on the Grassmann algebra” it is not fulfilled for any function f : G(E) 7→ G(E) since x2 = 0 doesn’t hold for any x ∈ G(E). Quite generally the term Grassmann number as used in the physical literature refers often just to the generators or the elements of E. 3.3. GRASSMANN ALGEBRA 23

Integration of Grassmann Functions

Even more important for the practical calculations than the Grassmann derivative is the in- tegral of a Grassmann function. Unfortunately, on a Grassmann algebra there exists no order relation, i.e. one cannot speak of large or small Grassmann numbers. In particular it is not possible to define the integral of a Grassmann function as the limit of a Riemann sum. As in the case of the derivative we define the integral as a linear operation by the settings

dx 1 = 0 and dx x = 1 (3.32) Z Z and the requirement that the integration variable x has to be anticommuted through until it is adjacent to dx. At first sight it might seem strange that Grassmann derivative and Grassmann integration are the same but the definitions are motivated by the fact that with these conventions many results for Fermion coherent states and coherent state path integrals look very similar to the corresponding results for Bosons. For the integrals of a general function of one and two variables we can copy the results of eqs. (3.28)–(3.31)

dx f(x)= f1 (3.33) Z

dx A(x, y)= a + y a and dy A(x, y) =a ¯ xa (3.34) 1 12 1 − 12 Z Z dy dx A(x, y)= a = dx dy A(x, y). (3.35) 12 − Z Z Z Z 3.3.3 Important Integration Formulas Apart from the defining equations it will be useful for the practical calculations to derive some general formulas for Grassmann integration. In this subsection we will give the representation of the Grassmann δ–function which allows to write some results in a more elegant way. We also derive the transformation law for a Grassmann integral under a change of variables and apply this law to the evaluation of Gaussian integrals that appear in the context of Hubbard– Stratonovich transformations and in the exact integration of path integrals for non–interacting particles.

The Grassmann δ–Function

The Grassmann δ–function can be represented as

′ δ(x,x′) dη e−η(x−x ) = dη 1 η(x x′) = (x x′). (3.36) ≡ − − − − Z Z £ ¤ We can easily verify that δ(x,x′) has the desired properties by evaluating the integral with an arbitrary function f : E (E) 7→ G

dx′ δ(x,x′) f(x′)= dx′ xf x′f x′xf = f + xf = f(x). (3.37) − 0 − 0 − 1 0 1 Z Z £ ¤ 24 CHAPTER 3. PATH INTEGRALS FOR FERMIONS

Change of Variables in Grassmann Integrals

Another important result about Grassmann integrals is the transformation law under a change of variables that shall be illustrated for a function of two variables. We consider the linear transform x = Mη described by the complex 2 2–matrix M. Since only terms bilinear in η × 1 and η2 survive the double integration a direct calculation gives

dη dη A(M η + M η ,M η + M η ) = M M a + M M a 1 2 11 1 12 2 21 1 22 2 − 11 22 12 12 21 12 Z Z = (det M) a . (3.38) − 12 After a change of variables to x we get the integral

dx dx J A(x ,x )= J a (3.39) 1 2 1 2 − 12 Z Z and conclude that ∂(x ,x ) J = det M = 1 2 (3.40) ∂(η , η ) ¯ 1 2 ¯ ¯ ¯ which is just the inverse of the usual Jacobean in the¯ transfo¯rmation law for complex integration. In summary we get the transformation law ¯ ¯

dη A(Mη) = det(M) dx A(x) (3.41) Z Z for the change of variables x = Mη which holds also in the general case of more than two variables.

Gaussian Integrals

We conclude this section with the application of the transformation law to the calculation of Gaussian integrals. First we will consider the Hubbard–Stratonovich transform n 1 JT AJ − 1 d x − 1 xT A−1x+xT J e 2 = [det A] 2 n e 2 (3.42) (2π) 2 Z where x is a real vector and A is a real symmetric positive definite matrix. There is no restriction on the vector J which might contain complex numbers or even elements of a Grassmann algebra. In applications the Hubbard–Stratonovich transform is used to decouple the direct interaction in the quadratic term on the l.h.s. of eq. (3.42) by the introduction of a real valued auxiliary field x that couples only linearly to J. To prove this equality one diagonalizes A−1 by an orthogonal transform O, i.e. A−1 = T −1 O diag(ak )O with the eigenvalues ak of A, and changes the integration variables to y = Ox. For the integral on the r.h.s. of eq. (3.42) one gets a n–fold simple Gaussian integral that can be integrated to yield

n y2 1 T −1 T k d x − x A x+x J dyk − 2a +yk(OJ)k n e 2 = e k (3.43) (2π) 2 Ã √2π ! Z Yk Z 1 2 1 T T 2 dyk − [y −2ykak(OJ)k+ (J O )ka (OJ)k] 1 JT AJ = e 2ak k 2 k e 2 √2π Yk µZ ¶ 1 JT AJ 1 1 JT AJ = √ak e 2 = [det A] 2 e 2 Ã ! Yk 3.4. FERMION COHERENT STATES 25 which proves the proposition. The counterpart of the Hubbard-Stratonovich transform for real (or complex) numbers is the following identity for Grassmann vectors x, y, η, ζ En ∈ † † † † −1 dnx∗dny e−x Hy+x η+ζ y = det(H) eζ H η (3.44) Z where H is a hermitian but not necessarily positive definite matrix. To prove this formula we first calculate the simple Gaussian Grassmann integral for x∗, y E, a C ∈ ∈ ∗ dx∗ dy e−x ay = dx∗ dy [1 x∗ay]= a. (3.45) − Z Z Z Z Now we calculate the integral on the l.h.s. of eq. (3.44) taking into account that H can be diagonalized by a unitary transform U, i.e. H = U †diag(h )U with h R. Defining X† = x†U † k k ∈ and Y = Uy one gets

∗ ∗ † † ∗ −Xk hkYk+Xk (Uη)k+(ζ U )kYk dXk dYk e (3.46) Yk Z ∗ ∗ † † † † −1 † † −1 ∗ −Xk hkYk+Xk (Uη)k+(ζ U )kYk−(ζ U )khk (Uη)k+(ζ U )khk (Uη)k = dXk dYk e Yk Z ∗ † † −1 −1 † −1 ∗ −(Xk −(ζ U )khk )hk(Yk−hk (Uη)k) ζ H η = dXk dYk e e à ! Yk Z ζ†H−1η ζ†H−1η = hk e = [det H] e à ! Yk In the application of formulas (3.42) and (3.44) in the later chapters we omit the transposition of vectors and write instead of since it usually is evident from the context whether the vector ∗ † has to be interpreted as a column or a row.

3.4 Fermion Coherent States

In second quantization the quantum mechanics of a Fermionic system is described by creation † operators aα and annihilation operators a acting on the Fermion Fock space where the one– α F particle states denoted by the quantum numbers α form a basis of the Hilbert space for a H single particle.

3.4.1 Definition of Fermion Coherent States The Generalized Fermion Fock Space

Since the eigenvalues of the annihilation operators have to fulfill the anticommutation rela- tion (3.21) it is necessary to enlarge the Fock space. Associating a generator ζα with each ∗ † annihilation operator aα, and a generator ζα with each creation operator aα one first defines the Grassmann algebra . The generalized Fock space ∗ is then built up by all linear combina- G F tions of states in with coefficients in the Grassmann algebra . One further states that the ∗ F G generators ζα, ζα anticommute with all creation and annihilation operators. For the adjoint one requires (ζ˜a˜)† a˜†ζ˜∗ wherea ˜ is a creation or annihilation operator and ζ˜ any generator. ≡ 26 CHAPTER 3. PATH INTEGRALS FOR FERMIONS

Definition of Fermion Coherent States

The Fermion coherent states can be defined by

† ζ e−ζa 0 = 1 ζ a† 0 . (3.47) | i ≡ | i − α α | i α Y ³ ´ One should note that the product in the last expression is well defined and represents all terms † in the exponential series since the factors (1 ζ aα) commute with each other. To prove that − α ζ is indeed an eigenstate of all annihilation operators we first note that | i a 1 ζ a† 0 = ζ 0 = ζ 1 ζ a† 0 . (3.48) α − α α | i α| i α − α α | i ³ ´ ³ ´ Taking into account that a as well as ζ commutes with all factors (1 ζ a† ) for α = β we get α α − β β 6

a ζ = 1 ζ a† a 1 ζ a† 0 = 1 ζ a† ζ 1 ζ a† 0 = ζ ζ . (3.49) α| i − β β α − α α | i − β β α − α α | i α| i βY6=α ³ ´ ³ ´ βY6=α ³ ´ ³ ´ The adjoint of a coherent state ζ is denoted by the bra vector ζ and can be given as | i h | ∗ ∗ ζ = 0 e−aζ = 0 eζ a = 0 (1 + ζ∗a ) . (3.50) h | h | h | h | α α α Y † It is straightforward to prove that ζ is a left eigenvector of the creation operators aα with ∗ † ∗ h | eigenvalue ζ , i.e. ζ aα = ζ ζ . α h | h | α 3.4.2 Properties of Fermion Coherent States Overlap of Coherent States

In the rest of this section we will use the basic properties of the creation and annihilation operators as well as the rules of Grassmann algebra to derive a number of useful formulas for the Fermion coherent states starting with the simple calculation of the overlap of two coherent states. Since (1 + ζ∗a ) and (1 ζ′ a† ) commute for α = β we get α α − β β 6

∗ ′ ζ ζ′ = 0 (1 + ζ∗a ) 1 ζ′ a† 0 = 1+ ζ∗ζ′ = eζ ζ . (3.51) h | i h | α α − α α | i α α α α Y ³ ´ Y ¡ ¢ Closure Relation for Coherent States

A very important property of the coherent states is that every vector in the Fock space can be expressed as a linear combination of coherent states and the unity operator in can be F represented by the closure relation

∗ 1= dζ∗dζ e−ζ ζ ζ ζ . (3.52) | ih | Z To prove this relation we calculate the matrix elements of the operator on the r.h.s. of eq. (3.52) for two arbitrary basis vectors α . . . α and β . . . β of the Fock space. | 1 ni | 1 mi 3.4. FERMION COHERENT STATES 27

Using α . . . α ζ = 0 a ...a ζ = ζ . . . ζ and the corresponding adjoint relation it h 1 n| i h | αn α1 | i αn α1 follows

∗ α . . . α dζ∗dζ e−ζ ζ ζ ζ β . . . β (3.53) h 1 n| | ih | 1 mi Z = dζ∗dζ (1 ζ∗ζ ) ζ . . . ζ ζ∗ . . . ζ∗ . − α α αn α1 β1 βm α Z Y ∗ In the integrations only terms bilinear in ζα and ζα for any α have to be considered since all others vanish due to the rules of Grassmann algebra. Thus the matrix element (3.53) is zero unless n = m and there exists a permutation P S with β = P α , i = 1,...,n. In that case ∈ n i i we can reorder ζ∗ . . . ζ∗ to ζ∗ . . . ζ∗ which gives an additional factor ( 1)P . Noting that it β1 βm α1 αn ∗ − ∗ takes an even number of anticommutations to bring ζαi ζαi adjacent to dζαi dζαi we finally get

∗ α . . . α dζ∗dζ e−ζ ζ ζ ζ β . . . β (3.54) h 1 n| | ih | 1 mi Z = δ ( 1)P dζ∗dζ (1 ζ∗ζ ) ζ . . . ζ ζ∗ . . . ζ∗ βi,P αi − − α α αn α1 α1 αn à ! α Yi Z Y ( 1)P n = m and β = P α , P S = − i i ∈ n 0 otherwise ½ which is just the result one gets for the unity operator in the Fock space. Since this statement holds for arbitrary basis vectors we have proven the closure relation (3.52).

Representation of the Trace by Coherent States

To calculate the partition function or the generating functional of some expectation value of the system we need a representation for the trace of an operator in terms of the Fermion coher- ent states. We will prove the following relation

∗ ∗ tr A = dζ∗dζ e−ζ ζ ζ A ζ = dζdζ∗ eζ ζ ζ A ζ (3.55) { } h− | | i h | | i Z Z where the second equality follows from the transformation ζ∗ ζ∗. This convention destroys − α → α the conjugation relation of the Grassmann variables (that will not be used in the following) but it is a rather suitable way to incorporate the antisymmetry requirement into the derivation of the coherent state path integral. Using the definitions (3.47) and (3.50) of the coherent states we calculate explicitly the first Grassmann integral in eq.(3.55)

dζ∗dζ (1 ζ∗ζ ) 0 1 ζ∗a A 1 ζ a† 0 − α α h | − β β − γ γ | i Z α β γ Y Y ¡ ¢ Y ³ ´ = dζ∗dζ 0 (1 ζ∗ζ ) (1 ζ∗a ) A 1 ζ a† 0 h | − α α − α α − β β | i α Z Y Yβ ³ ´ = 0 dζ∗dζ (1 ζ∗ζ ζ∗a ) A 1 ζ a† 0 h | α α − α α − α α − β β | i α Y Z Yβ ³ ´

† † † = 0 dζα (ζα + aα) A 1 ζβa + ζβ a ζβ a + ... 0 h |  − β 1 β1 2 β2 −  | i α Y Z Xβ βX1,β2   28 CHAPTER 3. PATH INTEGRALS FOR FERMIONS

= 0 dζ (ζ + a ) A 0 0 dζ (ζ + a ) A ζ a† 0 h | α α α | i − h | α α α β β| i α α Y Z Xβ Y Z † † + 0 dζα (ζα + aα) A ζβ a ζβ a 0 + ... h | 1 β1 2 β2 | i − α βX1,β2 Y Z = 0 A 0 + α A α + α α A α α + ... h | | i h | | i h 1 2| | 1 2i α α ,α X X1 2 = tr A . (3.56) { }

In the evaluation of the integrals over dζα we have used the fact that only those terms give a non–vanishing contribution which contain ζα exactly once for each value of α. Calculating the integrals we get the trace of the operator A as the sum of traces over the many–particle Hilbert spaces , , , ... . F0 F1 F2 3.5 Coherent State Path Integral

In this section we apply the formalism of Grassmann algebra and Fermion coherent states to give the general expression for the partition function or generating functional of a system in terms of a coherent state path integral. We start from the second quantized Hamiltonian H in normal ordered form (i.e. all creation operators to the left of any annihilation operators). For the derivation we will consider the partition function Z = tr e−βH and merely note that source terms for a generating functional may easily be included. Using the representation of the trace by coherent states (3.55) we can © ª express the partition function as

∗ Z = tr e−βH = dµ(ζ) e−ζ ζ ζ e−βH ζ (3.57) h− | | i n o Z ∗ where we have introduced the short hand notation dµ(ζ)= α dζαdζα. Like in the case of Feynman’s path integral we split the propagation exp( βH) along the Q − complex time contour from z = 0 to z = iβ into small time steps ∆ , (j = 1,...,P ) with C − j P P ∆ = iβ such that e−βH = e−i∆j H . (3.58) j − Xj=1 jY=1 The partition function can now be calculated using the coherent state representation of the trace (3.55) and the closure relation (3.52)

−βH ∗ ζ∗ ζ −βH Z = tr e = dζ dζ e P 0 ζ e ζ (3.59) 0 P h P | | 0i Z n o P −ζ∗ ζ −i∆ H = dζ dµ(ζ ) δ(ζ + ζ ) e P P ζ e j ζ 0 P P 0 h P | | 0i Z jY=1 P ∗ = dζ dµ(ζ ) . . . dµ(ζ ) δ(ζ + ζ ) e−ζj ζj ζ e−i∆j H ζ 0 P 1 P 0 h j| | j−1i Z jY=1 P ∗ = µ(ζ) e−ζj ζj ζ e−i∆j H ζ D h j| | j−1i Z jY=1 3.6. EXAMPLE: NON–INTERACTING FERMIONS 29 where the multiple integral over the Grassmann variables at the different time steps along the contour and the δ-functions that enforce the antisymmetry relation ζ = ζ have been 0 − P combined to the path integral measure. The next step in our derivation is the evaluation of the coherent state matrix elements. Since the Hamiltonian is already in normal ordered form we can simply expand the exponential function and use the eigenvalue relations for the annihilation and creation operators to replace † ∗ H = H[a ,a] by H[ζj , ζj−1]. For the matrix elements we thus get

ζ e−i∆j H ζ = ζ 1 i∆ H ζ + (∆2) (3.60) h j| | j−1i h j| − j | j−1i O j = ζ ζ 1 i∆ H[ζ∗, ζ ] + (∆2) (3.61) h j| j−1i − j j j−1 O j ζ∗ζ −i∆ H[ζ∗,ζ ] = e j j−1 e ¡ j j j−1 + (∆2¢). O j With this result the partition function can be rewritten as

P ∗ ∗ ∗ Z = µ(ζ) eζj ζj−1−ζj ζj e−i∆j H[ζj ,ζj−1] (3.62) D Z jY=1 P ∗ ζj −ζj−1 ∗ i ∆j ζ i −H[ζ ,ζj−1] j ∆j j = µ(ζ) e j=1 µ ¶. D P Z When we consider the limit P and ∆ 0 the well known expressions for the Riemann → ∞ j → sum and the differential quotient suggest to use the following notation to further simplify our result for the partition function

P ∆j[...] dz[...] (3.63) → C Xj=1 Z ζj ζj−1 − ∂zζ(z) (3.64) ∆j → H[ζ∗, ζ ] H[ζ∗(z), ζ(z)]. (3.65) j j−1 → This completes the derivation of the coherent state path integral and allows us to express the partition function (or the generating functional) in the following elegant form

Z = µ(ζ) exp(iS[ζ∗, ζ]) (3.66) D Z with the action S given by S = dz (ζ∗i∂ ζ H[ζ∗, ζ]) . (3.67) z − ZC 3.6 Example: Non–Interacting Fermions

3.6.1 The Partition Function As a basic example for the application of coherent state path integrals we will consider the cal- culation of the partition function for non–interacting Fermionic quasiparticles. The Hamiltonian of this system is H = ǫ a† a a†ǫa (3.68) α α α ≡ α X 30 CHAPTER 3. PATH INTEGRALS FOR FERMIONS where α might denote the wave vector ~k and the spin σ of the quasiparticle. In the second equality we have again used vector notation and defined the diagonal matrix ǫ ′ ǫ δ ′ . αα ≡ α αα The partition function (3.66) written out explicitely as in (3.62) is given by

P ∗ ∗ − [ζ ζj −ζ (1−i∆j ǫ)ζj−1] Z = dζ0dµ(ζP ) . . . dµ(ζ1) δ(ζ0 + ζP ) e j=1 j j (3.69) Z P ∗ = dµ(ζ) e−ζ Sζ Z where we have eliminated the δ-functions integrating over ζ0. Further we have also included α the time index j in the matrix notation defining the action matrix S ′ ′ S ′ δ ′ that is jα,j α ≡ j,j α,α diagonal in α with blocks given by

α 1 0 ... 0 c1 . .  cα .. .. 0  − 2 α ...... α S =  0 . . . .  with cj = 1 i∆jǫα. (3.70)   −  . .. ..   . . . 0     0 ... 0 cα 1   − P −1    According to the result (3.44) for the Gaussian integral, the partition function is given by the determinant of S which is just the product of determinants of the blocks Sα

Z = det S = det[Sα] (3.71) α Y each of which can be calculated via an expansion by minors along the first row leading to the well–known textbook result [32]

P Z = 1 1 + ( 1)P +1cα ( cα) (3.72)  · − 1 − j  α Y jY=2  P  P = 1+ (1 i∆ ǫ ) = 1+ e−iǫα j=1 ∆j  − j α  α α P Y jY=1 Y ³ ´   = 1+ e−βǫα . α Y ³ ´ 3.6.2 The Thermal Green’s Function The thermal Green’s function can be defined as

′ † ′ ′ (z,z )= a (z)a ′ (z ) , (3.73) Gαα hT α α i h i where denotes the contour–ordered product in which operators are ordered according to their T time argument along the contour . For equal times coincides with the normal–ordered C T product. For the imaginary–time contour is just the ordering of operators with decreasing T imaginary time. Let z and z′ be two times of a given discretization, i.e.

k k′ ′ z = ∆j and z = ∆j. (3.74) Xj=1 Xj=1 3.6. EXAMPLE: NON–INTERACTING FERMIONS 31

Then the discrete path integral corresponding to eq. (3.73) is given by

′ 1 ∗ −ζ∗Sζ ′ (z,z )= dµ(ζ) ζ ζ ′ ′ e . (3.75) Gαα Z k,α k ,α Z Since the action is block–diagonal in α, the Green’s function vanishes for α = α′. Using eq. ∗ 6 (3.71) and writing the factors ζk,α and ζk′,α as derivatives of source terms added to the action we can express the Green’s function as

2 ′ δαα′ ∂ −ζ∗Sζ+J∗ζ+ζ∗J αα′ (z,z )= ∗ dµ(ζ) e . (3.76) G det(S) ∂J ∂J ′ ∗ k,α k ,α Z ¯J=J =0 ¯ ¯ According to eq. (3.44) we can perform the integral in the numerator to¯ get

2 ′ ∂ J∗S−1J α −1 αα′ (z,z )= δαα′ e = δαα′ (S )kk′ , (3.77) ∗ ′ G ∂Jk,α∂Jk ,α ¯ ∗ ¯J=J =0 ¯ ¯ α −1 i.e. we just have to determine the appropriate matrix¯ element of the inverse matrix (S ) corresponding to Sα as given by eq. (3.70). The inverse matrix can be computed by the rule [38]

s1,1 ... s1,j−1 s1,j+1 ... s1,P . . . . ¯ . . . . ¯ ¯ ¯ 1 ¯ s ... s s ... s ¯ (Sα)−1 = (S˜α)T with (S˜)α = ¯ i−1,1 i−1,j−1 i−1,j+1 i−1,P ¯ . det(S) i,j ¯ s ... s s ... s ¯ ¯ i+1,1 i+1,j−1 i+1,j+1 i+1,P ¯ ¯ . . . . ¯ ¯ . . . . ¯ ¯ ¯ ¯ s ... s s ... s ¯ ¯ P,1 P,j−1 P,j+1 P,P ¯ ¯ (3.78)¯ ¯ ¯ As a result for the matrix (S˜α)T one gets ¯ ¯

P P α α α α α α α α α 1 c1 cj c1 cj ...... c1 cP −1cP c1 cP c1 − j=3 − j=4 − − −  P  α Q α αQ α α α α α α α α α α  c2 1 c1 c2 cj ...... c1 c2 cP −1cP c1 c2 cP c1 c2   − j=4 − − −   3   α α α Q α α α α α α α α α α   c2 c3 c3 1 ...... cj c c c1 c2 c3 c c1 c2 c3   − P −1 P − P −   j=1   . . . Q . . .   ......  (S˜α)T =   .  ......   ......     P −2 P −2 P −2 P −2 P −2   cα cα cα ...... 1 cαcα cα   j j j − j P − j   j=2 j=3 j=4 j=1 j=1   PQ−1 PQ−1 PQ−1 Q PQ−1   α α α α α   cj cj cj ...... cP −1 1 cj   j=2 j=3 j=4 − j=1     QP QP QP Q   cα cα cα ...... cα cα cα 1   j j j P −1 P P   j=2 j=3 j=4   Q Q Q (3.79) To determine the thermal Green’s function we have to distinguish between the cases k > k′ and ′ ′ k < k . To get the usual expression for ′ (z,z ) we will also consider the continuum limit. In Gαα 32 CHAPTER 3. PATH INTEGRALS FOR FERMIONS the case k > k′, i.e. if the contour first runs through z and then through z′, we get C k α cj k ′ ′ −i j=k′+1 ∆j ǫα −i(z−z )ǫα ′ j=k +1 . e e αα′ (z,z ) = δαα′ = δαα′ δαα′ . (3.80) QP P −βǫα −βǫα G α 1+ e −→ 1+ e 1+ cj j=1 Q For the case k < k′ we get

−1 P k′ α α c c ′ j j i k ∆ ǫ ′ j=1 j=k+1 j=k+1 j α −i(z−z )ǫα ′ Ã ! . −βǫα e −βǫα e αα′ (z,z ) = δαα′ = δαα′ e δαα′ e . Q QP P −βǫα −βǫα G − α − 1+ e −→ − 1+ e 1+ cj j=1 Q (3.81) We can summarize the two results (3.80) and (3.81) and write them explicitly for the case of the imaginary–time contour parameterized by τ = iz to get the textbook result [32]

′ e−ǫα(τ−τ ) ′ ′ 1+e−βǫα τ > τ αα′ (τ, τ )= δαα′ ′ . (3.82) G e−ǫα(τ−τ ) ′ ( βǫ τ < τ − 1+e α Chapter 4

Path Integral Monte Carlo

In this chapter we present the Monte Carlo method and its application to the calculation of (Feynman) path integrals. We start with a general introduction to the idea of Monte Carlo integration and the sampling of a probability distribution by a Markov process in sections 4.1 and 4.2. In sections 4.3 and 4.4 we summarize the formulas for the determination of estimates for expectation values and their statistical and systematical errors that will be applied in our Monte Carlo calculations for the single electron transistor in chapter 6. In section 4.5 we close with a discussion of the problems associated with path integral calculations for non–positive actions that will also present the motivation to consider imaginary–time correlation functions and their analytical continuation in chapter 5.

4.1 Basics of Monte Carlo Integration

In chapter 3 we have expressed the generating function as Feynman path integral or as coherent state path integral. This formulation is often used as the starting point for analytical methods such as perturbation theory or stationary–phase approximation [32]. In situations where these approaches are no longer feasible or have to rely on uncontrolled approximations one might be tempted to use numerical methods to solve these high–dimensional integrals. Conventional methods appropriate for integrals in few dimensions use the values of the integrand on a regular grid to approximate the integral. Since the number of grid points, that are necessary to achieve a given accuracy, grows exponentially with the dimension, such methods are useless for the evaluation of path integrals. Fortunately, Monte Carlo integration provides a method that is especially suited to the evaluation of high–dimensional integrals. The starting point for Monte Carlo integration is the formulation of the given integral as

I = f = ddx f(x) ρ(x) with ρ(x) 0 and ddx ρ(x) = 1. (4.1) h iρ ≥ Z Z This formulation is natural for expectation values in classical or quantum statistical physics when ρ is viewed as the phase space density or of the system. For real and positive action S also the imaginary–time path integral is of this form with ρ = Z−1 exp( S). − Quite general any integral (over a finite integration volume V ) can be brought in this form by choosing the constant probability density ρ = V −1 though ”importance sampling” will lead to improved convergence as will be discussed below.

33 34 CHAPTER 4. PATH INTEGRAL MONTE CARLO

An estimate for the expectation value I = f is given by the mean of N measurements h iρ N 1 f¯ = f(x ) (4.2) N i Xi=1 xi∈ρ(x) where x ρ(x) shall denote that the points x are chosen randomly (and independently) i ∈ i according to the probability density ρ(x). The basis of this method is the central limit theorem for the sum of independent random variables which states that for large N the distribution of f¯ converges to a Gaussian distribution with mean

I = f ddx f(x) ρ(x) (4.3) h iρ ≡ Z and variance 1 1 σ2 = (f f )2 = f 2 f 2 . (4.4) f N h − h iρ iρ N h iρ − h iρ In particular the statistical error of the estimate scales¡ as N −1/2 with¢ the number of data points independently of the dimension of the integral.

4.2 Importance Sampling and Markov Processes

4.2.1 Reduction of Statistical Errors by Importance Sampling In section 4.1 we stated that the statistical error of a Monte Carlo estimate scales as 1/√N independent of the way one decomposes the integrals as

I = ddx g(x)= ddx f(x) ρ(x) (4.5) Z Z into a measured function f and a sampling density ρ. Nonetheless this decomposition influences the convergence of the Monte Carlo method as it determines the value of the proportionality constant in eq. (4.4) which is given by the variance of f. Thus importance sampling, i.e. sampling according to a properly chosen non-uniform distribution ρ(x) leads to a considerable improvement of the accuracy of the result for given N. In the case of a positive function g(x) 0 ≥ one might use the factorization g(x) I = dx g(x)= dx I (4.6) I Z Z with f(x)= I and ρ(x)= g(x)/I, i.e. the normalized integrand itself is chosen as the probability density in order to sample preferably those regions that give the largest contribution to the integral. Although the variance of f and hence the statistical error would be reduced to zero, this example is only of academical interest since we would have to know the value of I in advance. In practical applications of importance sampling one chooses a probability density that is easy to sample and as similar to the integrand as possible to achieve a fast and accurate calculation of Monte Carlo estimates. In the calculation of expectation values by path integral Monte Carlo one has to solve integrals of the form 1 A = ddx A(x) exp( S(x)). (4.7) h i Z − Z 4.2. IMPORTANCE SAMPLING AND MARKOV PROCESSES 35

The natural choice for the sampling density in this case is ρ(x) = Z−1 exp( S(x)) since the − exponential factor usually gives the dominant contribution to the functional dependence of the integrand. A prerequisite for this choice is of course that the action S(x) is always positive. The problems arising in the calculation of real–time correlation functions and for Fermionic systems due to a non–positive action will be discussed in section 4.5.

4.2.2 Markov Processes and the Metropolis Algorithm

As we have seen in section 4.1 the basic idea of the Monte Carlo method is the calculation of high–dimensional integrals by stochastic sampling. The choice of a probability density ρ(x) for the integration points can be done with the help of the guidelines of importance sampling. So far the method is not yet complete since we still have to specify an algorithm that allows us to sample points xi from the given probability density ρ.

Random Number Generation and Direct Sampling

The fundamental building block for all sampling methods is a pseudo random number gen- erator (PRNG) for variables that are uniformly distributed on the interval [0, 1]. We have used MT79937 of Makoto Matsumoto and Takuji Nishimura [39] for all calculations in this thesis since it allows a very fast computation of (pseudo) random numbers and has a very long period of 219937 1. Other popular and well tested PRNGs are G05FAF of the NAG software package − [40], RAN3 from the Numerical Recipes library [41], R250/521 [42], RANMAR [43] or RANLUX [44]. In our discussion of importance sampling we have seen that it is preferable to use a non– uniform probability density adapted to the functional form of the integrand. Simple probability distributions, e.g. Gaussian or Lorentzian, can be sampled directly by applying a transform to the created sequence of uniformly distributed numbers [41]. In the case of path integrals where one uses the probability density ρ(x) = Z−1 exp( S(x)), whose normalization is usually not − even known a–priori, other sampling methods have to be applied.

Sampling by a Markov Process

The most common sampling method is the generation of an ensemble of configurations xi in the form of a Markov chain x ,x ,x ,... in which the value of the (n + 1)-th element is deter- { 1 2 3 } mined according to a transition probability T (x x ) that depends only on the value of x . n → n+1 n Thus the sampling of integration points xi from a given probability density ρ(x) is replaced by a stochastic dynamic process (Markov process) in configuration space that generates the desired sequence. In the following we will give two conditions which are sufficient that the elements of the Markov chain are distributed according to the given probability density.

Ergodicity: Assuming that for each state x the transition probability T (x x) is non– • → vanishing, ergodicity can be formulated as the requirement that from each point x in configuration space any other point y can be reached by a finite number of transitions, i.e. for any points x, y there exists an integer n(x, y) N such that ∈ T n(x y) > 0 for n n(x, y) (4.8) → ≥ 36 CHAPTER 4. PATH INTEGRAL MONTE CARLO

where T n is the probability for a transition in n steps. For the necessarily finite configu- ration space of a Monte Carlo simulation this also implies the existence of

nmax nmax = max n(x, y) with T (x y) > 0 for all x,y. (4.9) x,y → Detailed Balance: The relation of detailed balance or microscopic reversibility can be • formulated as ρ(x) T (x y)= ρ(y) T (y x) (4.10) → → From these conditions (and the normalization of the transistion probability to 1) it is straight forward to prove that y is distributed according to the probability density ρ if x was chosen according to this distribution since

dx ρ(x) T (x y)= ρ(y) dx T (y x)= ρ(y), (4.11) → → Z Z i.e. ρ is stationary under the Markov process. It is also easy to prove that the distribution for the elements of the chain will converge to this stationary solution. If x is distributed according to ρold(x) we get for the distance between ρnew(y) and ρ(y) after nmax steps

D = dy ρ (y) ρ(y) new | new − | Z = dy dx (ρ (x) ρ(x)) T nmax (x y) old − → Z ¯Z ¯ ¯ ¯ dx ¯ρ (x) ρ(x) dy T nmax (x y) =¯ D (4.12) ≤ ¯| old − | → ¯ old Z Z where we have used the normalization of the transition probability dy T nmax (x y) = 1. Since → T nmax is non–zero for all x and y, the strict equality occurs only if ρ ρ, i.e. the sampled R old ≡ distribution converges monotonously towards the desired probability density ρ(x). The above consideration also has an important implication for the practical implementation of Markov chain sampling. At the beginning of a Monte Carlo simulation one usually doesn’t have a very good idea which configurations are physically most relevant and thus often has to resort to a random guess for the initial point x1 for the Markov chain that is inconsistent with the sampling probability ρ(x). As a consequence one has to wait until the simulation has ”equilibrated”, i.e. until the sampling probability has converged, before taking measurements. A common way to check equilibration is to monitor the change of expectation values until they reach their stationary (equilibrium) value.

The Metropolis Algorithm

In the famous algorithm due to Metropolis, Rosenbluth, Rosenbluth, Teller and Teller [45] the Markov chain is constructed as follows. Starting from a given configuration x one proposes a move to the new configuration y, by a symmetric probability rule F , i.e. F (x y)= F (y x). → → A simple choice would be to choose y arbitrarily in a finite hypercube centered around x. De- pending on the system under consideration it might be favorable to invest more time into the choice of appropriate moves in order to improve the sampling efficiency (see e.g. [46, 47, 48]). The proposed point y is accepted as the new configuration with the acceptance probability 1 if ρ(y) > ρ(x) A(x y)= ρ(y) . (4.13) → ( ρ(x) if ρ(y) < ρ(x) 4.3. STATISTICAL ANALYSIS OF MONTE CARLO DATA 37

If the new point y is rejected one keeps the configuration x for the accumulation (4.2) of the averages. The transition probability T (x y) = F (x y) A(x y) fulfills the detailed balance → → → relation as can be proved easily by considering the two cases ρ(y) < ρ(x) and ρ(y) > ρ(x) separately. For ρ(y) < ρ(x) one gets

ρ(y) ρ(x) T (x y)= ρ(x) F (x y)= ρ(y) F (y x)= ρ(y) T (y x) (4.14) → ρ(x) → → → and analogously for ρ(y) > ρ(x)

ρ(x) ρ(x) T (x y)= ρ(x) F (x y)= ρ(y) F (y x)= ρ(y) T (y x). (4.15) → → ρ(y) → → The form (4.13) of the acceptance criterion is very well suited for the application to path in- tegral calculations. In this case the probability density to be sampled is ρ(x)= Z−1 exp( S(x)) − with the unknown normalization Z given by the partition function. Since the acceptance prob- ability depends only on the ratio

ρ(y) e−S(y) = = e−∆S (4.16) ρ(x) e−S(x) the normalization cancels out and A(x y) depends only on the change ∆S = S(y) S(x) of → − the action which usually is rather easy to calculate.

4.3 Statistical Analysis of Monte Carlo Data

In this section we will summarize the main results about the analysis of Monte Carlo data, i.e. the calculation of estimates for expectation values and statistical errors of . Further details about the estimation of autocorrelation times and alternatives to the binning method presented in 4.3.3 can be found in [49].

4.3.1 Estimates for Uncorrelated Measurements The sampling of a probability distribution by a Markov process as presented in the preceding section and especially the Metropolis algorithm is an efficient method to produce a sequence of integration points for the Monte Carlo integration of path integrals. It is a general method since it can be applied to sample any given probability density, e.g. for any physical system that has positive definite action. As already given in eq. (4.2) the expectation value f of an h i f can be estimated easily from the Monte Carlo data by

N N 1 1 f f¯ = f(x )= f (4.17) h i ≈ N i N i i=1 Xi=1 X xi∈ρ(x)

In the introduction to the Monte Carlo method we also mentioned that for uncorrelated mea- surements the variance of this estimate is given by

σ2 σ2 = f (4.18) f¯ N 38 CHAPTER 4. PATH INTEGRAL MONTE CARLO in the limit of large N due to the central limit theorem. An (unbiased) estimate of this variance can be obtained from the Monte Carlo data by

N N 1 1 1 σ2 (f f¯)2 = f 2 f¯2 with f 2 = f 2. (4.19) f¯ ≈ N(N 1) i − N 1 − N i − Xi=1 − ³ ´ Xi=1 4.3.2 Correlated Measurements and Autocorrelation Time

In the preceding analysis we assumed that the points xi and hence the measurements fi = f(xi) were independent, i.e. uncorrelated. Indeed the result (4.19) has to be modified if the measurements are correlated as we will see below. This modification is crucial for the correct estimation of the (statistical) Monte Carlo error bars when we use Markov chain sampling. While the sequential production of points xi by a Markov process provides an easy and general way to ensure importance sampling, it inevitably creates correlations in the measurements fi that have to be taken into account properly when analyzing the statistical error. The influence of correlations in the measurements fi becomes obvious if we write down the expression for the variance of the estimator f¯

N N 1 1 σ2 = f¯2 f¯ 2 = f f f f (4.20) f¯ h i − h i N 2 h i ji − N 2 h iih ji i,jX=1 i,jX=1 N N 1 1 = f 2 f 2 + ( f f f f ) . N 2 h i i − h ii N 2 h i ji − h iih ji i=1 i6=j X ¡ ¢ X The first term is again the result that we got for uncorrelated measurements (except for the factor N/(N 1) that removes the bias) while the second term is the contribution of the measurement − correlations to the variance. Since the Markov process will create positive correlations in the measurements, the statistical error of the Monte Carlo data will be underestimated if we use the ”naive” estimator (4.19). The effect of correlations is most easily expressed in terms of the autocorrelation time of the measurements. Starting from eq. (4.20) one uses the symmetry i j and the translation ↔ invariance of the correlator with respect to the measurement index, i.e. f f = f f to h i i+ki h 1 1+ki reorder the summation in the second term in the following way

N σ2 2 k σ2 = f + ( f f f f ) 1 (4.21) f¯ N N h 1 1+ki − h 1ih 1+ki − N Xk=1 µ ¶ σ2 = f τ N f where the factor N k in the numerator of (1 k/N) counts the number of times we find two − − measurements separated by k steps in the sequence of N measurements. In the second line we have defined the integrated autocorrelation time τf for the measurement of the observable f by

N k τ 1 + 2 A(k) 1 (4.22) f ≡ − N Xk=1 µ ¶ with the normalized autocorrelation function (A(0) = 1) f f f f A(k)= h 1 1+ki − h 1ih 1+ki. (4.23) f 2 f f h 1 i − h 1ih 1i 4.4. SYSTEMATIC ERRORS AND TROTTER EXTRAPOLATION 39

The autocorrelation time defined above can be interpreted as the number of steps in the Monte Carlo between two uncorrelated measurements since the formula (4.20) can still be written as 2 2 σf¯ = σf /Neff like in the uncorrelated case but with an effective statistic Neff = N/τf .

4.3.3 Binning Analysis of the Monte Carlo Error

The determination of the autocorrelation time τf by eq. (4.21) is rather cumbersome since it requires the accumulation of the autocorrelation function A(k) during the Monte Carlo. There- fore one usually uses a binning analysis to determine accurate error estimates or estimates for the autocorrelation time. The original sequence consisting of N correlated measurements fi is split up into NB non- overlapping blocks of length k such that N = NBk. The block length k is thereby chosen large enough, i.e. k τ , to ensure that the block averages ≫ f

k 1 f f , n = 1,...,N (4.24) B,n ≡ k (n−1)k+i B Xi=1 are uncorrelated. Therefore we can estimate the average f by h i

N N 1 1 B f f¯ = f = f (4.25) h i ≈ N i N B,n B n=1 Xi=1 X and the variance of this estimate is given by

2 NB σ 1 2 σ2 = B = f f (4.26) f¯ N N (N 1) B,n − B B B B − n=1 X ¡ ¢ 2 where σB is the variance and fB the mean of the block averages fB,n. The autocorrelation time can be deduced from this binning analysis as

2 kσB τf = 2 . (4.27) σf

4.4 Systematic Errors and Trotter Extrapolation

In the preceding section we analyzed the statistical error that is inherent to the Monte Carlo evaluation of integrals due to the random sampling of integration points. In this section we will consider in detail the systematic error caused by the finite discretization of the path that is necessary for the numerical treatment of the (infinite dimensional) path integral. Our analysis follows closely that of Fye [50]. Depending on the system under consideration other sources of systematic error, like e.g. finite–size effects [51], have to be taken into account. For the single electron transistor that will be examined by Monte Carlo methods in chapter 6 the finite discretization represents the only source of systematic error and we will therefore restrict our discussion to this aspect here. 40 CHAPTER 4. PATH INTEGRAL MONTE CARLO

4.4.1 Approximations for the Short–Time Propagator In section 3.1 we derived analytically a path integral expression for the partition function of a system described by the Hamiltonian H = K + V composed of a kinetic energy K and a potential V . One of the key steps was the approximation of the short–time propagator using the Trotter breakup

e−∆τH = f (1)(∆τ)+ O(∆τ 2) with f (1)(∆τ)= e−∆τK e−∆τV . (4.28)

The evaluation is only exact up to an Trotter error of order O(∆τ 2) which is sufficient since we could perform the limit ∆τ 0 (corresponding to P ) analytically. In numerical → → ∞ calculations of path integrals the number of time slices P and hence the time step ∆τ has to be kept finite. The systematic errors introduced by the finite discretization can be reduced by approximations to the short–time propagator that are more accurate than the standard Trotter breakup (4.28), i.e. by considering a generalized Trotter formula

f (n)(∆τ)= e−∆τH + ∆τ n+1 + ∆τ n+2 C + O(∆τ n+3) (4.29) C1 2 which is exact up to order ∆τ n with first corrections given by the operators and C . Explicit C1 2 examples for H = K + V and n = 2 are

2 (2) −∆τK −∆τV − ∆τ [V,K] f (∆τ) = e e e 2 or (4.30) (2) − ∆τ V −∆τK − ∆τ V f˜ (∆τ) = e 2 e e 2 (4.31) as given by Suzuki [52] or De Raedt and De Raedt [53].

4.4.2 Trotter Error of Expectation Values Given the generalized Trotter formula (4.29) for the approximation of the short–time propagator one still has to analyze how the errors add up in the calculation of the partition function of expectation values of the system. To this end we will calculate the absolute Trotter errors

P −βH (n) ∆Z = Z Zapprox = tr e tr f (∆τ) (4.32) − − ( ) n o Yi=1 of the partition function and

P −1 −βH −1 (n) ∆ = O O approx = Z tr e Zapprox tr f (∆τ) (4.33) hOi h i − h i O − ( O) n o Yi=1 of an observable.

Derivation of the Error Formulas

To add up the errors of P subsequent applications of the short–time propagator it is useful to include the first correction terms in the exponent of the exponential function. By an expan- sion of the exponential function in the following expression one can easily check that (4.29) is equivalent to

n n+1 1 f (n)(∆τ)= e−∆τ[H−∆τ C1−∆τ C2] + O(∆τ n+3) with = C + [ ,H] . (4.34) C2 2 2 C1 + 4.4. SYSTEMATIC ERRORS AND TROTTER EXTRAPOLATION 41

Starting from this equation we can use time–dependent perturbation theory to get

P n n+1 f (n)(∆τ) = e−β[H−∆τ C1−∆τ C2] + O(∆τ n+2) (4.35) i=1 Y β β −βH n −βH n+1 −βH = e + ∆τ dτ e 1(τ) + ∆τ dτ e 2(τ) 0 C 0 C β Z τ Z +∆τ 2n dτ dτ ′e−βH (τ) (τ ′)+ O(∆τ n+2) C1 C1 Z0 Z0 with the usual notation (τ) eτH e−τH for the operators in the . The Cj ≡ Cj term of order ∆τ 2n has been kept since in the important special case n = 1 it is of the same order as ∆τ n+1. Using the notation

β = dτ (τ) (4.36) D1 C1 Z0 and β β τ ′ ′ 0 dτ 2(τ)+ 0 dτ 0 dτ 1(τ) 1(τ ) for n = 1 2 = C C C (4.37) D β ( R R dτ R (τ) for n> 1 0 C2 we can summarize this basic result as R P f (n)(∆τ)= e−βH 1 + ∆τ n + ∆τ n+1 + O(∆τ n+2) . (4.38) D1 D2 i=1 Y ¡ ¢ Inserting eq.(4.38) into eq.(4.32) we can easily evaluate the Trotter errors of the partition func- tion ∆Z = ∆τ n + ∆τ n+1 + O(∆τ n+2) (4.39) hD1i hD2i Using eq.(3.3) together with eq. (4.33) we get after another simple calculation the following result for the error of the expectation value

∆ = ∆τ n Cov ( , ) + ∆τ n+1 Cov ( , )+ O(∆τ n+2) (4.40) hOi D1 O D2 O with the covariance Cov(A, B) AB A B . ≡ h i − h ih i Discussion of the Error Formulas

The reason why we have considered not only the first but also the second correction term is that in a number of cases including the standard Trotter formula (4.28) the first error term can be proven to vanish. The first correction to the standard Trotter formula is given by 1 = [K, V ] (4.41) C1 2 which is an anti–Hermitian operator, i.e. † = . This property is inherited by the operator C1 −C1 product e−βH since D1 † β β −βH −τH † −(β−τ)H ′ β−τ ′ τ ′H e 1 = dτ e 1 e = dτ e 1 e D 0 C − 0 C h i Z Z = e−βH . (4.42) − D1 42 CHAPTER 4. PATH INTEGRAL MONTE CARLO

If and consequently also can be represented as a real operator the trace C1 D1 = tr e−βH = 0 (4.43) hD1i D1 vanishes. The condition of real representabilityn is surelyo fulfilled for (4.41) and usually will be favored in any Monte Carlo algorithm. The argument above shows that the first correction term for the partition function (4.32) vanishes whenever is anti–Hermitian and real–representable. C1 To show that the same holds also for the error of an observable (4.33) one has to regard that in addition also the trace = tr e−βH = 0 (4.44) hD1 Oi D1 O vanishes due to n o † e−βH = e−βH . (4.45) D1O −O D1 The above error formulas (4.39)³ and (4.40)´ make definite statements about the systematic error due to the discretization of the path integral that allow a comparison between different short–time approximations for the propagator. They can also be used to check the ”Trotter convergence” or to eliminate systematical errors as outlined in the next subsection.

4.4.3 Trotter Extrapolation Usually the Trotter number is chosen large enough in order to reduce the systematic error level below the statistical fluctuations of the Monte Carlo result. To check for the ”Trotter convergence” of the Monte Carlo simulation the calculations are repeated for larger values of the Trotter number (i.e. smaller time step) until no deviations due to systematic errors are observable. A more systematic approach based on the error formulas eqs. (4.39) and (4.40) is the so– called Trotter extrapolation. Also here the calculations are repeated for several choices of ∆τ. Each calculation is carried out with high accuracy, i.e. the statistical fluctuations are minimized, in order to achieve a set of measurements (∆τ) that are biased mainly just by the systematic hOi Trotter error for the given step size ∆τ. Since eq. (4.33) tells us how the systematic error scales for small ∆τ, the plot of the measurements against the appropriate power of ∆τ allows an extrapolation to step size ∆τ = 0 and thus effectively eliminates the Trotter error of the Monte Carlo results. In the case of the standard Trotter breakup the measurements (∆τ) have hOi to be plotted against ∆τ 2 in order to determine (∆τ = 0) from a straight line fit. This hOi analysis of the systematic error is very useful in the accurate calculation of expectation values for (a small number of) observables though the application for the measurement of a correlation function discretized to a set of 200 or more values is rather cumbersome. Therefore we rather performed simulations for very small values of ∆τ and used Trotter extrapolation only to check the convergence for certain values of the physical parameters.

4.5 Non-Positive Actions and the Sign Problem

A proper definition of (real–time) correlation functions and a short discussion of their properties will be given in chapter 5 and appendix A. Here we just want to point out the problems that arise in their calculation by path integral Monte Carlo methods. In generalization of eq. (3.10) the correlation function of the path–coordinate x(t) can be expressed as the path integral 1 x(t)x(0) = xx(t)x(0) eiS[x] (4.46) h i Z D x(0)=Zx(−iβ) 4.5. NON-POSITIVE ACTIONS AND THE SIGN PROBLEM 43 where the time contour connecting the origin and z = iβ in the complex plain contains the C − point z = t. The usual choice for this contour is given by a straight line along the real time axis from z = 0 to z = t and back again, followed by another straight line from z = 0 to z = iβ. − For this case the action can be written as

iS[x, x,¯ x˜]= iS[x] iS[¯x] SE[˜x] (4.47) − − where we have explicitly split the path into the segments x(t),x ¯(t) andx ˜(τ) corresponding to the forward, backward and imaginary–time part of the contour (which are connected by appropriate boundary conditions). The real–time action S[x] (and correspondingly forx ¯) and the Euclidean action SE[˜x] are given by

t β m m 2 S[x]= dt′ x˙ 2(t′) V (x(t′)) and SE[˜x]= dτ x˜˙ (τ)+ V (˜x(τ)) . (4.48) 0 2 − 0 2 Z h i Z h i From eq. (4.47) it is clear that the weight factors eiS[x] and e−iS[¯x] corresponding to the real– time part of the path are usually neither real nor positive and hence can not be interpreted as a probability density for the Monte Carlo integration. In this case one splits the complex weight factor into its modulus ρ[x] and phase φ[x] and uses the first as a probability while the latter is included in the observable. Writing the correlation function again as a single path integral over the complete path x(z) one thus uses

xx(t)x(0) eiφ[x]ρ[x] x(t)x(0) = D with eiS[x] ρ[x] eiφ[x]. (4.49) h i x eiφ[x]ρ[x] ≡ R D As a consequence of this separationR of the weight factor into modulus and phase, we now have to compute two path integrals to determine also the proper normalization. A similar situation arises in the computation of observables for many–Fermion systems since the anti– symmetrization with respect to the particle coordinates that has to be performed in this case leads to an action that can take negative as well as positive (real) values and the sign of the action has to be incorporated again in the observable and accumulated for the normalization. In both cases one usually speaks of the phase factor as the ”sign” and the normalization as the ”average sign”. The ”sign problem” arising from eq. (4.49) is given by the simple fact that for quantum systems, that are not just characterized by small fluctuations around the classical path, the integrals in the numerator and denominator are similar to the integral over an oscillating function, i.e. sum up contributions with similar modulus but very different phase. This leads to a numerically unfavorable cancelation of leading digits, and the problem is even worsened by the fact that the average sign in the denominator goes to zero in the deep quantum regime. For the calculation of real–time correlation function these problems lead to an exponential decay of the signal–to–noise ratio for increasing time t which severely hampers the direct calculation of dynamical properties of quantum systems using real–time path integral methods and on the other hand is a strong motivation to consider methods that try to extract dynamical information from imaginary–time path integrals as will be discussed in detail in the following chapter. 44 CHAPTER 4. PATH INTEGRAL MONTE CARLO Chapter 5

Correlation Functions and Inverse Problems

In this chapter we develop the framework that will be used to deduce transport properties like the conductance from correlation functions. We introduce two–point correlation functions in real and imaginary time and relate them to the (generalized) dynamic susceptibility using Linear Response Theory. The resulting set of equations for the calculation of transport coefficients from imaginary–time correlation functions involves the solution of an integral equation that can be classified as an ill–posed inverse problem. As a preparation for the applications in chapter 6 and 7 we discuss the Singular Value Decomposition (SVD) and Maximum Entropy Method (MEM) for the solution of linear inverse problems and compare them for analytically solvable test cases.

5.1 Time Correlation Functions and Linear Response

We start with a brief summary of the definition and properties of real–time (two-point) corre- lation functions. For systems close to thermodynamic equilibrium one can establish a relation between transport and correlations using Linear Response Theory. In particular we derive the Kubo formula for the electrical conductance that will be used in the applications in chapters 6 and 7. To bypass the dynamic sign problem connected with the Monte Carlo calculation of the real–time correlation function we introduce correlation functions in imaginary–time. The relation between these correlation functions and transport coefficients involves the solution of an ill–posed inverse problem in the form of a Fredholm integral equation.

5.1.1 Real–Time Correlation Functions Due to the large number of degrees of freedom the dynamics of a many–particle system can hardly be expressed by the solution of the microscopic equations of motion. A more suitable approach is the calculation of temporal correlations of (macroscopic) observables. The simplest conceivable quantity is the two–point correlation function of observables A and B (for real times) defined by

1 S (t)= A(t)B = tr e−βH A(t)B with (5.1) AB h i Z n o Z = tr e−βH and A(t)= eiHt A e−iHt. (5.2) n o

45 46 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

It tells us how a measurement of the observable B at time t = 0 is correlated with the measurement of A at time t for a canonical ensemble specified by the density operator ρ = Z−1 exp( βH). − Besides the time–translation invariance A(t)B(t′) = A(t t′)B(0) in thermodynamic h i h − i equilibrium, that was already used in the definition of the correlation function, we will use the property 1 S ( t)= tr e−βH e−iHt A e−βH eβH eiHt B = S (t iβ) (5.3) AB − Z BA − n o that results from the cyclic invariance of the trace operation. Since the correlation function is evaluated for a complex argument in eq. (5.3) one should note at this point that the definition eq. (5.1) can be extended to the strip z C β < Im(z) 0 of the complex time plane. We { ∈ | − ≤ } will discuss this aspect in more detail in the section 5.1.4 dealing with imaginary time correlation functions. The Fourier transform of the correlation function ∞ iωt SAB(ω)= dt SAB(t) e (5.4) Z−∞ is related to the spectral functions that can be observed in suitable scattering experiments and is thus often referred to as scattering function. As we will see below it can also be related to the (dissipative) response of a system to an external field using the fluctuation–dissipation theorem. For the scattering function eq. (5.3) leads to the detailed balance property

∞ S ( ω)= dt S ( t) eiωt = e−βω S (ω) (5.5) AB − AB − BA Z−∞ 5.1.2 Linear Response Theory and Fluctuation–Dissipation Theorem In this subsection we briefly summarize the derivation of the linear response of a system to a weak external field that will lead us to the definition of the response function, the dynamic sus- ceptibility and transport coefficients. We will also introduce the dissipative response which gives the connection between scattering and response measurements in the form of the fluctuation– dissipation theorem.

Linear Response Theory and Response Function

We consider a system in thermodynamical equilibrium given by the Hamiltonian H0. A weak time–dependent field F (t) coupling to the variable B of the system gives rise to the perturbation V (t)= F (t)B. The disturbed system − H(t)= H + V (t)= H F (t)B (5.6) 0 0 − shall be described by time–dependent perturbation theory up to first order in the weak external field F . Initially the system is described by the equilibrium density matrix ρ = Z−1 exp( βH ) 0 0 − 0 corresponding to H0. At t = t0 the perturbation shall be switched on and the density matrix will evolve according to the following equation of motion

† ρI (t)= UI (t,t0) ρ0 UI (t,t0). (5.7) 5.1. TIME CORRELATION FUNCTIONS AND LINEAR RESPONSE 47

Here the subscript I refers to the interaction picture, i.e. the with respect to H0, and the evolution operator UI (t,t0) can be expressed by the Dyson equation

t ′ ′ −i dt VI (t ) U (t,t ) = e t0 (5.8) I 0 T R t t t′ = 1 i dt′ V (t′) dt′ dt′′ V (t′) V (t′′)+ ... − I − I I Zt0 Zt0 Zt0 with the time ordering operator that orders operators with larger times to the left of operators T with smaller times. Under the assumption of weak external fields it is sufficient to consider only terms up to order (V ) and we get the following expression for the expectation value of the observable A O I A(t) = tr ρ (t) A (t) = tr U (t,t ) ρ U †(t,t ) A (t) (5.9) h i { I I } I 0 0 I 0 I . n t o t = tr ρ A (t) i tr dt′ V (t′) ρ A (t) + i tr ρ dt′ V (t′) A (t) { 0 I } − I 0 I 0 I I ½Zt0 ¾ ½ Zt0 ¾ t = A i dt′ [A (t), V (t′)] h i0 − I I 0 Zt0 ­ ® with the expectation value with respect to H , i.e. in the undisturbed system, and [ , ] h·i0 0 · · denoting the commutator of observables. Inserting V (t) = F (t)B (t) one gets in the limit I − I t the main result of linear response theory 0 → −∞ ∞ A(t) = A + dt′ χ (t t′) F (t′) with χ (t)= iθ(t) [A(t), B(0)] . (5.10) h i h i0 AB − AB h i0 Z−∞ Sometimes the Heaviside function θ(t) is not included in the definition of the response function χAB(t) [54, 55, 56] but we will restrict ourselves to the ”retarded response” discussed above. Usually the experimental measurement of the response function is not performed in the time domain but by applying an oscillating external field. Therefore eq. (5.10) is translated to the (complex) frequency domain using the two–sided Laplace transform

∞ izt χAB(z)= dt χAB(t) e for Im(z) > 0 (5.11) Z−∞ which results in ∆A(z) = χ (z) F (z) (5.12) h i AB where ∆A(t)= A(t) A 0. The experimentally relevant quantities are given by the (generalized) −h i + + dynamic susceptibility χAB(ω + i0 ) and the transport coefficient χAB = lim χAB(ω + i0 ). ω→0 The notation was used in this paragraph to denote expectation values with respect to the h·i0 unperturbed Hamiltonian H0. Since this distinction was only necessary in the derivation of the linear response theory, we will denote the average with respect to the Hamiltonian of the unperturbed equilibrium system just with in the following. h·i Dissipative Response and Fluctuation Dissipation Theorem

A very useful quantity is also the dissipative response defined by 1 χ′′ (t)= [A(t), B] . (5.13) AB 2h i 48 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

Its Fourier transform determines the dynamic susceptibility for all frequencies in the upper half of the complex plane since ∞ dω χ′′ (ω) ∞ ∞ dω eiωt AB = dt i [A(t), B] = χ (z) (5.14) π ω z h i 2πi ω z AB Z−∞ − Z−∞ Z−∞ − where the second equality follows form Cauchy’s integral formula after splitting the t–integration into the parts t > 0 and t < 0 and closing the contour of the ω–integration at infinity in the upper and lower half plane, respectively. The dissipative response is also related to the scattering function SAB(ω) by the fluctuation– dissipation theorem 1 1 χ′′ (ω)= (S (ω) S ( ω)) = 1 e−βω S (ω). (5.15) AB 2 AB − BA − 2 − AB ³ ´ 5.1.3 The Kubo Formula for the Conductance Before we turn to the discussion of inverse problems we want to illustrate the linear response the- ory for transport coefficients by the derivation of the Kubo formula for the (linear) conductance that will also be important for the applications in chapters 6 and 7.

The Current Operator

For the conductance measurement a source drain voltage V (t) is applied across the device. In the steady state the current through the conductor can be defined as a weighted average ∂n ∂n I = w I + w I = e w d w s with w + w = 1 (5.16) d d s s − d ∂t − s ∂t s d · ¸ of the current Id leaving the conductor to the drain and the current Is entering the conductor from the source. These currents can be expressed by the change of the number of charges ns and nd in the source and drain, respectively. Except for the normalization of their sum to one, the weights ws and wd could be chosen arbitrarily. For the conductance of an interacting region coupled to leads by tunnel junctions we will find later (c.f. section 6.3.2) that the theory can be formulated most elegantly if the weights are chosen in correspondence to the tunnel resistances of the junctions. For the moment we just use the general definition (5.16). For the linear response formalism we also need the charge operator

Q = e(w′ n w′ n ) with w′ + w′ = 1 (5.17) − d d − s s s d as the variable of the system which is coupled to the external voltage V (t). In principle the ′ ′ weights ws and wd can be chosen independently from the weights used in the current operator since only the voltage difference between source and drain is relevant and not its distribution into the potential shifts of the source and drain electrodes. In order that the current I defined in ′ ′ eq. (5.16) is given by the time derivative of the charge (5.17) we choose ws = ws and wd = wd.

Linear Response of the Current

In a current measurement a small voltage V is applied across the conductor resulting in the following perturbation of the system

H = H QV (t) (5.18) 0 − 5.1. TIME CORRELATION FUNCTIONS AND LINEAR RESPONSE 49 with Q given by eq. (5.17). We are interested in the influence of the perturbation on the current I. For a weak external voltage V (t) we can apply linear response theory to calculate the expectation value of the current operator. Since there is no current for vanishing voltage we get from eq. (5.10)

∞ I(t) = dt′ χ (t t′) V (t′) (5.19) h i IQ − Z−∞ or equivalently I(ω) = χ (ω) V (ω). (5.20) h i IQ The DC conductance G has to be calculated as the limit ω 0 of the dynamic susceptibility → χIQ(ω).

The Kubo formula

Using eqs. (5.14) and (5.15) we can express the dynamic susceptibility χIQ(ω) as

′ ∞ dω′ (1 e−βω )S (ω′) χ (ω + i0+)= − IQ . (5.21) IQ 2π ω′ ω i0+ Z−∞ − − ′ Since the current I is the derivative of the charge Q we can relate SIQ(ω ) to the scattering ′ function SII (ω ) corresponding to the current autocorrelations

∞ ∞ iω′t ′ ′ e S (ω ) S (ω′)= dt I(0)Q( t) eiω t = dt I(0)I( t) = II (5.22) IQ h − i h − i iω′ iω′ Z−∞ Z−∞ where we have integrated by parts and used that the correlation function vanishes for t = . ±∞ Inserting this result into eq. (5.21) and taking into account the formula of Sokhotski–Plemelj [57] 1 1 = + iπ δ(ω) (5.23) ω i0+ P ω − µ ¶ we get for the conductance

′ (1 e−βω) ∞ dω′ (1 e−βω )S (ω′) G(ω)= χ (ω)= − S (ω) i − II . (5.24) IQ 2ω II − P 2π (ω′ ω)ω′ Z−∞ − In the limit ω 0 the integrand iω′−2(S (ω′) S ( ω′)) of the principle value integral is → − II − II − an odd function and we get for the DC conductance the Kubo formula β β ∞ G = S (ω =0) = dt I(t)I . (5.25) 2 II 2 h i Z−∞ 5.1.4 Imaginary–Time Correlation Functions In sections 5.1.1 and 5.1.2 we have seen how experimental quantities like the scattering function SAB(ω), the dynamic susceptibility χAB(ω), or the transport coefficient χAB can be determined from either the real–time correlation functions SAB(t), the response function χAB(t), or the ′′ dissipative response χAB(t). Unfortunately these real–time quantities are rather difficult to obtain by Monte Carlo evaluation of the corresponding path integral expressions due to the dynamical sign problem discussed in section 4.5. 50 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

In contrast the computation of the imaginary–time correlation function defined by 1 C (τ)= A( iτ)B = tr e−(β−τ)H A e−τH B (5.26) AB h − i Z n o is not more difficult than that of time–independent expectation values. To relate CAB(τ) to the experimentally observable quantities discussed above we use the fact that CAB(τ) and SAB(t) are limiting cases of the same analytic function S (z) = A(z)B defined on the strip z AB h i { ∈ C β < Im(z) < 0 of the complex time plane. Thus we can relate C (τ) to the scattering | − } AB function by the simple substitution t = iτ in the inversion of eq. (5.4), i.e. − 1 ∞ C (τ)= dω e−ωτ S (ω). (5.27) AB 2π AB Z−∞ ′′ The relation to χAB(ω) then follows from the fluctuation dissipation theorem (5.15) and is given by [54, 55] 1 ∞ e−ωτ C (τ)= dω χ′′ (ω). (5.28) AB π 1 e−βω AB Z−∞ − Once the scattering function or the dissipative response has been determined from the inversion of eq. (5.27) or (5.28) the dynamic susceptibility and the transport coefficient can be determined according to eqs. (5.14) and (5.15) by

′ ∞ dω′ χ′′ (ω′) ∞ dω′ (1 e−βω )S (ω′) χ (ω + i0+)= AB = − AB . (5.29) AB π ω′ ω i0+ 2π ω′ ω i0+ Z−∞ − − Z−∞ − − The connection between the imaginary–time correlation function and the dynamic susceptibility thus obtained represents a possibility to circumvent the dynamical sign problem in the calcu- lation of experimentally observable quantities. The main problem in this scheme is given by the inversion of the Fredholm integral equations of the first kind (5.27) or (5.28) that will be discussed in the general framework of inverse problems in the following sections.

5.2 Linear Inverse Problems

Integral equations like (5.27) or (5.28) are examples of linear inverse problems that are encoun- tered in a broad range of physical and technical applications. In this section we introduce the operator formulation that is used in the theory of inverse problems [58, 59, 60]. In 5.2.2 we discuss the important aspect of ill–posedness that characterizes the difficulties encountered in the solution of such problems and forces us to consider (approximate) regularized solutions.

5.2.1 Definition and Examples of Inverse Problems Operator Formulation of Inverse Problem

A typical inverse problem is the identification of a property by an indirect measurement. Instead of the set of values of the quantity x that is not directly accessible we can only measure the values y that are related to the property of interest by a (physical) model that can be quantified by the relation y = Kx, x X, y Y, K : X Y (5.30) ∈ ∈ → where X is the space of possible values for x and Y is the space of possible measurements of y. In this thesis we will only consider linear inverse problems for which K is a (compact) linear 5.2. LINEAR INVERSE PROBLEMS 51 operator. The sets X and Y will be chosen as the Hilbert space Rn of n–dimensional real vectors in the discrete case or the Hilbert space L2 of square integrable functions with the usual scalar products. The operator equation (5.30) represents an inverse problem when the left hand side y is given and our aim is to invert the relation to calculate the argument x. In contrast, the calculation of y for a given x is often called the direct problem.

Examples of Inverse Problems

Before we turn to a consideration of the problems connected with the inversion of eq. (5.30) we would like to mention just a few examples to illustrate the broad range of applications that lead to inverse problems. We start with the operator formulation of eqs. (5.27) and (5.28). Both relations represent a ′′ Fredholm integral equation of the first kind for the function x(ω)= SAB(ω) or x(ω)= χAB(ω) and can be written as

∞ y(τ) = (Kx)(τ) dω k(τ, ω) x(ω), y L2(0, β), x L2( , ), (5.31) ≡ ∈ ∈ −∞ ∞ Z−∞ with the kernel function k given by

e−τω 1 e−τω k(τ, ω)= or k(τ, ω)= . (5.32) 2π π 1 e−βω − The inverse problem that we will be concerned with in the applications in chapter 6 has already been given in eq. (5.27) which connects the imaginary–time correlation function CAB(τ) with the experimentally relevant scattering function SAB(ω). As the real–time correlation function CAB(t) can be obtained from SAB(ω) by a straight forward Fourier transform, the inverse problem (5.27) is sometimes also referred to as the ”numerical analytical continuation” of the imaginary–time data to real times. Due to the sign problem in quantum Monte Carlo calculations of real–time correlation functions eq. (5.27) represents an important example of inverse problems with applications for the numerical simulation of quantum systems. Another application of inverse problems which is even better known to the general public is computer tomography. Measuring the attenuation of the intensity of X–rays one can deduce the density distribution inside a living organism by solving an inverse problem. Details about the mathematical formulation of this inverse problems and the practical application can be found in [61]. As the final example we would like to mention another physical application which is known as inverse scattering problem. From the measurement of the amplitude us(x) of electromagnetic (or acoustic) waves outside the scattering region one tries to identify an inhomogeneous refraction index n(x). From the Helmholtz equation (∆ + k2n2(x))u(x) = 0 one can deduce in the first Born approximation the integral equation [58]

k2 eik|x−y| u (x)= dy f(y) u (y), x = R (5.33) s −4π x y i | | ZΩ | − | that relates the amplitude us(x) of the scattered wave to that of the incident wave ui(x). Here Ω is a sphere of radius R surrounding the scattering region. The integral equation can be viewed as linear inverse problem for the function f(x) = 1 n2(x). − 52 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

5.2.2 Ill–Posedness and Regularization Ill–Posed Inverse Problems

To understand the difficulties that may arise in the solution of an inverse problem we will examine the conditions that are necessary for a stable inversion of eq. (5.30). First of all it is important to note that the result of our measurement (or numerical simulation) will not be the exact value y but yǫ = Kx + ǫ (5.34) containing the measurement error ǫ. To obtain a solution xǫ that approximates x with sufficient accuracy we need the following properties of the mapping K that constitute a well–posed (or correct) inverse problem according to the definition of Hadamard [58].

Existence: To guaranty the existence of a solution for any measurement it is necessary • that y = Kx has a solution for any value y Y , i.e. the range R(K) = Kx x X of ∈ { | ∈ } the operator K has to cover the space Y of all possible measurements. Though the exact value y always corresponds to the exact solution x, we might encounter the problem that there exists no solution for the actual measurement yǫ if the above condition is violated. To circumvent this problem one usually relaxes the condition y = Kx and defines as a generalized solution any element of X that minimizes the defect Kx y 2 . k − kY Uniqueness: In order to find a unique solution for each measurement the linear operator • K has to fulfill the requirement N(K) = 0 where N(K) = x X Kx = 0 . Under { } { ∈ | } this assumption one can define the inverse operator K−1 that maps each measurement y to the corresponding solution x of the inverse problem. If Kx = y (or the minimization of Kx y 2 ) has more than one (generalized) solution, i.e. if the measurement y contains k − kY too little information to identify the solution x uniquely, it is necessary to select the solution of the inverse problem by further assumptions. Since irregular solutions are often characterized by strong oscillations that lead to a large value x 2 one may choose among k kX the possible solutions that which additionally minimizes the norm. As we will see in section 5.3 this choice corresponds to the solution of the problems using the generalized inverse or Moore–Penrose inverse K+ of K. This additional constraint is also closely related to the Tikhonov regularization that shall be discussed below. In the Maximum Entropy Method one rather chooses that solution that additionally maximizes an entropy functional S[x] as will be discussed in section 5.4. In the following we use the term ”generalized inverse” for the mapping of a measurement to its uniquely determined solution with maximum entropy or minimum norm. If we want to discriminate between the two cases we will use the notion of ”Moore–Penrose inverse” or ”Maximum Entropy inverse”. Other possibilities for the formulation of a generalized inverse are possible but will not be considered here. Among the possible choices the Moore–Penrose inverse and the Maximum entropy inverse can be distinguished by certain general axioms about the selection of a solution [62].

Stability: Even if the above requirements are fulfilled, or the corresponding problems • have been circumvented by the introduction of generalized solutions and a generalized inverse, we have to ensure that the identified solution xǫ is a good approximation for x. The necessary condition to guaranty the stability of the solution with respect to the data error ǫ is the continuity of the (generalized) inverse of the operator K. The violation of the stability requirement represents the main difficulty in the solution of inverse problems and can only be addressed by regularization methods as outlined below. 5.2. LINEAR INVERSE PROBLEMS 53

An inverse problem that fulfills these properties, i.e. that has a stable and unique solution for any measurement, can easily be solved and is therefore denoted as well–posed (or correct). Otherwise (5.30) is referred to as an ill–posed (inverse) problem. In practice the notion of inverse problems implicitly implies ill–posedness since when the inversion of (5.30) is straight forward one rather considers the direct problem x = K−1y. A linear operator K that acts between spaces X and Y of finite dimension, i.e. that can be represented as a matrix, is always bounded and hence continuous. Since the same applies to its (generalized) inverse the above stability condition for the inverse problem y = Kx is always fulfilled. Nonetheless for the practical application it makes little difference whether K+ is not bounded or whether K+ has a very large bound and thus leads to a finite but still very large amplification of the measurement errors. Thus we will make no distinction between infinite dimensional ill–posed inverse problems and finite dimensional so called ill–conditioned linear systems of equations. The numerical methods described below apply to both cases as for example the discretization of an ill–posed integral equation leads to an ill–conditioned set of linear equations.

Regularization of Inverse Problems

While the existence and uniqueness of the solution can easily be ensured by regarding gen- eralized solutions and a generalized inverse, the fundamental problem of the instability of the solution requires the introduction of approximations in the form of a regularization of the solu- tion. Mathematically, a regularization for the inverse problem y = Kx is defined as a family of continuous mappings T : Y X that converges (pointwise) to the (generalized) inverse of K γ → in the limit γ 0, i.e. the original problem is replaced by the well–posed problem y = T −1x → γ (with solution Tγy). The parameter γ that determines how well Tγ approximates the generalized inverse is called the regularization parameter. An example for a regularization of eq. (5.30) is the Tikhonov regularization in which the approximate solution for regularization parameter γ > 0 is defined by

ǫ ǫ ǫ 2 2 2 xγ = Tγy argmin Kx y Y + γ x X . (5.35) ≡ x∈X k − k k k ¡ ¢ where argmin (f(x)) or argmin f(x) x X denotes the argument x X that minimizes x∈X { | ∈ } ∈ the function f(x). Correspondingly the Maximum Entropy regularization is defined as

ǫ ǫ ǫ 2 xγ = Tγy argmin Kx y Y 2γS[x] . (5.36) ≡ x∈X k − k − ¡ ¢ with the entropy functional S[x] that will be given explicitly in section 5.4. In both cases the ǫ minimization of the defect assures that xγ approximates the solution of the inverse problem while the minimization of the norm or the maximization of the entropy S[x] ensures the uniqueness and stability of the solution. The relative weight of the two contributions is determined by the regularization parameter γ. The absolute error of the regularized solution can be split into two parts

xǫ x = T (yǫ y)+ T K+ y. (5.37) γ − γ − γ − The first term on the right hand side results from the¡ error ǫ in¢ the data yǫ. It usually decreases + with increasing γ as Tγ becomes smoother and more stable than the generalized inverse K . The + second term is the regularization error that results from the approximation of K by Tγ. It goes 54 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS to zero for γ 0 by construction and usually increases with increasing γ as T approximates → γ K+ less well. To get an optimal solution for the inverse problem one has to find a compromise between the effect of the data error and the regularization error by the appropriate choice of the regularization parameter γ.

5.3 The Singular Value Decomposition (SVD)

In this section we use results of the spectral theory of compact linear operators to give an explicit representation of the Moore–Penrose inverse K+. This allows to write down the formal solution for the ill–posed problem y = Kx. We identify the source of the instability of the solution and discuss schemes for the regularization of the problem. Finally we show how additional information about the solution can be used to improve the quality of the result.

5.3.1 Formal Solution for Linear Inverse Problems The Singular System

To give an explicit formula for the Moore–Penrose inverse K+ of the compact linear opera- tor K : X Y we consider the spectrum of the (compact) hermitian operator T = K†K. → According to standard theorems of functional analysis [58, 63] the spectrum of T is given by the sequence λ N of non–negative eigenvalues which shall be ordered as λ λ .... { n}n∈ 1 ≥ 2 ≥ The corresponding orthonormal eigenvectors shall be denoted by vn. With the definitions (for σn > 0) σ = λ and u σ−1Kv (5.38) n n n ≡ n n † one can easily check that un are thep eigenvectors of KK with the same eigenvalues λn and that σ ,v ,u fulfills the conditions { n n n} † Kvn = σnun and K un = σnvn. (5.39) The system σ ,v ,u is called the singular system of the operator K. The non–negative { n n n} values σ are denoted as singular values and the vectors (or functions) v and u as right { n} { n} { n} and left singular vectors of K, respectively.

Expansions in the Singular Vectors

The singular vectors are particularly useful since v is an orthonormal basis of the (orthog- { n} onal) complement N(K)⊥ X while u is an orthonormal basis of the range R(K) of K. ⊆ { n} Consequently any vector x X can be written as ∈ x = x + x,v v with Kx = 0 and x ,v = 0, (5.40) 0 h ni n 0 h 0 ni σ >0 Xn with norm x 2 = x 2 + x,v 2 (5.41) k kX k 0kX |h ni| σ >0 Xn as a consequence of the orthogonality of the singular vectors and the theorem of Phythagoras. The expansion (5.40) and the first property in eq. (5.39) lead to the following representation for the operator K Kx = σ x,v u (5.42) n h ni n σ >0 Xn 5.3. THE SINGULAR VALUE DECOMPOSITION (SVD) 55 which is called the singular value decomposition of K. In correspondence to eq. (5.40) we can use the left singular vectors u to expand any measurement y Y in the form { n} ∈ y = y + y,u u with y ,u = 0. (5.43) 0 h ni n h 0 ni σ >0 Xn that will help us to derive an explicit representation of the Moore–Penrose inverse. The reader should note that the contributions x0 and y0 have been included to take into account that the existence and uniqueness conditions might be violated, i.e. that elements y R(K)⊥ and 0 ∈ x N(K) exist. They can be set to zero if the range of K covers all possible measurements 0 ∈ and if any two values of x lead to different measurements y.

Representation of the Moore–Penrose Inverse

By definition, a generalized solution minimizes the defect which can be expressed by eqs. (5.42) and (5.43) as Kx y 2 = y 2 + σ x,v y,u 2. (5.44) k − kY k 0kY | nh ni − h ni| σ >0 Xn The minimum defect is obtained if the expansion coefficients x,v for the solution satisfy h ni

y,un x,vn = h i for all σn > 0. (5.45) h i σn

According to eq. (5.40) the generalized solutions are determined by the coefficients x,v up h ni to an element x0 with Kx0 = 0, and due to eq. (5.41) the unique solution of minimum norm is given by the choice x0 = 0. Hence we have derived the following explicit expression for the Moore–Penrose inverse K+ that maps the measurement y Y onto the minimum norm solution ∈ of the inverse problem. y,u x = K+y = h ni v . (5.46) σ n σ >0 n Xn 5.3.2 Regularization of the Solution Instability of the Solution

With eq. (5.46) we have found an analytic expression for the generalized (minimum norm) solution of the linear inverse problem y = Kx. We have properly taken into account the pos- sible violation of the existence and uniqueness property of the solution by the consideration of generalized solutions and the Moore–Penrose inverse K+. Nevertheless the solution is yet only a formal one because it still contains the instability of the inverse problem with respect to data errors, i.e. K+ is still not a continuous linear operator. In eq. (5.46) this finds its expression in the fact that the sequence σ of singular values decays rapidly to zero for increasing n. { n} One can even use the characterization of the decay of σn to define a degree of ill–posedness [58]. Here we merely want to point out the mechanism by which the data error ǫ gets amplified in the solution. The coefficients of the exact solution

y,un Kx,un σn x,vn h i = h i = h i = x,vn (5.47) σn σn σn h i 56 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS are not affected by the rapid decay of σn since the singular values cancel out. The coefficients for the actual measurement yǫ = y + ǫ on the other hand are given by

ǫ y ,un ǫ, un h i = x,vn + h i. (5.48) σn h i σn

For large n, i.e. small σn, the second term on the right hand side will dominate and totally corrupt the result for the solution.

Regularization by a Filter Function

Having identified the source of the instability in the formal solution (5.46) we can construct a regularization for the problem, i.e. a family of continuous linear operators Tγ that converge to the generalized inverse K+ in the limit γ 0. We get a bounded (and hence continuous) linear → operator Tγ by the definition

y,v T y = h ni F (σ ) v (5.49) γ σ γ n n σ >0 n Xn with a filter function Fγ(σn) that falls of strong enough that the expansion coefficients in (5.49) remain finite, i.e. −1 sup σn Fγ(σn) Cγ < . (5.50) n | | ≡ ∞

The boundedness of the operator Tγ follows from

T y 2 = σ−1F (σ ) 2 y,u 2 C2 y 2. (5.51) k γ k | n γ n | |h ni| ≤ γ k k σ >0 Xn + In order that the family Tγ converges to K we further need the following property of the filter function lim Fγ(σn) = 1 for all σn (5.52) γ→0 to recover eq.(5.46) in the limit γ 0. If the summation in eq. (5.49) runs over an infinite set → of singular values one also needs the condition F (σ ) < const for all γ and σ to ensure that | γ n | n the limit γ 0 and the summation can be interchanged [58]. → For the appropriate choice of the regularization parameter γ the filter Fγ(σn) should modify only slightly the first terms in the expansion (5.49) which contain large singular values and are thus little affected by the data error. The terms corresponding to small singular values on the other hand should be damped such that the instability of the ill–posed problem is removed.

Examples of Regularizations

The general construction outlined above shall be illustrated by two important examples. We would like to mention first the Tikhonov regularization that was already given in the introduction to inverse problems as eq. (5.35). Noting the equivalence

K y min Kx y 2 + γ2 x 2 = min Kx˜ y˜ 2 with K˜ = andy ˜ = (5.53) x∈X k − k k k x∈X k − k γ1 0 µ ¶ µ ¶ ¡ ¢ ³ ´ 5.3. THE SINGULAR VALUE DECOMPOSITION (SVD) 57 we can deduce from the singular value decomposition of the operator K˜ that the Tikhonov regularization can also be formulated as σ T y = n y,u v (5.54) γ σ2 + γ2 h ni n σ >0 n Xn where σ ,v ,u is again the singular system of K. Hence the Tikhonov regularization corre- { n n n} sponds to the filter function 2 σn Fγ(σn)= 2 2 (5.55) σn + γ that clearly fulfills the properties (5.50) and (5.52). We will not discuss the choice of the regularization parameter γ for the Tikhonov regular- ization here (see e.g. [58, 60]) and rather concentrate on another method which is known as truncated singular value decomposition (TSVD). This method is characterized by the simple (low pass) filter 1 for σ > γ F (σ )= n , (5.56) γ n 0 for σ < γ ½ n i.e. the expansion (5.46) is truncated to

y,u T y = h ni v . (5.57) γ σ n σ >γ n Xn

While the terms containing large singular values (σn > γ) remain unaltered by the regularization, the corrupted coefficients of the singular vectors vn of higher order are set to zero.

Choice of the Regularization Parameter

Depending on the regularization method one uses, a wide range of (empirical) criteria for the choice of the regularization parameter exists [60]. An intuitive criterion that is very easy to implement for the TSVD and can be shown to have certain mathematical optimality properties [58] is the discrepancy principle of Morozov. ǫ ǫ ǫ For the noisy data y with y y = ǫ it makes no sense to look for a solution xγ whose ǫ ǫ k − k image Kxγ approximates y better than the measurement error ǫ since otherwise one runs the risk to fit the solution not only to the data but also to the measurement errors which leads to unstable results as discussed in section 5.2. Hence the discrepancy principle determines the regularization parameter γ from the relation

ǫ2 = yǫ Kxǫ 2 = yǫ 2 yǫ,u 2 , (5.58) k − γk k k − |h ni| σ >γ Xn i.e. the cutoff γ is chosen in such a way that the expansion of the data yǫ in terms of the singular vectors un is just accurate within the given measurement error.

5.3.3 Additional Constraints

As proven in appendix A the scattering function SAA(ω) for the autocorrelation function of the (hermitian) observable A is always positive. This information represents an important additional constraint on the solution of the inverse problem (5.27) for A = B. In this section we want to introduce a method [64] which uses exactly this additional information to enhance 58 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS the resolution of the solution. We will also comment on two other methods [16, 65] that aim at ensuring positivity of the solution. One might also think of using sumrules for the scattering function as additional constraints. The normalization of SAA(ω), i.e. the zeroth moment, is given by the data CAA(τ = 0) and thus is well recovered in the SVD solution without further constraints. If higher moments of SAA(ω) are known, they could be used in a similar way as described for the positivity constraint below.

Positive Solutions for Linear Inverse Problems

The additional constraint that a (physical) solution of the inverse problem has to be positive represents valuable information that should be useful for its reconstruction from incomplete or corrupted data. The TSVD approach (5.57) by itself does not ensure a positive solution even if the exact physical result has this property since the contribution of the ”invisible” components that are corrupted by the measurement errors has been truncated. The idea of the approach of de Villiers et al. [64] is to extend the TSVD solution to

y,un x = h ivn + cnvn (5.59) σn σn>γ γ>σ >0 X Xn and determine the coefficients cn from the positivity property of the solution. In order to get a unique solution and to retain the information contained in the first terms of eq. (5.59) one requires additionally that the norm of the solution, and hence that of the added components, should be minimal, i.e. one solves the following constraint minimization problem

xǫ = T yǫ = argmin x 2 x X, A x = b, x 0 . (5.60) γ γ {k kX | ∈ > ≥ } where ǫ y ,un bn = h i for σn > γ (5.61) σn are the coefficients obtained by the TSVD approach and the operator A> defined by

(A x) = x,v for σ > γ (5.62) > n h ni n maps x onto its first expansion coefficients in terms of the singular vectors, i.e. the constraint A>x = b just fixes the coefficients for σn > γ to their TSVD values. It is also useful to define analogously the operator A< with

(A x) = x,v for 0 < σ < γ (5.63) < n h ni n that maps x to the vector of ”invisible” expansion coefficients with γ > σn > 0. The adjoint † † operators A< and A> are simply given by

† † A>b = bnvn and Aγ γ>σ >0 X Xn Using these operators the minimization problem (5.60) can also be formulated in terms of the additional coefficients cn of the SVD expansion as

c = argmin c 2 c R, A† b + A† c 0 . (5.65) {k k | n ∈ > < ≥ } 5.3. THE SINGULAR VALUE DECOMPOSITION (SVD) 59

Though this problem can be approached by standard methods for quadratic programming (such as e04nfc from the NAG C library, mark 7 [40]) it is complicated by the fact that for ill–posed problems usually the number of known coefficients, i.e. the dimension of b, is rather small while the vector c has a large dimension. Therefore it is favorable to consider the dual problem [64]

2 1 † R λ = argmin D(λ) b, λ + A>λ λn (5.66) ( ≡ −h i 2 + ¯ ∈ ) °³ ´ ° ¯ ° ° ¯ ° ° ¯ where the notation (x)+ denotes the positive part of° x. The° problem¯ (5.66) is equivalent to (5.65) by Fenchel’s duality theorem for convex programming [66, 67]. The dual problem has the advantage that λ has the same dimension as b, i.e. consists just of a small number of values, and thus can be determined much easier than the vector c in the primal problem (5.65). In addition eq.(5.66) is an unconstraint minimization problem and does not even require the use of quadratic programming methods but can be approached simply by the application of Newton’s method to the gradient D(λ). As shown in appendix B this leads to the following linear system for λ ∇ that has to be solved selfconsistently

† 1 + sgn(A>λ) H(λ) λ = b with (H(λ))n,n′ = vn,vn′ . (5.67) * 2 +

Having obtained the optimal value for the dual problem the solution of the inverse problem can be obtained from [64, 66]

x = (λnvn)+ (5.68) σ >γ Xn which is obviously positive (even for noisy data). Still the noise can cause problems if it corrupts the coefficients bn determined from the TSVD. In this case the primal as well as the dual problem might fail if no choice for the additional coefficients cn leads to a positive solution. As a remedy on can allow a small deviation of the coefficients bn from their TSVD values.

Other Methods

We briefly mention two other methods [16, 65, 68] that are supposed to deliver positive SVD solutions. Both methods are easier to implement than the approach described above but on the other hand suffer from serious drawbacks. The first method we would like to mention ensures positivity of the solution by choosing the filter function σn−γ σn > γ F (σ )= σ1−γ . (5.69) γ n 0 γ > σ > 0 ½ n Bertero et al. [65] showed that this filter leads to a positive solution in the absence of noise but they observed also that it reduces the resolution of the result in comparison with the TSVD approach. Though it is very simple to implement, this method is unsatisfactory since it neither guaranties a positive solution for noisy data nor does it make use of the additional information to enhance the resolution. The second method [16, 68] we would like to discuss is based on the same idea as the approach by de Villier et al. and tries to extend the TSVD solution using additional information to determine further expansion coefficients and thus enhance the resolution. Though H¨upper claims that his method leads to (almost) positive solutions if the exact result has this property, 60 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS the additional information he actually uses is that certain parts of the solution should be (almost) zero. The solution of this so–called super resolution SVD method can be given as xǫ = T yǫ = argmin P x 2 x X, A x = b (5.70) γ γ k D k | ∈ > where PD is the projector on the range for© which the solution is supposedª to vanish and the D constraint A>x = b again fixes the first coefficients to their TSVD values. In contrast to de Villiers’ method, the problem (5.70) introduces additional parameters to specify the range . D Further one has to determine optimal values for the regularization parameter of the TSVD and the number of additional coefficients cn that shall be considered. In our tests of the method for an analytically solvable model we found that the optimal determination of this large number of free parameters is rather difficult even if the exact solution is known. Certain choices of the range in which x is minimized also led to serious artifacts though they were in accord with D the exact properties of the solution. For these reasons we have employed de Villier’s extension of the SVD method to ensure positivity of the solution instead of H¨uppers super resolution SVD.

5.4 The Maximum Entropy Method (MEM)

In this section we describe the Maximum Entropy Method that approaches the inverse problem y = Kx from a statistical point of view. It is restricted to inverse problems that have a positive solution x> 0 and thus is frequently applied for the reconstruction of the spectrum SAA(ω) of an autocorrelation function. In 5.4.1 we review Bayes theorem and its application to the solution of inverse problems. Specifying the likelihood functional and the (entropic) prior probability in the general statistical expression we get the maximum entropy functional (5.36) for the regularized solution of the inverse problem. Depending on the choice of the overall sign this functional either has to be minimized as in eq. (5.36) or maximized as the name of the method suggests. In subsection 5.4.3 we describe how statistical reasoning can be used to derive a criterion for the choice of the regularization parameter. We will just summarize the key aspects of the method and refer the reader to the review of Jarrell and Gubernatis [69] and the article of Bryan [70] for details.

5.4.1 Bayesian Inference Since the data y of our inverse problem generally are the result of a measurement they can be interpreted as a realization of the random variable Y having a certain distribution. As an idealization one usually assumes that Y has a Gaussian distribution with the exact value as the mean and a width that reflects the statistical error of the measurement. The inverse problem can then be viewed as the relation Y = KX between the random variables Y and X and its solution represents the inference of the distribution of the variable X from that of Y . Having established this probabilistic interpretation we can use statistical reasoning to find an expression for the distribution of X for a given measurement Y = y, i.e. for the conditional probabilities P [x y] of observing X = x when Y = y has already occurred. As the solution of | the inverse problem we could then define the mean

x = dxxP [x y]. (5.71) mean | Z Since it is computationally easier and leads to a formulation analogous to the numerical approach (5.35) to inverse problems, one usually defines as the solution the most probable value of x, i.e.

xmax = max P [x y]. (5.72) x | 5.4. THE MAXIMUM ENTROPY METHOD (MEM) 61

To calculate the conditional probability P [x y] we use the theorem of Bayes that gives two | equivalent expressions for the joint probability P [x, y] of observing simultaneously Y = y and X = x P [x y]P [y]= P [x, y]= P [y x]P [x]. (5.73) | | From this relation one directly gets P [y x]P [x] P [x y]= | (5.74) | P [y] that expresses the conditional probability P [x y] in terms of the likelihood P [y x] of the observa- | | tion of Y = y for given X = x and the prior probability P [x] we can assign to the event X = x not taking into account the measurement y. The probability P [y] which is also called evidence is a normalization factor that follows from the likelihood and the prior probability due to

dx P [x y]P [y]= dx P [x, y]= P [y]. (5.75) | Z Z Eqs. (5.72) and (5.74) are the statistical basis for the maximum entropy method that will be concretized by the specification of the likelihood and the entropic prior probability in the following.

5.4.2 The Maximum Entropy Functional Least Squares Deviation

To give an expression for the likelihood P [y x] of the measurement Y = y if X assumes the | value x we have to analyze the error of our measurement process. For the applications we can T restrict ourselves to the case that y consists of a finite set of data points y = (y1, y2,...,yN ) . We will make the assumption of an ideal measurement which has a Gaussian distribution around the mean valuey ¯ = Kx, i.e.

2 N − χ 2 −1 2 P [y x] e 2 with χ = (y y¯ ) C (y y¯ )= y Kx (5.76) | ∝ i − i ij j − j k − kC i,jX=1 which is well known from the parameter determination by the method of least squares. The metric tensor C−1 is given by the inverse covariance matrix of the measurements. If the data 2 2 2 points yi can be measured independently, C = diag(σ1, . . . , σN ) is diagonal and χ is given by

N 2 2 yi (Kx)i χ = | − 2 | (5.77) σi Xi=1 2 where σi is the variance for the measurement of yi. In the general case the data points yi are not independent and the covariance matrix has to be estimated from M repeated measurements by M 1 C = (y(k) yˆ )(y(k) yˆ ) (5.78) ij M(M 1) i − i j − j − Xk=1 (k) where yi denotes the value of yi in the k-th measurement andy ˆi is the i-th component of the meany ˆ. 62 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

Entropic Regularization

As we have discussed in sections 5.2 and 5.3 the maximization of the likelihood (or equivalently the minimization of χ2) alone will not lead to useful results due to the instability of the inverse problem that amplifies the error in the data. Hence the prior probability P [x] in eq. (5.74) can not be neglected but plays an important role for the regularization of the problem. Since the solution x itself has to be positive it can be normalized and interpreted as a probability distribu- tion. In the Maximum Entropy Method one uses as the prior probability P [x] = exp( I[x,x∗]) − where I[x,x∗] 0 denotes the information that is contained in the distribution x relative to a ≥ default solution x∗ which is also termed default map. The default map should be constructed from any prior knowledge of the solution besides the measurement y and is set to a positive con- stant value if no such information is available. A measure for the information I[x,x∗] contained in x relative to x∗ can be derived from certain general axioms that shall be stated here for the ∗ case that x (and x ) consists of a finite set x = (x1,...,xn) of values that can be interpreted as the probabilities of n events due to the positivity of x [59].

Similar solutions should contain a similar amount of information in order to obtain a • method that gives stable results, i.e.

I[x,x∗] is continuous.

The information of a solution relative to itself should be zero, i.e. • I[x,x] = 0 for all x.

The information of a distribution x shall be independent on the labeling of the events • 1,...,n , i.e. { } I(π(x), π(x∗)) = I(x,x∗) for all permutations π.

The information associated to the exclusion of n m events from a uniform distribution • − shall be the higher the more events are excluded, i.e.

1 1 1 1 I( ,..., , 0,..., 0; ,..., ) grows monotonically with n m. m m n n −

If the set of events 1,...,n is split into parts 1,...,r and r+1,...,n the information • { } { } { } of the whole set shall be the weighted sum of the information of the subsets according to

∗ ∗ ∗ ∗ ∗ ∗ ∗ x1 xr x1 xr xr+1 xn xr+1 xn I(x; x )= I(q1, q2; q1, q2)+ q1I( ,..., ; ∗ ,..., ∗ )+ q2I( ,..., ; ∗ ,..., ∗ ) q1 q1 q1 q1 q2 q2 q2 q2

r n ∗ ∗ with q1 = xi and q2 = xi and analogous definitions for q1 and q2. i=1 i=r+1 P P One can show that for normalized x and x∗ the most general form consistent with these axioms is given by [59] n ∗ xi I[x,x ]= γ xi ln ∗ with γ > 0 (5.79) xi Xi=1 5.4. THE MAXIMUM ENTROPY METHOD (MEM) 63

It can be checked easily that for normalized x and x∗ the information I[x,x∗] attains its (only) minimum value of zero for x = x∗ such that P [x] = exp( I[x,x∗]) is a well defined probability. − If x and x∗ are not normalized eq. (5.79) is generalized to

n ∗ xi ∗ I[x,x ]= γ xi ln ∗ (xi xi ) = γS[x] with γ > 0. (5.80) xi − − − Xi=1 · ¸ where we have written the (regularization) parameter γ separately and defined the entropy functional S[x] suppressing the explicit reference to the default map. The generalization for continuous solutions x(ω) is given by

x(ω) S[x]= dω x(ω) x∗(ω) x(ω) ln . (5.81) − − x∗(ω) Z · ¸ Inserting our result for the prior probability P [x] and the likelihood P [y x] into eq. (5.74) | and using the definition (5.72) for the maximum entropy solution of the inverse problem with (corrupted) data yǫ and regularization parameter γ we find

ǫ γS[x]− 1 χ2[x] 1 2 ǫ 2 x = max e 2 = min χ [x] γS[x] = min y Kx 2γS[x] (5.82) γ x x 2 − x k − kC − µ ¶ ¡ ¢ as already given in eq. (5.36). Thus we have constructed a regularization for the inverse prob- lem analogous to the Tikhonov regularization (5.35) but with a regularization or penalty term 2γS[x] that was motivated by statistic arguments based on the theorem of Bayes and the − entropy as a measure for the (relative) information of a probability distribution.

5.4.3 Determination of the Regularization Parameters To close the discussion of the maximum entropy method we want to describe how statistical arguments can also be used to derive criteria for the choice of the regularization parameter γ.

Historic Maximum Entropy

In the so called historic maximum entropy the choice of the regularization parameter γ is per- formed such that χ2[x] for the solution equals the number of data points of the (discretized) solution, i.e. for x = (x1,...,xn) χ2[x]= n. (5.83) As in the parameter determination by a least squares fit, this choice is motivated by the fact that n is the expectation value of the random variable χ2[x] as defined in eq. (5.77) if the data Y has a Gaussian distribution. Choosing values for χ2 that are smaller than the average n can lead to a too close fit to the (corrupted) data and thus to unsatisfactory results in the form of a non–regular solution while too large values of χ2 result in a too strong regularization and thus a smoothing of the solution towards the default map.

Classic Maximum Entropy

In the classical maximum entropy approach also the value of the regularization parameter γ is treated as the realization of a random variable and determined by Bayesian inference. One 64 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS uses the assumption that for a given data set y there is one appropriate valueγ ˆ of the regular- ization parameter and that the distribution P [γ y] is sharply peaked around this value. Then | the probability of a solution x for the given measurement y can be approximated as

P [x y] = dγ P [x, γ y] = dγ P [x γ, y]P [γ y] | | | | Z Z γSˆ [x]− 1 χ2[x] P [x γ,ˆ y] e 2 . (5.84) ≈ | ∝ The inverse problem is solved by the maximization of this probability which is the same as eq. (5.82) with γ =γ ˆ. The optimal valueγ ˆ of the regularization parameter on the other hand is determined by the maximization of P [γ y]. To find an expression for this probability we first generalize eq. (5.74) | to P [y x, γ]P [x, γ] P [y x]P [x γ]P [γ] γS[x]− 1 χ2[x,y] P [x, γ y]= | = | | P [γ]e 2 (5.85) | P [y] P [y] ∝ where we have used that the likelihood P [y x, γ] exp( χ2[x]/2) of the data is independent | ∝ − of the regularization parameter and again estimated the prior probability by the entropic prior P [x γ] = exp(γS[x]). The prior probability P [γ] for the regularization parameter is either chosen | as constant or as P [γ] γ−1 over a certain interval. We can now give the probability P [γ y] ∝ | that has to be maximized for the determination of γ as

γS[x]− 1 χ2[x,y] P [γ y]= dx P [x, γ y]= P [γ] dx e 2 . (5.86) | | Z Z For the practical calculations the integral over x is approximated by a Gaussian integral andγ ˆ is determined by a selfconsistent maximization of the probabilities (5.84) and (5.86). For further details and another popular alternative (Bryan’s method) we refer the reader to [70].

5.5 Test of the SVD Method

In this section we close the chapter on correlation functions and inverse problems with the application of the singular value decomposition to an exactly solvable but nontrivial model system. We briefly introduce the model of a damped harmonic oscillator and present exact results for the dipole absorption cross section σ(ω) and imaginary–time displacement correlation function R2(τ) that are related by an inverse problem similar to (5.27). The SVD is then applied to reconstruct the cross section σ(ω) from the correlation function R2(τ) for different levels of added noise. To evaluate the accuracy of the method we compare the reconstructed spectra to results obtained for the same system by Gallicchio et al. [71] and Krilov et al. [72] using the Maximum Entropy Method.

5.5.1 An Exactly Solvable Model Hamiltonian and Action of the Model

An exactly solvable model that has been studied extensively in the context of dissipative quantum systems [73, 74] is a harmonic oscillator linearly coupled to an environment. In the Caldeira– Leggett model [75] also the environment is described as a bath of harmonic oscillators. The corresponding Hamiltonian consists of three parts H = HS + HR + HI that are given by p2 mω2 p2 m ω2 H = + 0 x2, H = α + α α x2 , H = x g x . (5.87) S 2m 2 R 2m 2 α I − α α α α α X X 5.5. TEST OF THE SVD METHOD 65

Here HS is the Hamiltonian of the isolated oscillator while HR describes the environment and HI specifies the coupling between the oscillator and the harmonic bath. To avoid the renormalization of the frequency ω0 of the system oscillator by the coupling to the bath, in the literature [73, 74] often an additional counter term is included in HI , i.e. one uses

g2 x2 H = α xg x . (5.88) I 2m ω − α α α α α X · ¸ For our purpose it is not necessary to cancel the frequency shift by a counter term and we will consider the Hamiltonian (5.87) to facilitate the comparison with the results obtained in [71, 72]. We are interested in the (imaginary–time) dynamics of the oscillator that can be described by path integrals over the path x(τ) with the Euclidean action

β m mω2 1 β β S[x]= dτ x˙ 2(τ)+ 0 x2(τ) dτ dτ ′ x(τ) Γ(τ τ ′) x(τ ′). (5.89) 2 2 − 2 − Z0 · ¸ Z0 Z0 Here the first part is the action of the isolated oscillator. The second term follows from the explicit path integration over the bath degrees of freedom and describes the influence of the en- vironment on the oscillator. As shown in appendix C.1 the integral kernel Γ(τ) can be expressed by the spectral density JR(ω) of bath modes as

∞dω cosh(ω( 1 β τ)) c2 Γ(τ)= J (ω) 2 − where J (ω)= π α δ(ω ω ). (5.90) π R 1 R 2m ω − α 0 sinh( 2 ωβ) α α α Z X

The spectral density JR(ω) can also be related to the dynamical friction kernel ζ(t) that appears in the classical equation of motion for the damped harmonic oscillator

t mx¨(t)+ dt′ ζ(t t′)x ˙(t′)+ mω2 ζ(0) x(t)= F (t) (5.91) − 0 − Z−∞ ¡ ¢ As shown in appendix C.2 the relation between JR(ω) and ζ(t) is given by [73, 74]

2 ∞ J (ω) ∞ ζ(t)= dω R cos(ωt) or J (ω)= ω dt ζ(t) cos(ωt). (5.92) π ω R Z0 Z0 In the following we will examine the model that is specified by the dynamic friction kernel

2 2 −α1(ft) 4 4 −α2(ft) ζ(t)= ζ0 e 1+ a1(ft) + a2(ft) e (5.93) n £ ¤ o with parameters ζ = 225, a = 1.486 105, a = 285, α = 903, α = 75 (in atomic units, 0 1 × 2 1 2 i.e. a = e = m = ~ = 1 and consequently c = α−1 137, 036) that resembles a model of an 0 e ≈ oscillator in a fluid of Lennard–Jones particles [71, 76]. As values of the parameter f we will consider a ”high damping” case f = 1 and a ”low damping” case f = 0.2. The mass of the oscillator is chosen as m = 1 and the inverse temperature will be set to β = 1. This choice of parameters allows us to compare our results directly to the calculations of Gallicchio et al. [71] and Krilov et al. [72]. 66 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

f=0.2 f=1.0 (t) (a.u.) ζ 0 50 100 150 200 250 0.0 0.5 1.0 1.5 t (a.u.)

Figure 5.1: Dynamic friction kernel ζ(t) (in atomic units (a.u.)) for a particle in a fluid of Lennard–Jones particles modeled by a bath of harmonic oscillators with parameters as given in the text.

The Dipole Absorption Cross Section

The dipole absorption cross section σ(ω) describes the coupling of the oscillator dipole µ(x)= q0x to external electromagnetic radiation. Here q0 denotes the charge of the oscillator, which will be set to unity in the following, and x denotes the displacement of the oscillator. The absorption cross section is related to the spectrum of the position correlation function Sx(t) = x(t)x(0) h ′′ i and via the fluctuation–dissipation theorem to the corresponding dissipative response χx(ω) by [71] 4π 8π σ(ω)= ω 1 e−βω S (ω)= ωχ′′(ω). (5.94) c − x c x Below we show how this experimentally³ relevant´ quantity can be determined from the action (5.89) as the solution of an inverse problem (by the SVD method). Exact results for σ(ω) to check the accuracy of the SVD calculations can be obtained exploiting the fact that for ′ ′′ our model system the quantum mechanical dynamical susceptibility χx(ω) = χx(ω)+ iχx(ω) coincides with the classical result [71, 74]. Thus the dipole absorption cross section σ(ω) is most easily calculated from the dynamical friction kernel ζ(t) by a Fourier transform of the classical equation of motion (5.91) that leads to 1 1 χ (ω)= (5.95) x m ω˜2 ω2 iωγ(ω) − − whereω ˜2 = ω2 ζ(0)/m and γ(ω) denotes the frequency dependent damping coefficient that is 0 − related to the dynamic friction kernel by

1 ∞ γ(ω)= γ′(ω)+ iγ′′(ω)= dt Θ(t) ζ(t) eiωt. (5.96) m Z−∞ 5.5. TEST OF THE SVD METHOD 67

a) f=0.2 f=1.0 ) (a.u.) ω '( γ 0 10 20 30 40 50 0.0 20.0 40.0 60.0 80.0 100.0 ω (a.u.)

b) f=0.2 f=1.0 ) (a.u.) ω ''( γ 0 5 10 15 20 25 30 0.0 50.0 100.0 150.0 ω (a.u.)

Figure 5.2: Frequency dependent damping constant γ(ω) (in atomic units) corresponding to the dynamic friction kernel ζ(t) of a particle in a Lennard–Jones liquid. Subfigure a) shows the real part γ′(ω) of the damping constant while b) displays the imaginary part γ′′(ω).

As explicit expression for the dipole absorption cross section σ(ω) we obtain from the imag- inary part of χx(ω) 8π ω2γ′(ω) σ(ω)= . (5.97) mc [˜ω2 ω2 + ωγ′′(ω)]2 +[ωγ′(ω)]2 − The form of σ(ω) is shown in fig. 5.3 for high damping (f = 1.0) environment and two values of the free oscillator frequency ω0 = 20.0 and ω0 = 15.0. The latter case ω0 = 15.0 corresponds toω ˜ = 0, i.e. a particle that diffuses under the influence of the environment with a diffusion 68 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS constant that is given by [72] c 1 D = σ(0) = . (5.98) 8πβ mβγ′(0)

ω 0=20.0 ω 0=15.0 ) (a.u.) ω ( σ 0 0.005 0.01 0.015 0.02 0.025 0.03 0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.3: Dipole absorption cross section for an oscillator in a Lennard–Jones liquid that exerts strong damping (f = 1.0) for ω0 = 20.0 (non–diffusive) and ω0 = 15.0 (diffusive, i.e.ω ˜ = 0).

The corresponding results for the case of a low damping (f = 0.2) environment are shown in fig. 5.4.

ω 0=20.0 ω 0=15.0 ) (a.u.) ω ( σ 0 0.02 0.04 0.06 0.08 0.1 0.0 10.0 20.0 30.0 40.0 ω (a.u.)

Figure 5.4: Dipole absorption cross section for an oscillator in a Lennard–Jones liquid that exerts weak damping (f = 0.2) for ω0 = 20.0 (non–diffusive) and ω0 = 15.0 (diffusive, i.e.ω ˜ = 0). 5.5. TEST OF THE SVD METHOD 69

The Displacement Correlation Function

Turning back to the path integral description in terms of the Euclidean action (5.89) we can derive an analytic expression for the imaginary–time displacement correlation function R2(τ) which is defined by

R2(τ)= x(τ) x(0) 2 = 2 x2 x(τ)x(0) x(0)x(τ) = 2 (C (0) C (τ)) . (5.99) h| − | i h i − h i − h i x − x where the last equality follows from the fact that the imaginary–time correlation function Cx(τ)= x(t)x(0) is (by definition) β–periodic and symmetric, i.e. fulfills Cx( τ)= Cx(β τ)= h i 2 − − Cx(τ). To give an explicit expression for R (τ) it is useful to express the β–periodic path x(τ) and Γ(τ) as a Fourier series in the Matsubara frequencies νn = 2πn/β

∞ ∞ 1 1 x(τ)= x eiνnτ , Γ(τ)= Γ eiνnτ . (5.100) β n β n n=−∞ n=−∞ X X As shown explicitly in appendix C.3, R2(τ) can be expressed as the series [71]

∞ 1 β R2(τ)= 4 x 2 (1 cos(ν τ)) with x 2 = . (5.101) β2 h| n| i − n h| n| i mν2 + mω2 Γ n=1 n n X − This representation of the displacement correlation function can be taken as the starting point for the calculation of dynamical properties of the system like the dipole absorption cross section. As shown explicitly in appendix C.3 the coefficients Γn can be determined from the dynamic friction kernel ζ(t) by the relation

∞ Γ = ζ(0) ν ζˆ( ν ) with ζˆ(z)= dt ζ(t) e−zt. (5.102) n −| n| | n| Z0 Results for the exact displacement correlation function are shown in figs. 5.5 and 5.6 for the examined values of the damping parameter f and the free oscillator frequency ω0.

Inverse Problem for the Dipole Absorption Cross Section

To derive the inverse problem that relates the dipole absorption cross section σ(ω) to the dis- placement correlation function R2(τ), we just have to combine the relation (5.99) between R2(τ) and Cx(τ), the definition (5.94) of σ(ω) in terms of the spectrum Sx(ω), and the inverse problem (5.27). This leads to ∞ 1 R2(τ) = 2(C (0) C (τ)) = dω 1 e−ωτ S (ω) x − x π − x Z−∞ £ ¤ ∞ c (1 e−ωτ )+ e−βω (1 eωτ ) = dω − − σ(ω) 4π2 ω (1 e−βω) Z0 − ∞ c cosh( 1 βω) cosh( 1 βω ωτ) = dω 2 − 2 − σ(ω). (5.103) 4π2 ω sinh( 1 βω) Z0 2 where we have used the detailed balance relation (5.5) to restrict the ω-integration to positive frequencies. 70 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

ω 0=20.0 ω 0=15.0 ) (a.u.) τ ( 2 R 0 0.05 0.1 0.15 0.2 0.0 0.2 0.4 0.6 0.8 1.0 τ (a.u.)

Figure 5.5: Exact imaginary–time displacement correlation function R2(τ) for an oscillator in a Lennard–Jones liquid for ω0 = 20.0 (non–diffusive) and ω0 = 15.0 (diffusive, i.e.ω ˜ = 0) and high damping environment (f = 1.0).

ω 0=20.0 ω 0=15.0 ) (a.u.) τ ( 2 R 0 0.02 0.04 0.06 0.08 0.1 0.0 0.2 0.4 0.6 0.8 1.0 τ (a.u.)

Figure 5.6: Exact imaginary–time displacement correlation function R2(τ) for an oscillator in a Lennard–Jones liquid for ω0 = 20.0 (non–diffusive) and ω0 = 15.0 (diffusive, i.e.ω ˜ = 0) and low damping environment (f = 0.2). 5.5. TEST OF THE SVD METHOD 71

5.5.2 Application of the SVD Method

Discretization of the Inverse Problem

For the numerical test of the SVD method for the solution of the inverse problem (5.103) the integral equation has to be discretized. Due to the symmetry τ β τ we can restrict the rele- → − vant τ interval to [0, β/2] which has been divided into N = 128 time steps ∆τ = τ τ . The τ i − i−1 cutoff for the frequency range was chosen as ωmax = 100 for the high damping case (f = 1.0) and ωmax = 50.0 for the low damping case (f = 0.2). In both cases the frequency interval [0, ωmax] was discretized to a grid containing Nω = 250 equidistant points ωj. After the discretization the inverse problem (5.103) results in the (ill–conditioned) linear equation

Nω 2 R (τi) = Kij σ(ωj), for i = 1,...,Nτ , (5.104) Xj=1 1 1 c cosh( 2 βωj) cosh( 2 βωj τiωj) Kij = 2 − 1 − ∆ω. (5.105) 4π ωj sinh( 2 βωj)

To test the SVD method we added between 1 10−5% and 1% Gaussian noise to the exact 2 × values R (τi) for the displacement correlation function. Since the amount of added noise is rather small, it is important to start from very accurate data for the exact correlation function. We found that the summation of the series (5.101) converges rather slowly and that for the purpose 2 of testing the SVD method it is more accurate to calculate R (τi) from the exact results of the dipole absorption cross section by eq. (5.103).

The Singular System

The singular system of the matrix K has been calculated using the routine f02wec from the NAG library in C (mark 7) [77]. Fig. 5.7 shows the results for the singular values σn (up to n = 12) while the left singular vectors un(τ) and the right singular vectors vn(ω) (up to n = 6) are displayed in figs. 5.8 and 5.9.

The singular values σn show an exponential decrease with increasing order n, i.e. the in- verse problem (5.103) can be classified as seriously ill–posed in contrast to inverse problems encountered in image reconstruction or computer tomography in which the singular values show a power–law decay. The left and right singular vectors un(τ) and vn(ω) are the basis functions into which the data R2(τ) and the solution σ(ω) are expanded.

The order n of the singular vector also gives the number of zeros of un(τ) in ]0, β/2[ or vn(ω) in [0, ωmax], respectively. Thus the expansion in the singular vectors has properties that are comparable to the expansion into a Fourier series. In particular, the singular vectors un(τi) for higher order have large amplitude close to τ = 0 (and τ = β) while the higher order vectors vn(ωj) are important for larger frequencies. 72 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS 1 n σ 1e-06 0.001 1e-09

0.0 5.0 10.0 15.0 n

Figure 5.7: Exponential decrease of the singular values of the integral operator K with increasing order n. ) τ ( n u

n=1 n=2 n=3

-0.2 -0.1 0 0.1 0.2 n=4 n=5 n=6 0.0 0.1 0.2 0.3 0.4 0.5 τ

Figure 5.8: Left singular functions corresponding to the integral operator K up to order n = 6. The left singular functions represent the basis for the expansion of the displacement correlation function. 5.5. TEST OF THE SVD METHOD 73

n=1 n=2 n=3 n=4 n=5 n=6 ) ω ( n v

0.0-0.2 -0.1 10.0 0 0.1 20.0 0.2 30.0 40.0 50.0 ω

Figure 5.9: Right singular functions corresponding to the integral operator K up to order n = 6. The right singular functions represent the basis for the expansion of the dipole absorption cross section.

A Remark on the Shape of the Correlation Function

Before we turn to the SVD solution of the inverse problem we just want to use our knowl- edge about the singular system to make a short remark about the form of the imaginary–time correlation functions R2(τ) that are shown in figs. 5.5 and 5.6 for the examined values of the parameters ω0 and f. Due to the strong decay of the singular values σn, the general form of R2(τ) = σ σ, v u (τ) is given by the first singular vector u (τ) in the generic case, i.e. n nh niω n 1 unless σ(ω) is orthogonal to v1(ω). Hence, it is hard to draw conclusions directly from the form P of R2(τ). Still we can make one general observation from the inspection of the correlation func- tions in figs. 5.5&5.6 and the corresponding absorption cross sections in figs. 5.3&5.4. The cross section corresponding to f = 1.0, ω = 15 has a very simple form consisting of a single peak { 0 } at ω = 0, σ(ω) for f = 1.0, ω = 20.0 and f = 0.2, ω = 15.0 is more ”complex” showing { 0 } { 0 } a sharper peak around ω 15 while the cross section for f = 0.2, ω = 20.0 is characterized ≈ { 0 } by a rather sharp peak at ω 20 and a small background at lower frequencies. Comparing the ≈ imaginary–time correlation functions in figs. 5.5 and 5.6 we can observe that the ”simple” σ(ω) corresponds to a ”round” correlation function that closely resembles u1(τ) while with increasing ”complexity” of σ(ω), R2(τ) becomes more ”square” as a consequence of the contributions of higher order singular vectors un(τ).

Truncated Singular Value Decomposition

To get more detailed information about σ(ω) we have to perform the SVD analysis (or solve the inverse problem with an alternative method such as MEM). The first step in the determination of the TSVD solution is the calculation of the expansion coefficients

Nτ 2 2 an = R ,un τ = R (τi)un(τi)∆τ (5.106) i=1 ­ ® X 74 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS of the data in terms of the singular vectors un. Using the discrepancy principle (5.58) we can then determine the regularization parameter γ, i.e. the cutoff nco for the TSVD solution (5.57)

nco 2 a R ,un σ(ω )= b v (ω ) with b = n = τ . (5.107) j n n j n σ σ n=1 n ­ n ® X Determination of Additional Coefficients

We used de Villier’s method to extend the TSVD solution and determine additional coeffi- cients that lead to a positive solution with enhanced resolution. Starting from the TSVD values (0) for the expansion coefficients λn = bn, the target function

N n 2 n 1 ω co co D(λ)= λnvn(ωj) λnbn (5.108) 2 ¯Ã ! ¯ − j=1 ¯ n=1 +¯ n=1 X ¯ X ¯ X ¯ ¯ of the dual problem (5.66) was minimized¯ using the simplex¯ algorithm (e04ccc) from the NAG opt C Library [77]. The positive SVD solution was then obtained from the optimal coefficients λn as nco opt σ(ωj)= λn vn(ωj) for j = 1,...,Nω. (5.109) Ãn=1 ! X + 5.5.3 Comparison of SVD and MEM Results Results of the Truncated Singular Value Decomposition

Figs. 5.10–5.13 show the results that were obtained by the TSVD using Morozov’s discrep- ancy principle to determine the cutoff. The best agreement of the TSVD calculations with the exact result is found for high damping (f = 1.0) and diffusive motion (ω0 = 15.0). The simple form of the cross section with a broadened peak at ω = 0 can be recovered well even for rather −5 noisy data. For the lowest noise level ǫrel = 10 % the agreement is excellent over the whole range of frequencies and even for an error level of 1% the general shape of the cross section is re- produced correctly. The error for σ(0) (as the local error in general) behaves non–monotonically with the error level. For the non–diffusive (ω0 = 20), strongly damped (f = 1.0) case shown in fig. 5.11 the maximum of σ(ω) is around ω = 12 and the TSVD results show a much stronger sensitivity −5 on the level of added noise. For ǫrel = 10 % the cross section is still represented quite well though the maximum amplitude already shows a deviation of 30% from the exact result. For −2 ≈ ǫrel = 10 % the TSVD result still shows a peak at finite frequency while for higher noise level −5 −2 the cross section is represented rather poorly. For ǫrel = 10 % and ǫrel = 10 % the TSVD produces negative side lobes as in the case of a band limited Fourier representation. In the case of weak damping (f = 0.2) displayed in figs. 5.12 and 5.13 where the cross section is peaked more strongly the agreement of the TSVD with the exact result is clearly worse than for high damping. In the diffusive case one still finds good agreement for the diffusion constant −5 −2 at ǫrel = 10 % and ǫrel = 10 % though the peak amplitude shows a deviation of at least 50% and the range of higher frequencies is represented poorly even at the lowest noise level. The exact result for ω = 20.0 and f = 0.2 is characterized by a very sharp peak at ω 20. 0 ≈ The TSVD reconstruction of the cross section is here generally poor. While at ǫrel = 1% the 5.5. TEST OF THE SVD METHOD 75 peak is missed almost completely, the results for lower noise level show a strongly broadened peak with very much reduced amplitude and oscillations around the exact solution on both sides of the maximum.

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.005 0.01 0.015 0.02 0.025 0.03 0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.10: Results for oscillator frequency ω0 = 15 (diffusive) and high damping (f = 1.0) obtained by the truncated singular value decomposition. Shown is the exact result and results for different levels of noise added to the correlation function R2(τ).

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.01 0.02 0.03

0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.11: Results for oscillator frequency ω0 = 20 (non–diffusive) and high damping (f = 1.0) obtained by the truncated singular value decomposition. Shown is the exact result and results for different levels of noise added to the correlation function R2(τ). 76 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.01 0.02 0.03 0.04 0.05

0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.12: Results for oscillator frequency ω0 = 15 (diffusive) and low damping (f = 0.2) obtained by the truncated singular value decomposition. Shown is the exact result and results for different levels of noise added to the correlation function R2(τ).

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.02 0.04 0.06 0.08 0.1

0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.13: Results for oscillator frequency ω0 = 20 (non–diffusive) and low damping (f = 0.2) obtained by the truncated singular value decomposition. Shown is the exact result and results for different levels of noise added to the correlation function R2(τ). 5.5. TEST OF THE SVD METHOD 77

Results of de Villier’s SVD Method

Starting from the TSVD solutions presented above we used de Villier’s method to obtain the positive solutions that are shown in figs. 5.14–5.17. Since the TSVD results for f = 1.0 and ω0 = 15.0 were already positive they are preserved by de Villier’s method that adds just as much additional contributions from higher order singular functions as are needed to get a positive re- sult. Since the positivity provides no additional information in this case the resolution of the result is unchanged. The same is true for the strongly damped case with ω0 = 20.0 at high noise level ǫrel = 1.0% where we get no improvement of the TSVD result. For lower noise level the negative side lobes −2 are cut off, the peak is sharpened and increased in amplitude, especially for ǫrel = 10 %. As a reminiscence of the oscillations of the TSVD solution at ω . 5 and ω & 30 the result for ǫ = 10−5% still shows small artificial peaks at ω = 0 and ω 42. rel ≈ In the results for the diffusive case at low damping shown in fig. 5.16 we observe an improve- ment of the result especially at large frequencies ω & 30 where the artificial oscillations in the TSVD result have been removed. The positive solution also shows a better resolution of the main peak of the cross section. The almost flat part of σ(ω) at low frequencies and especially the diffusion constant, on the other hand, are affected adversely by the determination of additional coefficients. In the results of de Villier’s method for the non–diffusive case ω0 = 20.0, the width of the −5 −2 peak is reduced by a factor of 2 for the lower noise levels ǫrel = 10 % and ǫrel = 10 % and the amplitude shows an improvement compared with the TSVD result. On the other hand the shape of the peak is not correctly represented and at low frequency the solutions bear little resemblance with the flat shape of the exact result.

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.005 0.01 0.015 0.02 0.025 0.03 0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.14: Results for high damping (f = 1.0) in the diffusive case (ω0 = 15) obtained by de Villier’s extension of the TSVD. Shown are the exact result and data for different levels of added noise. 78 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.005 0.01 0.015 0.02 0.025 0.03 0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.15: Results for high damping (f = 1.0) in the non–diffusive case (ω0 = 20) obtained by de Villier’s extension of the TSVD. Shown are the exact result and data for different levels of added noise.

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.01 0.02 0.03 0.04 0.05 0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.16: Results for low damping (f = 0.2) obtained by de Villier’s extension of the TSVD. Shown are the exact result and data for different levels of added noise. 5.5. TEST OF THE SVD METHOD 79

exact -5 10 % noise -2 10 % noise 1% noise ) (a.u.) ω ( σ 0 0.02 0.04 0.06 0.08 0.1 0.0 10.0 20.0 30.0 40.0 50.0 ω (a.u.)

Figure 5.17: Results for low damping (f = 0.2) obtained by de Villier’s extension of the TSVD. Shown are the exact result and data for different levels of added noise.

Results of the Maximum Entropy Method

Besides the comparison with the exact cross section it is also interesting to compare the SVD with other methods for the solution of inverse problems. Therefore we would like to discuss briefly the data calculated by Krilov et al. [72] using the Maximum Entropy method. As the default map for the entropy functional they used a constant function whose amplitude is de- termined from the sumrule for the absorption cross section. The plots shown in figs. 5.18–5.21 display the results obtained from exact data for the imaginary–time correlation function R2(τ) −2 that was corrupted by ǫrel = 1% Gaussian noise in the diffusive case (ω0 = 15) and ǫrel = 10 % Gaussian noise for the non–diffusive case (ω0 = 20). For ω0 = 20.0 Krilov et al. show in addition results obtained using short–time data (up to t = 0.3β) for the Kubo transform of the position correlation function 1 β ψ(t)= dλ x(t + iλ)x(0) (5.110) β h i Z0 which is related to the dipole absorption cross section by the inverse problem c ∞ cos(ωt) ψ(t)= dω σ(ω). (5.111) 4π2 βω2 Z0 The information of both correlation functions can also be used in combination using as the linear operator of the inverse problem the tensor product of the operators corresponding to the two parts. In the diffusive case one has to use the Kubo transform ψv(t) of the velocity autocorrelation function instead of ψ(t). The combination of real– and imaginary–time data is useful because the real–time correlation function can still be calculated at short–times for general systems and is supposed to contain information that is complemental to that of the imaginary–time correlations [72, 78]. We will restrict the comparison to the results obtained from the imaginary–time data only since the use of additional real–time correlation functions just represents an enlargement of the input but does 80 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS not alter the method itself. Real–time data can be included equally well in the SVD method. We have not presented the corresponding data here since its comparison with the MEM data provides no further insight.

The MEM data obtained from the imaginary–time correlation function (with ǫrel = 1% added noise) shown in fig. 5.18 for the case of strong damping and diffusive motion agree very well with the exact results except for an artificial dip at ω = 0 that leads to an error of 10% for ≈ the diffusion constant.

0.025

Imaginary time

0.02 Imag. time + velocity Kubo velocity Kubo Exact ) 0.015 ω ( σ 0.01 0.005 0 0.0 20.0 40.0 60.0 80.0 ω [a.u.]

Figure 5.18: Maximum Entropy results of Krilov et al. [72] for the absorption cross section for high damping (f = 1.0) in the diffusive case (ω0 = 15.0). Shown are the exact result, data obtained from the imaginary–time correlation function, the Kubo velocity correlation function and from a combination of both types of data.

A comparison of fig. 5.19 with the corresponding positive SVD solutions for ω0 = 20.0 at −2 a noise level of ǫrel = 10 % shows that the amplitude and shape of the peak are much better reproduced by the maximum entropy approach. In addition the MEM results show less artificial oscillations at low frequencies and for ω & 25.

In the case of low damping the MEM results from imaginary time data (with ǫrel = 1%) are −5 comparable to SVD data for ǫrel = 10 % with respect to the representation of the main peak. Also the MEM results show some deviations from the exact solution at small frequencies but the artificial oscillations are not as pronounced as in the case of the positive SVD solution. Similar conclusions can be drawn from the comparison of the results for ω0 = 20 and f = 0.2 where also the MEM has difficulties to produce a good reconstruction of the cross section at the given noise level. 5.5. TEST OF THE SVD METHOD 81

0.03 Imaginary time Imag. time + Kubo Kubo Exact 0.02 ) ω ( σ 0.01 0 0.0 10.0 20.0 30.0 40.0 50.0 ω [a.u.]

Figure 5.19: Maximum Entropy results of Krilov et al. [72] for the absorption cross section for high damping (f = 1.0) in the non–diffusive case (ω0 = 20.0). Shown are the exact result, data obtained from the imaginary–time correlation function, the Kubo correlation function and from a combination of both types of data.

0.05 Imaginary time Imag. time + velocity Kubo velocity Kubo 0.04 Exact ) 0.03 ω ( σ 0.02 0.01 0 0.0 10.0 20.0 30.0 40.0 50.0 ω [a.u.]

Figure 5.20: Maximum Entropy results of Krilov et al. [72] for the absorption cross section for low damping (f = 0.2) in the diffusive case (ω0 = 15.0). Shown are the exact result, data obtained from the imaginary–time correlation function, the Kubo velocity correlation function and from a combination of both types of data. 82 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS

0.1 Imaginary time Imag. time + Kubo

0.08 Kubo Exact ) 0.06 ω ( σ 0.04 0.02 0 0.0 10.0 20.0 30.0 40.0 50.0 ω [a.u.]

Figure 5.21: Maximum Entropy results of Krilov et al. [72] for the absorption cross section for low damping (f = 0.2) in the non–diffusive case (ω0 = 20.0). Shown are the exact result, data obtained from the imaginary–time correlation function, the Kubo correlation function and from a combination of both types of data.

Concluding Comparison of the Methods

The exactly solvable model system of a harmonic oscillator linearly coupled to a harmonic envi- ronment allowed us to test methods for the solution of ill–posed problems for the reconstruction of experimentally accessible quantities from imaginary–time correlation functions. Varying the frequency ω0 of free oscillations and the damping parameter f we could check the performance of the methods for the calculation of dipole absorption cross sections of different shape ranging from a simple relaxational form with a broad peak at ω = 0 to a peak at intermediate frequencies around ω = 12 and finally to a sharp peak at ω = 20 with some small amplitude low frequency background. From the results of both methods it is obvious that — for a given level of noise in the data — imaginary–time methods are more accurate for the description of relaxational dynamics than in the case of (weakly damped) oscillatory dynamics. For the SVD this can be understood easily from the shape of the singular functions in which the solution is expanded. While a peak at ω = 0 can already be described (at least qualitatively) by the first singular function v1(ω), the representation of peaks at higher frequency require the accurate determination of expansion coefficients for singular functions of higher order. This becomes exceedingly difficult with increasing order due to the decay of the singular values inherent in ill–posed problems. Though the origin of the shortcomings in the description of oscillatory dynamics is less apparent in the maximum entropy formalism it suffers from the same problems as the comparison of the results in figs. 5.18–5.21 clearly reveals. The comparison between corresponding results of the two examined methods allows the conclusion that the MEM gives in general much better results for a given level of noise in the 5.5. TEST OF THE SVD METHOD 83 data. In particular for oscillatory dynamics the MEM reconstruction is much closer to the exact result and suffers evidently much less from artificial oscillations. To a large part the superiority of the MEM should be due to the entropy functional that was derived from general (information theoretical) axioms by statistical reasoning. Although it ultimately also relies on certain postulates it is clearly much better justified than the (squared) norm which is used as a regularizing functional in the SVD method. To a certain degree the choice of the regularization parameter may also contribute to the success of the MEM. The discrepancy between the two methods is smallest in case of the relaxational dynamics. Although the SVD requires a lower noise level to achieve the same accuracy as the MEM, it can be considered as an alternative method in this case. While the numerical effort to reduce the (statistical) noise of the imaginary–time data is bigger, the SVD method is quite robust and requires less effort for the numerical solution of the inverse problem. This is also the reason why we will use the SVD method for the applications in chapter 6. There we will have to solve several hundred inverse problems for each choice of the system parameters (i.e. temperature and parallel conductance). Since this is only possible if the method for the solution of the inverse problem can be automatized we rather invested some more (computer) time in the calculation of the imaginary–time data and used the SVD method. This works well for our application since the system we will examine displays relaxational dynamics. 84 CHAPTER 5. CORRELATION FUNCTIONS AND INVERSE PROBLEMS Part II

Applications

85

Chapter 6

The Metallic Single Electron Transistor

In this chapter we apply the methods developed in chapters 3 to 5 for the calculation of the conductance of a metallic single electron transistor (SET). In the first section we describe the setup used in a recent experimental study by Wallisser et al. [2]. As the improved layout of these experiments allowed (for the first time) a complete characterization of the relevant parameters of the device, it gave the motivation for a rigorous test of the theory that will be presented in this chapter. In section 6.1 we also introduce the system and the relevant parameters of the theoretical model that is used as the basis of our path integral calculations. Section 6.3 deals with the exact integration ever the quasi–particle baths that leads to the formulation of the generating functional for the current autocorrelation function in terms of a path integral involving an effective action for the single electron transistor. In the final section of this chapter we present results for the conductance obtained by the Monte Carlo integration of the path integral and subsequent analytical continuation of the imaginary–time correlation function. We compare our theoretical results to the experimental findings of Wallisser et al. [2] and Joyez et al. [18] as well as results from perturbation theory [79, 80] and semiclassical calculations [81].

6.1 Single Electron Tunneling through a Metallic Island

6.1.1 Experimental Realizations and Model Parameters

Single Electron Tunneling and Coulomb blockade were observed first in the electronic transport through small metal particles embedded in a tunnel barrier between two electrodes [82, 83]. Since a large number of metal islands are involved, the interaction effects in this system can only be observed in an averaged way in form of a zero–bias anomaly of the tunneling resistance. A more detailed study of the IV–characteristic of an ultra–small metallic island coupled to leads by tunnel barriers began with the experiment of Fulton and Dolan [84] who contacted a small aluminum island by aluminum electrodes using oxide layers as tunnel barriers. The setup used in a recent experiment by Wallisser et al. [2] is shown in Fig. 6.1. The layout consists of two source and two drain electrodes that contact the small aluminum island in the middle. Two further gate electrodes are coupled capacitively to the island. The metal electrodes are deposited on an oxidized Si substrate by two–angle shadow evaporation using a mask produced by e–beam lithography. In between the two deposition steps tunnel barriers are formed by oxidation.

87 88 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

Rs1, Cs1 n Rd1, Cd1

Rs2, Cs2 Rd2, Cd2

Cg

V V - 2 Ug 2

Figure 6.1: REM picture of a four junction SET. The layout contains two symmetrically arranged gate fingers (connected in parallel) to avoid asymmetrical proximity effects in the lithography process. The circuit diagram displays how the SET is operated during conductance measure- ments. The region encircled by the dashed line is the central island containing n (excess) electrons.

During the actual measurements of the conductance of the SET the two source electrodes (and likewise the two drain electrodes) are operated in parallel as shown in fig. 6.1 b). The 4–terminal layout is necessary for the experimental determination of the tunnel resistances of the single junctions which is not possible with the 2–terminal device that was used in previous experiments. With the 4–terminal setup one can measure the resistance across the island for all pairs of electrodes and calculate the resistances of each single junction. In particular it is possible to determine the (dimensionless) parallel conductance g = (Gs + Gd)/GK of all tunnel 2 junctions that is the relevant quantity for the theoretical description. Here GK = e /h is the conductance quantum, Gs = Gs1 + Gs2 and Gd = Gd1 + Gd2 denote the conductance of the source and drain, respectively. Quite generally we denote by the subscript s or d properties of the two source or drain junctions operated in parallel. The second important model parameter is the total capacitance CΣ = Cs +Cd +Cg of the metal island that determines the charging energy 2 EC = e /(2CΣ), i.e. the electrostatic energy associated with the charging of the island by a single electron. Experimentally the charging energy is determined from the offset in the IV-curve or (for higher accuracy) from a fit of high–temperature data to semiclassical predictions [2]. The size of the metallic island is still large enough that energy quantization can be neglected, i.e. the level spacing ∆E of the states on the island is much smaller that the charging energy EC . The parallel conductance g and the charging energy EC are determined by the geometry and thus are fixed for a given experimental setup. Further external parameters that can be varied during −1 the measurement are the (inverse) temperature β = (kBT ) , the source drain voltage V , and the number of charges ng = (CgUg)/e that are induced on the island by the gate voltage Ug. In our theoretical treatment of the single electron transistor we concentrate on the linear response, i.e. vanishing source drain voltage V .

6.1.2 Charging Model

For the theoretical description we have to combine the tunneling of electrons between reservoirs, i.e. the leads and the dot, and the electrostatic charging of the metallic island. 6.2. PATH INTEGRAL FORMULATION 89

The conduction electrons of the leads and the island are described by free quasi–particle reservoirs j (j)† (j) † HB = Ekσ ckσ ckσ + ǫqσ dqσdqσ (6.1) q,σ jX=0,1 Xk,σ X j where Ekσ is the energy of an electron with longitudinal wave vector k in channel σ of lead j and ǫqσ the energy corresponding to an electron with longitudinal wave vector q in channel σ on the island. The channel index σ includes the electron spin and the transversal quantum numbers that are conserved during tunneling transitions. The electrostatic energy of the island is described by the charging Hamiltonian

2 2 e HC = EC (n ng) with EC = (6.2) − 2CΣ where n = Q/e is the number of conduction electrons on the island and ng = (CgUg)/e the number of charges induced by the gate voltage. The tunneling of a quasi–particle from one of the leads onto the island or vice versa is accompanied by the charging or discharging of the island by exactly one elementary charge. These tunneling events can be described by the tunnel Hamiltonian

(j)† j † j∗ (j) j j iϕ HT = ckσ Λkqσ dqσ + dqσ Λkqσ ckσ with Λkqσ = tkqσ e (6.3) j,k,q,σX h i where tj is the tunnel amplitude for an electron in state qσ on the island to tunnel onto the kqσ | i lead j with final state kσ . The operator ϕ is the conjugate of the number operator n of the | i charges on the island, i.e. [n,ϕ]= i. Hence, the charge shift operator exp( iϕ) adds one charge − to the island (see Appendix D.1).

6.2 Path Integral Formulation

In this section we derive the general path integral expression for the generating functional of the current autocorrelation function of the SET. While the trace over the charge degrees of freedom can be written as a Feynman path integral, the trace over the quasi–particle reservoirs has to be formulated as a coherent state path integral. We derive the Coulomb action, i.e. the action of the Feynman path integral that describes the charging of the island. Using the general results of chapter 3, the action of the quasi–particles can also be given. Finally, we include the source terms for the generating functional of the current autocorrelation function.

6.2.1 Path Integral Ansatz The Hilbert space of the Hamiltonian (6.1)–(6.3) is the product of the space spanned by the charge states n , or equivalently the phase states ϕ , and the Fock space of the quasi–particles. | i | i To calculate the partition function of the SET we have to trace over these degrees of freedom, i.e.

−βH Z = trqp trϕ e (6.4)

∗ ∗ = dµ(φ) dµ(ψ) e−φ φ−ψ ψ dϕ ϕ, φ, ψ e−βH ϕ, φ, ψ h − − | | i Z Z Z 90 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

(j) where the Grassmann numbers φkσ and ψqσ denote the eigenvalues of the coherent states cor- (j) responding to the annihilators ckσ and dqσ, respectively. To simplify the notation we have combined them to the vectors φ and ψ and written sums as scalar products, e.g.

∗ (j)∗ (j) φ φ = φkσ φkσ . (6.5) j,k,σX Correspondingly the integrals have to be read as multiple integrals, e.g.

∗ dµ(ψ) ˆ= dψqσdψqσ. (6.6) q,σ Z Z Y The derivation of the path integral expression can be done as usual by multiple insertion of the closure relation ∗ ∗ 1= dµ(φ) dµ(ψ) e−φ φ−ψ ψ dϕ ϕ, φ, ψ ϕ, φ, ψ (6.7) | ih | Z Z Z in the product space. Though the derivation of the Feynman path integral and the coherent state path integral have to be performed in parallel we will present them separately to make the calculations more transparent. In fact only the action of the Feynman path integral has to be considered in detail while the action of the quasi–particles follows directly from the general discussion in chapter 3.

6.2.2 The Coulomb Action To evaluate the matrix elements of the short–time propagator for the Coulomb action we consider the SET Hamiltonian as

H = H + H (ϕ) with H = E (n n )2 (6.8) C TB C C − g where HTB(ϕ)= HT (ϕ)+HB contains the tunnel Hamiltonian and the Hamiltonian of the quasi– particle reservoirs. In contrast to the determination of the ”standard” Feynman path integral in section 3.1, the Hamiltonian is 2π–periodic in the phase coordinate ϕ. Correspondingly the conjugate variable n takes only discrete values n Z such that the closure relations are given ∈ by 2π 1= dϕ ϕ ϕ = n n . (6.9) 0 | ih | Z | ih | Z nX∈ Taking into account these modifications we can calculate the short–time propagator in analogy to eq. (3.7) . ϕ , φ , ψ e−i∆j H ϕ , φ , ψ = ϕ e−i∆j HC ϕ φ , ψ e−i∆j HT B (ϕj−1) φ , ψ h j j j| | j−1 j−1 j−1i h j| | j−1i h j j| | j−1 j−1i

2 −i∆j HT B (ϕj−1) −i∆j EC (n−ng) = φj, ψj e φj−1, ψj−1 ϕj n e n ϕj−1 h | | i Zh | i h | i nX∈ 2 −i∆j HT B (ϕj−1) 1 −i∆j EC (n−ng) −in(ϕj −ϕj−1) = φj, ψj e φj−1, ψj−1 e . (6.10) h | | i 2π Z nX∈ This expression can be further simplified using the Poisson resummation formula

f(n)= dn e−2πikn f(n) (6.11) Z Z nX∈ Xk∈ Z 6.2. PATH INTEGRAL FORMULATION 91 to get

2 −i∆j HC . 1 −i∆j EC (n−ng) −in(ϕj −ϕj−1+2πkj ) ϕj e ϕj−1 = dn e (6.12) h | | i 2π Z kXj ∈ Z − − ϕj ϕj−1+2πkj 2 n˜ ϕj ϕj−1+2πkj −i∆j ng −i∆j EC n˜ + 1 ∆j EC ∆j = e · ¸ dn˜ e · ¸ 2π Z kXj ∈ Z 2 1 ϕj ϕj−1 + 2πkj ϕj ϕj−1 + 2πkj = exp i∆j − ng − . N Z ( "4EC ∆j − ∆j #) kXj ∈ µ ¶ µ ¶ Here and in the following is a normalization constant that is usually incorporated into the N path integral measure ϕ. D The parametric dependence of the operator HTB(ϕ) on the phase results in a partition function ZTB[ϕ] of the quasi–particles that depends on the path ϕ(z) (or the corresponding discrete values ϕj). The explicit expression for ZTB[ϕ] will be given in 6.2.3. To complete the derivation of the Coulomb action we just note that due to the 2π–periodicity of HTB(ϕ) with respect to ϕ the partition function ZTB[ϕ]is2π–periodic in ϕj for all j. Inserting the short–time propagator into the Trotter breakup for the partition function we get

2π 2π Z = dϕ ... dϕ δ(ϕ ϕ ) ϕ e−i∆P HC ϕ ... ϕ e−i∆1HC ϕ Z [ϕ]. (6.13) P 0 P − 0 h P | | P −1i h 1| | 0i × TB Z0 Z0 To simplify further we use the freedom to relabel the summations over k and to transform the integrals over ϕ. Instead of summations over k1, . . . , kP we sum over n ′ ′ ′ kn = kj for n = 1,...,P ,i.e. k1 = k1, k2 = k1 + k2, ... (6.14) Xj=1 with the consequence that

ϕ ϕ + 2πk = ϕ + 2πk′ ϕ + 2πk′ . (6.15) j − j−1 j j j − j−1 j−1 Using ¡ ¢ ¡ ¢ ′ ′ ϕj = ϕj + 2πkj (6.16) as a convenient integration variable (and dropping the primes) we get

2π(kP +1) 2π(k1+1) 2π Z = dϕP ... dϕ1 dϕ0 δ(ϕP ϕ0 2πkP ) N 2π+kP 2π+k1 0 − − k1X,...,kP Z Z Z P 2 1 ϕj ϕj−1 ϕj ϕj−1 exp i ∆j − ng − ZTB[ϕ]. ×  "4EC ∆j − ∆j # ×  Xj=1 µ ¶ µ ¶ 

With the exception of kP the sums over kj can be incorporated in the integrals over ϕj (leading to an unrestricted integration) such that we can simplify further and get the following result in the continuum limit

ϕ(0)+2πk 2 iSC [ϕ] ϕ˙ Z = ϕ e ZTB[ϕ] with SC [ϕ]= dt ngϕ˙ . (6.17) Z D × C 4EC − Xk∈ ϕZ(0) Z · ¸ 92 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

For the imaginary–time contour connecting z = 0 and z = iβ by a straight line one gets the C E − integral over the Euclidean Coulomb action SC [ϕ]

ϕ(0)+2πk β 2 E ϕ˙ −SC [ϕ] E Z = ϕ e ZTB[ϕ] with SC [ϕ]= dτ + ingϕ˙ . (6.18) Z D × 0 4EC Xk∈ ϕZ(0) Z · ¸

6.2.3 Coherent State Path Integral and Source Terms Following the standard derivation of chapter 3, the remaining coherent state path integral can be written out easily. For convenience we directly give the imaginary–time expression

Z [ϕ]= µ(φ) µ(ψ) e−ST B [ϕ,φ,ψ] with S = S + S + S , (6.19) TB D D TB L D T Zβ Z (j)∗ j (j) ∗ −1 SL[φ]= dτ φkσ (τ) ∂τ µ + Ekσ φkσ (τ) φ g φ (6.20) 0 − ≡ Z j,k,σX h i β S [ψ]= dτ ψ∗ (τ)[∂ µ + ǫ ] ψ (τ) ψ∗ −1ψ (6.21) D qσ τ − qσ qσ ≡ G 0 q,σ Z X β (j)∗ j ∗ j∗ (j) ST [ϕ, φ, ψ]= dτ φkσ (τ)Λkqσ(τ)ψqσ(τ)+ ψqσ(τ)Λkqσ(τ)φkσ (τ) (6.22) 0 Z j,k,q,σX h i φ∗Λψ + ψ∗Λ∗φ ≡ where in the definitions (’ ’) we have also included the integration over imaginary time τ in the ≡ vector multiplication. The matrices g−1 and −1 denote the inverse of the Green’s function of G the lead and the island electrons which are defined by

j,j′ ′ j ′ gkσ,k′σ′ (τ, τ )= gkσ(τ, τ ) δjj′ δkk′ δσσ′ (6.23) ∂ µ + Ej gj (τ, τ ′)= δ(τ τ ′) and gj (β, τ ′)= gj (0, τ ′) τ − kσ kσ − kσ − kσ ³ ´ for the leads, and correspondingly for the island as

′ ′ ′ ′ (τ, τ )= (τ, τ ) δ ′ δ ′ (6.24) Gqσ,q σ Gqσ qq σσ (∂ µ + ǫ ) (τ, τ ′)= δ(τ τ ′) and (β, τ ′)= (0, τ ′) τ − qσ Gqσ − Gqσ −Gqσ where µ denotes the common chemical potential of leads and island. The imaginary–time autocorrelation function of the SET can be expressed as the derivative

1 δ2Z [χ] C (τ)= I(τ)I(0) = I (6.25) I h i Z [0] δχ(τ) δχ(0) I ¯χ≡0 ¯ ¯ of a generating functional ZI [χ] which can be constructed from the¯ partition function by addition of the source term β S [χ]= dτ I(τ)χ(τ) (6.26) I − Z0 to the action. Here I(τ) denotes the coherent state matrix element of the current operator for the SET. 6.3. EFFECTIVE ACTION OF THE SINGLE ELECTRON TRANSISTOR 93

To derive an expression for the current operator I one can analyze the time–dependence of (j)† (j) the number of quasi–particles nj = kσ ckσ ckσ on the leads as presented in Appendix D.2. As a result one gets for the current operator P ∗ I = φ(j)† ( 1)jie w tj ψ + ψ† ( 1)jie w tj φ(j) (6.27) − kσ − j kqσ qσ qσ − j kqσ kσ j,k,q,σX h ³ ´ ³ ´ i and correspondingly for the source term

β (j)∗ j j SI [χ]= dτ φkσ (τ) ( 1) ie wjχ(τ)tkqσ(τ) ψqσ(τ) + h.c. (6.28) − 0 − Z j,k,q,σX h ³ ´ i which is quite similar to the tunnel action. Hence it can be incorporated easily in the preceding discussion of the partition function by the simple redefinition

Λj (τ) 1 ( 1)jie w χ(τ) tj eiϕ(τ) (6.29) kqσ ≡ − − j kqσ which includes the source terms in the£ tunnel action. ¤ We close this section with a summary of the path integral formulation for the SET. For the generating functional ZI [χ] of the current autocorrelation function we found the representation

−S[χ;ϕ,φ,ψ] ZI [χ]= ϕ µ(φ) µ(ψ) e with S = SC + SL + SD + STI . (6.30) Z D D D Xk∈ Z Z Z The action S is composed of the Coulomb action

β ϕ˙ 2(τ) S [ϕ]= dτ + in ϕ˙(τ) , (6.31) C 4E g Z0 · C ¸ the actions S [φ]= φ∗g−1φ and S [ψ]= ψ∗ −1ψ (6.32) L D G of the quasi–particle reservoirs, and the tunnel action

∗ ∗ ∗ STI [χ; ϕ, φ, ψ]= φ Λ[χ; ϕ]ψ + ψ Λ [χ; ϕ]φ (6.33) that also contains the source terms for the generating functional if Λ is given by eq. (6.29).

6.3 Effective Action of the Single Electron Transistor

In this section we will integrate over the quasi–particle reservoirs and thus derive an effective tunnel action that allows us to express the generating functional and the current correlation function as a simple Feynman path integral over the phase ϕ.

6.3.1 Exact Integration of Quasi-Particle Baths Obviously the action (6.32)-(6.33) is quadratic in the Fermionic fields φ and ψ and hence the corresponding path integrals are Gaussian and can be performed analytically. The integration over the quasi–particle reservoirs will be done in two steps. First we integrate over the fermions 94 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR in the leads to get an effective action for the electrons on the island only. Using the general formula (3.44) we get

∗ −1 ∗ ∗ ∗ ∗ ∗ µ(φ) e−φ g φ−φ Λψ−ψ Λ φ = Z eψ Λ gΛψ (6.34) D L Z −1 where we have used that det(g ) = ZL is just the partition function of the (isolated) leads. The result after this first integration step can be summarized as

−S[ϕ,ψ] Z = ZL ϕ dµ(ψ) e with S[ϕ, ψ]= SC [ϕ]+ S˜D[ϕ, ψ], (6.35) Z D Xk∈ Z Z β ϕ˙ 2(τ) S [ϕ]= dτ + in ϕ˙(τ) (6.36) C 4E g Z0 · C ¸ S˜ [ϕ, ψ]= ψ∗ −1 Λ∗gΛ ψ ψ∗ ˜−1ψ (6.37) D G − ≡ G with the effective Green’s function£ ˜ of the¤ electrons on the island. G In a second step we can integrate over the fields ψ. The simple Gaussian integral results in

∗ ˜−1 µ(ψ) e−ψ G ψ = det ˜−1 = Z det (1 Λ∗gΛ) (6.38) D G D − G Z ³tr{ln[1´−GΛ∗gΛ]} = ZD e where Z = det( −1) is the partition function of the isolated island. D G The remaining determinant has been rewritten as an exponential function to identify the effective tunnel action. The partition function has now been reduced to a single (Feynman) path integral over the phase ϕ

−Seff [χ;ϕ] ZI [χ]= ZLZD ϕ e with Seff = SC + ST , (6.39) Z D Xk∈ Z β ϕ˙ 2(τ) S [ϕ]= dτ + in ϕ˙(τ) (6.40) C 4E g Z0 · C ¸ S [χ; ϕ]= tr ln( 1 Λ∗[χ; ϕ]gΛ[χ; ϕ] ) . (6.41) T − { − G } 6.3.2 The Tunnel Action Following the ideas of Ambegaokar et al. [85] and Grabert [86] we evaluate explicitely the effective tunnel action (6.41). As a first step the logarithm is expanded as ∞ S [χ; ϕ]= tr ln (1 Λ∗gΛ) = tr ( Λ∗gΛ)n . (6.42) TI − { − G } { G } n=1 X Due to the large number N = σ 1 of available channels in the metallic tunnel junctions, it is sufficient to keep only the first term in this expansion. A more detailed justification for this P approximation will be given after we have evaluated the lowest order contribution. In the approximation of large channel number N the tunnel action is given by S [χ; ϕ] tr Λ∗gΛ (6.43) TI ≈ {G } = dτ dτ ′ (τ, τ ′) λ∗(τ ′)tj∗ gj (τ ′, τ) λ (τ)tj Gqσ j kqσ kσ j kqσ j,k,q,σX Z Z 6.3. EFFECTIVE ACTION OF THE SINGLE ELECTRON TRANSISTOR 95 where we have used the short hand notation

λ (τ)= 1 ( 1)jie w χ(τ) eiϕ(τ) by which Λj (τ)= λ (τ)tj , (6.44) j − − j kqσ j kqσ and the (weighted)£ electron–hole–pair¤ Green’s functions

2 α (τ, τ ′)= tj (τ, τ ′) gj (τ ′, τ). (6.45) j kqσ Gqσ kσ kqσ ¯ ¯ X ¯ ¯ To evaluate α explicitly, we have to¯ insert¯ the Green’s function (τ, τ ′) of free quasi– j Gqσ particles on the island (cf. (3.82))

−ǫ (τ−τ′) e qσ τ > τ ′ ′ 1+e−βǫqσ qσ(τ, τ )= −ǫ (τ−τ′) (6.46) G  e qσ τ < τ ′  − 1+eβǫqσ j ′ and the analogous expression for gkσ(τ , τ). Here all energies are measured from the common chemical potential µ of the leads and the island. At low temperatures tunneling will occur only for energies close to the Fermi energy and we can assume that in the relevant energy range the tunnel matrix elements will be constant for each tunnel junction, i.e. tj t (6.47) kqσ ≈ j By the same reasoning we can rewrite the summation over wave vectors as an integration over energies using a constant density of states per channel, i.e. we make the replacements ∞ f dǫ ρ f(ǫ) with ρ = ρ (0). (6.48) kσ ≈ σ σ σ σ Xkσ X−∞Z Since the metallic bandwidth is much larger than all other relevant energy scales of the problem the integration has been extended to infinity. Inserting eqs. (6.46)–(6.48) into the electron–hole–pair Green’s function and taking into account that the Green’s functions and hence the α depend only on the time difference τ τ ′ j − we get ∞ ∞ ′ e−ǫτ eǫ τ α (τ)= t 2 dǫ dǫ′ ρ ρ′ (6.49) j −| j| σ σ 1+ e∓βǫ 1+ e±βǫ′ σ −∞Z −∞Z X where the upper signs correspond to τ > 0 and the lower ones to τ < 0. ′ The integrations over the energies can be performed [87] and replacing ρσ and ρσ by the average densities of states per channel ρ and ρ′ of the island and the leads one gets

∞ ∞ ′ e−ǫτ eǫ τ α (τ) = t 2 Nρρ′ dǫ dǫ′ (6.50) j −| j| 1+ e∓βǫ 1+ e±βǫ′ −∞Z −∞Z 2 2 ′ π = tj Nρρ . −| | 2 2 π β sin β τ ³ ´ The prefactor can be related to the dimensionless tunnel conductance of the junction j [86] 2π g = G = 4π2 t 2 Nρρ′. (~ = 1) (6.51) j e2 j | j| 96 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

For the tunnel action of the SET we thus get in the approximation of large channel number the expression β β ′ 1 ∗ ′ STI [χ; ϕ]= dτ dτ gjλj(τ)λj (τ ). (6.52) 0 0 4β2 sin2 π (τ τ ′) Z Z β − Xj At this point one can easily understand why³ the higher´ order terms in the expansion of the logarithm may be neglected in the limit N 1. The first order term that we have derived ≫ is proportional to N( t 2 + t 2) which in turn is proportional to the parallel conductance | 0| | 1| g = g0 + g1 of the tunnel junctions. It is quite easy to see that the second term in the expansion would be proportional to N( t 4 + t 4) and hence to g2/N which is negligible for large N. A | 0| | 1| detailed study of Coulomb charging for finite channel number has been given in [88] with the conclusion that the large N action suffices already for N & 10. ∗ ′ To conclude the evaluation of the tunnel action we have to write out the sum j gjλj(τ)λj (τ ). We get P ′ g λ (τ)λ∗(τ ′) = g 1 ( 1)jie w χ(τ) 1 + ( 1)jie w χ(τ ′) ei(ϕ(τ)−ϕ(τ )) (6.53) j j j j − − j − j Xj Xj £ ¤ £ ¤ ′ = g 1+ e2 w2χ(τ)χ(τ ′) ( 1)jie w χ(τ) χ(τ ′) ei(ϕ(τ)−ϕ(τ )) j j − − j − j X £ ¡ ¢¤ ′ i(ϕ(τ)−ϕ(τ ′)) = g + 2πGclχ(τ)χ(τ ) e where for the last equality£ the weights wj for¤ the current operator were chosen according to g g w = 1 and w = 0 with g = g + g (6.54) 0 g 1 g 0 1 to cancel the asymmetry of the junctions in the summations

2 2 g0g1 2π g0 + g1 = g, g0w0 + g1w1 = Gcl 2 , and g0w0 g1w1 = 0. (6.55) g0 + g1 ≡ e −

Here Gcl is just the classical series conductance of the two tunnel junctions. Since α is an even function of τ τ ′ the imaginary part of S averages to zero in the double − TI integral (6.52) such that we can summarize our findings for the effective tunnel action as β β S [χ; ϕ]= dτ dτ ′ α(τ τ ′) g + 2πG χ(τ)χ(τ ′) cos ϕ(τ) ϕ(τ ′) (6.56) TI − − cl − Z0 Z0 £ ¤ ¡ ¢ 1 α(τ τ ′)= . (6.57) − 4β2 sin2 π (τ τ ′) β − The dependence on the sources³ χ is only´ relevant for the derivation of the formula for the current autocorrelation function that will be presented in the next subsection. After taking the functional derivative with respect to χ(τ) and χ(0) the sources χ are set to zero and the resulting expression for the tunnel action becomes β β S [ϕ] = g dτ dτ ′ α(τ τ ′) cos ϕ(τ) ϕ(τ ′) (6.58) T − − − Z0 Z0 β β ¡ ϕ(τ) ϕ(τ ′)¢ = 2g dτ dτ ′ α(τ τ ′) sin2 − const. − 2 − Z0 Z0 µ ¶ Especially in the second form that results from a simple trigonometrical identity, this result has been used frequently in the literature [2, 81, 85, 86]. 6.4. MONTE CARLO CALCULATION OF THE CORRELATION FUNCTION 97

6.3.3 The Current Autocorrelation Function From the path integral formula (6.39)–(6.41) for the generating functional and the explicit expression (6.56)–(6.57) for the tunnel action we can derive the path integral for the current autocorrelation function in imaginary time 1 δ2Z [χ] C (τ)= I . (6.59) I Z [0] δχ(τ) δχ(0) I ¯χ≡0 ¯ Since the generating functional depends on the sources χ only¯ through the tunnel action we first ¯ evaluate the functional derivatives of STI δS [χ] β TI = dτ ′ α(τ τ ′)4πG χ(τ ′) cos ϕ(τ) ϕ(τ ′) (6.60) δχ(τ) − − cl − Z0 δ2S [χ] ¡ ¢ TI = 4πG α(τ τ ′) cos ϕ(τ) ϕ(τ ′) . (6.61) δχ(τ) δχ(τ ′) − cl − − Using these results one gets for the current autocorrelatio¡ n function¢ 2 1 δSTI [χ] δSTI [χ] δ STI [χ] −S[χ;ϕ] CI (τ) = ϕ e Z D δχ(τ) δχ(0) − δχ(τ) δχ(0) ¯ k∈Z Z · ¸ ¯χ≡0 X ¯ = 4π G α(τ) A(τ) ¯ (6.62) cl ¯ with the cosine correlation function ϕ(β)=2πk 1 A(τ)= ϕ cos (ϕ(τ) ϕ(0)) e−S[ϕ], (6.63) Z Z D − Xk∈ ϕ(0)=0Z and the action S = SC + ST given by eqs. (6.40) and (6.58).

6.4 Monte Carlo Calculation of the Correlation Function

In this section we discretize the path integral and put it in a form that is suitable for the numerical evaluation of the imaginary–time correlation function by the path integral Monte Carlo method. We give the details of the Monte Carlo calculations and present results for the cosine correlation function.

6.4.1 Discretization of the Path Integral Dimensionless Energies

For the numerical evaluation we measure all energies in units of the charging energy EC and write out the path integral in discrete form. In units of EC the action of the SET reads βEC ϕ˙ 2(τ) S [ϕ] = dτ + in ϕ˙(τ) (6.64) C 4 g Z0 · ¸ βEC βEC S [ϕ] = g dτ dτ ′ α(τ τ ′) cos ϕ(τ) ϕ(τ ′) (6.65) T − − − Z0 Z0 ¡ ¢ 1 with α(τ)= (6.66) 4(βE )2 sin2 π τ C βEC ³ ´ 98 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

Winding Numbers

Eq. (6.63) expresses the cosine correlation function as the sum over path integrals with dif- ferent boundary conditions. Instead of evaluating each path integral separately up to a certain cutoff kco and adding them up, it is more convenient to make the transformation 2πk ϕ(τ)= ζ(τ)+ νkτ with νk = (6.67) βEC and use Monte Carlo sampling for the path variable ζ(τ) as well as for the winding number k. The Coulomb action transforms to

βEC 1 S [ζ, k] = dτ ζ˙2(τ) + 2ν ζ˙(τ)+ ν2 + in ζ˙(τ)+ ν (6.68) C 4 k k g k Z0 · ¸ βEC ³ ´ ³ ´ 1 ˙2 2 = dτ ζ (τ)+ νk + ingνk 0 4 Z · ³ ´ ¸ where we have used that dτζ˙(τ)= ζ(βE ) ζ(0) = 0 due to the boundary conditions for the C − paths. The tunnel action changes to R

βEC βEC S [ζ]= g dτ dτ ′ α(τ τ ′) cos ζ(τ) ζ(τ ′)+ ν (τ τ ′) . (6.69) T − − − k − Z0 Z0 ¡ ¢ Metropolis Action

As pointed out in section 4.5 we need a positive definite action for the application of the Metropo- lis algorithm. Therefore we have to include the imaginary part of the Coulomb action in the observable to be measured. For the cosine correlation function this amounts to

−2πingk e cos (ζ(τ)+ νkτ) A(τ)= 0 (6.70) −2πingk ­ e 0 ® where denotes the average ­ ® h·i0

ζ(βEC )=0

−S0[ζ,k] X 0 ζ X e (6.71) h i ≡ Z D Xk∈ ζ(0)=0Z with the positive action

βEC 1 βEC βEC S [ζ, k]= dτ ζ˙2(τ)+ ν2 g dτ dτ ′ α(τ τ ′) cos ζ(τ) ζ(τ ′)+ ν (τ τ ′) . 0 4 k − − − k − Z0 Z0 Z0 ³ ´ ¡ (6.72)¢

Discretization

The path integral for the cosine correlation function (6.70)–(6.72) is in a suitable form for the application of the Monte Carlo method and we can proceed to discretize the problem for the numerics. We split the interval [0,βEC ] into P pieces of length ∆τ and use the notation 6.4. MONTE CARLO CALCULATION OF THE CORRELATION FUNCTION 99

ζj = ζ(τj) with τj = j∆τ. Approximating the derivatives by difference quotients and the integrals by Riemann sums the action S0 takes the form

P 2 P 2 ′ ∆τ ζj ζj−1 βEC 2 ∆τ g cos (ζj ζj−1 + νk∆τ(j j )) S0[ζ, k] − + νk − − ≈ 4 ∆τ 4 − 2 2 π∆τ ′ j=1 j,j′=1 4(βEC ) sin (j j ) X µ ¶ X βEC − P 2 2 2 P ³ ´ P (ζj ζj−1) π k 2πk ′ = − + g αj−j′ cos ζj ζj′ + (j j ) (6.73) 4βEC βEC − ′ − P − Xj=1 j,jX=1 µ ¶ with 1 α = . (6.74) j 2 2 π 4P sin P j A special consideration is necessary for the terms with¡ j¢= j′ in the tunnel action which have a (constant) divergent contribution, that can be neglected, and a finite contribution that leads to a renormalization of the parameters of the Coulomb action. To identify the two parts one uses the simple trigonometrical identity cos(x) = 1 2 sin2(x/2). The finite contribution stemming − from the sin2–term shall be calculated explicitly. Using the abbreviation

1 2πk ξ(τ)= ζ(τ) ζ(τj)+ (τ τj) with lim ξ(τ) = 0 (6.75) 2 − βE − τ→τj µ C ¶ the nontrivial j = j′ terms may be written as

ζ ζ + 2πk 0 2 2 2 j j P 1 sin [ξ(τ)] (βEC ) ˙2 α0 sin − = lim 2 = 2 2 ξ (τj) (6.76) 2 τ→τj 2P 2 π 2π P " # sin (τ τj) βEC − 1 h k i 1 = [ζ ζ ]2 + [ζ ζ ]+ k2. 8π2 j − j−1 2πP j − j−1 2P 2 When we insert this expression into the action (6.73), the second term in the last line of (6.76) sums to zero while the first and last renormalize the coefficients of the Coulomb action. This leads to the following formula for the action that can be used for the Monte Carlo simulation

P 2 2 2πk ′ S0[ζ, k]= cζ [ζj ζj−1] + ck k 2g αj−j′ cos ζj ζj−1 + (j j ) (6.77) − − ′ − P − Xj=1 jX

6.4.2 Details of the Monte Carlo Simulation

Before we present the results for the cosine correlation function obtained from the Monte Carlo evaluation of the path integral we give the relevant details about the simulation.

Range of the Parameters

The range of experimental parameters covered by our calculations includes inverse tempera- tures βE 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 4.0, 4.5, 5.0, 10.0, 21.0, 30.0 for a (strongly conducting) C ∈ { } SET with parallel conductance g = 4.75. This particular choice for g was studied in detail since we could compare our theoretical results for this parameter with recent experimental results obtained by Wallisser et al. [2]. For the dimensional gate voltage ng we considered 200 equidis- tantly spaced values in the relevant interval n [0, 0.5]. In addition we studied a range of g ∈ parallel conductances g 2, 3, 4, 4.75, 6, 7, 10, 15, 20 for a fixed inverse temperature βE = 20 ∈ { } C in the Coulomb blockade regime.

Initialization and Equilibration

To determine suitable initial values for the path variables ζ and the winding number k we equilibrated the system at the highest temperature βE = 0.5 starting from ζ 0, k = 0 C ≡ and performing 14 million Monte Carlo sweeps. As initial values for the lower temperatures we used configurations from simulation runs at higher temperature that were again equilibrated by another 2 to 12 million sweeps. Fig. 6.2 shows the behavior of the correlation function Aj during the equilibration for g = 4.75, βEC = 0.5 and ng = 0. During equilibration the behavior 1 ) τ 6

A( NMC=0.5*10 6

0.9 0.95 NMC=1.0*10 6 NMC=1.5*10 6 NMC=2.0*10 6 NMC=2.5*10 final result 0.85 0.0 0.2 0.4 0.6 0.8 1.0 τ β /( EC)

Figure 6.2: Measurements of the correlation function during equilibration of the Monte Carlo simulation at parameters g = 4.75, βEC = 0.5 and ng = 0. Shown are the result after 0.5, 1.0, 1.5, 2.0, 2.5 106 Monte Carlo steps (averaged over 0.5 106 steps) and the final { } × × result of the Monte Carlo measurements for the correlation function of the correlation function does not change monotonically with the number of steps, i.e. devia- tions from the final result can be either positive or negative. Still one clearly sees that artificial structures that are present around τ = 0.5βEC vanish and the amplitude approaches the exact 6.4. MONTE CARLO CALCULATION OF THE CORRELATION FUNCTION 101 limit after 2.5 106 Monte Carlo sweeps. We still continued equilibration for another 12.5 106 × × sweeps before starting our measurement runs.

Measurement Details

For each value of g and βEC we performed between 5 and 22.5 million measurements of the correlation functions that were grouped into 10 to 45 bins with a length of 0.5 million measure- ments for the evaluation of the statistical errors. To reduce correlations in the accumulation of the result, we waited 10 Monte Carlo sweeps in between two subsequent measurements. The number of Monte Carlo steps for each choice of parameters was adjusted to ensure that the statistical error of the imaginary–time correlation function is smaller that 3% over the whole range τ [0,βE ]. This error margin corresponds to the worst case, i.e. low temperature and/or ∈ C small values of g, where the convergence of the Monte Carlo is slow for τ 0.5βE . As shown ≈ C in the next subsection the Monte Carlo error is much smaller in general.

Trotter Convergence

Since the Trotter number P also determines the discretization for the correlation function Aj that is used for the analytical continuation, we have chosen a fixed value of P = 200 for all simu- lations. Over the whole range of temperatures this exceeds the previously used value P 5βE ≈ C by a factor 2. G¨oppert et al. [16, 89] and Herrero et al. [90] reported a Trotter convergence of the Monte Carlo, i.e. negligible systematic errors, for this choice of the Trotter number. We could confirm this observation using Trotter extrapolation for g = 4.75 and temperatures βEC = 5.0. 2 Fig. 6.3 shows the linear scaling of A(τ = 0) and A(τ = 0.5βEC ) as a function of ∆τ as pre- dicted by eq. (4.40). The systematic error of our data as determined from the difference between the value at Trotter number P = 200 and the extrapolation to ∆τ = 0 is less than 0.1% and hence can be neglected in comparison with the statistical fluctuations.

a) b)

ng=0.5 ng=0.5 =0.5) =2.5) τ ng=0.4 τ ng=0.4 A( ng=0.3 A( ng=0.3

ng=0.2 ng=0.2

ng=0.1 ng=0.1 0.7 0.72 0.74 0.76 ng=0.0 ng=0.0

0 0.05 0.1 0.15 0.2 0.25 0.350 0.4 0.450.05 0.5 0.55 0.1 0.15 0.2 0.25 ∆τ2 ∆τ2

Figure 6.3: Trotter extrapolation for the correlation function A(τ) at parallel conductance g = 4.75 and temperature βE = 5.0 for different values of the dimensionless gate voltage n C g ∈ 0.1, 0.2, 0.3, 0.4, 0.5 . Subfigure a) shows the Monte Carlo results for τ = 0.1βE and the { } C linear fit. Subfigure b) displays the corresponding results for τ = 0.5βEC . 102 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

Monte Carlo Moves and Acceptance Rates

For a single Monte Carlo step we chose randomly a time slice j 1,...,P 1 and attempted ∈ { − } to move the path coordinate ζ by a random value in [ ∆ζ, ∆ζ]. The (maximal) stepsize ∆ζ j − was adjusted to get an acceptance rate of around 30%. Every 10 such moves we also attempted a change of the winding number k by 1. The acceptance rate for a change of k differed between ± almost 50% for g = 4.75, βEC = 20.0 and less than 0.4% for the highest temperatures where already k = 0 gives the dominant contribution.

Average Sign

At high temperatures and for large values of g the sign problem is suppressed since the main contribution to the correlation function in these regimes comes from configurations with wind- ing number k = 0 for which the sign factor is given by cos(2πkn ) = 1. These findings are h g i summarized in fig. 6.4 that displays the average sign as a function of dimensionless gate voltage ng for different values of the inverse temperature βEC and the parallel conductance g. The sign problem is worst for ng = 0.5 (corresponding to the maximum of the Coulomb oscillations) and does not affect at all the calculations at ng = 0 (corresponding to the Coulomb oscillation min- imum). The sign problem represents a limitation of the examined Monte Carlo method based on the phase representation at low temperatures and small parallel conductances. 1 1 a) b) )> )> g g k n k n

π π g=15.0 βE =2.5 C g=10.0 β EC=3.0 g=7.0 βE =5.0 < cos(2 C < cos(2 g=6.0 β g=4.0 EC=10.0 βE =21.0 g=3.0 C g=2.0 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 ng ng

Figure 6.4: Average sign cos(2πkn ) as a function of dimensionless gate voltage n . Subfigure h g i g a) shows results for selected values of the inverse temperature βEC at g = 4.75 while subfigure b) displays results for different values of the dimensionless parallel conductance g 2.0,..., 15.0 ∈ { } for βEC = 20.0.

6.4.3 Results for the Cosine Correlation Function We close this section about the Monte Carlo simulation for the SET with the presentation of the results for the cosine correlation function A(τ). Though the imaginary–time correlation function can hardly be interpreted directly, it is interesting to show how the values and their statistical error depend on the physical control parameters. Furthermore we can identify those parameter regimes for which the Monte Carlo is most accurate and those for which the statistical errors are more pronounced. 6.4. MONTE CARLO CALCULATION OF THE CORRELATION FUNCTION 103

Cosine Correlation Function at High Conductance (g = 4.75)

Fig. 6.5 shows the imaginary–time correlation function at a parallel conductance of g = 4.75 and dimensionless gate voltage ng = 0 for the whole range of inverse temperatures from the ”high” temperature regime βEC = 0.5 down to βEC = 21 where the thermal energy is small compared to the charging energy EC . We can observe that the amplitude of A(τ) depends strongly on the temperature. The statistical error of the data on the other hand is not changing appreciably for the gate voltage ng = 0. 1 β EC=0.5 β EC=1.0 β EC=1.5 β EC=2.0

) βE =2.5

τ C βE =3.0

A( C β EC=4.0 β EC=5.0 β EC=10.0 β EC=21.0 0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 1.0 τ β /( EC)

Figure 6.5: Cosine correlation function A(τ) for a highly conducting SET with g = 4.75 for the whole range of examined temperatures and dimensionless gate voltage ng = 0.

Cosine Correlation Function at Low Temperature (βEC = 20.0)

Since the high temperature regime in which the system behaves more classically will be less crucial for the MC simulation, we will take a closer look at the result for the cosine correlation function at lower temperatures in the Coulomb blockade regime. Fig. 6.6 shows the dependence of A(τ) on the gate voltage ng (for g = 4.75) and on the tunnel strength parameter g (for ng = 0.2). The statistical error of the correlation function shows a clear dependence on the gate voltage ng. While it is generally small at ng = 0.0 it becomes more pronounced with increasing gate voltage until it reaches its maximum at ng = 0.5. This behavior is a consequence of the sign problem due to the factor cos(2πkn ) in the denominator of eq. (6.79). The dependence h g i of the correlation function on the parallel conductance g displayed in fig. 6.6 b) shows some similarity to the temperature dependence in as far as the amplitude (at the minimum) of A(τ) increases monotonically with g as well as with temperature. Though the direct interpretation of the imaginary–time data is not very reliable we would like to mention that this behavior can be expected since a stronger coupling of the island to the leads, like an increase of temperature, results in a smearing of the Coulomb blockade effect. The statistical error shows no strong dependence on the dimensionless conductance and hence we can conclude that the limitations of the Monte Carlo are mainly due to the sign problem for n = 0. g 6 104 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR 1 1 a) b) ng=0.5 g=15.0 n =0.4 g=10.0 g g=7.0 n =0.3 g g=6.0 ng=0.2 g=4.0 ) ) g=3.0 τ n =0.1 τ g g=2.0 A( ng=0.0 A( 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 τ β τ β /( EC) /( EC)

Figure 6.6: Quantum Monte Carlo results for the cosine correlation function at low temperature βEC = 20.0 in the Coulomb blockade regime. Subfigure a) shows results for different values of the gate voltage n 0.0,..., 0, 5 at g = 4.75 while subfigure b) displays results for different g ∈ { } values of the dimensionless parallel conductance g 2.0,..., 15.0 for n = 0.2. ∈ { } g

6.5 Results for the Conductance

Having obtained results for the imaginary–time (cosine) correlation function A(τ) we still have to solve the inverse problem of determining the conductance G from A(τ). In this section we start from the general linear response relation (5.25) between the conductance and the spectrum of the current autocorrelation function and formulate the inverse problem connecting the quantity G to the result A(τ) of the Monte Carlo simulation. After giving some details about the implementation of the SVD method for the solution of the inverse problem, we show results for the conductance as a function of gate voltage and temperature in the regime of strong tunneling in comparison with the experimental findings of Wallisser et al. [2]. We then discuss in more detail the temperature dependence of the minimal and maximal value of the conductance and show data for the dependence of the minimal and maximal conductance on the tunneling strength combining results from different experiments [2, 18, 91] and theoretical approaches.

6.5.1 Inverse Problem for the Conductance

Formulation of the Inverse Problem

Using linear response theory we expressed in eq. (5.25) the conductance G in terms of the spec- trum of the current autocorrelation function CI (t) which is given by (the real–time analouge of) eq. (6.62) as the product of the electron–hole pair correlation function α(t) and the cosine correlation function A(t). Since the Fourier transform of a product is a convolution we get

∞ βEC βEC iωt G = lim CI (ω) = lim 4πGcl dt α(t) A(t) e ω→0 2 ω→0 2 −∞ ∞ Z ∞ ′ ′ ′ = lim βEC Gcl dω α(ω ω ) A(ω ) = βEC Gcl dω α( ω) A(ω). (6.80) ω→0 − − Z−∞ Z−∞ 6.5. RESULTS FOR THE CONDUCTANCE 105

As we could calculate α(τ) explicitly it can be analytically continued to α(t) by the simple substitution τ = it. The spectral function can be determined from a Fourier transform

∞ ∞ eiωt 2πi eiωz α(ω) = dt = Res ,zn = iβEC n − 2 2 πt −4(βE )2  2 πz  −∞ 4(βEC ) sinh C n=0 sinh Z βEC X βEC ω 1 ³ ´  ³ ´  = . (6.81) 2π 1 e−βEC ω − The spectrum A(ω) of the cosine correlation function A(τ) on the other hand has to be deter- mined numerically as the solution of the inverse problem

1 ∞ 1 ∞ A(τ)= dω e−τω A(ω) = dω e−τω + e−(βEC −τ)ω A(ω), (6.82) 2π −∞ 2π 0 Z Z h i where we have used for the second equality the detailed balance relation (5.5) for the spectral function (with energies in units of EC ). For the numerical evaluation of eqs. (6.80) and (6.82) it is more suitable to use the symmetric spectral function

1 e−βEC ω As(ω) − A(ω) for which As( ω)= As(ω). (6.83) ≡ ω − In terms of As(ω) we can summarize the set of equations for the calculation of the conductance from the cosine correlation function as

βE G ∞ ω2 G = C cl dω As(ω), (6.84) 2π cosh(βE ω) 1 Z0 C − βE ∞ C ω cosh 2 τ ω A(τ) = (KAs) (τ) = dω − As(ω). (6.85) 0 ³h βEC i ´ Z 2π sinh 2 ω ³ ´ Discretization of the Integral Equation

To solve the inverse problem (6.85) for the symmetric spectral function we employed the SVD algorithm that was described in detail in section 5.3. The integral equation was discretized resulting in a linear equation

Nω s A(τi)= Kij A (ωj), for i = 1,...,Nτ (6.86) Xj=1 1 ωj cosh 2 βEC τi ωj Kij = 1 − ∆ω (6.87) 2π sinh¡£ 2 βEC ωj¤ ¢ of dimension N N where the number N ¡of time points¢ is given by the Trotter number of τ × ω τ the Monte Carlo simulation and the number of frequency points was chosen as Nω = 250. The integration range in eq. (6.85) was determined by a systematic variation of the upper cutoff frequency ωmax. 106 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

The Singular System

The singular system of the discretized integral operator for a frequency cutoff ωmax = 8.5 is shown in figs. 6.7–6.9. The singular values decrease rapidly with increasing order showing that also eq. (6.85) belongs to the class of seriously ill–posed problems. Figs. 6.8 and 6.9 show the left and right singular vectors, respectively. The properties of these functions, like the structure of their zeros, is similar to that of the singular functions of the inverse problem discussed for the damped harmonic oscillator in section 5.5. 0 10 -3 10 n σ -6 10 -9 10 -12

0.010 2.0 4.0 6.0 8.0 10.0 n

Figure 6.7: The singular values σn of the integral operator K up to order n = 9 which show a (more than) exponential decrease with increasing order. ) τ ( n u n=1 n=2 n=3

-0.2 -0.1 0 0.1 0.2 n=4 n=5 n=6

0.0 0.2 0.4 0.6 0.8 1.0 τ

Figure 6.8: Left singular functions un(τ) corresponding to the first six singular values. The functions un(τ) are used for the expansion of the cosine correlation function in the SVD. 6.5. RESULTS FOR THE CONDUCTANCE 107 ) ω ( n v n=1 n=2 n=3

-0.2 -0.1n=4 0 0.1 0.2 n=5 n=6

0.0 2.0 4.0 6.0 8.0 ω

Figure 6.9: Right singular function vn(ω) corresponding to the first six singular values. The SVD gives a solution for the symmetrical spectral function in the form of an expansion in the functions vn(ω).

Examples for the Symmetric Spectral Function

The symmetric spectral function As(ω) as an intermediate quantity in the calculation of the conductance has not been studied in detail as it is not relevant by itself. In fig. 6.10 we show as s an example the SVD results of the spectra A (ω) for g = 4.75 and ng = 0.

β EC=21.0 β EC=10.0 β EC=5.0 β EC=4.0

) β EC=3.0 ω

( βE =2.5 s C

A β EC=2.0 β EC=1.5 β EC=1.0 β EC=0.5 0 2.5 5 7.5 10 12.5 15 17.5 0.0 1.0 2.0 3.0 4.0 5.0 ω

Figure 6.10: Symmetric spectral function As(ω) for parallel conductance g = 4.75 and gate voltage n = 0 for inverse temperatures βE 0.5,..., 21.0 . g C ∈ { }

At low temperature the frequency cutoff was determined as ωmax = 4.6 and up to 8 coef- ficients of the SVD expansion were determined. At the highest temperatures the cutoff was chosen as ωmax = 5.8 and 3 expansion coefficients were determined. Additional coefficients were 108 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR obtained from the requirement of positivity of the solution. Except for the lowest temperatures, the spectrum has a simple form with a maximum at ω = 0 for which the SVD produced accurate results in our tests for the damped harmonic oscillator.

6.5.2 Coulomb Oscillations of the Conductance

From the solution As(ω) of the inverse problem (6.85) the DC conductance G of the SET is determined using eq.(6.84). With g and EC all relevant parameters of the theory can be measured independently in the experiment such that we can present a comparison between the experimental data of Wallisser et al. [2] and our theoretical calculations without fit parameters.

Experimental Parameters

The only parameters of the theory are the parallel conductance g and the charging energy EC . For the experimental setup of Wallisser et al. [2] shown in fig. 6.1 one can measure the resistance for each pair of electrodes connecting the island and is thus able to determine the resistance of the single tunnel junctions separately. From these resistances one can calculate in particular the conductance g of the four junctions connected in parallel. ) cl G/(G-G 0 5 10 15 0.0 2.0 4.0 6.0 8.0 10.0 T [K]

Figure 6.11: Inverse conductance reduction G/(G G ) as a function of temperature. Shown − cl are the experimental data for the minimum and maximum conductance ( ) and the linear fit • using the high–temperature expansion eq.(6.88) with the value of g determined from direct measurements (see text).

For the comparison of the Coulomb oscillations, i.e. the gate voltage dependence of the conductance G of the SET, we concentrate on a highly conducting sample with g = 4.75 cor- responding to resistances of the tunnel junctions given by [2] Rs = 20.3kΩ, Rs′ = 16.4kΩ, Rd = 31.7kΩ, and Rd′ = 23.8kΩ. To fix the charging energy EC , as the only other free parameter, the high temperature data was compared with semiclassical calculations [92]. The fit with the full semiclassical results as 6.5. RESULTS FOR THE CONDUCTANCE 109 well as the linear fit according to the high–temperature expansion [92]

G 3k T 27gζ(3) 2 gE = B + for k T C (6.88) G G E 2π4 − 5 B ≫ 2π4 − cl C shown in fig. 6.11 allows to fix the charging energy to EC = 1.87kBK with an error of less than 1%.

Comparison of Theory and Experiment

−1 The experimental results were scaled by the classical conductance Gcl = (23kΩ) and the gate voltage was expressed in dimensionless units using Cg = 19.0aF as determined from the period of the Coulomb oscillations. The comparison of experiment and theory for the Coulomb oscillations is shown in fig. 6.12 for the range of inverse temperatures of the simulation. The experimental data ( ) is very well described by the Monte Carlo calculation (—). ◦ 1 β EC=0.5 β EC=1.0 β EC=1.5 β EC=2.0

cl β EC=2.5 β EC=3.0 G/G β EC=4.0 β EC=5.0 β EC=10.0 β EC=21.0 0 0.2 0.4 0.6 0.8 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 ng

Figure 6.12: Coulomb oscillations of the conductance for g = 4.75 as measured by Wallisser et al. [2] ( ) in comparison with quantum Monte Carlo data (—). ◦ Minor discrepancies at some temperatures can be attributed to the fact that the temperatures used in the simulation do not match exactly those of the experiment. The theoretical data show excellent agreement with the measurements from the high temperature regime down to the Coulomb blockade regime where well developed oscillations of the conductance can be observed. For lower temperatures than those shown in fig. 6.12 it was not possible to get converged Monte Carlo results for the whole range of gate voltages due to the sign problem for n = 0. g 6 110 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR

6.5.3 Temperature Dependence of the Conductance

To study the temperature dependence of the conductance in more detail we plot in fig. 6.13 the minimum and maximum conductance as a function of temperature. Also in this representation we can observe an excellent agreement of the numerical calculations with the experimental data. At the lowest temperature (βEC = 30) we only show theoretical data for the minimum conductance Gmin. The data point for Gmax has been omitted since the convergence of the Monte Carlo at ng = 0.5 was too slow to get accurate results as a consequence of the sign problem. As can be seen on the logarithmic plot in the inset of fig. 6.13 the data for Gmin at βEC = 30 is still in very good agreement with experiment since the sign problem does not affect the value of the conductance for ng = 0. The error of the theoretical result is smaller than the symbol size except for the lowest temperatures. It was estimated from the minimum and maximum value obtained from the SVD solution of the inverse problem for different realizations of the statistical Monte Carlo error. 1

Experiment Monte Carlo PT (2nd order) Semiclassic 1 cl G/G cl G/G 0.01 0.1 0.01 0.1 1 β 1/( EC) 0 0.2 0.4 0.6 0.8 0.01 0.1 1 10 β 1/( EC)

Figure 6.13: Maximum and minimum linear response conductance Gmin and Gmax for g = 4.75 normalized to the high temperature conductance G . The experimental data ( ) are compared cl • to the results of our Monte Carlo simulation (¤), perturbation theory in g up to 2nd order [79, 80] (—) and semiclassical calculations [81] (– –). The inset shows the experimental and Monte Carlo data on a logarithmic scale for better comparison of Gmin. Error bars are only shown if they exceed the symbol size.

The comparison with the results of perturbation theory up to second order in g [79, 80] shows clearly that the Monte Carlo method is able to give an accurate description for parallel conductance g = 4.75 well outside the perturbative regime. Fig. 6.13 also demonstrates that the Monte Carlo extends the regime in which semiclassical calculations [81] are applicable down to low temperatures where Coulomb blockade oscillations of the conductance are observed. 6.5. RESULTS FOR THE CONDUCTANCE 111

6.5.4 Dependence on the Tunneling Strength We close our discussion of the metallic single electron transistor with the presentation of the dependence of the minimum and maximum conductance on the parallel conductance g in fig. 6.14. We compare our Monte Carlo data for inverse temperature βEC = 20 (corresponding to the Coulomb blockade regime) with experimental data of Wallisser et al. [2] and Joyez et al. [18] and with results of perturbation theory in second [79, 80] and third order [93] in the parallel conductance g. Also included are further Monte Carlo data obtained by G¨oppert et al. [16, 89]. Since the parallel conductance g is fixed by the thickness of the oxide film that forms the tunnel barriers, each experimental point corresponds to a different setup. We also include the data of Joyez et al. who used a layout with only one source and drain electrode which does not allow a direct measurement of the parallel conductance. Instead they estimated the tunneling strength parameter g using the assumption of a symmetric setup, i.e. by g = (4Gser)/GK as four 2 times the series conductance divided by the conductance quantum GK = e /h. Also the data of Wallisser et al. for g = 0.8, g = 1.39, g = 5.38, and g = 5.98 have been obtained with the two terminal setup using the same assumption. At least for the lower values of g the given numerical values could be corroborated by a fit with results from perturbation theory [2]. In view of these uncertainties of the experimental data (in particular for g = 10.0) we find again a very good agreement with the Monte Carlo results. The comparison with the data from perturbation theory in second and third order shows that the range of validity of these approaches is limited to g . 3 for inverse temperature βEC = 20. cl

G/G Exp. (CW) Exp. (PJ) Monte Carlo (GG) Monte Carlo (CT) PT 2nd order PT 3rd order 0 0.1 0.2 0.3 0.4 0.5 0.0 5.0 10.0 15.0 20.0 g

Figure 6.14: Maximum and minimum conductance for βEc = 20 as a function of the dimen- sionless parallel conductance g. Shown in black are the experimental data of Wallisser et al. [2] ( ) and Joyez et al. [18] (¥). The Monte Carlo data (¤) includes also the data of G¨oppert et • al. [16, 89] ( ) for g = 2.5 and g = 10.0. For g 5 we show results of perturbation theory in ◦ ≤ second order [79, 80] (♦) and third order [93] (△). 112 CHAPTER 6. THE METALLIC SINGLE ELECTRON TRANSISTOR Chapter 7

Semiconductor Quantum Dots

Motivated by the excellent agreement between experiments and theoretical results for the con- ductance of the metallic single electron transistor we examine in this chapter in how far the methods used for the SET can be generalized to electron transport through semiconductor quantum dots. For the detailed theoretical description we focus on vertical quantum dots as shown in fig. 7.1

Figure 7.1: Schematic layout of a vertical quantum dot consisting of a InGaAs layer sandwiched between AlGaAs tunnel barriers and contacted from above and below (from [10]).

In section 7.1 we give a short overview of the band structure of the semiconductor materials used for the fabrication of quantum dots and in particular discuss the band structure in semicon- ductor heterostructures. This provides the basis for a microscopic model of electron transport. In section 7.2 we examine in detail the electrostatics of semiconductor quantum dots in the presence of metallic gates to get a realistic description of charging effects and the screening of the electron–electron interaction on the dot. The results are then summarized in a microscopic model of a quantum dot coupled to leads by tunnel barriers in section 7.3. Using the imaginary– time path integral formalism we integrate out the Fermionic fields to obtain an expression for the generating functional of the current operator in terms of an effective action for the electrostatic potential which will be presented in section 7.4. We close this chapter by a discussion of this

113 114 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS theoretical result in comparison with the results obtained for the SET and point out directions for future work in relation to recently published theoretical work on semiconductor quantum dots [19].

7.1 Band Diagram of Semiconductor Heterostructures

The semiconductor quantum dots that have been studied experimentally are almost exclusively build up from GaAs/AlGaAs heterostructures that define a two–dimensional electron gas which is then confined further by electrostatic gates (lateral dots), lithographic techniques or a com- bination of both (vertical dots). To derive a model for the electronic structure and electron transport in quantum dots we start from the well–known band structure of bulk GaAs [94] or more generally AlxGa1−xAs [95] that will be summarized in subsection 7.1.1. In section 7.1.2 we then discuss the formation of the band profile in semiconductor heterostructures and in particular present the conduction band profile of a typical vertical quantum dot.

7.1.1 Band Structure of GaAs and AlGaAs GaAs is a III–V semiconductor that crystallizes in the zincblende structure, i.e. as a fcc lattice with a basis consisting of a Ga atom and an As atom as shown in fig. 7.2 a). The lattice constant at room temperature is a = 5.6533 A.˚ The first Brillouin zone of the corresponding bcc lattice in the reciprocal space and the symmetry points Γ,X and L are displayed in fig. 7.2 b). As Ga a) b)

As

Figure 7.2: Unit cell of the zincblende crystal structure of GaAs in real space (a) and reciprocal space (b) (from [96]). shown in fig. 7.3 a) the band diagram of GaAs exhibits a band gap of Eg = 1.42 eV at room temperature. The minimum of the conduction band as well as the valence band maxima are at the Γ point corresponding to ~k = 0, i.e. GaAs is a so–called direct band gap semiconductor. The conduction electrons are characterized by an (isotropic) effective mass of me = 0.063m0 where m0 denotes the bare electron mass. The atomic p–orbitals give rise to three valence bands two of which are (nearly) degenerate at ~k = 0 but have different effective masses, i.e. light holes with mlh = 0.082m0 and heavy holes with mhh = 0.51m0. The third valence band is split off 7.1. BAND DIAGRAM OF SEMICONDUCTOR HETEROSTRUCTURES 115 by ∆ 340 meV due to the spin–orbit interaction and characterized by an effective mass of so ≈ mso = 0.15m0.

a) 2

0 Eg = 1.42 eV

-2 Energy (eV)

-4 L Λ Γ ∆ X Wavevector

b)

Figure 7.3: Subfigure a) shows the relevant part of the electronic band structure of GaAs (adapted from [97]). In subfigure b) the band gaps at the Γ, L and X point are plotted as a function of the fraction x of Ga atoms that have been replaced by Al atoms (from [98]).

Pure AlAs has the same crystal structure as GaAs with a lattice constant of a = 5.6611 A˚ which also matches very closely that of GaAs. Hence the material system AlxGa1−xAs (where x denotes the fraction of Ga atoms that have been replaced by Al) is ideally suited for the epitaxial growth of mechanically stable and clean semiconductor heterostructures. On the other hand the replacement of Ga by Al atoms has a marked effect on the electronic bandstructure as displayed in fig. 7.3 b) where the gap width at the symmetry points Γ, L and X is plotted as a function of the composition parameter x. With increasing fraction x of Al atoms the 116 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS band gap Eg,Γ grows which can be used to create potential barriers in the conduction band of semiconductor heterostructures as will be discussed in subsection 7.1.2. At x 0.45 the position ≥ of the conductance band minimum is no longer at the Γ point but located at the X point(s) which results in a crossover from a direct band gap to a indirect band gap semiconductor. This fact is relevant e.g. for the construction of LEDs but shall not concern us here since the Al fraction in the tunnel barriers of vertical semiconductor quantum dots is about x 0.2 0.3 ≈ − [10].

7.1.2 Band Profile of a Heterostructure

The structure of the energy bands of AlxGa1−xAs discussed in 7.1.1 applies to the bulk of the semiconductor. At a surface or more generally an interface between the semiconductor and other materials (such as metals, other semiconductors or vacuum) the structure of electronic energy bands is modified. We will not concern ourselves here with details, such as interface roughness or reconstruction of the crystal structure at the surface or interface, which make quantitative calculations of interface properties very complicated. We rather summarize the general aspects of Fermi level pinning and band bending [99] which allow a qualitative understanding of the band profile of a semiconductor heterostructure from the band diagram of the bulk materials.

a) b) E E

CB CB1 EC CB2 µ1 IGS

µCN S µbulk µ2 µCN EV IGS VB VB2 VB1 z z

Figure 7.4: a) Conduction band, valence band and surface (gap) states of a (undoped) semicon- ductor. The surface states are filled up to the charge neutrality level µCN . b) Heterojunction between a n–doped wide–gap and a p–doped narrow–gap semiconductor after formation of in- duced gap states but before thermodynamic equilibrium is achieved by charge transfer across the interface.

Fermi Level Pinning and Band Bending

The usual Bloch wave functions corresponding to the band energies of the bulk semiconduc- tor are modulated plane waves that extend across the whole semiconductor. In the presence of a surface or interface there (generally) exist in addition so–called surface or interface states which are highly localized, i.e. extend only a few atomic layers into the interior of the semiconductor(s). At a semiconductor surface, dangling bonds give rise to (2D) bands of surface states with ener- gies within the band gap of the semiconductor as shown in fig. 7.4 a). At a metal–semiconductor interface so–called metal induced gap states (MIGS) are created by conduction band states of 7.1. BAND DIAGRAM OF SEMICONDUCTOR HETEROSTRUCTURES 117 the metal corresponding to energies within the band gap of the semiconductor. These states tail off exponentially into the semiconductor where they are represented by evanescent modes. Analogously, for a semiconductor heterojunction, induced gap states (IGS) are created when the energy band(s) of one semiconductor overlap with the band gap of the other as shown in fig. 7.4 b). In the following we will concentrate on the latter case of a semiconductor heterostructure, for definiteness, though all three situations mentioned above can be treated in an analogous manner [99]. More specifically we will discuss the example of a heterojunction between a n–doped semi- conductor 1 with a (relatively) wide band gap (e.g. Al(Ga)As) and a p–doped semiconductor 2 with a narrower band gap (e.g. pure GaAs or InGaAs). As the total number of states in semiconductor 1 is fixed the interface induced gap states are derived from valence and conductance band states. Interface states with valence band character can be regarded as donor–like, i.e. they are neutral when occupied and positively charged when empty, while interface states with predominantly conduction band character act like acceptors, i.e. are neutral when empty and negatively charged when occupied. Hence there exists a so– called charge neutrality level µCN up to which the band of interface states is filled for a neutral interface. Due to the strong localization of the interface states, already small deviations of the chemical potential from µCN at the surface lead to strong charging effects (and a corresponding shift of the band structure) such that the Fermi level is effectively pinned to the charge neutrality level of the interface states at the surface. a) b) c)

E E E

µ1 µ1 µ

µ2 µ2 µCN

z z z

Figure 7.5: Gedanken experiment for the construction of the band profile at the heterojunction between a n–doped semiconductor with a wide band gap and a p–doped semiconductor with a smaller band gap that leads to a triangular quantum well in the conduction band. a) Formation of interface induced gap states inside the band gap of semiconductor 1. b) Alignment of the charge neutrality levels at the interface by charge transfer from semiconductor 2 into the IGS. c) Band bending and alignment of the bulk Fermi energies of the two semiconductors by charge transfer across the interface and the formation of depletion layers on both sides of the interface.

When the semiconductors 1 and 2 are brought together to form a heterojunction in a gedanken experiment as shown in fig. 7.5 the first step after the formation of induced gap states is the alignment of the charge neutrality levels of the two semiconductors due to the interface dipole that is created by charge transfer to (or from) the IGS. In the case of doped semiconductors, as considered here, the bulk chemical potentials are aligned in a second step by recombination of electrons from donor impurities in semiconductor 1 with acceptor impurities in 118 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS semiconductor 2 which results in the formation of depletion layers and a bending of the energy bands as displayed in fig. 7.5 that can be determined by a solution of the Poisson equation. In a simple model, known as the Schottky model, one assumes that within the depletion length zdep all donor (acceptor) impurities are ionized. This allows a description of the band bending in terms of the band offset ∆E = E(z = 0) E(z = ) (determined from the alignment of the − ∞ chemical potentials) and the density of dopants ND [99], i.e.

eN 2ǫ ǫ∆E ∆E(z)= D (z z )2 with z = 0 . (7.1) 2ǫ ǫ dep dep eN − 0 − r D

The conduction band profile shown in fig. 7.5 for the heterojunction between n–doped Al- GaAs and p–doped GaAs develops a triangular well in the GaAs right at the interface in which (for careful choice of the Al–fraction and the densities of dopants) a two–dimensional electron gas (2DEG) is formed. Thus, such a semiconductor heterojunction serves as the basis for the fabrication of lateral quantum dots in which the 2DEG is further confined in the xy–plane by electrostatic gates on top of the semiconductor heterostructure as shown in fig. 1.2.

Band Profile of a Vertical Quantum Dot

For lateral quantum dots the band profile of a AlGaAs/GaAs heterostructure is used only to form a 2DEG while the confinement of the quantum dot electrons and in particular the tunnel barriers that separate the quantum dot from source and drain contacts are defined by electro- static gate electrodes. The possibility to control the tunnel barriers by an external gate voltage is especially useful for experiments on the Kondo effect in quantum dots. On the other hand the tunnel barriers and the confinement potential can hardly be tuned independently for such small structures. To have a better control of the confinement potential (at the cost of fixed tunnel barriers) it is more suitable to use vertical quantum dots in which the conduction band electrons are confined in z–direction by a quantum well potential as shown in fig. 7.6 created by a sequence of semiconductor heterojunctions. The leads are formed by GaAs while tunnel barriers are created by layers of Al0.22Ga0.88As which has a wider band gap. The potential in between the barriers is lowered with respect to the leads by the addition of a small fraction (x = 0.05) of In to the dot layer exploiting that In0.05Ga0.95As has an even smaller band gap than pure GaAs. In this way a vertical confinement of the quantum dot electrons is created that is modified only slightly by the application of voltages to further electrostatic gate electrodes. The lateral confinement in the xy–plane is created by etching of a mesa with a diameter of a few hundred nanometers as shown in fig. 7.1. To achieve further control of the lateral confinement potential one forms a metallic side gate around the mesa to which a gate voltage can be applied. As indicated in fig. 7.6 the Fermi level lies slightly above the conduction band edge. On the one hand this implies that we need only consider the conduction electrons and the corresponding potential profile in the theoretical description and can ignore the valence bands which lie far below the Fermi level. It also justifies the use of an effective mass description for the electrons which is quite accurate for energies close to the conduction band minimum. On the other hand it is already clear from this picture that the number of conduction channels will be small in contrast to the situation encountered for the metallic single electron system. 7.2. ELECTROSTATICS OF GATED QUANTUM DOTS 119

Figure 7.6: Self–consistent calculation of the conduction band profile of the GaAs/AlGaAs heterostructur used by Kouwenhoven et. al. (from [10]).

7.2 Electrostatics of Gated Quantum Dots

In the preceding section we have studied the energy dispersion of free quasiparticles in a semicon- ductor heterostructure. To develop a model for the theoretical description we have to consider further the confinement of the electrons in the plane of the dot and their mutual interaction.

7.2.1 The Constant Interaction Model and its Limitations Theoretical calculations for (few–electron) quantum dots usually assume a harmonic confinement 2 and describe the electron–electron interaction by a constant charging energy EC = e /(2C) per electron on the dot. The influence of gate electrodes is modeled as a homogeneous shift of the quantum dot potential that merely controls the number of electrons occupying the dot. These assumption constitute the Constant Interaction (CI) model [10, 100, 101] that can be summarized as H = ǫ c† c + E (n n )2 (7.2) α α α C − g α X † where n = α cαcα is the total number of electrons on the dot and ǫα are the single particle energies for the harmonic confinement. The gate voltage Ug induces ng = (CgUg)/e charges on P the dot. The CI model has been widely used to give a qualitative description of transport through semiconductor quantum dots. It has been rather successful since it allows to give an intuitive interpretation in terms of the single particle states. Nonetheless the model has its limitations as pointed out e.g. in [10] and a quantitative comparison between theory and experiment requires a more realistic description. Therefore we rather follow the work of Kumar et al. [102] and Hallam et al. [103] that takes into account the full electrostatics of a (geometrically idealized) quantum dot. The approach outlined in the following addresses the following limitations of the CI model

1. Deviations from harmonic confinement (for larger dots)

2. Dependence of the confinement potential on the gate voltages

3. Realistic electron-electron interaction beyond the constant charging energy 120 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS

4. Screening of the Coulomb interaction by leads and metallic gates

7.2.2 Electrostatic Energy and Work of the Power Sources

We consider a quantum dot containing N electrons with coordinates ~ri that give rise to a charge density ρ (~r)= e N δ(~r ~r ). In addition to these mobile electrons there exist fixed charges e − i=1 − i created by ionized impurities or defects with a density ρion(~r). Besides these sources of the P electrostatic field we also have to consider boundary conditions given by the voltages applied to the source and drain electrodes as well as other electrostatic gates. For our general calculation we will consider Ns such electrodes whose surfaces shall be denoted by Sn, n = 1,...,Ns. The corresponding boundary values for the electrostatic potential are given by the voltages Un. The surfaces Sn are part of a surface S that completely surrounds the quantum dot.

The Electrostatic Potential

The electrostatic potential φ(~r) in the quantum dot is determined by the Poisson equation

[ǫ ǫ(~r) φ(~r)] = [ρ (~r)+ ρ (~r)] (7.3) ∇ · 0 ∇ − e ion and the boundary conditions

φ(~r)= U for ~r S , n = 1,...,N (7.4) n ∈ n s φ(~r) = 0 for ~r . (7.5) | | → ∞ As summarized in Appendix E.1 the formal solution of this Dirichlet boundary value problem can be specified with the help of the Green’s function for the given geometry that is defined by

ǫ ǫ(~r) G(~r, r~′) = δ(~r r~′) (7.6) ∇ · 0 ∇ − − G(~r,hr~′) = 0 for i ~r S. (7.7) ∈ Once the Green’s function is known, the general solution of eqs. (7.3) and (7.4) can be given as

Ns φ(~r)= dr~′ G(r~′, ~r) ρ (r~′)+ ρ (r~′) α (~r)U (7.8) e ion − n n V n=1 Z h i X where α (~r)= dS~′ ǫ ǫ(r~′) ′G(r~′, ~r). (7.9) n · 0 ∇ ZSn is the surface charge on the n–th electrode induced by a unit charge at ~r.

Electrostatic Energy of the Charge Distribution

Using this expression for the electrostatic potential we can calculate the electrostatic energy of the charge distribution

N 1 1 s W = d~rφ(~r)[ρ (~r)+ ρ (~r)] + (Q + q )U (7.10) e 2 e ion 2 n n n n=1 Z X 7.2. ELECTROSTATICS OF GATED QUANTUM DOTS 121 where qn is the induced charge on electrode n whereas Qn is the charge on the n–th electrode for vanishing electronic and ionic charge densities that can be expressed by the voltages Un and the capacitances Cn,n′ between the different electrodes as Qn = n′ Cn,n′ Un′ . Inserting eqs. (7.8) and (7.9) into this expression for the electrostatic energy one gets P 1 W = d~r dr~′ ρ (~r) G(~r, r~′) ρ (r~′)+ d~r dr~′ ρ (~r) G(~r, r~′) ρ (r~′) (7.11) e 2 e e e ion ZV ZV ZV ZV N N 1 1 s s + d~r dr~′ ρ (~r) G(~r, r~′) ρ (r~′)+ U C ′ U ′ 2 ion ion 2 n nn n V V n=1 ′ Z Z X nX=1 N N 1 s 1 s d~rα (~r)[ρ (~r)+ ρ (~r)] U + q U . −2 n e ion n 2 n n n=1 V n=1 X ½Z ¾ X The two terms in the first line of (7.11) are the (screened) electron–electron interaction and the interaction between electrons and ionized defects. The second line gives the constant contri- butions of the fixed ions and the electrodes to the electrostatic energy that will be omitted in the following since they are irrelevant for the dynamics of the electrons. The third line exactly cancels out since ... is equal to the total induced charge q on the n–th electrode. This is { } n no surprise since the electrostatic energy of the image charges qn is already contained in the screened Coulomb interactions in the first line.

Work of the Power Sources

The movement of the electrons in the quantum dot also leads to a rearrangement of the image charges between the electrodes. The required energy has to be supplied by the power sources. The work of the power source related to the movement of the i–th electron by d~ri can be calculated as

Ns Ns dW = U dq = U d~rα (~r) ρ (~r) d~r (7.12) ps n n n n ∇~ri e · i n=1 n=1 V X X ½Z ¾ Ns = U d~rα (~r) ρ (~r) d~r F~ d~r . (7.13) −∇~ri − n n e · i ≡ ps · i ( n=1 V ) X Z

Obviously the force F~ps exerted by the power sources can be derived from the potential

Ns W = U d~rα (~r) ρ (~r). (7.14) ps − n n e n=1 V X Z 7.2.3 Green’s Function for a Vertical Quantum Dot The discussion of the electrostatics of a quantum dot in the preceding subsection was completely general since all geometric details, like the shape and position of the electrodes, enters only via the electrostatic Green’s function. To close this section we will calculate the Green’s function for a cylindrical vertical quantum dot with diameter a and height L that is shown schematically in Fig. 7.7. The top and the bottom are formed by the lead electrodes that are grounded since we will calculate the conductance in linear response, i.e. for vanishing source drain voltage. The shell of the cylinder forms the gate electrode to which a finite voltage Ug is applied. We will 122 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS

Drain z                            Dot  L           Side Gate           Ug Source

 a

Figure 7.7: Schematic picture of the idealized vertical quantum dot. As sources for the potential φ(~r) inside the structure we consider only the electrons confined between the tunnel barriers, i.e. the influence of ionized impurities and conduction electrons outside the quantum dot will be neglected. further assume that the dielectric constant within the dot is homogeneous, i.e. ǫ(~r) ǫ. To ≡ calculate the Green’s function for this geometry we have to solve the boundary value problem 1 ǫ ǫ∆G(ρϕz, ρ′ϕ′z′)= δ(ρ ρ′) δ(ϕ ϕ′) δ(z z′) (7.15) 0 −ρ − − − G(ρϕz, ρ′ϕ′z′) = 0 for z = 0,z = L or ρ = a for a point charge inside a grounded cylinder. As already indicated in eq.(7.15) one uses cylin- drical coordinates and expands the Green’s function and the δ–functions in Fourier series with respect to ϕ and z before one can finally solve the remaining radial differential equation by the modified Bessel functions Im and Km. The complete derivation is presented in Appendix E.2. Here we just want to state the result for the Green’s function given by ∞ ∞ 1 ′ G(ρϕz; ρ′ϕ′z′)= eim(ϕ−ϕ ) sin(k z) sin(k z′) πε εL n n 0 n=1 m=−∞ X X Im(knρ<) [Im(kna)Km(knρ>) Km(kna)Im(knρ>)] (7.16) × Im(kna) − ′ ′ with kn = πn/L, ρ< = min(ρ, ρ ) and ρ> = max(ρ, ρ ). The Green’s function determines the screened Coulomb interaction of the electrons inside the dot. To calculate the confinement of the electrons by the surrounding gate we also need the partial derivative of G(ρϕz, ρ′ϕ′z′) with respect to ρ′ evaluated at ρ′ = a that is given by

∞ ∞ ′ 1 ′ I (k ρ ) ∂ G(ρϕz; ρ′ϕ′z′) = eim(ϕ−ϕ ) sin(k z) sin(k z′) m n . (7.17) ρ ρ=a −ε επaL n n I (k a) 0 n=1 m=−∞ m n ¯ X X ¯ Inserting eq.(7.17) into eq.(7.9) one gets the function αg(~r) that describes the confinement of the electrodes by the lateral gate

∞ (2n+1)π (2n+1)π 4 sin L z I0 L ρ αg(ρϕz)= (7.18) −π ³2n + 1 ´ ³ (2n+1)π ´ n=0 I0 a X L ³ ´ The details of the calculation can once again be found in Appendix E.2. 7.3. THEORETICAL MODEL 123

7.3 Theoretical Model

In this section we want to summarize the results of sections 7.1 and 7.2 to formulate the Hamil- tonian for the quantum dot. We will also include tunneling between the dot and the leads in our description and write down the corresponding action of the coherent state path integral. Finally we will add source terms that define the generating functional for the current autocorrelation function.

7.3.1 Model Hamiltonian From the results of section 7.1 and 7.2 we can write down the Hamiltonian of a cylindrical quantum dot containing N electrons with momenta ~pj and positions ~rj

N 2 N 2 ~pj e ′ HD = ∗ + Vext(~rj)+ G(~rj, ~rj ) (7.19) 2m 2 ′ Xj=1 Xj=1 jX6=j The confining potential is given by

V (~r)= V (z)+ e α (~r)U e dr~′ G(r~′, ~r) ρ (r~′), (7.20) ext db g g − ion ZV i.e. the superposition of the conduction band profile of the heterostructure, the confinement by the gate electrode and the interaction with the fixed charged impurities. The Hamiltonian HD describes the isolated dot with a fixed number of electrons. In the experiment the dot is separated from source and drain by barriers that still provide a (weak) coupling and allow electrons to leave or enter the dot by tunneling processes. Therefore we will rewrite the Hamiltonian of the dot in the language of second quantization and add the Hamiltonians HL and HT that describe the (free) electrons of the leads and the tunnel coupling between the leads and the quantum dot

H = HD + HL + HT (7.21) with

† 1 † † HD = ψαǫαα′ ψα′ + ψαψα′ Vαα′ββ′ ψβψβ′ (7.22) ′ 2 ′ ′ α,αX ααX,ββ † j HL = ϕjγEγγ′ ϕjγ′ (7.23) ′ jX=0,1 Xγ,γ † j † j∗ HT = ϕjγ tγα ψα + ψα tγα ϕjγ . (7.24) α,γ jX=0,1 X h i † Here ψα and ψα are creation and annihilation operators for an electron in the one–particle state α of the dot. Correspondingly ϕ† and ϕ create and destroy an electron in the state γ in | i jγ jγ | i lead j. Further quantities that enter the description are the energy matrix elements ǫαα′ and j Eγγ′ that have to be calculated from the one–particle part of the Hamiltonian HD, i.e. the kinetic energy and the confinement potential of the dot, and the Hamiltonian of free particles in the leads, respectively. The matrix Vαα′ββ′ describes the two–body interaction between the electrons which has to be determined from the screened Coulomb interaction G(~r, ~r′). The tunnel matrix j elements tγα give the amplitude for an electron that occupies the dot state α to enter the lead | i j in the state γ . | i 124 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS

7.3.2 Action and Source Terms The (Euclidean) action of the coherent state path integral for the partition function correspond- ing to the Hamiltonian (7.21) can be written as

S = SD + SL + ST (7.25) with β ∗ SD = dτ ψα(τ) [(∂τ µ)δαα′ + ǫαα′ ] ψα′ (τ) (7.26) 0 ( ′ − Z Xαα

1 ∗ ∗ + ψα(τ)ψα′ (τ) Vαα′ββ′ ψβ(τ)ψβ′ (τ) 2 ′ ′  ααX,ββ  β ∗ j ′ ′  SL = dτ ϕjγ(τ) (∂τ µ)δγγ + Eγγ′ ϕjγ (τ) (7.27) 0 ′ − Z j,γγX h i β ∗ j ∗ j∗ ST = dτ ϕjγ(τ) tγα ψα(τ)+ ψα(τ) tγα ϕjγ(τ) . (7.28) Z0 α,j,γ X © ª For simplicity we have denoted the coherent states of the operators ψα and ϕjγ again by the same symbols. The partition function of the system is given by the multiple path integral

Z = µ(ψ) µ(ϕ) e−S[ψ,ϕ] (7.29) D D Z Z with µ(ψ) dψ∗ (τ) dψ (τ) and µ(ϕ) dϕ∗ (τ) dϕ (τ). (7.30) D ≡ α α D ≡ jγ jγ α Z Y Z Yj,γ Since our aim is to calculate the current autocorrelation function we have to add further source terms to the action that allow us to express expectation values and correlation functions of the current operator as derivatives of a generating functional. As derived in Appendix D.2 the current operator I can be expressed by the annihilation and creation operators of the dot and lead Fermions in the following way ∗ I = ϕ† ( 1)jie w tj ψ + ψ† ( 1)jie w tj ϕ (7.31) jγ − j γα α α − j γα jγ α,j,γ X n ¡ ¢ ¡ ¢ o that is quite analogous to the tunneling part of the Hamiltonian. Defining Λj (τ)= 1 ( 1)jie w χ(τ) tj (7.32) γα − − j γα the source term £ ¤ β S [χ]= dτ I(τ) χ(τ) (7.33) I − Z0 to the action of the path integral can be combined elegantly with the tunnel action ST for the definition of the generating functional

Z [χ] = µ(ψ) µ(ϕ) e−S[χ,ψ,ϕ] with S = S + S + S , (7.34) I D D D L TJ Z β Z ∗ j ∗ j∗ STJ = dτ ϕjγ(τ)Λγα(τ) ψα(τ)+ ψα(τ)Λγα(τ) ϕjγ(τ) , (7.35) Z0 α,j,γ X © ª 7.3. THEORETICAL MODEL 125 where the actions SD of the quantum dot and SL of the leads remain unchanged and are given by eqs. (7.26) and (7.27). The current autocorrelation function in imaginary time can be calculated from this generat- ing functional simply by

1 δ2Z [χ] C (τ)= I(τ)I(0) = I . (7.36) I h i Z [0] δχ(τ) δχ(0) I ¯χ≡0 ¯ ¯ 7.3.3 Decoupling of the Interaction ¯

While the coherent states ϕjγ(τ) occur only in linear and quadratic order and thus can be integrated out exactly, the interaction term of the quantum dot action is quartic in the fields ψα(τ) describing the dot Fermions. Since we want to integrate over the dot Fermions as well as the lead Fermions, in the next section we will rewrite the interaction term using a Hubbard– Stratonovich transformation. The quartic term of the action can be expressed as the result of a Gaussian integral over the auxiliary field φ

1 β exp dτ ραα′ (τ) Vαα′ββ′ ρββ′ (τ) (7.37) −2 0 ′ ′   Z ααX,ββ  β  1  −1 = φ(τ) exp dτ φαα′ (τ) Vαα′ββ′ φββ′ (τ)+ ραα′ (τ) φαα′ (τ) D − 0 2 ′ ′ ′  Z  Z ααX,ββ Xαα    ∗ where we have used the abbreviation ραα′ (τ)= ψα(τ)ψα′ (τ).  The physical interpretation of the new degree of freedom introduced by the auxiliary field is quite easy. By the Hubbard–Stratonovich transformation we have replaced the direct electron– electron interaction by the linear coupling of the charges to the electrostatic potential φαα′ . In addition we get a quadratic term in φαα′ describing the electrostatic energy contained in the field. Thus we have decoupled the electron–electron interaction on the dot at the cost of introducing one more path integral over the electrostatic potential in addition to the path integrals over coherent states. We will conclude this section and summarize the results obtained so far by giving the ex- pression for the generating functional of the current autocorrelation function

Z [χ] = φ µ(ψ) µ(ϕ) e−S[χ,φ,ψ,ϕ] with S = S + S + S + S , (7.38) I D D D F D L TJ Z β Z Z 1 −1 SF = dτ φαα′ (τ) Vαα′ββ′ φββ′ (τ), (7.39) 0 2 ′ ′ Z ααX,ββ β ∗ SD = dτ ψα(τ) [(∂τ µ)δαα′ + ǫαα′ (τ)] ψα′ (τ) with ǫαα′ (τ)= ǫαα′ + φαα′ (τ)(7.40), 0 ′ − Z Xαα β ∗ j SL = dτ ϕjγ(τ) (∂τ µ)δγγ′ + Eγγ′ ϕjγ′ (τ), (7.41) 0 ′ − Z jγγX h i β ∗ j ∗ j∗ STJ = dτ ϕjγ(τ)Λγα(τ) ψα(τ)+ ψα(τ)Λγα(τ) ϕjγ(τ) (7.42) Z0 α,j,γ X £ ¤ with Λj (τ)= 1 ( 1)jie w χ(τ) tj . γα − − j γα £ ¤ 126 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS

7.4 Effective Action

In this section we will derive an effective action for the interacting quantum dot coupled to two leads. We will use the fact that the action of the system is (at most) quadratic in the Fermion fields to integrate out these degrees of freedom exactly. The final result will be an expression for the generating functional of the dot as a path integral over the electrostatic potential only.

7.4.1 Integration over the Lead Fermions Since the Fermion fields on the quantum dot and on the leads are coupled by the bilinear terms of the tunneling action we will proceed in two steps. First we will integrate over the lead Fermions to obtain an intermediate description in terms of the Fermions and the electrostatic potential on the dot only. In a second step we can then also integrate over the electrons on the quantum dot.

Definition of the Electron Green’s Functions

The Green’s function of an electron in one of the leads can be defined by

j j ′ ′ (∂τ µ)δγγ′ + Eγγ′ gγ′γ′′ (τ, τ )= δ(τ τ ) δγγ′′ . (7.43) ′ − − Xγ h i In addition we define the Green’s function of the (interacting) electrons on the quantum dot by

′ ′ [(∂τ µ)δαα′ + ǫαα′ (τ)] α′α′′ (τ, τ )= δ(τ τ ) δαα′′ . (7.44) ′ − G − Xα Simplification of the Notation

To keep the formulas as simple and clear as possible it is useful to express the summations over quantum numbers α and γ as well as the integration over imaginary time as multiplication of matrices and vectors. We will use the following notations

ϕ ˆ= ϕjγ(τ) ψ ˆ= ψα(τ) φ; ˆ= φαα′ (τ) (7.45) j V ˆ= Vαα′ββ′ Λ ˆ=Λγα(τ) (7.46) j ′ ′ g ˆ= g ′ (τ, τ ) δ ′ ˆ= ′ (τ, τ ) (7.47) γγ jj G Gαα and multiplications of these matrices and vectors imply a summation or integration of the common indices, e.g.

β β ∗ −1 ′ ∗ j −1 ′ ′ ϕ g ϕ ˆ= dτ dτ ϕjγ(τ) (gγγ′ ) (τ, τ )δjj′ ϕj′γ′ (τ ) (7.48) 0 0 ′ ′ Z Z jγ,jXγ β ∗ ∗ j ϕ Λψ ˆ= dτ ϕjγ(τ)Λγα(τ) ψα(τ). (7.49) 0 Z α,j,γX With these shorthand notations the action of the quantum dot can be condensed to 1 S = φV −1φ + ψ∗ −1ψ + ϕ∗g−1ϕ + ψ∗Λ∗ϕ + ϕ∗Λψ. (7.50) 2 G 7.4. EFFECTIVE ACTION 127

Gaussian Integral over the Lead Fermions

The fields ϕ occur only in the action SL of the leads and in the tunneling action STJ . Thus we need only consider the following integral that is easily solved by application of eq. (3.44)

∗ −1 ∗ ∗ ∗ ∗ ∗ µ(ϕ) e−ϕ g ϕ−ψ Λ ϕ−ϕ Λψ = det g−1 eψ Λ gΛψ (7.51) D Z ¡ ¢ The determinant on the r.h.s. is the partition function ZL of the leads that may be omitted since it is just a constant contribution to the action. The exponential term represents the influence of the lead electrons on the quantum dot and shall be incorporated into the action SD of the dot electrons. In summary we get the following equations after integrating out the lead Fermions

Z [χ] = Z φ µ(ψ) e−S[χ,φ,ψ] with S = S + S˜ , (7.52) I L D D F D Z Z 1 S = φV −1φ, (7.53) F 2 S˜ = ψ∗ −1 Λ∗gΛ ψ ψ∗ ˜−1ψ. (7.54) D G − ≡ G £ ¤ 7.4.2 Integration over the Quantum Dot Fermions

The action S˜D of the quantum dot Fermions is again quadratic in the fields ψ and we can use the formula for the Gaussian integral of Grassmann fields to evaluate the path integral

∗ ˜−1 µ(ψ) e−ψ G ψ = det ˜−1 . (7.55) D G Z ³ ´ The Green’s function ˜ of the electrons in the electrostatic potential φ can be factorized using G the relation

˜−1[φ, χ] = −1[φ] Λ∗gΛ = −1 + φ Λ∗gΛ G G − G0 − = −1 [1 + φ Λ∗gΛ] (7.56) G£ 0 G0 − G¤0 £ ¤ where is the Green’s function of non–interacting electrons. Using Z = det( −1) as the G0 D0 G0 notation for the corresponding partition function we can rewrite the result of eq.(7.55) as

det ˜−1 = Z det(1 + φ Λ∗gΛ) G D0 G0 − G0 ∗ ³ ´ tr{ln[1+G0φ−G0Λ gΛ]} = ZD0 e . (7.57)

While the partition function ZD0 once again is just an uninteresting constant contribution to the remaining path integral, we have reexponentiated the determinant since it is a functional of the electrostatic potential φ and thus represents a contribution to the effective action that we have derived. We can summarize our results as

Z [χ] = Z Z φ e−Seff [χ,φ] (7.58) I L D0 D Z 1 S [χ, φ] = φ V −1 φ tr ln[1 + φ Λ∗gΛ] . (7.59) eff 2 − { G0 − G0 } 128 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS

7.5 Discussion of the Results

In this section we will discuss our theoretical results in comparison with the results obtained for the SET and point out directions for future research in relation to recently published work on semiconductor quantum dots [19].

7.5.1 General Discussion With eqs. (7.58) and (7.59) we could give an elegant representation of the generating functional for the current correlations of a semiconductor quantum as an imaginary–time path integral over the dot potential φ(~r, τ). It was derived from the Hamiltonian (7.21)–(7.24) that takes into account explicitly the (screened) Coulomb interaction of the electrons on the dot and the dependence of the confinement potential on the gate voltage and the number of electrons via the electrostatic Green’s function. Though it was recognized that these aspects are relevant for a quantitative description of electron transport in semiconductor quantum dots in the Coulomb blockade regime (see e.g. [10, 19]) as well as in the Kondo regime (see e.g. [31]) the standard theoretical model takes into account electron electron interaction only in terms of a constant charging energy and is restricted to a fixed (2D) harmonic confinement. In analogy to the cal- culation of the effective action for the SET we could integrate out explicitly the quasiparticle degrees of freedom reducing the problem to a single path–integral over the (macroscopic) elec- trostatic potential that was introduced by the decoupling of the electron–electron interaction via a Hubbard–Stratonovich transform. In contrast to the formulation for the SET, we have expressed the generating functional for the current autocorrelation function of the quantum dot as a path integral over the potential instead of the phase variable (which is conjugate to the charge on the dot). This formulation emerges naturally from the decoupling of the electron–electron interaction and is appealing due to its use of the potential as a quantity that can be imagined more easily than the phase. On the other hand it does not allow us to make direct comparison with the results obtained for the SET. Nonetheless one could try to exploit the similarity of the result (7.58)–(7.59) with eqs. (6.39)–(6.41) for the SET as a guide for the further investigation. In the case of the SET one could proceed by evaluating the tunnel action in the limit of a large number N of conduction channels. Although G¨oppert et al. [88] have shown that this approach is justified already for N = 10 channels, its application to semiconductor quantum dots is questionable. Also from another point of view a further study in analogy to that for the SET poses considerable problems. Due to the assumption of constant interaction of the electrons on the island the Coulomb action could be expressed as a path integral over a single phase ϕ(τ). The path integral in eq. (7.58) runs over the potential φ(~r, τ) (in the position basis), i.e. requires a discretization and involves a much bigger number of variables. In view of the computational effort this fact rules out the possibility to perform a Monte Carlo simulation as in the case of the SET unless one invokes the approximation that the field φαα′ (τ) can be given by just a view matrix elements for a certain choice of the basis α in which eqs. (7.58)–(7.59) are expressed. | i In view of these difficulties eqs. (7.58) and (7.59) have to be considered just as a first step towards the calculation of the conductance of semiconductor quantum dots for a realistic model.

7.5.2 Outlook: Stationary Phase Approximation In this subsection we present the stationary phase approximation as a direction for the further examination of the path integral for the generating functional ZI [χ] given in eqs. (7.58) and 7.5. DISCUSSION OF THE RESULTS 129

(7.59). We will first determine the stationary solution for the electrostatic potential, i.e. the field φ that minimizes the effective action. We will then expand the action up to second order around the stationary point. We will conclude this outlook with the relation of our results to recent work by Bednarek et al. [19].

Poisson Equation for the Stationary Field

To determine the stationary field φsp we have to find the root(s) of the functional derivative of the effective action with respect to the field φ. Before we determine this functional derivative we introduce a very useful notation for the δ-functions that appear throughout these calculations. We set ′ τ δφββ′ (τ ) ′ δαα′ ˆ= = δαβ δα′β′ δ(τ τ ). (7.60) δφαα′ (τ) − It is important to note that the indices α, α′ and τ are fixed and no summation will be implied ′ ′ τ whereas β, β and τ are normal matrix indices. The meaning of the symbol δαα′ shall be illustrated by the following examples that we will also need later. We start with the simple expression

β ′ ˜ τ ′ ˜ ′ ′ δφβ′β(τ ) ˜ ˜ tr δαα′ = dτ ββ′ (τ , τ ) = α′α(τ, τ)= α′α(0, 0) (7.61) ′ G 0 ′ G δφαα (τ) G G n o Z Xββ where we have used in the last step that the Green’s function depends only on the difference of the time arguments to state that the resulting expression is indeed independent of τ. In the calculation of the fluctuations of the electrostatic potential that will be presented in the next subsection we encounter the following expression

β β ′′′ ′′ ˜ τ ′ ˜ τ ′′ ′′′ ˜ ′′ ′′′ δφβ′β′′ (τ ) ˜ ′′′ ′′ δφβ′′′β(τ ) tr δα′′α′′′ δαα′ = dτ dτ ββ′ (τ , τ ) β′′β′′′ (τ , τ ) ′′ ′′′ ′ ′ G G 0 0 ′ ′′ G′′′ δφα α (τ )G δφαα (τ) n o Z Z ββXβ β ′ ′ = ˜ ′ ′′ (τ, τ ) ˜ ′′′ (τ , τ) (7.62) Gα α Gα α Using the δ symbol the equations determining the stationary field can be easily derived

δS 0 = eff (7.63) δφ ′ (τ) sp αα ¯φ=φ ¯ 1 sp −¯1 τ τ −1 sp sp ∗ −1 τ = φ V δ ′ + δ ′ V φ tr [1 + φ Λ gΛ] δ ′ 2 ¯ αα αα − G0 − G0 G0 αα τ −1 sp sp τ n o = δ £ ′ V φ tr ˜[φ ]δ ′ ¤ αα − G αα n o where we have used the symmetry of the interaction matrix V and the definition of ˜. G The stationarity condition (7.63) and the definition of the Green’s function ˜ form a closed G set of equations for the potential φsp that shall be discussed in the following. Multiplying eq.(7.63) by V and writing out the trace and all matrix multiplications we get the following equations for the stationary field

sp ˜ φαα′ = Vαα′,ββ′ β′β(0, 0) (7.64) ′ G Xββ 130 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS which has to be supplemented by the equation for ˜ that we will write in the form of a Dyson G equation ˜ = (φsp Λ∗gΛ) ˜ (7.65) G G0 − G0 − G where is the Green’s function of non–interacting Fermions on the quantum dot. G0 The interpretation of eq. (7.64) is physically quite intuitive and becomes obvious if we choose the position basis for our one-particle Hilbert space for which it takes the form

φsp(~r)= dr~′ G(~r, r~′) ˜(r~′0, r~′0). (7.66) G ZV The potential φsp is the solution of the electrostatic boundary problem, i.e. the Poisson equation, with sources given by the electron density ρ (~r) = ˜(~r0, ~r0) of the quantum dot. The electron e G density itself has to be calculated from the Green’s function ˜ that describes the quantum G mechanics of an electron in the presence of the Hartree potential φsp created by the other electrons as well as the external potentials and the tunnel coupling to the leads. Since each of the eqs. (7.64) and (7.65) requires the solution of the other one this system of equations has to be solved selfconsistently.

Fluctuations of the Electrostatic Potential

The fluctuations of the action around the stationary solution φsp are described by the sec- ond order functional derivative of the effective action with respect to the fields φ. Starting from the results of eq. (7.63) we get 2 δ Seff δ τ −1 τ = δ ′ V φ tr ˜[φ]δ ′ (7.67) ′ ′ αα αα sp δφ ′′ ′′′ (τ )δφ ′ (τ) sp δφ ′′ ′′′ (τ ) − G φ=φ α α αα ¯φ=φ α α ¯ h n oi ¯ ˜ ¯ τ −1 τ ′ δ [φ] τ = δαα′ V δα′′α′′′ tr G δαα′  ′′ ′′′ ′  − δφα α (τ )¯ sp  ¯φ=φ  ¯ ˜−¯1 τ −1 τ ′  ˜ sp ¯ [φ]  ˜ sp τ = δαα′ V δα′′α′′′ tr [φ ] G [φ ]δαα′  ′′ ′′′ ′  − G δφα α (τ )¯ sp G  ¯φ=φ  ¯ τ −1 τ ′ sp τ ′ sp τ = δ ′ V δ ′′ ′′′ tr ˜[φ ] δ ′′ ′′′ ˜[φ ]¯δ ′ . αα α α − G α α G ¯ αα  Using eq.(7.62) and writing out in a similar way alson the products in the firsto term we get the following simple result

δSeff ′ ′ ′ ′ ′′ ′′′ ˜ ′ ′′ ˜ ′′′ ′ = Vαα ,α α δ(τ τ ) α α (τ, τ ) α α(τ , τ) (7.68) δφα′′α′′′ (τ )δφαα′ (τ) − − G G where the second term is usually denoted as the polarization function of the quantum dot ′ ′ ′ P ′ ′ (τ, τ ) ˜ ′ (τ, τ ) ˜ ′ (τ , τ) (7.69) αα ,ββ ≡ −Gα β Gβ α From the expansion of the effective action around the stationary point we get the following result for the generating functional in the so–called one–loop approximation that takes into account Gaussian fluctuations around the stationary field.

2 δ S sp eff −Seff [χ,φ ] ZI [χ] = ZLZD0 det e (7.70) δφδφ sp à ¯φ=φ ! ¯ sp ¯ −φspV −1φsp+tr G˜[φsp,χ] = ZL det (V + P [φ ])¯ e { } 7.5. DISCUSSION OF THE RESULTS 131

Discussion of the Result

Eq. (7.70) together with the definition (7.69) and eqs. (7.64) and (7.65) represents a concrete starting point for future research. Though we have made the approximation that only Gaussian fluctuations around the stationary field are taken into account, this approach has the advan- tage that the effective tunnel action could be treated generally without invoking the limit of large number of conduction channels. The determination of the stationary field φsp and the dot Green’s function ˜ from eqs. (7.64) and (7.65) amounts to the self–consistent solution of G the Poisson equation for φsp (with sources given by the dot electrons and boundary conditions given by the gate voltage(s)) and the quantum mechanical problem for ˜ taking into account the G mean field potential φsp and the correction due to tunneling. This part of the problem is closely related to recent work by Bednarek et al. [19] who performed a self–consistent solution of the Poisson equation for the dot potential and the Hartree–Fock equations for the energy and the electron density of the quantum dot electrons. Using a realistic model for the vertical quantum dot of Kouwenhoven et al. [10] that was also considered in this thesis, Bednarek et al. found excellent agreement for the position of the conductance peaks as shown in fig. 7.8.

Figure 7.8: Theoretical results for the electrochemical potential of the dot for n=1 to n=12 electrons as a function of gate voltage and experimental conduction peaks. The position of peaks shows good agreement with the zeros of the chemical potential (from [10]).

Taking into account a finite bias voltage across the dot they could even determine the Coulomb blockade diamonds for n = 1 up to n = 12 electrons on the dot in very good agree- ment with experiment as shown in fig. 7.9. In view of this work of Bednarek et al. it would be interesting to use the results of the stationary phase approximation that in addition take into account Gaussian fluctuations of the potential and corrections due to tunneling and offer the possibility to determine not only the position but also the shape of the conductance peaks. 132 CHAPTER 7. SEMICONDUCTOR QUANTUM DOTS

Figure 7.9: Experimental stability diagram exhibiting diamond shaped regions of Coulomb blockade (white regions) and the the theoretically calculated boundaries (solid curves) of single electron tunneling (from [19]). Chapter 8

Summary and Conclusions

In this thesis we have examined imaginary–time path integral methods for the calculation of the conductance of single electron devices. Our work was focussed in particular on three aspects of this topic. The first aim was to study in how far imaginary–time methods for the determina- tion of (linear response) transport coefficients represent an alternative to real–time calculations which are hampered by the dynamical sign problem. Secondly, we applied the imaginary–time formalism to the calculation of the conductance of the metallic single electron transistor to make a detailed comparison with new experimental results obtained by Wallisser et al. [2]. Finally we tried to extend the ideas used in the calculations for the metallic single electron transistor to a more general model suitable for the description of semiconductor quantum dots. In the following we will give a summary of our work and present the conclusions that we obtained with respect to these three aspects.

Imaginary–Time Path Integral Methods

The connection between imaginary–time correlation functions and transport coefficients pro- vided by linear response theory is given in the form of ill–posed inverse problems. We have pre- sented the singular value decomposition and the maximum entropy method as two approaches for the determination of a regularized solution of such problems. As a test case for the com- parison of the methods we have chosen the exactly solvable model of a harmonic oscillator coupled linearly to a harmonic environment. The dipole absorption cross section σ(ω) of the oscillator can be related by an inverse problem to the imaginary–time displacement correlation function R2(τ). For our tests we implemented a recently proposed improvement of the SVD algorithm that exploits the positivity of the solution. The cross section was reconstructed from the imaginary–time correlation function for different levels of added noise. In addition we exam- ined different values for the system parameters, i.e. the frequency ω0 of free oscillations and the damping parameter f, ranging from a regime in which the oscillator exhibits purely relaxational dynamics to the regime of weakly damped oscillatory behavior. We performed a comparison of our SVD solutions with the exact results and MEM calculations of Krilov et al. [72]. The comparison of the two methods with the exact solution allows us to conclude that imaginary–time methods represent a viable alternative to real–time methods in particular for systems with a purely relaxational dynamics. To obtain accurate solutions in the case of a weakly damped oscillatory dynamics, on the other hand, requires imaginary–time data with a very high precision. From the comparison between the two methods we can further conclude that the MEM solutions show better agreement with the exact results than the SVD solutions.

133 134 CHAPTER 8. SUMMARY AND CONCLUSIONS

The tests for the oscillator model clearly demonstrate the superiority of the entropy functional (derived by statistical reasoning) over the ad–hoc (historic) choice of the norm as a regularizing functional in the SVD. We have shown that this conclusion remains valid also if positivity is incorporated in the SVD approach to enhance the resolution. Nonetheless the application of the SVD can be justified for systems with relaxational dynamics, for which the discrepancy between results of the two methods is smallest, since it is more robust and easier to implement than the MEM approach. The combined use of imaginary–time correlation functions and real–time data (for short times) that was proposed recently and has been examined using the MEM approach [35, 72, 78] can be included equally well into the SVD scheme. Since the incorporation of additional real– time data just represents an enlargement of the input used for the solution of the inverse problem and does not alter the method itself, we have not included corresponding results obtained from the SVD in this thesis. That real–time data (at short times) comprises information that is complemental to that contained in the imaginary–time correlation function has already been proven for the MEM approach and it is not surprising that it leads to analogous improvements in the SVD method.

Conductance of the Metallic Single Electron Transistor

We have derived a path integral expression for the imaginary–time current autocorrelation func- tion of a metallic single electron transistor from which the conductance can be calculated using linear response theory and methods for the solution of inverse problems. Our aim was to make a detailed comparison of this theory with recent experimental results of Wallisser et al. [2]. Due to an improved 4–junction SET layout these experiments allowed for the first time a complete characterization of the sample by a direct determination of the parallel conductance g and an accurate measurement of the charging energy EC from high temperature data. This experimen- tal progress provided strong motivation to test the theory without invoking any assumptions about the experimental setup or using adjustable parameters. We were in particular interested in the regime of large parallel conductance g that is not accessible to perturbation theory. To get accurate results for temperatures corresponding to the Coulomb blockade regime k T E , it B ≪ C was necessary to go beyond semiclassical calculations and to use Monte Carlo methods for the evaluation of the path integral expression. We employed importance sampling by the Metropolis algorithm to obtain data for the imaginary–time correlation function which are sufficiently accurate for the calculation of the conductance by the solution of an inverse problem. Since the employed regularization method requires not only precise data for the correlation function but also a good estimate of the error of this data, we made considerable effort in the equilibration of the system, the control of sys- tematical errors by Trotter extrapolation and the determination of the statistical Monte Carlo error. As a limiting factor of the Monte Carlo approach based on the phase representation of the effective action we identified the sign problem that occurs for gate voltages U = 0 and leads to a g 6 slow convergence of the method for (very) small temperatures and small parallel conductances. We made a detailed comparison of the theoretical results for the conductance with the exper- imental measurements of Wallisser et al. [2] for g = 4.75 over the whole range of gate voltages and a broad range of temperatures ranging from the high–temperature regime kBT >EC down to temperatures corresponding to the Coulomb blockade k T E . The comparison revealed B ≪ C an excellent agreement between experiment and theory over the whole range of examined pa- rameters. The results of perturbation theory in second order in the parallel conductance [79, 80] 135 describe the experimental data very poorly in the Coulomb blockade regime clearly demon- strating the fact that g = 4.75 lies well outside the range of approaches that take into account tunneling only in a perturbative way. Also the semiclassical approach [81] can give an accu- rate description of the experimental data only in the high–temperature regime. Thus we have shown that the Monte Carlo approach is able to extend the parameter range covered by these alternative theories to large parallel conductance and/or low temperatures. In addition we have calculated the conductance as a function of the tunneling strength g in the Coulomb blockade regime. We combined the available measurements from different experiments and results of perturbation theory in second and third order for a comparison. Also in this study we found good agreement of the Monte Carlo calculations with the experimental data over the whole range of parameters thereby extending the results of perturbation theory which is restricted to small parallel conductances.

Imaginary–Time Approach for Semiconductor Quantum Dots

Using as a basis the band structure of the semiconductor heterostructure, we have set up a microscopic model of electron transport in semiconductor quantum dots. The electrostatics of the system, consisting of the dot and the nearby metallic electrodes, was studied to get a realistic description of the screened Coulomb interaction of the electrons on the quantum dot and the confinement potential induced by the voltages applied to the metallic gates. Thus our model addresses the shortcomings of the constant interaction model that takes into ac- count electron–electron interaction only via a constant charging energy EC and uses a fixed confinement potential which is usually assumed to be harmonic. We introduced the electrostatic potential φ(~r) as an auxiliary field of the Hubbard–Stratonovich transformation that decouples the direct electron–electron interaction on the quantum dot. By an exact integration of the coherent state path integrals over the Fermions in the leads and on the quantum dot, the gen- erating functional for the current autocorrelation function could be expressed as a single path integral over an effective action in terms of the potential φ. Though these theoretical results are quite similar to the equations obtained for the metallic single electron transistor an analysis of the conductance of semiconductor quantum dots by a Monte Carlo simulation is inhibited by two circumstances. One reason is, that the assumption of a large number of conduction channels, which was used for the evaluation of the effective tunnel action of the SET, can not be applied in the case of semiconductor quantum dots where only few conduction channels are available. Further, the path integral for the generating functional of the current autocorrelation function of the quantum dot runs over the potential field φαα′ (τ) and thus involves a much larger number of degrees of freedom than the simple path integral over the phase in the case of the SET. Unless one can restrict the problem to a small number of relevant matrix elements for the potential, the numerical effort of a Monte Carlo simulation will be too large. Finally, we pointed out the stationary phase approximation as a possible direction for future work giving the expression for the generating functional in the one–loop approximation. The stationary field has to be determined from the self–consistent solution of a set of two equations. The Poisson equation determines the mean field potential φsp from the sources given by the quantum dot electrons and the boundary conditions given by the voltages applied to the metallic gates. This mean field potential and corrections due to tunneling between dot and leads are used in the calculation of the Green’s function of the dot electrons and in particular the electron density on the dot. As shown by Bednarek et al. [19] the solution of this problem for a realistic 136 CHAPTER 8. SUMMARY AND CONCLUSIONS model of the quantum dot can give a very accurate description of the position of the conductance peaks for few–electron quantum dots and explains convincingly the observed regions of Coulomb blockade in the stability diagram for n = 1 up to n = 12 electrons on the quantum dot. The one–loop approximation, presented in this thesis, takes into account in addition to the approach of Bednarek et al. Gaussian fluctuations around the mean field potential and corrections due to tunneling between dot and leads. Since it should further be able to describe not only the position but also the shape of the conductance peaks, it would be an interesting subject for further research. Part III

Appendices

137

Appendix A

Properties of Correlation Functions

Correlation functions and especially auto correlation function have some very useful properties which shall be summarized in this appendix.

Time translation invariance: As was implicitly used already in the definition of the • two-point correlation function its value is invariant under translations of (complex) time. This property follows directly from the cyclic invariance of the trace.

′ ′ ′ ′ A(z + z′)B(z′) = Z−1tr eiH(z+z )Ae−iH(z+z )eiHz Be−iHz e−βH (A.1) h i n o = Z−1tr eiHzAe−iHzBe−βH = A(z)Bn(0) for all z,z′ o C. h i ∈ As a special case we get

A(z)B(0) = A(0)B( z) for z C. (A.2) h i h − i ∈ Commutation of operator: Using the completness relation and once again the cyclic • invariance of the trace we prove the following relation between CAB(z) and CBA(z).

A(z)B(0) = Z−1tr eiHzAe−iHzBe−βH (A.3) h i n o = Z−1tr Be−βH eizH Ae−izH eβH e−βH = B(0)An(z + iβ) o h i = B( iβ z)A(0) for z C. h − − i ∈ As a special case for autocorrelation functions we get

A(z)A(0) = A( iβ z)A(0) for z C. (A.4) h i h − − i ∈ Complex conjugation: Since quantum mechanical observables are hermitian operators • we can state for real times t the following relation for the complex conjugate of a correlation function:

A(t)B(0) ∗ = B†(0)A†(t) h i h i = B(0)A(t) h i = B( t)A(0) = A(t + iβ)B(0) for t R. (A.5) h − i h i ∈

139 140 APPENDIX A. PROPERTIES OF CORRELATION FUNCTIONS

Cauchy Schwarz inequality: Another property that holds for real times t is the Cauchy • Schwarz inequality A(t)B(0) A2 B2 (A.6) |h i| ≤ h ih i that follows from the fact that the correlation functionp represents a scalar product on the Hilbert space. For imaginary or general complex time this is not necessarily true since the complex conjugation property (A.5) is required for the scalar product.

From these relations for the correlation function one can deduce corresponding properties for the spectral function. We will just give two attributes of the spectrum of autocorrelation functions that are frequently used in this thesis

Positivity of the spectral function: The spectral function for the autocorrelation • function of an hermitian operator is non-negative, e.g. S (ω) 0. To prove this statement AA ≥ we define the auxiliary quantity

T 1 iωt AT (ω)= dt A(t)e . (A.7) √2T Z−T The proposition follows from

2 0 AT (ω) (A.8) ≤ h| T | i T 1 ′ = dt dt′ A(t)A(t′) eiω(t−t ) 2T h i Z−T Z−T T t+T 1 ′ = dt dt′ A(t t′)A(0) eiω(t−t ) S (ω) for T 2T h − i → AA → ∞ Z−T Zt−T Detailed Balance Relation: The detailed balance relation for the spectral function • SAA(ω) is a consequence of the property (A.4) of the auto correlation function for real times t and can be proven as follows

∞ iωt SAA( ω) = dt A(t)A(0) e (A.9) − −∞ h i Z ∞ = dt A(iβ t)A(0) eiωt h − i Z−∞ ∞+iβ = dt A(t)A(0) eiωt h i Z−∞+iβ −βω = e SAA(ω)

where we have used for the last equality that the Fourier integral along each line parallel to the real axis is equal to the usual Fourier integral. Appendix B

Linear System of de Villiers’ SVD Method

Here we find the value of λ that minimizes the function

2 1 † D(λ)= bnλn + A>λ . (B.1) − 2 + σn>γ ° ° X °³ ´ ° ° ° Using the following result for the derivative of the positiv° e part°

∂ ∂ 1 1 + sgn(f) ∂f (f) = (f + f )= (B.2) ∂λ + ∂λ 2 | | 2 ∂λ we get for the partial derivative of D(λ) with respect to λn

† ∂ 1 + sgn(A>λ) D(λ) = bn + vn, (λn′ vn′ )+ (B.3) ∂λn − 2 * σ ′ >γ + Xn † 1 + sgn(A>λ) = b + v ,v ′ λ ′ − n 2 n n n σ ′ >γ * + Xn where we have used the fact that it is sufficient to take the positive part in one of the vectors of the scalar product. Hence the optimal value λ for the minimization has to be determined from the selfconsistent solution of the linear system

† 1 + sgn(A>λ) H(λ) λ = b with (H(λ))n,n′ = vn,vn′ (B.4) * 2 +

For the Fredholm integral equation (5.31) the matrix H can be given as

(H(λ))n,n′ = dω vn(ω) vn′ (ω). (B.5)

† Z (A>λ)(ω)≥0

141 142 APPENDIX B. LINEAR SYSTEM OF DE VILLIERS’ SVD METHOD Appendix C

The Damped Harmonic Oscillator

In this appendix we give a short summary of the derivation of the results for an (tagged) oscillator coupled to an environment which is described by the Caldeira–Leggett model [75] as a bath of harmonic oscillators. We restrict ourselves to the results which are quoted in chapter 5 and refer the reader to the literature [73, 74] for further details about this model.

C.1 Influence Functional for a Linearly Coupled Harmonic Bath

To derive the influence functional that describes the effect of a harmonic bath on a system we first study a single harmonic oscillator that couples linearly to the system variable x, i.e. the Hamiltonian p2 mω2 H = + q2 gxq. (C.1) 2m 2 − In the path integral description the effect of the oscillator on the system can be described by an influence functional F [x] that is defined by a path integral over all closed path

F [x]= Z−1 q e−S[q,q ˙ ;x] (C.2) B D Z where ZB is the partition function of the isolated oscillator and S[˙q, q; x] the (Euclidean) action which is given by

β β m mω2 S[˙q, q; x]= dτ [˙q, q; x] = dτ q˙2(τ)+ q2(τ) gx(τ)q(τ) . (C.3) L 2 2 − Z0 Z0 · ¸ Very useful for explicit calculations is the Matsubara representation

∞ m g S[ q ; x ]= ν2 + ω2 q 2 x q∗ (C.4) { n} { n} 2β n | n| − β n n n=−∞ · ¸ X ¡ ¢ where we have introduced the coordinates xn and qn by expanding the β-periodic paths x(τ) and q(τ) in a Fourier series

∞ ∞ 1 1 2πn x(τ)= x eiνnτ , q(τ)= q eiνnτ with ν = . (C.5) β n β n n β n=−∞ n=−∞ X X

143 144 APPENDIX C. THE DAMPED HARMONIC OSCILLATOR

To calculate the path integral (C.2) for the influence functional we first split qn into its classical cl part qn that minimizes the action (C.4) and the fluctuations ξn

cl cl g qn = qn + ξn with qn = 2 2 xn. (C.6) m(νn + ω )

Inserting eq. (C.6) into (C.4) the action simplifies to

∞ ∞ m 1 g2 S[ ξ ; x ]= ν2 + ω2 ξ 2 x 2 (C.7) { n} { n} 2β n | n| − 2β m(ν2 + ω2)| n| n=−∞ n=−∞ n X ¡ ¢ X where the first term describes an isolated harmonic oscillator while the second term is the effective action for the system variable x that describes the influence of the oscillator on the system. In the calculation of (C.2) the path integration over the fluctuations ξn just cancels the −1 prefactor ZB and we get

1 β β F [x] = exp dτ dτ ′ x(τ) Γ(τ τ ′) x(τ ′) (C.8) 2 − ½ Z0 Z0 ¾ with the kernel Γ(τ) that is given by

∞ 1 g2 g2 cosh(ω( 1 β τ)) Γ(τ)= eiνnτ = 2 − . (C.9) β m(ν2 + ω2) 2mω 1 n=−∞ n sinh( 2 ωβ) X From this result we can easily generalize to the Caldeira–Leggett model that describes the environment as a bath of harmonic oscillators with masses mα and frequencies ωα that couple linearly to a system variable x by coupling constants gα, i.e. the Hamiltonian

p2 m ω2 H = α + α α q2 g xq . (C.10) R 2m 2 α − α α α α X µ ¶ Since the bath oscillators are independent we just have to multiply the corresponding influence functionals to get

1 β β F [x] = exp dτ dτ ′ x(τ)Γ (τ τ ′) x(τ ′) (C.11) R 2 R − ½ Z0 Z0 ¾ where the kernel ΓR(τ) just adds up the contribution of the bath oscillators

2 1 ∞ 1 g cosh(ωα( β τ)) dω cosh(ω( β τ)) Γ (τ)= α 2 − = J (ω) 2 − . (C.12) R 2m ω 1 π R 1 α α α sinh( 2 ωαβ) 0 sinh( 2 ωβ) X Z For the second equality we have introduced the spectral density of the bath modes

g2 J (ω)= π α δ(ω ω ). (C.13) R 2m ω − α α α α X C.2. CLASSICAL DYNAMICAL FRICTION KERNEL 145

C.2 Classical Dynamical Friction Kernel

In this appendix we examine the classical equations of motion of the Caldeira–Leggett model to relate the classical dynamical friction kernel ζ(t) to the spectral density JR(ω) of bath modes and thus to the kernel ΓR(τ) of the quantum mechanical influence functional. The coupled equations of motion for the tagged harmonic oscillator and the harmonic bath are given by

2 mx¨(t)+ mω0x(t) = gαqα(t) (C.14) α 2 X mαq¨α(t)+ mαωαqα(t) = gαxα(t). (C.15)

One way to eliminate the bath degrees of freedom in eq.(C.14) is to use a Fourier transform by which the differential equations become algebraic ones. Thus eq.(C.15) leads to the relation

g x (ω) q (ω)= α α (C.16) α m (ω2 ω2) α α − which can be used to eliminate qα(ω) in the Fourier transform of eq. (C.14) to get

g2 0= mω2x(ω)+ mω2x(ω) α x(ω). (C.17) − 0 − m (ω2 ω2) α α α X − To go back to the time domain we need to calculate (by the residual theorem)

∞ dω e−iωt ∞ dω e−iωt sin(ω t) = lim = θ(t) α (C.18) 2π ω2 ω2 − ǫ→0+ 2π (ω ω + iǫ)(ω + ω + iǫ) ω Z−∞ α − Z−∞ − α α α where both poles of the integrand have to be shifted to the lower half of the complex plane in order to get a ”causal” result including the factor θ(t). Thus the Fourier (back) transform of eq. (C.17) is given by

t g2 0 = mx¨(t) dt′ α sin(ω (t t′)) x(t′)+ mω2x(t) − m ω α − 0 Z−∞ α α α t X = mx¨(t)+ dt′ ζ(t t′)x ˙(t′)+ mω2 ζ(0) x(t) (C.19) − 0 − Z−∞ ¡ ¢ where we have integrated by parts to recover the well–known result of the damped harmonic oscillator [73, 74] with the dynamical friction kernel ζ(t) given by

g2 2 ∞ J (ω) ζ(t)= α cos(ω t)= dω R cos(ωt). (C.20) m ω2 α π ω α α α 0 X Z The inverse relation of this Fourier cosine transform is given by [38]

∞ JR(ω)= ω dt ζ(t) cos(ωt). (C.21) Z0 These are the results quoted in the main text that represent (together with eq. (C.12)) the link between the classical dynamical friction kernel ζ(t) and the kernel Γ(τ) of the influence functional in the quantum mechanical treatment. 146 APPENDIX C. THE DAMPED HARMONIC OSCILLATOR

C.3 Correlation Function for the Tagged Oscillator

In this appendix we will use the influence functional (C.11) to derive an analytic expression of the displacement correlation function R2(τ) of a tagged harmonic oscillator linearly coupled to an environment that is described by the Caldeira–Legget model. The action of this system can be given as

β β β m 2 mω 2 1 ′ ′ ′ S[˙x,x]= dτ x˙ (τ)+ x (τ) dτ dτ x(τ)ΓR(τ τ ) x(τ ). (C.22) 0 2 2 − 2 0 0 − Z h i Z Z More convenient for our calculations is the Matsubara representation

∞ 1 S[ x ]= mν2 + mω2 Γ x 2. (C.23) { n} 2β n − n | n| n=−∞ X ¡ ¢ where Γn are the Matsubara coefficients of ΓR that can be read off from the generalization of eq. (C.9) for a bath of oscillators

g2 2 ∞ ω Γ = α = dω J (ω). (C.24) n m (ν2 + ω2 ) π ν2 + ω2 R α α n α 0 n X Z From the action (C.23) one can easily compute the expectation values

∗ β x x ′ = δ ′ (C.25) h n n i mν2 + mω2 Γ n,n n − n from which the following explicit expression for the displacement correlation function results

∞ 2 R2(τ) = x( iτ) x 2 = x 2 (1 cos(ν τ)) h| − − | i β2 h| n| i − n n=1 ∞ X 4 (1 cos(ν τ)) = − n . (C.26) β(mν2 + mω2 Γ ) n=1 n n X − In the main text the model is specified in terms of the classical dynamical friction kernel ζ(t) rather than in terms of the kernel Γ(τ) of the influence functional. Therefore we give here the explicit representation of the Matsubara coefficients Γn in terms of (the Laplace transform of) ζ(t). Starting from the expression (C.24) of Γn in terms of the spectral function JR(ω) we use the relation (C.21) to get

∞ ∞ 2 ∞ 2 ω −|νn|t Γn = dt ζ(t) dω 2 2 cos(ωt) = dt ζ(t) δ(t) νn e 0 π 0 νn + ω 0 −| | Z Z Z ³ ´ = ζ(0) ν ζˆ( ν ) (C.27) −| n| | n| where ζˆ(z) denotes the (one–sided) Laplace transform of the friction kernel ζ(t)

∞ ζˆ(z)= dt ζ(t) e−zt. (C.28) Z0 Appendix D

Representation of Operators

D.1 The Charge Shift Operator

Starting from the canonical commutation relation [n,ϕ]= i we prove in this appendix that the operator e−iNϕ, N Z applied to an eigenstate of the number operator n changes the eigenvalue ∈ by N. The first step is to prove by induction that

nϕk = ϕkn + kiϕk−1. (D.1)

For k = 1 this is just the commutation relation and hence correct. Once it has been proven up to a given power k it must also be true for k + 1 since

nϕk+1 = ϕknϕ + kiϕk = ϕk+1n + iϕk + kiϕk = ϕk+1n + (k + 1)iϕk. (D.2)

By induction eq.(D.1) has been proven for all powers k. Expanding the charge shift operator one gets

∞ ∞ ( iNϕ)k ( iN)k eiNϕne−iNϕ = eiNϕn − = eiNϕ − ϕkn + ikϕk−1 k! k! k=0 k=0 h i X ∞ X ( iN)k−1 = n + eiNϕN − ϕk−1 = n + N. (D.3) (k 1)! Xk=1 − D.2 The Current Operator

The first step in the calculation of the conductance of a metallic SET or a quantum dot from the Kubo formula is to derive a formal expression for the current operator. Since the derivation in both cases is analogous we will use the more general notation appropriate for the description of semiconductor quantum dots. The Hamiltonian of a quantum dot coupled to source (j = 0) and drain (j = 1) is given by H = HD + HL + HT (D.4) with

† 1 † † HD = ψαǫαα′ ψα′ + ψαψα′ Vαα′ββ′ ψβψβ′ (D.5) ′ 2 ′ ′ Xαα ααX,ββ

147 148 APPENDIX D. REPRESENTATION OF OPERATORS

† j HL = ϕjγEγγ′ ϕjγ′ (D.6) ′ jX=0,1 Xγγ † j † j∗ HT = ϕjγ tγα ψα + ψα tγα ϕjγ (D.7) α,γ jX=0,1 X h i † where ψα and ψα denote the creation and annihilation operator for an electron in state α on † | i the dot/island while ϕjγ and ϕjγ are the corresponding operators for lead j. In the case of the metallic SET the interaction term in HD can be omitted and the representation of the energy matrix elements is diagonal in the basis given by the longitudinal wave vector and the channel index. In the steady state the DC current through the dot must be the same across both tunnel junctions. The number of electrons in the source is given by

† n0 = ϕ0γϕ0γ. (D.8) γ X The current across the tunnel junction between the source and the dot can be calculated from the change of the number of electrons in the source dn 0 = i [n ,H]= i [n ,H ] (~ = 1) (D.9) dt − 0 − 0 T where we have used the fact that n0 commutes with the Hamiltonian HD of the dot as well as that of the leads HL. Before we continue to calculate the remaining commutator of n0 with HT we need the following two results

† † † ϕ ϕ ,ϕ ′ = ϕ ϕ ϕ ′ ϕ ′ ϕ ϕ (D.10) 0γ 0γ jγ 0γ 0γ jγ − jγ 0γ 0γ h i † = ϕ ϕ ,ϕ ′ δ δ ′ ϕ 0γ 0γ jγ + − j0 γγ 0γ = δ δ ′ ϕ − j0£ γγ 0γ ¤

† † † † † † ϕ ϕ ,ϕ ′ = ϕ ϕ ϕ ′ ϕ ′ ϕ ϕ (D.11) 0γ 0γ jγ 0γ 0γ jγ − jγ 0γ 0γ h i † † † = ϕ0γ,ϕjγ′ ϕ0γ + δj0 δγγ′ ϕ0γ + h †i = δj0 δγγ′ ϕ0γ.

We can now continue our calculation of the change of the number of charges on the source electrode dn 0 = i [n ,H ] (D.12) dt − 0 T

† † j † j∗ = i ϕ ϕ , ϕ ′ t ′ ψ + ψ t ′ ϕ ′ −  0γ 0γ jγ γ α α α γ α jγ  γ ′ X αjγX ³ ´  † † j † j∗ †  = i ϕ0γϕ0γ,ϕjγ′ tγ′α ψα + ψα tγ′α ϕ0γϕ0γ,ϕjγ′ − ′ αjγγX nh i h io = i ϕ† t0 ψ ψ† t0∗ ϕ . − 0γ γα α − α γα 0γ αγ X n o D.2. THE CURRENT OPERATOR 149

Analogously one finds for the change of the number of charges on the drain electrode dn 1 = i ϕ† t1 ψ ψ† t1∗ ϕ . (D.13) dt − 1γ γα α − α γα 1γ αγ X n o If a DC voltage is applied to the SET or quantum dot the current across both junctions must be equal. Hence we have the freedom to define the current operator as a linear combination

dn dn I = e w 0 w 1 (D.14) − 0 dt − 1 dt · ¸ ∗ = ϕ† ( 1)jie w tj ψ + ψ† ( 1)jie w tj ϕ − jγ − j γα α α − j γα jγ αjγ X h ¡ ¢ ¡ ¢ i with w0 + w1 = 1. From an esthetic point of view one might like to choose w0 = w1 = 1/2 to get a symmetric expression for the current operator. The practical calculations (see 6.3.3) show that the weights w0 and w1 should be chosen as g g w = 1 and w = 0 with g = g + g (D.15) 0 g 1 g 0 1 where g0 and g1 are the (dimensionless) tunnel conductances of the junctions, i.e. the weighting for the current operator should reflect the symmetry (or asymmetry) of the device. 150 APPENDIX D. REPRESENTATION OF OPERATORS Appendix E

Electrostatics of Quantum Dots

E.1 Formal Solution of the Dirichlet Problem

In this appendix we will consider the boundary value problem

[ǫ ǫ(~r) ϕ(~r)] = ρ(~r) (E.1) ∇ · 0 ∇ − ϕ(~r) = Φ (~r) for ~r S = ∂V (E.2) S ∈ that determines the electrostatic potential ϕ(~r) inside the volume V from the charge density ρ(~r) and its value ΦS(~r) on the surrounding surface S = ∂V . We show that the general solution of this problem can be expressed by the electrostatic Green’s function G(~r, r~′) for the given geometry. The Green’s function is defined by the properties

ǫ ǫ(~r) G(~r, r~′) = δ(~r r~′) (E.3) ∇ · 0 ∇ − − G(~r,hr~′) = 0 for i ~r S = ∂V. (E.4) ∈ Physically it can be interpreted as the potential created by a unit point charge located at r~′ when the surrounding surface S is grounded. We start from a generalization of Green’s theorem for a dielectrically inhomogeneous medium [104]

′ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ dr~ ψ(r~ ) ǫ0ǫ(r~ ) ϕ(r~ ) ϕ(r~ ) ǫ0ǫ(r~ ) ψ(r~ ) (E.5) V ∇ · ∇ − ∇ · ∇ Z n o ′ ′ ′ ′ ′ ′ ′ ′ = dS~ ǫ0ǫ(r~ ) ψ(r~ ) ϕ(r~ ) ϕ(r~ ) ψ(r~ ) . ∂V · ∇ − ∇ Z n o If one chooses ψ(r~′)= G(r~′, ~r) as the solution of the boundary value problem (E.3) and inserts the defining properties (E.1) of the potential ϕ(r~′), one directly gets the following representation

ϕ(~r)= dr~′ G(r~′, ~r) ρ(r~′) dS~′ ǫ ǫ(r~′)Φ (r~′) ′G(r~′, ~r). (E.6) − · 0 S ∇ ZV ZS For a volume surrounded by Ns metallic electrodes such that the surface S is partitioned into the surfaces Sn, n = 1,...,Ns of the electrodes, the function ΦS(~r) determining the value of the potential at the boundaries is piecewise constant and can be given as

Ns Φ (~r)= θ (~r) V for ~r S (E.7) S n n ∈ n=1 X

151 152 APPENDIX E. ELECTROSTATICS OF QUANTUM DOTS where Vn is the voltage on the n–th electrode and θn(~r) is the characteristic function of the n–th electrode, i.e. 1 for ~r Sn θn(~r)= ∈ . (E.8) 0 for ~r S ½ 6∈ n In this case the result for the electrostatic potential can be simplified to

Ns ϕ(~r)= dr~′ G(r~′, ~r) ρ(r~′) V α (~r) (E.9) − n n V n=1 Z X where α (~r)= dS~′ ǫ ǫ(r~′) ′G(r~′, ~r) (E.10) n · 0 ∇ ZSn is the surface charge on the n–th electrode that is induced by a unit point charge located at ~r.

E.2 Green’s Function for a Cylindrical Dot

The general solution of the electrostatic boundary value problem can be given by use of the Green’s function for the given geometry. The Green’s function gives the potential at ~r that is created by a point charge at r~′ surrounded by a conducting cylinder with radius a and height L. Thus it is a solution of the Poisson equation 1 ∆G(~r, r~′)= δ(~r r~′). (E.11) −ε0ε − Regarding the symmetry of the problem we will formulate the problem in cylindrical coordinates 1 1 1 ∂2 + ∂ + ∂2 + ∂2 G(ρϕz; ρ′ϕ′z′)= δ(ρ ρ′)δ(ϕ ϕ′)δ(z z′). (E.12) ρ ρ ρ ρ2 ϕ z −ε ερ − − − · ¸ 0 The point charge at r~′ is surrounded by a conducting cylinder. Consequently the Green’s function vanishes on the shell of the cylinder defined by ρ = a and on the base at z = 0 and top at z = L, i.e. G(ρϕz; ρ′ϕ′z′) = 0 for ρ = a z = 0 z = L. (E.13) ∨ ∨ It further is symmetric with respect to the arguments ~r and r~′, i.e.

G(ρϕz; ρ′ϕ′z′)= G(ρ′ϕ′z′; ρϕz). (E.14)

We use the 2π periodicity of the Green’s function in ϕ to expand G(~r, r~′) in a Fourier series in ϕ. Further we can use the boundary conditions at z = 0 and z = L to write the Green’s function as a odd function on [ L,L] as a Fourier sine series. Thus we get − ∞ ∞ 2 nπ 1 G(ρϕz; ρ′ϕ′z′)= sin z eimϕA (ρ; ρ′ϕ′z′). (E.15) L L 2π nm n=1 m=−∞ X ³ ´ X The δ-functions on the right hand side of eq. (E.12) can be expressed as ∞ 1 ′ δ(ϕ ϕ′) = eim(ϕ−ϕ ) (E.16) − 2π m=−∞ ∞X 2 nπ nπ δ(z z′) = sin( z) sin( z′). (E.17) − L L L n=1 X E.2. GREEN’S FUNCTION FOR A CYLINDRICAL DOT 153

Inserting eqs. (E.15),(E.16) and (E.17) into the differential equation (E.12), we get with the nπ abbreviation kn = L ∞ ∞ 1 1 m2 sin(k z)eimϕ ∂2 + ∂ k2 A (ρ; ρ′ϕ′z′) (E.18) πL n ρ ρ ρ − ρ2 − n nm n=1 m=−∞ · ¸ X X∞ ∞ 1 1 ′ = sin(k z)eimϕ δ(ρ ρ′) sin(k z′)e−imϕ . πL n −ε ερ − n n=1 m=−∞ 0 X X · ¸ Inserting the ansatz ′ ′ ′ 1 ′ −imϕ′ ′ Anm(ρ; ρ ϕ z )= sin(knz )e gnm(ρ; ρ ) (E.19) ε0ε into eq. (E.18) we get from a comparison of coefficients the following differential equation for ′ the radial Green’s function gnm(ρ ; ρ) 1 m2 1 ∂2 + ∂ + k2 g (ρ; ρ′)= δ(ρ ρ′). (E.20) ρ ρ ρ − ρ2 n nm −ρ − · µ ¶¸ To simplify the notation we introduce x = k ρ and R = k a and use ∂ = k ∂ and δ(ρ ρ′)= n n x n ρ − k δ(x x′) to express eq.(E.20) for the radial Green’s function as n − k k2 m2 k k2 ∂2 + n k ∂ n + k2 g (x; x′)= n k δ(x x′) (E.21) n x x n x − x2 n nm − x n − · µ ¶¸ 1 m2 1 ∂2 + ∂ 1+ g (x; x′)= δ(x x′) (E.22) ⇒ x x x − x2 nm −x − · µ ¶¸ m2 ∂ x∂ g (x; x′) x + g (x; x′)= δ(x x′) (E.23) ⇒ x x nm − x nm − − µ ¶ A fundamental solution¡ of the corresponding¢ homogeneous differential equation is given by the modified Bessel functions Im(x) and Km(x). Thus we can make the following ansatz for the solution of the inhomogeneous equation. a (x′)I (x)+ b (x′)K (x) xx′ ½ 2 m 2 m ′ Since gnm has to be regular at x = 0, one gets b1(x ) 0. From the boundary condition ′ ≡ ′ gnm(x ; R) = 0 one further concludes the following conditions for the coefficients a2(x ) and ′ b2(x ) a (x′)I (R)+ b (x′)K (R) = 0 x′ < R (E.25) 2 m 2 m ∀ ′ Km(R) ′ ′ a2(x )= b2(x ) x < R. (E.26) ⇒ − Im(R) ∀ Inserting these results into our ansatz for the radial Green’s function we get a (x′)I (x) xx ( − Im(R) ³ ′ ′ ´ We further use the symmetry relation gnm(x ; x)= gnm(x; x ) to conclude K (R) a (x′)I (x)= b (x) K (x′) m I (x′) (E.28) 1 m 2 m − I (R) m µ m ¶ I (x) 1 K (R) 1 m = K (x′) m I (x′) . (E.29) ⇒ b (x) a (x′) m − I (R) m ≡ A 2 1 µ m ¶ 154 APPENDIX E. ELECTROSTATICS OF QUANTUM DOTS

Thus we have determined the coefficients up to a constant factor A

′ ′ Km(R) ′ a1(x ) = A Km(x ) Im(x ) (E.30) − Im(R) ′ µ ′ ¶ b2(x ) = AIm(x ). (E.31) The Green’s function now has been simplified to

′ Im(x<) gnm(x; x )= A [Im(R)Km(x>) Km(R)Im(x>)] (E.32) Im(R) − ′ where x< is the smaller and x> the larger of the values x and x . ′ ′ For the normalization A one has to determine the jump of gnm(x ; x) at x = x from the differential equation (E.23). We integrate eq. (E.23) from x = x′ ε to x = x′ + ε and consider − the limit ε 0. First we get → x′+ε m2 (x′ + ε)∂ g (x′ + ε; x′) (x′ ε)∂ g (x′ ε; x′) dx x + g (x; x′)= 1 (E.33) x nm − − x nm − − x nm − x′Z−ε µ ¶ We sort the first two expressions and use integration by parts in the third term to get x′ ∂ g (x′ + ε; x′) ∂ g (x′ ε; x′) + ε ∂ g (x′ + ε; x′)+ ∂ g (x′ ε; x′) x nm − x nm − x nm x nm − ′ ′ x +ε £ 2 x=x +ε ¤ £ 2 ¤ m ′ m ′ x + Gnm(x; x ) + dx 1 2 Gnm(x; x )= 1 (E.34) − x x=x′−ε − x − µ ¶ ¯ ′Z µ ¶ ¯ x −ε ¯ Since the integral G of g is continuous¯ at x = x′, in the limit ε 0 on the left hand side nm nm → only the first term remains and we get the following condition for the discontinuity 1 ∂ g (x′+; x′) ∂ g (x′ ; x′)= . (E.35) x nm − x nm − −x′ Inserting eq. (E.32) into eq. (E.35) we get ′ Im(x ) ′ ′ ′ ′ A Im(R)Km(x ) Km(R)Im(x ) Im(R) − − ′ £ ′ ¤ Im(x ) ′ ′ 1 A Im(R)Km(x ) Km(R)Im(x ) = ′ . (E.36) Im(R) − −x £ ¤ This expression can be further simplified, and using the relation I (x′)K′ (x′) I′ (x′)K (x′)= m m − m m 1/x′ for the Wronski determinant of the fundamental solution, we finally conclude − 1 A I (x′)K′ (x′) I′ (x′)K (x′) = m m − m m −x′ A £= 1. ¤ (E.37) ⇒ Successively inserting the results into eqs. (E.32), (E.19) and (E.15) we get the radial Green’s function, the coefficients of the Greens function and finally the Green’s function of the Dirichlet boundary value problem inside the cylinder ∞ ∞ 1 ′ G(ρϕz; ρ′ϕ′z′)= eim(ϕ−ϕ ) sin(k z) sin(k z′) πε εL n n 0 n=1 m=−∞ X X Im(knρ<) [Im(kna)Km(knρ>) Km(kna)Im(knρ>)] . (E.38) × Im(kna) − E.2. GREEN’S FUNCTION FOR A CYLINDRICAL DOT 155

To specify the solution of the boundary value problem when a finite voltages is applied to the shell of the cylinder we need not only the Green’s function but also it’s partial derivative with respect to the radius ρ at the surface. Using again the relation for the Wronski determinant one gets

∞ ∞ ′ 1 ′ I (k ρ ) ∂ G(ρϕz; ρ′ϕ′z′) = eim(ϕ−ϕ ) sin(k z) sin(k z′) m n ρ ρ=a ε επL n n I (k a) 0 n=1 m=−∞ m n ¯ X X ¯ k I (k a)K′ (k a) K (k a)I′ (k a) × n m n m n − m n m n ∞ ∞ ′ 1 ′ I (k ρ ) = £ eim(ϕ−ϕ ) sin(k z) sin(k z¤′) m n (E.39) −ε επaL n n I (k a) 0 n=1 m=−∞ m n X X

The derivative determines the lateral confinement via the function αg(~r) that was defined in eq. (7.9). The integral over the surface Sg of the gate electrode can be carried out explicitly

α (ρϕz) = dS~′ ǫ ǫ ′G(r~′, ~r) (E.40) g · 0 ∇ ZSg 2π L ′ ′ ′ ′ ′ ′ = dϕ dz a ǫ0ǫ ∂ρ G(ρ ϕ z , ρϕz) ρ′=a Z0 Z0 ∞ ∞ 2π ¯ L 1 ′ I (k ρ) = sin(k z) e−imϕ dϕ¯′ eimϕ dz′ sin(k z′) m n −πL n n I (k a) n=1 m=−∞ 0 0 m n X X Z Z ∞ nπ 2 nπ L I0 ρ = sin z (1 cos(nπ)) L −L L nπ − I nπ a n=1 0 ¡ L ¢ X ³ ´ ∞ (2n+1)π (2n+1)π ¡ ¢ 4 sin L z I0 L ρ = −π ³2n + 1 ´ ³ (2n+1)π ´ n=0 I0 a X L ³ ´ 156 APPENDIX E. ELECTROSTATICS OF QUANTUM DOTS Bibliography

[1] D. V. Averin and K. K. Likharev, in Mesoscopic Phenomena in Solids, edited by B. L. Altshuler, P. A. Lee, and R. A. Webb (Elsevier, Amsterdam, 1991).

[2] C. Wallisser, B. Limbach, P. vom Stein, R. Sch¨afer, C. Theis, G. G¨oppert, and H. Grabert, Phys. Rev. B 66, 125314 (2002).

[3] J. Weis, in Fundamentals of Nanoelectronics, edited by S. Bl¨ugel, M. Luysberg, K. Ur- ban, and R. Waser, 34th Spring School of the Department of Solid State Research (Forschungszentrum J¨ulich, J¨ulich, 2003), vol. 14 of Matter and Materials, pp. D6.1–D6.33.

[4] R. J. Schoelkopf, P. Wahlgren, A. A. Kozhevnikova, P. Delsing, and D. E. Prober, Science 280, 1238 (1998).

[5] D. Vion, A. Assime, A. Cottet, P. Joyez, H. Pothier, C. Urbina, D. Esteve, and M. H. Devoret, Science 296, 886 (2002).

[6] J. Park, A. N. Pasupathy, J. I. Goldsmith, C. Chang, Y. Yaish, J. R. Petta, M. Rinkoski, J. P. Sethna, H. D. Abruna, P. L. McEuen, et al., Nature 417, 722 (2002).

[7] W. Liang, M. P. Shores, M. Bockrath, J. R. Long, and H. Park, Nature 417, 725 (2002).

[8] J. Nyg˚ard, D. H. Cobden, and P. E. Lindelof, Nature 408, 342 (2000).

[9] R. C. Ashoori, Nature 379, 413 (1996).

[10] L. P. Kouwenhoven, D. G. Austing, and S. Tarucha, Rep. Prog. Phys. 64, 701 (2001).

[11] W. Lu, Z. Ji, L. Pfeiffer, K. W. West, and A. J. Rimberg, Nature 423, 422 (2003).

[12] S. M. Cronenwett, T. H. Oosterkamp, and L. P. Kouwenhoven, Science 281, 540 (1998).

[13] T. Hayashi, T. Fujisawa, H. D. Cheong, Y. H. Jeong, and Y. Hirayama, Phys. Rev. Lett. 91, 226804 (2003).

[14] J. M. Elzerman, R. Hanson, J. S. Greidanus, L. H. W. van Beveren, S. D. Franceschi, L. M. K. Vandersypen, S. Tarucha, and L. P. Kouwenhoven, Phys. Rev. B. 67, 161308(R) (2003).

[15] L. M. K. Vandersypen, R. Hanson, L. H. W. van Beveren, J. M. Elzerman, J. S. Greidanus, S. D. Franceschi, and L. P. Kouwenhoven, in Quantum Computing and Quantum Bits in Mesoscopic Systems, edited by A. J. Leggett, B. Ruggiero, and P. Silvestini (Kluwer Accademics/ Plenum Publishers, 2004), chap. 22.

157 158 BIBLIOGRAPHY

[16] G. G¨oppert, B. H¨upper, and H. Grabert, Phys. Rev. B 62, 9955 (2000).

[17] S. Florens, P. S. Jos´e, F. Guinea, and A. Georges, Phys. Rev. B 68, 245311 (2003).

[18] P. Joyez, H. Bouchiat, D. Esteve, C. Urbina, and M. H. Devoret, Phys. Rev. Lett. 79, 1349 (1997).

[19] S. Bednarek, B. Szafran, and J. Adamowski, Phys. Rev. B 64, 195303 (2001).

[20] L. L. Chang, L. Esaki, and R. Tsu, Appl. Phys. Lett. 24, 593 (1974).

[21] S. Datta, Electronic transport in mesoscopic systems, vol. 3 of Cambridge studies in semi- conductor physics and microelectronic engineering (Cambridge University Press, Cam- bridge, 1995).

[22] M. Cahay, M. McLennan, S. Datta, and M. S. Lundstrom, Appl. Phys. Lett. 50, 612 (1987).

[23] C. Wallisser, private communication (2003).

[24] L. P. Kouwenhoven, N. C. van der Vaart, A. T. Johnson, W. Kool, C. J. P. M. Harmans, J. G. Williamson, A. A. M. Staring, and C. T. Foxton, Z. Phys. B - Cond. Matter. 85, 367 (1991).

[25] L. I. Glazman and M. E. Ra˘ıkh, JETP Lett. 47, 452 (1988).

[26] T. K. Ng and P. A. Lee, Phys. Rev. Lett. 61, 1768 (1988).

[27] L. P. Kouwenhoven and L. I. Glazman, Physics World pp. 33–38 (2001).

[28] D. Goldhaber-Gordon, H. Shtrikman, D. Mahalu, D. Abusch-Magder, U. Meirav, and M. A. Kastner, Nature 391, 156 (1998).

[29] S. Tarucha, D. G. Austing, S. Sasaki, T. Fujisawa, Y. Tokura, J. Elzerman, W. van der Wiel, S. de Franceschi, and L. P. Kouwenhoven, Mater. Sci. Eng. B 84, 10 (2001).

[30] S. Sasaki, S. de Franceschi, J. M. Elzerman, W. van der Wiel, M. Eto, S. Tarucha, and L. P. Kouwenhoven, Nature 405, 764 (2000).

[31] J. Schmid, J. Weis, K. Eberl, and K. von Klitzing, Phys. Rev. Lett. 84, 5824 (2000).

[32] J. W. Negele and H. Orland, Quantum Many–Particle Systems, vol. 68 of Frontiers in Physics (Addison–Wesley, 1987).

[33] R. Shankar, Rev. Mod. Phys. 66, 129 (1994).

[34] J. Rollb¨uhler, Dissertation, Albert–Ludwigs–Universit¨at, Freiburg (2002).

[35] G. Krilov, E. Sim, and B. J. Berne, J. Chem. Phys. 114, 1075 (2001).

[36] W. Nolting, Grundkurs Theoretische Physik, vol. 7 (Springer, Berlin, Heidelberg, New York, 2002), 5th ed.

[37] F. A. Berezin, The Method of Second Quantization (Academic Press, New York, London, 1965), chap. I.3, pp. 49–86. BIBLIOGRAPHY 159

[38] I. N. Bronˇstejn and K. A. Semedjajew, Taschenbuch der Mathematik (Verlag Harri Deutsch, Thun, Frankfurt/Main, 1987), chap. 2, p. 154, 23rd ed.

[39] T. Nishimura and M. Matsumoto, The mersenne twister mt19937, code and information available at http://www.math.keio.ac.jp/matumoto/emt.html (2002).

[40] NAG, Fortran library manual, mark 20, available as pdf under http://www.nag.co.uk (2002).

[41] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C - The Art of Scientific Computing (Cambridge University Press, Cambridge, 1992), 2nd ed.

[42] A. Heuer, B. D¨unweg, and A. M. Ferrenberg, Comp. Phys. Comm. 103, 1 (1997).

[43] G. Marsaglia, A. Zaman, and W. W. Tsang, Stat. & Prob. Lett. 9, 35 (1990).

[44] F. James, Comp. Phys. Comm. 79, 111 (1994).

[45] N. C. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller, J. Chem. Phys. 21, 1087 (1953).

[46] D. M. Ceperley, Rev. Mod. Phys. 67, 279 (1995).

[47] R. H. Swendsen, in The Monte Carlo Method in the Physical Sciences, edited by J. E. Gubernatis (AIP, Melville, New York, 2003), vol. 690 of AIP Conference Proceedings, pp. 45–51.

[48] D. Frenkel, in The Monte Carlo Method in the Physical Sciences, edited by J. E. Gubernatis (AIP, Melville, New York, 2003), vol. 690 of AIP Conference Proceedings, pp. 99–109.

[49] W. Janke, in Quantum Simulations of Complex Many–Body Systems: From Theory to Algorithms, edited by J. Grotendorst, D. Marx, and A. Muramatsu ( Institute for Computing, J¨ulich, 2002), vol. 10 of NIC Series, pp. 423–445.

[50] R. M. Fye, Phys. Rev. B 33, 6271 (1986).

[51] K. Binder, in The Monte Carlo Method in the Physical Sciences, edited by J. E. Gubernatis (AIP, Melville, New York, 2003), vol. 690 of AIP Conference Proceedings, pp. 74–84.

[52] M. Suzuki, Commun. Math. Phys. 51, 183 (1976).

[53] H. D. Raedt and B. D. Raedt, Phys. Rev. A 28, 3575 (1983).

[54] H.-B. Sch¨uttler and D. J. Scalapino, Phys. Rev. B 34, 4744 (1986).

[55] J. E. Gubernatis, M. Jarrell, R. N. Silver, and D. S. Sivia, Phys. Rev. B 44, 6011 (1991).

[56] D. Forster, Hydrodynamic fluctuations, broken symmetry and correlation functions (Ben- jamin, Reading, 1975).

[57] P. Blanchard and E. Bruning, Distributionen und Hilbertraumoperatoren (Springer, Wien, New York, 1993). 160 BIBLIOGRAPHY

[58] A. K. Louis, Inverse und schlecht gestellte Probleme, Teubner Studienb¨ucher: Mathematik (B.G.Teubner, Stuttgart, 1989).

[59] H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, vol. 375 of Mathematics and its applications (Kluwer Academic, Dordrecht, 1996).

[60] B. Hofmann, Mathematik inverser Probleme, Mathematik f¨ur Ingenieure und Naturwis- senschaftler (B.G.Teubner, Stuttgart, Leipzig, 1999).

[61] F. Natterer, The mathematics of computerized tomography (Wiley and Teubner, Stuttgart, 1986).

[62] I. Csisz´ar, The Annals of Statistics 19, 2032 (1991).

[63] H. W. Alt, Lineare Funktionalanalysis (Springer, Berlin, Heidelberg, New York, 1992), chap. 9, pp. 286–293, Springer-Lehrbuch, 2nd ed.

[64] G. D. de Villiers, B. McNally, and E. R. Pike, Inverse Problems 15, 615 (1999).

[65] M. Bertero, P. Brianzi, E. R. Pike, and L. Rebolia, Proc. R. Soc. London A 415, 257 (1988).

[66] J. M. Borwein and A. S. Lewis, Math. Program. 57, 49 (1992).

[67] R. T. Rockafellar, Convex Analysis (Princeton University Press, Princeton, NJ, 1970).

[68] B. H¨upper (1999), unpublished.

[69] M. Jarrell and J. E. Gubernatis, Physics Reports 269, 133 (1996).

[70] R. K. Bryan, Eur. Biophys. J. 18, 165 (1990).

[71] E. Gallicchio, S. A. Egorov, and B. J. Berne, J. Chem. Phys. 109, 7754 (1998).

[72] G. Krilov and B. J. Berne, J. Chem. Phys. 111, 9147 (1999).

[73] U. Weiss, Quantum Dissipative Systems, vol. 2 of Modern Condensed Matter Physics (World Sientific, Singapore, 1993).

[74] H. Grabert, P. Schramm, and G.-L. Ingold, Phys. Rep. 168, 115 (1988).

[75] A. O. Caldeira and A. J. Leggett, Phys. Rev. Lett. 46, 211 (1981).

[76] J. E. Straub, M. Borkovec, and B. J. Berne, J. Chem. Phys. 89, 4833 (1988).

[77] NAG, C library manual, mark 7, available as pdf under http://www.nag.co.uk (2002).

[78] D. Kim, J. D. Doll, and D. L. Freeman, J. Chem. Phys. 108, 3871 (1998).

[79] J. K¨onig, H. Schoeller, and G. Sch¨on, Phys. Rev. Lett. 78, 4482 (1997).

[80] J. K¨onig, H. Schoeller, and G. Sch¨on, Phys. Rev. B 58, 7882 (1998).

[81] G. G¨oppert and H. Grabert, Eur. Phys. J. B 16, 687 (2000).

[82] I. Giaever and H. R. Zeller, Phys. Rev. Lett. 20, 1504 (1968). BIBLIOGRAPHY 161

[83] H. R. Zeller and I. Giaever, Phys. Rev. 181, 789 (1969).

[84] T. A. Fulton and G. J. Dolan, Phys. Rev. Lett. 59, 109 (1987).

[85] V. Ambegaokar, U. Eckern, and G. Sch¨on, Phys. Rev. Lett. 48, 1745 (1982).

[86] H. Grabert, Phys. Rev. B 50, 17364 (1994).

[87] I. S. Gradshtein and I. M. Ryzhik, Table of Integrals, Series, and Products (Academic Press Inc., Boston, 1994), chap. 3, p. 352, 5th ed.

[88] G. G¨oppert, H. Grabert, and C. Beck, Europhys. Lett. 45, 249 (1999).

[89] G. G¨oppert, Dissertation, Albert–Ludwigs–Universit¨at, Freiburg i. Br. (2000).

[90] C. P. Herrero, G. Sch¨on, and A. D. Zaikin, Phys. Rev. B 59, 5728 (1999).

[91] D. Chouvaev, L. S. Kuzmin, D. S. Golubev, and A. D. Zaikin, Phys. Rev. B 59, 10599 (1999).

[92] G. G¨oppert and H. Grabert, Phys. Rev. B 58, R10155 (1998).

[93] H. Schoeller, J. K¨onig, F. Kuczera, and G. Sch¨on, J. Low Temp. Phys. 118, 409 (2000).

[94] M. E. Levinshtein, S. L. Rumyantsev, and M. Shur, eds., Handbook Series on Semicon- ductor Parameters, vol. 1 (World Scientific, London, 1996).

[95] M. E. Levinshtein, S. L. Rumyantsev, and M. Shur, eds., Handbook Series on Semicon- ductor Parameters, vol. 2 (World Scientific, London, 1999).

[96] T. Bronger, Diplomarbeit, RWTH, Aachen (2001).

[97] J. R. Chelikowsky and M. L. Cohen, Phys. Rev. B 14, 556 (1976).

[98] E. F. Schubert, www.lightemittingdiodes.org (Rensselaer Polytechnique Institute, Troy, NY), webpage (2004).

[99] T. Heinzel, Mesoscopic Electronics in Solid State Nanostructures (Wiley-VCH, Weinheim, 2003), 1st ed.

[100] D. V. Averin, A. N. Korotkov, and K. K. Likharev, Phys. Rev. B 44, 6199 (1991).

[101] H. van Houten, C. W. J. Beenakker, and A. A. M. Staring, in Single Charge Tunnling: Coulomb Blockade Phenomena in Nanostructures, edited by H. Grabert and M. H. Devoret (Plenum Press, New York, 1992), vol. 294 of NATO ASI series B: Physics, pp. 167–216.

[102] A. Kumar, S. E. Laux, and F. Stern, Phys. Rev. B 42, 5166 (1990).

[103] L. D. Hallam, J. Weis, and P. A. Maksym, Phys. Rev. B 53, 1452 (1996).

[104] W. R. Smythe, Static and Dynamic Electricity (McGraw–Hill, New York, 1968). Lebenslauf

Pers¨onliche Daten:

Christoph Theis Sundgauallee 39 79114 Freiburg geb.: 27. Mai 1972 deutsch, ledig

Schulausbildung:

1978–1982: Grundschule Monzelfeld 1982–1990: Nikolaus-von-Kues Gymnasium Bernkastel-Kues

Zivildienst:

1990–1992: Cusanus-Krankenhaus Bernkastel-Kues

Universit¨atsausbildung:

April 1992: Beginn eines Studiums der Physik und Mathematik an der Johannes Gutenberg-Universit¨at Mainz. Mai 1997: Diplom in Physik an der Johannes Gutenberg-Universit¨at Mainz. Thema: “Modenkopplungsgleichungen f¨ur Molekulare Fl¨ussigkeiten” April 1999: Diplom in Mathematik an der Johannes Gutenberg-Universit¨at Mainz. Thema: “Quasinormalteiler und total vertauschbare Produkte” Juli 1999: Beginn der Dissertation an der Albert–Ludwigs–Universit¨at Freiburg in der Arbeitsgruppe von Herrn Professor Dr. Hermann Grabert Thema: “Conductance of Single Electron Devices from Imaginary–Time Path Integrals” Acknowledgement

I’m grateful to Prof. Dr. Hermann Grabert for his continuous support and the opportunity to participate and get insight into the actual development of theoretical methods in the field of nanoelectronics. I appreciate the offer to work in his group at the Albert–Ludwigs–Universit¨at in Freiburg and to get further stimulation for my research at national and international conferences.

I thank Dr. Georg G¨oppert for many helpful and motivating discussions that contributed ap- preciably to this thesis. Furthermore, I would like to express my gratitude to Dr. Christoph Wallisser and the group of Dr. Roland Sch¨afer at the Forschungszentrum Karlsruhe for a fruitful collaboration and the opportunity to learn more about the experimental aspects of the single electron transistor. I’m indebted to them also for the supply of the experimental data and the picture of the experimental layout.

I also want to thank all my colleagues who worked with me during the last years. In particular I would like to mention Dr. Boris Reusch, Dr. J¨org Rollb¨uhler and Dr. Markus Saltzer with whom I have passed many pleasant hours here in Freiburg.

I gratefully acknowledge the help of Dr.–Ing. Bernhard Gimber in proofreading this manuscript and appreciate his suggestions to improve the transparency of the presentation. I also appreciate the comments of Wolfgang K¨orner about the introductory chapters of this thesis.

Last but not least I would like to express my gratitude to my parents and my girlfriend Isabel for their support, love and patience throughout the last years.