Heat kernels on Riemannian Lecture course at the Humboldt-University SS 2019/2020

Batu G¨uneysu

1. Introduction Fourier in 1822 was the first to derive the in the following context: assume M ⊂ R3 is a sufficiently homogeneous body. Then the temperature function u : (0, ∞) × M −→ [0, ∞) of M, that is, u(t, x) is the temperature at the time t in x ∈ M, satisfies the heat equation

∂tu(t, x) = k∆xu(t, x) if M has no sources or sinks of heat. Above, k > 0 is a material constant (heat conductivity Pm 2 constant), and ∆ = j=1 ∂j is the . The heat equation was also the basis for modern probability theory: in 1827 the botanist Brown was watching small test particles (pollen,...) in suspended in a fluid medium (wa- ter,...) in a body M ⊂ R3 and was shocked by the fact that the pollen is moving. Having started with pollen, his first conclusion was that pollen is alive, until he repeated the experiment with other test particles that were . His observations were that the trajectory X of each test particle was random and in- dependent of the trajectory of any other test particle (so wlog we can consider one test particle). This leads to the idea that X should be what we call today a stochastic process, that is, a map X : [0, ∞) × (Ω, F ,P ) −→ M, where (Ω, F ,P ) is a probability space. Here, the set Ω contains the random parameters and for each fixed ω ∈ Ω, the map X(ω) : [0, ∞) −→ M is called a (random) path of the process. Then Brown observed that the expected dis- placement of the test particle was a decreasing function of its size and of viscosity of the medium, and increasing with the temperature of the medium. Let u(·, ·, y) : (0, ∞) × M −→ [0, ∞), (t, x, y) 7−→ u(t, x, y) denote the probability density of X, assuming that X starts in some y ∈ M. In other words, the probability of finding X in A ⊂ M at the time t is given by Z P {Xt ∈ A} = u(t, x, y)dy. A 1 2 B. GUNEYSU¨

It was then Einstein who derived in 1905 that

∂tu(t, x, y) = D∆xu(t, x, y), where the diffusion constant D > 0 of the system is given by kT , 6πνR where k is the Boltzmann constant, T the temperature of the medium, ν its viscosity and R the radius of the test particle. Assuming that u(t, x, y) behaves like the Gauss kernel

2 |x−y| 3 3 3 3/2 − R p : (0, ∞) × R × R −→ [0, ∞), p(t, x, y) := (4πt) e 4Dt , which is a solution of the heat equation in (t, x), one easily derives the fundamental relation Z j j 2 (1) |Xt − y | dP ≈ Dt, Ω for the average square displacement, which justifies all observations of Brown. The sto- chastic process underlying the random trajectory of a test particle as above is nowadays called a Brownian motion. Einstein’s conclusion was that the medium consists of very small particles (molecules), subject to some random kinematics, which bombard the larger test particles and lead to their random movement. The above fundamental relation (1) was confirmed in an experiment by Perrin in 1908 for which he received the Nobel price later. Note that all of this is roughly 20 years before quantum mechanics, and so these results can be thought of as a first confirmation of the atomic structure of matter. Let us now take a closer look at the properties of the m-dimensional Gauss kernel 2 m m −m/2 − |x−y| p : (0, ∞) × R × R −→ [0, ∞), p(t, x, y) := (4πt) e 4t , In the sequel, we are going to consider Rm as a Riemannian with its Euclidean metric gij(x) = δij. Then ∆ is the underlying Laplace-Beltrami operator, the Lebesgue measure dx becomes the Riemannian volume measure and the Euclidean distance |x − y| the Riemannian distance. One can then prove the following facts for the Riemannian manifold Rm: i) (t, x, y) 7−→ p(t, x, y) is jointly smooth and (t, x) 7→ p(t, x, y) satisfies m ∂tu(t, x) = ∆xu(t, x), lim u(t, ·) = δy for all y ∈ ,(2) t→0+ R ii) one has Z m p(t, x, y)dy = 1 for all t > 0, x ∈ R , iii) one has p(t, x, y) = p(t, y, x) for all t > 0, x, y ∈ Rm, iv) one has Z m m p(t + s, x, y) = p(t, x, z)p(s, y, z)d z t, s > 0, x ∈ R Heat kernels on Riemannian manifolds 3

v) one has p > 0, and p is the unique nonnegative function satisfying (2), vi) there is exactly one self-adjoint realization H ≥ 0 of −∆ in the L2(Rm) and one has Z −tH 2 m e f(x) = p(t, x, y)f(y)dy for all f ∈ L (R ), where the heat semigroup on the left hand side is defined via the , vii) if m ≤ 2 we have Z ∞ m G(x, y) := p(t, x, y)dt = ∞ for all x, y ∈ R , 0 while for m ≥ 3 we have m G(x, y) < ∞ for all x, y ∈ R with x 6= y. In this course we will attack the following problem: to what extend hold the above results on Riemannian manifolds? It is easy to convince oneself that some subtleties must appear: for example, even if we replace Rm above with an arbitrary bounded open subset U of Rm, then there exist at least two nonnegative solutions to (2) in U and ∆ has at least two self-adjoint realizations in L2(U): the Dirichlet realization and the Neumann realization, and the integral kernels of the corresponding heat semigroups both solve (2) in U.

Assume now (M, g) is an arbitrary Riemannian manifold, let ∆g denote the induced Laplace-Beltrami operator, let µg denote the Riemannian volume measure, and dg(x, y) the Riemannian distance. The first question is: what is the analogue of the Gauss kernel and of H in this case? Firstly, we are going to show that ∆g has a canonically given 2 self-adjoint realization Hg ≥ 0 in L (M, g), its Friedrichs realization. Then one can define pg(t, x, y) as the integral kernel of the heat semigroup of Hg. These construction rely on some several deep results from (which is why we are actually going to start with abstrct functional analysis). It will turn out that without any further assumptions on the geometry, • the map pg : (0, ∞) × M × M −→ [0, ∞)

is jointly smooth, and (t, x) 7→ pg(t, x, y) satisfies

∂tu(t, x) = ∆g,xu(t, x), lim u(t, ·) = δy for all y ∈ M,(3) t→0+

which is why one calls pg the heat kernel of (M, g), • one has Z pg(t, x, y)dµg(y) ≤ 1 for all t > 0, x ∈ M, • one has the natural analogues of iii) and iv), • and the analogue of vi) holds by definition. Moreover, we are going to address some of the following facts: 4 B. GUNEYSU¨

• In general, Hg need not be the unique self-adjoint realization of −∆g, but we are going to show that this is the case if (M, dg) is complete. R • The property pg(t, x, y)dµg(y) = 1 turns out to be a property which is highly subtle for noncompact Riemannian manifolds and typically depends on the growth of the volume of metric balls. Riemannian manifolds having this property are called stochastically complete. • While one always has pg ≥ 0, the strict positivity pg > 0 turns out to be related with the connectedness of M. • The property Z ∞ G(x, y) = pg(t, x, y)dt < ∞ 0 for x 6= y turns out to be subtle again: it implies the noncompactness of M and there are noncompact Riemannian manifolds of dimension ≥ 3 which need not satisfy the above finiteness, which is called nonparabolicity. • Both, stochastic completeness and nonparabolity are linked with probability the- ory: on every Riemannian manifold one can define Brownian motion, and stochas- tic completeness means that this process cannot explode in a finite time, while nonparabolicity means that the process eventually leaves every relatively compact subset. • the Gauss type behaviour of the heat kernel

2 2 dg(x,y) dg(x,y) dim(M)/2 − C t dim(M)/2 − C t C1t e 2 ≤ pg(t, x, y) ≤ C3t e 4 depends sensitively on the geometry of (M, g): in fact, it turns to be more natural dim(M)/2 in the above estimate to replace√ the factor t by the volume of a Riemannian ball centered in x with radius t. These are the celebrated Li-Yau heat kernel estimates.

2. Linear operators in Banach and Hilbert spaces 2.1. Motivation. For the convenience of the reader, we collect some facts linear operators. For a detailed discussion of the (standard) results below, we refer the reader to [35, 28, 22]. This section is motivated by the following observations from linear algebra: Assume a ∼ l linear operator T in a (say) complex finite dimensional Hilbert space H = C is given. Then for every ψ0 ∈ H there is a unique solution Ψ : [0, ∞) → H of the ’heat equation’

(d/dt)Ψ(t) = −T Ψ(t), Ψ(0) = Ψ0. −tT In fact, we can simply set Ψ(t) = e Ψ0, with ∞ X e−tT = (−tT )j/j! j=0 the matrix exponential series. Now if H is infinite dimensional (in our case this will the Hilbert space of square integrable functions on a Riemannian manifold), for T ’s one is Heat kernels on Riemannian manifolds 5

interested in (in our case: the Laplace-Beltrami operator), the exponential series will never converge. The way out of this is provided by the following observation: assume in the above finite dimensional situation that T is self-adjoint. Then, as T is diagonalizable, one has Z X T = λ dPT (λ) := λPT (λ) {λ∈R: λ is an eigenvalue of T } with PT (λ) the projection onto the eigenspace of λ. Given a function f : R → C (like f(r) = e−tr!), the above formula suggests to define a linear operator f(T ) in H by setting Z X f(T ) := f(λ) dPT (λ) := f(λ)PT (λ). {λ∈R: λ is an eigenvalue of T } For f(r) = e−tr this definition is equivalent to using the matrix exponential. By John von Neumann’s spectral theorem, it turns out that given any self-adjoint operator T in a possibly infinite dimensional Hilbert space there exists a unique projection-valued measure PT such that one has Z T = λ dPT (λ), and using the above observations this fact leads to satisfactory solution theory of the abstract heat equation induced by T . The purpose of this section is to explain these facts im detail. Before that, let us list some issues that are supposed to motivate some of the following definitions: • a self-adjoint operator T : H → H in a Hilbert space is automatically continuous. However, the operators we will be interested in (like the Laplace-Beltrami operator) turn out to be never continuous. The way out of this is consider linear operators T : Dom(T ) → H that are defined on a (typically dense) subspace Dom(T ) ⊂ H , called the domain of definition of T . Thus: any self-adjoint operator T in H with Dom(T ) = H is automatically continuous, the self-adjoint operators of interest are not continuous (and so cannot be defined everyhwere). Although self-adjoint operators are not continuous, they turn out to satisfy a weaker useful property, namely they are closed. • In infinite dimensions, it is often easier to define a self-adjoint operator via sym- metric sesquilinear forms. Note that in finite dimensions, given any symmetric sesquilinear form Q : H × H −→ C

there exists a unique self-adjoin operator TQ : H → H such that

Q(Ψ1, Ψ2) = hTQΨ1, Ψ2i . In the infinite dimensional case, again domain of definition questions arise and, in particular, one needs the sesqulinear form to be bounded from below in a certain sense in order that it induces a self-adjoint operator (which is then also bounded from below). 6 B. GUNEYSU¨

• In infinite dimensions, the actual definition of the adjoint of an operator (and thus of seldfadjointness) is a bit subtle, which is due to the above mentioned domain of definition problems. In particular, we will distinguish symmetric operators from self-adjoint ones, noting that an everywhere defined operator is self-adjoint if any only if it is symmetric.

2.2. Facts about linear operators in Banach and Hilbert spaces. We understand all our normed spaces to be over C. As we have explained above, it is essential to require a linear operator T between Banach spaces B1 and B2 to be only defined on a subspace Dom(T ) ⊂ B1, called its domain of definition, so that T is actually a linear map T : Dom(T ) → B2. Its image orrange Ran(T ) ⊂ B2 is defined to be the linear space of all f2 ∈ B2 for which there exists f1 ∈ Dom(T ) with T f1 = f2. Its kernel Ker(T ) is given by all f ∈ Dom(T ) with T f = 0. Such a linear operator T is called bounded, if there exists a constant C ≥ 0 such that kT fk ≤ C kfk for all f ∈ Dom(T ), and the smallest such C is called the operator of T and denotes by kT k. Boundedness of T is equivalent to its continuity as a map between normed spaces (considered as metric and thus topological spaces in the usual way). If Dom(T ) is dense, then T can be uniquely extended to a bounded linear map B1 → B2, which will be denoted with teh same symbol again. The linear space of bounded linear operators is denoted by L (B1, B2) and becomes a Banach itself with the above operator norm. One sets L (B1) := L (B1, B1).

Theorem 2.1 (). A linear operator T from B1 to B is bounded, if and only if its graph

{(f1, f2) ∈ Dom(T ) × B2 : T f1 = f2} ⊂ B1 × B2 is closed. We also record:

Theorem 2.2 (Uniform boundeness principle). For a subset A ⊂ L (B1, B2) the following conditions are equivalent:

• for all f ∈ B1 there exists a constant Cf ≥ 0 with kT fk ≤ Cf for all T ∈ A. • there exists a constant C ≥ 0 with kT k ≤ C for all T ∈ A. Let H be a separable Hilbert space. The underlying scalar product, which is assumed to be antilinear in its first slot, will be simply denoted by h•, •i, and the induced norm (as well as the induced operator norm) is denoted by k•k.

Theorem 2.3 (Riesz-Fischer’s duality theorem). Assume T ∈ L (H , C), that is, T is a linear continuous functional on H . Then there exists a unique fT ∈ H such that for all h ∈ H one has T (h) = hfT , hi . The map T 7→ fT induces an antilinear isometric isomorphism between L (H , C) and H . Heat kernels on Riemannian manifolds 7

If H˜ is another seperable complex Hilbert space case and R is a densely defined linear operator from H to H˜ , then the adjoint R∗ of R is a linear operator from H˜ to H which is defined as follows: Dom(R∗) is given by all f ∈ H˜ for which there exists f ∗ ∈ H such that hf ∗, hi = hf, Rhi for all h ∈ Dom(R), and then R∗f := f ∗. In the sequel, let S and T be arbitrary linear operators in H . Firstly, T is called an extension of S (symbolically S ⊂ T ), if Dom(S) ⊂ Dom(T ) and Sf = T f for all f ∈ Dom(S). If S is densely defined, then S is called symmetric, if S ⊂ S∗ and self-adjoint if S = S∗. Clearly, self-adjoint operators are symmetric. Note for the symmetry of only needs to check that it is densely defined

hSf1, f2i = hf1, Sf2i

for all f1, f1 ∈ Dom(S). Checking self-adjointness is a tricky business for unbounded operators, while checking symmetry is very easy:

m Pm 2 Example 2.4. Assume U ⊂ R is open and the operator S := −∆ = − j=1 ∂j in the 2 ∞ complex Hilbert space L (U) is given the domain of definition Dom(S) := Cc (U). Then ∞ S is symmetric: for all f1, f2 ∈ Dom(S) = Cc (U) by Stokes’ Theorem one has Z hSf1, f2i = (−∆)f1f2dx U Z = (∇f1, ∇f2)dx + a boundary term that vanishes because fj is compactly supported in U U Z = f1(−∆)f2dx = hf1, Sf2i . U This operator is not closed and so surely not self-adjoint (in general it has many self-adjoint extensions; in case U = Rm it has precisely one self-adjoint extension). The operator S is called semibounded (from below), if there exists a constant C ≥ 0 such that for all f ∈ Dom(S) one has (4) hSf, fi ≥ −C kfk2 , or in short: S ≥ −C. Since H is assumed to be complex, semibounded operators are automatically symmetric (by complex polarization).

S is called closed, if whenever (fn) ⊂ Dom(S) is a sequence such that fn → f for some f ∈ H and Sfn → h for some h ∈ H , then one has f ∈ Dom(S) and Sf = h. S is called closable, if it has a closed extension. In this case, S has a smallest closed extension S, which is called the closure of S. The closure S is determined as follows: Dom(S) is given by all f ∈ H for which there exists a sequence (fn) ⊂ Dom(S) such that fn → f and such that (Sfn) converges, and then Sf := limn Sfn. 8 B. GUNEYSU¨

Adjoints of densely defined operators are closed, so that that symmetric operators are closable; self-adjoint operators are closed. Bounded operators are always closed by the closed graph theorem. If S is densely defined and closable, then S∗ is densely defined and S∗∗ = S. If T is symmetric, then T is called essentially self-adjoint, if T is self-adjoint. Then T is the unique self-adjoint extension of T . We record:

Theorem 2.5. Assume that S is semibounded (in particular symmetric) with S ≥ −C for some constant C ≥ 0. Then S is essentially self-adjoint, if and only if there exists z ∈ C \ [−C, ∞) such that Ker((S − z)∗) = {0}.

The resolvent set ρ(S) is defined to be the set of all z ∈ C such that S − z is invertible as a linear map Dom(S) → H and is in addition bounded as a linear operator from H to H . If S is closed and (S−z)−1 invertible, then (S−z)−1 is automatically bounded by the closed graph theorem. The spectrum σ(S) of S is defined as the complement σ(S) := C \ ρ(S). Resolvent sets of closed operators are open, therefore spectra of closed operators are always closed. A number z ∈ C is called an eigenvalue of S, if Ker(S −z) 6= {0}. In this case, dim Ker(S − z) is called the multiplicity of z, and each f ∈ Ker(S − z) \{0} is called an eigenvector of S corresponding to z. Of course each eigenvalue is in the spectrum. The eigenvalues of a symmetric operator are real, and the eigenvectors corresponding to different eigenvalues of a symmetric operator are orthogonal. A simple result that reflects the subtlety of the notion of a “self-adjoint operator” when compared to that of a“symmetric operator” is the following: A symmetric operator in H is self-adjoint, if and only if its spectrum is real. If S is self-adjoint, then S ≥ −C for a constant C ≥ 0 is equivalent to σ(S) ⊂ [−C, ∞) (cf. Satz 8.26 in [36]).

The σess(S) ⊂ σ(S) of S is defined to be the set of all eigenvalues λ of S such that either λ has an infinite multiplicity, or λ is an accumulation point of σ(S). Then the discrete spectrum σdis(S) ⊂ σ(S) is defined as the complement

σdis(S) := σ(S) \ σess(S).

As every isolated point in the spectrum of a self-adjoint operator is an eigenvalue (cf. Folgerung 3, p. 191 in [35]), it follows that in case of S being self-adjoint, the set σdis(S) is precisely the set of all isolated eigenvalues of S that have a finite multiplicity. Let H˜ be another complex separable Hilbert space. We recall that given q ∈ [1, ∞), some K ∈ L (H , H˜) is called

• compact, if for every orthonormal sequence (en) in H and every orthonormal se- ˜ quence (fn) in H one has hKen, fni → 0 as n → ∞ Heat kernels on Riemannian manifolds 9

• q-summable (or an element of the q-th Schatten class of operators H → H˜), if for every (en), (fn) as above one has X q |hKen, fni| < ∞. n Let us denote the class of compact operators with J ∞(H , H˜) and the q-th Schatten class with J q(H , H˜), with the convention J •(H ) := J •(H , H ). These are linear spaces with

q1 ˜ q2 ˜ J (H , H ) ⊂ J (H , H ) for all q2 ∈ [1, ∞], with q1 ≤ q2, and one has inclusions of the type J q ◦ L ⊂ J q, L ◦ J q ⊂ J q for all q ∈ [1, ∞], and q1 q2 q3 J ◦ J ⊂ J if 1/q1 + 1/q2 = 1/q3 with qj ∈ [1, ∞). For obvious reasons, J 1 is called the , and moreover J 2 is called the Hilbert- Schmidt class. Example 2.6. A K in L2(X, µ)-space is Hilbert-Schmidt, if (and only if) it is an integral operator with a square integrable integral kernel, that is, if Z Kf(x) = k(x, y)f(y)dµ(y)

for some k ∈ L2(X × X, µ ⊗ µ). This follows from evaluating X 2 |hKen, fni| n explicitly using Parseval’s identity. Let us now turn towards the formulation of the spectral theorem: Definition 2.7. A spectral resolution P on H is a map P : R → L (H ) such that • for every λ ∈ R one has P (λ) = P (λ)∗, P (λ)2 = P (λ) (that is, each P (λ) is an orthogonal projection onto its image) • P is monotone in the sense that λ1 ≤ λ2 implies Ran(P (λ1)) ⊂ Ran(P (λ2)) • P is right-continuous in the strong topology1] of L (H ) • limλ→−∞ P (λ) = 0 and limλ→∞ P (λ) = idH , both in the strong sense. It follows that for every f ∈ H , the function λ 7→ hP (λ)f, fi = kP (λ)fk2 is right-continuous and increasing. Thus by the usual Stieltjes construction2 it induces a Borel measure on R, which will be denoted by hP (dλ)f, fi. This measure has the total mass 2 hP (R)f, fi = kfk . 1 Tn → T strongly in L (H ) means that Tnf → T f for all f ∈ H ; this is weaker that convergence in the operator norm toppology, which means that kTn − T k → 0. 2 Given a right-continuous and increasing function F : R → R there exists precisely one measure µF on R such that for all b > a one has µF ((a, b]) = F (b) − F (a). 10 B. GUNEYSU¨

Given such P and a Borel function φ : R → C, the set  Z  2 DP,φ := f ∈ H : |φ(λ)| hP (dλ)f, fi < ∞ R is a dense linear subspace of H (cf. Satz 8.8 in [36]), and accordingly one can define a linear operator φ(P ) with Dom(φ(P )) := DP,φ in H by mimicking the complex , Z hφ(P )f1, f2i := (1/4) φ(λ) hP (dλ)(f1 + f2), f1 + f2i R Z −(1/4) φ(λ) hP (dλ)(f1 − f2), f1 − f2i R Z √ √ √ +( −1/4) φ(λ) P (dλ)(f1 − −1f2), f1 − −1f2 R Z √ √ √ −( −1/4) φ(λ) P (dλ)(f1 + −1f2), f1 + −1f2 , R where f1, f2 ∈ Dom(φ(P )). Every spectral measure induces the following “calculus”: Theorem 2.8. Let P be a spectral resolution on H , and let φ : R → C be a Borel function. Then: (i) One has φ(P )∗ = φ(P ); in particular, φ(P ) is self-adjoint, if and only if φ is real-valued.

(ii) One has kφ(P )k ≤ supR |φ| ∈ [0, ∞]. (iii) If φ ≥ −C for some constant C ≥ 0, then one has φ(P ) ≥ −C. (iv) If φ0 : R → C is another Borel function, then φ(P ) + φ0(P ) ⊂ (φ + φ0)(P ), Dom(φ(P ) + φ0(P )) = Dom((|φ| + |φ0|)(P )) and φ(P )φ0(P ) ⊂ (φφ0)(P ), Dom(φ(P )φ0(P )) = Dom((φφ0)(P )) ∩ Dom(φ0); in particular, if φ0 is bounded, then φ(P ) + φ0(P ) = (φ + φ0)(P ), φ(P )φ0(P ) = (φφ0)(P ). (v) For every f ∈ Dom(φ(P )) one has Z kφ(P )fk2 = |φ(λ)|2 hP (dλ)f, fi . R One variant of the spectral theorem is: Theorem 2.9. For every self-adjoint operator S in H there exists precisely one spec- tral resolution PS on H such that S = idR(PS). The operator PS is called the spectral resolution of S, and it has the following additional properties:

• PS is concentrated on the spectrum of S in the sense that for every Borel function φ : R → C one has φ(PS) = (1σ(S) · φ)(PS) Heat kernels on Riemannian manifolds 11

• if φ : R → R is continuous, then σ(φ(PS)) = φ(σ(S)) • if φ, φ0 : R → R are Borel functions, then one has the transformation rule (φ ◦ 0 0 φ )(PS) = φ(Pφ (PS )). In view of these results, given a self-adjoint operator S in H , the calculus of Theorem 2.8 applied to P = PS is usually referred to as the spectral calculus of S. Likewise, given a Borel function φ : R → C one sets

φ(S) := φ(PS). Remark 2.10. Let S be a self-adjoint operator in H . 1. The spectral calculus of S is compatible with all functions of S that can be defined “by hand”. For example, for every z ∈ C \ K one has φ(S) = (S − z)−1 with φ(λ) := 1/(λ − z), or Sn = φ(S) with φ(λ) := λn. 2. If S is a semibounded operator and z ∈ C is such that 0 one has the following formula for f1, f2 ∈ H : Z ∞ −b 1 b−1 zs −sS (5) (S − z) f1, f2 = s e e f1, f2 ds. Γ(b) 0 −tS 3. If S ≥ −C for some constant C ≥ 0, then the collection (e )t≥0 forms a strongly continuous self-adjoint semigroup of bounded operators (contractive, if one can pick C = 0), and one has the abstract smoothing effect \ Ran(e−tS) ⊂ Dom(Sn) for all t > 0.

n∈N≥1 Moreover, for every ψ ∈ H the path [0, ∞) 3 t 7−→ ψ(t) := e−tS, ψ ∈ H is the uniquely determined continuous path with ψ(0) = ψ which is differentiable in (0, ∞) and satisfies there the abstract heat equation (d/dt)ψ(t) = −Sψ(t) 4. If S ≥ −C for some constant C ≥ 0 and if e−tS ∈ J 2(H ) for some t > 0, then S has a purely discrete spectrum (so the spectrum consists of countably many eigenvalues having a finite multiplicity) and if one ennumerates the eigenvalues in an increasing way and counting multiplicity, (λn), then one has −C ≤ λ0 < λ1 % ∞ if H is infinite dimensional. Example 2.11. Assume on a sigma-finite measurable space (X, µ) we are given a mea- surable function ψ : X → C. Then the associated maximally defined multiplication in L2(X, µ) is given by 2 2 Dom(Mψ) := {f ∈ L (X, µ): ψf ∈ L (X, µ)},Mψf(x) := ψ(x)f(x).

Mψ is bounded from below, if and only if ψ ≥ C µ-a.e. for some C ∈ R and bounded, if and only |ψ| ≤ c µ-a.e. for some c ≥ 0. Moreover, Mψ is always closed, and a point z ∈ C 12 B. GUNEYSU¨

lies in the spectrum if and only there exists no  > 0 such that |ψ − z| ≥  µ-a.e. and in the discrete spectrum if and only of µ{|ψ − z|} > 0.

The operator Mψ is self-adjoint if only if ψ(x) ∈ R for µ-a.e. x ∈ X. In the latter case, concerning the spectral calculus, one has φ(Mψ) = Mφ◦ψ. Using the spectral theorem one can show that every self-adjoint operator is unitarily equiv- alent to a self-adjoint multiplication operator on some finite measure space. Here, a linear operator V between two Hilbert spaces is called unitary, if it is bijective with V −1 = V ∗ and two linear operators are called unitarily equivalent, if there exits a V with B = V ∗AV . We now collect some basic facts about possibly unbounded sesquilinear forms on Hilbert spaces. Unless otherwise stated, all statements below can be found in section VI of T. Kato’s book [22]. Let again H be a complex separable Hilbert space. A sesquilinear form Q on H is understood to be a map Q : Dom(Q) × Dom(Q) −→ C, where Dom(Q) ⊂ H is a linear subspace called the domain of definition of Q, such that Q is antilinear3 in its first slot, and linear in its second slot. The quadratic form induced by Q is simply the map

Q : Dom(Q) −→ C, 7→ f 7−→ Q(f, f).

Let Q and Q0 be sesquilinear forms on H in this section. Q0 is called an extension of Q, symbolically Q ⊂ Q0, if Dom(Q) ⊂ Dom(Q0) and if both forms coincide on Dom(Q). ∗ Q is called symmetric, if Q(f1, f2) = Q(f2, f1) , and semibounded (from below), if its quadratic form is real-valued with and there exists a constant C ≥ 0 such that Q(f, f) ≥ −C kfk2 for all f ∈ Dom(Q),(6) symbolically Q ≥ −C. Again by complex polarization, every semibounded form is auto- matically symmetric (as the quadratic form is real-valued).

Following Kato, given a sequence (fn) ⊂ Dom(Q) and f ∈ H we write fn −→ f as Q n → ∞, if one has fn → f in H and in addition

Q(fn − fm, fn − fm) → 0 as n, m → ∞.

3We warn the reader, however, that in [22] the forms are assumed to be antilinear in their second slot; thus, if Q(f1, f2) is a form in our sense, the theory from [22] has to be applied to the complex conjugate ∗ form Q(f1, f2) . Heat kernels on Riemannian manifolds 13

Then Q is called closed, if fn −→ f implies that f ∈ Dom(Q). A semibounded Q is closed, Q if and only if for some/every C ≥ 0 with Q ≥ −C the scalar product on Dom(Q) given by

hf1, f2iQ,C = (1 + C) hf1, f2i + Q(f1, f2)(7) turns Dom(Q) into a Hilbert space. Futhermore, for a semibounded Q ≥ −C its closedness is equivalent to the lower-semicontinuity of the function ( Q(f, f), if f ∈ Dom(Q) H −→ [−C, ∞], f 7−→ ∞ else.

The form Q is called closable, if it has a closed extension. If Q is semibounded and closable, then it has a smallest semibounded and closed extension Q, which is (well-)defined as follows: Dom(Q) is given by all f ∈ H that admit a sequence (fn) ⊂ Dom(Q) with fn −→ f; then one has Q

Q(f, h) = lim Q(fn, hn), where fn −→ f, hn −→ h. n Q Q

If Q is closed, then a linear subspace D ⊂ Dom(Q) is called a core of Q, if Q|D = Q. Using the spectral calculus one defines:

Definition 2.12. Given a self-adjoint operator S in H , the (densely defined and sym- p metric) sesquilinear form QS in H given by Dom(QS) := Dom( |S|) and Dp p E QS(f1, f2) := |S|f1, |S|f2

is called the form associated with S. The following fundamental result links the world of densely defined, semibounded, closed forms with that of semibounded self-adjoint operators (cf. Theorem VIII.15 in [28] for this exact formulation):

Theorem 2.13. For every self-adjoint semibounded operator S in H , the form QS is densely defined, semibounded and closed. Conversely, for every densely defined, closed and semibounded sesquilinear form Q in H , there exists precisely one self-adjoint semibounded

operator SQ in H such that Q = QSQ . The operator SQ will be called the operator associated with Q.

The correspondence S 7→ QS has the following additional properties: Theorem 2.14. Let Q be densely defined, closed and semibounded. Then:

• SQ is the uniquely determined self-adjoint and semibounded operator in H such that Dom(SQ) ⊂ Dom(Q) and

hSQf1, f2i = Q(f1, f2) for all f1 ∈ Dom(SQ), f2 ∈ Dom(Q). 14 B. GUNEYSU¨

• Dom(SQ) is a core of Q; some f1 ∈ Dom(Q) is in Dom(SQ), if and only if there exists f2 ∈ H and a core D of Q with

Q(f1, f3) = hf2, f3i for all f3 ∈ D,

and then SQf1 = f2. • One has  h − e−tSQ h   Dom(Q) = h ∈ H : lim , h < ∞ , t→0+ t h − e−tSQ h  Q(h, h) = lim , h . t→0+ t • One has the variational principle

(8) min σ(SQ) = inf{Q(f, f): f ∈ Dom(Q), kfk = 1}

(9) = inf{hSQf, fi : f ∈ Dom(SQ), kfk = 1}. Notation 2.15. If Q, Q0 are symmetric, we write Q ≥ Q0, if and only if Dom(Q) ⊂ Dom(Q0) and Q(f, f) ≥ Q0(f, f) for all f ∈ Dom(Q). The Friedrichs extension of a semibounded operator can be defined as follows: Example 2.16. Let S ≥ −C be a symmetric (in particular, a densely defined) and semi- bounded operator in H . Then the form (f1, f2) 7→ hSf1, f2i with domain of definition ˜ Dom(S) is closable, and of course the closure QS of that form is densely defined and semi- ˜ bounded. The operator SF associated with QS is called the Friedrichs realization of S. The operator SF can also be characterized as follows: SF is the uniquely determined self-adjoint ˜ semibounded extension of S with domain of definition ⊂ Dom(QS). Let MC (S) denote the class of all self-adjoint extensions of S which are ≥ −C. Thus we have SF ∈ MC (S), and in addition the following maximality property holds: ˜ T ∈ MC (S) ⇒ QT ≤ QS.

In particular, SF has the smallest bottom of spectrum min σ(SF ) among all operators in MC (S). This is Krein’s famous result on the characterization of semibounded extensions [1] [23]. Let us see how the Friedrichs construction can be used to define a self-adjoint realization of the Laplace-Beltrami operator −∆ in L2(U), where U is an arbitrary open subset of m 2 ∞ R : consider −∆ as a linear operator in L (U), defined initially on Cc (U). We have seen ∞ above that −∆ is symmetric; more precisely, for all f1, f2 ∈ Cc (U) one has Z h(−∆)f1, f2iU = (∇f1, ∇f2), U so Z 2 h(−∆)f1, f1i = |∇f1| ≥ 0, U Heat kernels on Riemannian manifolds 15

and −∆ ≥ 0 in L2(U). It follows from the previous example that −∆ canonically induces 2 a self-adjoint operator HU ≥ 0 in L (U), called the Dirichlet-Laplacian in U. In terms of k,p k,p the Euclidean Sobolev spaces W (U) and W0 (U): one has 1,2 2 Dom(HU ) = {f ∈ W0 (U) : ∆f ∈ L (U)},HU f = −∆f, where ∆f is understood in the sense of distributions. We will come to a detailed explaina- tion of these facts in the more general context of Riemannian manifolds later on.

3. Basic facts on differential operators on Riemann manifolds Let M be a manifold4 of dimension m and let E → M, F → M be vector bundles over M with rank `0 and rank `1, respectively. We understand all vector bundles over C (if not we can complexify; for example, a priori, the tangent bundle E = TM → M is of course naturally given over R). We denote with ΓC∞ (M,E) the smooth sections of E → N, that is, the linear space (in fact C∞ left module) of all smooth maps ψ : M → E with ψ(x) ∈ Ex for all x ∈ N. Likewise, smooth compactly supported sections will be denoted ∞ with ΓCc (M,E). l l In case E = M ×C → M is a trivial vector bundle, then each fiber Ex is given by {x}×C ∞ l ∞ ∞ and we can identify ΓC∞ (M,E) with C (M, C ), where C (M) := C (M, C) for l = 1. A map P :ΓC∞ (M,E) −→ ΓC∞ (M,F ) is called restrictable, if for all open U ⊂ X there exists a linear map

P |U :ΓC∞ (U, E) −→ ΓC∞ (U, F )

with P |U ψ|U = (P ψ)|U for all ψ ∈ ΓC∞ (M,E). Definition 3.1. A restrictable linear map

P :ΓC∞ (M,E) −→ ΓC∞ (M,F )

is called a (smooth, linear) partial differential operator of order ≤ k ∈ N≥0, if for any 1 m 5 ∞ chart ((x , . . . , x ),U) of M which admits frames e1, . . . , e`0 ∈ ΓC (U, E), f1, . . . , f`1 ∈ 6 m ΓC∞ (U, F ), and any multi-index α ∈ Nk , there are (necessarily uniquely determined) smooth functions Pα : U −→ Mat(C; `0 × `1) such that for all (φ(1), . . . , φ(`0)) ∈ C∞(U, C`0 ) one has `0 `1 `0 |α| (i) X (i) X X X ∂ φ P |U φ ei = Pαij α fj in U. m ∂x i=1 j=1 i=1 α∈Nk Any differential operator P satisfies supp(P ψ) ⊂ supp(ψ), that is, P is local.

4We understand all our manifolds to be smooth and without boundary. 5 that is, ’frame’ means that e1(x), . . . , e`0 (x) is basis of Ex for all x ∈ U 6 m m Nk denotes the set of multi-indices α = (α1, . . . , αm) ∈ (N≥0) such that α1 + ··· + αm ≤ k. 16 B. GUNEYSU¨

Definition 3.2. Let k ∈ N≥0 and let

P :ΓC∞ (M,E) −→ ΓC∞ (M,F ) be a differential operator of order ≤ k. a) The (linear k-th order principal) symbol of P is the unique morphism

k ∗ k symbP :(T M) ⊗ E −→ F of vector bundles, where stands for the symmetric tensor product, such that for all 1 m (i) ((x , . . . , x ),U), e1, . . . , e`0 , f1, . . . , f`1 as in Definition 3.1, and all real-valued ζα ∈ ∞ m C (U) (where i runs through i = 1, . . . , `0 and α runs through α ∈ N is such that α1 + ··· + αm = k), one has

`0 ! `0 `1 k X X (i) α X X X (i) symbP ζα dx ⊗ ei = Pαijζα fj in U. m m α∈N :α1+···+αm=k i=1 α∈N :α1+···+αm=k i=1 j=1

∗ k ⊗k b) P is called elliptic, if for all x ∈ M, v ∈ Tx M \{0}, the linear map symbP,x(v ): Ex → Fx is invertible. Remark 3.3. 1. Keep in mind that (at least locally) an operator P of order ≤ k can also be considered as having order ≤ l where l > k (set the higher order coefficients = 0), and then P can be elliptic as in the k-sense but not in the l-sense. Thus we always have to specify the order of P when we talk about ellipticity. 2. Ellipticity is a local question: it needs to be checked in some chart around x only.

∗ ∗ A (smooth) metric hE on E → M is by definition a section hE ∈ ΓC∞ (M,E ⊗ E ), such that hE is fiberwise a scalar product. Then the datum (E, hE) → M is referred to as a metric vector bundle. In other words, for every x ∈ M zuwe have a scalar product hE(x): l Ex × Ex → C and hE(x) depends smoothly on x. The trivial vector bundle M × C → M 0 Pl is equipped with its canonic smooth metric which is induced by (z, z ) 7→ j=1 zjzj, where z, z0 ∈ Cl. Definition 3.4. A Riemannian metric on M is by definition a metric g on TM → M, and then the pair (M, g) is called a (smooth) Riemannian manifold. Proposition and definition 3.5. For any Riemannian metric g on M there exists pre- 1 m cisely one Borel measure µg on M such that for every chart ((x , . . . , x ),U) for M and any Borel set N ⊂ U, one has Z p µg(N) = det(g(x))dx, N

where det(g(x)) is the determinant of the matrix gij(x) := g(∂i, ∂j)(x) and where dx = dx1 ··· dxm stands for the Lebesgue integration.

Proof : Exercise.  Heat kernels on Riemannian manifolds 17

The above measure µg is called the Riemannian volume measure on (M, g). It is a Radon measure with a full topological in the sense that µg(U) > 0 for all open nonempty U ⊂ M.

Remark 3.6. That two Borel sections are equal µg-a.e. does not depend on a particular choice of g. Thus, given k ∈ N≥0, q ∈ [1, ∞] we can define a the local Γ k,q (M,E) to be the space of equivalence classes of Borel sections ψ of E → M such that Wloc (j) k,q in every chart U ⊂ M in which E → M admits a local frame ej one has ψ ∈ Wloc (U), P (j) q if ψ = j ψ ej in U. In particular, we get the local L -spaces

ΓLq (M,E) := Γ 0,q (M,E). loc Wloc The fundamental lemma of distribution theory takes the following form:

Lemma 3.7. For all f1, f2 ∈ Γ 1 (M,E) one has f1 = f2 a.e., if and only if there exists Lloc a pair/for all pairs of metrics (g, hE) with Z Z ∞ hE(f1, ψ)dµg = hE(f2, ψ)dµg for all ψ ∈ ΓCc (M,E). M M Proof : ⇒: Clear. ⇐: Let U ⊂ M be a chart which admits an orthonormal frame e1, . . . , el for (E, hE) → M (of course M be can covered with such U’s) and let ψ be an arbitrary smooth section with P i P i a compactl support in U. Then writing fj = i fj ei, j = 1, 2, and ψ = i ψ ei we have Z Z Z X p i i det(g) · f1ψ dx = hE(f1, ψ)dµg = hE(f2, ψ)dµg U i M M Z X p i i = det(g) · f2ψ dx, U i so that by the Euclidean fundamental lemma of distribution theory we have p i p i det(g) · f1 = det(g(x)) · f2 p in U, for all i, so f1 = f2 as det(g) > 0.  Now we can prove: Proposition and definition 3.8. Assume that g is a Riemannian metric on M and that (E, hE) → M and (F, hF ) → M are metric vector bundles. Then for any differential operator P :ΓC∞ (M,E) −→ ΓC∞ (M,F ) of order ≤ k there is a uniquely determined differential operator

g,hE ,hF P :ΓC∞ (M,F ) −→ ΓC∞ (M,E) of order ≤ k which satisfies Z Z g,hE ,hF  hE P ψ, φ dµg = hF (ψ, P φ) dµg M M 18 B. GUNEYSU¨

for all ψ ∈ ΓC∞ (M,F ), φ ∈ ΓC∞ (M,E) with either φ or ψ compactly supported. The g,hE ,hF operator P is called the formal adjoint of P with respect to (g, hE, hF ). An explicit local formula for P g,hE ,hF can be found in the proof. Proof : Uniqueness follows from the fundamental lemma of distribution theory. As differ- ential operators are local, it is sufficient to prove the local existence. To this end, in the situation of Definition 3.1, we assume that ei and fj are orthonormal with respect to hE and hF , respectively. Then an integration by parts shows that   ` ` ` |α| p (j) X1 1 X0 X1 X ∂ Pαji det(g)ψ (10) P g,hE ,hF ψ(i)f := (−1)|α| e in U j p α i det(g) m ∂x j=1 i=1 j=1 α∈Nk does the job.  There is a way to define the action of differential operators on locally integrable functions:

Proposition and definition 3.9. Given P as above, f ∈ Γ 1 (M,E) and a subspace Lloc A ⊂ Γ 1 (M,F ) we write P f ∈ A, if there exists h ∈ A, such that for all triples of metrics Lloc (g, hE, hF ) it holds that Z Z g,hE ,hF  ∞ (11) hE P ψ, f dµg = hF (ψ, h) dµg for all ψ ∈ ΓCc (M,F ) . M M Then h is uniquely determined and we set P f := h. This property is equivalent to (11) being true for some triple (g, hE, hF ) of this kind (and is thus independent of the metrics). Proof : Clearly h is uniquely determined by the fundamental lemma of distribution theory. It remains to show that if (11) holds for some triple (g, hE, hF ) then it also holds for any other such triple. This is left as an exercise. 

Remark 3.10. One says that given fn, f ∈ Γ 1 (M,E) that fn → f in the sense of Lloc ∞ distributions, if for all ψ ∈ ΓCc (M,E) and some pair of metrics (g, hE) one has Z hE (fn − f, ψ) dµg → 0 M as n → ∞. Using that ψ is compactly supported one easily checks that this automat- ically holds for all pairs of metrics (g, hE). Moreover, distributional limits are uniquely determined. Given P as above, it is clear that fn → f in the sense of distributions implies P fn → P f in the sense of distributions, if P f ∈∈ Γ 1 (M,E) (as the action of P is defined Lloc by duality). Lemma 3.11 (Local elliptic regularity). Assume

P :ΓC∞ (M,E) −→ ΓC∞ (M,F )

is elliptic of order ≤ k and let q ∈ [1, ∞). Then for all f ∈ Γ q (M,E) with P f ∈ Lloc ΓLq (M,F ) one has f ∈ Γ k,q (M,E) if q > 1 and f ∈ Γ k−1,1 (M,E) if q = 1. loc Wloc Wloc Heat kernels on Riemannian manifolds 19

Proof : The q > 1 is a classical fact by Nirenberg [26] and can be found in many textbooks such as [27]. The q = 1 case is nonstandard uses Besov spaces. Together with Guidetti and Pallara I have given a proof in [9].  Recall in this context that the local Sobolev embedding implies \ Γ l,p (M,E) ⊂ ΓC∞ (M,E) for all p ∈ (1, ∞).(12) Wloc l∈N Remark 3.12. To give an idea of how ellipticity comes into play in such a result: Assume M = Rm and E = F are the trivial line bundles Rm × C → C (so that P acts on P α functions and has scalar coefficients). Assume further that P = |α|≤k Pα∂ has constant coefficients. The global Sobolev spaces W k,2(Rm), k ∈ N, can be equivalently defined via Fourier transform 0 m 0 m F : S (R ) → S (R ) mapping between Schwartz distributions: Z k,2 m 2 m 2 2 k W (R ) = {f ∈ L (R ): |F f(ζ)| (1 + |ζ| ) dζ < ∞}. Then P defines a continuous map k,2 m 2 m P : W (R ) → L (R ), −1 P α which, using that F PF is nothing but multiplication by ζ 7→ |α|≤k Pαζ , can be shown P α m to be bijective, if |α|=k Pαζ 6= 0 for all ζ ∈ R \{0} (this requires some work). So P f = g ∈ L2(Rm) implies f = P −1g ∈ W k,2(Rm). The local case of nonconstant coefficients can be deduced from this result by ’approximat- ing’ the coefficients and using some cut-off function machinery, and the local manifold case follows from this by a partition of unity argument. From now on we fix once for all a connected Riemannian manifold M = (M, g) with dimension m. We are going to ommit the dependence on g in the notation whenever there is no danger of confusion. For example the Riemann volume measure is denoted by µ. In addition, a metric vector bundle is simply depicted by E → M, that is, the dependence on the fiber metrics will be ommited in the notation and the metric on E → M is simply denoted by (·, ·). For all q ∈ [1, ∞] we get the ΓLq (M,E) given by all equivalence classes of Borel sections f of E → M such that

kfkq < ∞, where ( inf{C ≥ 0 : |f| ≤ C µ-a.e.}, if q < ∞ kfk := q R q 1/q M |f| dµ else, and |f| := p(f, f) 20 B. GUNEYSU¨ is the fiberwise norm. The space ΓL2 (M,E) becomes a Hilbert space via Z hf1, f2i := (f1, f2)dµ. M With this convention, it makes sense to denote the formal adjoint of a differential operator

P :ΓC∞ (M,E) −→ ΓC∞ (M,F ) acting between metric vector bundles simply by † P :ΓC∞ (M,F ) −→ ΓC∞ (M,E). We record:

∞ q Lemma 3.13. The space ΓCc (M,E) is dense in ΓL (M,E) for all q ∈ [1, ∞). In partic- ∞ q ular, Cc (M) is dense in L (M).

q q Proof : Step 1: A := ΓLc (M,E) is dense in ΓL (M,E). Proof of step 1: Pick an exhaustion Kn of M with compact sets. Given f ∈ ΓLq (M,E) set fn := 1Kn f ∈ A. Then we have Z Z q q q lim |fn − f| dµ = lim |(1K − 1)| |f| dµ = 0 n n n by dominated convergence.

∞ Step 2: ΓCc (M,E) is dense in A. Proof of step 2: Given f ∈ A cover its support by finitely many charts (Un) for M which ∞ admit an orthonormal frame. Pick a partition of unity (φn) ⊂ Cc (M) subordinate to q (Un). Then fn := φnf is compactly supported in Un and L thereon. Given arbitrary ∞  > 0, using Friedrichs mollifiers, for each n we can pick fn, ⊂ ΓCc (Un,E) with n+1 kfn, − fnkq < /2 .

P ∞ Then f := n fn, ∈ ΓCc (M,E) and

X X X kf − fk = f − f ≤ kf − f k < ,  q n, n n, n q n n q n completing the proof. 

4. The Friedrichs realizaiton of the Laplace-Beltrami operator Since we have fxied g, the tangent bundle TM → M is by definition a metric bundle, using the isomorphism of vector bundles ] : T ∗M −→ TM induced by the fiberwise nondegeneracy of g, we get a metric g∗ on T ∗M → M by setting (α, β) := (]α, ]β). Let ∞ 1 ∗ d : C (M) −→ ΩC∞ (M) := ΓC∞ (M,T M) Heat kernels on Riemannian manifolds 21

denote the exterior differential. It is a first order differential operator (which does not P i depend on g) given locally by df = i ∂ifdx . Definition 4.1. The Laplace-Beltrami operator is the second order differential operator given by ∆ := −d†d : C∞(M) −→ C∞(M). Locally one has † 1 X p X kj  d α = −p ∂k det(g) g αj det(g) k j P j kj k j if α = j αjdx and g := (dx , dx ). This formula shows

1 X p X ij  ∆ = p ∂i det(g) g ∂j , det(g) i j which can be worked out to give

X ij ∆ = g ∂i∂j + lower order terms, ij in particular, in each chart U, the symbol of ∆ (as an operator of order ≤ 2...!) is given ij ij by g (x)ζiζj, x ∈ U, ζTxM. This implies that ∆ is elliptic (as g is nondegenerate). 2 2 Remark 4.2. Local elliptic regularity shows: f ∈ Lloc(M), ∆f ∈ Lloc(M) implies f ∈ 2,2 2 Wloc (M), in particular, locally all weak partial derivatives of order ≤ 2 of f are in Lloc (say in each chart). It is a more delicate question to investigate the following GLOBAL 2 2 1 question: Does f ∈ L (M), ∆f ∈ L (M) implies, say df ∈ ΩL2 (M). We will come back to this later (geodesic completeness!). Lemma 4.3. a) One has

(13) d(f1f2) = f1df2 + f2df1, (14) d†(fα) = fd†α − (df, α),

(15) ∆(f1f2) = f1∆f12 + f2∆f1 + 2<(df1, df2), (16) ∆(u ◦ f) = (u00 ◦ f) · |df|2 + (u0 ◦ f) · ∆f.

Proof : Exercise. For example, one can use the above local formulae.  Consider now the densely defined, nonnegative, symmetric sesqulinear form Q0 in L2(M) given by Z 0 ∞ 0 Dom(Q ) = Cc (M),Q (f1, f2) = (df1, df2)dµ.

∞ It is induced by the symmetric nonnegative operator −∆ (with Dom(−∆) = Cc (M)), as we have Z 0 Q (f1, f2) = −∆f1f2dµ = h−∆f1, f2i . 22 B. GUNEYSU¨

By Friedrichs’ theorem (cf. Example 2.16), it follows that Q0 is closable. Let us describe its closure. To this end, define the global Sobolev space 1,2 2 1 ∗ W (M) := {f ∈ L (M): df ∈ ΩL2 (M) := ΓL2 (M,T M)}, which is a Hilbert space with scalar product Z Z hf1, f2iW 1,2 := hf1, f2i + hdf1, df2i = f1f2dµ + (df1, df2)dµ. Then we define 1,2 ∞ W0 (M) := closure of Cc (M) with respect to k·kW 1,2 . m 1,2 m 1,2 m Remark 4.4. If M = R (with its Euclidean metric) then one has W0 (R ) = W (R ), m 1,2 1,2 while if M is a bounded open subset U of R then one has W0 (U) 6= W (U). We will come to problems of this kind later on. Now by Kato’s theory it follows that the closure Q of Q0 is the closed nonnegative densely defined nonnegative symmetric sesqulinear form given by Z 1,2 Dom(Q) = W0 (M),Q(f1, f2) = (df1, df2)dµ.

By Kato’s theory (cf. Theorem 2.14) there exists a uniquely determined self-adjoint non- negative operator H in L2(M) such that Dom(H) ⊂ Dom(Q) and

hHf1, f2i = Q(f1, f2) for all f1 ∈ Dom(H), f2 ∈ Dom(Q). 2 Moreover, some f1 ∈ Dom(Q) is in Dom(H), if and only if there exists f2 ∈ L (M) with ∞ Q(f1, f3) = hf2, f3i for all f3 ∈ Cc (M),

and then Hf1 = f2. It follows now easily that 1,2 2 Dom(H) = {f ∈ W0 (M) : ∆f ∈ L (M)}, Hf = −∆f.

5. Geodesic completeness and the essential self-adjointness of −∆ This section deals with the following question: under which condition on the geometry of M, that is, on g, is H is the unique self-adjoint realization of −∆? R b To this end, for all x, y ∈ M we define %(x, y) to be the infimum of all a |γ˙ (s)|ds such that [a, b] ⊂ R is a closed interval and γ :[a, b] → M is a piecewise smooth curve with γ(a) = x, γ(b) = y. Note thatγ ˙ (s) ∈ Tγ(s)M and Z b `(γ) := |γ˙ (s)|ds a can be interpreted as the Riemannian length of the curve γ (this notion is, as usual, justified by approximating with ’summing up the lenghtsof polygons approximating the curve’ and taking the limit. Heat kernels on Riemannian manifolds 23

Remark 5.1. The main reason why we assume throughout that M is connected is that otherwise the set whose infimum defines %(x, y) could be empty, leading to %(x, y) = ∞. The main properties of % : M × M −→ [0, ∞), (x, y) 7−→ %(x, y) are collected in the following Theorem: Theorem 5.2. a) % is a distance on M (the corresponding open balls will simply be denoted with B(x, r) := {y : %(x, y) < r} ⊂ M in the sequel) and one has (17) B(x, r) = {y : %(x, y) ≤ r}. b) % induces the original topology on M. c) The following statements are equivalent: i) M is complete. ii) All closed bounded subsets of M are compact. ii’) All bounded subsets of M are relatively compact. ∞ iii) M admits a sequence (χn) ⊂ Cc (M) of first order cut-off functions, that is, (χn) has the following properties:

(C1) 0 ≤ χn(x) ≤ 1 for all n ∈ N≥1, x ∈ M, (C2) for all compact K ⊂ M, there is an n0(K) ∈ N such that for all n ≥ n0(K) one has χn |K = 1, (C3) kdχnk∞ → 0 as n → ∞. Proof : a) Clearly % is nonnegative and %(x, x) = 0. To show the triangle inequality, fix x, y, z ∈ M and pick a piecewise smooth path γ1 from x to z and a piecewise smooth path γ2 from z to y. Let γ be the path from x to y obtained as γ = γ2γ1 in the obvious sense. Then one has %(x, y) ≤ `(γ) = `(γ2) + `(γ1), so %(x, y) ≤ %(x, z) + %(z, y) follows from minimizing in γ. To see that % is nondegenerate, we first prove: Claim: for all p ∈ M there exists a chart p ∈ U ⊂ M and a constant C such that C−1|x − y| ≤ %(x, y) ≤ C|x − y| for all x, y ∈ U. Proof of the claim: pick a chart p ∈ W with coordinates x1, . . . , xm and pick a Euclidean ball V ⊂ W of radius r > 0 around p whose closure is included in W . For all x ∈ V , ζ ∈ TxM one has 2 2 X i j 2 X j 2 |ζ| := |ζ|g = gij(x)ζ ζ , |ζ|e = (ζ ) . ij j 24 B. GUNEYSU¨

Since (gij(x))ij is positive semidefinite and depends continuously on x we find C > 1 such that for all x ∈ V , ζ ∈ TxM one has

−2 X j 2 X i j 2 X j 2 C (ζ ) ≤ gij(x)ζ ζ ≤ C (ζ ) , j ij j so −1 C |ζ|e ≤ |ζ| ≤ C|ζ|e. For any piecewise smooth path γ which remains in V we get −1 C `e(γ) ≤ `(γ) ≤ C`e(γ). If x, y ∈ V , then we get

%(x, y) ≤ `(γx,y) ≤ C|x − y|,

where γx,y is the straight line from x to y. We are going to show that on U defined as the Euclidean ball in W around p of radius r/3 one has the reverse inequality, so that U does the job. Let x, y ∈ U and let γ be an arbitrary piecewise smooth curve in M from x to y. If γ stays in V then7

`e(γ) ≥ |x − y| and so (18) `(γ) ≥ C−1|x − y|. If γ intersects ∂V , pick a point z ∈ ∂V which is hit by γ and letγ ˜ denote the part of γ which connects in V the point x with z. Thus `(γ) ≥ `(˜γ) ≥ C−1|x − z| ≥ C−1(2r/3) ≥ C−1|x − y|.

Thus taking infγ we get %(x, y) ≥ C−1|x − y|, proving the claim. In order to show that % is nondegenerate, fix distinct p, x ∈ M. Pick a chart U around p and C > 1 as in the above claim. If x ∈ U then clearly %(x, p) > 0. If x ∈ M \ U pick r > 0 small with Be(p, r) ⊂ U (Euclidean ball). Then any curve γ from x to p must hit −1 −1 ∂Be(p, r), and so `(γ) ≥ C r and by taking infγ we arrive at %(x, p) ≥ C r > 0. This completes the proof that % is a distance. The proof of (17) is left as an exercise. b) It is enough to show that for all p ∈ M there exists a chart U around p and R > 0, C > 1 such that for all r ∈ (0,R] one has −1 r Be(p, C r) ⊂ B(p, r) ⊂ Be(p, C ) ⊂ U.

7 Here we use that the lenght distance of x, y ∈ Rm induced by the Euclidean Riemannian metric is precisely |x − y|. Heat kernels on Riemannian manifolds 25

To this end pick U, C as in the claim and  > 0 small with Be(p, ) ⊂ U. Set R := /(2C) −1 and let 0 < r ≤ R. If x ∈ Be(p, C r) we have x ∈ U and so x ∈ B(p, r). If x∈ / U then and curve γ from x to p hits a point y ∈ U with |y − p| = /2. Thus we obtain, `(γ) ≥ %(y, p) ≥ C−1|y − p| = /(2C) ≥ r, and taking infγ this shows %(y, p) ≥ r and so x∈ / B(p, r). This completes the proof. c) i) ⇔ ii): Exercise (a proof which does not use exponential coordinates). ii) ⇔ ii’): this is trivial. i) ⇔ iii): I sketch a proof: if M = (M, g) is complete, then by a small generalization of Nash’s embedding theorem we can pick a smooth embedding ι : M → Rl such that g is the pull-back of the Euclidean metric on Rl (thus an isometric embedding), where l ≥ m is large enough, and such that ι(M) is a closed subset of Rl: note here that the original Nash embedding does not produce a closed image; to correct this, one constructs a new 0 0 metricg ˜ on M, embeds (M, g˜) into some Rl isometrically via some map Ψ : M → Rl and constructs, using that closed balls are compact on (M, g), a map φ : M → R, such that l ι := (Ψ, ψ): M → R is an isometric embedding of (M, g), where l := l0 + 1. A detailed explanation of the above construction of ι has been given by O. Mueller in [25]. From here the proof is straightforward: ι is proper, and therefore the composition

2 f : M −→ R, f(x) := log(1 + |ι(x)| ) is a smooth proper function with |df| ≤ 1, since

l 2 f˜ : R −→ R, f˜(v) := log(1 + |v| ) is a smooth proper function whose gradient is absolutely bounded by 1. Pick now a se- ∞ quence (ϕn) ⊂ Cc (R) of first order cut-off functions on the Eudlidean space R. (For example, let ϕ : R → [0, 1] be smooth and compactly supported with ϕ = 1 near 0, and set ϕn(r) := ϕ(r/n), r ∈ R.) Then χn(x) := ϕn(f(x)) obviously has the desired properties, in 0 view of the chain rule dχn(x) = ϕn(f(x))df(x). ∞ iii) ⇔ ii’): Suppose that M admits a sequence (χn) ⊂ Cc (M) of first order cut-off func- tions. Then given O ∈ M, r > 0, we are going to show that there is a compact set AO,r ⊂ M such that

%(x, O) > r for all x ∈ M \ AO,r, which implies that any open geodesic ball is relatively compact. To see this, we define

AO := {O}, and a number nO,r ∈ N large enough such that χnO,r = 1 on AO and

(19) sup dχnO,r (x) ≤ 1/(r + 1). x∈M

Now let AO,r := supp(χnO,r ), let x ∈ M \ AO,r, and let γ :[a, b] −→ M 26 B. GUNEYSU¨

be a piecewise smooth curve with γ(a) = x, γ(b) = O. Then we have Z b ˙  1 = χnO,r (O) − χnO,r (x) = χnO,r (γ(b)) − χnO,r (γ(a)) = dχnO,r (γ(s)), γ(s) ds, a where we have used the chain rule. By using (19) and taking infγ ··· , we arrive at

%(x, O) ≥ r + 1 for all x ∈ M \ AO,r, as claimed.  Now we can prove the following result, which has been first shown by Gaffney, 1954 (from my point of view: much ahead of his time!). We follows a proof given by Strichartz in 1983: Theorem 5.3. Assume M is complete. Then the symmetric nonnegative operator −∆ ∞ 2 (defined on Cc (M)) is essentially self-adjoint in L (M). As a consequence, it has a unique self-adjoint extension which necessarily coincides with H ≥ 0. Proof : By the abstract functional analytic fact Theorem 2.5, it suffices to show that Ker((−∆ + 1)∗) = {0}. Let f ∈ Ker((−∆ + 1)∗). Unpacking definitions one finds that this is equivalent to f ∈ L2(M) and −∆f = −f, in particular, f is smooth by local elliptic regularity. We pick a sequence (χn) of first order cut-off functions. Then by the product rule for d from Lemma 4.3 we have

(d(χnf), d(χnf)) 2 2 = (df, χnfdχn) + (df, χndf) + |fdχn| + (fdχn, χndf), which, using 2 2 (df, d(χnf)) = (df, χndf) + 2(df, fχndχn), implies 2 |d(χnf)| = (d(χnf), d(χnf)) 2 2 = (df, d(χnf)) + |fdχn| − (df, fχndχn) + (fdχn, χndf). This in turn implies (after adding the complex conjugate of the formula to itself) 2 2 2 2|d(χnf)| = 2<(df, d(χnf)) + 2|fdχn| . Integrating and then integrating by parts in the last equality, we get Z Z Z 2 † 2 |d(χnf)| dµ = < (χnd df, χnf)dµ + |fdχn| dµ.

Using d†df = −∆f = −f and Z 2 |d(χnf)| dµ ≥ 0 we see Z Z 2 2 2 |χn| |f| dµ ≤ |fdχn| dµ, Heat kernels on Riemannian manifolds 27

which implies R |f|2dµ = 0 and thus f = 0 by dominated convergence, using the properties of (χn).  Some remarks are in order: Remark 5.4. 1. There are some interesting (though not many) incomplete Riemannian manfolds such that −∆ is essentially self-adjoint. 2. We are going to prove in the exercises that even the Schr¨odingeroperator −∆ + V in L2(M) is essentially self-adjoint, if M is complete and V : M → R is smooth and bounded from below. Note V has to be real-valued to get a symmetric operator. 3. The ultimate essential self-adjointness result on Riemann manifolds is the following one: 2 assume M is complete and V ∈ Lloc(M) has a little more local regularity (’local Kato class’ of M) such that −∆ + V is bounded from below. Then −∆ + V is essentially self-adjoint. This result can by applied to get that the Hamilton operator corresponding to a molecule is essentially self-adjoint (so there is no ambiguity concerning the quantum mechanics of matter). 4. Similar essential self-adjointness results hold for operators of the form ∇†∇+V on metric vector bundles E → M, where ∇ is a metric connection on E → M and V is a pointwise 2 self-adjoint Lloc-section of End(E) → M (G¨uneysu/Post; Braverman/Milatovic/Shubin; Lesch). These results are needed at least to deal with molecules in magnetic fields.

6. Some regularity results

1,2 1,2 2 Lemma 6.1. Assume f1 ∈ W0 (M), f2 ∈ W (M), ∆f2 ∈ L (M). Then one has the following integration by parts formula, Z Z f1∆f2dµ = − (df1, df2)dµ.

Proof : If f1 is smooth and compactly supported, then the identity follows immediately from the definition of weak (= distributional) derivatives. It carries over to general f1’s by a trivial density argument. 

Note that every f2 ∈ Dom(H) satisfies the above assumption. Often, this is used in the −tH 2 form f2 = e h for some h ∈ L (M), t > 0, as we know that for all t > 0, \ Ran(e−tH ) ⊂ Dom(Hn) n∈N by the spectral calculus.

Lemma 6.2. Given a sequence of smooth functions ψk : R → R, k ∈ N, with 0 ψk(0) = 0, sup sup |ψk(t)| < ∞, k∈N t∈R and a pair of functions ψ : R → R, ϕ : R → R with 0 ψk → ψ, ψk → ϕ 28 B. GUNEYSU¨ pointwise as k → ∞. 1,2 1,2 a) For every real-valued f ∈ W0 (M) one has ψ ◦ f ∈ W0 (M) and d(ψ ◦ f) = (ϕ ◦ f)df. b) For every real-valued f ∈ W 1,2(M) one has ψ ◦ f ∈ W 1,2(M) and d(ψ ◦ f) = (ϕ ◦ f)df. 1,2 If in addition ϕ is continuous away from an at most countable set, then fn, f ∈ W (M), 1,2 1,2 fn → f in W (M) implies ψ ◦ fn → ψ ◦ f in W (M), as n → ∞. 1,2 1,2 c) For every real-valued f ∈ Wloc (M) one has ψ ◦ f ∈ Wloc (M) and d(ψ ◦ f) = (ϕ ◦ f)df. Proof : a) Lemma 5.2 in [7]. b) Theorem 5.7 in [7]. c) This follows from applying b) with M replaced by a relatively compact chart of M. 

Denote with a+ := max(0, a) ∈ [0, ∞) the positive part of a ∈ R and with a− := a+ − a ∈ [0, ∞) its negative part.

Example 6.3. Given c ≥ 0 set ψ(t) := (t − c)+, ( 0, t ≤ c, φ(t) := . 1, t > c

Then picking ψ1 : R → R smooth with ( 0, t − 1 ≤ c, ψ (t) := 1 1, t > c + 2,

−1 the sequence ψk(t) := k ψ1(kt) satisfies the assumptions of the previous lemma, yielding 1,2 1,2 1,2 that for all real-valued f ∈ W0 (M) (resp. f ∈ W (M)) one has (f − c)+ ∈ W0 (M) 1,2 (resp. (f − c)+ ∈ W (M)) and the formula ( df, if f > c d(f − c) = + 0, else

1,2 1,2 Moreover, fn → f in W (M) implies (fn − c)+ → (f − c)+ in W (M). Let N ⊂ M be an arbitrary subset. Then a function f : N → R on M is called Lipschitz, if there exists a constant C such that for all x, y ∈ N one has (20) |f(x) − f(y)| ≤ C%(x, y). Lipschitz functions are continuous, restrictions of Lipschitz functions are again Lipschitz, and for a fixed x0 ∈ M, the function

M 3 x 7→ %(x, x0) Heat kernels on Riemannian manifolds 29

is Lipschitz. Note also that if U ⊂ M is open, then with an obvious notation one has

%U (x, y) ≥ %(x, y) for all x, y ∈ U, so a Lipschitz function f : U → R in the above sense is also a Lipschitz function with respect to the Riemannian manifold (U, gU ). Remark 6.4. The following assertions can be deduced in an elementary way and hold on every metric space: If f, g are Lipschitz, then so is f + g, min(f, g), max(f, g); the product fg is Lipschitz, if in addition f is bounded on the support of g. A function f : M → R is called locally Lipschitz, if for each compact K ⊂ M there exists a C = CK with (20) for all x, y ∈ K. The composition of a Lipschitz function with a Lipschitz function on R is Lipschitz; the composition of a locally Lipschitz function with a locally Lipschitz function on R is locally Lipschitz. Lemma 6.5. a) If f : M → R is a Lipschitz function, then df exists as an element of 1 0 0 ΩL∞ (M) and one has kdfk∞ ≤ C , where C is the smallest C with (20). If f : M → R is 1 locally Lipschitz, then df exists as an element of Ω ∞ (M). Lloc 1 1 b) A C -function f : M → R with kdfk∞ is Lipschitz. In particular, C -functions are locally Lipschitz. Proof : a) This follows from applying the corresponding Euclidean result (Rademacher’s theorem) in ’nice charts’ like those appearing in the proof of Theorem 5.2, namely, by scaling the charts if necessary, one can can find for each p ∈ M a chart U with p ∈ U and

(1/2)δij ≤ gij(x) ≤ 2δij for all x ∈ U, as bilinear forms (the point is that the constant in this quasi-isometry, C = 2, is uniform in each chart. Rademacher’s theorem can either be deduced with methods of Analysis 1, by reducing to the m = 1 case with a covering argument (’Vitali covering’), using that functions on an interval having a are almost everywhere differentiable by Lebesgue’s theorem, or by using a Sobolev embedding theorem (cf. Theorem 3.1, resp. section 4.2 in [13]). The local statement can be deduced as follows from the above: Assume N ⊂ M is open ∞ and relatively compact and let f : M → R be locally Lipschitz. Pick φ ∈ Cc (M) with 1 φ = 1 on N. Then φf is globally Lipschitz and so d(φf) ∈ ΩL∞ (M). Since f = φf on N 1 we thus get df ∈ ΩL∞ (N). b) This follows applying the mean value theorem for differentiation in nice charts.  1,2 1,2 Lemma 6.6. One has Wc (M) ⊂ W0 (M). In particular, for every compactly supported 1,2 Lipschitz function f : M → R one has f ∈ W0 (M). 1,2 Proof : Let h ∈ Wc (M). Covering the support of h with finitely many nice charts, we can assume that M is an open subset of the Euclidean Rm. In this case the assertion follows from Friedrichs mollifiers. For the second statement, note that f is L2 (continuous and compactly supported) and df 2 1,2 is L (bounded by the previous Lemma and compactly supported), so f ∈ Wc (M).  30 B. GUNEYSU¨

Lemma 6.7. One has the product rule d(f1f2) = f1df2 +f2df1 if f1, f2 : M → R are locally Lipschitz. If ψ : R → R is C1 and f : M → R is locally Lipschitz, then ψ ◦ f is locally Lipschitz with the chain rule d(ψ ◦ f) = (ψ0 ◦ f)df. Proof : In view of the cut-off function argument from the proof of Lemma 6.5 b) we can 1,2 assume that f1 and f2 are compactly supported, so that f1, f2 ∈ W0 (M) by the previous ∞ 1,2 lemma. Pick a sequence (φn) ⊂ Cc (M) with φn → f1 in W (M). Since f1 is bounded, 1,2 we can assume for the proof that (φn) is uniformly bounded in W (M) (and so (φn) 2 1 is uniformly bounded in L (M) and (dφn) is uniformly bounded in ΩL2 (M): indeed, let 0 C := kf1k∞ and pick ψ : R → R with ψ(t) = t for all t with |t| ≤ C and with ψ bounded. 1,2 Then by Lemma 6.2 we have ψ ◦ φn → f1, and (ψ ◦ φn) is uniformly bounded in W (M). ∞ Thus we can pick sequences φn, θn in Cc (M) which are both uniformly bounded in 1,2 1,2 W (M) and φn → f1 and θn → f2 in W (M). Then one easily checks that (a stan- 2 dard /2 type argument which uses that fj are bounded) that φnθn → f1f2 in L (M), which using Cauchy-Schwarz implies φnθn → f1f2 in the sense of distributions, and so d(φnθn) → d(f1f2) in the sense of distributions (cf. Remark 3.10). ∞ Similarly, as df1,df2 ∈ L (M) by Lemma 6.5 a) (since fj are compactly supported), one 1 can check that φndθn → f1df2 and θndφn → f2df1 in ΩL2 (M), and so

d(φnθn) = θn(dφn) + φndθn → f2df1 + f1df2 1 in ΩL2 (M), which using Cauchy-Schwarz implies φnθn → f2df1 + f1df2 in the sense of distributions. This completes proof. It remains to prove the asserted chain rule: since C1-functions are locally Lipschitz, it suffices to prove the formula in each open relatively compact subset of M. In particular, we can assume that f is compactly supported. Furthermore, we can assume that ψ(0) = 0 (if not: consider ψ˜ := ψ − ψ(0)), and as f is bounded also that ψ is compactly supported (in particular, ψ0 is bounded). Under these assumptions, the chain rule follows trivially from Lemma 6.2.  1,2 Lemma 6.8. Assume f1 : M → R is bounded and Lipschitz and f2 ∈ W0 (M). Then 1,2 f1f2 ∈ W0 (M) and one has the product rule d(f1f2) = f1df2 + f2df1. ∞ Proof : Assume first f2 ∈ Cc (M). Then we have f1f2 is compactly supported and Lips- 1,2 chitz, thus in W0 (M), and the product rule holds by the previous lemma. 1,2 2 1 If f2 ∈ W0 (M), then f1f2 ∈ L (M) and f1df2 + f2df1 ∈ ΩL2 (M), as f1 and df1 are 1,2 ∞ bounded. This implies fg ∈ W (M). Pick a sequence φn in Cc (M) such that φn → f2 1,2 1,2 2 in W (M). Then we have f1φn ∈ W0 (M) and f1φn → f1f2 in L (M), as f1 is bounded. Applying the product rule to f1φn and using that f, df are bounded, one easily finds that also d(f1φn) → f1df2 + f2df1. 1 1,2 in ΩL2 (M). It follows from these two convergences that f1φn → f1f2 in W (M) and 1,2 1,2 so f1f2 ∈ W0 (M), as the latter is a closed subspace of W (M). Finally, d(f1φn) → 1 f1df2 +f2df1 in ΩL2 (M) implies the corresponding convergence in the sense of distributions 2 (by Cauchy-Schwarz), f1φn → f1f2 in L (M) implies the corresponding convergence in Heat kernels on Riemannian manifolds 31

the sense of distributions, and so by Remark 3.10 also d(f1φn) → d(f1f2), which also establishes the product formula for f1f2.  1,2 Lemma 6.9. Assume f1 ∈ Wloc (M) and that f2 : M → R is compactly supported and 1,2 Lipschitz. Then one has f1f2 ∈ W0 (M) and the product rule applies.

Proof : Multiplying f1 with a smooth compactly supported function which is = 1 on the 1,2 support of f2 we can assume that f1 ∈ W0 (M) (cf. Lemma 6.6), in which case the statement f)ollows from the previous lemma. 

7. Basic properties of the heat kernel The “heat semigroup” −tH 2 (e )t≥0 ⊂ L (L (M)) is defined by the spectral calculus. It is a strongly continuous and self-adjoint semigroup with −tH e 2,2 ≤ 1, where k·k denotes the operator for linear operators from Lq1 (M) to Lq2 (M). Moreoever, q1,q2 for every f ∈ L2(M) the path [0, ∞) 3 t 7−→ e−tH f ∈ L2(M) is the uniquely determined continuous path [0, ∞) −→ L2(M) which is C1 in (0, ∞) (in the norm topology) with values in Dom(H) thereon, and which satisfies the abstract “heat equation” (d/dt)e−tH f = −He−tH f, t > 0, −tH subject to the initial condition e f|t=0 = f. All of the above facts follow from abstract functional analytic results and only rely on the fact that H is self-adjoint and nonnegative. The aim of this section is to show that e−tH is given by an integral kernel Z e−tH f(x) = p(t, x, y)f(y)dµ(y),

such that for fixed x,(t, y) 7→ p(t, x, y) solves the heat equation

∂tu(t, y) = ∆yu(t, y)

with initial condition u(0, x) = δx. Theorem 7.1. a) There is a unique smooth map (0, ∞) × M × M 3 (t, x, y) 7−→ p(t, x, y) ∈ [0, ∞), the heat kernel of H, such that for all t > 0, f ∈ L2(M), and µ-a.e. x ∈ M one has Z (21) e−tH f(x) = p(t, x, y)f(y)dµ(y). 32 B. GUNEYSU¨

b) For all s, t > 0, x, y ∈ M one has Z (22) p(t, x, y)2dµ(y) < ∞, (23) p(t, y, x) = p(t, x, y), Z (24) p(t + s, x, y) = p(t, x, z)p(s, z, y)dµ(z), Z (25) p(t, x, z)dµ(z) ≤ 1.

c) For any f ∈ L2(M), the function Z (0, ∞) × M 3 (t, x) 7−→ Ptf(x) := p(t, x, y)f(y)dµ(y) ∈ C

is smooth and one has ∂ P f(x) = ∆ P f(x) for all (t, x) ∈ (0, ∞) × M. ∂t t x t d) For all fixed x ∈ M, the function (t, y) 7→ p(t, x, y) solves the heat equation

∂tu(t, y) = ∆yu(t, y)

in (0, ∞) × M, with initial condition u(0, x) = δx, in the sense that Z ∞ lim p(t, x, y)φ(y)dµ(y) = φ(x) for all φ ∈ Cc (M). t→0+ Proof : Before we come to the proof of the actual statements of Theorem 7.1, let us first establish some auxiliary results. Step 1: For fixed t > 0, there exists a smooth version of x 7→ e−tH f(x) (which from now on will always be taken). Proof: To see this, note that for any n ∈ N≥1 one has n 2n,2 Dom(H ) ⊂ Wloc (M), by local elliptic regularity. By the spectral calculus and the local Sobolev embedding (12), this implies \ Ran(e−tH ) ⊂ Dom(Hn) ⊂ C∞(M) for any t > 0.

n∈N≥1 Step 2: For any t > 0, U ⊂ M open and relatively compact, the map −tH 2 e : L (M) −→ Cb(U)(26) is a bounded linear operator between Banach spaces, where the space of bounded continuous functions Cb(U) is equipped with its usual uniform norm. Proof: A priory, this map is algebraically well-defined by step 1. The asserted boundedness follows from the closed graph theorem, noting that the L2(M)-convergence of a sequence Heat kernels on Riemannian manifolds 33

implies the existence of a subsequence which converges µ-a.e. Step 3: For fixed s > 0, the map 2 −sH L (M) × M 3 (f, x) 7−→ e f(x) ∈ C is jointly continuous. Proof: Let U ⊂ M be an arbitrary open and relatively compact subset. Given a sequence 2 ((fn, xn))n∈N≥0 ⊂ L (M) × U which converges to (f, x) ∈ L2(M) × U, we have −sH −sH e fn(xn) − e f(x) −sH −sH −sH ≤ e [fn − f](xn) + e f(x) − e f(xn)

−sH −sPe −sPe ≤ e 2 kfn − fk + e f(x) − e f(xn) L (M),Cb(U) 2 → 0, as n → ∞, by step 2 and step 1. Step 4: For fixed  > 0 and f ∈ L2(M), the map {< > } × M 3 (z, x) 7−→ e−zH f(x) is jointly continuous. Proof: Indeed, this map is equal to the composition of the maps

−(z−)H −H (z,x)7→(e f,x) 2 (f,x)7→e f(x) {< > } × M −−−−−−−−−−−−→ L (M) × X −−−−−−−−−→ C, where the second map is continuous by Step 3. The first map is continuous, since the map {< > 0} 3 z 7−→ e−zH f ∈ L2(M)(27) is holomorphic. Note that, a priory, (27) is a weakly holomorphic semigroup by the spectral calculus, which is then indeed (norm-) holomorphic by the weak-to-strong differentiability theorem. 2 Step 5: For any f ∈ L (M), there exists a jointly smooth version (t, x) 7→ Ptf(x) of (t, x) 7→ e−tH f(x), which satisfies ∂ (28) P f(x) = ∆ P f(x). ∂t t x t Proof: By Step 4, for arbitrary f ∈ L2(M), the map

−zH {< > 0} × M 3 (z, x) 7−→ e f(x) ∈ C is jointly continuous. It then follows from the holomorphy of (27) that for any open ball B in the open right complex plane which has a nonempty intersection with (0, ∞), for any 34 B. GUNEYSU¨

t ∈ B ∩ (0, ∞), and for any x ∈ M, we have Cauchy’s integral formula

I e−zH f(x) e−tH f(x) = dz, ∂B t − z noting that the holomorphy of (27) a priori only implies Cauchy’s integral formula for almost every x. Now the claim follows from differentiating under the line integral, observing that for fixed z ∈ {< > 0}, the map

√ −zH −<(z)H h − −1=(z)H i M 3 x 7−→ e f(x) = e e f (x) ∈ C

is smooth by Step 1. Finally, the asserted formula (28) follows from the by now proved existence of a smooth version of (t, x) 7→ e−tH f(x) and the fact that

(d/dt)e−tH f = He−tH f, t > 0,

in the sense of norm differentiable maps (0, ∞) → L2(M). Let us now come to the actual proof of Theorem 7.1. a) First of all, it is clear that any such heat kernel is uniquely determined (by the funda- mental lemma of distribution theory). To see its existence, we start by remarking that for every x ∈ M, t > 0, the complex linear functional given by

2 L (M) 3 f 7−→ Ptf(x) ∈ C is bounded by Step 2. Thus by Riesz-Fischer’s representation theorem, there exists a 2 2 unique function pt,x ∈ L (M) such that for all f ∈ L (M) one has

(29) Ptf(x) = hpt,x, fi .

−tH Clearly pt,x ∈ R for all y ∈ M, for if not, then e would not preserve reality (but it does, as −∆ is an operator with real-valued coefficients, so H preserves reality and so its heat semigroup). Moreover, it follows immediately from step 5 that (t, x) 7→ pt,x is weakly smooth. Then, this map is in fact norm smooth as a map (0, ∞) × M → L2(M) by the weak-to-strong differentiability theorem. We claim that the integral kernel which is well-defined by the “regularization”

(30) p(t, x, y) := pt/2,x, pt/2,y has the desired properties. Firstly, the smoothness of (t, x, y) 7→ p(t, x, y) follows imme- diately from the norm smoothness of (t, x) 7→ pt,x and the smoothness of the Hilbertian pairing (f, g) 7→ hf, gi. Claim 1: One has Z Pt+sf(x) = hpt,z, ps,xi f(z)dµ(z) Heat kernels on Riemannian manifolds 35

Proof of Claim 1:

Pt+sf(x) = P sPtf(x)

= hps,x,Ptfi

= hPtps,x, fi Z = Ptps,x(z)f(z)dµ(z) Z = hpt,z, ps,xi f(z)dµ(z).

0 Claim 2: For all t > 0, the scalar product hps0,z, pt−s0,xi does not depend on s ∈ (0, t). 0 Proof of Claim 2: Let r ∈ (0, s ). Then using Claim 1 with f = pr,x,

hps0,z, pt−s0,xi = Ps0 pt−s0,y(x) = PrPs0−rpt−s0,y(x) Z = pr,x hps0−r,z, pt−s0,yi dµ(z)

= Pt−rpr,x(y) = hpt−r,y, pr,xi = hpr,x, pt−r,yi . Now it follows from Claim 1 that Z Z

Ptf(x) = pt/2,x, pt/2,y f(y)dµ(y) = p(t, x, y)f(y)dµ(y).

It remains to show p(t, x, y) ≥ 0: It will be shown as an exercise (which relies on Lemma 6.2 and Example 6.3!) that f ≤ 1 implies Ptf ≤ 1. Thus if c > 0 and f ≤ c we have Ptf ≤ c. If f ≥ 0 we have −f ≤ c for all c > 0, so that we get Pt(−f) ≤ c and taking c → 0+ we have shown that f ≥ 0 implies Ptf ≥ 0. Thus writing

p(t, x, y) = p(t, x, y)+ − p(t, x, y)− we get

0 ≤ Pt(p(t, x, ·)−)(x) = hp(t, x, ·), p(t, x, ·)−i

= hp(t, x, ·)+, p(t, x, ·)−i − hp(t, x, ·)−, p(t, x, ·)−i

= − hp(t, x, ·)−, p(t, x, ·)−i , so kp(t, x, ·)−k2 = 0 and the claim follows form continuity. b) As by the fundamental lemma of distribution theory we have pt,x = p(t, x, ·) µ-a.e., and 2 pt,x ∈ L (M) it is clear that Z p(t, x, y)2dµ(y) < ∞. The symmetry p(t, y, x) = p(t, x, y) follows immediately from

p(t, x, y) = pt/2,x, pt/2,y . Next, for all 0 < s0 < t0 one has 0 p(t , x, y) = hps0,x, pt0−s0,yi , 36 B. GUNEYSU¨ as the formula holds for s0 = t/2 and the as the RHS does not depend on s0 by Claim 2. So Z p(t, x, z)p(s, z, y)dµ(z) = hp(t, x, ·), p(s, y, ·)i = hpt,x, ps,yi = p(t + s, x, y). It remains to show Z p(t, x, y)dµ(y) ≤ 1.

This follows by monotone konvergence from Ptf ≤ 1 for all f ≤ 1, by letting f run through f = 1Kn for Kn some compact exhaustion of M. c) = Step 5 and the proof of part a). d) For fixed s we set Z v(t, y) := p(t + s, x, y) = p(t + s, y, x) = p(t, y, z)p(s, z, x)dµ(z) = Ptp(s, ·, x)(y), which by Step 5 solves the heat equation in (t, y). It follows that (t, y) 7→ v(t − s, y) = p(t, x, y) solves the heat equation, too. 

8. Strong parabolic maximum principle and its applications From here on we will closely follow the presentation from Grigor’yan’s book [7]. The following result (and all its consequences) relies heavily on our standing assumption that M is connected:

Theorem 8.1. i) The strong parabolic minimum principle holds: assume I ⊂ R is an open interval and 0 ≤ u ∈ C2(I × M) solves

∂tu ≥ ∆u. If there exists (t0, x0) ∈ I × M with u(t0, x0) = 0, then one has u(t, x) = 0 for all x ∈ M and all t ≤ t0. ii) The strong parabolic maximum principle holds: assume I ⊂ R is an open interval and 0 ≥ u ∈ C2(I × M) solves ∂tu ≤ ∆u. If there exists (t0, x0) ∈ I × M with u(t0, x0) = 0, then one has u(t, x) = 0 for all x ∈ M and all t ≤ t0

8 Proof : i) Step 1): Let Ω ⊂ R × M be nonempty, open, and relatively compact and assume u ∈ C2(Ω) is such that ∂tu ≥ ∆u in Ω.

Then one has infΩ u = inf∂pΩ, where ∂pΩ denotes the parabolic boundary of Ω, which is defined as the complement ∂Ω \ ∂topΩ,

8This means that u is the restriction of C2-function on M to Ω Heat kernels on Riemannian manifolds 37

where ∂topΩ denotes the set of all (t, x) ∈∈ ∂Ω which admit an open neighborhood U ⊂ M of x and  > 0 such that (t − ) × U ⊂ Ω. This is called parabolic minimum principle. Proof of step 1:

WLOG we can assume the strict inequality ∂tu > ∆u in Ω (if this is not satisfied, one can replace u by u := u + t and take  → 0+). Let

(t0, x0) := min u(s, y). (s,y)∈Ω

It suffices to show (t0, x0) ∈ ∂pΩ. Assume the contrary. Then either (t0, x0) ∈ Ω or (t0, x0) ∈ Ω or (t0, x0) ∈ ∂topΩ. In both cases there exists a chart U around x0 and  > 0 such that Γ := [t0 − , t0] × U ⊂ Ω.

By the definition of (x0, t0), one has

t0 = min u(s, x0), s∈[t0−,t0]

and so ∂tu(t0, x0) ≤ 0. By diagonalizing gij(x0) and making the induced coordinate trans- formation on U, we can assume that the coordinates (x1, . . . , xm) on U satisfy, for some constants, b1, . . . , bm, m m X ∂2 X ∂ ∆f(x ) = f(x ) + b f(x ), 0 (∂xi)2 0 i ∂xi 0 i=1 i=1 for all f ∈ C2(U). Since x0 = min u(t0, y), y∈U we have ∂ ∂2 u(t , x ) = 0, u(t , x ) ≥ 0, ∂xi 0 0 (∂xi)2 0 0 and so ∆u(t0, x0) ≥ 0.

This estimate together with ∂tu(t0, x0) ≤ 0 contradicts ∂tu > ∆u in Ω and proves the parabolic minimum principle.

Step 2): Let V be a chart in M, let x0, x1 ∈ V be such that the line connecting x0 and x1 lies in V and assume I ⊂ R is an open interval and 0 ≤ u ∈ C2(I × M) solves

∂tu ≥ ∆u.

Then for all t0, t1 ∈ I with t1 > t0 and u(t0, x0) > 0 one has u(t1, x1) > 0.

Proof of step 2: Assume WLOG t0 = 0 and that • V is relatively compact and its closure is contained in a chart, • r > 0 is so small that the 2r-neighborhood of the line connecting x0 and x1 lies in V , and that for U := Be(x0, r) one has inf u(0, x) > 0. x∈U 38 B. GUNEYSU¨

Set 1 ζ := (x1 − x0). t1 Then for all t ∈ [0, t1] one has U + tζ ∈ V . Consider the open tilted cylinder (’schiefer Zylinder’) Γ := {(t, x): t ∈ (0, t1), x ∈ U + tζ}. We are going to show that u > 0 in Γ away from the lateral surface of Γ (’Oberflaeche von Γ ohne Deckel und Boden’). To this end, pick a function v ∈ C2(Γ) such that

(31) ∂tv ≤ ∆v in Γ, v = 0 on the lateral surface of Γ and v > 0 elsewhere on Γ.(32) Such a function v will be constructed in the exercises. Pick  > 0 such that inf u(0, x) ≥  sup v(0, x), x∈U x∈U

in particular, u ≥ v at the bottom U of Γ. In particular, u ≥  on ∂pΓ. Since the function u − v satisfies the assumptions of the parabolic minimum principle, one has u ≥ v in Γ. In light of (32), this implies u > 0 on Γ away the lateral surface of Γ, completing the proof of step 2. Step 3): The strong parabolic minimum principle holds: Proof of step 3: Given u(t0, x0) = 0 for some (t0, x0) ∈ I × M, it suffices to prove u(t, x) = 0 0 for all (t, x) ∈ I × M with t < t . Pick a finite sequence of points x0, . . . , xk such such 0 that x0 = x, xk = x and such that xi and xi+1 lie in the same chart together with line connecting these two points, for all i = 0, . . . , k (finally, here we use that M is connected!!). 0 Picking a finite sequence of times t = t0 < . . . tk = t we can use step 2 k-times to deduce that if u(t0, x0) = u(t, x) > 0 then also u(t1, x1) > 0, and so u(t2, x2) > 0 and so on, 0 0 yielding finally that u(tk, xk) = u(t , x ) > 0, a contradiction. This completes the proof of the strong parabolic minimum principle. ii) This follows from applying i) to −u.  Corollary 8.2. One has p > 0. Proof : Assume there exist t0, x0, y0 with p(t0, x0, y0) = 0. Then as (t, y) 7→ p(t, x0, y) solves the heat equation one has p(t, x0, y) = 0 for all y ∈ M all t ≤ t0. Pick φ smooth compactly supported with φ(x0) = 1. Then we have Z p(t, x, y)φ(y)dµ(y) → 0

as t → 0+ by p(t, x0, y) = 0 for all y ∈ M all t ≤ t0, while Z p(t, x, y)φ(y)dµ(y) → 1

0 as t → 0+ by Theorem 7.1 d) and φ(x ) = 1.  Definition 8.3. Given α ∈ R, a real-valued function u ∈ C2(M) is called Heat kernels on Riemannian manifolds 39

• α-superharmonic, if (−∆ + α)u ≥ 0, • α-subharmonic, if (−∆ + α)u ≤ 0, • α-harmonic, if (−∆ + α)u = 0. In the α-harmonic case we can assume that u is smooth by local elliptic regularity. If α = 0, one simply says superharmonic (subharmonic) [harmonic], instead of 0-superharmonic, (0- subharmonic) [0-harmonic]. Theorem 8.4 (Strong elliptic minimum/maximum principle). i) Assume α ∈ R and that u ≥ 0 is α-superharmonic. If there exists x0 with u(x0) = 0, then one has u ≡ 0. ii) Assume α ∈ R and that u ≤ 0 is α-subharmonic. If there exists x0 with u(x0) = 0, then one has u ≡ 0. Proof : i) Apply the strong parabolic minimum principle to v(t, x) := eαtu(x). ii) Apply i) to −u. 

Corollary 8.5. i) If u is superharmonic and if there exits x0 with u(x0) = inf u, then u ≡ inf u. ii) If u is subharmonic and if there exits x0 with u(x0) = sup u, then u ≡ sup u. Proof : i) Apply the strong elliptic minumum principle tou ˜ := u − inf u. ii) Apply i) to −u.  Example 8.6. Let N be a compact connected manifold (smooth without boundary). By picking a Riemannian metric on N, using the Hodge-Theorem and that continuous real- valued functions on a compact space attain their minimum and maximum, we get from the above Corollary 0 H (N) = {f : ∆f = 0} = {constant real-valued functions on N} = R for the zeroth homology group of N. Theorem 8.7 (Elliptic minimum/maximum principle). Let V ⊂ M be open, relatively compact with ∂V nonempty. i) Assume u ∈ C2(V ) ∩ C(V ) is superharmonic, then one has inf u = inf u. V ∂V ii) Assume u ∈ C2(V ) ∩ C(V ) is subharmonic, then one has sup u = sup u. V ∂V

Proof : i) set r := infV u and S := {x ∈ V : u(x) = r}. It suffices to show that S intersets ∂V . Assume not. Then one has S ⊂ V . We are going to show that the closed set S is open, so S = M, a contradiction to S ⊂ V ⊂ M \ ∂V . Let x ∈ S ⊂ V and let N ⊂ V be a connected open nbh of x. Then u|N attains its minimum in x, and so u ≡ r by the above Corollary. Thus we have shown N ⊂ S, showing 40 B. GUNEYSU¨

that S is open. ii) Apply i) to −u. 

9. Some In general, both parts of the spectrum (discrete spectrum and essential spectrum) of H can be nonempty and the only thing we know for sure is σ(H) ⊂ [0, ∞), as H ≥ 0. The following simple result indicated that essential spectrum can only be nonempty on noncompactness M’s: Theorem 9.1. Assume that for some t > 0 one has sup p(t, x, x) < ∞, x∈X and that µ(M) < ∞. Then H has a purely discrete spectrum (so the spectrum consists of eigenvalues having finite multiplicity), and if (λn) denotes the increasing ennumeration of the eigenvalues with each eigenvalue counted according to its multiplicity, then one has

0 ≤ λn % ∞. −tH Proof : By abstract functional analysis it suffices to show that Pt = e is Hilbert-Schmidt. But the latter is an integral operator, so it suffices to show ZZ p(t, x, y)2dµ(x)dµ(y) < ∞.

Since ZZ Z p(t, x, y)2dµ(x)dµ(y) = p(t, x, x)dµ(x), the claim follows from the assumptions.  The latter result clearly applies to compact M’s (so compact M’s have a purely discrete spectrum), but also to some noncompact M’s! For example, as we shall see later on (cf. Corollary 9.8 below), the result applies to open relatively compact subsets of an arbitrary Riemannian manifold (so those have a purely discrete spectrum, too). To prove the latter statement, we are going to show pU (t, x, y) ≤ p(t, x, y), where U ⊂ M is an arbitrary open relatively compact subset and pU its heat kernel, that is, the heat kernel of the Riemannian manifold (U, g|U ). To this end, we record: 1,2 ∞ Lemma 9.2. For all 0 ≤ f ∈ W0 (M) there exists a sequence 0 ≤ fk ∈ Cc (M) with 1,2 fk → f as k → ∞ in W (M). ∞ 1,2 Proof : Pick a sequence hk ∈ Cc (M) with hk → f in W (M), and pick ψ : R → [0, ∞) 0 ∞ smooth with ψ(0) = 0 and supt |ψ (t)| < ∞. Then 0 ≤ ψ ◦hk ∈ Cc (M) and ψ ◦hk → ψ ◦f in W 1,2(M) by Lemma 6.2 (applied with a constant sequence). Thus it suffices to show 0 that there exists a sequence ψk : R → [0, ∞) smooth with ψk(0) = 0 and supk,t |ψk(t)| < ∞ 1,2 such that ψk ◦ f → f in W (M). So this end, let φ(t) := 1(0,∞=(t), ψ(t) := t+, t ∈ R, Heat kernels on Riemannian manifolds 41 and pick ψk as in Example 6.3. Then for some C > 0 we have |ψk(t) − ψ(t)| ≤ C|t| for all 2 t, k, so that ψk ◦ f → ψ ◦ f = f in L (M) by dominated convergence. Moreover, we have 0 d(ψk ◦ f) = (ψk ◦ f)df by Lemma 6.2 which shows that d(ψk ◦ f) → (φ ◦ f)df = df again by dominated convergence. This completes the proof.  Let U ⊂ M be open and denote byα ˜ the trivial extension to M by zero away from U of a function or a 1-form on U. Then we consider L2(U) as a closed subspace of L2(M) via ˜ 1 1 the embedding f 7→ f, and likewise we have ΩL2 (U) ⊂ ΩL2 (M). 1,2 ˜ 1,2 Lemma 9.3. Let U ⊂ M be open. Then for all f ∈ W0 (U) one has f ∈ W0 (M) and ˜ 1,2 1,2 df = dfe . In particular, W0 (U) ⊂ W0 (M) is a closed subspace. ∞ ˜ ∞ ˜ Proof : If f ∈ Cc (U) then clearly f ∈ Cc (M) with df = dfe by locality of differential operators. 1,2 ∞ 1,2 Now let f ∈ W0 (U) and pick a sequence fn ⊂ Cc (U) with fn → f in W (U). Then ˜ ˜ 2 ˜ 1,2 ˜ clearly fn → f in L (M). Moreover, fn is Cauchy in W (M), and its limit must be f, ˜ ˜ 2 ˜ ˜ because fn → f in L (M). In particular, dfn → df in ΩL2 (M). On the other hand, we have dfn → df ∈ ΩL2 (U) and so dffn → dfe ∈ ΩL2 (M). In view of dffn = dfen, we have ultimately shown df˜ = dfe .  1,2 1,2 m The analogous result with W0 replaced by W is wrong: in R , one has 1B(x,1) = 1 ∈ 1,2 1,2 m W (B(x, 1)) (this is trivial), but 1B(x,1) ∈/ W (R ) (this requires some work). 1,2 1,2 Lemma 9.4. Let h ∈ W (M). Then there exists v ∈ W0 (M) with h ≤ v, if and only if 1,2 one has h+ ∈ W0 (M). Proof : Exercise.  Lemma 9.5. Assume 0 ≤ u ∈ W 1,2(M), f ∈ L2(M) is real-valued, λ > 0, and that one has (−∆ + λ)u ≥ f weakly, meaning that Z Z (33) u(−∆ + λ)φdµ ≥ fφdµ

∞ −1 for all 0 ≤ φ ∈ Cc (M). Then one has u ≥ (H + λ) f. −1 −1 1,2 Proof : Write f = (H + λ)(H + λ) f and set v := (H + λ) f ∈ Dom(H) ⊂ W0 (M). Then one has (−∆ + λ)(v − u) ≤ 0 weakly, and so (−∆ + λ)h ≤ 0 weakly, if we set h := v − u ∈ W 1,2(M). Thus, integrating by parts, Z Z (dh, dφ)dµ + λ hφdµ ≤ 0 42 B. GUNEYSU¨

∞ 1,2 for all 0 ≤ φ ∈ Cc (M). Since both sided are continuous in the W -norm, Lemma 9.2 1,2 shows that the the latter inequality holds for all 0 ≤ φ ∈ W0 (M). By Lemma 36 we have 1,2 0 ≤ h+ ∈ W0 (M), and so Z Z Z Z 2 (dh, dh+)dµ + λ hh+dµ = (dh, dh+)dµ + λ h+dµ ≤ 0. By Example 6.3 one has Z Z 2 (dh, dh+)dµ = |dh+| dµ, and so Z Z 2 2 |dh+| dµ + λ h+dµ =≤ 0,

−1 and so h+ = 0, as λ > 0. This shows v − u ≤ 0, and so v = (H + λ) f ≤ u.  −1 −1 2 Lemma 9.6. For all λ > 0 one has (H + λ) f1 ≤ (H + λ) f2, whenever f1, f2 ∈ L (M) are such that f1 ≤ f2.

Proof : By linearity we can assume f1 = 0. Using the formula (Laplace-transforms) Z ∞ (r + λ)−1 = e−λse−srds 0 with r = H (spectral calculus) implies Z ∞ −1 −λs (H + λ) f2 = e Psf2ds ≥ 0, 0 2 where the intregal converges in the L -sense, so the claim follows from Psf2 ≥ 0 (which is p a consequence of p(s, x, y) ≥ 0). Note there that on any measure space 0 ≤ hn → h in L for some p ∈ [1, ∞) implies h ≥ 0.  Given U ⊂ M open, we denote with HU , P U , pU the objects H, P , p which are defined on the Riemannian manifold (U, g|U ). Based on the above auxiliary results we can prove: Theorem 9.7. For all open U ⊂ M, t > 0, x, y ∈ U one has pU (t, x, y) ≤ p(t, x, y). Proof : It suffices to prove that for all 0 ≤ f ∈ L2(U), x ∈ U, one has Z   Z  U U ˜ (34) p (t, x, y)f(y)dµ(y) = Pt f(x) ≤ Ptf(x) = p(t, x, y)f(y)dµ(y) . U U Step 1: For all λ > 0 one has (HU + λ)−1f ≤ (H + λ)−1f˜. −1 ˜ 1,2 1,2 Proof of step 1: We have u := (H + λ) f ∈ W0 (M) ⊂ W (M), and this function in 1,2 ≥ 0 by Lemma 9.6. Clearly, u|U ∈ W (U) and (−∆ + λ)u|U = f|U . Thus, we have (H + λ)−1f˜ = u ≥ (HU + λ)−1f.˜ from Lemma 9.5. Step 2: For all λ > 0, k ∈ N, one has (HU + λ)−kf ≤ (H + λ)−kf˜. Heat kernels on Riemannian manifolds 43

Proof of step 2: This follows from applying step 1 and Lemma 9.6 (using the latter for M and for U). Step 3: One has (34). −tr k k −k U Proof of step 3: By applying the formula e = limk t (r + k/t) for r = H,H (spectral calculus), we get the L2-convergences  k  k U −tHU k U −k −tH k −k Pt = e = lim (H + k/t) ,Pt = e = lim (H + k/t) , k t k t so that the claim follows from Step 2.  Applying the last result with M replaced by V , where V ⊂ M is open with U ⊂ V , 2 U V implies that for all 0 ≤ f ∈ L (M) one has Pt f|U ≤ Pt f|V , in particular if (Uj)j∈N is an Un exhaustion of M with open subsets the limit limn Pt f|Un exists pointwise. As one might guess, one has that for all t > 0 (exercise)

Un Pt f|Un % Ptf µ-a.e.(35) With some nore efforts, one can prove that the above relation actually holds pointwise (and even in the C∞), but we will need this stronger statement. Corollary 9.8. For all open relatively compact U ⊂ M the operator HU has a purely discrete spectrum.

Proof : Combine Theorem 9.7 with Theorem 9.1. 

10. Integrated maximum principle Theorem 10.1 (Integrated Maximum Principe). Let I ⊂ [0, ∞) be an interval and let ζ : I × M → R be continuous such that i) for all t ∈ I, the function ζ(t, ·) is locally Lipschitz, ii) ∂tζ exists and is continuous on I × M, iii) one has 1 ∂ ζ + |dζ|2 ≤ 0. t 2 Then with λmin(M) := inf σ(H), for all f ∈ L2(M) the function Z 2 ζ(t,x) I 3 t 7−→ J(t) := |Ptf| (x)e dµ(x) ∈ [0, ∞] satisfies −2λmin(M)(t−t0) J(t) ≤ J(t0)e for all t, t0 ∈ I with t > t0, in particular, J is nonincreasing. Remark 10.2. We make no statement here on the finiteness of J(t)! 44 B. GUNEYSU¨

2 2 2 Proof : Because of |Ptf| = |Pt−t0 Pt0 f| ≤ (Pt−t0 |Pt0 f|) WLOG (check this!) we can and we will assume f ≥ 0. Then, using (35) in combination with monotone convergence, and that by the variational principle (8) Z  Z  2 2 λmin(U) = inf |dΨ| dµ / |Ψ| dµ 1,2 Ψ∈W0 (U) U U 1,2 1,2 in combination with W0 (U) ⊂ W0 (M) one has

λmin(U) ≥ λmin(M), it suffices to show that for all open relatively compact U ⊂ M one has Z U U 2 ζ(t,x) U −2λmin(U)(t−t0) (36) J (t) := (Pt f) (x)e dµ(x) ≤ J (t0)e , U U U U where here and in the sequel Pt f is to be understood in the obvious sense Pt f := Pt f|U . Note that J U is finite and continuous on I, for we have U U ζ U J = P f, e P f U . Thus in order to show (36), it suffices to show that J U is differentiable in I \{0} with U (d/dt)JU (t) ≤ −2λmin(U)J (t)(37) for all t ∈ I \{0}. The functions ζ(t, ·) and ∂tζ(t, ·) are in Cb(U), so that (exercise) ∂tζ is equal to the strong derivative (d/dt)ζ. The same remark applies to eζ , and so ζ ζ ζ (38) (d/dt)e = ∂te = e ∂tζ. On the other hand P U f is strongly differentiable as an L2(U)-valued map by the spectral calculus with we record (39) (d/dt)P U f = ∆P U f.

ζ ∞ Using that e is strongly differentiable as an Cb(U)-valued (and thus as an L (U)-valued) map it follows that P U feζ is a strongly differentiable L2(U)-valued map and one has (40) (d/dt)(P U feζ ) = [(d/dt)P U f]eζ + [(d/dt)eζ ]P U f.

The product rule applied above is as follows: if s 7→ h1(s) is a strongly differentiable map ∞ from an open subset A ⊂ R to L , and if s 7→ h2(s) is strongly differenatibale map from A 2 2 to L , then s 7→ h1(s)h2(s) is a strongly differentiable map from A to L , and the product rule applies. Thus U U ζ U J = P f, e P f U is differentiable with U ζ U U ζ (41) (d/dt)JU = (d/dt)P f, e P f U + P f, (d/dt)[ue ] U U ζ U U 2 ζ (42) = 2 (d/dt)P f, e P f U + (P f) , (d/dt)e U U ζ U U 2 ζ (43) = 2 ∆P f, e P f U + (P f) , [∂tζ]e U . Heat kernels on Riemannian manifolds 45

By the Chain rule from Lemma 6.7 we have that eζ(t,·) is locally Lipschitz on M, so this U function is Lipschitz (and bounded) on U. As by the spectral calculus we have Pt f ∈ 1,2 ζ U 1,2 W0 (U) it follows that e Pt f ∈ W0 (U) by Lemma 6.8 and we may integrate by parts Z Z U ζ U U ζ U U ζ U 2 ∆P f, e P f U = 2 ∆P f · e P fdµ = −2 (dP f, d(e P f))dµ. U U U ζ(t,)˙ In addition, Pt f and e are locally Lipschitz in U so that using the product rule from Lemma 6.7 and the chain rule therein we get d(eζ P U f) = P U f · deζ + eζ · dP U f = P U f · eζ dζ + eζ · dP U f and so

U ζ U 2 ∆P f, e P f U = Z   Z   − 2 dP U f, P U f · eζ dζ dµ − 2 dP U f, eζ · dP U f dµ. U U Plugging this into (41) and using assumption ii) from the theorem we obtain U ζ U U 2 ζ (d/dt)JU = 2 ∆P f, e P f U + (P f) , (∂tζ)e U Z Z Z ζ U  U  ζ  U U  U 2 ζ = −2 e P f dP f, dζ dµ − 2 e dP f, dP f + (P f) · (∂tζ) · e dµ U U U Z    Z   1  ≤ −2 eζ P U f dP U f, dζ + eζ dP U f, dP U f + eζ (P U f)2|dζ|2 dµ U U 4 Z 2 U 1 U ζ = −2 dP f + P fdζ e dµ U 2 Z Z ζ/2 U 2 ζ/2 U 2 = −2 d(e P f) dµ ≤ −2λmin(U) |e P f| dµ U U where the last equality follows from  1  dP U f + P U fdζ eζ/2 = d(eζ/2P U f) 2 ζ/2 U 1,2 and the last inequality from the variational principle, since e P f ∈ W0 (U). Noting that Z ζ/2 U 2 |e P f| dµ = JU , U this completes the proof. 

11. L2-mean-value-inquality (MVI)

Definition 11.1. Let I ⊂ R be an interval and V ⊂ M be open. Then a C2-function u : I × V → R is called a subsolution of the heat equation, if ∂tu ≤ ∆u. 46 B. GUNEYSU¨

Theorem 11.2 (L2-MVI). Assume B(x, R) ⊂ M is relatively compact and that for some a, n > 0 one has the following Faber-Krahn inequality: U −2/n λmin(U) = min σ(H ) ≥ aµ(U) for all open U ⊂ B(x, R).

Then there exists a constant Cn, which only depends on n, such that for all T > 0 and all subsolutions u of the heat equation in the cylinder C := (0,T ] × B(x, R) one has C a−n/2 Z u2 (T, x) ≤ √n u2 dν, + n+1 + min( T ),R) C where dµ := dtdµ is the product of the Lebesgue measure on R and the Riemann volume measure on M. The proof of the L2-MVI requires two auxiliary results:

Lemma 11.3. Let V ⊂ M be open, 0 ≤ T0 < T and let η be a Lipschitz function on C := [T0,T ] × V (considered as a Riemann manifold) such that for some compact K ⊂ V one has supp(η(t, ·)) ⊂ K for all t ∈ [T0,T ]. Let u be a subsolition of the heat equation in C and set v := (u − θ)+ for some θ ≥ 0. Then one has Z Z  Z 1 2 2 2 2 2 v (T, ·)η (T, ·)dµ − v (T0, ·)η (T0, ·)dµ + |d(vη)| dν 2 V V C Z 2 2 (44) ≤ v (|dη| + |η∂tη|)dν. C

In particular, if η(T0, ·) = 0, then the following two additional inequalities hold: Z Z 2 2 2 2 (45) v (t, ·)η (t, ·)dµ ≤ 2 v (|dη| + |η∂tη|)dν for all t ∈ [T0,T ], V C and Z Z 2 2 2 (46) |d(vη)| dν ≤ v (|dη| + |η∂tη|)dν. C C Proof : Note first that the second inequality follows by applying the first with T replaced by t, and the last one follows trivially from the first one (by leaving away one positive summand on the LHS). 2 1,2 To prove the first inequality, since u(t, ·) ∈ C (V ) ⊂ Wloc (V ) for all t, by Lemma 6.2 one 1,2 has v(t, ·) ∈ Wloc (V ) with

(47) dv = 1{u>θ}du = 1{v6=0}du, so (48) (dv, du) = |dv|2, vdu = vdv.− Since η(t, ·) is compactly supported and Lipschitz in V (and so its square, too) one has 2 1,2 v(t, ·)η(t, ·) ∈ W0 (V ) by Lemma 6.9, with (49) d(vη2) = vdη2 + η2dv = 2vηdη + η2dv, Heat kernels on Riemannian manifolds 47 thus (50) (du, d(vη2)) = 2vη(dv, dη) + η2|dv|2. If we multiply ∂ u ≤ ∆ with vη2 and perform R ··· dν, we get t C Z Z T Z 2 2 (∂tu)vη dν ≤ (∆u)vη dµdt C T0 V Z T Z = − (du, d(nη2))dµdt T0 V Z T Z   = − 2vη(dv, dη) + η2|dv|2 dµdt T0 V Z T Z   (51) − |d(vη)|2 − v2|dη|2 dµdt, T0 V 2 1,2 0 where have integrated by parts (Lemma 6.1; note that v(t, ·)η(t, ·) ∈ W0 (V ) on some open relatively compact neighbourhood V 0 ⊂ V of K and that u(t, ·), du(t, ·), ∆u(t, ·) are square integrable on V 0, u(t, ·) is C2 on V ), and where we have used (50). Let us also record an application of Lemma 6.2 to the t-variable gives as above

(52) v∂tu = v∂tv, which shows the first identity in Z T Z T 2 1 2 2 (∂tu)vη dt = (∂tv )η dt T0 2 T0 Z T 1 2 2 2 2  1 2 2 = v (T, ·)η (T, ·) − v (T0, ·)η (T0, ·) − v ∂tη dt 2 2 T0 Z T 1 2 2 2 2  2 = v (T, ·)η (T, ·) − v (T0, ·)η (T0, ·) − v η∂tηdt, 2 T0 where we have integrated by parts (noting that this can be done for Lipschitz functions, R say by using Friedrichs mollifiers). If we perform V ··· dµ in the last identity and use (51), the proof is complete.  Lemma 11.4. In the situation of the L2-MVI consider

Ci := [Ti,T ] × B(x, Ri), i = 0, 1, where Ri, Ti are chosen with 0 < R1 < R0 ≤ R, 0 ≤ T0 < T1 < T . Chose θ1 > θ0 ≥ 0 and set Z 2 Ji := (u − θi)+dν, i = 0, 1. Ci Then for some constant cn, which only depends on n, one has 1+2/n cnJ0 J1 ≤ 1+2/n 4/n , aδ (θ1 − θ0) 48 B. GUNEYSU¨

where 2 δ := min T1 − T0, (R0 − R1) .

Proof : WLOG θ0 = 0, θ := θ1. Define η by η(t, y) := φ(t)ψ(y), where  t − T  φ(t) := min 0 , 1 , T1 − T0 (R − %(x, y))  ψ(y) := min 1/4 + , 1 , R1/4 − R1/2

0 where Rλ := λR1 + (1 − λ)R0, λ ∈ [0, 1]. Note that Rλ ≤ Rλ0 , iff λ ≤ λ. Note also that by construction η(t, ·) is supported in the compact ball K := B(x, R1/4). Applying the second inequality from the previous lemma to C0, v := u+, with t ∈ [T1,T ] ⊂ [T0,T ] gives the bound Z Z Z 2 2 2 2 u+(t, ·)dµ ≤ u+(t, ·)dµ ≤ 2 u+(|dη| + |η∂η|)dν, B(x,R1/2) B(x,R0) C0

where we have used that η = 1 in [T1,T ] × B(x, R1/2). Clearly we have

2 −2 −2 −1 η ≤ 1, |dη| ≤ (R1/4 − R1/2) = 16(R0 − R1) ≤ 16/δ, |∂tη| ≤ (T1 − T0) ≤ 1/δ, and so Z 2 −1 u+(t, ·)dµ ≤ 34δ J0. B(x,R1/2) Fix t as above and set

Ut := {y ∈ B(x, R3/4): u(t, y) > θ}, so that by the latter inequality we get 1 Z 1 Z 34J (53) µ(U ) ≤ u2 (t, ·)dµ ≤ u2 (t, ·) ≤ 0 . t θ2 + θ2 + θ2δ B(x,R3/4) B(x,R1/2) Define now (R − %(x, y))  ψ0(y) := min 3/4 + , 1 R3/4 − R1 and η0(t, y) := φ(t)η0(y). Applying the third inequality of the previous lemma to v0 = (u − θ)+ in C0 gives with a similar reasoning as above the inequality Z Z 17 Z 17J (54) |d(v0η0)|2dν ≤ v02(|dη0|2 + |η0∂η0|)dν ≤ v02dν ≤ 0 . C0 C0 δ C0 δ

0 0 1,2 The function η (t, ·)v (t, ·) is suppported in the compact set Ut, thus an element of W0 (V ) for all open V ⊂ M with Ut ⊂ V . Choose such a V such that in addition V ⊂ B(x, R0) Heat kernels on Riemannian manifolds 49

and9 68J µ(V ) ≤ 2µ(U ) ≤ 0 , t θ2δ 0 0 1,2 where the second inequality follows from (53). Since η (t, ·)v (t, ·) ∈ W0 (V ) we can use the variational principle for V to conclude (using the support properties of η0(t, ·)v0(t, ·)) Z |d(η0(t, ·)v0(t, ·))|2dµ B(x,R0) Z = |d(η0(t, ·)v0(t, ·))|2dµ V Z 0 0 2 ≥ λmin(V ) (η (t, ·)v (t, ·)) dµ V Z 0 0 2 = λmin(V ) (η (t, ·)v (t, ·)) dµ B(x,R0) Z ≥ aµ(V )−2/n (η0(t, ·)v0(t, ·))2dµ B(x,R0)  2 2/n Z θ δ −2/n 0 0 2 ≥ a J0 (η (t, ·)v (t, ·)) dµ, 68 B(x,R0) 0 where we have used the Faber-Krahn inequality (an assumption) and that η = 1 in [T1,T ]× B(x, R1). The lemma now immediately follows from integrating the latter inequality with respect to R T ··· dt and using (54). T1  Now we can give the

2 Proof of the L -MVI: Assume for the moment that θ ≥ 0 is arbitrary and define δk = δk(θ) > 0 by !n/(n+2) 162/nC 161−k/nJ 2/n C0 16−k/(n+2)J 2/(n+2) δ := n 0 = n 0 k aθ4/n an/(n+1)θ4/(n+2)

0 2/n where Cn is as in the previous lemma, where Cn := 16 Cn, and where Z 2 J0 := (u − θ/2)+dν. C Set k−1 k−1 X X p Tk := δi,Rk := R − δi. i=0 i=0

9 In fact µ(V ) ≤ 2µ(Ut) is a little technical to prove, but replacing 2 with some A = An follows straightforwardly from covering B(x, R0) with Euclidean balls and using that locally g is equivalent to the Euclidean metric. 50 B. GUNEYSU¨

00 Then there exists Cn > 0 such that (geometric series) if a−n/2J θ2 := √ 0 , 00 n+2 Cn min( T,R) then ∞ ∞ X X p Tk ≤ δi ≤ T/2,Rk ≥ R − δi ≤ R/2. i=0 i=0 Define a sequence of cylinders by

Ck := [Tk,T ] × B(x, Rk) and further Z −(k+1) 2 θk := (1 − 2 )θ, Jk := (u − θk)+dν. Ck Then we have

C0 = C , [T/2,T ] × B(x, R/2) ⊂ Ck+1 ⊂ Ck.

We claim that Jk → 0 as k → ∞, which would complete the proof, for one has Z T Z 2 (u − θ)+dν ≤ Jk → 0, T/2 B(x,R/2)

2 so u+(T, x) ≤ θ, which implies the assertion of the L -MVI. In order to show Jk → 0, we apply the previous lemma to the pair (Ck, Ck+1), which using 2 2 δk = (Rk − Rk+1) = (Tk+1 − Tk) = min((Rk − Rk+1) , (Tk+1 − Tk)) yields the inequality 1+2/n CnJk Jk+1 ≤ . 1+2/n 4/n aδk (θk+1 − θk)

Using this inequality and the defining formula for δk a simple induction on k shows −k Jk ≤ 16 J0, which completes the proof. 

12. Equivalent characterizations of Li-Yau upper heat kernel bounds Definition 12.1. Given t, D > 0, x ∈ M define the weighted L2-norm of the heat kernel by Z 2 2 %(x,y) ED(t, x) := p(t, x, y) e Dt dµ(y) ∈ [0, ∞). M m Note that in R one has ED(t, x) = ∞ for all t, x of D ≤ 2. Heat kernels on Riemannian manifolds 51

Remark 12.2. Let s > 0 be arbitrary. As we have Z 2 ζ(t,·) ED(t, x) = (Pt−sf) e dµ, M for all t > s, where f := p(s, x, ·), ζ(t, y) := %(x, y)2/(Dt),

it follows from the integrated maximum principle that t 7→ ED(t, x) is nonincreasing on (0, ∞), and

−2λmin(t−t0) (55) ED(t, x) ≤ ED(t0, x)e

so long as 0 < t0 ≤ t. Theorem 12.3. Let B(x, r) be a relatively compact ball and assume the following Faber- Krahn inequality: there exist a, n > 0 such that for all open U ⊂ B(x, r) one has −2/n (56) λmin(U) ≥ aµ(U) . Then for all t > 0, D > 2 one has C (aδ)−n/2 E (t, x) ≤ n where δ := min(D − 2, 1) D min(t, r2)n/2 Remark 12.4. On Rm one has the Faber-Krahn inequality uniformly on any ball (a m 10 classical fact), in the sense that for some a = am > 0, and all U ⊂ R open one has −2/m λmin(U) ≥ aµ(U) . Furthermore, one can show that there exists a (Lipschitz continuous) functionrM ˜ → (0, ∞), so that one has U −2/m λmin(U) = min σ(H ) ≥ aµ(U) for all open U ⊂ B(x, r˜(x)), where the constant a > 0 only depends on m = dim(M). Thus for all x ∈ M there exists r > 0 such that one has (56), and so the above theorem shows that for all t > 0, D > 2, x ∈ M one has ED(t, x), regardless of how bad the geometry of M is! This is a highly nontrivial fact. We prepare the proof of Theorem 12.3 with Lemma 12.5. Let B(x, r) be a relatively compact ball and assume the following Faber- Krahn inequality: there exist a, n > 0 such that for all open U ⊂ B(x, r) one has −2/n (57) λmin(U) ≥ aµ(U) .

Set %(y) := (%(x, y) − r)+, y ∈ M. Then for all t > 0 one has

Z 2 −n/2 %(y) Cna 2 2t p(t, x, y) e dµ(y) ≤ 2 n/2 M min(t, r ) 10 ∗ ∗ Sketch of proof: One first shows that λmin(U) ≥ λmin(U ), where U is any ball with the same ∗ 2 volume as U (this estimate uses the co-area formula). Then one shows λmin(U ) = cm/r , and uses ∗ 0 m µ(U) = µ(U ) = cmr . 52 B. GUNEYSU¨

Proof : With the same argument as in Remark 12.2 one finds that the left hand side of the inequality is noncreasing in t, so WLOG we can assume t ≤ r2. For the moment let 0 ≤ f ∈ L2(M) be arbitrary and set u := P f. The L2-mean value inequality applied to the cylinder [t/2, t] × B(x, r) shows −n/2 Z t Z 2 Cna 2 u(t, x) ≤ 1+2/n u(s, y) dµ(y)ds. t 0 B(x,r) Set ζ(s, y) := %(y)2(2(s − t))−1 and note supp(ζ(s, ·)) ⊂ M \ B(x, r), so that the latter inequality is gives −n/2 Z t Z 2 Cna 2 ζ(s,y) u(t, x) ≤ 1+2/n u(s, y) e dµ(y)ds. t 0 B(x,r) Again the integrated maximum principle implies that Z J(s) := u(s, y)2eζ(s,y)dµ(y)ds M is nonincreasing in s ∈ [0, t) and so the latter inequality implies −n/2 Z t/2 Z 2 Cna 2 ζ(s,y) u(t, x) ≤ 1+2/n u(s, y) u dµ(y)ds t 0 M −n/2 Z t/2 Cna = 1+2/n J(s)ds t 0 C0 a−n/2 ≤ n J(0) t2/n Z 0 −n/2 2 −%2/(2t) = Cn(at) f e dµ. Pick φ smooth compactly supported with 0 ≤ φ ≤ 1 and applying the latter inequality with f := p(t, x, ·)e%2/(2t)φ to get Z 2 Z 2 %2/(2t) −n/2 2 %2/(2t) 2 p(t, x, ·) e φdµ ≤ Cn(at) p(t, x, ·) e φ dµ M M Z −n/2 2 %2/(2t) ≤ Cn(at) p(t, x, ·) e φdµ, M which is equivalent to the assertion of the lemma (letting φ → 1).  2 Proof of Theorem 12.3: It suffices to prove the inequality for D ≤ 3 and t ≤ r (as ED(t, x) is decreasing√ t and D). √ We have δt ≤ r. Thus Faber-Krahn holds in B(x, δt) and applying the previous lemma on that ball we get √ Z (%(x,y)− tδ)2 −n/2 + Cna 2 2t p(t, x, y) e dµ(y) ≤ n/2 M (δt) Heat kernels on Riemannian manifolds 53

using 2 2 2 a /t2 + b /t1 ≥ (a + b) /(t1 + t2), valid for all a, b ∈ R, t1, t1 > 0, we get √ √ 2 (%(x, y) − tδ)2 δt %(x, y)2 + + ≥ , 2t δt (2 + δ)t thus √ (%(x, y) − tδ)2 %(x, y)2 + ≥ − 1, 2t Dt which completes the proof.  The following lemma connects the weighted L2-norm with Gaussian heat kernel upper bounds:

Lemma 12.6. For all D > 0, t ≥ t0 > 0, x, y ∈ M one has 2 p −%(x,y) /(2Dt)−λmin(M)(t−t0) (58) p(t, x, y) ≤ ED(t0/2, x)ED(t0/2, y)e , in particular, p −%(x,y)2/(2Dt) (59) p(t, x, y) ≤ ED(t/2, x)ED(t/2, y)e . Proof : In view of (55) it suffices to prove (59). Set a := %(y, z), b := %(x, z), c := %(x, y). Then exp((a2 + b2 − c2/2)/(Dt)) ≥ 1 by the triangle inequality, and so Z p(t, x, y) = p(t/2, x, z)p(t/2, y, z)dµ(z) M Z ≤ exp(−%(x, y)2/(2Dt)) p(t/2, x, z) exp(%(x, z)2/(Dt))p(t/2, y, z) exp(%(y, z)2/(Dt))dµ(z) M fro which the claim follows using Cauchy-Schwarz. 

Theorem 12.7. Assume (B(xi, ri))i∈I is a family of relatively compact balls such that there exists constant n > 0 and for each i a constant ai > 0 with the following property: for each i and each open U ⊂ B(xi, ri) one has the Faber-Krahn inequality −2/n λmin(U) ≥ aiµ(U) .

Then for all i, j, all x ∈ B(xi, ri/2), y ∈ B(xj, rj/2), all t ≥ t0 > 0 one has

2 C (1 + %(x, y)2/t)n/2e−%(x,y) /(4t)−λmin(t−t0) p(t, x, y) ≤ n . 2 2 n/4 aiaj min(t0, ri ) min(t0, rj )

Proof : We have B(x, ri/2) ⊂ B(xi, ri) so the Faber-Krahn inequality holds on B(x, ri/2) and Theorem 12.3 gives with D := 2 + (1 + %(x, y)2/t)−1 54 B. GUNEYSU¨ the bound −n/2 Cn(aiδ) 2 −1 ED(t, x) ≤ 2 n/2 where δ = min(D − 2, 1) = (1 + %(x, y) /t) . min(t, ri ) Likewise we have −n/2 Cn(ajδ) ED(t, y) ≤ 2 n/2 . min(t, rj ) and so the previous lemma gives 2 C (1 + %(x, y)2/t)n/2e−%(x,y) /(2Dt)−λmin(t−t0) p(t, x, y) ≤ n . 2 2 n/4 aiaj min(t0, ri ) min(t0, rj ) In view of %(x, y)2/t = 1 − 1/δ, D = δ + 2 we get %(x, y)2 %(x, y)2 δ%(x, y)2 δ(1 − δ) δ − = = < < 1 4t 2Dt 4Dt 4(δ + 2) 4(δ + 2) we have 2 2 2 −%(x,y)2/(2Dt) −%(x,y)2/(2Dt) %(x,y) − %(x,y) − %(x,y) e = e e 4t e 4t ≤ ee 4t , this completes the proof.  As a first application of the above estimate in combination with Remark 12.4 (apply the above estimate with the family B(x, r˜(x))x∈M ), we obtain for all x, y ∈ M the estimate t log p(t, x, y) ≤ t log(C(x, y)t−n/2) + t log(1 + %(x, y)2/2)n/2 − %(x, y)2/4, and so (since log growths slower then any polynomial) lim sup 4t log p(t, x, y) ≤ −%(x, y)2, t→0+ which is one half of Varadhan’s famous asymptotic formula [34] lim 4t log p(t, x, y) = −%(x, y)2, t→0+ The latter formula states that the heat kernel captures a Euclidean behaviour for small times, regardless of the geometry. A very noneuclidean behaviour may occur for large times, though. Here comes the main result of this lecture course: Theorem 12.8 (Grigor’yan 1994). Assume M is geodesically complete and noncompact. Then the following statements are equivalent: a) M satisfies a relative Faber-Krahn inequality, that is, there exist constants b, n0 > 0 such that for all x ∈ M, r > 0 and all open relatively compact U ⊂ B(x, r) one has 0 b µ(x, r)2/n (60) λ (U) ≥ min r2 µ(U) Heat kernels on Riemannian manifolds 55

b) M is doubling, that is, there exists a constant C > 0 such that for all x ∈ M, r > 0 one has (61) µ(x, 2r) := µ(B(x, 2r)) ≤ Cµ(x, r), and M satisfies the following Li-Yau upper heat kernel bound: there exist a constants n, C0 > 0 such that for all t > 0, x, y ∈ M one has

2 n/2 −%(x,y)2/(4t) 0 (1 + %(x, y) /t) e (62) p(t, x, y) ≤ C q √ √ . µ(x, t)µ(y, t)

c) M is doubling and satisfies the following on-diagonal upper heat kernel bound: there exists a constant C00 > 0 such that for all t > 0, x ∈ M one has C00 p(t, x, x) ≤ √ . µ(x, t) Remark 12.9. 1. The noncompactness is only used in c) =⇒ a). 2. The proof shows that the constants are related as follows: a) implies b) with n = n0 and C0,C depending only on n0 and b. b) implies c) with C00 = C (trivial). 00 0 c) implies a) with b depending only the doubling constant C and C , and n = log2(C). 3. Using that for all r,  > 0 we can find a constant Cr, > 0 such that for all ζ ≥ 0 one has

r −ζ/4 −ζ/(4+) (63) (1 + ζ) e ≤ Cr,e , one immediately gets that b) is equivalent to the following condition: M is volume doubling and there exists a constant n > 0 and for all  > 0 a constant C,n > 0 such that for all t > 0, x, y ∈ M one has

−%(x,y)2/((4+)t) C,ne p(t, x, y) ≤ q √ √ . µ(x, t)µ(y, t)

4. As arguments from the proof entail, b) is also equivalent to the following condition: M is volume doubling and there exists a constant n > 0 and for all  > 0 a constant A,n,C > 0, where C is the doubling constant, such that for all t > 0, x, y ∈ M one has A e−%(x,y)2/((4+)t) p(t, x, y) ≤ ,n,C √ . µ(x, t) To see this, apply estimate (64) below to estimate √ √ √ √ 2 n00/2 µ(x, t)/µ(y, t) ≤ µ(y, t + %(x, y))/µ(y, t) ≤ AC (1 + %(x, y) /t) ,

00 00 where n = log2(C) and use again (63) with r depending on n and n . 56 B. GUNEYSU¨

5. Note that the Theorem entails that certain heat kernel upper bounds (+ doubling) is stable under quasi-isometry11. This is very surprising, as the heat kernel depends on ∆ which carries derivatives of the metric which can differ as dramatically as we wish for quasi-isometric metrics. 6. One can also prove that if M geodescially complete and doubling with doubling constant A > 0, and if there exists a constant B > 0 such that for all t > 0, B p(t, x, x) ≤ √ , µ(x, t) then there exists a constant C > 0 (which only depends on A and B) such that for all t > 0 one has the lower bound C p(t, x, x) ≥ √ . µ(x, t) This is Theorem 16.6 in Grigor’yan’s book. Proof of Theorem 12.8: a) ⇒ b): Let x ∈ M, r > 0 be arbitrary. Then for all open U ⊂ B(x, r) we have −2/n0 λmin(U) ≥ a(x, r)µ(U) , where a(x, r) := br−2µ(x, r)2/n. √ Applying Theorem 12.7 to the family B(x, t), x ∈ M, immediatly proves the heat kernel estimate. That relative Faber-Krahn implies doubling will be an exercise. b) ⇒ c): trivial. c) ⇒ a): Let C be the doubling constant. Iterating the doubling inequality one gets that for all 0 < r ≤ R, x ∈ M one has (64) µ(x, R)/µ(x, r) ≤ C(R/r)n0 ,

0 N where n := log2(C). Indeed, R ≤ 2 r, where N is the smallest natural number ≥ log2(R/r), and so N ≤ log2(R/r) + 1 and 0 µ(x, R)/µ(x, r) ≤ µ(x, 2N r)/µ(x, r) ≤ CN ≤ C1+log2(R/r) = C(R/r)n . Fix x ∈ M, r > 0, U ⊂ B(x, r) open. Then for all t > 0 one has

U Z Z √ (65) e−λmin(U)t ≤ tr(e−tH ) = pU (t, y, y)dµ(y) ≤ C00 µ(y, t)−1dµ(y). U U If y ∈ U, t ≤ r2, then by (64) √ √ √ (66) µ(x, r)/µ(y, t) ≤ µ(y, 2r)/µ(y, t) ≤ C(r/ t)n0 ,

11 Two Riemann metrics g,h are called quasi-isometric, if C1h ≤ g ≤ C2h, which easily shows that the volume measures and the quadratic forms R |f|2dµ are equivalent Heat kernels on Riemannian manifolds 57

and so 0 0 Z √ C00Cµ(U)  r n C000µ(U)  r n (67) e−λmin(U)t ≤ C00 µ(y, t)−1dµ(y) ≤ √ = √ , U µ(x, r) t µ(x, r) t yielding

000  n0 ! −1 C µ(U) r (68) λmin(U) ≥ −t log √ . µ(x, r) t Case: µ(U) ≤ (C000e)−1µ(x, r). Define t by

√ 0 µ(x, r)) (r/ t)n = , C000eµ(U) so 0 1  µ(x, r)) 2/n 1/t = , r2 C000eµ(U) Then we have t ≤ r2 and so by (68),

0 1  µ(x, r)) 2/n λ (U) ≥ −t−1 log e−1 = t−1 = . min r2 C000eµ(U) Proving relative Faber-Krahn in this case. Case µ(U) > (C000e)−1µ(x, r): using that balls are relatively compact, M is connected and noncompact one proves (exercise) that doubling implies reverse doubling: there exists n000, c (which only depend on the doubling constant C) such that for all y ∈ M, 0 < s ≤ S one has µ(y, S) 000 ≥ c(S/s)n . µ(y, s) This reverse doubling implies that we can pick a constant A > 1, which only depends depends on n000, c, such that µ(x, Ar) ≥ C000e. µ(x, r) Then we have U ⊂ B(x, Ar) and µ(U) ≤ (C000e)−1µ(x, Ar) and the previous case applied to Ar implies the first inequality in

0 0 1 µ(x, Ar))2/n 1  µ(x, r)) 2/n λ (U) ≥ ≥ , min (Ar)2 C000eµ(U) (Ar)2 C000eµ(U) completing the proof.  Recall that the Levi-Civita connection on M is the uniquely determined smooth metric covariant derivative ∇ on TM which is torsion free, in the sense that

∇AB − ∇BA = [A, B] for all smooth vector fields smooth A, B on M. 58 B. GUNEYSU¨

The Riemannian curvature tensor Riem is defined to be the curvature of ∇, 2 ∗ Riem := R∇ ∈ ΓC∞ (M, (∧ T M) ⊗ End(TM)). We recall that for smooth vector fields A, B, C on M, this tensor is explictly given by

Riem(A, B)C := ∇A∇BC − ∇B∇AC − ∇[A,B]C ∈ XC∞ (M). Then the Ricci curvature ∗ ∗ Ric ∈ Γ∞ (M,T M T M) is the field of symmetric bilinear forms on TM given by the fiberwise (g-)trace m X Ric(A, B) |U = (Riem(ej,B)A, ej), j=1 where e1, . . . , em ∈ XC∞ (U) is a local orthonormal frame, and A, B are as above. The condition Ric ≥ κ for some constant κ ∈ R means that for all A as above one has Ric(A, A) ≥ κ|A|2 on M. Clearly the Ricci curvature of the Euclidean Rm is zero, and compact M’s have Ric ≥ κ for some κ ∈ R. On the other hand, if M is complete and has Ric ≥ κ > 0, then M is compact (by the Bonnet-Meyers theorem). The hyperbolic space Hm has constant Ricci curvature −1, in the sense that Ric(A, B) = −(A, B) for all A, B as above. Theorem 12.10 (Grigor’yan 1986). If M is geodesically complete with Ric ≥ 0, the M satisfies the relative Faber-Krahn inequality. In particular, Theorem 12.8 applies. The ultimate reason for the above Theorem is that geod. compl. and nonnegative Ricci imply the Laplacian comparison theorem, which states that for all fixed O ∈ M, one has ∆%(·, O) ≤ (m − 1)/%(·, O), wherever %(·, O) is smooth (away from union of {O}and the cut-locus of O).

13. Wiener measure and Brownian motion on Riemannian manifolds

Roughly speaking, one would like to construct Brownian motion X(x0) on M, starting 12 from x0 ∈ M, as follows: It should be an M-valued process with continuous paths

(69) X(x0) : [0, ∞) × Ω −→ M, which is defined on some probability space (Ω, P, F ), and which has the transition proba- bility densities given by p(t, x, y). In other words, given n ∈ N, a finite sequence of times 0 < t1 < ··· < tn and Borel sets A1,...,An ⊂ M, setting δj := tj+1 − tj with t0 := 0,

12 We recall that given two measurable spaces Ω1 and Ω2, a map

X : [0, ∞) × Ω1 −→ Ω2, (t, ω) 7−→ Xt(ω) is called an Ω2-valued process, if for all t ≥ 0 the induced map Xt :Ω1 → Ω2 is measurable. The maps t 7→ Xt(ω), with fixed ω ∈ Ω1, are referred to as the paths of X. Heat kernels on Riemannian manifolds 59

we would like the probability of finding the Brownian particle simultaneously in A1 at the time t1, in A2 at the time t2, and so on, to be given by the quantity

(70) P{Xt1 (x0) ∈ A1,...,Xtn (x0) ∈ An} Z Z

= ··· 1A1 (x1)p(δ0, x0, x1) ···

× 1An (xn)p(δn−1, xn−1, xn)dµ(x1) ··· dµ(xn),

whenever the particle starts from x0. Equivalently, one could say that a Brownian motion on M with starting point x0 is a process with continuous paths (69), such that the finite- dimensional distributions of its law are given by the right-hand side of (70)13. In fact, such a path space measure is uniquely determined by its finite-dimensional distributions (cf. Remark 13.7 below). In particular, all Brownian motions should have the same law, which we will call the Wiener measure later on. Ultimately, the above prescriptions indeed turn out to work perfectly well in terms of giving Brownian motion for the Euclidean Rm or for compact Riemannian manifolds. On the other hand, we see from (70) that, in particular, it is required that for all t > 0, Z P{Xt(x0) ∈ M} = p(t, x0, y)dµ(y), M and already if M is any open bounded subset of Rm, it automatically happens that Z (71) p(t, x0, y)dµ(y) < 1 for some (t, x0) ∈ (0, ∞) × M, M This leads to the conceptual difficulty that the process can leave its space of states with a strictly positive probability and leads to:

Definition 13.1. M is called stochastically complete, if for all t > 0, x0 ∈ M one has Z p(t, x0, y)dµ(y) = 1. M Remark 13.2. Stochastic completeness is unrelated to geodesic completeness. For ex- ample, Rm \{0} is stochastically complete but geodesically incomplete, and there exist geodesically complete and but stochastically incomplete M’s. On the other hand, a cel- ebrated result by Yau states that if M is geodesically complete with a Ricci curvature bounded from below by a constant, then M is stochastically complete. In particular, the Euclidean Rm is stochastically complete (of course this follows also simply from a calcula- tion), and compact M’s are stochastically complete.

13 The law of X(x0) is by definition the probability measure on the space of continuous paths on M, which is defined as the pushforward of P under the induced map

Ω −→ C([0, ∞),M), ω 7−→ X•(x0)(ω). 60 B. GUNEYSU¨

Since we aim to work on arbitrary Riemannian manifolds, we need to solve the above conceptual problem of stochastic incompleteness. This is done by using the Alexandrov compactification of M. Since it does not cause much extra work, we start by explaining the corresponding constructions in the setting of an arbitrary Polish space, recalling that a topological space is called Polish, if it is separable and if it admits a complete metric which induces the original topology. Notation 13.3. Given a locally compact Polish space N, we set

Ne := ( N, if N is compact

Alexandrov compactification N ∪ {∞N }, if N is noncompact.

We recall here that ∞N is any point ∈/ N, and that the topology on N ∪ {∞N } is defined as follows: U ⊂ N ∪ {∞N } is declared to be open, if and only if either U is an open subset of N or if there exists a compact set K ⊂ N such that U = (N \ K) ∪ {∞N }. This construction depends trivially on the choice of ∞N , in the sense that for any other choice 0 0 ∞N ∈/ N, the canonical bijection N ∪ {∞N } → N ∪ {∞N } is a homeomorphism. We consider the path space ΩN := C([0, ∞), Ne), and thereon we denote (with a slight abuse of notation) the canonically given coordinate process by

X : [0, ∞) × ΩN −→ N,e Xt(γ) := γ(t).

We consider ΩN a topological space with respect to the topology of uniform convergence on compact subsets, and we equip it with its Borel sigma-algebra F N . We fix such a locally compact Polish space N (e.g., a manifold) for the moment. It is well-known that ΩN as defined above is Polish again. In fact, Ne is Polish, and if we pick a bounded metric % : N × N → [0, 1] which induces the original topology on N, then Ne e e e ∞ X %Ω (γ1, γ2) := max % (γ1(t), γ2(t)) N 0≤t≤j Ne j=1

14 is a complete separable metric on ΩN which induces the original topology (of local uniform convergence). Furthermore, since evaluation maps of the form

X1 × C(X1,X2) −→ X2, (x, f) 7−→ f(x) are always jointly continuous, if X1 is locally compact and Hausdorff and if C(X1,X2) is equipped with its topology of local uniform convergence, it follows that X is in fact jointly continuous. In particular, X is jointly (Borel) measurable.

14In fact, it is easy to see that this is a complete metric which induces the original topology. On the other hand, the proof that this topology is separable is a little tricky. Although it is not so easy to find a precise reference, we believe that these results can be traced back to Kolmogorov. Heat kernels on Riemannian manifolds 61

Notation 13.4. Given a set Ω and a collection C of subsets of Ω or of maps with domain Ω, the symbol hC i stands for the smallest sigma-algebra on Ω which contains C . Furthermore, whenever there is no danger of confusion, we will use notations such as

{f ∈ A} := {y ∈ Ω: f(y) ∈ A} ⊂ Ω,

where f :Ω → Ω0 and A ⊂ Ω0.

Definition 13.5. 1. A subset C ⊂ ΩN is called a Borel cylinder, if there exist n ∈ N, 0 < t1 < ··· < tn and Borel sets A1,...,An ⊂ Ne, such that

n \ C = { ∈ A ,..., ∈ A } = −1(A ). Xt1 1 Xtn n Xtj j j=1

N The collection of all Borel cylinders in ΩN will be denoted by C . N 2. Likewise, given t ≥ 0, the collection Ct of Borel cylinders in ΩN up to the time t is defined to be the collection of subsets C ⊂ ΩN of the form

n \ C = { ∈ A ,..., ∈ A } = −1(A ), Xt1 1 Xtn n Xtj j j=1

where n ∈ N, 0 < t1 < ··· < tn < t, and where A1,...,An ⊂ Ne are Borel sets.

N N It is easily checked inductively that both C and Ct are π-systems in ΩN , that is, both col- lections are (nonempty and) stable under taking finitely many intersections. The following fact makes F N handy in applications: Lemma 13.6. One has

N N D E (72) F = C = (Xs :ΩN −→ Ne)s≥0 .

Proof : Since for every fixed s ≥ 0 the map ˜ Xs :ΩN −→ N, γ 7−→ γ(s) is F N -measurable, it is clear that C N ⊂ F N , and therefore

N N C ⊂ F .

In order to see N N F ⊂ C ,

pick a topology-defining metric % on N and denote the corresponding closed balls by Ne e B (x, r). Then, since the elements of Ω are continuous, for all γ ∈ Ω , n ∈ ,  > 0 Ne N 0 N N 62 B. GUNEYSU¨ one has   γ : max % (γ(t), γ0(t)) ≤  0≤t≤n Ne \ = γ : γ(t) ∈ B (γ (t), ) , Ne 0 0≤t≤n, t is rational \ = γ : γ(t) ∈ B (γ (t), ) . Ne 0 0 0(73) 0≤t≤n Ne N

N are C -measurable. Since the collection of sets of the form (73) generates the topology of local uniform convergence15, it is clear that the induced Borel sigma-algebra F N satisfies N N F ⊂ C . The inclusion N D E C ⊂ (Xs :ΩN −→ Ne)s≥0

N −1 is clear, since each set in C is a finite intersection of sets of the form Xs (A), s > 0, A ⊂ Ne Borel. To see D E N (Xs :ΩN −→ Ne)s≥0 ⊂ C , note that for every metric % that generates the topology on N, one has Ne e D E D E ( :Ω −→ N) =  −1B (x, r) : x ∈ N, r > 0, s ≥ 0 , Xs N e s≥0 Xs Ne e with the corresponding closed balls B (... ), so that it only remains to prove Ne −1B (x, r) ∈ N X0 Ne C for all x ∈ Ne, r > 0. This, however, follows from −1   0 B% (x, r) = γ : lim % (γ(1/n), x) ≤ r , X Ne n→∞ Ne since clearly γ 7→ % (γ(1/n), x) is a N -measurable function on Ω (the pre-image of Ne C N an interval of the form [0,R] under this map is the cylinder set −1 B (x, R)). This X1/n Ne completes the proof.  Remark 13.7. By the above lemma, C N is a π-system that generates F N . It then follows from an abstract measure theoretic result that every finite measure on F N is uniquely determined by its values on C N .

15To be precise, this collection forms a basis of neighbourhoods of this topology. Heat kernels on Riemannian manifolds 63

Definition 13.8. Setting

N D E Ft := (Xs :ΩN −→ Ne)0≤s≤t for every t ≥ 0, it follows from Lemma 13.6 that N [ N F∗ := Ft t≥0 becomes a filtration of F N . It is called the filtration generated by the coordinate process on ΩN . Precisely as for the second equality in (72), one proves N N Ft = Ct for all t ≥ 0.(74) N Particularly important Ft -measurable sets are provided by exit times: Definition 13.9. Given an arbitrary subset U ⊂ Ne, we define

(75) ζU :ΩN −→ [0, ∞], ζU := inf{t ≥ 0 : Xt ∈ Ne \ U}, and call this map the the first exit time of X from U, with inf{...} := ∞ in case the set is empty. There is the following result, which in a probabilistic language means that first exit times N 16 from open sets are F∗ -optional times: Lemma 13.10. Assume that U ⊂ Ne is open with U 6= Ne. Then one has N {t < ζU } ∈ Ft for all t ≥ 0.

Proof : The proof actually only uses that X has continuous paths and that Ne is metrizable: Pick a metric % on N which induces the original topology. Then, since N˜ \ U is closed Ne e and X has continuous paths, we have [ [ {t < ζ } = {% ( , N˜ \ U) ≥ 1/n}. U Ne Xs n∈N 0≤s≤t, s is rational N The set on the right-hand side clearly is ∈ Ft , since the distance function to a nonempty set is continuous and thus Borel. 

14. The Wiener measure on Riemannian manifolds We return to our Riemannian setting. In order to apply the above abstract machinery in this case, we have to extend some Riemannian data to the compactification of M (in the noncompact case):

16 Let (Ω, F ) be a measure space, and let F∗ = (Ft)t≥0 be a filtration of F . Then a map τ :Ω → [0, ∞] is called a F∗-optional time, if for all t ≥ 0 one has {t < τ} ∈ Ft, and it is called a F∗-stopping time, if for all t ≥ 0 one has {t ≤ τ} ∈ Ft. 64 B. GUNEYSU¨

Notation 14.1. Let µe denote the Borel measure on Mf given by µ if M is compact, and which is extended to ∞M by setting µ(∞M ) = 1 in the noncompact case. Then we define a Borel function pe : (0, ∞) × Mf × Mf −→ [0, ∞) as follows: pe := p if M is compact, and in case M is noncompact, then for t > 0, x, y ∈ M we set

pe(t, x, y) := p(t, x, y), pe(t, x, ∞M ) := 0, pe(t, ∞M , ∞M ) := 1, Z pe(t, ∞M , y) := 1 − p(t, y, z)dµ(z). M It is straightforward to check that the pair (p,e µe) satisfies the Chapman-Kolmogorov equa- tions, that is, for all s, t > 0, x, y ∈ Mf one has Z (76) pe(t, x, z)pe(s, y, z)dµe(z) = pe(s + t, x, y). Mf Furthermore, one has Z (77) pe(t, x, y)dµe(y) = 1 for all x ∈ M,f Mf R in contrast to the possibility of M p(t, x, y)dµ(y) < 1 in case M is stochastically incomplete. It is precisely the conservation of probability (77) which motivates the above Alexandrov machinery. Since there is no danger of confusion, the following abuse of notation will be very convenient in the sequel:

Notation 14.2. We write ζ := ζM for the first exist time of the coordinate process X on ΩM from M ⊂ Mf. For obvious reasons, ζ is also called the explosion time of X. Note also that one has ζ > 0, and that by our previous conventions we have ζ ≡ ∞ if M is compact. The last fact is consistent with the fact that compact Riemannian manifolds are stochastically complete. The following existence result will be central in the sequel:

x0 Proposition and definition 14.3. The Wiener measure P with initial point x0 ∈ M M is defined to be the unique probability measure on (ΩM , F ) which satisfies

x0 P {Xt1 ∈ A1,..., Xtn ∈ An} Z Z = ··· 1 (x )p(δ , x , x ) ··· A1 1 e 0 0 1 × 1 (x )p(δ , x , x )dµ(x ) ··· dµ(x ) An n e n−1 n−1 n e 1 e n for all n ∈ N, all finite sequences of times 0 < t1 < ··· < tn and all Borel sets A1,...,An ⊂ Mf, where δj := tj+1 − tj with t0 := 0. It has the additional property that  n o x0 [ (78) P {ζ = ∞} ζ < ∞ and Xt = ∞M for all t ∈ [ζ, ∞) = 1, Heat kernels on Riemannian manifolds 65

x0 17 in other words, the point at infinity ∞M is a “trap” for P -a.e. path. Proof : Some remarks during class. 

An obvious but nevertheless very important consequence of (78) is that for all x0 ∈ M one has

x0 (79) P {1{t<ζ} = 1{Xt∈M}} = 1. In the sequel, integration with respect to the Wiener measure will often be written as an expectation value, Z Z x x x E 0 [Ψ] := ΨdP 0 := Ψ(γ)dP 0 (γ),

where Ψ : ΩM → C is any appropriate (say, nonnegative or integrable) Borel function. We remark that using monotone convergence, the defining relation of the Wiener measure implies that for all n ∈ N, all finite sequences of times 0 < t1 < ··· < tn and all Borel functions f1, . . . , fn : Mf −→ [0, ∞), one has

x0 E [f1(Xt1 ) ··· fn(Xtn )](80) Z Z = ··· f1(x1)pe(δ0, x0, x1) ···

(81) × fn(xn)pe(δn−1, xn−1, xn)dµe(x1) ··· dµe(xn), where δj := tj+1 − tj with t0 := 0. In particular, by the very construction of Mf and µe, the above formula in combination with (79) implies

x0   (82) E 1{t1<ζ}f1(Xt1 ) ··· 1{tn<ζ}fn(Xtn ) = x0 1 f ( ) ··· 1 f ( ) E {Xt1 ∈M} 1 Xt1 {Xtn ∈M} n Xtn Z Z = ··· f1(x1)p(δ0, x0, x1) ···

(83) × fn(xn)p(δn−1, xn−1, xn)dµ(x1) ··· dµ(xn), therefore quantities that are given by averaging over paths that remain on M until any fixed time can be calculated by genuine Riemannian data on M, as it should be. In the sequel, we will also freely use the following facts:

Remark 14.4. 1. Each of the measures Px0 is concentrated on the set of paths that start in x0, meaning that x0 P {X0 = x0} = 1 for all x0 ∈ M, as it should be. To see this, pick a metric %e on Mf which induces the topology on Mf, and set ˜ fe := %e(•, x0) − %e(∞M , x0) ∈ C(M). 17 It is a trap in the sense that once a path touches ∞M , it remains there for all times. 66 B. GUNEYSU¨

As x0 ∈ M, the very definition of (p,e µe) implies that for all t > 0 one has Z Z pe(t, x0, y)%e(y, x0)dµe(y) = p(t, x0, y)fe|M (y)dµ(y) + %e(∞M , x0), Mf M which, since fe|M is a continuous bounded function on M, implies through (80) and the fact that for all f ∈ Cb(M) one has [7]

Ptf → f locally uniformly as t → 0+, the L1-convergence Z x0 E [%e(Xt, x0)] = pe(t, x0, y)%e(y, x0)dµe(y) → 0 as t → 0+. Mf Thus we can pick a sequence of strictly positive times a with a → 0 such that %( , x) → n n e Xan 0 Px0 -a.e., and the claim follows from %( , x) ≤ %( , ) + %( , x) for all n ∈ e X0 e X0 Xan e Xan N and the continuity of the paths of X. 2. For every Borel set N ⊂ M with µ(N) = 0 and every x ∈ M, one has Z ∞ Z Z ∞ Z x (84) 1{(s0,γ0): γ0(s0)∈N}(s, γ)dP (γ)ds = p(s, x, y)dµ(y)ds = 0. 0 ΩM 0 N This fact follows immediately from the defining relation of the Wiener measure. For the first identity in (84), one also needs Fubini’s Theorem, which can be used due to X being jointly measurable. 3. For each fixed A ∈ F M , the map x M −→ [0, 1], x 7−→ P (A)(85) is Borel measurable. In fact, this is obvious for A ∈ C M by the defining relation of the Wiener measure, and it holds in general by the monotone class theorem, since C M is a π-system which generates F M , and since the collection of sets {A : A ∈ F M , (85) is Borel measurable} forms a monotone Dynkin-system. The following result is crucial: Lemma 14.5. The family of Wiener measures satisfies the following Markov property: M For all x0 ∈ M, all times t ≥ 0, all Ft -measurable functions φ :ΩM → [0, ∞), and all M F -measurable functions Ψ:ΩM → [0, ∞), one has Z Z Z x γ(t) x (86) φ(γ)Ψ(γ(t + •))dP 0 (γ) = φ(γ) Ψ(ω)dP (ω)dP 0 (γ) ∈ [0, ∞]. Heat kernels on Riemannian manifolds 67

Proof : By monotone convergence, it is sufficient to consider the case φ = 1A, Ψ = 1B with M M A ∈ Ft , B ∈ FM . Furthermore, for fixed A ∈ Ft , using a monotone class argument as in Remark 14.4.3, it follows that it is sufficient to prove the formula for B ∈ C M . Using yet another monotone class argument, it follows that ultimately we have to check the formula M only for φ = 1A, Ψ = 1B with A ∈ Ct , B ∈ CM . So we pick k, l ∈ N, finite sequences of times 0 < r1 < ··· < rk < t, 0 < s1 < ··· < sl, Borel sets

A1,...,Ak,B1,...,Bl ⊂ Mf with k l \ \ A = −1(A ),B = −1(B ), Xri i Xsi i i=1 i=1 and s0 := 0, r0 := 0. Then by the defining relation of the Wiener measure we have Z x0 1A(γ) · 1B(γ(t + •))dP (γ) Z = 1 ··· 1 1 ··· 1 d x0 {Xr1 ∈A1} {Xrk ∈Ak} {Xs1+t∈B1} {Xsl+t∈Bl} P Z Z = ··· 1 (x )p(r − r , x , x ) ··· 1 (x )p(r − r , x , x ) A1 1 e 1 0 0 1 Ak k e k k−1 k−1 k × 1 (x )p(s + t − r , x , x ) ··· B1 k+1 e 1 k k k+1 × 1 (x )p(s − s , x , x )dµ(x ) ··· dµ(x ). Bl k+l e l l−1 k+l−1 k+l e 1 e k+l

On the other hand, if for every y0 ∈ Mf we set Z Z Ψ(y ) := ··· 1 (y )p(s − s , y , y ) ··· 0 B1 1 e 1 0 0 1 × 1 (y )p(s − s , y , y )dµ(y ) ··· dµ(y ), Bl l e l l−1 l−1 l e 1 e l then by using the defining relation of the Wiener measure for the dPγ(t)(ω) integration and then using (80), we get Z Z γ(t) x0 1A(γ) 1B(ω)dP (ω)dP (γ) Z = 1 (γ) ··· 1 (γ)Ψ(γ(t))d x0 (γ) {Xr1 ∈A1} {Xrk ∈Ak} P Z Z = ··· 1 (z )p(r − r , x , z ) ··· 1 (z )p(r − r , z , z ) A1 1 e 1 0 0 1 Ak k e k−1 k k−1 k × p(t − r , z , z)1 (y )p(s − s , z, y ) ··· 1 (y )p(s − s , y , y ) e k k B1 1 e 1 0 1 Bl l e l l−1 l−1 l × dµe(y1) ··· dµe(yl)dµe(z1) ··· dµe(zk)dµe(z), which is equal to the above expression for Z x0 1A(γ) · 1B(γ(t + •))dP (γ), 68 B. GUNEYSU¨

since by the Chapman-Kolomogorov equation and recalling s0 = 0, we have ZZ p(t − r , z , z)1 (y )p(s − s , z, y )dµ(z)dµ(y ) e k k B1 1 e 1 0 1 e e 1 Z = p(t − r + s , z , y )1 (y )dµ(y ). e k 1 k 1 B1 1 e 1 This completes the proof.  Now we are in the position to define Brownian motion on an arbitrary Riemannian mani- fold:

Definition 14.6. 1. Let (Ω, F , P) be a probability space, x0 ∈ M, and let

X(x0) : [0, ∞) × Ω −→ M,f (t, ω) 7−→ Xt(x0)(ω)

be a continuous process. Then the tuple (Ω, F , P,X(x0)) is called a Brownian motion on M with starting point x0, if the law of X(x0) with respect to P is equal to the Wiener measure Px0 . Recall that this means the following: The pushforward of P with respect to the F /F M measurable18 map  (87) Ω −→ ΩM , ω 7−→ t 7−→ Xt(x0)(ω) is Px0 . 2. Assume that (Ω, F , P,X(x0)) is a Brownian motion on M with starting point x0, and that F∗ := (Ft)t≥0 is a filtration of F . Then the tuple (Ω, F , F∗, P,X(x0)) is called an adapted Brownian motion on M with starting point x0, if X(x0) is adapted to F∗ := (Ft)t≥0 (that is, Xt(x0):Ω → Mf is Ft-measurable for all t ≥ 0) and if in addition the following Markov property holds: For all times t ≥ 0, all Ft measurable functions φ :Ω → [0, ∞), and all Borel functions Ψ : ΩM → [0, ∞), one has Z Z Z Xt(x0)(ω) φ(ω)Ψ(Xt+•(x0)(ω))dP(ω) = φ(ω) Ψ(γ)dP (γ)dP(ω). It follows from the above results that a canonical adapted Brownian motion with starting point x0 is given in terms of the Wiener measure by the datum

M M x0 (88) (Ω, F , F∗, P,X(x0)) := (ΩM , F , F∗ , P , X). Having recorded the existence of Brownian motion, we can immediately record the following characterization of the stochastic completeness property that was previously defined by the “parabolic condition” Z p(t, x0, y)dµ(y) = 1 for all (t, x0) ∈ (0, ∞) × M: M

Namely, M is stochastically complete, if and only if for every x0 ∈ M and every Brownian motion (Ω, F , P,X(x0)) on M with starting point x0, one has P{Xt(x0) ∈ M} = 1 for all t ≥ 0,

18 M Note that by assumption Xt(x0) is Ft -measurable for all t ≥ 0, so that indeed (87) is automatically F /F M measurable. Heat kernels on Riemannian manifolds 69

that is, if all Brownian motions remain on M for all times. This observation follows immediately from the defining relation of the Wiener measure. The second part of Definition 14.6 is motivated by the fact that every Brownian motion has the required Markov property with respect to its own filtration:

Lemma 14.7. Every Brownian motion (Ω, F , P,X(x0)) on M with starting point x0 is X(x0) automatically an (Ft )t≥0-Brownian motion, where

X(x0) Ft := h(Xs(x0))0≤s≤ti , t ≥ 0

denotes the filtration of F which is generated by X(x0).

X(x0) Proof : We have to show that given t ≥ 0, an Ft -measurable function φ :Ω → [0, ∞), and a Borel function Ψ : ΩM → [0, ∞), one has Z Z Z Xt(x0)(ω) φ(ω)Ψ(Xt+•(x0)(ω))dP(ω) = φ(ω) Ψ(γ)dP (γ)dP(ω).

M Assume for the moment that we can pick an Ft -measurable function f :ΩM → [0, ∞) 0 such that f(X (x0)) = φ, where 0 X (x0):Ω −→ ΩM

M x0 denotes the induced F /F measurable map (87). Then, since the law of X(x0) is P , we can use the Markov property from Lemma 14.5 to calculate Z φ(ω)Ψ(Xt+•(x0)(ω))dP(ω) Z 0 0 x 0 = f(ω )Ψ(ω (t + •))dP 0 (ω ) Z Z 0 ω0(t) x 0 = f(ω ) Ψ(γ)dP (γ)dP 0 (ω ) Z Z  Xt(x0)(ω) = f X(x0)(ω) Ψ(γ)dP (γ)dP(ω) Z Z X (x )(ω) = φ(ω) Ψ(γ)dP t 0 (γ)dP(ω), proving the claim in this case. It remains to prove that one can always “factor” φ in the above form. Somewhat simpler variants of such a statement are usually called Doob- Dynkin lemma in the literature. An important point here is that the factoring procedure can be chosen to be positivity preserving. We give a quick proof: Set X := X(x0), 0 0 X := X (x0), and assume first that φ is a simple function, that is, φ is a finite sum P X φ = j cj1Aj with constants cj ≥ 0 and disjoint sets Aj ∈ Ft . Then by the definition of this sigma-algebra, there exist times 0 ≤ s ≤ t and Borel sets B ⊂ M with A = X−1(B ), j j f j sj j such that with C := −1(B ) ∈ M , the function f := P c 1 on Ω is nonnegative, j Xsj j Ft j j Cj M M 0 Ft -measurable, and satisfies f(X ) = φ. In the general case, there exists an increasing X sequence of nonnegative Ft -measurable simple functions φn on Ω such that limn φn = φ. 70 B. GUNEYSU¨

M By the above, we can pick for each n an Ft -measurable nonnegative function fn on ΩM 0 with fn(X ) = φn. The set 0 Ω := {fn converges pointwise } ⊂ Ω 0 M clearly contains the image of X , and it is straighforwardly seen to be Ft -measurable. Then f := limn(fn1Ω0 ) has the desired properties. Note that the above proof is entirely measure theoretic and does not use any particular (say, topological) properties of the involved quantities.  Without entering the details, we remark here that the importance of adapted Brownian motions stems from the fact that they are continuous M-valued semimartingales with respect to the given filtration (in the sense that their composition with arbitrary smooth functions are real-valued semimartingales) [21, 10]. Being a continuous semimartingale, the paths of an adapted Brownian motion can be almost surely horizontally lifted (in a natural sense that relies on Stratonovic stochastic integrals) to smooth principal bundles that come equipped with a smooth connection [10]. This is a very remarkable fact, since Brownian paths are almost surely nowhere differentiable [10]. Such lifts are the main ingredient of probabilistic formulae for the heat semigroups associated with operators of the ∇†∇ acting on L2-sections on a metric vector bundle over M [21]. Definition 14.8. M is called nonparabolic, if and only if for all x, y ∈ M with x 6= y one has the finiteness of the Coulomb potential Z ∞ G(x, y) := p(t, x, y)dt. 0 One can easily show that compact M’s are always parabolic, and that the Euclidean Rm is nonparabolic if and only m ≥ 3. Probabilistically, this property means [8]:

Theorem 14.9. M is nonparabolic, if and only if every Brownian motion (Ω, F , P,X(x0)) on M with starting point x0 is transient, in the sense that for every precompact set U ⊂ M one has

P{there exists s > 0 such that for all t > s one has Xt(x0) ∈/ U} = 1, that is, if and only if all Brownian motions on M eventually leave each precompact set almost surely. One can show that if M is geodesically complete with Ric ≥ 0, then (M, Ψ) is nonparabolic, if and only if Z ∞ t √ dt < ∞ for all x ∈ M, 0 µ(x, t) Finally, we are going to sketch a proof of the Feynman-Kac formula. The aim here is to w −tHw 2 deribe a path integral formula for the semigroup Pt := e ∈ L (L (M)) associated with a Schr¨odinger operator of the form Hw := −∆ + w, where w : M → R is a potential. In case w = 0 we simply have, x   Ptf(x) = E 1{t<ζ}f(Xt) , Heat kernels on Riemannian manifolds 71

which is a path integral formula, as Z x   x E 1{t<ζ}f(Xt) = f(γ(t))dP (γ). {t<ζ} In the general case, according to Richard Feynman’s thesis, we expect a formula of the form

Z R t h R t i w − 0 w(γ(s))ds x x − 0 w(Xs)ds (89) Pt f(x) = e f(γ(t))dP (γ) = E 1{t<ζ}e f(Xt) . {t<ζ}

Actually, in quantum physics, one is rather interested in the unitary group e−itHw ∈ 2 w 2 L (L (M)), which with Ψ(t) := Pit Ψ, Ψ ∈ L (M), solves the Schr¨odingerequation (d/dt)Ψ(t) = −iHwΨ(t), Ψ(0) = Ψ. Feynman then ’showed’ that (without any mathematical rigour) that Z −itH −i R t w(γ(s))ds −i R t |γ˙ (s)|2ds x e w f(x) = e 0 f(γ(t))e 0 D (γ), {t<ζ} where Dx is some sort of Riemannian wolume measure on the space of paths on M starting R t 2 x x and 0 |γ˙ (s)| ds is the energy of such a path γ. Now one can prove that D does not exist, and of course many paths do not have a finite energy. On the other hand, switching from it to t, although each factor is problematic, the product

− R t |γ˙ (s)|2ds x e 0 · D (γ) is well-defined and in fact one has

− R t |γ˙ (s)|2ds x x e 0 D (γ) = dP (γ)

− R t |γ˙ (s)|2ds in a sense that can be made precise. The point is that e 0 is damping and can x − R t |γ˙ (s)|2ds absorb some of the infinities of D (γ), while e 0 was oscillating and could not do that. The first issue that has to be attacked is which w’s can be dealt with in such a formula. In quantum physics, one has to deal with nonsmooth and unbounded w’s such as the Coulomb potential w(x) = −1/|x| for M = R3. Ultimately, the following class has turned out to be useful:

Definition 14.10. A Borel function w : M → R is said to be in the Kato class K(M) of M, if Z t x   (90) lim sup E 1{s<ζ}|w(Xs)| ds = 0. t→0+ x∈M 0 Obviously, K(M) is a linear space. We are going to show in an exercise that K(M) ⊂ 1 3 3 Lloc(M) and that if M = R with its Euclidean metric then w(x) := 1/|x| is in K(R ). 72 B. GUNEYSU¨

One can show with some efforts that for w ∈ K(M) the symmetric densely defined quadratic form Z Z ∞ ∞ Cc (M) × Cc (M) 3 (f1, f2) 7−→ (df1, df2)dµ + wf1f2dµ in L2(M) is closable and semibounded from below. The associated semibounded self- adjoint operator is denotes by Hw. Note that Hw|w=0 = H. Thus we are interested in a formula for w −tHw 2 Pt := e ∈ L (L (M)). That (89) can hold at all relies on: Lemma 14.11. Let w ∈ K(M). Then: a) One has Z Z T sup p(s, x, y)|w(y)|ds dµ(y) < ∞ for all T > 0. x∈M M 0 b) For all x ∈ M one has x  1 P w(X•) ∈ Lloc[0, ζ) = 1. c) There are cj = cj(w) > 0, j = 1, 2, such that for all t ≥ 0,

h R t i x |w(Xs)|ds tc2 (91) sup E e 0 1{t<ζ} ≤ c1e < ∞. x∈M Proof : a) Take a t > 0 with Z Z t sup p(s, x, y)|w(y)|ds dµ(y) < ∞ x∈M M 0 and pick l ∈ N with T < lt. Then we can estimate Z Z T sup p(s, x, y)|w(y)|ds dµ(y) x∈M M 0 Z Z lt ≤ sup p(s, x, y)|w(y)|ds dµ(y) x∈M M 0 l X Z Z t ≤ sup p((k − 1)t + s, x, y)|w(y)|ds dµ(y) x∈M k=1 M 0 l X Z t Z Z = sup p((k − 1)t, x, z) p(s, z, y)|w(y)|dµ(y)dµ(z)ds x∈M k=1 0 M M l ! X Z ≤ sup p((k − 1)t, x, z)dµ(z) × x∈M k=1 M Z t Z × sup p(s, z, y)|w(y)|dµ(y)ds z∈M 0 M Heat kernels on Riemannian manifolds 73

Z t Z ≤ l sup p(s, z, y)|w(y)|dµ(y)ds < ∞, z∈M 0 M where we have used the Chapman-Kolomogorov identity and Z p(s0, x0, y0)dµ(y0) ≤ 1. b) Pick a continuous function ρ : M → [0, ∞) such that for all c ∈ [0, ∞) the level sets

{ρ ∈ [c, ∞)} are compact. Then the collection of subsets (Un)n∈N of M given by

Un := of {ρ ∈ [1/n, ∞)} forms an exhaustion of M with open relatively compact subsets. For every n ∈ N, define the first exit times (1) ζn := ζUn :ΩM −→ [0, ∞].

(1) x Then the sequence ζn announces ζ with respect to P for every x ∈ M in the following x sense: There exists a set Ωx ⊂ ΩM with P (Ωx) = 1, such that for all paths γ ∈ Ωx one has the following two properties: (1) • ζn (γ) % ζ(γ) as n → ∞, (1) • the implication ζ(γ) < ∞ ⇒ ζn (γ) < ζ(γ) holds true for all n. (1) To see that ζ is indeed announced by ζn in the asserted form, one can simply set

Ωx := {γ ∈ ΩM : γ(0) = x}. x Then P (Ωx) = 1 and the asserted properties follow easily from continuity arguments, (2) since Ωx is a set of continuous paths that start in x. It follows immediately that ζn := (1) min(ζn , n) also announces ζ. As a consequence, we have ( (2) ) Z ζn x  1 x \ P w(X•) ∈ Lloc[0, ζ) = P |h(Xs)| ds < ∞ . 0 n∈N Now we have " (2) # " # Z ζn Z min(ζ,n) x x E |w(Xs)| ds ≤ E |w(Xs)| ds 0 0 Z n  Z n Z x = E |w(Xs)| 1{s<ζ}ds = p(s, x, y) |w(y)| dµ(y), 0 0 M and this number is finite for all n by a). ˜ c) With M = M ∪{∞M } the Alexandrov compactification of M, we can canonically extend ˜ w to a Borel function we : M → R by setting we(∞M ) = 0. Then one trivially has h R t i h R t i x |w(Xs)|ds x |w(Xs)|ds (92) E e 0 1{t<ζ} ≤ E e 0 e . 74 B. GUNEYSU¨

2. (Khas’minskii’s lemma) For any s ≥ 0, let

x h R s|w( )|dri J(w, s) := sup E e 0 e Xr ∈ [0, ∞]. x∈M Then for every s > 0 with Z s  x D(w, s) := sup E |w(Xr)| 1{r<ζ}dr < 1 x∈M 0 it holds that 1 (93) J(w, s) ≤ . 1 − D(w, s) Proof: One has Z s  x D(w, s) = sup E |we(Xr)| dr . x∈M 0 For any n ∈ N, let n o n sσn := q = (q1, . . . , qn) : 0 < q1 < ··· < qn < s ⊂ R denote the open scaled simplex. In the chain of equalities ∞ Z h R s i X x 0 |we(Xr)|dr x n E e = 1 + (1/n!) E [|w(Xq1 )| ... |w(Xqn )|] d q n e e n=1 [0,s] ∞ X Z = 1 + x [|w( )| ... |w( )|] dnq E e Xq1 e Xqn n=1 sσn ∞ X Z s Z s Z s = 1 + ··· x [|w( )| ... |w( )|] dnq, E e Xq1 e Xqn n=1 0 q1 qn−1 the first one follows from Fubini’s theorem, and the second one from combining the fact that the integrand is symmetric in the wariables qj with the fact that the number of orderings of a real-walued tuple of length n is n!. In particular, by comparison with a geometric series, it is sufficient to prowe that for all natural n ≥ 2, one has Z s Z s Z s J (w, s) := sup ··· x [|w( )| ... |w( )|] dnq n E e Xq1 e Xqn x∈M 0 q1 qn−1

(94) ≤ D(w, s)Jn−1(w, s). But the Markov property of the family of Wiener measures implies Z s Z s Z s Z Jn(w, s) = sup ··· |we(γ(q1))| ... |we(γ(qn−1))| × x∈M 0 q1 qn−2 ΩM Z Z s−qn−1 γ(qn−1) x n−1 × |we(ω(u))| du dP (ω)dP (γ)d q ΩM 0 (95) ≤ D(w, s)Jn−1(w, s), which prowes Khas’minskii’s lemma. Heat kernels on Riemannian manifolds 75

3. Pick s > 0 with D(w, s) < 1. Then for any t > 0 one has

1 t log 1 J(w, t) ≤ e s ( 1−D(w,s) ). 1 − D(w, s) Proof: Pick a large n ∈ N with t < (n + 1)s. Then the Markov property of the family of Wiener measures and Khas’minskii’s lemma imply J(w, t) ≤ J(w, (n + 1)s) Z Z R ns|w(γ(r))|dr R s|w(ω(r))|dr γ(ns) x = sup e 0 e e 0 e dP (ω)dP (γ) x∈M ΩM ΩM 1 ≤ J(w, ns) 1 − D(w, s) 1 = × 1 − D(w, s) Z Z R (n−1)s|w(γ(r))|dr R s|w(ω(r))|dr γ((n−1)s) x × sup e 0 e e 0 e dP (ω)dP (γ) x∈M ΩM ΩM ≤ ... (n-times) 1  1 n ≤ 1 − D(w, s) 1 − D(w, s)

1 t log 1 ≤ e s ( 1−D(w,s) ), 1 − D(w, s) which proves (91) in view of (92).  Definition 14.12. An ordered pair (Ξ, Ξ)˜ of functions Ξ: M −→ (0, ∞], Ξ˜ : (0, ∞) −→ (0, ∞) is called a heat kernel control pair for the Riemannian manifold M, if the following as- sumptions are satisfied: • Ξ is continuous with inf Ξ > 0, Ξ˜ is Borel • for all x ∈ M, t > 0 one has sup p(t, x, y) ≤ Ξ(x)Ξ(˜ t) y∈M • for all q0 ≥ 1 in the case of m = 1, and for all q0 > m/2 in the case of m ≥ 2, one has Z ∞ Ξ˜1/q0 (t)e−Atdt < ∞ for some A > 0. 0 Remark 14.13. 1. Every Riemannian manifold admits a canonically given heat kernel control pair. Indeed, using an L1-variant of the parabolic mean value inequality one can show that C ˜ −m/2 Ξ(x) = m , Ξ(t) = t + 1 min(rEucl(x), 1) 76 B. GUNEYSU¨ defines such a pair, where rEucl(x) is defined to be infimum of all r > 0 such that B(x, r) is relative compact and

(1/2)δij ≤ gij ≤ 2δij thereon. 2. Assume that one has the ultracontractiveness sup p(t, x, y) ≤ Ct−m/2 for all 0 < t < T . x,y∈M Then, (Ξ(x), Ξ(˜ t)) := (C, min(t, T )−m/2) is a heat kernel control pair, which is constant in its first slot. 3. Assume that M is geodesically complete with Ric ≥ −K for some constant K ≥ 0. Then the (local in time) Li-Yau heat kerne bound shows that Ξ(x) := Cµ(B(x, 1))−1, Ξ(˜ t) := t−m/2 + 1 is a heat kernel kontrol pair.

Proposition 14.14. Let w = w1 + w2 : M → R be a function which can be decomposed into Borel functions wj : M → R satisfying the following two properties: ∞ • w2 ∈ L (M) • there exists a real number q0 < ∞ such that q0 ≥ 1 if m = 1, and q0 > m/2 if ˜ 19 q0 m ≥ 2, and a heat kernel control pair (Ξ, Ξ), such that w1 ∈ LΞ (M). Then for all u > 0 and all x ∈ M, one has the bound Z ˜ 1/q0 (96) p(u, x, y)|w(y)|dµ(y) ≤ Ξ(u) kw1kq0;Ξ + kw2k∞ . M In particular, for any choice of q0 and (Ξ, Ξ)˜ as above one has

q0 ∞ LΞ (M) + L (M) ⊂ K(M), where q q LΞ(M) := L (M, Ξdµ). Proof : Once we have proved Z ˜ 1/q0 (97) p(u, x, y)|w(y)|dµ(y) ≤ Ξ(u) kw1kq0;Ξ + kw2k∞ , M

0 19 q q0 Note that one automatically has LΞ (M) ⊂ L (M), which is implied by inf Ξ > 0. Heat kernels on Riemannian manifolds 77

the inclusion w ∈ K(M) clearly follows from

Z t Z lim sup p(u, x, y)|w(y)|dµ(y)du t→0+ x∈M 0 M Z t ˜ 1/q0 ≤ C(w1) lim Ξ(u) du + C(w2) lim t = 0. t→0+ 0 t→0+ In order to derive (97), note first that the inequality Z p(u, x, y)dµ(y) ≤ 1(98) M

0 shows that we can assume w2 = 0. Furthermore, the case q = 1 (which is only allowed for m = 1) is obvious, so let us assume m ≥ 2 and q0 > m/2. The essential trick to R bound M p(u, x, y)|w1(y)|dµ(y) is to factor the heat kernel appropriately: Indeed, with 1/q0 + 1/q := 1, HA˜¶lder’s inequality and using (98) once more gives us the following estimate: Z Z 1 1− 1 p(u, x, y)|w1(y)|dµ(y) = p(u, x, y) q p(u, x, y) q |w1(y)|dµ(y) M M 1 1 Z  q Z  q0 q0 ≤ p(u, x, y)dµ(y) |w1(y)| p(u, x, y)dµ(y) M M 1 Z  q0 q0 ˜  ≤ |w1(y)| Ξ(u)Ξ(y) dµ(y) M ˜ 1/q0 ≤ Ξ(u) kw1kq0;Ξ .

This completes the proof. 

3 Example 14.15. We have w := 1/| · | ∈ K(R ). Indeed, this w = 1B(0,1)w + 1B(0,1)c w is in L2(R3) + L∞(R3). Theorem 14.16. Let w ∈ K(M). Then for all t ≥ 0, f ∈ L2(M), µ-a.e. x ∈ M, one has

h R t i −tHw x − w(Xs)ds (99) e f(x) = E 1{t<ζ}e 0 f(Xt) .

Proof : Step 1: (99) holds in case w : M → R is continuous and bounded. Proof: Decomposing √ f = f1 − f2 + f3 − −1f4, fj ≥ 0, if necessary, we can and we will assume f ≥ 0 for the proof. Since w is bounded, it simply acts as a bounded multiplication operator (that will be denoted by the same symbol again). 78 B. GUNEYSU¨

By (82), for every t > 0, n ∈ N and µ-a.e. x0 ∈ M, we have −(t/n)H −(t/n)w n (e e ) f(x0) = n ! Z Z X ··· exp −(t/n) w(xi) p(t/n, x0, x1) ··· p(t/n, xn−1, xn)f(xn) i=1

× dµ(x1) ··· dµ(xn) = " n ! # x0 X E 1{t<ζ} exp −(t/n) w(Xt/n) f(Xt) . i=1 Since w is continuous, for each fixed continuous path which remains on M until t, the R t exp(··· )-expression represents Riemann sums for − 0 w(Xs)ds. Furthermore, we have n ! X 1{t<ζ} exp −(t/n) w(Xt/n) f(Xt) ≤ exp(t kwk∞)f(Xt), i=1 and clearly Z x0   E 1{t<ζ}f(Xt) = p(t, x0, y)f(y)dµ(y) < ∞, M therefore dominated convergence shows that for µ-a.e. x0 ∈ M, h R t i −(t/n)H −(t/n)w n x0 − w(Xs)ds lim (e e ) f(x0) = 1{t<ζ}e 0 f( t) . n→∞ E X On the other hand, Trotter’s product formula n A semibounded, B bounded, e−t(A+B) = lim e−(t/n)Ae−(t/n)B strongly, n→∞ gives (after picking a subsequence, if necessary, to turn the L2-convergence to a µ-a.e. convergence)

−(t/n)H −(t/n)w −tHw lim e e f(x0) = e f(x0) for µ-a.e. x0, n→∞ which proves the Feynman-Kac formula in this case. Step 2: (99) holds in case w is a bounded potential. Proof: We will use Friedrichs mollifiers to reduce everything to the continous (in fact: smooth) bounded case from step 1. To this end, we pick an atlas (Ul)l∈N for M such that each Ul is relatively compact. We also take a subordinate partition of unity ϕl ∈ Cc(Ul). Then (l) w := ϕlw : Ul −→ R defines a bounded compactly supported function, and by Remark ??.ii) we can pick a sequence (l) (wn )n ⊂ Cc(Ul) such that µ-a.e. in Ul we have (l) (l) (l) |wn | ≤ kwk∞ < ∞, wn → w as n → ∞. Heat kernels on Riemannian manifolds 79

Defining a sequence of smooth potentials X (l) wn := ϕlwn , l one has

|wn| ≤ kwk∞ , wn → w µ-a.e.(100) It is clear from (100) and dominated convergence that 2 lim Hw ψ = Hwψ in L (M) n→∞ n for all

ψ ∈ Dom(Hw) = Dom(Hwn ) = Dom(H). Thus by an abstract convergence result for semigroups, which deduces the strong conver- gence of the semigroups

−tAn −tA Anf → Af for all f ∈ Dom(A) = Dom(An), and A, An semibounded ⇒ e → e strongly we have lim e−tHwn f = e−tHw f in L2(M). n→∞ In particular, passing to a subsequence if necessary, we can and we will assume lim e−tHwn f(x) = e−tHw f(x) for µ-a.e. x,(101) n→∞ so we find h R t i −tHw x − wn(Xs)ds (102) e f(x) = lim 1{t<ζ}e 0 f( t) for µ-a.e. x n→∞ E X by the already established validity of the covariant Feynman-Kac formula for e−tHwn f. It remains to show that the right-hand side of (102) is equal to h R t i x − w(Xs)ds E 1{t<ζ}e 0 f(Xt) . To this end, applying (100) together with the elementary inequality a b max(a,b) (103) e − e ≤ 2 |a − b| e , a, b ∈ R, shows that one has − R t w( )ds − R t w ( )ds 0 Xs 0 n Xs 1{t<ζ} e − e Z t kwk t x ≤ 2 · 1{t<ζ}e ∞ |w(Xs) − wn(Xs)| ds P -a.s., 0 so using (100) once more with dominated convergence, we find R t R t − wn(Xs)ds − w(Xs)ds x lim 1{t<ζ} e 0 − e 0 = 0 -a.s.(104) n→∞ P Finally, we may use (104) and

− R t w ( )ds kwk t x 0 n Xs ∞ 1{t<ζ} e ≤ e P -a.s. 80 B. GUNEYSU¨

to deduce (99) from (102) and dominated convergence. Step 3: (99) holds in case w is bounded from below. Proof: Set wn := min(n, w), use step 2 and convergence results for semigroups for the LHS and convergence results for intgegrals for RHS. Step 4: (99) holds for w Kato. Proof: Set wn := wn := max(−n, w), use step 2 and convergence results for semigroups for the LHS and convergence results for intgegrals for RHS. 

References

[1] Alonso, A. & Simon, B.: The Birman-Krein-Vishik theory of selfadjoint extensions of semibounded operators. J. 4 (1980), no. 2, 251–270. [2] Azencott, R.: Behavior of diffusion semi-groups at infinity. Bull. Soc. Math. France 102 (1974), 193–240. [3] Bei, F. & G¨uneysu,B.: q-parabolicity of stratified pseudomanifolds and other singular spaces. Ann. Global Anal. Geom. 51 (2017), no. 3, 267–286. [4] Bianchi, D. & Setti, A.: Laplacian cut-offs, fast diffusions on manifolds and other applications. Preprint (2016). arXiv:1607.06008v1. [5] Burago, D. & Burago, Y. & Ivanov, S.: A course in metric geometry, AMS (book). [6] Davies, E.B.: Heat kernels and spectral theory. Cambridge Tracts in Mathematics, 92. Cambridge University Press, Cambridge, 1990. [7] Grigor’yan, A.: Heat kernel and analysis on manifolds. AMS/IP Studies in Advanced Mathematics, 47. American Mathematical Society, Providence, RI; International Press, Boston, MA, 2009. [8] Grigor’yan, A.: Analytic and geometric background of recurrence and non-explosion of the Brownian motion on Riemannian manifolds. Bulletin of Amer. Math. Soc. 36 (1999) 135–249. [9] G¨uneysu,B. & Guidetti, D. & Pallara, D.: L1-elliptic regularity and H=W on the whole Lp-scale on arbitrary manifolds. Annales Academiae Scientiarum Fennicae, Mathematica (2017) Volumen 42, 497–521. [10] Hackenbroch, W. & Thalmaier, A.: Stochastische Analysis. B.G. Teubner, Stuttgart, 1994. [11] Hebey, E.: Sobolev spaces on Riemannian manifolds. Lecture Notes in Mathematics, 1635. Springer- Verlag, Berlin, 1996. [12] Heinonen, J. & Koskela, P. & Shanmugalingam, N. & Tyson, J.T.: Sobolev spaces on metric mea- sure spaces. An approach based on upper gradients. New Mathematical Monographs, 27. Cambridge University Press, Cambridge, 2015. [13] Heinonen, J.: Lectures on Lipschitz analysis. http://www.math.jyu.fi/research/reports/rep100.pdf [14] Hess, H. & Schrader, R. & Uhlenbrock, D.A.: Kato’s inequality and the spectral distribution of Lapla- cians on compact Riemannian manifolds. J. Differential Geom. 15 (1980), no. 1, 27–37 (1981). [15] Hess, H. & Schrader, R. & Uhlenbrock, D.A.: Domination of semigroups and generalization of Kato’s inequality. Duke Math. J. 44 (1977), no. 4, 893–904. [16] Hiai, F.: Log-majorizations and norm inequalities for exponential operators. Banach Cent. Publ. 38(1), 119–181 (1997). [17] Hirsch, M.W.: Differential topology. Corrected reprint of the 1976 original. Graduate Texts in Math- ematics, 33. Springer-Verlag, New York, 1994. [18] Hinz, A.M. & Stolz, G.: Polynomial boundedness of eigensolutions and the spectrum of Schr¨odinger operators. Math. Ann. 294 (1992), no. 2, 195–211. [19] HA˜¶rmander, L.: The analysis of linear partial differential operators I. Second edition. Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, 1990. Heat kernels on Riemannian manifolds 81

[20] Hunsicker, E. & Mazzeo, R.: Harmonic forms on manifolds with edges, Int. Math. Res. Not. 2005 (52) (2005) 3229–3272. [21] Hsu, E.P.: Stochastic analysis on manifolds. Graduate Studies in Mathematics, 38. American Math- ematical Society, Providence, RI, 2002. [22] Kato, T.: Perturbation theory for linear operators. Reprint of the 1980 edition. Classics in Mathemat- ics. Springer-Verlag, Berlin, 1995. [23] Krein, M.: The theory of self-adjoint extensions of semi-bounded Hermitian transformations and its applications. I. Rec. Math. [Mat. Sbornik] N.S. 20(62), (1947). 431–495. [24] Lee, J.M.: Introduction to smooth manifolds. Graduate Texts in Mathematics, 218. Graduate Texts in Mathematics, 218. Springer, New York, 2013. ˜ 1 [25] MA 4 ller, O.: A note on closed isometric embeddings. J. Math.Anal. 349 (2009), 297–298. [26] Nirenberg, L.: Remarks on strongly elliptic partial differential equations. Comm. Pure Appl. Math. 8 (1955), 649–675. [27] Nicolaescu, L.I.: Lectures on the geometry of manifolds. Second edition. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2007. [28] Reed, M. & Simon, B.: Methods of modern . I. Functional analysis. Academic Press, New York-London, 1972. [29] Reed, M. & Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. Academic Press, Inc., 1978. [30] Saloff-Coste, L.: Aspects of Sobolev-type inequalities. London Mathematical Society Lecture Note Series, 289. Cambridge University Press, Cambridge, 2002. [31] Schoen, R. & Yau, S.-T.: Lectures on differential geometry. Conference Proceedings and Lecture Notes in Geometry and Topology, I. International Press, Cambridge, MA, 1994. [32] Spiegel, Daniel: The Hopf-Rinow theorem. Notes available online. [33] Sturm, K.-T.: Heat kernel bounds on manifolds. Math. Ann. 292 (1992), no. 1, 149–162. [34] Varadhan, S.: On the behavior of the fundamental solutionof the heat equation with variablecoeffi- cients, Comm. Pure Appl Math., 20 (1967), 431–455. [35] Weidmann, J.: Lineare Operatoren in Hilbertraeumen. Mathematische Leitfaeden. B.G. Teubner, Stuttgart, 1976. [36] Weidmann, J.: Lineare Operatoren in Hilbertraeumen. Teil 1. Grundlagen. Mathematische Leitfaeden. B.G. Teubner, Stuttgart, 2000. [37] Yau, S.T.: On the heat kernel of a complete Riemannian manifold. J. Math. Pure Appl. (9) 57 (1978), no. 2, 191–201.