<<

AST 6416: Physical Instructor: Gonzalez Fall 2009

This document contains the lecture notes for the graduate level cosmology course at the University of Florida, AST 6416. The course is 15 weeks long, with three class periods per week (one on Tuesday and two on Friday). These notes are based upon a collection of sources. The most notable of these are lecture notes from George Blumenthal and Henry Kandrup, and the textbooks by Coles & Lucchin (2002), Peacock (1999), and Peebles (1993).

1 Introduction, Early Cosmology

Week 1 Reading Assignment: Chapter 1

1.1 Course Overview Cosmology, defined as man’s attempt to understand the origin of the , is as old as mankind. Cosmology, as a field of scientific inquiry, is one of the newest of topics. The first theoretical underpinnings of the field date to the dawn of the ; a significant fraction of the landmark cosmological have occurred in the past two decades — and the field certainly holds a plethora of fundamental unanswered questions. It is only during this past century that we have gained the ability to start answering questions about the origin of the universe, and I hope to share with you some of the excitement of this field. The title of this course is , and the central aim of this semester will be for you to understand the underlying that defines the formation and of the universe. We will together explore the development of cosmology, investigate the successes (and failures) of the current , and discuss topics of current relevance. By the end of the course, you will have a basic understanding of the foundations upon which our current picture of the universe is based and (hopefully) a sense of the direction in which this field is headed. What you will not have is comprehensive knowledge of the entire discipline of cosmology. The field has grown dramatically in recent , and a semester is sufficient to cover only a fraction of the material. This semester will be primarily taught from a theoretical perspective, with limited discussion of the details of the observations that have helped define our current picture of the universe. is the foundation that underpins all of modern cosmology, as it defines the structure of and thereby provides the physical framework for describing the Universe. I realize that not all of you have taken a course in GR; and a detailed discussion of GR is beyond the scope of this class. Consequently, I will tread lightly in this area, and GR will not be considered a prerequisite for this class. For many of the key results, we will use pseudo-Newtonian derivations to facilitate intuition (which is useful even if you do know GR). In practice, this detracts very little from the scope of the course. Once we have

1 established a few fundamental equations, the bulk of the semester will be quite independent of one’s knowledge of General Relativity. For those of you who wish to learn more about General Relativity, I refer you to PHZ 6607. Finally, please review your copy of the syllabus for the semester. You will see that the textbook is Coles & Lucchin. The advantages of this text are that it is generally readable and should serve as a good reference source for you both for this class and in the future. Moreover, this text is used both for my course and for . Be aware however that the organization of this course does not directly parallel the organization of the book – we will be jumping around, and sometimes covering material in a different fashion than the text. The first half of the semester will be dedicated to what I would call “classical” cosmology, which broadly refers to the fundamental description of the universe that was developed from 1916-1970 – the global structure of the universe, expansion of the universe, and development of the Big Bang model, Big Bang , etc. The second half of the semester will focus upon more recent topics in the field – things such as dark , dark , inflation, modern cosmological tests, and gravitational lensing. I emphasize that the division between the two halves of the semesters is only a preliminary plan, and schedule may shift depending on the pace of the course. Homework will be assigned every two weeks, starting on Friday, and will comprise 50% of your grade. I strongly encourage you to work together on these assignments. and cosmology are collaborative fields and you are best served by helping each other to learn the material. Make sure though that you clearly understand you write down – otherwise you will be poorly served for the exam and the future. The final will be comprehensive for the semester and account for the other 50%.

1.2 The Big Questions Before we begin, it is worth taking a few moments to consider the scope of the field of cosmology by considering, in broad terms, the aim of the subject. More so than most other fields, cosmology is all encompassing and aims for a detailed understanding of the universe and our place therein. Fundamental questions that the field aims include:

What is the history of the Universe? How did it begin? How did the structures • that we see today – matter, , and everything else – come to be?

What is the future of the Universe? What happens next? How does the Universe • end, or does it end?a

How does the Universe, and the matter/energy it contains, change with • time?

What are the matter/energy constituents of the Universe and how were • they made?

What is the of the Universe? • 2 Why are the physical laws in the Universe as they are? • What, if anything, exists outside our own Universe? • Clearly an ambitious of questions. We by no means have complete answers to all of the above, but it is remarkable the – and rate of progress – towards answers that has transpired in recent . In this course we will touch upon all these topics, but primarily focus upon the first five.

1.3 Olbers’ Paradox And so with that introduction, let us begin. Let us for a moment step back 200 years to 1807. Newtonian physics and calculus were well-established, but was still over 50 years in the future, and it would be a similar before Monsieur Messier would begin to map his nebulae (and hence the concepts of and Universe were essentially equivalent). Copernicus had successfully displaced us from the center of the solar , but our position in the larger Universe was essentially unknown. At the time, as would remain the case for another 100 years, cosmology was the realm of the philosopher – but even in this realm one can ask physically meaningful questions to attempt to understand the Universe. Consider Olbers’ Paradox, which was actually first posited in before being rediscovered by several people in the 18th and 19th century. When Olbers posed the paradox in 1826, the general was the Universe was infinite, uniform, and unchanging (“as fixed as the in the firmament”). The question that Olbers asked was: Why is the night sky dark? Let us make the following assumptions:

1. Stars (or in a more modern version galaxies) are uniformly distributed throughout the universe with mean n and L. This a corollary of the Cosmological , which we will discuss in a moment.

2. The universe is infinitely old and static, son ˙ = L˙ = 0.

3. The geometry of is Euclidean. And in 1800, what else would one even consider?

4. There is no large scale systematic of stars (galaxies) in the Universe. Specifi- cally, the Universe is not expanding or contracting.

5. The known laws of physics, derived locally, are valid throughout the Universe.

For a , the flux from an object is defined simply as L f = , (1) 4πr2 where L is the luminosity and r is the distance to the object. In this case, the total incident flux arriving at the is

3 ∞ 2 L ftot = 4πr dr n 2 = (2) Z0  4πr  ∞ The incident flux that we observe should therefore be infinite, as should the , f/c. Clearly the night sky is not that bright! ≡ Can we get around this by including some sort of absorption of the ? Adding an absorbing dust between us doesn’t help much. For a static, infinitely old universe (as- sumption 2), the dust must eventually come into thermodynamic equilibrium with the stars and itself radiate. This would predict a night sky as bright as the surface of a typical . We get the same result if we include absorption by the stars themselves (through their geometric cross section). Specifically, consider the paradox in terms of surface brightness. For a Euclidean geometry, surface brightness (flux per unit angle) is independent of distance since L πd2 L I f/dΩ= 2 / 2 = 2 2 , (3) ≡ 4πr  r ! 4π d where d is the physical size of the object. If the surface brightness is constant, and there is a star in every direction that we look (which is a the logical result of the above assumptions), then every point in space should have the same surface brightness as the surface of a star – and hence T 5000 K. That the sky looks dark to us tells us that T < 1000 K, and sky ≈ sky from modern observations of the background radiation we know that Tsky =2.726 K. Which assumption is wrong?! Assumption 1 is required by a Copernican view of the Uni- verse. We now know that the stars themselves are not uniformly distributed, but the galaxy density is essentially constant on large scales. We are also loathe to abandon assumption 5, without which we cannot hope to proceed. Assumption 3, Euclidean geometry, turns out to be unnecessary. For a non-, the surface area and volume elements within a solid angle dΩ are defined as: dA = r2f(r, Ω)dΩ (4) and dV = d3r = r2f(r, Ω)drdΩ. (5) Therefore, from a given solid angle dΩ,

2 n L nL Ω= r drf(r, Ω) 2 = dr , (6) Z  c  r f(r, Ω)! Z c independent of f(r, Ω). Relaxing assumption 2 (infinite and static) does avoid the paradox. If the Universe is young, then:

Absorption can work because the dust may not be hot yet. • Stars may not have shined long enough for the to reach us from all directions. •

4 If we define the time at t0, then we can only see sources out to R = ct0 so R nL nLR = dr = = nLt0, (7) Z0 c c which is finite and can yield a dark sky for sufficiently small t0. Relaxing assumption 4 can also avoid the paradox. Radial motion gives a Doppler shift ν = ν γ (1 v /c). (8) observed emitted · · − r Since luminosity is energy per unit time, it behaves like squared, i.e. L = L γ2 (1 v /c)2 L . (9) observed emitted · · − r ≤ emitted One avoids the paradox if vr c at large distances. This can be achieved if the Universe is expanding. ∼ Olbers’ paradox therefore tells us that the universe must be either young or expanding – or both. In practice, it would be another century before such conclusions would be drawn, and before there would be additional observational evidence.

2 Definitions and Guiding (Assumptions)

Olber’s paradox has begun to introduce us to some of the fundamental concepts underlying modern cosmology. It is now time step forward 100 years to the start of the 20th century, explicitly lay out these concepts, and establish working definitions for terms that we will use throughout the course.

2.1 Definitions Let us begin by introducing the concepts of a co-moving observer, , and . Co-moving Observer: Imagine a hypothetical set of observers at every point in the • universe (the cosmological equivalent of test ). A co-moving observer is defined as an observer who is at rest and unaccelerated with respect to nearby material. More specifically, any observer can measure the flow velocity, v(r), of nearby material at any time. If the observer finds v(0) = 0 and v˙ (0) = 0, then the observer is comoving. Co-moving observers are expected to be inertial observers (who feel no ) in a homogeneous universe. Note, however that all inertial observers are not necessarily comoving – an inertial observer must have v˙ (0) = 0, but can have v(0) = 0. 6 Homogeneity: A universe is homogeneous if all co-moving observers would observe • identical for the universe. In other words, all spatial positions are equivalent (translational invariance). A simple example of a homogeneous geometry would be the 2-D surface of a sphere. Equivalently, an example of an inhomogeneous universe would be the interior of a 3-D sphere, since some points are closer to the surface than others.

5 Isotropy: A universe is isotropic if, for every co-moving observer, there is no preferred • direction. In other words, the properties of the universe must look the same in all direc- tions. This is equivalent to saying that an isotropic Universe is rotationally invariant at all points. Going back to the same examples from before, the two-dimensional sur- face of a sphere is isotropic – any direction along the surface of the sphere looks the same. On the other hand, the interior of a 3-D sphere is not isotropic. It is rotationally invariant at the center, but for any other point the distance to the surface is shorter for some directions than others.

So are the conditions of homogeneity and isotropy equivalent? Not quite. One can prove that an isotropic universe is always homogeneous, but the converse is not true. Here are the proofs. Assume that the first statement is false, such that there exists a universe that is isotropic everywhere, but not homogeneous. For an inhomogeneous universe, there must exist some observable quantity φ(r) that is position dependent. The quantity φ must be a scalar, because if it were a vector it would have a direction and thus violate the assumption of isotropy. Consider the vector D, defined by

D = φ(r). (10) ∇ Since φ is not a constant, D must be non-zero somewhere. Since D is a vector, it picks out a direction at some point, and therefore the universe cannot appear isotropic to an observer at that point. This contradicts our assumption of an isotropic but inhomogeneous universe and therefore proves that an isotropic universe is always homogeneous. Now, what about the converse statement? How can we have a universe that is homoge- neous but not isotropic. One example would be the 2-D surface of an infinite cylinder (Figure 1). The surface is clearly homogeneous (translationally invariant). However, at any point on the surface the direction parallel to the axis of the cylinder is clearly different from the direction perpendicular to the axis since a path perpendicular to the axis will return to the starting point. A few examples of homogeneous, inhomogeneous, isotropic, and anisotropic are show in Figure 2. The fact that a geometry is dynamic need not affect its isotropy or homogeneity. A dynamic universe can be both homogeneous and isotropic. Consider the surface of a sphere whose radius is increasing as some function of time. The surface of a static sphere is isotropic and homogeneous. The mere fact that the size of the sphere is increasing in no way picks out a special position or direction along the surface. The same considerations also apply to a uniform, infinite sheet that is being uniformly stretched in all directions.

2.2 The In the early days of cosmology at the start of the 20th century, theoretical development was very much unconstrained by empirical data (aside from the night sky being dark). Consequently, initial progress relied making some fundamental assumptions about the

6 Figure 1 An example of a homogeneous, but anisotropic universe. On the 2-D surface of an infinite cylinder there is no preferred location; however, not all directions are equivalent. The surface is translationally, but not rotationally invariant.

Figure 2 Slices through four possible universes. The upper left panel shows a homogeneous and isotropic example. The upper right shows a non-homogeneous and non-isotropic uni- verse. The lower panels illustrate universes that are homogeneous (on large scales), but not isotropic. In one case the galaxies are clustered in a preferred direction; in the other the expansion of the universe occurs in only one direction.

7 of the Universe. As we have seen above, the geometry, , and matter distribution of a universe can be arbitrarily complex. In the absence of any knowledge of these quantities, where should we begin? The most logical approach is the spherical cow approach – start with the simplest physical system, adding complexity only when required. Towards this end, introduced what is known as the Cosmological Principle. The Cosmological Principle states that the Universe is homogeneous and isotropic. It is immediately obvious that this principle is incorrect on small scales – this classroom for instance is clearly not homogeneous and isotropic. Similarly, there are obvious inhomo- geneities on galaxy, , and even scales. However, if you average over larger scales, then the distribution of matter is indeed approximately uniform. The Cosmological Principle should therefore be thought of as a reasonable approximation of the Universe on large scales – specifically scales much greater than the size of gravitationally collapsed structures. Both the global homogeneity and isotropy (at least from our perspec- tive) have been remarkable confirmed by observations such as cosmic background experiments (COBE, WMAP) and large galaxy surveys. The success of the Cos- mological Principle is remarkable given that it was proposed at time with the existence of external galaxies was still a subject of debate.

2.2.1 Spatial Invariance of Physical Laws If we ponder the implications of the Cosmological Principle, we see that it has important physical consequences. Perhaps the most fundamental implication of accepting the Cos- mological Principle is that the known laws of physics, derived locally, must remain valid everywhere else in the Universe. Otherwise the assumption of homogeneity would be vio- lated. Reassuringly, modern observations appear to validate this assumption, at least within the , with the properties of distant astrophysical objects being consistent with those observed locally. Within our own Galaxy, period changes for binary are consistent with the slowdown predicted by General Relativity as a result of gravitational radiation. On a much more distant scale, the light curves of type Ia supernovae are similar in all directions out to z 1 (d 8 billion light years), and have the same functional form ≈ ∼ as those at z = 0. Indeed, terrestrial physics has been remarkably successful in explaining astrophysical phenomena, and the absence of failures is a powerful argument for spatial in- variance. As an aside, it is worth nothing that and , which we will discuss later, are two instances in which standard physics cannot yet adequately describe the universe. Neither of these phenomena violate spatial invariance though – they’re a problem everywhere.

2.2.2 The Additionally, the Cosmological Principle has a philosophical implication for the place of mankind in the Universe. The assumption of isotropy explicitly requires that we are not in a preferred location in the Universe, unlike the center of the 3-D sphere discussed above.

8 The Cosmological Principle therefore extends Copernicus’ displacement of the Earth from the center of the . The statement that we are not in a preferred location is sometimes called the Copernican Principle.

2.2.3 The Perfect Cosmological Principle It is worth noting that there exists a stronger version of the Cosmological Principle called the “Perfect Cosmological Principle”. The Perfect Cosmological Principle requires that the Universe also be the same at all times, and led rise to the “steady-state” cosmology (Hoyle 1948), in which continuous creation of matter and stars maintained the density and luminosity of the expanding Universe. We now know that the Universe is not infinitely old (and could have from Olbers’ paradox!), yet this can still be considered relevant in larger contexts such as eternal inflation, where our Universe is one of an infinite . In this case we may have a preferred time in our own Universe, but the Universe itself is not at a preferred “time”.

2.2.4 Olbers’ Paradox Revisited Finally, it is worth taking one last look at Olbers’ Paradox in light of the Cosmological Principle. Of the five assumptions listed before, the first and fifth are simply implications of the cosmological principle. Since we showed that the fourth was unnecessary, we return to the conclusion that either 2. or 3. must be false.

2.3 Expansion and the Cosmological Principle One of the most influential observations of the 20th century was the discovery by of the expansion of the Universe (Hubble 1929). Hubble’s law states that the reces- sional velocity of external galaxies is linearly related to their distance. Specifically, v = H0d, where v is velocity, d is the distance of a galaxy from us, and H0 is the “Hubble constant”. [It turns out that this “constant” actually isn’t, and the relation is only linear on small scales, but we’ll get to this later.]. It is straightforward to derive Hubble’s law as a natural consequence of the Cosmologi- cal Principle. Consider a triangle, sufficiently small that both Euclidean geometry is a valid approximation (even in a universe with curved geometry) and v <

9 Taking the derivative, a˙ a˙ x˙ = x0 = x, (12) a0  a  or v = Hx, (13) where the Hubble parameter is defined as H a/a˙ . The Hubble constant, H is defined ≡ 0 at the value of the Hubble parameter at t0, i.e. H0 =a ˙ 0/a0. Note that the Cosmological Principle does not require H > 0 – it is perfectly acceptable to have a static or contracting universe.

3 Dynamics of the Universe - Conservation Laws, Fried- mann Equations

To solve for the dynamics of the universe, it is necessary to use the Cosmological Principle (or another principle) along with General Relativity (or another of grav- ity). In this lecture we shall use a Newtonian approximation to derive the evolution of the universe. The meaning of these solutions within the framework of GR will then be discussed to illustrate the effect of spatial and the behavior of light as it propagates. It turns out that the trajectory of light cannot be treated self-consistently within the framework of Newtonian – essentially because of the need for Lorentzian rather than for relativistic velocities. As a reminder, for ,

x′ = x vt (14) − t′ = t, (15) while for a x vt x′ = − (16) 1 (v/c)2 − q vx t 2 t′ = − c . (17) 1 (v/c)2 − q 3.1 Conservation Laws in the Universe Let us approximate a region of the universe as a uniform density sphere of non-relativistic matter. We will now use the Eulerian equations for conservation of mass and to derive the dynamical evolution of the universe.

10 3.1.1 Conservation of Mass If we assume that mass is conserved, then the mass density ρ satisfies the continuity equation ∂ρ = (vρ)=0. (18) ∂t ∇· The Cosmological Principle demands that the density ρ be independent of position. Using the fact that v =3H(t), the continuity equation becomes ∇· dρ +3H(t)ρ =0, (19) dt or dρ = 3H(t)dt, (20) ρ − which integrates to

ρ t a da a ln = 3 dtH(t)= 3 = 3 ln . (21) ρ0 ! − Zt0 − Za0 a − a0  This can be rewritten as 3 a0 ρ(t)= ρ0 , (22)  a  so the time dependence is determined solely by the evolution of the scale factor and for a matter dominated universe ρ a−3. This intuitively makes sense, as it’s equivalent to saying that the matter density is inversely∝ proportional to the volume.

3.1.2 Conservation of Momentum We would like to apply conservation of momentum to the Universe using Newton’s theory of gravity. This approach would seem, at first glance, to be inconsistent with the Cosmological Principle. Euler’s equation for momentum conservation is ∂ρv + (ρv) v + p = Fρ, (23) ∂t ∇· ∇ where v is the local fluid velocity with respect to a co-comoving observer, p is the , and F is the force (in this case gravitational) per unit mass. An immediate problem is that it is difficult to define the gravitational potential in a uniform unbounded medium. We could apply Newton’s laws to a universe which is the interior of a large sphere. This violates the Cosmological Principle since we sacrifice isotropy; however it doesn’t violate it too badly if we consider only regions with size x<

11 The above version of Euler’s equation makes the physical meaning of each term apparent, but let us now switch to the more commonly used form, ∂v p +(v ) v = F ∇ . (24) ∂t · ∇ − ρ The Cosmological Principle requires that the pressure gradient must be zero, and using the fact that x x = x, the equation becomes · ∇ x H˙ + H2 = F. (25) h i Poisson’s equation for the gravitational force is

F = 4πGρ. (26) ∇· − Taking the divergence of both sides above, and using x = 3, we get ∇· dH + H2 = 4πGρ/3. (27) dt − Using a˙ H(t)= , (28) a along with mass conservation, this can be converted into an equation for the scale factor.

a¨ a˙ 2 a˙ 2 4πGρ + = , (29) a − a a − 3 which simplifies to 4πGρ a¨ = a, (30) − 3 or, using our result for the evolution of the matter density,

4πGρ a3 a2a¨ = 0 0 . (31) − 3 This is the basic differential equation for the time evolution of the scale factor. It is also the equation for the radius of a spherical self-gravitating ball. Looking at the equation, it is clear that the only case in whicha ¨ = 0 is when ρ0 =0– an empty universe. [We will revisit this with the more general form from GR, but this basic result is OK.] To obtain a , Einstein modified GR, to give it the most general form possible. His modification was to add a constant (for which there is no justification in Newtonian gravity), corresponding to a modification of Poisson’s Law,

F = 4πGρ +Λ, (32) ∇· −

12 where Λ is referred to as the . The cosmological constant Λ must have of t−2 to match the units of F If Λ H2, it would have virtually no effect ∇· | | ∼ 0 on gravity in the solar system, but would affect the large-scale universe. If we include Λ, our previous derivation is modified such that

3(H˙ + H2)= 4πGρ +Λ (33) − a¨ 4πGρ Λ = + , (34) a − 3 3 or equivalently 4πGρ Λ 4πGρ a3 Λ a¨ =( + )a = 0 0 a−2 + a, (35) − 3 3 − 3 3 Note that a positive Λ corresponds to a repulsive force than can counteract gravity. We now multiply both sides bya ˙ and integrate with respect to dt:

1 4πGρ a3 1 Λ a2 a˙ 2 = 0 0 + + K, (36) 2 3 a 3 2 or a˙ 2 8πGρ a 3 Λ = 0 0 + + Ka−2, (37) a2 3 a 3 where K is an arbitrary constant of integration. For the case of a self-gravitation sphere with Λ=0, K/2 is just the total energy per unity mass (kinetic plus potential) at the surface of the sphere. In GR, we shall see that K is associated with the spatial curvature. The above equation describes what are called the Friedmann solutions for the scale factor of the universe. It implicitly assumes that the universe is filled with zero pressure, non-relativistic material (also known as the dust-filled model). The above equations give some intuition for the evolution of the scale factor of the uni- verse. The equation shows that for an expanding universe, where a(0) = 0, the gravitational term should dominate for early times when a is small. As the universe expands though, first the curvature term and later the cosmological constant term are expected to dominate the right hand side of the equation. Let us now introduce one additional non-Newtonian tweak to the equations. The above equations correspond to a limiting case of the fully correct equations from GR in which the pressure is zero and the energy density is dominated by the rest mass of the particles. To be fully general, the matter density term should be replaced by an “effective density” 3p ρ = ρ + , (38) eff c2 where ρ should now be understood to be the total energy density (kinetic + rest mass). With this modification, Equation 34 becomes a¨ 4πG 3p Λ = ρ + 2 + . (39) a − 3  c  3 13 It is worth emphasizing at this point that the energy density ρ will include contributions from both matter and radiation, which as we shall see have different dependences upon the scale factor. Finally, we can now re-obtain the equation above for the first derivative if we take into account that the expansion of the universe with be adiabatic, i.e. dE = pdV d(ρc2a3)= pda3. (40) − → − This equation can be rewritten a3d(ρc2)+(ρc2)da3 = pda3, (41) − (ρc2 + p)da3 + a3d(ρc2 + p)= a3dp (42) d a3(ρc2 + p) = a3dp (43) h d i pa˙ 3 = a3(ρc2 + p) (44) dt h i which yields p a˙ ρ˙ +3 ρ + 2 =0. (45)  c  a If we now return to deriving the equation for the first derivative,

a¨ 4πG 3p Λ = ρ + 2 + , (46) a − 3  c  3 1 d˙a2 4πG 3p Λ = ρ + 2 aa˙ + aa.˙ (47) 2 dt − 3  c  3 The expression for adiabatic expansion can be rewritten, 3p a˙ a˙ = ρ˙ 3ρ , (48) c2 a − − a which can be inserted to yield 1 d˙a2 4πG Λ = ρaa˙ ρa˙ 2 3ρaa˙ + aa,˙ (49) 2 dt − 3 − − 3 1 d˙a2  4πG  Λ = 2ρaa˙ +ρa ˙ 2 + aa,˙ (50) 2 dt 3 3 1 d˙a2  4πG dρa2  Λ d = + a2, (51) 2 dt 3 dt 6 dt and hence 8πGρa2 Λa2 a˙ 2 = + k. (52) 3 3 − In the context of GR, we will come to associate the constant k with the spatial curvature of the universe. GR is fundamentally a geometric theory in which gravity is described as a curved spacetime rather than a force. In this Newtonian analogy the quantity -k/2 would be interpreted as the energy per unit mass for a at the point a(t) in the expanding system.

14 3.2 Conclusions The two , a¨ 4πG 3p Λ = ρ + 2 + , (53) a − 3  c  3 8πGρa2 Λa2 a˙ 2 = + k, (54) 3 3 − together fully describe the time evolution of the scale factor of the universe and will be used extensively during the next few weeks.

3.3 An Example Solution and Definitions of Observable Quanti- ties Let us now work through one possible solution to the Friedmann equations. For a simple case, we will start with Λ = 0. At the present

da =a ˙ = a H . (55) dt 0 0 0 !t=to We can now evaluate the constant k in terms of observable present day quantities.

8πGρa2 a˙ 2 = k, (56) 3 − 8πGρ a2 a˙ 2 H2a2 = 0 0 k, (57) 0 ≡ 0 0 3 − 2 2 8πGρ0 2 8πG 2 3H0 k = a0 H0 = a0 ρ0 . (58) −  3 −  3 − 8πG!

Clearly, k = 0 only if ρ0 is equal to what we will define as the critical density,

3H2 ρ = 0 . (59) crit 8πG With this definition,

8πG 2 ρ0 8πG 2 k = a0ρ0 1 = a0ρ0 (Ω0 1) , (60) 3 ρcrit − ! − 3 −

where we have further defined,

ρ0 8πGρ0 Ω0 = 2 . (61) ≡ ρcrit 3H0

15 Note that this has the corollary definition

2 8πGρ0 H0 = . (62) 3Ω0 Inserting the definition for the curvature back into the Friedmann equation, we see that

8πGρ a3 8πGa2ρ a˙ 2 = 0 0 + 0 crit (1 Ω ), (63) 3 a 3 − 0 or

2 2 a˙ Ω0H0 a0 8πGρcrit = + (1 Ω0) (64) a0  a 3 − 2 2 a˙ Ω0H0 a0 2 = + H0 (1 Ω0) (65) a0  a − We now consider big bang solutions, i.e. a(0) = 0. At very early times (a 0), the first term on the rhs – the gravitational term, will dominate the second term.∼ Thus, at early times the form of the solution should be independent of the density. However, at later times the nature of the solution depends critically upon whether the second (energy) term is positive, negative, or zero. Equivalently, it depends whether Ω0 is less than, equal to, or greater than 1. If Ω0 < 1 and the energy term is positive, the solution for a(t) is analogous to the trajectory of a rocket launched with a velocity greater than the . Consider now the case Ω0 = 1, which is called the Einstein-deSitter universe. This case must always be a good approximation at early time. Then

da H a3/2 = 0 0 , (66) dt a1/2 1/2 3/2 a da = H0a0 dt (67)

or, assuming a(0) = 0, a 3H t 2/3 = 0 . (68) a0  2  Thus, a(t) is a very simple function for the Einstein-deSitter case. We can also very easily solve for the , 2 t = H−1. (69) 0 3 0

−1 Indeed, H0 overestimates the age of the universe for all Friedmann models with Λ = 0. Now consider the case of Ω0 > 1. The maximum scale factor amax occurs whena ˙ = 0 in equation 65, a Ω max = 0 . (70) a Ω 1 0 0 − 16 We can obtain a parametric solution by letting Ω a a(t)= a sin2 θ = 0 0 sin2θ. (71) max Ω 1 0 − Substituting this into equation 65 gives

2 2 Ω0 2 2 ˙2 2 cos θ 4 sin θ cos θ θ = H0 (Ω0 1) 2 , (72) Ω0 1 − sin θ − θ 2Ω0 2 Ω0 1 H0t = 3/2 dx sin x = 3/2 θ sin 2θ . (73) (Ω 1) 0 (Ω 1) − 2 0 − Z 0 −   The above equation represents a parametric solution for the scale factor when Ω0 > 1. Since the lifetime of the universe extends from θ = 0 to θ = π, the total lifetime of the universe is πΩ0 tlifetime = (74) H (Ω 1)3/2 0 0 − A similar parametric solution for H0t can be derived for Ω0 < 1 by replacing sin θ with sinh θ in the expression for a(t). In this case, a(t) t for large t. ∝ 3.4 The Friedmann Equations from General Relativity Before moving on to a discussion of spacetime metrics, it is worth at least briefly mentioning the origin of the Friedmann equations in the context of General Relativity. They are derived directly from Einstein’s field equations, 1 8πG G R Rg = T , (75) ij ≡ ij − 2 ij c4 ij or, including the cosmological constant, 1 8πG R Rg Λg = T (76) ij − 2 ij − ij c4 ij

The gik comprise the tensor, describing the metric of spacetime. T is the energy- momentum tensor, and encapsulates all the about the energy and momentum conservation laws that we discussed in the Newtonian context. The in this j context is simply Ti;j = 0, which means that the covariant derivative is zero. The Ricci tensor (Rij) and Ricci scalar (R) together make up the . In cosmology, the energy-momentum tensor of greatest relevance is a perfect fluid, T =(ρc2 + p)U U pg (77) ij i j − ij

where Uk is the fluid four-velocity. Remember that we assumed a perfect fluid in the New- tonian analog. This covariant derivative of the tensor provides the analog to the Euler equations. Substituting this expression for the stress tensor yields, after some math, the Friedmann equations.

17 4 Spacetime Metrics

It is important to interpret the solutions for the scale factor obtained from Newtonian theory in the last section within the framework of GR. While Newtonian theory treats gravity as a force, in GR the presence of a mass is treated as curving or warping spacetime so that it is no longer Euclidean. Particles moving under the influence of gravity travel along geodesics, the shortest distance between two points in curved spacetime. It is therefore necessary to be able to describe spatial curvature in a well-defined way.

4.1 Example Metrics Curvature is most easily visualized by considering the analogy with 2D creatures living on the surface of a sphere (balloon). Such creatures, who live in a closed universe, could easily detect curvature by noticing that the sum of the angles of any triangle is greater than 180◦. However, this space is locally flat (Euclidean) in the sense that in a small enough region of space the geometry is well-approximated by a Euclidean geometry. This space has the interesting that the space expands if the sphere (balloon) is inflated, and such an expansion in no way changes the nature of the geometry. It is also possible to define a metric along the surface. A metric, or distance measure, describes the distance, ds, between two points in space or spacetime. The general form for a metric is 2 ds = gijdxidxj, (78)

where the gij are the metric coefficients that we saw in the Einstein field equations. The distance ds along the surface of a unit sphere is given by

dφ 2 ds2 = dθ2 + sin2 θdφ2 = dθ2 1 + sin2 θ . (79)  dθ !    The metric given by the above equation relates the difference between the coordinates θ and φ of two points to the physically measurable distance between those points. Since the metric provides the physical distance between two nearby points, its value should not change if different coordinates are used. A change of coordinates from (θ,φ) to two other coordinates must leave the value of the metric unchanged even though its functional form may be very different. The minimum distance between two points on the surface of the sphere is obtained by minimizing the distance given by equation 79.

4.2 Geodesics In general, for any metric the shortest distance between two points comes from minimizing the quantity P2 P2 ds P2 I = ds = dt = Ldt, (80) ZP1 ZP1 dt ZP1 18 where the two points P1 and P2 are held fixed, t is a dummy variable that varies continuously along a trajectory, and the Lagrangian L = ds/dt. Minimization of the Lagrangian yields the equation of motion in . If P1 and P2 are held fixed then the integral is minimized when Lagrange’s equations are satisfied (same as in ),

∂L d ∂L = , i =1..N. (81) ∂xi dt ∂x˙ i Consider the example of the shortest distance (geodesic) between two points on the surface of a unit sphere. Let the independent variable be θ instead of t. Then the Lagrangian is 2 ds 2 dφ L = v1 + sin θ , (82) ≡ dθ u dθ ! u t and Lagrange’s equation is ∂L d ∂L = , (83) ∂φ dθ ∂φ˙ d sin2 θ φ˙ =0, (84) dθ  1 + sin2 θφ˙2  q  where φ˙ = dφ/dθ. Integrating and squaring this equation gives

2 2 4 d 2 d sin θ (φ C2) = C1 1 + sin θ (φ C2) . (85) "dθ − #  "dθ − #    dx −1 Let y = φ C and x = cot θ. Then = 2 and the differential equation becomes − 2 dθ sin θ 2 2 dy 2 2 dy = C1 (1 y )+(1+ x ) , (86) dx!  − dx !    with the solution 1/2 C1 ′ y = x = C1x, (87) 1 C1  or, − cos(φ C )= C′ cot θ. (88) − 2 1 The above equation gives the geodesics along the surface of a sphere. But this is just the expression for a great circle! To see this, consider that a through the origin,

x + Ay + Bz =0 (89)

19 produces the following locus of intersection with a unit sphere:

sin θ cos φ + A sin θ sin φ + B cos θ =0, (90) B + tan θ(A sin φ + cos φ)=0, (91) B cot θ = C cos(φ D). (92) − − Therefore we have demonstrated that geodesics on the surface of a sphere are great circles. Of course, this can be proven much more easily, but the above derivation illustrates the general method for determining geodesics for an arbitrary metric.

4.3 Special Relativity and Curvature Week 3 Reading Assignment: Chapter 2 For special relativity, in a Lorentz frame we can define a distance in spacetime as

v2 ds2 = c2dt2 dx2 = c2dt2 1 . (93) − − c2 !

This metric also relates physically measurable distances to differences in coordinates. For example, the time measured by a moving clock (the ) is given by ds/c. Thus, proper time intervals are proportional to, but not equal to, dt. Let’s look at the above metric for a moment. For light, the metric clearly yields ds2 = 0. Light is therefore said to follow a null geodesic, which simply means that the physical distance travelled is equal to ct. Everything that we see in the universe by definition lies along null geodesics, as the light has just had enough time to reach us. Consider Figure 3. The null geodesics divide the spacetime plane into two types of lines. World lines with ds2 > 0 are said to be timelike because the time component is larger. Physically, this means that we observed (received the light from) events with timelike world lines some time in the past. World lines with ds2 < 0 are said to be spacelike. Spacetime points that lie along spacelike world lines are sufficiently far that light has not yet had time to reach us. Now, consider the equation of motion for a particle in special relativity. For a free particle, the equation of motion follows from minimizing the distance between two fixed points in space time, analogous to the case with the surface of the sphere,

2 t2 δ ds = δ Ldt =0. (94) Z1 Zt1 Since the Lagrangian is

v 2 1/2 v2 L = c 1 = c 1 2 + ... , (95) " −  c  # − 2c ! and since the first term is constant, for nonrelativistic free particles (v << c) the special relativistic Lagrangian reduces to the usual nonrelativistic Lagrangian without interactions.

20 Figure 3 Light cones for a flat geometry. Light travels along the null geodesics, while particles travel along timelike geodesics. Points with ds2 < 0 are not observable at the present time. .

Note that in the case of external , the situation is not quite so simple. Recall that the classical Lagrangian is given by 1 L = mv2 U. (96) 2 − The analog in special relativity is

L = mc2 1 v2/c2 U, (97) − − − q If one wishes to calculate the motion of a undergoing electromagnetic interactions, then one must include the electrostatic potential Φ and the vector potential A in the Lagrangian as e U = eΦ v A. (98) − c · In general relativity, gravity is treated as an entity that modifies the geometry of space- time. Particles travel along geodesics in that geometry with the equation of motion

2 δ ds =0. (99) Z1 Thus, gravitational forces, as such, do not exist. The presence of massive bodies simply affects the geometry of spacetime. When spacetime is curved due to the presence of gravi- tational mass, particles no longer travel on straight lines in that geometry. If one wishes to

21 Figure 4 with the three different .

include, say, electromagnetic forces in addition to gravity, then the Lagrangian would have to be modified as in special relativity. What distinguishes a curved from flat geometry? At any point in a metric, one can define an invariant quantity called the curvature, which characterizes the local deviation of the geometry from flatness. Since it is an invariant quantity, the curvature does not depend on the choice of coordinate system. For the surface of a unit sphere, the value of the curvature is +1. The curvature of flat space is zero, and the curvature of an open hyperboloid is -1. It is useful to picture the three types of curvature geometrically (Figure ??). The properties of the three cases are: k=0: Flat, Euclidean geometry. The sum of angles in a triangle is 180◦. • k=1: Closed, spherical geometry. The sum of angles in a triangle is greater than 180◦. • k=-1: Open, hyperbolic geometry. The sum of angles in a triangle is less than 180◦. The • standard analogy for visualization is a saddle, where all directions extend to infinity. Since the value of the curvature is invariant, there can be no global coordinate transfor- mation that converts a curved metric, such as the surface of a sphere, into the metric of flat spacetime. In other words, there is no mapping x = x(θ,φ),y = y(θ,φ), z = z(θ,φ) that converts the metric for a unit sphere to

ds2 = dx2 + dy2 + dz2. (100)

This is why, for example, flat maps of the world always have some intrinsic distortion in them.

22 Similarly, there is no coordinate transformation that converts the metric of special rela- tivity (called the Minkowski metric)

ds2 = c2dt2 dx2 (101) − into a curved geometry.

4.4 The Robertson-Walker Metric We have looked at examples of metrics for a unit sphere and for special relativity. Let us now turn our attention to the question of whether we can construct a metric that is valid in a cosmological context. Assume that (1) the cosmological principle is true, and (2) each point in spacetime has one and only one co-moving, timelike geodesic passing through it. Assumption (2) is equivalent to assuming the existence of worldwide simultaneity or universal time. Then for a co-moving observer, there is a metric for the universe called the Robertson-Walker metric, or sometimes the Friedmann-LeMaitre-Robertson-Walker metric (named after the people who originally derived it). The Robertson-Walker metric is

d˜r2 ds2 =(c dt)2 a(t)2 +˜r2dη2 , (102) − "1 kr˜2 # − where k is the sign of the curvature (k = 1, 0, 1), a(t) is the scale factor, and r is the co-moving distance. The dη term is short-hand− for the solid angle,

dη2 = sin2 θdφ2 + dθ2. (103)

For a given curvature, this metric completely specifies the geometry of the universe to within one undetermined factor, a(t), which is determined from the Friedmann equations. Together, the Friedmann equations and Robertson-Walker metric completely describe the geometry. The above form of the metric is the one given in the text; however, there are in fact three commonly used forms for the metric,

sin kr¯ 2 ds2 =(c dt)2 a(t)2 d¯r2 + dη2 , (104) −  k !  a(t)2  ds2 =(c dt)2 dr2 + r2dη2 , (105) − (1 + 1 kr2)2 4 h i d˜r2 ds2 =(c dt)2 a(t)2 +˜r2dη2 . (106) − "1 kr˜2 # − All three forms are equivalent, yielding the same value for the distance between two points. Transformation between the forms is possible given the appropriate variable substitutions. These transformations are left as a homework exercise.

23 In the above equations, k is the same curvature that we discussed in the context of special relativity. The phrases “open” and “closed” now take on added significance in the sense that, for Λ = 0, a “closed” will recollapse while an “open” universe will expand forever. In contrast, the recent discovery that Λ = 0 has given rise to the phrase: “Geometry is not destiny”. In the presence of a cosmological6 constant, the strict relation above does not hold.

4.4.1 Proper and Co-moving Distance Given the above metric, we will be able to measure distances. Looking at the equation, let us start with two distance definitions

Proper Distance: Proper distance is defined as the actual spatial distance between • two co-moving observers. This distance is what you would actually measure, and is a function of time as the universe expands.

Co-moving (or coordinate) distance: The co-moving distance is defined such that • the distance between two co-moving observers is independent of time. The standard practice is to define the co-moving distance at the present time t0.

As an illustration, consider two co-moving observers currently separated by a proper distance r0. At any lookback time t, the proper separation will be

DP =(a/a0)r0, (107) while the co-moving distance will be DC = r0. (108)

Note that it is a common practice to set a0 = 1.

4.4.2 Derivation of the Robertson-Walker Metric We shall now derive the Robertson-Walker metric. While the metric can be derived by several methods, we will go with a geometric approach for clarity. Consider an arbitrary (t, r) in spacetime. This event must lie within a spacelike 3D hypersurface within which the universe everywhere appears identical to its appearance at the point in question (homogeneity). The set of co-moving timelike geodesics (world lines of co-moving observers) through each point on this hypersurface defines the universal time axis. The metric can then be expressed in the form ds2 = c2dt2 dχ2, (109) − where dχ is the distance measured within the spacelike hypersurface. There are no cross terms dχdt because the time axis must be perpendicular to the hypersurface. Otherwise there is a largest cross-term that yields a preferred spacelike direction, thus violating isotropy. If we choose a polar coordinate system, then dχ2 can be written in the form

dχ2 = Q(r, t) d˜r2 +˜r2dη2 , (110) h i 24 where Q(r, t) includes both the time and spatial dependence. Again by isotropy, all cross terms like drdη must vanish. The second term inside the brackets can have a different coefficient than the first term, but we have the freedom to define r so that the coefficients are the same. The proper distance δx between two radial points r and r + δr is δx = Q1/2δr. (111) Locally, geometry is Euclidean, and local Galilean invariance implies that Hubble’s law is valid: 1 ∂ 1 ∂Q H(t)= δx = (112) δx ∂t 2Q ∂t Hubble’s law must be independent of position, r, because of the Cosmological Principle. Therefore Q(r, t) must be separable, Q(r, t)= a2(t)G(r), (113) so the metric is ds2 = c2dt2 a2(t)G(r) dr2 + r2dη2 . (114) − Let us now transform the radial coordinates toh i dχ2 = d˜r2 + F 2(˜r)dη2 (115) using the change of variables F (˜r)= G(r)r, (116) d˜r = G(r)dr. (117) For a Euclidean geometry, dχ2 = d˜r2 + dη2, (118) so F (r)= r in the Euclidean case. Since spacetime locally appears Euclidean, we therefore require in the limit (˜r 0) that F (0) = 0 and F ′(0) =1. → Now consider the triangles below. If the angles α,β,γ are small, and if x, y, z are proper distances, we get 3 identities:

F (˜r)α = F (ǫ + τ)γ, (119) F (˜r + ǫ + τ)α = F (ǫ + τ)β, (120) F (˜r + ǫ)α = F (ǫ)β + F (τ)γ. (121) Eliminating β and γ from the three equations, we get F (˜r + ǫ + τ) F (˜r) F (˜r + ǫ)= F (ǫ) + F (τ) , (122) F (ǫ + τ) F (ǫ + τ) F (ǫ + τ)F (˜r + ǫ)= F (ǫ)F (˜r + ǫ + τ)+ F (τ)F (˜r). (123)

25 Figure 5 Geometric Derivation of Robertson-Walker Metric

Take the limit ǫ 0 and expand to first order in ǫ. →

[F (τ)+ ǫF ′(τ)] [F (˜r)+ ǫF ′(˜r)] = ǫF (˜r + τ)+ F (τ)F (˜r)), (124) F (τ)F (˜r)+ ǫF (τ)F ′(˜r)+ ǫF ′(τ)F (˜r)+ ǫ2F ′(τ)F ′(˜r)= ǫF (˜r + τ)+ F (τ)F (˜r)), (125) F (˜r)F ′(τ)+ F (τ)F ′(˜r)= F (˜r + τ). (126)

Expand to second order in τ: 1 1 1 F (˜r) 1+ τF ′′(0) + τ 2F ′′′(0) +F ′(˜r) F (0) + τF ′(0)++ τ 2F ′′(0) = F (˜r)+τF ′(˜r)+ τ 2F ′′(˜r),  2   2  2 (127) or, using the limits for F (0) and F ′(0), the first order terms give

F ′′(0) = 0, (128)

and the second order terms give

F ′′(˜r)= F ′′′(0)F (˜r). (129)

Define k ( F ′′′(0))1/2. Then ≡ − F ′′(˜r)= k2F (˜r), (130) − and this has the general solution

F (r)= A sin(kr˜ + B). (131)

26 From the boundary conditions, F (0) = 0 implies B = 0, and F ′(0) = 1 implies kA = 1. Therefore, the solution is sin kr F (˜r)= . (132) k Verify the third derivative: F ′′′(0) = k2 cos0 = k2. (133) − − Correct. The sign of k determines the nature of the solution: k =1 F (˜r) = sinr ˜ • → k =0 F (˜r)=˜r • → k = 1 F (˜r) = sinhr ˜. • − → Thus, we have the Robertson-Walker metric,

sin kr¯ 2 ds2 =(c dt)2 a(t)2 d¯r2 + dη2 , (134) −  k !    which can be converted to the other standard forms.

5 Redshift

OK. Stepping back for a second, we now have a means of describing the evolution of the size of the universe (Friedmann equation) and of measuring distances within the universe (Robertson-Walker metric). It’s time to recast these items in terms of observable quantities and use this machinery to develop a more concise description our Universe. We don’t directly observe the scale factor, a(t), but we can observe the cosmological redshift of objects due to the expansion of the universe. As you may recall, the Doppler shift of light (redshift or ) is defined as λ λ ν ν z = o − e = e − o , (135) λe νo where λo and λe are the observed and emitted , and νo and νe are the corre- sponding . This can be recast in terms of frequency as ν 1+ z = e . (136) νo We know that light travels along null geodesics (ds = 0). Therefore, for light travelling to us (i.e. along the radial direction) the RW metric implies dr2 c2dt2 = a2 (137) 1 kr2 cdt dr − = = f(r). (138) a 1 kr2 − 27 Consider two at distance R, emitted at times te and te + δte, that are observed at times to and to + δto. Since both are emitted at distance R, f(r) is the same and

to cdt to+δto cdt = . (139) Zt1 a Zte+δte a

If δte is small, then the above equation becomes δt δt o = 1 (140) ao ae νoao = νeae (141) ν a e = o =1+ z, (142) νo ae

where the last relation comes from the definition of redshift. Taking ao to be now (t0), and defining a 1 we therefore have the final relation 0 ≡ 1 a = . (143) 1+ z Note that there is a one-to-one correspondence between redshift and scale factor – and hence also time. The variables z,a, and t are therefore interchangeable. From this point on, we will work in terms of redshift since this is an observable quantity. We do, however, need to be aware that the cosmological expansion is not the only source of redshift. The other sources are

Gravitational redshift: Light emitted from deep within a gravitational potential well • will be redshifted as it escapes. This effect can be the dominant source of redshift in some cases, such as light emitted from near the of a .

Peculiar velocities: Any motion relative to the uniform expansion will also yield a • Doppler shift. Galaxies (and stars for that matter) do not move uniformly with the expansion, but rather have peculiar velocities relative to the Hubble flow of several hundred km s−1 – or even > 1000 km s−1 for galaxies in clusters. In fact, some of the nearest galaxies to us are blueshifted rather than redshifted. This motion, which is a natural consequence of gravitational attraction, dominates the observed redshift for nearby galaxies.

The total observed redshift for all three sources is

(1 + z)=(1+ zcosmological)(1 + zgrav)(1 + zpec). (144)

Also, between two points at z1 and z2 (z1 being larger), the relative redshift is

1+ z1 a1 1+ z12 = = . (145) 1+ z2 a2

28 6 The Friedmann Equations 1: Observable Quantities

Recall again the Friedmann equation,

8πG Λc2 a˙ 2 + kc2 = ρa2 + a2. (146) 3 3 We will now recast this in a simpler form corresponding to observable quantities. First, let us list and define these quantities.

6.1 The Hubble parameter (H) We have previously defined the Hubble parameter as a˙ H = (147) a and the Hubble constant as a˙ 0 H0 = . (148) a0

7 The density parameter (Ω0) We have previously defined the density parameter as the ratio of the actual density to the critical density at the current time (t0). The critical density ρc is the density required to just halt the expansion of the universe for models with Λ = 0, and is given by

3H2 ρ = 0 . (149) c 8πG The matter density parameter at the current time is thus,

ρ0 8πGρ0 Ω0 = = 2 . (150) ρc 3H0

8 The cosmological constant density parameter (Ω0Λ)

Consider a empty universe (Ω0 = 0). The “critical” value of the cosmological constant is defined as the value required for a flat universe in this model (k = 0). Specifically, for time t0 the Friedmann equation above becomes

2 2 a˙ 0 Λcc 2 =0 (151) a0 − 3 3H2 Λ = 0 . (152) c c2

29 The parameter ΩΛ is defined as Λ Λc2 ΩΛ = = 2 . (153) Λc 3H0 This is basically a statement describing the contribution of the energy density in the cosmo- logical constant as a fraction of the total required to close the universe.

9 The Observable Friedmann Equation

Using the above equations, let’s now proceed to recast the Friedmann equation. 8πG Λc2 a˙ 2 + kc2 = ρa2 + a2 (154) 3 3 2 2 8πGρ0 ρ 2 Λc 2 = a 2 H0 + 2 H0 (155) " 3H0 ρ0 3H0 # 2 2 2 2 ρ kc a˙ = a H0 Ω0 + ΩΛ 2 2 (156) " ρ0 − H0 a # 2 2 2 ρ kc H = H0 Ω0 + ΩΛ 2 2 (157) " ρ0 − H0 a #

Now, at time t0, 2 2 2 ρ0 kc H0 = H0 Ω0 + ΩΛ 2 (158) " ρ0 − H0 # kc2 Ω0 + ΩΛ 2 =1 (159) − H0 Ω0 + ΩΛ + Ωk =1, (160) (161) where we have now defined the curvature term in terms of the other quantities, Ω =1 Ω Ω , (162) k − 0 − Λ This tells us that the general description of the evolution of the scale factor, in terms of redshift, is 2 2 ρ 2 H = H0 Ω0 + ΩΛ + Ωk(1 + z) (163) " ρ0 # or 2 2 ρ 2 H = H0 Ω0 + ΩΛ + (1 Ω0 ΩΛ)(1 + z) . (164) " ρ0 − − #

This definition is commonly written as H = H0E(z), where 1/2 ρ 2 E(z)= Ω0 + ΩΛ + (1 Ω0 ΩΛ)(1 + z) . (165) " ρ0 − − #

30 10 The Equation of State

OK – looks like we’re making progress. Now, what is ρ/ρ0?? Well, we worked this out earlier for pressureless, non-relativistic matter, assuming adiabatic expansion of the universe – ρ (1 + z)3. However, ρ is an expression for the total energy density. We need to correctly∝ model the evolution of the density for each component, which requires us to use the appropriate equation of state for each component. Recall that for the matter case, we started with the adiabatic assumption

pdV = dE (166) − pda3 = d(ρa3) (167) − and set p = 0. Let us now assume a more general equation of state,

p = (1 γ)ρc2 = wρc2. (168) − In general w is defined as the ratio of the pressure to the density. One can (and people do) invent more complicated equations of state, such as p = (1 γ)ρc2 +p , where w is not longer − 0 defined by the simple relation above, but the above equation is the standard generalization that encompasses most models. For this generalization,

ρwda3 = da3ρ a3dρ (169) − − ρda3(1 + w)= a3dρ (170) − dρ da3 = (1 + w) (171) ρ − a3 3(1+w) a 3(1+w) ρ = ρ0 = ρ0(1 + z) (172) a0 For the “dust-filled” universe case that we discussed before, which corresponds to non- relativistic, pressureless material, we had w = 0. In this case, the above equation reduces to 3 ρ = ρ0(1+z) . More generally, a non-relativistic fluid or can be described by a somewhat more complicated equation of state that includes the pressure. For an ideal gas with thermal 2 energy much smaller than the rest mass (kBT <

ρ k T ρc2 p = nk T = m k T = B = w(T )ρc2 (173) B m c2 B m c2 1+(k T/((γ 1)m c2)) p p B − p In most instances, w(T ) << 1 and is well-approximated by the dust case. ———————————- Aside on adiabatic processes As a reminder, an adiabatic process is defined by

PV γ = constant, (174)

31 where γ is called the adiabatic index. For an ideal gas, we know from basic thermodynamics that

pV = nkBT ; (175) 3 E = nk T. (176) 2 B The equation of state for an ideal gas can be obtained in the following fashion. Integrating

dE = pdV = V −γdV, (177) − one gets C PV k T E = V 1−γ = = B . (178) −1 γ γ 1 γ 1 − − − It is now simple to see that the total energy density is

kBT/(γ 1) ρ = ρm + ρkBT = ρm(1 + 2− . (179) mpc ———————————- At the other extreme, for photons and ultra-relativistic particles where the rest-mass makes a negligible contribution to the energy density, w = 1/3. In this case, ρ (1 + z)4. Thus, the radiation and matter have different dependences on∝ redshift. For radiation, the added 1 + z term can be understood physically as corresponding to the redshifting of the light. Since E ν 1/(1 + z), the energy of the received photons is a factor of 1+ z less than the emitted∝ photons.∝ What about other equations of state described by other values of w? As we noted earlier, the special case of w = 1 is indistinguishable from a cosmological constant. More generally, let us consider constraints− on arbitrary values of w. If we consider that the adiabatic sound speed for a fluid is 1/2 ∂p 2 vs = = wc , (180) ∂ρ ! [where the equation is for the condition of constant ] then we see that w> 1 the sound speed is greater than the , which is unphysical. Thus, we require that w < 1. All values less that one are physically possible. The range 0 w 1 is called the Zel’dovich interval. This interval contains the full range of matter- to≤ radiation-dominated≤ equations of state (0 w 1/3) as well as any other equations of state where the pressure increases with the energy≤ ≤ density. Exploring equations of state with w < 1 is currently a hot topic in cosmology as a means of distinguishing exotic dark energy models from a cosmological constant. Additionally, in the above discussion we have generally made the approximation that w is independent of time. For the ideal gas case, which depends upon , this is not the case since the temperature will change with the expansion. More generally, for the negative w cases there is also a great deal of effort being put into models where w varies with time. We will likely talk more about these topics later in the semester.

32 11 Back to the Friedmann Equation

For now, let us return to the present topic, which is the Friedmann equation in terms of observable quantities. What is the appropriate expression for ρ/ρ0 that we should insert into the equation? Well, we know that in general the universe can include multiple constituents with different densities and equations of state, so the E(z) expression in the Friedmann equation should really be expressed as a summation of all these components,

1/2 3(1+wi) 2 E(z)= Ω0i(1 + z) + (1 Ω0i)(1 + z) . (181) " − # Xi Xi

To be more concrete, if we consider the main components to be matter (Ω0M), radiation (Ω0r), (Ω0ν ), a cosmological constant (Ω0Λ), and any unknown exotic component (Ω0X ), then the equation becomes

1/2 3 4 4 3(1+wX ) 2 E(z)= Ω0M (1 + z) +Ω0r(1 + z) +Ω0ν (1 + z) +Ω0Λ +Ω0X (1 + z) +Ωk(1 + z) , (182) h i where

Ω =1 Ω Ω Ω Ω Ω . (183) k − 0M − 0r − 0ν − 0Λ − 0X

When people talk about dark energy, they’re basically suggesting replacing the Ω0Λ term with the Ω0X term with 1 < wx < 0. The radiation and neutrino densities are currently orders of magnitude lower− than the matter density, so in most textbooks you will see the simpler expression

1/2 E(z)= Ω (1 + z)3 + Ω + (1 Ω Ω )(1 + z)2 . (184) 0M 0Λ − 0M − 0Λ h i The expression for E(z) can be considered the fundamental component of the Friedmann equation upon which our measures for the distance and evolution of other quantities will be based. So, given the above expression for E(z) (whichever you prefer), what does this tell us about all the other possible observable quantities? We have already seen that

H = H0E(z); (185) 3(1+w) ρ = ρ0(1 + z) . (186) (187)

12 Distances, Volumes, and Times

Cosmography is the measurement of the Universe. We’re now ready to take a look at how we can measure various distances and times.

33 12.1 Hubble Time and Hubble Distance The simplest time that we can define is the Hubble time, 1 tH = , (188) H0 which is roughly (actually slightly greater than) the age of the universe. The simplest distance that we can define is the Hubble distance, the distance that light travels in a Hubble time, c DH = ctH = . (189) H0

12.2 Radial Comoving Distance Now, if we want to know the radial (line-of-sight) comoving distance between ourselves and an object at redshift z, r˜ d˜r t cdt D = , (190) C 2 ≡ Z0 √1 kr˜ Z0 a a da a da− a da DC = c = c = c 2 , (191) Z0 ada/dt Z0 aa˙ Z0 a H(z) −1 dz and using a =(1+ z) and da = 2 , − (1+z) z cdz c z dz DC = = . (192) Z0 H(z) H0 Z0 E(z) z dz DC = DH . (193) Z0 E(z) This can also be derived directly from Hubble’s law, v = Hd. Recalling that v = cz, for a small distance change ∆d, ∆v = c∆z = H∆d (194) z cdz z dz DC = dd = = DH . (195) Z Z0 H Z0 E(z) We shall see below that all other distances can be expressed in terms of the radial comoving distance. Finally, note that at the start of this section we used, r˜ d˜r DC = , (196) 0 √1 kr˜2 Z − which relates DC tor ˜. We could just as easily have used r dr DC = 1 2 , (197) Z0 1+ 4 kr 34 or r¯ DC = d¯r =r. ¯ (198) Z0 The important thing is to be consistent in your definition of r when relating to other quantities!

12.3 Transverse Comoving Distance Now consider two events at the same redshift that are separated by some angle δθ. The comoving distance between these two objects, known as the transverse comoving distance or the proper motion distance, is defined by the coefficient of the angular term in the RW metric. Using ther ¯ version of the metric, sin kr¯ sin kD D = = C , (199) M k k which for the three cases of curvature corresponds to

D = sinh D ; k= 1, Ω > 0 (200) M C − k DM = DC ; k=0, Ωk =0 (201)

DM = sin DC . k=1, Ωk < 0 (202)

Note that David Hogg posted a nice set of notes about , which are widely used, on astro-ph (astro-ph/9905116). In these notes, he instead recasts the equations in terms of Ωk and DH , giving the following form for gives the transverse comoving distance as: D D = H sinh Ω D /D k= 1, Ω > 0; (203) M √Ω k C H − k k q  DM = DC k=0, Ωk =0; (204) D D = H sin Ω D /D k=1, Ω < 0; (205) M √Ω | k| C H k k q  (206)

which is equivalent to our formulation above.

12.4 Distance The angular diameter distance relates an objects physical transverse distance to its angular size. It is defined such that for a rod with l, sin kr¯ l = a dθ D dθ, (207) k ≡ A

35 or sin kr¯ D D = a = M . (208) A k 1+ z Note that we are using proper distance, because physically we typically care about the actual size of the observed source (say the size of star forming region or galaxy) rather than some comoving scale. It is of interest to note that the angular diameter distance does not increase indefinitely. At large redshift the (1 + z)−1 term dominates and the angular size decreases. In practice, the maximum size for objects is at z 1 for the observed cosmological parameters. ∼ 12.5 Comoving Area and Volume It is also ofter of interest to measure volumes so that one can determine the density of the objects being observed (e.g. galaxies or ). In this instance, what one typically cares about is the comoving volume, since you want to know how the population is evolving (and hence the comoving density is changing) rather than how the scale factor is changing the proper density. The differential comoving volume is simply the produce of the differential comoving area and the comoving radial extent of the volume element,

dVC = dAC dDC . (209)

The comoving area is simply defined from the solid angle term of the RW metric,

sin kr¯ 2 dAC = sin θdθdφ, (210) k ! sin kr¯ 2 dAC = dΩ. (211) k ! 2 dAC = DM dΩ. (212) (213)

Using the above information in the volume relation, we get

dzD D2 (1 + z)2D dV = D2 dΩ H = A H dΩdz, (214) c M Ez E(z) or dV D (1 + z)2D2 D D2 C = H A = H M . (215) dΩdz E(z) E(z) The integral over the full sky, out to redshift z, gives the total comoving volume within that redshift. It will likely be a homework assignment for you to derive an analytic solution for this volume and plot it for different values of Ω0 and ΩΛ.

36 12.6 Luminosity Distance OK – so at this point we have a means of measuring comoving distances and volumes and figuring out how large something is. What about figuring out the luminosity of a source? The luminosity distance to an object is defined such that the observed flux, f, is L f = 2 , (216) 4πDL just as in the Euclidean case, where L is the bolometric luminosity of the source. Now, looking at this from a physical perspective, the flux is going to be the observed luminosity divided by the area of a spherical surface passing through the observer. This sphere should 2 2 have area 4π(aor˜) = 4πr˜ . Additionally, the observed luminosity differs from the intrinsic luminosity of the source. During their flight the photons are redshifted by a factor of (1+z) — so the energy is decreased by this factor, and also dilutes the incident flux by a factor of (1+z) – δt0 =(1+ z)δt. The net effect then is that L L(1 + z)−2 f = obs = , (217) 4πr¯2 4πr¯2 or 2 DL =r ¯(1 + z)= DM (1 + z)= DA(1 + z) (218) Note the very different redshift dependences of the angular diameter and luminosity distances. While the angular diameter distance eventually decreases, the luminosity distance is monotonic. This is good, as otherwise the flux could diverge at large redshift!

12.7 Flux from a Fixed Passband: k-corrections On a related practical note, the luminosity distance above is defined for a bolometric lu- minosity. In astronomy, one always observes the flux within some fixed passband. For any spectrum the differential flux fν , which is the flux at frequency ν within a passband of width δν, is related to the differential luminosity Lν by ′ ∆ν Lν Lν fν = ′ 2 , (219) ∆ν Lν 4πDL ′ ′ where ν is the emitted frequency and is related to ν by ν = (1+ z)ν. Similarly, Lν′ is the emitted luminosity at frequency ν′. The first time in the expression accounts for the change in the width of the passband due ′ ′ to the redshift. Consider two emitted frequency ν1 and ν2. These are related to the observed wavelengths by ′ ν1 = (1+ z)ν1; (220) ′ ν2 = (1+ z)ν2; (221) ν′ ν′ = (1+ z)(ν ν ); (222) 1 − 2 1 − 2 ∆ν′ = (1+ z). (223) ∆ν 37 The second term accounts for the fact that you are looking at a different part of the spectrum than you would be in the rest frame. This quantity will be one for a source with a flat spectrum. Thus, the expression for the observed flux is

Lν(1+z) Lν fν = (1+ z) 2 . (224) Lν 4πDL

It is worth noting that it is a common practice in astronomy to look at the quantity νfν because this eliminates the (1 + z) redshifting of the passband since ν = νe/(1 + z) ,

νeLνe νfν = 2 , (225) 4πDL

where νe = ν(1 + z) is the emitted frequency.

12.8 Lookback time and the age of the Universe Equivalent to asking how far away an object lies, one can also ask how long ago the observed photons left that object. This quantity is called the lookback time. The definition of the lookback time is straightforward,

t a da a da tL = dt = = , (226) Z0 Za0 a˙ Za0 aH(z) z dz 1 z dz tL = = tH . (227) Z0 (1 + z) H0E(z) Z0 (1 + z)E(z) The complement of the lookback time is the age of the universe at redshift z, which is simply the integral from z to infinity of the same quantity,

∞ dz tU = tH (228) Zz (1 + z)E(z) 12.9 Surface Brightness Dimming While we are venturing into the realm of observable quantity, another that is of particular relevance to observers is the surface brightness of an object, which is the flux per unit solid −2 angle. In the previous sections, we just seen that for a source of a given luminosity f DL , and for a source of a given size, ∝ dΩ = dθdφ D2 . (229) ∝ A 2 We also know that DL = DA(1 + z) . From this information one can quickly show that

f D2 2 Σ= L (1 + z)4. (230) dΩ ∝ DA ∝

38 The above equation, which quantifies the effect of cosmological dimming is an im- portant result. It says that the observed surface brightness of objects must decrease very rapidly as one moves to high redshift purely due to cosmology, and that this effect is com- pletely independent of the cosmological parameters.

12.10 There is one additional quantity that should be mentioned in this section, which is primarily of historical significance, but also somewhat useful for physical intuition. There was period during the mid-20th century when observational cosmology was considered essentially a quest for two parameters, the Hubble constant (H0), and the deceleration parameter (q0). The idea was that measurement of the instantaneous velocity and deceleration at the present time would completely specify the time evolution. The deceleration parameter is defined by

a¨0a0 a¨0 q0 = = 2 , (231) − a˙ 0 Ho a0 which originates from a Taylor expansion for the scale factor at low redshift,

1 2 2 a = a0 1+ H0(t t0) q0H0 (t t0) + ... . (232)  − − 2 −  For a general Friedmann model, the deceleration parameter is given by 1 q = Ω Ω (233) 0 2 0 − Λ 13 The Steady-State Universe

Although we won’t go into this topic during the current semester, it is worth pointing out that there have been proposed alternatives to the standard cosmological model that we have presented thus far. One that is of particular historical interest is the “steady-state” universe. The steady-state universe follows from the perfect cosmological principle, which states that the universe is isotropic and homogeneous in time as well as space. This means that all observable quantities must be constant in time, and that all observers must observe the same properties for the universe no matter when or where they live. It does not mean that the universe is motionless – a flowing river or glacier has motion but does not change with time (global warming aside). The expansion of the universe implies that the scale factor (which is not itself a directly observable quantity) must increase with time. The metric must again be the RW metric because the cosmological principle is contained within the perfect cosmological principle. For the steady-state universe, the curvature must be k = 0. Otherwise, the three dimensional spatial curvature (ka−2), which is an observable quantity, varies with time as a changes. Similarly, the Hubble parameter must be a true constant, which implies that a = exp [H(t t0)] , (234) a0 −

39 and the metric must be ds2 = c2dt2 e2Ht dr2 + r2 dθ2 + sin2 θdφ2 (235) − h  i Note the for the steady-state universe, a¨ a q = q = 0 0 = 1. (236) 0 − a˙ 2 − The mean density of the universe is also observable, which requires that ρ is constant, even though the universe is expanding. This require the continuous creation of matter at a uniform rate per unit volume that just counterbalances the effect of the expansion, dρa3 a−3 =3ρH 3 10−47gmcm−3s−1. (237) dt ∼ × In this model galaxies are constantly forming and evolving in such a way that the mean observed properties do not change. Usually creation of and were assumed, but in principle the created matter could have been anything. Continuous creation of neu- trons, the so-called hot steady-state model, was ruled out because it predicted too large an X-ray background via n p + e− +ν ¯ + γ. Hoyle considered a modified version of GR → e that no longer conserves mass, and he found a way to obtain the steady-state universe with ρ = ρcrit. This model is of course only of historical interest – it was originally proposed by Bondi and Gold in 1948 when H0 was thought to be an larger than the currently accepted value. The larger value, and hence younger age of the Universe, resulting in the classic age problem in which the Universe was younger that some of the stars it contains. The discovery of the black-body microwave background proved to be the fatal blow for this model.

14 Horizons

The discussion of lookback time naturally to the issue of horizons – how far we can see. There are two kinds of horizons of interest in cosmology. One represents a horizon of events, while the other represents a horizon of world lines. The event horizon is the boundary of the set of events from which light can never reach us. You are all probably familiar with the term event horizon in the context of black holes. In the cosmological context, event horizons arise because of the expansion of the universe. In a sense, the universe is expanding so fast that light will never get here. The other type of horizon, the , is the boundary of the set of events from which light has not yet had time to reach us. Consider first event horizons. Imagine a emitted towards us at (t1,r1). This photon travels on a null geodesic,

t cdt r1 dr = 2 . (238) Zt1 a Zr 1+ kr /4 40 As t increases, the distance r will decrease as the photon gets closer. The photon lies outside the event horizon if r> 0 at t = – i.e. if the light never reaches us. Put differently, if ∞ ∞ cdt = , (239) Zt1 a ∞ then light can reach everywhere, so there is no event horizon, and so an event horizon exists if and only if ∞ cdt < , (240) Zt1 a ∞ Note that for a closed universe, which recollapses, the upper limit is usually set to tcrunch, the time when the universe has recollapsed. The hypersurface corresponding to the event horizon is ∞ cdt r1 dr = 2 . (241) Zt1 a Z0 1+ kr /4 For the Einstein-, where k = 0 and Λ = 0, there is no event horizon. Intuitively, this should make since because the Einstein-de Sitter universe expands forever, but with an expansion rate asymptotically approaching zero. On the other hand, the steady- state universe does have one with

−Ht1 r1 =(c/H)e . (242)

This can be seen in that

∞ ∞ dt dt c −Ht1 = −Ht = e . (243) Zt1 a Zt1 e H Now, the particle horizon exists only if

t dt < . (244) Z0 a ∞ For the steady-state universe, the lower limit of the time integral should be , and it is clear that universe does not, in fact, have a particle horizon. The Einstein-deSitter−∞ universe, for which a t2/3 (which can be derived going back to the section on the lookback time), does have a∝ particle horizon at

t cdt 1/3 rph 2/3 3ct . (245) ∝ Z0 t ∝ Hence one measure of the physical size of the particle horizon at any time t is the proper distance arph = 3ct. All non-empty isotropic general relativistic have a particle horizon. Horizons have a host of interesting properties, some of which are listed below:

1. If there is no event horizon, any event can be observed at any other event.

41 2. Every galaxy within the event horizon must eventually pass out of the event horizon. This must be true because equation 241 is a monotonically decreasing function.

3. In big bang models, particles crossing the particle horizon are seen initially with infinite redshift since the emission occurred at a(temission)=0. 4. If both an event horizon and a particle horizon exist, they must eventually cross each other. Specifically, at some time t, the size of the event horizon corresponding to those events occurring at time t will equal the size of the particle horizon. This can be seen as a natural consequence of the previous statements, as the event horizon shrinks with time while the particle horizon grows.

15 Exploring the Friedmann Models

Having derived a general description for the evolution of the universe, let us now explore how that time evolution depends upon the properties of the universe. Specifically, let us exploring the dependence upon the density of the different components and the presence (or absence) of a cosmological constant. In all cases below we will consider only single-component models. Before we begin though, let us return for a moment to a brief discussion from an earlier lecture. We discussed that for Λ = 0 the curvature alone determines the fate of the universe. For a universe with positive curvature, gravity eventually reverses the expansion and the universe recollapses. For a universe with zero curvature, gravity is sufficient to asymptotically halt the expansion, but the universe never recollapses. Meanwhile, for a universe with negative curvature the expansion slows but never stops (analogous to a rocket with velocity greater than escape velocity). In the case of a cosmological constant, the above is no longer true. Geometry alone does not determine the destiny of the universe. Instead, since the cosmological constant dominates at late times, the sign of the cosmological constant determines the late-time evolution (with the exception of cases where the matter density is >> the critical density and the universe recollapses before the cosmological constant has any effect). A positive cosmological constant ensures eternal expansion; a negative cosmological constant leads to eventual recollapse. This can be seen in a figure that I will show (showed) in class.

15.1 Empty Universe A completely empty universe with Λ = 0 has the following properties:

H = H0(1 + z) (246)

q = q0 =0 (247) −1 t0 = H0 (248) (249)

42 Such a universe is said to be “coasting” because there is no gravitational attraction to decelerate the expansion. In contrast, an empty universe with ΩΛ = 1 has:

H = H0 (250) q = 1 (251) 0 − (252) This universe, which has an accelerating expansion, can be considered the limiting case at late times for a universe dominated by a cosmological constant.

15.2 Einstein - de Sitter (EdS) Universe The Einstein - de Sitter universe, which we have discussed previously, is a flat model in which Ω0 = 1. By definition, this universe has a Euclidean geometry and the following properties:

3(1+w)/2 H = H0(1 + z ) (253)

q = q0 (254) 2 t0 = (255) 3(1 + w)H0 t −2 1 ρ = ρ0c = 2 2 . (256) t0  6(1 + w) πGt (257) The importance of the EdS model is that at early times all Friedmann models with w > 1/3 are well-approximated as and EdS model. This can be seen in a straightforward fashion− by looking at E(z),

1/2 E(z) Ω (1 + z)3(1+w) + Ω + (1 Ω Ω )(1 + z)2 . (258) ≡ 0 Λ − 0 − Λ h i As z , the cosmological constant and curvature terms become unimportant as long as w> →1/ ∞3. − 15.3 Concordance Model Current observations indicate that the actual Universe is well-described as by a spatially flat, dust-filled model with a non-zero cosmological constant. Specifically, the data indicate that Ω 0.27 and Λ=0.73. This particular model has 0 ≈ q = 0.6 (259) 0 − t t . (260) 0 ≈ H It is interesting that this model yields an age very close to the Hubble time (consistent to within the observational uncertainties), as this is not a generic property of spatially flat

43 Table 1. Comparison of Different Cosmological Models

Name Ω0 ΩΛ t0 q0

2 1 Einstein-de Sitter 1 0 3 tH 2 Empty, no Λ 0 0 tH 0 Example Open 0.3 0 0.82tH 0.15 Example Closed 2 0 0.58tH 1 Example Flat, 0.5 0.5 0.84tH -0.25 Concordance .27 .73 1.001tH -0.6 SteadyState 1 0 ≈ -1 ∞

Note. — The ages presume a matter-dominated (dust) model.

models with a cosmological constant (see Table 1). I have not seen any discussion of this “coincidence” in the . It is also worth pointing out the values cited assume the presence of a cosmological constant (i.e. w = 1) rather than some other form of dark energy. −

15.4 General Behavior of Different Classes of Models We have talked about the time evolution of the Λ = 0 models, and have also talked about the accelerated late time expansion in Λ > 0 models. Figure ?? shows the range of ex- pansion histories that can occur once one includes a cosmological constant. Of particular note are the so-called “loitering” models. In these models, the energy density is sufficient to nearly halt the expansion, but right at the point where the cosmological constant be- comes dominant. Essentially, the expansion rate temporarily drops to near zero, followed by a period of accelerated expansion that at late times looks like the standard Λ-dominated universe. What this means is that a large amount of time corresponds to a narrow - shift interval, so observationally there exists a preferred redshift range during which a great deal of evolution (stellar/galaxy) occurs. Having a loitering period in the past requires that Ω0Λ > 1, and therefore is not consistent with the current data. Finally, to gain a physical intuition for the different types of models, there is a nice javascript application at http://www.jb.man.ac.uk/ jpl/cosmo/friedman.html. ∼

44 16 Classical Cosmological Tests

Week 4 Reading Assignment: 4.7 All right. To wrap up this section§ of the class it’s time to talk slightly longer about something fun – classical cosmological tests, which are basically the application of the above theory to the real Universe. There are three fundamental classical tests that have been used with varying degrees of success: number counts, the angular size - redshift relation, and the magnitude - redshift relation.

16.1 Number Counts The basic idea here is that the volume is a function of the cosmological parameters, and therefore for a given class of objects the redshift distribution, N(z), will depend upon Ω0 and ΩΛ. To see this, let us try a simple example. Assume that we have a uniformly distributed population of objects with mean density n0. Within a given redshift interval dz, the differ- ential number of these objects (dN) is given by dN = n dV , (261) dz × P

where the proper density n and the proper volume element dVP are given by

3 n = n0(1 + z) (262) DP dΩ dz DP dVP = AdDP = 2 = 3 . (263) (1 + z) H(1 + z) (1 + z) H0E(z) Inserting this into the above definition, dN dV n D2 = n = dΩ 0 P , (264) dz dz H0E(z) or 2 2 dN (1 + z) DC = n0 (265) dz H0E(z) The challenge with this test, as with all the others, is finding a suitable set of sources that either do not evolve with redshift, or evolve in a way that is physically well understood.

16.2 Angular Size - Redshift Relation If one has a measuring stick with fixed proper length, then measuring the angular diameter versus redshift is an obvious test of geometry and expansion. We have already seen (and I will show again in class), that the angular diameter distance has an interesting redshift de- pendence, and is a function of the combination of Ω0 and ΩΛ. A comparison with should in principle directly constrain these two parameters.

45 In practice, there are several issues that crop up which make this test difficult. First, there is the issue of defining your standard ruler, as most objects (like galaxies) evolve significantly over cosmologically interesting distance scales. I will leave the discussion of possible sources and systematic issues to the observational cosmology class, but will note that there is indeed one additional fundamental concern. Specifically, the relation that we derived is valid for a homogeneous universe. In practice, we know that the matter distribution is clumpy. We will not go into detail on this issue, but it is worth pointing out that gravitational focusing can flatten out the angular size - redshift relation.

16.3 Alcock-Paczynski Test The Alcock-Paczynski (Alcock & Paczynski 1979) test perhaps should not be included in the “classical” section since it is relatively modern, but I include it here because it is another geometric test in the same spirit as the others. The basic idea here is as follows. Assume that at some redshift you have a spherical source (the original proposal was a galaxy cluster). Then in this case, the proper distance measured along the line of sight should be equal to the proper distance measured in the plane of the sky, or more specifically. If one inputs incorrect values for Ω0 and ΩΛ, then the sphere will appear distorted in one of the two directions. Mathematically, the sizes are dD dz LOS size = C = (266) 1+ z (1 + z)H0E(z) D dθ Angular size = D dθ = M , (267) A 1+ z so dz = H E(z)D , (268) dθ 0 M or in the more standard form 1 dz H E(z) = 0 D . (269) z dθ z M In practice, the idea is to average over a number of sources that you expect to be spherical such that the relation holds in a statistical sense. This test remains of interest in a modern context, primarily in the application of measuring the mean separations between an ensemble of uniformly distributed objects (say galaxies). In this case, the mean separation in redshift and mean angular separation should again satisfy the above relation.

16.4 Magnitude - Redshift Relation The magnitude - redshift relation utilizes the luminosity distance to constrain the combina- tion of Ω0 ΩΛ. We have seen that the incident bolometric flux from a source is described by − L f = 2 , (270) 4πDL

46 Figure 6 Pretend that there is a figure here showing the SN mag-z relation.

which expressed in astronomical magnitudes (m 2.5log f) becomes, ∝ − m = M +2.5 log(4π) log D (z, Ω , Ω ), (271) − L 0 Λ where the redshift dependence and its sensitivity to the density parameters is fully encap- sulated in DL. This test requires that you have a class of sources with the same intrinsic luminosity at all redshifts – so-called “standard candles”. Furthermore, real observations are not bolometric, which means that you must include passband effects and k-corrections. The basic principle is however the same. The greatest challenge associated with this test lies in the identification of well-understood standard candles, and the history of attempted application of this method is both long and interesting. Early attempts included a number of different sources, with perhaps the most famous being brightest cluster galaxies. Application of the magnitude-redshift relation to type Ia supernovae provided the first evidence for an accelerated expansion, and remains a cosmological test of key relevance. It is hoped that refinement of the supernovae measurements, coupled with other modern cosmological tests, will also provide a precision constraint upon w. Achieving this goal will require addressing a number of systematics, including some fundamental issues like bias induced by gravitational focusing of supernovae. These issues are left for the observational cosmology course. It is interesting to note though that this test took the better part of a century to yield meaningful observational constraints!

17 The Hot Big Bang Universe: An overview

Up to this point we have been concerned with the geometry of the universe and measuring distances within the universe. For the next large section of the course we are going to turn our attention to evolution of matter in the universe. Before delving into details though, let’s begin with a brief overview of the time evolution of the constituent particles and fundamental forces in the Universe. If we look around at the present time, the radiation from the in our local part of the universe – stars, galaxies, galaxy clusters – reveals a complex network of structure. Observations of the microwave background also show that the radiation energy density, and temperature, are low. Three fundamental questions in cosmology are:

Can we explain the observed structures in the universe in a self-consistent cosmological • model?

Can we explain the observed cosmic background radiation? • Can we explain the abundances of light elements within the same model? •

47 Table 2. Timeline of the Evolution of the Universe

Event tU TU Notes

Planck Time 10−43 s 1019 GeV GR breaks down StrongForce 10−36 s 1014 GeV GUT Inflation 10−36 10−32 s Weak Force 10−−12 s - Transition 10−5 s 300 MeV form LeptonEra 10−5 10−2 s 130 Mev 500 keV e+ e− Nucleosynthesis 10−2− 102 s 1− MeV Light− elements form − ∼ Radiation-Matter Equality 50,000 yrs (z =3454) 9400K Recombination 372,000 yrs (z =1088) 2970K(0.3eV) CMB 108 yrs (z =6 20) 50K Galaxy Formation∼ Reionization til− now 50-2.7 K Present Day 13.7 Gyrs 2.7 K ( 10−4 eV) ∼

The answer to these three question is largely yes for a hot big bang cosmological model – coupled with inflation to address a few residual details for the second of these questions. For now, we will begin with a broad overview and then explore different critical epochs in greater detail. The basic picture for the time evolution of the universe is that of an adiabatically ex- panding, monotonically cooling fluid undergoing a series of transitions, with the global structure defined by the RW metric and Friedmann equations. A standard timeline denoting major events in the history of the universe typically looks something like Table 2. Note that our direct observations are limited to t > 372 kyrs, while nucleosynthesis constraints probe to t 1 s. The table, however, indicates that much of the action in establishing what we see∼ in the present-day universe occurs at even earlier times t < 1s. From terrestrial experiments and the of , we believe that we have a reasonable description up to 10−32 s, although the details get sketchier as we move to progressively earlier times. At t∼ 10−32 there is a postulated period of superluminal ∼ expansion (the motivations for which we will discuss later), potentially driven by a change in the equation of state. At earlier times, we expect that sufficiently high are reached that the strong force is unified with the weak and electromagnetic forces, and eventually a sufficiently high temperature (density) is reached that a theory of gravity (which currently does not exist in a coherent form) is required. The story at early times though remains very much a speculative tale; as we shall see there are ways to avoid ever reaching the density. Now, looking back at the above table, in some sense it contains several categories of

48 events. One category corresponds to the unification scales of the four fundamental forces. A second corresponds to the evolution of particle species with changing temperatures. This topic is commonly described as the thermal history of the universe. A third category describes key events related to the Friedmann equations (radiation-matter equality, inflation). Finally, The last few events in this table correspond to the formation and evolution of the large scale structures that we see in the universe today. Clearly each of these subjects can fill a semester (or more) by itself, and in all of these “categories” are quite interdependent. We will aim to focus on specific parts of the overall picture that illuminate the overall evolutionary history. For now, let us begin at the “beginning”.

18 The Planck Time

Definition

The Planck time ( 10−43s) corresponds to the limit in which Einstein’s equations are no longer valid and must∼ be replaced with a more complete theory of if we wish to probe to earlier times. An often used summary of GR is that “space tells matter how to move; matter tells space how to curve”. The Planck time and length essentially correspond to the point at which the two cannot be considered as independent entities. There are several ways to define the Planck time and Planck length. We will go through two. The first method starts with the Heisenberg , and defines the Planck time as the point at which the uncertainty of the wavefunction is equal to the particle horizon of the universe, ∆x∆p = lP mpc =¯h, (272) where lP = ctP . Now, mP is the mass within the particle horizon, and

3 mP = ρP lP . (273)

At early times we know that ρ ρ and t 1 H−1, so the density can be approximated as ∼ c U ∼ 2 3H2 1 c2 ρP ρc = 2 2 , (274) ∼ 8πG ∼ GtP ∼ GlP so

1/2 Gh¯ −33 lP 2 10 cm, (275) ≃ c3 ! ≃ × 1/2 Gh¯ −43 tP = lP /c 10 s, (276) ≃ c5 ! ≃

1 93 −3 ρP 2 4 10 gcm , (277) ≃ GtP ≃ ×

49 1/2 3 hc¯ −5 mP ρlP 3 10 g, (278) ≃ ≃ G ! ≃ × 5 1/2 2 hc¯ −19 EP = mP c 10 GeV. (279) ≃ G ! ≃

Additionally, we can also define a Planck temperature, E T P 1032K. (280) P ≃ k ≃ Just for perspective, it is interesting to make a couple of comparisons. The (LHC), which will be the most advanced terrestrial accelerator when it becomes 3 −15 fully operational, is capable of reaching E 7 10 GeV, or roughly 10 EP . 14∼ ×−3 −79 Meanwhile, the density of a star is ρN 10 g cm , or roughly 10 ρP . The second way to think about the Planck length∼ is in terms of the Compton and the . To see this, consider a particle of mass m. The Compton length of the particle’s wavefunction is h¯ h¯ λ = . (281) C ≡ ∆p mc The Compton wavelength in essence defines the scale over which the wavefunction is localized. Now consider the Schwarzschild radius of a body of mass m, 2Gm r = . (282) s c2 By definition, any particle within the Schwarzschild radius lies beyond the event horizon and can never escape. The Planck length can be defined as the scale at which the above two equations are equal. Equating the two relations, we find that the Planck mass is

hc¯ 1/2 hc¯ 1/2 mP = , (283) 2G! ≃ G !

and that Planck length is 2G Gh¯ 1/2 lP = mp , (284) c2 ≃ c3 ! from which the other definitions follow. Note that if the Schwarzschild radius is less than the wavefunction, then this would indicate that information (and mass) can escape from within. This would equivalent to having a naked singularity.

Physical Interpretation

50 The notion of a Big Bang singularity at t = 0 is an idea that is somewhat engrained in the common picture of the Big Bang model. In truth, we cannot presently say anything about the early universe at smaller times than the Planck time, and it is not at all clear that a complete theory of quantum gravity would to an . In this light, the notion of t = 0 is indeed more a matter of than of physics. The Planck time should therefore be thought of as the age of the universe when it has the Planck density IF one uniformly extrapolates the expansion to earlier times. Moreover, it is also physically plausible that the real Universe never reaches the Planck density. Consider again the equation of state of the universe. As discussed in Chapter 2 of your book, there is no initial singularity if w < 1/3. To see this, we return to the − Friedmann equations. Recall that 4 p a¨ = πGρ 1+3 a. (285) −3 ρc2 ! It is clear that p 1 a<¨ 0 if > , (286) ρc2 −3 p 1 a>¨ 0 if < . (287) ρc2 −3 (288)

In the latter case, the expansion is accelerating with time, so conversely as you look back to earlier times you fail to approach an initial singularity. Fluids with w < 1/3 are considered to violate the strong energy condition. Phys- ically, how might one− violate this condition? The simplest option is to relax our implicit assumption that matter can be described as an ideal fluid. Instead, consider a generalized imperfect fluid – one which can have thermal conductivity (χ), shear viscosity (η), and bulk viscosity (ζ). We cannot introduce thermal conductivity or shear viscosity without violating the CP, however it is possible for the fluid to have a bulk viscosity. In the Euler equation this would look like dv ρ +(v )v = p + ζ ( v). (289) " dt · ∇ # −∇ ∇ ∇· The net effect upon the Friedmann equations (which we will not derive right now) is to replace p with a effective pressure p∗,

p p∗ = p 3ζH. (290) → − With this redefinition it is possible to get homogeneous and isotropic solutions that never reach the Planck density if ζ > 0. In fact, there are actually physical motivations for having an early period of exponential growth in the scale factor, which we will discuss in the context of inflation later in the term.

51 Given our lack of knowledge of the equation of state close to the Planck time, the above scenario remains plausible. Having briefly looked at the earliest time, we now shift focus and will spend a while talking about the evolution from the era through recombination.

19 Temperature Evolution, Recombination and Decou- pling

Before exploring the thermal history of the universe in the big bang model, we first need to know how the temperature scales with redshift. This will give us our first glimpse of the cos- mic microwave background. Below we are concerned with matter and radiation temperatures when the two are thermally decoupled and evolving independently.

19.1 The Adiabatic and LTE Assumptions Throughout this course we have been making the assumption that the matter and radiation distributions are well-approximated as an adiabatically expanding ideal fluid. For much of the early history of the universe, we will also be assuming that this fluid is approximately in local thermodynamic equilibrium (LTE). It is worth digressing for a few moments to discuss why these are both reasonable assumptions. Adiabatic Expansion In classical thermodynamics, an expansion is considered to be adiabatic if it is “fast” in the sense that the gas is unable to transfer to/from an external reservoir on a timescale less than the expansion. The converse would be an isothermal expansion, in which the pressure/volume are changed sufficiently slowly that the gas can transfer heat to/from an external reservoir and maintain a constant temperature. Mathematically, the above condition for adiabatic expansion corresponds to PV =constant and dE = P dV . In the case of the universe, the assumption of adiabatic expansion− is a basic consequence of heat. Having a non-adiabatic expansion would require a means of transferring heat between our universe and some external system (an adjacent ?). Moreover, this heat transfer would need to occur on a timescale t << tH for the adiabatic assumption to fail. One can always postulate scenarios in which there is such a transfer (e.g. the steady-state model); however, there is no physical motivation for doing so at present. Indeed, the success of can be considered a good argument back to t 1s for the sufficiency of the adiabatic assumption. ∼ Local Thermodynamic Equilibrium (LTE) The condition of LTE implies that the processes acting to thermalize the fluid must occur rapidly enough to maintain equilibrium. In an expanding universe, this is roughly equivalent to saying that the collision timescale is less than a Hubble time, τ tH . Equivalently, one can also say that the interaction rate, Γ nσ v H. Here n is≤ the number density of particles, σ is the interaction cross-section,≡ and |v|is ≥ the amplitude of the velocity. | |

52 Physically, the above comes from noting that T a−1 (which we will derive shortly) and hence T˙ /T = H, which says that the rate of change∝ in the temperature is just set − by the expansion rate. Once the interaction rate drops below H, the average particle has a mean free path larger that the Hubble distance, and hence that species of particle evolves independently from the radiation field henceforth. It is worth noting that one cannot assume a departure from thermal equilibrium just because a species is no longer interacting – it is possible for the temperature evolution to be the same as that of the radiation field if no additional processes are acting on either the species or the radiation field.

19.2 Non-relativistic matter In this section we will look at the temperature evolution of non-relativistic matter in the case where the matter is decoupled from the radiation field. If we assume that the matter can be described as an adiabatically expanding ideal gas, then we know

dE = P dV ; (291) − 2 3 2 3 ρmkBTm 3 E = U + KE = ρmc + nkBTm V = ρmc + a ; (292)  2  2 mp ! ρmkBTm P = nkBTm = , (293) mp and putting these equations together can quickly see

2 3 kBTm 3 kBTm 3 d ρmc + ρm a = ρm da . (294) " 2 mp ! # − mp

3 Mass conservation requires also that ρma is constant, so

3 ρ k a3dT ρ k T m B m = m B M da3, (295) 2 mp − mp 3 dTm 2 da = 3 , (296) Tm 3 a 2 a0 2 Tm = T0m = T0m(1 + z) . (297)  a  So we see that the temperature of the matter distribution goes at (1+ z)2.

19.3 Radiation and Relativistic Matter What about for radiation. For a gas of photons, it is straightforward to derive the redshift dependence. The relation between the energy density and temperature for a is simply ρ c2 = σ T 4, (298) ≡ r r r 53 and the pressure is 1 σ T 4 p = ρc2 = r . (299) 3 3 Note that you may have seen this before in the context of the luminosity of a star with temperature T . The quantity σr is the radiation density constant, which in most places you’ll see written at a instead. The value of the radiation constant is π2k4 σ = B =7.6 10−15ergscm−3K4. (300) r 15¯h3c3 × We know from a previous class that ρ (1 + z)4, (301) r ∝ which tells us that Tr = T0r(1 + z). (302) This is true for any relativistic species, and more generally the temperature of any particle species that is coupled to the radiation field will have this dependence. One could also derive the same expression using the adiabatic expression σ T 4 d(σ T 4a3)= r da3. (303) r − 3 T 4 4T 3dT a3 + T 4da3 = da3 (304) − 3 dT 1 da3 = (305) T −3 a3 T a−1 (1 + z). (306) ∝ ∝ 19.4 Temperature Evolution Prior to In the above two sections we have looked at the temperature evolution of non-relativistic matter and radiation when they are evolving as independent, decoupled : 2 Tm = T0m(1 + z) (307) 1 Tr = T0r(1 + z) (308) (309) What about before decoupling? In this case, the adiabatic assumption becomes 4 2 3 kBTm 4 3 kBTm σrT 3 d ρmc + ρm + σrT a = ρm + da . (310) " 2 mp ! # − mp 3 ! 3 Mass conservation (valid after freeze-out) requires that ρma =constant, as before. We are now going to introduce a σrad, which will be important in subsequent discussions. We define 3 4mpT σrad = σr . (311) 3kBρm

54 Using this expression and mass conservation, the previous equation can be rewritten as

4 3 kBT 4 3 kBT σrT 3 d ρm + σrT a = ρm + da . (312) " 2 mp ! # − mp 3 ! 4 3ρmkB 3 3 3 ρmkBT 4σrT dT +4σrT a = da + (313) 2mp ! − mp 3 ! 3ρmkB +4σ T 3 3 dT 2mp r da da 3 = = 3 (314) T  ρmkB + 4σr T  − a3 − a mp 3   dT 1+ σrad da = 1 (315) T 2 + σrad a

3 Now, the above is non-trivial to integrate because σ is in general a function of T . rad ρm Recall though, that after decoupling the radiation temperature T (1 + z), while the 3 ∝ matter density ρm (1 + z) . In this case, we have that σrad is constant after decoupling. Note that I have not∝ justified here why it is OK to use the radiation temperature, but bear with me. To zeroth order you can consider σrad as roughly the ratio of the matter to radiation 4 energy densities (to within a constant of order unity), which would have in it Tr /Tm, so the temperature dependence is mostly from the radiation. Anyway, if σrad is constant after decoupling, then we can compute the present value and take this as also valid at decoupling. Taking T0r =2.73 K,

σ (t ) σ (t = t ) 1.35 108(Ω h2)−1, (316) rad decoupling ≃ rad 0 ≃ × b This value is >> 1, which implies that to first order, dT da T = T (1 + z), (317) T ≃ − a 0r This shows that even at decoupling, where we have non-negligible contributions from both the matter and the radiation, the temperature evolution is very well approximated as T (1+z). ∝ At higher temperatures the matter becomes relativistic, and the temperature should therefore evolve in with the same redshift dependence.

20 A Thermodynamic Digression

Before proceeding further, it is worth stepping back for a moment and reviewing some basic thermodynamics and statistical mechanics. You should be all too familiar at this point with adiabatic expansion, but we haven’t discussed either entropy, chemical potentials, and equilibrium energy/momentum distributions of particles.

55 20.1 Entropy Entropy is a fundamental quantity in thermodynamics that essentially describes the disorder of a system. The classical thermodynamic definition of entropy is given by dQ dS = , (318) T where S is the entropy and Q is the heat of the system. In a more relevant astrophysical context, the radiation energy density for instance would be ρc2 + p s = = . (319) r T T The statistical mechanics definition is based instead upon the number of internal “micro- states” in a system (essentially internal degrees of freedom) for a given macro-state. As an example, consider the case of 10 coins. There is one macro-state corresponding to all heads, and also only 1 micro-state (configuration of the individual coins) that yields this macro-state. On the other hand, for the macro-state with 5 heads and 5 tails, there are C(10,5) combinations of individual coins - micro-states - that can yield a single macro-state. By this Boltzmann definition, the entropy is S = kB ln ω, where ω is the number of internal micro-states.

21 Chemical Potential

I’ve never liked the name chemical potential, as the ‘chemical’ part is mainly a historical artifact. What we’re really referring here to is the potential for electromagnetic and weak (and strong at high T) reactions between particles. In this particular instance, I’ll quote the definition of chemical potential from Kittel & Kroemer (page 118). Consider two systems that can exchange particles and energy. The two systems are in equilibrium with respect to particle exchange when the net particle flow is zero. In this case: “The chemical potential governs the flow of particles between the systems, just as the temperature govers the flow of energy. If two systems with a single chemical species are at the same temperature and have the same value of the chemical potential, there will be no net particle flow and no net energy flow between them. If the chemical potentials of the two systems are different, particles will flow from the system at higher chemical potential to the system at lower chemical potential.” In the cosmological context that we are considering, instead of looking at physically moving particles between two systems, what we are instead talking about is converting one species of particle to another. In this interpretation, what the above statement says is that species of particles are in equilibrium (chemical equilibrium, but again I loathe the terminology), then the chemical potential of a given species is related to the potentials of the other species with which it interacts. For instance, consider four species a,b,c,d that interact as a + b c + d. (320) ←→ 56 For this reaction, µa + µb = µc + µd whenever chemical equilibrium holds. For photons, µγ = 0, and indeed in for all species in the early universe it is reasonable to approximate µi = 0 (i.e. µi << kT ). This can be seen for example by considering the reaction γ + γ ↽⇀ e+ + e−. (321) If this reaction is in thermal equilibrium (i.e. prior to pair annihilation), then the chemical potential must be µ+ + µ− =0. (322) In addition, since −9 n − = n + + 10 , (323) e e ∼ we expect µ = µ + 10−9. (324) + − ∼ Hence, µ µ 0. (325) + ≃ − ≃ 21.1 Distribution Functions Long ago, in a statistical mechanics class far, far away, I imagine that most of you discussed distribution functions for particles. For species of indistinguishable particles in kinetic equilibrium the distribution of filled occupation states is given either by the Fermi-Dirac distribution (for ) or the Bose-Einstein distribution (for bosons), which are given by 1 f(p)= , (326) e(E−µ)/kT 1 ± where p in this equation is the particle momentum and E2 = p 2c2 +(mc2)2). The “+” | | corresponds to Fermions and the “-” corresponds to bosons. Physically the that the equation is different for the two types of particles is due to their intrinsic properties. Fermions, which have half integer spin, obey the Pauli exclusion principle, which means that no two identical fermions can occupy the same quantum state. Bosons, on the other hand, do not obey the Pauli exclusion principle and hence multiple bosons can occupy the same quantum state. This is a bit of a digression at this point, so I refer the reader to Kittel & Kroemer for a more detailed explanation. Note that the above distribution functions hold for indistinguishable particles. By defini- tion, particles are indistinguishable if their wavefunctions overlap. Conversely, are considered distinguishable if their physical separation is large compared to their De Broglie wavelength. In the classical limit of distinguishable particles, the appropriate distribution function is the Boltzmann distribution function, 1 f(p)= , (327) e(E−µ)/kT

57 which can be seen to be the limiting case of the other distributions when kT <

g 3 n = 3 f(p)d p. (328) h Z In the above equation, the quantity g/h is the density of states available for occupation (can be derived from a particle-in-a-box quantum mechanical argument). The quantity g specifically refers to the number of internal degrees of freedom. We will return to this in a moment. Similar to the number density, the energy density can be written as written as

2 g 3 = ρc = 3 E(p)f(p)d p. (329) h Z From the distribution functions and definition of energy, these equations can be rewritten as g ∞ (E2 (mc2)2)1/2EdE n = 3 − (330) 2π2h¯ c3 m exp[(E µ)/kT ] 1 Z − ± g ∞ (E2 (mc2)2)1/2E2dE ρ = 3 − . (331) 2π2h¯ c3 m exp[(E µ)/kT ] 1 Z − ± In the relativistic limit, the above equations become g ∞ E2dE n = 3 (332) 2π2h¯ c3 0 exp[(E µ)/kT ] 1 Z − ± g ∞ E3dE ρ = 3 . (333) 2π2h¯ c3 0 exp[(E µ)/kT ] 1 Z − ± Note that Kolb & Turner 3.3-3.4 is a good reference for this material. Now, let us consider a§ specific example. Photons obey Bose-Einstein statistics, so the number density is g ∞ E2dE nγ = 3 . (334) 2π2h¯ c3 0 exp[(E µ)/kT ] 1 Z − ± As we will discuss later, for photons the chemical potential is µγ = 0, so making the substi- tution x = E/kT the above equation becomes

g kT 3 ∞ x2dx . (335) 2π2 hc¯ ! 0 ex 1 Z − It turns out that this integral corresponds to the Riemann-Zeta function, which is defined such that ∞ xn−1dx ζ(n)Γ(n)= , (336) 0 ex 1 Z − 58 for integer values of n. Also, for photons g = 2 (again, we’ll discuss this in a moment). The equation for the number density of photons thus becomes,

1 kT 3 2ζ(3) kT 3 nγ = ζ(3)Γ(3) = . (337) π2 hc¯ ! π2 hc¯ !

−3 For the current Tr = 2.73 K, and given that ζ(3) 1.202, we have that n0γ = 420 cm . 3 3 ≃ Note that since nγ scales with T , nγ (1 + z) for redshift intervals where no species are freezing out. ∝ More generally, we have noted above that the chemical potential for all particle species in the early universe is zero. Consequently, in a more general derivation it can be shown that for any relativistic particle species

3 ∞ 2 3 kT x dx giζ(3) kBT ni = gi = α , (338) hc¯ ! 0 ex 1 π2 hc¯ ! Z ± where α = 3/4 for fermions and α = 1 for bosons. Similarly, the energy density of a given species is given by 4 ∞ 3 (kT ) x dx gi 4 ρi = gi = β σrT . (339) (¯hc)3 0 ex 1 2 Z ± where β =7/8 for fermions and β = 1 for bosons. For a multi-species fluid, the total energy density will therefore be

7 σ T 4 σ T 4 ρc2 = g + g r = g∗ r . (340)  i 8 i 2 2 bosonsX fermionsX   21.2 What is g? A missing link at this point is this mysterious g, which I said is the number of internal degrees of freedom. In practice, what this means for both bosons and fermions is that

g =2 spin + 1. (341) × For example, for a spin 1/2 or , g = 2. Two exceptions to this rule are photons (g = 2) and neutrinos (g = 1), which each have one less degree of freedom than you might expect from the above relation. The underlying are unimportant for the current discussion, but basically the photon is down by one because longitudinal E&M don’t propagate, and neutrinos are down one because one helicity state does not exist. In the above section we showed how to combine the g of the different particle species to obtain an effective factor g∗ for computing the energy density for a multispecies fluid. Just to give one concrete (but not physical) example, consider having a fluid comprised of only + νe and µ . In this case, 7 21 g∗ =0+ (1+2) = (342) 8 8

59 22 Photon- Ratio

OK – time to return to cosmology from thermodynamics, although at this point we’re still laying a bit of groundwork. One important quantity is the ratio between the present mean number density of baryons (n0b) and photons (n0γ ). It’s actually defined in the book con- versely as η0 = n0b/n0γ . The present density of baryons is

ρ0b −5 2 −3 n0b = 1.12 10 Ω0bh cm . (343) mp ≃ × Meanwhile, we have now calculated the photon density in a previous section,

3 2ζ(3) kBT0r −3 n0γ = 420cm (344) π2 hc¯ ! ≃ The photon-baryon ratio therefore is

−1 n0b 7 2 −1 η0 = 3.75 10 (Ω0bh ) . (345) n0γ ≃ × The importance of this quantity should become clearer, but for now the key thing to note is that there are far more photons than baryons.

23 Radiation Entropy per Baryon

Related to the above ratio, we can also ask what is ratio between the radiation entropy density and the baryon density? From our definition of entropy earlier, we have ρ c2 + p + r 4 ρ c2 4 s = = r = r = σ T 3. (346) r T T 3 T 3 r We also know that the number density of baryons is

nb = ρbmp (347)

3 Recalling that σrad =4mpσrT0r/(3kbρ0b), we can rewrite the equation for the entropy as

sr = σradkbnb, (348)

or sr srnγ σrad = = , (349) kBnb kBη −1 which tells us that σrad, sr, and η are all proportional. Your book actually takes the above −1 equation and fills in to get σrad =3.6η to show that the constant is of order unity. Finally, and perhaps more interestingly, σrad is also related to the primordial baryon- antibaryon asymmetry. Subsequent to the initial establishment of the asymmmetry, (n b − 60 3 n¯b)a must be a conserved quantity (conservation of baryon number). Moreover, in the observed universen ¯ 0, so n a3 is conserved. At early times when the baryon species are b → 0b in equilibrium, we have 3 3 n n¯ n T (1 + z) . (350) b ≃ b ≃ γ ∝ ∝ At this stage the is expected to be

nb n¯b nb n¯b n0b −1 − − 1.8σrad. (351) nb + n¯b ≃ 2nγ ≃ 2n0γ ≃

From a physical perspective, what this says is that the reason that σrad is so large, and that there are so many more photons than baryons, is that the baryon-antibaryon asymmetry is small.

24 Lepton Era

The Lepton era corresponds the time period when the universe is dominated by , which as you will recall are particles that do not interact via the strong force. The three ± ± ± families of leptons are the electron (e , νe, ν¯e), muon (µ , νµ, νµ¯ ), and (τ , ντ , ν¯τ ) families. The lepton era begins when (a type of hadron with a short lifetime) freeze-out at T 130 MeV, annihilating and/or decaying into photons. At the start of the lepton era, the∼ only species that are in equilibrium are the γ,e±,µ±, a small number of baryons, and neutrinos (all 3 types). It is of interest to calculate g∗ at both the beginning and end of the lepton era (both for practice and physical insight). At the start of the lepton era (right after annihilation), 7 g∗ =2+ (2 2+2 2+3 2) = 14.25 (352) start 8 × × × × where the terms correspond to the photons, , , and neutrinos, respectively.[We’ll ignore the baryons.] At the end of the lepton era, right after the electrons annihilate, we are left with only the photons, so ∗ gend =2. (353) We’ll work out an example with neutrinos in a moment to show why this . The quick physical answer though is that when species annihilate g decreases and the radiation energy density increases (since particles are being converted into radiation). The value of g is used to quantify this jump in energy density and temperature. To see this, consider that in this entire analysis we are treating the universe as an adia- batically expanding fluid. We discussed previously that this is equivalent to requiring that there is no heat transfer to an external system, or dQ dS =0. (354) ≡ dT

61 In other words, entropy is conserved as the universe expands. We also discussed previously that the entropy density is given by ρc2 + p s = , (355) r T which in well-approximated by the radiative components, giving 4 ρc2 2 s = = g∗σ T 3. (356) r 3 T 3 r Now consider pair annihilation of a particle species at temperature T . From conservation of entropy, we require

sbefore = safter, (357) 2 2 g∗ σ T 3 = g∗ σ T 3 , (358) 3 before r before 3 after r after ∗ 3 ∗ 3 gbeforeTbefore = gafterTafter, (359) (360) or ∗ 1/3 gbefore Tafter = Tbefore ∗ (361) gafter ! Now, g∗ is a decreasing function as particles leave equilibrium, so the above equation states that the radiation temperature is always higher after annihilation of a species. There are two relevant types of interactions during the lepton era that act to keep particles in equilibrium – electromagnetic and weak interactions. Examples of the electromagnetic interactions are

+ − + − + − p +¯p ↽⇀ n +¯n ↽⇀ π + π ↽⇀ µ + µ ↽⇀ e + e ↽⇀ π0 ↽⇀ 2γ, (362)

and examples of weak interactions are

− + e + µ ↽⇀ νe + νµ, (363) − + e + e ↽⇀ νe +¯νe, (364) − e + p ↽⇀ νe + n, (365) − + e + νe ↽⇀ e + νe. (366) (367)

The relevant cross-sections for electromagnetic interactions is the Thomson cross-section 2 (σT ), while the cross-section σwk T is given in your book. Note that neutrinos feel the weak force, but not∝ the electromagnetic force. This property, coupled with the temperature dependence of the weak interaction cross-section, is the reason that neutrinos are so difficult to detect.

62 24.1 Electrons The electron-positron pairs remain in equilibrium during the entire lepton era since the creation timescale for pairs is much less than the expansion timescale. Indeed the electrons remain in equilibrium until recombination, which we will discuss shortly. In practice, the end of the lepton era is defined by the annihilation of the electrons and positrons at T 0.5 MeV. ∼ What is the density of electron-positron pairs? For T > 1010K, electromagnetic‘ interac- tions such as γ + γ ↽⇀ e+ + e− are in thermal equilibrium. Using the phase space distributions, we have

3 3 3 ζ(3) kBTe 3ζ(3) kBTe ne± = ne− + ne+ = (2 ge) = , (368) 4 π2 × hc¯ ! π2 hc¯ ! 4 2 2 2 7 σrTe 7 4 ρ ± c = ρ + c + ρ − c = (2+2) = σ T . (369) e e e 8 2 4 r e

Since the electrons are in equilibrium, Te = Tr.

24.2 Muons The muon pairs also remain in equilibrium until T 1012K, at which point they annihilate. It is straightforward to work out that before annihilation∼ the muons should have the same number and energy densities as the electrons.

24.3 Neutrinos Electron neutrinos decouple from the rest of the universe when the timescale for weak inter- + − action processes such as νe +¯νe ↽⇀ e + e equals the expansion timescale. Other species are coupled to the νe via neutral current interactions, so they decouple no later than this. To be specific, the condition for is

1/2 3 −1 tH 2 < tcollision (nlσwkc) , (370) ≃ 32πGρ! ≃

where nl is the number density of a generic lepton. At this time period τ particles are no longer expected to be in equilibrium, and we have noted above that ne = nµ, so the relevant density is given by the above equation for ne± as nl = (1/2)ne± . Similarly,

2 7 4 ρ = ρ ± c /2= σ T . (371) l e 8 r The condition for equilibrium, with insertion of constants, becomes

t T 3 H < 1 (372) t ≃ 3 1010K coll  ×  63 24.3.1 Temperature of Relic Species: Neutrinos as an Example It is interesting to consider the neutrino temperature in order to illustrate the general char- acteristics of temperature evolution. Given that we know the current temperature of the radiation field, we can derive the neutrino temperature by expressing it in terms of the radiation temperature. Neutrinos decouple from the radiation field at a time when they remain a relativistic species. Consequently, we expect their subsequent time evolution to follow the relation

−1 T0ν = Tν,decoupling(1 + z) (373)

We know that at decoupling Tν = Tr. If no particle species annihilate after the neutrinos decouple, the Tr = T0r(1 + z) and we have T0ν = T0r. However, we know that neutrinos decouple before electron-positron annihilation. We must therefore calculate how much the temperature increased due to this annihilation We saw before that when a particle species annihilates ∗ 1/3 gbefore Tafter = Tbefore ∗ , (374) gafter ! so what we need to do is calculate the g∗ before and after annihilation. Before annihilation, the equilibrium particle species are e± and photons, so 7 g∗ =2+ (2 2)=2+7/2=11/2. (375) 8 × After annihilation, only the photons remain in equilibrium, so g∗ = 2. Consequently,

11 1/3 Tafter = Tbefore . (376)  4  which says that at the neutrino decoupling

4 1/3 Tr,decoupling = T0r (1 + z) (377)  11 and hence 1/3 −1 4 T0ν = Tν,decoupling(1 + z) = T0r 1.9K. (378) 11 ≃ 24.3.2 Densities of Relic Species: Neutrinos as an Example Once the neutrino temperature is known, the number density can be calculated in the stan- dard fashion:

3 3 ζ(3) kBTν −3 nν = (3 2 1) 324cm . (379) 4 π2 × × hc¯ ! ≃

64 where we have assumed 3 neutrino species. If neutrinos are massless, we can also compute the energy density as 7 σ T 4 7 ρ c2 = (3 2 1) r ν = σ T 4 3 10−34g cm−3. (380) ν 8 × × 2 4 r e ≃ × In the last few years, terrestrial experiments have however demonstrated that there must be at least one massive neutrino species (a consequence of neutrinos changing flavor, which cannot happen if they are all massless). The number density calculation above is unaffected if neutrinos are massive, but looking at the number one sees that this is roughly the same as the photon number density (or of order 109 times the baryon number density). Consequently, even if neutrinos were to have a very small rest mass, it is possible for them to contribute a non-negligible amount to the total energy density. To be specific, the mass density of neutrinos is < m > ρ =< m > n N 1.92 ν 10−30g cm−3, (381) 0ν ν 0ν ≃ ν × × 10eV × or in terms of critical density, < m > Ω 0.1 N ν h−2. (382) 0ν ≃ × ν 10eV × Astrophysical constraints based upon the CMB and large scale structure indicate that the combined mass of all neutrino species is m 1eV (383) ν ≤ X (assuming General Relativity is correct), or < m > 1/3 eV for N = 3, which implies that ν ≤ ν Ω 0.005. (384) 0ν ≤ One can ask whether the relic neutrinos remain relativistic at the current time. Roughly speaking, the neutrinos will cease to be relativistic when ρ kinetic 1. (385) ρrestmass ≤ We calculated that at the current time, for a given type of neutrino, the density is −34 −3 ρkinetic = 10 g cm . (386) and the rest energy is < m > ρ = 1.92 ν 10−30g cm−3, (387) 0ν ≃ × 10eV × so we see that the neutrinos are no longer relativistic if 10−34 µ 10 eV, (388) ν ≥ 1.92 10−30 × × µ 5 10−4 eV. (389) ν ≥ × 65 We know that at least one of the species of neutrinos is massive (see section 8.5 of your book for more details), but given this limit cannot say whether some of the neutrino species are non-relativistic at this time. Finally, it is worth reiterating that we found that the neutrinos, which were relativistic at decoupling, have a temperature dependence after decoupling of T (1 + z). This will ν ∝ remain true even after the neutrinos become non-relativistic. Recall that the temperature is defined in terms of the distribution function – this distribution remains valid when the particles become non-relativistic. On the other hand, a species that is non-relativistic when it decouples has T (1+ z)2. Thus, the redshift dependence of the temperature for a given particle subsequent∝ to decoupling is determined simply by whether the particle is relativistic at decoupling.

24.4 Neutrino Oscillations This is an aside for interested readers. Why do we believe that at least one species of neutrinos has a non-zero mass? The basic evidence comes from observations of solar neutrinos and the story of the search for neutrino mass starts with the “solar neutrino problem”. In the standard model of , the p p chain produces neutrinos via reactions such as − p + p D + e+ + ν ; E =0.26MeV (390) → e ν Be7 + e− Li7 + ν ; E =0.80MeV (391) → e ν B8 Be7 + e+ + ν ; E =7.2MeV. (392) → e ν The physics is well-understood, so if we understand stellar structure then we can make a precise prediction for the solar neutrino flux at the earth. In practice, terrestrial experiments to detect solar neutrinos find a factor of a few less νe than are expected based upon the . Initially, there were two proposed solutions to the solar neutrino problem. One proposed resolution was that perhaps the central temperature of the was slightly lower, which would decrease the expected neutrino flux. Helioseismology has now yielded sound speeds over the entire volume of the sun to 0.1% and the resulting constraints on the central tem- perature eliminate this possibility. The second possible solution is “neutrino oscillations”. The idea with neutrino oscillations is that neutrinos can self-interact and transform to dif- ferent flavors. Since terrestrial experiments are only capable of detecting νe, we should only observe 1/3 to 1/2 of the expected flux if the electron neutrinos produces in the sun convert to other types. Now, what does this all have to do with neutrino mass? The answer comes from the physics behind how one can get neutrinos to change flavors. The basic idea is to postulate that perhaps the observed neutrino types are not fundamental in themselves, but instead correspond to linear combinations of neutrino eigenstates. As an example, consider oscil- lations between only electron and muon neutrinos and two eigenstates ν1 and ν2. Given a

66 mixing angle θ, one could construct νe and νµ states as

νe = cos θν1 + sinθν2, (393) ν = sin θν sin θν . (394) µ 1 − 2 (imaging the above in bra-ket notation...not sure how to do this properly in latex) Essentially, the particle precesses between the νe and νµ states. If the energies corre- sponding to the eigenstates are E1 and E2, then the state will evolve as ν = cos θ exp( iE t/h¯)ν + sin θ exp( iE t/h¯)ν , (395) e − 1 1 − 2 2 and the probability of finding a pure electron state will be

2 2 2 1 Pνe (t)= νe(t) =1 sin (2θ) sin (E1 E2)t/h¯ . (396) | | − 2 −  If both states have the same momenta (and how could they have different momenta?), then the energy difference is simple ∆m2c4 ∆E = . (397) E1 + E2 The important thing to notice in the above equation is that the oscillations do not occur if the two neutrinos have equal mass. There must therefore be at least one flavor of neutrino that has a non-zero mass.

25 Matter versus Radiation Dominated

One key benchmark during this period is the transition from a radiation to matter-dominated universe. We have seen in earlier sections that the redshift evolution of the energy density for non-relativistic matter is different than for relativistic particles and photons. Specifically, ρ = ρ (1 + z)3(w+1), where w 0 for non-relativistic matter and w = 1/3 for relativistic 0 ≈ particles and photons. While non-relativistic matter is the dominant component at z = 0, the practical consequence of this density definition is that ρ ρ r = 0r (1 + z), (398) ρm ρ0m which means that above some redshift the energy density of relativistic particles dominates and the radiative term in the Friedmann equations becomes critical. The current radiation energy density is g σ T 4 σ T 4 ρ = r = r =4.67 10−34g cm−3, (399) r 2 c2 c2 × while the current matter density is

ρ = ρ Ω h2 = (1.9 10−29)0.27h2g cm−3 2.5 10−30h2 g cm−3. (400) 0m oc 0m × ≃ × 70 67 Considering just these two components, we would expect the matter and radiation den- sities to be equal at ρ0m 1+ zeq = 5353. (401) ρ0r ∼ In practice, the more relevant timescale is the point when the energy density of non- relativistic matter (w = 0) is equal to the energy density in all relativistic species(w =1/3). To correctly calculate this timescale, we also need to fold in the contribution of neutrinos. Using the relations that we saw earlier, the energy density of the photons + neutrinos should be 4 7 gν 4 −2 ρrel = ρr + ρν = σrTr + Nν σrTν c . (402)  8 2  In our discussion of the lepton era and evolution of the neutrinos we worked out (or will 1/3 work out) that Tν = (4/11) Tr. We also know that gν = 1, and the best current evidence is that there are 3 neutrino species (N = 3) plus their (so 3 2), which means ν × that the above equation can be written as

4 4/3 σrTr 7 4 4 ρrel = 2 1+(2 Nν ) 12 =1.68σrT =1.681ρr. (403) c " × 8 11 # In your book, the coefficient (1.68) used to include the total relativistic contribution from photons and neutrinos is denoted at K0, and at higher densities the factor is called Kc to account for contributions from other relativistic species. In practice, K K , so we’ll stick c ∼ 0 with K0. If we now use this energy density in calculating the epoch of matter-radiation equality, we find that ρ0m 1+ zeq = 5353/1.68 3190. (404) ρ0rel ∼ ≃ The most precise current determination from WMAP gives z = 3454 (t 50 kyrs). eq u ∼ This is the redshift at which the universe transitions from being dominated by radiation and relativistic particles to being dominated by non-relativistic matter. At earlier times the universe is well-approximated by a simple radiation-dominated EdS model. During this era,

E(z) (1 + z) K Ω (405) ≃ 0 0r q It is worth noting that the most distant direct observations that we currently have come from the cosmic microwave background at z = 1088, so all existing observations are in the matter dominated era.

26 Big Bang Nucleosynthesis

Cosmological nucleosynthesis occurs just after electron-positron pairs have annihilated at the end of the lepton era. From an anthropic perspective this is perhaps one of the most

68 important events in the history of the early universe – by the end of BBN at t 3 minutes the primordial elemental abundances are fixed. ≃ Let us begin with the basic and definitions. There are basically two ways to synthesize elements heavier that hydrogen. The first method is the familiar process of stellar nucleosynthesis, as worked out by Burbidge, Burbidge, Fowler, and Hoyle. This method is good for producing heavy elements (C,N,O, etc), but cannot explain the observed high fraction of helium in the universe, m Y He 0.25. (406) ≡ mtot ≃ The second method is cosmological nucleosynthesis. The idea of elemental synthesis in the early universe was put forward in the 1940’s by Gamov, Alpher, and Hermann, with the basic idea being that at early times the temperature should be high enough to drive . These authors found that, unlike stellar nucleosynthesis, cosmological (or Big Bang) nucleosynthesis could produce a high helium fraction. As we shall see though, BBN does not produce significant quantities of heavier elements. The current standard picture is that BBN establishes the primordial abundances of the light elements, while the enrichment of heavier elements is subsequently driven by stellar nucleosynthesis. With that introduction, let’s dive in. Your book listed the basic underlying assumptions that are implicit to this discussion. Some of these are aspects of the Cosmological Principle (and apply to our discussion of other, earlier epochs as well); some are subtle issues that are somewhat beyond the scope of our discussion. I reproduce the full list here for completeness:

1. The Universe has passed through a hot phase with T 1012 K, during which its components were in thermal equilibrium. ≥

2. The known laws of physics apply at this time.

3. The Universe is homogeneous and isotropic at this time.

4. The number of neutrino times is not high (N 3). ν ≃ 5. The neutrinos have a negligible degeneracy parameter.

6. The Universe does not contain some regions of matter and others of (sub- point of 2., and part of the CP).

7. There is no appreciable magnetic field at this epoch.

8. The photon density is greater than that of any exotic particles at this time.

69 26.1 Neutron- Ratio As a starting point, we need to know the relative abundances of and at the start of nucleosynthesis. In kinetic equilibrium, the number density of a non-relativistic particle species obeys the Boltzmann distribution, so

3/2 2 mikBT µi mic ni = gi 2 exp − . (407) 2πh¯ ! kBT !

Neutrons and protons both have g = 2, and the µi can be ignored, which means that the ratio of the two number densities is

n m 3/2 (m m )c2 n n exp n − p e−Q/kBT , (408) np ≃ mp ! − kBT ! ≃

2 where Q = (mn mp)c 1.3 MeV. Equivalently, this expression says that while the two species are in thermal− equilibrium,≃

n 1.5 1010K n exp × . (409) np ≃ − T !

Equilibrium between protons and neutrons is maintained by weak interactions such as − n + νe ↽⇀ p + e , which remain efficient until the neutrinos decouple. The ratio is therefore set by the temperature at which the neutrinos decouple. Neutrinos decouple at T 1010K, in which case ≃ n 1 X =0.18. (410) n ≡ n + p ≃ 1 + exp(1.5) After this point, free neutrons can still transform to protons via β decay, which has a − half- of τn = 900s. This, the subsequent relative abundances of free protons and neutrons is given by X (t) X (t )e−(t−teq )/τn . (411) n ≡ n eq In practice, as we shall see nucleosynthesis lasts for << 900s, so X X (t ) for the entire n ≃ n eq time period of interest.

26.2 Nuclear Reactions Before proceeding, let us define the nuclear reactions that we may expect to occur. These include:

p + n ↽⇀ d(i.e.H2)+ γ (412) (413) d + n ↽⇀ H3 + γ (414) d + d ↽⇀ H3 + p (415)

70 d + p ↽⇀ He3 + γ (416) d + d ↽⇀ He3 + n (417) (418) H3 + d ↽⇀ He4 + n (419) He3 + n ↽⇀ He4 + γ (420) He3 + d ↽⇀ He4 + p. (421) (422)

The net effect of all after the first of these equations is essentially

d + d ↽⇀ He4 + γ. (423)

What about nuclei with higher atomic weights? The problem is that there are no stable nuclei with atomic weights of either 5 or 8, which means that you can only produce heavier elements by “triple-reactions”, such as 3He4 C12 + γ, in which a third nuclei hits the → unstable nuclei before it has time to decay. The density during cosmological nucleosynthesis is far lower than the density in stellar interiors, and the total time for reactions is only 3 minutes rather than billions of years. Consequently this process is far less efficient during∼ cosmological nucleosynthesis that stellar nucleosynthesis.

26.3 Let us now consider the expected abundance of helium produced. From the above equations, the key first step is production of deuterium via

p + n ↽⇀ d + γ. (424)

. The fact that nucleosynthesis cannot proceed further until there is sufficient deuterium is known as the deuterium bottleneck. From the , we saw that for all species 3/2 2 mikBT µi mic ni = gi 2 exp − , (425) 2πh¯ ! kBT ! where for protons and neutrons gn = gp = 2 (i.e. two spin states), and for deuterium gd =3 (i.e. the spins can be up-up, up-down, or down-down). For the chemical potentials, we take the relation µn + µp = µd (426) for equilibrium. Taking the total number density at n n + n , we thus find that tot ≃ n p 3/2 2 nd 3 mdkBT µd mdc Xd 2 exp − . (427) ≡ ntot ≃ ntot 2πh¯ ! " kBT #

71 The exponential term is equivalent to

µ + µ (m + m )c2 + B exp n p − n p d (428) " kBT # where B =(m + m m )c2 2.225 MeV. Using the expressions for the number density d n p − d ≃ of neutrons and protons, and defining the new quantity Xp =1 Xn, the above expression can be written in the form: −

3/2 −3/2 md 3 kBT Bd/kB T Xd ntot 2 XnXpe . (429) ≃ mnmp ! 4 2πh¯ ! After some algebraic manipulation and insertion of relevant quantities, the above equation has the observable form 25.82 X X X exp 29.33 + 1.5 ln T + ln Ω h2 , (430) d ≃ n p − T − 9 0b  9   9 where the dependence upon Ω0bh comes from ntot, and T9 = T/10 K. Looking at the last equation, the amount of deuterium starts becoming significant for T9 < 1. At this point the deuterium bottleneck is alleviated and additional reactions can proceed. To be more specific, for a value of Ω h2 consistent with WMAP, we find that X X X at T 8 108 K, or 0b d ≃ n p ≃ × t 200 s. We will call this time t∗ for consistency with the book. ≃ 26.4 Helium and Trace Metals Now, what about Helium? Once the temperature is low enough that the deuterium bottle- neck is alleviated, essentially all neutrons are rapidly captured and incorporated into He4 by reactions such as d + d He3 + nd + He3 He4 + p because of the large cross-sections for these reactions. We can→ consequently assume→ that almost all the neutrons end up in He4, in which case the helium number density fraction is

1 nn 1 XHe = Xn, (431) ∼ 2 ntot 2 and the fraction of helium by mass, Y , is

mHe nHe Y =4 2Xn. (432) ≡ mtot ntot ≃ If we account for the neutron β-decay, so that at the end of the deuterium bottleneck

∗ t teq Xn Xn(teq) exp − , (433) ≃ − τn  then we get that Y 0.26. This value is in good agreement with the observed helium ≃ abundance in the universe. Note that the helium abundance is fairly insensitive to Ω0b. This

72 is primarily because the driving factor in determining the abundance is the temperature at which the n/p ratio is established rather than the density of nuclei. In contrast, the abundances of other nuclei are much more strongly dependent upon 2 3 Ω0bh . For nuclei with lower atomic weight than Helium (i.e. deuterium and He ) the abundance decreases with increasing density. The reason is that for higher density these particles are more efficiently converted into He4 and hence have lower relic abundances. For nuclei with higher atomic weight the situation is somewhat more complicated. On the one hand, higher density means that there is a greater likelihood of the collisions required to make these species. Thus, one would expect for example that the C12 abundance (if had time to form) should monotonically increase with density. On the other hand, for nuclei that require reactions involving deuterium, H3, or He3, (such as Li7) lower density has the advantage of increasing the abundances of these intermediate stage nuclei. These competing effects are the reason that you see the characteristic dip in the abundance in plots of the relative elemental abundances as a function of baryon density.

27 The Era

The “plasma era” is essentially defined as the period during which the baryons, electrons, and photons can be considered a thermal plasma. This period follows the end of the lepton era, starting when the electrons and positrons annihilate at T 5 109 K and ending at recombination when the universe becomes optically thin. Furthermore,≃ × this era is sometimes subdivided into the radiative and matter era, which are the intervals within the plasma era during which the universe is radiation and matter dominated, respectively. In this section we will discuss the properties of a plasma, and trace the evolution of the universe up to recombination. We will also briefly discuss the concept of reionization at lower redshift.

27.1 Plasma Properties What exactly do we mean when we say that the universe was comprised of a plasma of protons, helium, electrons, and photons? Physically, we mean that the thermal energy of the particles is much greater than the energy of Coulomb interactions between the particles. If we define λ as the mean particle separation, then this criteria can be expressed mathematically as λD >> λ, (434) where λD is the Debye length. The Debye length (λD) is a fundamental length scale in plasma physics, and is defined as

1/2 kBT λD = 2 , (435) 4πnee !

where ne is the number density of charged particles and e is the charge in electrostatic units. It is essentially the separation at which the thermal and Coulomb terms balance.

73 Another way of looking at this is that for a charged particle in a plasma the effective electrostatic potential is e Φ= e−r/λD , (436) 4πǫ0r

where ǫ0 is the permittivity of free space. From this equation Debye length can be though of as the scale beyond which the charge is effective shielded by the surrounding sea of other charged particles. In this sense, the charge has a sphere of influence with a radius of roughly the Debye length. Now, given the above definitions, we can look at the Debye radius in a cosmological context. If we define the particle density to be

3 ρ0cΩ0b T ne , (437) ≃ mp T0r  then we see that 3 1/2 kBT0rmp −1 λD 2 2 T . (438) ≃ 4πe T ρ0cΩ0b ! ∝ Similarly, if we define 1/3 −1/3 ρ0cΩ0b T λ ne , (439) ≃ ≃ mp ! T0r then the temperature cancels and the ratio of the Debye length to the mean particle sepa- ration is λ D 102(Ω h2)−1/6. (440) λ ≃ 0b We therefore find that the ratio of these two quantities is independent of redshift, which means that ionized material in the universe can be appropriately treated as a plasma fluid at all redshifts up to recombination.

27.2 Timescales Now, what are the practical effects of having a plasma? There are several relevant charac- teristic timescales:

1. τe - This is the time that it takes for an electron to move a Debye length.

2. τeγ - This is the characteristic time for an electron to lose its momentum by electron- photon .

3. τγe – This is the characteristic timescale for a photon to scatter off an electron.

4. τep – This is the relaxation time to reach thermal equilibrium between photons and electrons.

74 Before diving in, why do we care about these timescales? We want to use the timescales assess the relative significance of different physical mechanisms. For instance, we must verify that τep is shorter than a Hubble time or else the assumption of thermal equilibrium is invalid. Let’s start with τep. We won’t go through the calculations for this one, but

τ 106(Ω h2)−1T −3/2s. (441) ep ≃ 0b 2 9 Assuming Ω0bh 0.02, then this implies that at the start of the radiative era(T 10 K, t 10s) τ 10≃−6s. Similarly, at the end of the radiative era (T 4000K, t ≃300, 000 U ≃ ep ≈ ≃ U ≃ yrs) τ 200s. Clearly, the approximation of thermal equilibrium remains valid in this era. ep ≈ Next, consider τe. The Coulomb interaction of an electron with a proton or helium nuclei is only felt when the two are within a Debye length of one another. On average, the time for an electron to cross a Debye sphere is

2 −1 me 8 −3/2 τe = we = 2 2 10 T s. (442) 4πnee  ≃ × so any net change to the electron momentum or energy must occur in this timescale. Let’s now compare this with the time that it takes for electron-photon scattering to actually occur. This timescale is,

′ 1 3me 21 −4 τeγ = = =4.4 10 T s. (443) nγ σT c 4σT ρrc ×

2 Note that this equation contains a factor of 4/3 in 4/3ρrc because of the contribution of the pressure for a radiative fluid (p =1/3ρc2). Combining the two, we see that

τe −14 5/2 ′ 5 10 T , (444) τeγ ≃ × so τ << τ ′ when z << 2 107(Ω h2)1/5 107, (445) e eγ × 0b ≃ which is true for much of the period the plasma era (after the universe is about 1 hour old). What this means is that for z << 2 107 there is only a very small probability for an e− to scatter off a γ during the timescale of× an e− p+ interaction during most of the plasma era. Consequently, the electrons and protons are− strongly coupled – essentially stuck together. On the other hand, at z >> 2 107 the electrons and γs have a high probability of scattering – they are basically stuck× together. In this case, the effective mass of the electron is ∗ 2 4 ρr me = me +(ρr + pr/c )/ne >> me, (446) ≃ 3 ne when calculating the timescale for an e− + p+ collision. We simply note this for now, but may utilize this last bit of information later.

75 For now, the main point is that z =1 107 is essentially a transition point before which the electrons and photons are stuck together,× and after which the electrons and protons are stuck together. We also note that the effective timescale for electron-γ scattering is:

3 me + mp 3 mp 25 −4 τeγ = 10 T s. (447) 4 σT ρrc ≃ 4 σT ρrc ≃ The final timescale that we will calculate here is that for the typical photon to scatter off an electron (not the same as the converse since there are more photons than electrons). This is roughly,

1 mp 1 4 ρr 20 2 −1 −3 τγe = = = τeγ 10 (Ω0bh ) T s. (448) neσT c ρb σT c 3 ρb ≃ To put this all together... First, we showed that we are in equilibrium. This means that the protons and electrons have the same temperature, and also means that the photons should obey a Planck distribution (Bose-Einstein statistics). If everything is at the same temperature, Compton interactions should dominate. From the calculation of the Debye length we found that up until recombination the universe is well-approximated as a thermal plasma. During this period, we have an initial era where the electron-photon interactions dominate and these two particle species are strongly stuck together. This is true up to z 107, after which the proton-electron interactions dominate and these two species are stuck≃ to each other. As we will discuss, the relevance of this change is that any energy injection prior to z = 107 (say due to evaporation of primordial black holes, among other things) will be rapidly thermalized and leave no signature in the radiation field. In contrast, energy injection at lower temperatures can leave some signature. We’ll get back to this in a minute. As we discussed previously, the transition from radiation to matter-dominated eras occurs at ρ0cΩ0 1+ zeq = 3450, (449) K0ρ0r ≃ or when the universe is roughly 50 kyrs old. This time is squarely in the middle of the plasma era. At the end of the radiative era, the temperature is T 104 K, at which point everything remains ionized, although some of the helium (perhaps half)≃ may be in the form of He+ rather than He++ at this point. In general though, recombination occurs during the matter-dominated era.

27.3 Recombination Up through the end of the plasma era, all the particles (p, e−,γ, H and helium) remain coupled. Assuming that thermodynamic equilibrium holds, then we can compute the - ization fraction for hydrogen and helium in the same way that we have computed relative

76 abundances of different particle species (we’re getting a lot of mileage out of a few basic for- mulas based upon Fermi-Dirac, Bose-Einstein statistics, and Boltzmann statistics). In the present case, we are considering particles at T 104 K, at which point p, e−, and H are all non-relativistic. We therefore can use Boltzmann≃ rather than Bose-Einstein or Fermi-Dirac statistics, so 3/2 2 mikBT µi mic ni gi 2 exp − . (450) ≃ 2πh¯ ! kBT ! Considering now only hydrogen (i.e. ignoring helium for simplicity), the ionization fraction should be n n x = e e , (451) np + nH ≃ ntot and the chemical potentials for e− + p H + γ must obey the relation →

µe− + µp = µH . (452)

− − Also, the statistical weights of the particles are gp = ge = 2, as always, and gH = gp +ge = 4. Finally, the ,

2 B =(m + m − m )c = 13.6eV. (453) H p e − H

Using this information and assuming ne = np (charge equality), we will derive what is called the Saha equation for the ionization fraction. Let us start by computing the ration of charge to neutral particles. We know from the above equations that

3/2 2 mekBT µe mec ne =2 2 exp − (454) 2πh¯ ! kBT ! 3/2 2 mpkBT µp mpc np =2 2 exp − (455) 2πh¯ ! kBT ! 3/2 2 mH kBT µH mH c nH =4 2 exp − (456) 2πh¯ ! kBT ! (457)

Therefore,

3/2 3/2 2 nenp mekBT mp µe + µp µH (me + mp mH )c = 2 exp − − − (458) nH 2πh¯ !  mH  kBT ! 3/2 B mekBT − H kBT 2 e (459) ≃ 2πh¯ ! Now, we can also see that n n n2 n 2 1 x2 e p = e = n e = n , (460) n n n tot n 1 n /n tot 1 x H tot − e  tot  − e tot − 77 which means that 2 3/2 B x 1 mekBT − H kB T = 2 e . (461) 1 x ntot 2πh¯ ! − This last expression is the Saha equation, giving the ionization fraction as a function of temperature and density. Your book gives a table of values corresponding to ionization fraction as a function of redshift and baryon density. The basic result is that the ionization fraction falls to about 50% by z 1400. ≃ Now, it is worth pointing out that we have assumed the are in thermal equilibrium. This was in fact a bit of a fudge. Formally, this is only true when the recombination timescale is shorter than the Hubble time, which is valid for z > 2000. During the actual interesting period – recombination itself – non-equilibrium processes can alter the ionization history. Nevertheless, the above scenario conveys the basic physics and is a reasonable approximation. More detailed calculations with careful treatment of physical processes get closer to the WMAP value of z = 1088. In general, the residual ionization fraction well after recombination ends up being x 10−4. ≃ 27.4 Cosmic Microwave Background: A First Look At recombination, the mean free path of a photon rapidly goes from being very short to es- sentially infinite as the probability for scattering off an electron suddenly becomes negligible. This is why you will often hear the cosmic microwave background called the “surface of last scattering”. We are essentially seeing a snapshot of the universe at z = 1088 when the CMB photons last interacted with the matter field. The CMB provides a wealth of cosmological information, and we will return to it in much greater detail in a few weeks. Right now though, there are a few key things to point out. First, note that Compton scattering between thermal electrons and photons maintains the photons in a Bose-Einstein distribution since the photons are conserved in Compton scattering (and consistent with our previous discussion). To get an (observed) Planck distribution of photons requires photon production via free-free emission, double Compton scattering, or some other physical process. These processes do occur rapidly enough in the early universe to yield a Planck distribution. Specifically,

hν −1 Nν = exp 1 , (462) " kBT ! − #

and the corresponding energy density per unit frequency should be

8πhν3 u = ρ c2 = N , (463) ν r,ν ν c which corresponds to an intensity 4πhν¯ 3 I = N (464) ν ν c

78 Integrating over the energy density gives the standard

2 4 ρrc = σrT (465) (see section 21.1 for this derivation). One can see that as the universe expands,

4 uνdν =(1+ z) uν0 dν0, (466) This is the standard (1 + z)4 for radiation that we have previously derived. Since dν = (1 + z)dν0, we have, 3 3 uν(z)=(1+ z) uν0 = (1+ z) uν/(1+z). (467) Plugging this into the above equation for the energy density, one finds that

3 −1 3 8πh ν hν/(1 + z) uν(z)=(1+ z) exp 1 (468) c 1+ z  " kBT0 − !# 8πh ν3 = . (469) c exp hν 1 kBT −   Thus, we see that an initial black body spectrum retains it’s shape as the tem- perature cools. This may seem like a trivial statement, but it tells us that when we look at the CMB we are seeing the same spectrum as was emitted at the surface of last scatter, just redshifted to lower temperature. Moreover, any distortions in the shape (or temperature) of the spectrum must therefore be telling us something physical. One does potentially expect distortions in the far tails of the Planck distribution due to the details of the recombination process (particularly two-photon decay), but these are largely unobservable due to galactic dust. Another means of distorting the black body spectrum is to inject energy during the plasma era. The book discusses this in some detail, but for the current discussion we will simply state that there are a range of possible energy sources (black hole evaporation, decay of unstable particles, damping of density fluctuations by photon diffusion), but the upper limits indicate that the level of this injection is fairly low. Nonetheless, if there is any injection, we saw before that it must occur in the redshift range 4 < log z < 7. Finally, intervening material between us and the CMB can also distort the spectrum. These “foregrounds” are the bane of the CMB field, but also quite useful in their own right as we shall discuss later. One particular example is inverse Compton scattering by ionized gas at lower redshift (think intracluster medium). If the ionized gas is hot, then the CMB photons can gain energy. This process, called the Sunyaev-Zeldovich effect, for low column densities essentially increases the effective temperature of the CMB in the Rayleigh-Jeans part of the spectrum, while distorting the tail of the distribution.

27.5 Matter-Radiation Coupling and the Growth of Structure One thing that we have perhaps not emphasized up to this point is that matter is tied in space, as well as temperature, to the radiation in the universe. In other words, the radiation

79 exerts a very strong force on the matter. We have already talked about electrons and photons being effectively glued together early in the plasma era. More generally for any charged particles interacting with a Planck distribution of photons, to first order the force on the particles is ∆v mev 4 4 v F me = ′ = σT σrT . (470) ≃ ∆t − τeγ −3 c or the same equation scaled by the ratio mp/me for a proton. The important thing to note here is that the force is linear in the velocity, which means that ionized matter experiences a very strong drag force if it tries to move with respect to the background radiation. The net result is that any density variations in the matter distribution remain locked at their original values until the matter and radiation fields decouple. Put more simply, gravity can’t start forming stars,galaxies,etc until the matter and radiation fields decouple. Conversely, the structure that we see at the surface of last scatter was formed at a far earlier epoch.

27.6 Decoupling As we have discussed before, the matter temperature falls at T (1 + z)2 once the matter decouples from the radiation field. To zeroth order this happ∝ens at recombination. In practice though, the matter temperature remains coupled to the radiation temperature until slightly later because of residual ionization. As before, one can calculate the timescale for collisions of free electrons with photons, 1 τ = 1025T −4 s. (471) eγ x × You can then go the normal approach of comparing this with the Hubble time to estimate when decoupling actually occurs (z 300). ≃ 27.7 Reionization A general question for the class. I just told you before that after recombination the ionized fraction drops to 10−4 – essentially all matter in the universe is neutral. Why then at z =0 do we not see neutral hydrogen tracing out the structure of the universe? This is a bit beyond where we’ll get in the class, but the idea is that the universe is reionized, perhaps in several stages, at z =6 20 by the onset of radiation from AGN (with possibly some contribution from ).− This reionization is partial – much of the gas at this point has fallen into galaxies and is either in stars or self-shielding to this radiation. Related to this last comment, it is good to be familiar with the concept of optical depth, which is commonly denoted at τ (note: beware to avoid confusion with the timescales above). The optical depth is related to the probability that a photon has a scattering interaction with an electron while travelling over a given distance. Specifically, combining

dt xρb dt dP = = neσT cdt = σT c dz (472) τγe mp dz

80 dN dI dP = γ = , (473) − Nγ − I

where Nγ is the photon flux and I is the intensity of the background radiation. The first equation is directly from the definition of P. The second line states that the fraction of photons (energy) that reaches in observer is defined by the fraction of photons that have not suffered a scattering event. If we now define τ as,

dP = dτ, (474) − we see that

N = N exp( τ) (475) γ,obs γ − I = I exp( τ), (476) γ,obs − where obs stands for the observed number and intensity. The book has a slightly different definition in that it expresses the relation in terms of redshift such that I(t0, z) and Nγ (t0, z) are defined and the intensity and number of photons observed for a initial intensity and number I(t), Nγ(t). One can see from the earlier equations that in terms of redshift, ρ Ω σ c z dt τ(z)= 0c 0b T dz, (477) mp Z0 dz which for w = 0 becomes

z ρ0cΩ0bσT c 1+ z τ(z)= 1/2 dz. (478) mpH0 Z0 (1+Ω0z)

For Ω0z >> 1, this yields, τ(z) 10−2(Ω h2)1/2z3/2. (479) ≃ 0b Finally, the probability that a photon arriving at z = 0 suffered it’s last scattering between z and z dz is − 1 dI d = [(1 exp( τ)] dz = exp( τ(z))dτ = g(z)dz. (480) I dz −dz − − − The quantity g(z) is called the differential visibility and defines the effective width of the surface of last scattering. In other words, recombination does not occur instantaneously, but rather occurs over a range of redshifts. The quantity g(z) measures the effective width of this transition. To compute it, one would plug in a prescription for the ionization fraction and integrate to get τ(z). Doing so for the approximations in the book, one finds that g(z) can be approximated at a Gaussian centered at the surface of last scattering with a width ∆z 400. WMAP observations indicate that the actual width is about 200. ≃

81 28 Successes and Failures of Basic Big Bang Picture

[Reading: 7.3-7.13] In recent weeks we have been working our way forward in time, until at last we have reached the surface of last scattering at t = 300, 000 years. We will soon explore this epoch in greater detail, but first we will spend a bit of time exploring the successes and failures of the basic Big Bang model that we have presented thus far and look in detail at inflation as a means of alleviating some (but not all) of these failures. Everything that we have discussed in class up until this point is predicated upon a few fundamental assumptions:

1. The Cosmological Principle is valid, and therefore on large scales the universe is ho- mogeneous and isotropic.

2. General relativity is a valid description of gravity everywhere (at least outside event horizons) and at all times back to the Planck time. More generally, the known laws of physics, derived locally, are valid everywhere. This latter part is a consequence of the Cosmological Principle.

3. At some early time the contents of the Universe were in thermal equilibrium with T > 1012 K.

Why do we believe the Big Bang model? There are four basic compelling reasons. Given the above assumptions, the Big Bang model:

1. provides a natural explanation for the observed expansion of the universe (Hubble 1929). Indeed, it requires that the universe be either expanding or contracting.

2. explains the observed abundance of helium via cosmological production of light ele- ments (Alpher, Hermann, Gamow; late 1940s). Indeed, the high helium abundance cannot be explained via stellar nucleosynthesis, but is explained remarkably well if one assumes that it was produced at early times when the universe was hot enough for fusion.

3. explains the cosmic microwave background. The CMB is a natural consequence of the cooling expansion.

4. provides a framework for understanding . Initial fluctuations (from whatever origin) remain small until recombination, after which they grow via gravity to produce stars, galaxies, and other observed structure. Numerical simulations show that this works remarkably well given (a) a prescription for the power spectrum of the initial fluctuations, and (b) inclusion of non-.

Clearly an impressive list of accomplishment, particularly given that all the observational and theoretical work progress described above was made in a mere 80 years or so. Not bad

82 progress for a field that started from scratch at the beginning of the 20th century. Still, the basic model has some gaping holes. These can be divided into two categories. The first category consists of holes arising from our limited current understanding of gravity and particle physics. These aren’t so much “problems” with the Big Bang as much as gaps that need to be filled in. These gaps include: 1. A description for the Universe prior to the Planck time. Also, the current physics is somewhat sketchy as one gets near the Planck time.

2. The matter-antimatter asymmetry. Why is there an excess of matter?

3. The nature of dark matter. What is it?

4. The cosmological constant (dark energy) problem. There is no good explanation for the size of the cosmological constant. All four of the above questions are rather fundamental. The solution to the first item on the list will require a theory of quantum gravity, or alternatively an explanation of how one avoids reaching the Planck density. Meanwhile, the last two give us the sobering reminder that at present we haven’t identified the two components that contain 99% of the total energy density in the universe. Clearly a bit of work to be done! The second category of holes are more in the thread of actual problems for the basic model. These include: 1. The [Why is everything so uniform?]

2. The flatness problem [Why is the universe (nearly) flat?]

3. The problem [Why don’t we see any?]

4. The origin of the initial fluctuations. [Where’d they come from?] We will discuss each of these, as well as the cosmological constant problem in greater detail below, and see how inflation can (or cannot) help.

28.1 The Horizon Problem The most striking feature of the cosmic microwave background is it’s uniformity. Across the entire sky (after removing the dipole term due to our own motion) the temperature of the CMB is constant to one part in 104. The problem is that in the standard Big Bang model points in opposite directions in the sky have never been in causal contact, so how can they possibly “know” to be at the same temperature? Let us pose this question in a more concrete form. Recall from earlier in the term (and chapter 2 in your book) that we discussed the definitions of a “particle horizon” and the “”. The “particle horizon” is defined as including all points with which we have ever been in causal contact. It is an actual horizon – we have no way of knowing

83 anything about what is currently beyond the particle horizon. As we discussed previously, the particle horizon is defined as t cdt RH = a . (481) Z0 a If the expansion of the universe at early times goes at a tβ, with β > 0, then ∝ t β −β RH = t ct dt = (1 β)ct. (482) Z0 − and a particle horizon exists if β < 1. Using the same expansion form in the Friedmann equation, 4 p a¨ = πG ρ +3 2 a, (483) −3  c  yields a¨ = β(β 1)tβ−2, (484) − and 4 p 2 β(β 1) = πG ρ +3 2 t . (485) − −3  c  The existence of an initial singularity requiresa ¨ < 0 and hence 0 <β< 1. Combining this with the result above, we see that there must be a particle horizon. How does the size of the particle horizon compare to the size of the surface of last scattering? At zCMB = 1100, the CMB surface that we observe had a radius

ctlookback ct0 rCMB = , (486) 1+ zCMB ≃ 1+ zCMB so opposite directions in the sky show the CMB at sites that were 2rCMB apart at recom- bination. If the above is a bit unclear, the way to think about it is as follows. Consider us at the center of a sphere with the observed CMB emitted at a comoving distance from us of ctlookback. This is the radius above, with the (1 + z) included to convert to proper distance. The size of the particle horizon at a given redshift for w = 0 is given by r R 3ct 3ct (1 + z )−3/2 3r (1 + z )−1/2 CMB . (487) H ≃ ≃ 0 CMB ≃ CMB CMB ≃ 10 The implication is that the CMB is homogeneous and isotropic on scales a factor of ten larger than the particle horizon.

28.2 Inflation: Basic Idea The most popular way of getting around the above problem is called inflation. The basic idea is to simply postulate that the universe underwent a period of accelerated expansion (¨a> 0) at early times. As we will see later, there are many variants of inflation, but they all boil down to the same result – a finite period of accelerated expansion very early. So how does

84 inflation help with the horizon problem? If there was a period of accelerated expansion, then one can envision a scenario in which the entire observable universe was actually in causal contact at some point prior to this accelerated expansion. In this case the uniformity of the CMB is no longer so mysterious. Let’s see how this would work.

28.2.1 Cosmological Horizon When we discussed particle horizons early in the term, we also discussed the “cosmological horizon”. This term is actually somewhat of a misnomer, but is commonly used. It’s not a true horizon, but rather is simply defined as being equivalent to the Hubble proper distance at a given redshift, a c R = c = , (488) c a˙ H(z) or a comoving distance a c(1 + z) r = c 0 = , (489) c a˙ H(z) which reduces to the familiar c Rc = rc DH = (490) ≡ H0 at z = 0. The relevance of the cosmological horizon, or Hubble distance, is that physical processes can keep things fairly homogeneous within a scale of roughly the cosmological horizon. Recall during the thermodynamic discussion that reactions were in thermodynamic equilibrium if the collision time was less than the Hubble time; similarly, physically processes can act within regions less than the Hubble distance (cosmological horizon). There are a couple important things to remember about the cosmological horizon. First, objects can be outside the cosmological horizon, but inside the particle horizon. Second, objects can move in and out of the cosmological horizon (unlike the particle horizon, where an object within the particle horizon will forever remain within the particle horizon). [Those reading notes should now refer to figure 7.4 in book, as I will be drawing something similar on board.]

28.2.2 Inflationary solution The horizon problem, as discussed above, is that points separated by proper distance l (comoving distance l0 = l(1 + z)) are only causally connected when l < RH , where RH is the size of the particle horizon. If we consider a tβ at early times (with β < 1 as above), ∝ then the size of the horizon grows with time. Put simply, as the universe gets older light can reach farther so the particle horizon is larger. Now, imagine that a region l0 that is originally within the cosmological horizon (l0 < rc(ti)) is larger than the horizon at some later time (l0 > rc(tf )). This can only happen if the comoving cosmological horizon decreases with time. In other words, d ca ca a¨ 0 = − 0 < 0, (491) dt a˙ a˙ 2 85 ora> ¨ 0. The inflationary solution thus posits that the universe passes through a period of accel- erated expansion, which after some time turns off and returns to a decelerated expansion. If observers (like us) are unaware of this period of accelerated expansion, then we perceive the paradox of the horizon problem. The problem is non-existent though in the inflationary model because everything that we see was at some point in causal contact. OK. That’s the basic picture. Now let’s work through the details. During the inflationary era, we require that the Friedmann equation be dominated by a component with w< 1/3 − in order to have an accelerated expansion. If we define the start and finish of the inflationary period as ti and tf , respectively, then from the Friedmann equation we find,

2 1+3w a˙ 2 ai = Hi Ωi + (1 Ωi) (492) ai  "  a  − # 1+3w 2 ai Hi (493) ≃  a  a˙ a −3(1+w)/2 = Hi (494) a ai  where we have assumed Ωi 1 since we are considering early times. Now we have several cases to consider. The first is≃ w = 1. In this case, we have − da = H dt, (495) a i

and integrating from ti to t we have,

Hi(t−ti) a = aie . (496)

This case is called “exponential inflation” for obvious reasons. Now, consider the cases where w = 1. Starting from above and integrating again we now have, 6 d 3(1 + w) a −3(1+w)/2 d = (Hit) (497) da "− 2 ai  # dt 3(1 + w) a 3(1+w)/2 1 = Hi(t ti (498) − 2 " − ai  # − a 3(1+w)/2 2 = Hi(t ti)+1 (499) ai  3(1 + w) − q Hi(t ti) 2 a ai 1+ − ; where q = . (500) ≃ " q # 3(1 + w) For 1 1, this equation reduces to simply − − i a tq; where q > 1. (501) ∝ 86 This case is called “standard inflation” or “power-law inflation”. Finally, for w < 1, we have − −|q| Hi −|q| a 1 (t ti) (C t) for t

a¨ = Ha˙ +aH ˙ = a(H2 + H˙ ) (503)

“Standard inflation” corresponds to H˙ < 0, “exponential inflation” corresponds to H˙ = 0, and “super-inflation” corresponds to H˙ > 0. It is straightforward to show that H˙ = 0 for “exponential inflation” yields a eHit, (504) ∝ and the previous solutions for the other cases can be recovered as well.

28.2.3 Solving the Horizon Problem Now, there are several requirements for inflation to solve the horizon problem. Let us divide the evolution of the universe into three epochs:

Epoch 1: Inflationary era from t to t , where w< 1/3 • i f − Epoch 2: Radiation-dominated era from t to t q, where w =1/3. • f e Epoch 3: Matter-dominated era from t q to t ,where w = 0. • e 0 Let the subscripts i and j stand for the starting and ending points of any of these intervals. For a flat model, where Ωij 1 in any interval, we find (see the equation for the Hubble parameter, eq 2.1.13 in your≃ book, for the starting point):

2 1+3w 2 2 ai aj Hi Hj Ωj (505) ≃ aj ! "  ai  # H a a −(1+3w)/2 i i i . (506) Hjaj ≃ aj !

To solve the horizon problem we require that the comoving horizon scale now is much smaller than at the start of inflation,

c ca0 rc(t0) << rc(ti)= , (507) ≡ H0 a˙ i which implies that H0a0 >> a˙ i = Hiai. (508)

87 Consequently, this means that H a H a H a H a i i << 0 0 = 0 0 eq eq , (509) Hf af Hf af Heqaeq Hf af which gives

a −(3w+1)/2 a −1/2 a −1 i << 0 eq (510) af ! aeq ! af ! a −(3w+1) a a 2 f >> 0 eq (511)  ai  aeq ! af ! −1 If one substitutes in a0/aeq = (1+ zeq) , and a 1+ z T eq = eq = f , (512) af 1+ zf Teq taking T 10−30T , where T is the Planck temperature, this yields eq ≃ P P −(1+3w) 2 af 60 − Tf >> 10 (1 + zeq) 1 . (513)  ai  TP  Consequently, for an exponential expansion this implies that the number of e-foldings is

ln 10 + ln(Tf /TP )/30 ln(1 + zeq)/60 N ln(af /ai) >> 60 − . (514) ≡ " 1+3w # | | −5 For most proposed models, 10 < Tf /TP < 1, which means that we require N >> 60. Think about this for a moment. This says that the universe had to grow by a factor of at least e60 during inflation, or a factor of 1026. As we shall see in a moment, this likely would have to happen during a time interval of order 10−32s. As an aside, note that while the expansion rate is >> c, this does not violate standard physics since it is spacetime rather than matter/radiation undergoing superluminal motion. Any particle initially at rest in the spacetime remains at rest and just sees everything redshift away.

28.3 Inflation and the Monopole Problem Let’s now see how inflation helps (or fails to help) some of the other problems. One problem with the basic Big Bang model is that most grand unification (GUTs) in particle physics predict that magnetic defects are formed when the strong and electroweak forces decouple. In the most simple case these defects are magnetic monopoles – analogous to electrons and positrons, but with a magnetic rather than . From these theories one finds that magnetic monopoles should have a charge that is a multiple of the Dirac charge, gD, such that

gM = ngD = n68.5e, (515)

88 with n =1 or n = 2, and a mass m 1016GeV. (516) M ≃ [Note that this g is charge rather than degrees of freedom!] One amusing comparison that I came across is that this is roughly the mass of an amoeba (http://www.orionsarm.com/tech/monopoles.html). [Note that equation 7.6.4 in the book is wrong.] The mass of a magnetic monopoles is close to the energy/temperature of the universe at the symmetry breaking scale (1014 1015 GeV). − In some GUT theories instead of (or in addition to) magnetic monopoles, this symmetry breaking produces higher dimensional defects (structures) such as strings, domain walls, and textures (see figure 7.3 in your book). Magnetic monopoles can also be produced at later times with m 105 1012 GeV via later phase transitions in some models. So what’s the∼ problem?− Well, first of all we don’t actually see any magnetic monopoles or the other more exotic defects. More worrisome though, one can calculate how common they should be. We are going to skip the details (which are in your book), but the main point is that the calculation gives

n > 10−10n n , (517) M γ ≃ 0b so there should be as many magnetic monopoles as baryons. [Question for the class: Could this be “dark matter”? Why (not)?] Working out the corresponding density parameter, we see that

mM 16 ΩM > Ωb 10 . (518) mp ≃ A definite problem. How does inflation help? Well, consider what inflation does – if you expand the universe by a factor of 1060, then the density of any particles that exist prior to inflation goes to Ω 0. This is analogous to our picture of the present universe in which the current accelerated→ expansion should eventually make the matter density go to zero. In this case, the density of magnetic monopoles should go to zero as long as inflation occurs after GUT symmetry breaking (t> 10−36s). At this point you may be asking how we have a current matter/energy density larger than zero if inflation devastated the density of pre-existing particles. We will return to this issue a bit later in our discussion of phase transitions. For now, I will just say that the universe is expected to gain energy from the expansion in the standard particle physics interpretations, and all particles/energy that we see today arise at the end of the inflationary era.

28.4 The OK – next on the list is the flatness problem. Specifically, why is the universe flat (or at least very, very close to it)? You might ask why not (and it certainly does simplify the math), but in truth there’s no a priori reason to expect it to be flat rather than have some other curvature. Indeed, if you look at this from a theoretical perspective, the only characteristic scale in the evolution of the universe is the Planck scale. One might therefore expect that for a closed

89 universe the lifetime might be tu tP . Similarly, for an open universe one would expect the curvature to dominate after roughly≃ a Planck time. Clearly not the case in ! Let’s start by quantifying how flat the universe is. We’ll do this for a model without a cosmological constant, but the same type of derivation is possible (with a bit more algebra) for a more general model. We can rearrange the Friedmann equation, 8πG a˙ 2 ρa2 = Kc2, (519) − 3 − to find a˙ 2 8πG Kc2 ρ = 2 (520) a − 3 − a Kc2 H2(1 ρ/ρ )= (521) − c − a2 H2(1 Ω)a2 = Kc2 (522) − − ρ 1 Ω H2 − a2 = Kc2 (523) ρc  Ω  − Kc23 (Ω−1 1)ρa2 = − =constant (524) − 8πG so (Ω−1 1)ρa2 = (Ω−1 1)ρ a2. (525) − 0 − 0 0 We can put this in terms of more observable parameters. Since we know that ρ a−4 for the radiation-dominated era and ρ a−3 for the matter-dominated era, we can use∝ the ∝ above to solve for the density parameter at early times. Specifically (Ω−1 1)ρa2 = (Ω−1 1)ρ a2 ; (526) − eq − eq eq and (Ω−1 1)ρ a2 = (Ω−1 1)ρ a2 (527) eq − eq eq 0 − 0 0 So −2 −1 a −1 (Ω 1) = (Ωeq 1) (528) − aeq ! − −1 −1 a0 (Ωeq 1) = (Ω0 1) (529) − − aeq (530) and therefore −1 2 2 2 Ω 1 aeq a0 −1 Teq −60 TP −1 − = =(1+ zeq) 10 . (531) Ω 1 a a ! T ≃ T 0 −   eq     Consequently, even for an open model with Λ = 0, Ω0 =0.3, the above result requires that Ω−1 1 10−60 at the Planck time. Indeed, right now we know that Ω + Ω =1.02 0.02 P − ≤ 0 Λ ± (WMAP, Spergel et al. 2003), which further tightens the flatness constraint in the Planck era.

90 28.4.1 Enter inflation How does inflation help with this one? Well, the basic idea is that inflation drives the universe back towards critical density, which means that it didn’t necessarily have to be so close to critical density at the Planck time. To see this, divide the history of the universe into three epochs, as we did before. Going along the same argument as above, we have

(Ω−1 1)ρ a2 = (Ω−1 1)ρ a2 = (Ω−1 1)a2 = (Ω−1 1)ρ a2. (532) i − i i f − f f eq − eq 0 − 0 0 Rearranging, this gives

−1 2 2 2 2 Ω 1 ρ a ρ a ρeqa ρf a i − = 0 0 = 0 0 eq f , (533) Ω−1 1 ρ a2 ρ a2 ρ a2 ρ a2 0 − i i eq eq f f i i −1 −1 −2 −(1+3w) Ωi 1 a0 aeq af −1 − = , (534) Ω 1 a ! a ! a 0 − eq f  i  or

−(1+3w) −1 2 af Ωi 1 a0 aeq = −1 − (535) ai Ω 1 aeq ! af !   0 − 1 Ω−1 T 2 − i (1 + z )−11060 f . (536) ≃ 1 Ω−1 eq T − 0  P  Recall in the horizon section that the horizon problem was solved if

−(1+3w) 2 af −1 60 Tf >> (1 + zeq) 10 . (537)  ai  TP  The flatness problem is now also resolved as long as the universe is no flatter now than it was prior to inflation, i.e. 1 Ω−1 − i 1. (538) 1 Ω−1 ≥ − 0 To rephrase, the problem before is that in the non-inflationary model the universe had to be 1060 times closer to critical density than it is now at the Planck time. With inflation, it’s possible to construct cases in which the universe was further from critical density than it is now. Indeed, inflation can flatten out rather large initial departures from critical density.

28.5 Origin of CMB Fluctuations Converse to the flatness problem, let us now ask why we see any bumps and wiggles in the CMB. If all the regions were causally connect, and we know that random fluctuations can’t grow prior to recombination, where did these things come from?? , which operates on ridiculously smaller scales, is the one way in which you can generate such random fluctuations. Inflation provides an elegant way out of this dilemma. If the

91 universe expanded exponentially, any quantum fluctuations in the energy density during the expansion are magnified to macroscopic scales. Since quantum wavefunctions are gaussian, inflation makes the testable prediction that CMB fluctuations should be gaussian as well. This is the one currently testable prediction of inflation, and it appears that the fluctuations are indeed gaussian. We’ll talk more about these later.

28.6 The Cosmological Constant Problem: First Pass Why is there a non-zero cosmological constant that has a value anywhere near the critical density? This is the basic question. We will return to this in greater detail after our discussion of phase transitions, but the basic problem is that current particle physics predicts a cosmological constant (if non-zero) that’s off by about 110 orders of magnitude. Inflation does not help with this one at all.

28.7 Constraints on the epoch of inflation Most theories predict that inflation should have occurred at t = 10−36 10−32s. Inflation must occur no earlier than 10−36s, which is the GUT time, or else we should− see magnetic monopoles or related topological defects. The constraint is more nebulous – it simply must have lasted at least until 10−32s for a 1060 increase in the scale factor.

29 The Physics of Inflation

In the previous section we have motivated why a period of accelerated expansion in the early universe would be a nice thing to have. Now how would one physically achieve such a state? This question is in fact even more relevant that it was even 10 years ago, as we now know that the universe is currently in the early stage of another accelerated (inflationary!) expansion. We will not go through all the details, but will qualitatively describe the fundamental (and partially speculative) physics.

29.1 Phase Transitions Most of you are familiar with the concept of a in other contexts. Phase transitions are defined by abrupt changes in one or more of the physical properties of a system when some variable (temperature) is changed slightly. Examples of well-known phase transitions include:

Freezing and boiling (transformation from the to solid and gas phases). • Magnetism – materials are ferromagnetic below the Curie temperature, but lose their • ferromagnetism above this temperature

92 – for some materials there is a critical temperature below which a • material becomes conductive.

Bose-Einstein condensation • What these processes have in common is that as the temperature is lowered slightly beyond some critical point, the material changes from a disordered to a more ordered state (i.e. the entropy decreases). For example an amorphous fluid can produce a crystal with an ordered lattice structure when it solidifies (sodium chloride, quartz, etc). Let the parameter Φ describe the amount of order in the system. What we are essentially saying is that Φ increases during the phase transition from the warmer to cooler state. Depending on the type of system, this order parameter can be defined in assorted ways (like the magnetism for a ferromagnet), but this basic meaning is unchanged. What we are going to see is that symmetry breaking in cosmology (i.e the points at which the forces break apart) can be considered phase transitions. Think of the universe instantaneously crystallizing to a more ordered state – for example spontaneously congealing to form hadrons, particles suddenly gaining mass, etc. These are profound transitions, and are accompanied by a change in the free energy of the system as the universe settles down to a new minima state. A related phrase that I will (without much description) introduce now is the energy. The corollary way of thinking about this is that prior to the phase transition the vacuum has some intrinsic energy density. During the transition this energy is freed and the universe settles down to a new, lower . Bit nebulous at this point, but we’ll see whether we can fill in a few details. Returning to thermodynamics (from which we never quite escape), the free energy of a system is F = U TS, where U is the internal energy, T is the temperature, and S is the − entropy. By definition, an equilibrium state corresponds to a minima in F (i.e. minimizing the free energy of the system). Consider a case in which for temperatures above the phase transition the free energy has a minimum at Φ = 0. During a phase transition, you are effectively creating new minima at higher values of Φ. [See figure]

To have true minima, the dependence must be on Φ2 rather that Φ. Why? Consider the case of a magnet, where the “order parameter” is the magnetization, M. The free energy doesn’t depend on the direction of the magnetic field – only the magnitude of the magnetism matters (say that ten times fast). Consequently, the free energy must depend on M 2 rather that M. Put more succinctly, the system needs to be invariant to transformations, so Φ and Φ need to be treated equally. If we expand F as a power series function of Φ2, then we −

93 can write F (Φ) F + AΦ2 + Bφ4. (539) ≈ 0 If A > 0 and B > 0, then we have a simple curve with the minima at Φ = 0 – i.e. the minimum is in the most disordered state. On the other hand, if A < 0 and B > 0, then you can create new minima at more ordered states. If you think of the free energy plot as a potential well, you can see that the phase transition changes the potential and the universe should roll down to the new minima. How would you change from one curve to another? Consider the case in which A = K(T Tc). In this case, the sign of A changes when you drop below the critical temperature, and as− you go to lower temperatures the free energy minima for the ordered states get lower and lower. This type of transition is a second order phase transition. As you can see from a time sequence, the transition is smooth between T > Tc and T < Tc and the process is transition is gradual as the system slowly rolls towards the new minima. As an alternative, there are also first-order phase transitions. In first-order transitions the order parameter appears rapidly and the difference in free-energy above and below the critical temperature is finite rather than infinitesimal. In other words, there is a sharp change in the minimum free energy right at the critical temperature. The finite change in the free energy at this transition, ∆F , is the latent heat of the transition (sound familiar from ?). Lets look at an example of such a transition. Consider the figure below, in which there are initially two local minima for T > Tc. As the system cools, these become global minima at T = TC , but the system has no way of reaching these minima. At some later time, after the system has cooled further, it becomes possible for the system to transition to the more ordered state (either by waiting until the barrier is gone, or via quantum tunneling depending on the type of system. In this case, the system rapidly transitions to the new minima and releases the latent heat associated with the change in free energy. This process is supercooling. From a mathematical perspective, one example (as shown in the figure) can be achieved by making the dependence,

F = F + AΦ2 + C Φ 3 + BΦ4, (540) 0 | | with A> 0, B > 0, and C < 0.

29.2 Cosmological Phase Transitions So how does freezing water relate to cosmology? The basic idea is that the Universe under- goes cosmological phase transitions. You may recall that we have used the term “spontaneous symmetry breaking” in describing the periods during which the fundamental forces separate. From the particle physics perspective, these events correspond to phase transitions in which the universe moves from a disordered to a more ordered state (for instance particles acquir- ing mass, and differentiation of matter into particles with unique properties like quarks and leptons). The free energy in this interpretation corresponds to a vacuum energy contained

94 1.5

1.25

1.0

0.75

0.5

0.25

0.0

−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 x

Figure 7 The x axis is the order parameter, while the y-axis is the free energy. The curves show how the free energy function change as temperature decreases. The yellow curve is at the critical temperature; the other two are slightly above and below the critical temperature.

95 0.2

0.1

0.0

−0.1

−0.2

−0.3

−0.4

−1.0 −0.5 0.0 0.5 1.0 x

Figure 8 Similar to previous figure, except that this one shows an example of supercooling (first-order phase transition). In this case, there is a barrier that prevents the system from moving to the new minimum until T <

96 in some scalar field (referred to as the inflaton field for inflation) – which is equivalent to the order parameter that we have been discussing (again for reference, temperature is an example of a scalar field while gravity is an example of a vector field). During the phase transition this vacuum energy decreases. Note that there are also other phase transitions in the early universe not associated directly with spontaneous symmetry breaking (think of the quarks congealing into hadrons for instance). As we shall discuss soon, this vacuum en- ergy can potentially drive a period of inflationary expansion if it dominates the total energy density. Meanwhile, the latent heat released during the phase transition is also key. So when during the early universe are cosmological phase transitions possible? Well, basically most of the time, as the constituents in the universe are rapidly evolving and new, more ordered structures are forming. The most dramatic phase transitions correspond to the spontaneous symmetry breaking scales when the forces separate, but it is possible to have other phase transitions along the way. Your book attempts to divide this time up until the quark-hadron transition into intervals characterized by the types of phase transitions occurring at each epoch. This is a reasonable approach, and we shall review these periods here.

The Planck Time ( 1019 GeV) – This is the point at which we require a quantum • theory of gravity, and∼ for which any super grand unification theory must unify gravity with the other forces above this temperature.

GUT ( 1015 GeV) – This is the temperature at which the strong and electroweak • forces break∼ apart. The GUT scale is when magnetic monopoles are expected to form, so we require a period of inflation during or after this epoch. We want inflation to occur very near this epoch though, because only at and above this temperature do most current models permit creation of a baryon-antibaryon asymmetry. This is not a hard constraint though, as it is possible to construct scenarios in which baryon conservation is violated at somewhat lower temperatures.

Between the GUT and Electroweak scales. The main point in the book is that the • timescale between GUT and Electroweak is from 10−37 10−11s, which logarithmically leaves a lot of time for other phase transitions to occur.− These phase transitions would not be associated with symmetry breaking.

Electroweak scale to quark-hadron transition. The universe undergoes a phase tran- • sition when the weak and electromagnetic forces split. It’s at this point that leptons acquire mass incidentally. Also in this category is the (much lower temperature) quark- hadron transition, at which free quarks are captured into hadrons.

Any and all of the above transitions can yield a change in the vacuum energy. Not all of the above however can cause inflation. Keep in as we go along that for inflation to occur, the vacuum energy must dominate the total energy density.

97 29.3 Return to the Cosmological Constant Problem As promised, we’re now going to finish up talking about the cosmological constant problem – specifically the problem of how it can be non-zero and small. Recall that the density corresponding to the cosmological constant (WMAP value) is given by

Λc2 ρ =0.7 =1.4 10−29 gcm−3 10−48GeV4. (541) | Λ| × 8πG × ≃ Equivalently, one can compute the value of Λ, finding Λ = 10−55cm−2. Small numbers, but is this a problem? The cosmological constant is often interpreted as corresponding to the vacuum energy of some scalar field. This is analogous to the discussion of free energy that we saw in previous sections. Modern gauge theories in particle physics predicts this vacuum energy corresponds to an effective potential,

ρ V (Φ, T ), (542) v ≈ and that the drop in the vacuum energy at a phase transition should be of the order

m4 ∆ρ , (543) v ≈ (¯hc)3 where m is the mass scale of the relevant phase transition. This density change corresponds to 1060 GeV4 for the GUT scale and values of 10−4 1012 GeV4 for other transitions (like − the electroweak). Now, if we take all the phase transitions together, this says that

ρ (t )= ρ (t )+ ∆ρ (m ) 10−48 + 1060 GeV4 = ∆ρ (m )(1 + 10−108). (544) v Planck v 0 v i ≈ v i Xi Xi In other words, starting with the vacuum density before GUT, the current vacuum density is a factor of 10108 smaller – and this value is very close to the critical density. Your book regards this as perhaps the most serious problem in all of cosmology. What is a greater mystery I would argue is why we find ourselves at a point in the history of the universe during which we are just entering a new inflationary phase.

29.4 Inflation: Putting the Pieces Together We’ve now defined inflation in terms of its impact upon the growth of the scale factor (¨a> 0), explored how it can resolve some key problems with the basic big bang, and done a bit of background regarding phase transitions. Seems like a good idea, so time to start assembling a coherent picture of how one might incorporate inflation into the Big Bang model. At a very basic level, all inflationary models have the following properties: There must be an epoch in the early universe in which the vacuum energy density, • ρ V (Φ), dominates the total energy density. ∝ 98 During this epoch the expansion is accelerated, which drives the radiation and matter • density to zero.

Vacuum energy is converted into matter and radiation as Φ oscillates about the new • minima. This reheats the universe back to a value near the value prior to inflation, with all previous structure having been washed out.

This must occur during or after the GUT phase to avoid topological defects. (note: • some versions don’t address this directly) We will now consider the physics of general inflationary models and then discuss a few of the zoo of different flavors of inflation. For a scalar field Φ, the Lagrangian of the field is 1 L = Φ˙ 2 V (Φ, T ), (545) Φ 2 − analogous to classical mechanics. Note that the scalar field Φ is the same as the order parameter that we have been discussing and the potential V is analogous to the free energy. The density associated with this scalar field is 1 ρ = Φ˙ 2 + V (Φ, T ) (546) Φ 2 (547)

Consider the case of a first-order phase transition (supercooling). In this case, the phase transition does not occur until some temperature Tb < Tc, at which point Φ assumes the new minimum value. If this transition is assumed to occur via either quantum tunneling or thermal fluctuations (rather than elimination of the barrier), then the transition will occur in a spatially haphazard fashion. In other words, the new phase will appear as nucleating bubbles in the false vacuum, which will grow until the entire Universe has settled to the new vacuum. On the other hand, if the transition is second order, the process is more uniform as all regions of space descend to the new minima simultaneously. Note however that not all locations will descend to the same minima, so you will end up with “bubbles” or domains. The idea is that one such bubble should eventually encompass our portion of the Universe. Now, how does this evolution occur? We’ll phrase this in terms of the equation of motion for the scalar field, d ∂L a3 ∂L a3 Φ Φ =0, (548) dt ∂Φ˙ − ∂Φ which, using the Lagrangian above, a˙ ∂V (Φ) Φ+3¨ Φ+˙ =0 (549) a ∂Φ Let’s look at this above equation. If we ignore thea ˙ term, then this is equivalent to a ball oscillating back and forth in the bottom of a potential well. In this analogy, the 3˙a/a

99 term then corresponds to friction damping the kinetic energy of the ball. It is standard in inflation to speak of the vacuum as “rolling down” to the new minima. More specifically, at the start of inflation one normally considers what is called the “slow roll phase”, in which the kinetic energy is << the potential energy. This corresponds to the case in which the motion is friction dominated, so the ball slowly moves down to the new minima. Remember that inflation causes ρr and ρm to trend to zero, so the Friedmann equation during the phase transition is approximately,

a˙ 2 8πGρ 8πG 1 = Φ = Φ˙ 2 + V (Φ, T ) . (550) a 3 3 2  In the slow roll phase, this reduces to

a˙ 2 8πG = V (Φ, T ), (551) a 3 so a exp(t/τ), (552) ∝ where 3 1/2 τ . (553) ≃ 8πGV  For most models the above timescale works out to roughly τ = 10−34 s. Since we need a minimum of 60 e-foldings to solve the horizon problem, this means that the inflationary period should last for at least 10−32 s. Note that this assumes inflation starts right at the phase transition. It’s possible to have ongoing inflation for a while before this, but you still want to have a large number of e-foldings to get rid of monopoles and other relics produced at the GUT temperature. As the roll down to the new minimum proceeds, the field eventually leaves the slow roll phase and rapidly drops down towards, and oscillates about the new minima. These oscillations are damped by the creation of new particles (i.e. conversion of the vacuum energy into matter and radiation). Mathematically, this corresponds to the addition of a damping term in the equation of motion a˙ ∂V (Φ) Φ+3¨ Φ+Γ˙ Φ+˙ =0. (554) a ∂Φ Physically, this has the effect of reheating the universe back up to some temperature, T < Tcrit, after which we proceed with a normal Big Bang evolution. Note that this new temperature has to be sufficiently high for baryosynthesis. So to summarize, the best way to look at things is like this:

1. Vacuum energy starts to dominate, initiating an inflationary expansion.

2. Inflation cools us through a phase transition, which initiates a slow roll down to the new minima. Inflation continues during this epoch.

100 3. Slow roll phase ends and vacuum drops to new minima. Inflation ends.

4. Scalar field oscillates around this new minima, releasing energy via particle production until it settles into the new minima. This released energy reheats the universe.

5. Back to the way things were before the inflationary period.

29.5 Types of Inflation OK - so what are the types of inflation. Inflation as a topic could fill the better part of a semester, with much of the time devoted to the various flavors. Here I am aiming to provide just an overview of inflation, and will continue that theme with a sparse sampling of types of inflation.

29.5.1 Old Inflation The original inflationary model (Guth, 1981) suggested that inflation is associated with a first-order phase transition. As we discussed, a first-order phase transition implies a spatially haphazard transition. It turns out that the bubbles produced in this way are too small for our Universe and never coalesce into a larger bubble, so this model was quickly abandoned.

29.5.2 New Inflation Shortly after the work by Guth, Andre Linde (1982) proposed a new version with a second- order rather than first-order phase transition. It turns out that a second order transition leaves larger spatial domains, and enables the entire universe to be in a single bubble with the same value of Φ. New inflation has several problems though that inspired other versions. (see your book for details)

29.5.3 Chaotic Inflation Chaotic inflation (Linde 1983) was an interesting revision in that it does not require any phase transitions. Instead, the idea is that near the Planck time Φ (whatever it is) varies spatially. Consider an arbitrary potential V(Φ) with the one condition that the minimum is at Φ = 0. Now, take a patch of the universe with a large, non-zero value of Φ. Clearly, within this region Φ will evolve just as it would right after a second-order phase transition – starting with a slow roll and eventually reheating and settling into the minima. The mathematics is the same as before – the main difference now is that we’ve removed the connection between inflation and normal particle physics. It’s completely independent of GUT or any other phase transitions.

101 29.5.4 Stochastic Inflation Stochastic, or eternal, inflation is an extension of chaotic inflation. Starting with an inhomo- geneous universe, the stochastic model incorporates quantum fluctuations as Φ evolves. The basic idea then is that there are always portions of the universe entering the inflationary phase, so you have many independent patches of universe that inflate at different times. What’s kind of interesting about this approach is that it brings us full circle to the Steady State model in the sense that there is no overall beginning or end – just an infinite number of Hubble patches evolving separately infinitely into the future.

102 30 Cosmic Microwave Background

[Chapter 17] It is now time to return for a more detailed look at the cosmic microwave background – although not as detailed a look as one would like due to time constraints on this class. We are now in what should be considered the “fun” part of the term – modern cosmology and issues that remain relevant/open at the present time. Let us start with a qualitative look at the CMB and how the encoded information can be represented. We will then discuss the underlying physics in greater detail and play with some animations and graphics on Wayne Hu’s web page.

30.1 Extracting information from the CMB The structure observed in the CMB, as we will see, is a veritable treasure trove of information. It provides a picture of the matter distribution at the epoch of recombination, constrains a host of cosmological parameters, and provides information on assorted physics that has occurred subsequent to recombination (such as the epoch of reionization). Before we get to the physics though, a first question that we will discuss is now one might go about extracting the essential information from a 2-d map of the CMB sky. The standard approach is to parameterize the sky map in terms of spherical harmonics, such that ∆T T ∞ l − = a Y (θ,φ), (555) lm lm Xl=0 mX=−l where the Ylm are the standard spherical harmonics familiar from quantum mechanics or (helio)seismology 1/2 2l +1 (l m)! m imφ Ylm(θ,φ)= − Pl (cosθ)e , (556) " 4π (l + m)! # m with the Pl being Legendre polynomials,

( 1)m m/2 dl+m l P m(cos θ)= − 1 cos2 θ cos2 θ 1 (557) l 2ll! − d cosl+m θ −     Now, for a given map the coefficients alm are not guaranteed to be real – in general they will be complex numbers. Rather, the more physical quantity to consider is the power in each mode, which is defined as C < a 2 > . (558) l ≡ | lm| As we will see in a moment, the angular power spectrum, measured in terms of Cl, is the fundamental observable for CMB studies. Specifically, when we see a typical angular power 1/2 spectrum for the CMB, the y axis is given by [l(l +1)Cl/(2π)] . The units are µK, and this can physically thought of as the amplitude of temperature fluctuations ∆T/T for a given angular scale, appropriately normalized.

103 First though, let’s consider the physical interpretation of different l modes. The l = 0 mode corresponds to a uniform offset in temperature, and thus can be ignored. The l = 1 mode is dipole mode. This term, which for the CMB is several orders of magnitude larger than any other terms, is interpreted as being due to our motion relative to the CMB. How does this effect the temperature? Assume that our motion is non-relativistic (which is the case). In this case, the observed frequency of the CMB is shifted by a factor ν′ = ν(1 + βcosθ), where β = v/c and θ = 0 is defined as the direction of our motion relative to the CMB. For a black-body spectrum it can be shown that this corresponds to a temperature distribution, T (θ)= T0(1 + β cos θ). (559) Thus, the lowest order in the CMB background tells us our velocity (both speed and direction) relative to the microwave background, and hence essentially relative to the cosmic rest frame. Not a bad start. Moving beyond the dipole mode, the l 2 modes are due primarily to intrinsic anisotropy produced either at recombination or by subsequent≥ physics. These are the modes that we care most about. The book provides a rough guide that the angular scale of fluctuations for large values of l is θ 60◦/l – more useful and correct numbers to keep in mind are that l = 10 corresponds to≃ about 10◦ and l = 100 to about 1◦.

30.2 Physics See http://background.uchicago.edu/˜whu/intermediate/intermediate.html The quick summary is that the peaks in the CMB angular power spectrum are due to acoustic oscillations in the plasma at recombination. The first peak corresponds to a fundamental mode with size equal to the sound horizon at recombination, while the higher order peaks are harmonics of this fundamental mode. The location of the first peak depends upon the angular diameter distance to the CMB, and is consequently determined primarily by the spatial curvature (with some dependence upon Λ). The relative amplitude of the second peak constrains the baryon density, while the third peak can be used to measure the total matter density. Meanwhile, the damping tail provides a cross-check on the above measurements. Finally, if it can be measured the provides a means of separating the effects of reionization epoch and gravitational waves. Note that the currently measured power spectrum of temperature fluctuations is commonly referred to as the scalar power spectrum (since temperature is a scalar field). Polarization on the other hand also probes the tensor and vector power spectrum.

30.3 CMB Polarization and Inflation [see section 13.6 in your book] One of the predictions of inflation is the presence of gravita- tional waves, which alter the B-mode of the CMB tensor power spectrum. If one can measure this polarization, then one can constrain the nature of the inflation potential. Consider the

104 equation of motion for a scalar field φ φ¨ +3Hφ˙ + V ′(φ)=0 Let us define two quantities, which we will refer to as “slow roll parameters”,that together define the shape of the inflation potential: m2 V ′ 2 m2 V ′′ ǫ = P η = P 16πG V ! 8πG V !

where mP is the Planck mass, V = V (φ), and all derivatives are with respect to φ. In the slow roll regime, the equation of motion is dominated by the damping term, so V ′ φ˙ = . −3H Additionally, the slow roll parameters must both be much less than 1. The requirement ǫ << 1 corresponds to V >> φ˙2 – which is the condition necessary for inflation to occur. The requirement that η << 1 can be derived from the other two conditions, so is considered a consistency requirement| | for the previous two requirements. We will (if time permits) later see that the primordial power spectrum for structure n formation is normally taken to have the form Pk k , where k is the wavenumber. The case of n = 1 is scale invariant and called the Harrison-Zel’dovich∝ power spectrum. The scalar and tensor power, n Pk k ∝ T P T kn , k ∝ are related to the inflation potential via their indices, n =1 6ǫ +2η − n = 2ǫ, T − where here ǫ and η correspond to the values when the perturbation scale k leaves the horizon. We now know that n=1, as expected in the slow-roll limit. A measurement of the tensor power spectrum provides the information necessary to separately determine ǫ and η and hence recover the derivatives of the inflaton potential. Now, how much power is in the tensor spectrum compared to the scalar power spectrum. The ratio is T T Cl r = = S = 12.4ǫ. S Cl Upcoming CMB experiments are typically aiming for r 0.1, or pushing to a factor of 10 smaller amplitudes that were needed for measuring the scala∼ r field. Now, the real challenge lies in separating out the tensor signal from gravitational waves from the other tensor signals, like gravitational lensing. As can be seen in the figures pre- sented in class, gravitational lensing is the dominant signal, and it is only at small l (large angular scales) that one cam reasonably hope to detect the the B-mode signal from gravita- tional waves associated with inflation.

105 30.4 Free in the CMB: Sunyaev-Zeldovich Obviously, in the discussion above we have focused solely on the physics of the CMB and ignored the ugly observational details associated with foreground sources that contaminate the signal. While we will largely skip this messy subject, it is worthwhile to note that one person’s trash is another’s treasure. In particular, perhaps the most interesting foregrounds are galaxy clusters, which are visible via what is know as the Sunyaev-Zeldovich effect. Physically, the Sunyaev-Zeldovich effect is inverse Compton scattering. The CMB photons gain energy by off the ionized intracluster medium (temperature of order a few million degrees K). If one looks at the Rayleigh-Jeans (long-wavelength) tail of the CMB spectrum, one consequently sees a decrement – the sky looks cooler at the location of the cluster than. At shorter wavelengths, on can instead see an enhancement of photons, so the sky looks hotter. This is a rather distinctive observational signature, and really the only way that I know of to generate a negative feature on the CMB. Now, there are actually two components to the SZ effect – the thermal and kinematic SZ. Essentially, the exact frequency dependence of the modified spectrum is a function of the motion of the scattering electrons. The part of the effect due to the random of the scattering electrons is called the thermal SZ effect; the part due to bulk motion of the cluster relative to the CMB is called the kinematic SZ effect. The thermal component is the part upon which people generally focus at this point in time. For an radiation field passing through an electron , there is a quantity called the Comptonization factor, y, which is a dimensionless measure of the time spent by the radiation in the electron distribution. Along a given line of sight,

kBTe y = dl neσT 2 , (560) Z mec

where σT is the Thomson cross-section. For the thermal SZ, along a given line of sight ne = ne(r) and Te = Te(r), where r is the cluster-centric distance. Essentially, y gives a measure of the signal strength (“flux”). If the cluster is modelled as a homogeneous, isothermal sphere of radius Rc, one finds that the maximum temperature decrement in the cluster center is given by

∆T 4RcnekBTeσT = 2 RC Te, (561) T − mec ∝

where ne and Te are again the electron density and temperature in the cluster. Both quan- tities scale with the cluster mass. Now, there is something very important to note about both of the previous two equations. Both of them depend upon the properties of the cluster (ne,Te), but are independent of the distance to the cluster. What this means is that SZ surveys are in principle able to detect uniform, roughly mass-limited samples of galaxy clusters at all redshifts. The relevance to cosmology is that the redshift evolution of the cluster mass function is a very strong function of cosmological parameters (particularly ΩM and w), so measuring the number of clusters above a given mass as a function of redshift provides important information. The key point is

106 that this is an extremely sensitive test. The big stumbling block with cluster mass functions is systematic rather than statistical – relating observed quantities to mass. A nice aspect of the SZ approach is that they should be roughly mass-limited, although you still want to have other data (x-ray, optical) to verify this. Observationally, the SZ folk have be “almost” ready to conduct blind cluster searches for about a decade (even when I was starting grad school), but it is only in the past year that clusters have begun to be discovered in this way. Another application of the SZ effect, which is perhaps less compelling these days, is direct measurement of the Hubble parameter. This is done by using the ∆T/T relation to get Rc and then measuring the angular size of the cluster. When done for an ensemble of clusters to minimize the statistical errors, this can be used to obtain H0 (or more generally ΩM and ΩΛ if one spans a large redshift baseline). In practice, large systematic uncertainties have limited the usefulness of this test. The above is a very quick discussion. If you are particularly interested in the SZ effect, I recommend Birkinshaw astro-ph/9808050.

31 Dark Matter

Time to turn our attention to the dark side of the universe, starting with dark matter. The general definition of dark matter is any matter from which we cannot observe electromagnetic radiation. By this definition, we include such mundane objects as cool white dwarfs as well as more exotic material. As we shall see though, there is now strong evidence for a component of exotic, non-baryonic dark matter that dominates the total matter density.

31.1 Classic Observational Evidence Galaxy Clusters The first evidence for dark matter was the observation by (1933) that the velocity dispersion of the Coma cluster is much greater than can be explained by the visible matter. This is a simple application of standard dynamics, where

GM v2 2σ2 = = . (562) r2 r r GM =2σ2 (563) r GL M =2σ2 (564) r  L  M 2σ2r = , (565)  L  GL where L is the total cluster luminosity and M/L is the mass-to-light ratio. Typical stellar mass to light ratios are of order a few (M⊙/L⊙ = 1; integrated stellar populations M/L <

107 10). If you plug in appropriate numbers for galaxy clusters, you get M/L 200 [100-500] – a factor of 10-50 higher than the stellar value. This was the first direct evidence∼ that the bulk of matter on cluster scales is in a form other than stars. In recent years other observations have confirmed that clusters indeed have such large masses (gravitational lensing, X-ray temperatures), and M/L has been shown to be a func- tion of the halo mass – i.e. lower mass-to-light ratios for smaller systems (see figure in class). Still, this observation was considered little more than a curiosity until complementary ob- servations of galaxy rotation curves in the 1970’s. Rotation Curves In the early 1970’s Rubin and Ford compiled the first large sample of galaxy rotation curves, finding that the rotation curves were flat at large radii. In other words, the rotation curves are consistent with solid-body rather than Keplerian rotation, which argues that the observed disk is embedded in a more massive halo component. These observations were the ones that elevated the idea of dark matter from an idle curiosity to a central feature of galaxies that required explanation. Subsequent work also showed that the presence of a massive halo is actually required in galactic dynamics to maintain disk stability, and the above data played a key role in influencing the later development of the dark matter model of structure formation (Blumenthal et al. 1984).

31.2 Alternatives Is there any way to avoid the consequence of dark matter? The most popular alternative is to modify gravity at large distances. One of the more well-known of these theories is called Modified Newtonian Dynamics (MOND, Milgrom 1980). The idea here is to change the Newtonian force law at small from F = ma to F = µma, where µ =1if a > a0 and µ = a/a0 if a < a0. In our normal everyday experience, we experience a > a0, so the modification to the acceleration would only matter for very small accelerations. Now, if we consider the gravitational attraction of two objects, GMm F = = µma. (566) r2

If we assume that at large distances a < a0 so that µ = a/a0, then

GM a2 2 = (567) r a0 √GMa a = 0 . (568) r For a circular orbit, v2 √GMa a = = 0 , (569) r r so 1/4 v =(GMa0) (570)

108 As you can see, this yields a circular velocity that is constant with radius – a flat rotation curve. One can calculate the required constant for the galaxy, finding a 10−10 m s−2. 0 ≃ Similar arguments can be made for explaining the cluster velocity dispersions. A limitation of MOND is that, like Newtonian gravity, it is not Lorentz covariant. Con- sequently, just as GR is required as a foundation for cosmology, one would need a Lorentz covariant version of the theory to test it in a cosmological context. There is now one such Lorentz covariant version, TeVeS (Tensor-Vector-Scalar theory; Bekenstein 2004), from which one can construct cosmological world models. However, in order to provide a viable alterna- tive to dark matter, TeVeS – or any other modified gravity theory – must be as successful as dark matter in explaining a large range of modern cosmological observations, including our entire picture of structure formation from initial density fluctuations.

31.3 Modern Evidence for Dark Matter So why do we believe that dark matter exists? While modified gravity is an interesting means of attempting to avoid the presence of dark matter, at this point I would argue that we have a preponderance of evidence against this hypothesis. One relatively clean example is the . For this system, we (Clowe et al. 2004,2006) used weak lensing to demonstrate that the mass and intracluster gas (which contains the bulk of the baryons) are offset from one another due to viscous drag on the gas. Hence the baryons cannot be responsible for the lensing and there must be some other component causing the lensing. The TeVeS community has attempted to reconcile this observation with modified gravity, but are unable to do so using baryons alone. The are able to manage rough qualitative (and I would argue poor) agreement if they assume that 80% of the total matter density is in 2 eV neutrinos. [It is worth noting that 2 eV is the maximum mass that a neutrino can have if one relies on constraints that are independent of GR, but new experiments should be running in the next few years that will significantly lower this mass limit.] Thus, even with modified gravity one still requires 80% of the total matter density to be ‘dark’. Aside from this direct evidence, a compelling argument can be made based upon the remarkable success of the model in explaining the growth and evolution of structure in the Universe. Dark matter provides a means for seed density fluctuations to grow prior to the surface of last scattering, and CDM reproduces the observed growth of structure in the Universe from the CMB to z = 0. It is not obvious a priori that this should be the case. As we have seen, the cosmic microwave background provides us with a measurement of the ratio between total and baryonic matter, arguing that there is roughly a factor of 7 more matter than the baryon density, and yields a measurement of the total matter density (assuming GR is valid). These results from the CMB, with the baryon density confirmed by Li abundance measurements, yield densities that, when used as inputs to CDM, produce the observed structures at the present. The fact that the bulk of the total matter is dark matter seems unavoidable.

109 31.4 Baryonic Dark Matter So what is dark matter? From the CMB observations we how have convincing evidence that much of the dark matter is non-baryonic. Baryonic Dark Matter is worth a few words though, as it actually dominates the baryon contribution. In fact, only about 10% of baryons are in the form of stars, and even including HI and molecular the majority of baryonic matter is not observed. Where is this matter? The predominant form of non-baryonic dark matter is ionized gas in the intragalactic medium. This is basically all of the gas that hadn’t fallen into galaxies prior to reionization. In addition, there is some contribution from MACHOS (Massive Compact Halo Objects) – basically old, cold white dwarfs, neutron stars, and stellar black holes that we can’t see.

31.5 Non-Baryonic Dark Matter

Non-baryonic matter is more interesting – it dominates the matter distribution (Ωnon−baryonic 0.23) and points the way towards a better understanding of fundamental physics if we can∼ figure out what it is. There are a vast number of dark matter candidates with varying de- grees of plausibility. These can largely be subdivided based upon a few underlying properties. Most dark matter candidates, with the exceptions of primordial black holes and cosmologi- cal defects (both relatively implausible), are considered to be relic particles that decoupled at some point in the early universe. These particles can be classified by the following two criteria: Are the particles in thermal equilibrium prior to decoupling? • Are the particles relativistic when they decouple? • We will discuss each case below.

31.5.1 Thermal and Non-Thermal Relics Let’s start with the question of thermal equilibrium. Thermal relics are particle species that are held in thermal equilibrium until they decouple. An example would be neutrinos. If relics are thermal, then we can use the same type of formalism as in the case of neutrinos to derive their temperature and density evolution. On the other hand, non-thermal relics are species that are not in equilibrium when they decouple, and hence their expected properties are less well constrained. We will start our discussion with thermal relics. First, let us write down the equation for the time evolution of a particle species. If no particles are being created or destroyed, we know that for a particle X the matter density −3 evolves as nX a , ∝ dn a˙ = 3 n . (571) dt − a x If we then let particles be created at a rate ψ and destroyed by collisional annihilation, dn a˙ = 3 n + ψ < σ v > n2 . (572) dt − a x − A X 110 If the creation and annihilation processes have an equilibrium level such that ψ =< 2 sigmaAv > nX,eq, then the above becomes dn a˙ = 3 n + < σ v > (n2 n2 ), (573) dt − a x A X,eq − X 3 or converting this to a comoving density via nc = n(a/a0) (with a few intermediate steps),

a dn < σ v > n n 2 c = A eq c 1 (574) nc,eq da − a/a˙  nc,eq ! −    Note that, < σ v > n τ A eq = H , (575) a/a˙ τcoll so we are left with a differential equation describing the particle evolution with scale factor as a function of the relevant timescales. In the limiting cases,

n n if τ << τ (576) c ≃ c,eq coll H n n if τ >> τ . (577) c ≃ c,decoupling coll H Not surprisingly, we arrive back at a familiar conclusion. The species has an equilibrium density before it decouples, and then “freezes out” at the density corresponding to equi- librium at decoupling. How the temperature and density evolve before decoupling depends upon whether the species is relativistic (“hot”) or non-relativistic (“cold”) when it decouples.

31.5.2 Hot Thermal Relics For the discussion of hot thermal relics we return to the discussion of internal degrees of freedom from sections 22-24, correcting a bit of sloppiness that I introduced in that discus- sion. We have previously shown that for a multi-species fluid the total energy density will be 7 σ T 4 σ T 4 ρc2 = g + g r = g∗ r . (578)  i 8 i 2 2 bosonsX fermionsX The first bit of sloppiness is that previously I assumed that all components were in thermal equilibrium, which meant that in the energy density expression I took temperature out of the g∗ expression and defined it as 7 g∗ = g + g . (579) i 8 i bosonsX fermionsX To be fully correct, then expression should be

T 4 7 T 4 g∗ = g i + g i . (580) i T 8 i T bosonsX   fermionsX   111 We also learned that the entropy for the relativistic components is 2 s = g∗ σ T 3. (581) r 3 S r ∗ ∗ The second bit of sloppiness is that in the previous discussion I treated g and gS inter- changeably, which is valid for most of the history of the universe, but not at late times (like ∗ the present). The definition of gS is

T 3 7 T 3 g∗ = g i + g i . (582) S i T 8 i T bosonsX   fermionsX   Now, for a species that is relativistic when it decouples (3kT >> mc2), entropy conser- vation requires that ∗ 3 ∗ 3 gS,XT0X = gS0T0γ , (583) where 3 ∗ 7 T0ν gS0 =2+ 2 Nν 3.9 (584) 8 × × × T0γ ! ≃

for Nν = 3. Anyway, you can also calculate the number density in the same way as before,

3 gXζ(3) kBTX nX = α (585) π2 hc¯ ! n g T 3 0X = α X 0X (586) n0γ 2  T0r  ∗ gS,X gS,X n0X = n0γ α ∗ , (587) 2 gS,0

where α = 3/4 or α = 1 depending on whether the particle is a fermion or boson. The density parameter in this case is

∗ mX n0X gS,X mX ΩX = 2αgX ∗ 2 −2 (588) ρ0c ≃ gS,0 ! 10 eVh 

31.5.3 Cold Thermal Relics The situation is not as straightforward for non-relativistic (“cold”) thermal relics. In this case, at decoupling the number density is described by the Boltzmann distribution,

3/2 2 gX mX kBT mxc ndecoupling,X = exp (589) h¯ 2π ! − kBT !

112 and hence the present day density is lower by a factor of a3,

∗ 3 gS,0 T0r n0X = ndecoupling,X ∗ . (590) gX Tdecoupling !

The catch is figuring out what the decoupling temperature is. As usual, you set τH = τcoll. We previously saw that 1/2 3 0.3¯hTP τH ∗ 2 , (591) ≃ 32πGρ! ≃ √gSkBT (as in equation 7.1.9 in your book), and that

−1 τcoll =(nσv) . (592)

The definition of the σv part is a bit more complex, since the cross-section can be velocity dependent. If we parameterize σv as

q kBT <σv>= σ0 2 , (593) mX c !

with q normally having a value of 0 or 1 (i.e. σ v−1 or σ v, then working through the ∝ ∝ algebra one would find that

3 2 q+1 ∗ (kBT0r) mX c ρ0X 10 gX 4 (594) ≃ hc¯ σ0mP kBTdecoupling ! q 31.5.4 Significance of Hot versus Cold Relics Physically, there is a much more significant difference between hot and cold relics that how to calculate the density and temperature. The details of the calculations we will have to leave for another course (they depend upon Jeans mass calculations, which are covered in chapter 10). The basic concept though is that after relativistic particles decouple from the radiation field, they are able to “free-stream” away from the locations of the initial density perturbations that exist prior to recombination. In essence, the velocity of the particles is greater than the escape velocity for the density fluctuations that eventually lead to galaxy and galaxy cluster formation. The net effect is damp the amplitude of these density fluctuations, which leads to significantly less substructure than is observed on small scales. In contrast, cold relics (cold dark matter) only damp out structure on scales much smaller than galaxies, so the fluctuations grow uninterrupted. The difference in the two scenarios is rather dramatic, and we can easily exclude as a dominant constituent. Finally, our observations of local structures also tells us that the dark matter must currently be non-relativistic or else it would not remain bound to galaxies.

113 31.5.5 Non-Thermal Relics We have shown how one would calculate the density of particles for relics that were in equilibrium when they decoupled. There does however exist the possibility that the dark matter consists of particles that were not in thermal equilibrium. If this is the case, then we are left is a bit of a predicament in this regard, as there is no a priori way to calculate the density analogous to the previous sections. As we shall see, one of the leading candidates for dark matter is a non-thermal relic.

31.6 Dark Matter Candidates At this point we have argued that the dark matter must be non-baryonic and “cold”, but not necessarily thermal. While the preferred ideas are that dark matter is a particle relic, there are non-particle candidates as well. Right now we will briefly review a few of the leading particle candidates, which are motivated by both cosmology and particle physics.

31.6.1 Thermal Relics: WIMPS Weakly interacting massive particles (WIMPS) are a favorite dark matter candidates. This class of particles are cold thermal relics. We worked out above a detailed expression for ρ0X , but to first order we can make the approximation that,

10−26 cm3 s−1 Ω . (595) WIMP ≃ <σv >

For ΩDM 1 (0.3 being close enough), the annihilation cross section <σv > turns out to be about what∼ would be predicted for particles with electroweak scale interactions – hence “weakly interacting” in the name. From a theoretical perspective, this scale of the annihilation cross-section is potentially a very exciting clue to both the nature of dark matter and new fundamental physics – specif- ically the idea of super-symmetry. Stepping back for a moment, the notion of antiparticles (perhaps rather mundane these days) comes from Dirac (1930), who predicted the existence of positrons based upon theoretical calculations that indicated electrons should have a sym- metric . It is now a fundamental element of particle physics that all particles have associated, oppositely charged antiparticles. What is relatively new is the idea of “super-symmetry”. Super-symmetry (SUSY) is a generalization of quantum field theory in which bosons can transform into fermions and vice versa. In a nutshell, the idea of super-symmetry is that every particle (and antiparticle) has a super-symmetric partner with opposite spin statistics (spin different by 1/2). In other words, every boson has a super-symmetric partner that is a fermion, and every fermion has a super-symmetric partner that is a boson. The partners for quarks and leptons are called squarks and sleptons; parters for photons are , and for neutral particles (Higgs, etc) are called neutralinos.

114 Now why would one want to double the number of particles? First, SUSY provides a framework for potential unification of particle physics and gravity. Of the numerous attempts to make general relativity consistent with quantum field theory (unifying gravity with the strong and electroweak forces), all of the most successful attempts have required a new symmetry. In fact, it has been shown (the Coleman-Mandual theorum) that there is no way to unify gravity with the standard gauge theories that describe the strong and electroweak interactions without incorporating some . There are also several other problems that SUSY addresses – the mass , coupling constant unification, and the anomalous muon magnetic moment. We won’t go into these here, other than to point out that they exist, and briefly explain the coupling constant problem. Essentially, the strength of the strong, weak, and electromagnetic forces is set by the coupling constants (like αwk, which the book calls gwk). These coupling “constants” (similar to the Hubble constant) are actually not constant, but dependent upon the energy of the interactions. It was realized several decades ago that the coupling constants for the three forces should approach the same value at 1015 GeV, allowing “grand unification” of the three forces. In recent years though, improved observations of the coupling constants have demonstrated in that in the Standard Model the three coupling constants in fact never approach the same value. Supersymmetry provides a solution to this problem – if super- symmetric particles exist and have appropriate masses, they can modify the above picture and force the coupling constants to unify. The way in which this ties back into dark matter is that the neutral charge super- symmetric particles (broadly grouped under the name neutralinos) become candidate dark matter particles. Due to a broken symmetry in supersymmetry, the super-symmetric partner particles do not have the same masses as normal particles, and so can potentially be the dark matter. There are many flavors of sypersymmetry, but one popular (and relatively simple) version called the Minimal Super-symmetric Standard Model (MSSM) illustrates the basic idea. In MSSM, one takes the standard model, and adds the corresponding super-symmetric partners (plus an extra Higgs doublet). The lightest super-symmetric particle (LSP) is stable (i.e doesn’t decay – an obvious key property for dark matter), and typically presumed to be the main constituent of dark matter in this picture. The combined requirements that the SUSY model both unify the forces and reproduce the dark matter abundance gives interesting constraints on the regime of parameter space in which one wants to search. A somewhat old, but illustrative, example is de Boer et al. (1996). These authors find that there are two regions of parameter space where the constraints can be simultaneously satisfied. In the first, the mass of the Higgs particle is relatively light (mH < 110 GeV) and the LSP 2 abundance is ΩLSP h =0.42 with mLSP = 80 GeV. In the other region, mH = 110 GeV and 2 the abundance is ΩLSP h =0.19. These values clearly bracket the current best observational data. Incidentally, note that all of these particles are very non-relativistic at the GUT scale (1015 GeV), and so quite cold.

115 31.6.2 Axions are the favorite among the non-thermal relic candidates, and like WIMPS are popular for reasons pertaining to particle physics as much as cosmology. The was originally proposed as part of a solution to explain the lack of CP (charge-parity) violation in strong nuclear interactions – e.g. quarks and , which are the fundamental constituents of pro- tons and neutrons (see http://www.phys.washington.edu/groups/admx/the axion.html and http://www.llnl.gov/str/JanFeb04/Rosenberg.html for a bit of background). CP is violated for electroweak interactions, and in the standard model it is difficult to explain why the should be finely-tuned to not violated CP in a similar fashion. Somewhat analogous to the case of supersymmetry, a new symmetry (Peccei-Quinn symmetry) has been proposed to explain this lack of CP violation. A nice aspect of this solution is that it explains why neutrons don’t have an electrical dipole moment (although we won’t discuss this). An important prediction of this solution is the existence of a particle called the axion. Axions have no electric charge or spin and interact only weakly with normal matter – exactly the properties one requires for a dark matter candidate. There are two very interesting differences between axions and WIMPS though. First, axions are very light. Astrophysical −6 −3 and cosmological constraints require that 10 < maxion < 10 eV – comparable to the plausible mass range for neutrinos. Specifically, the requirement m< 10−3 eV is based upon SN 1987a – if the axion mass were larger than this value, then the core should have cooled by both axion and neutrino emission (remember, they’re weakly interacting, but can interact) and the observed neutrino burst should have been much shorter than observed. The lower bound, somewhat contrary to intuition, comes from the requirement that the total axion density not exceed the observed dark matter density. Axions lighter than 10−6 eV would have been overproduced in the Big Bang, yielding Ω >> ΩM . At a glance, one might think that the low mass of the axion would be a strong argument against axions being dark matter. After all, shouldn’t axions be relativistic if they are so light? The answer would be yes – if they were thermal relics. Axions are never coupled to the radiation field though, and the that produces them gives them very little initial momentum, so axions are in fact expected to be quite cold relics.

31.7 Other Candidates The above two sections describe what are believed to be the most probable dark matter candidates. It should be pointed out though that (1) WIMPS are a broad class and there are many options within this category, and (2) there are numerous other suggestions for dark matter. These other suggestions include such exotic things as primordial black holes formed at very early times/high density, and cosmic strings. While I would suspect that these are rather unlikely, they cannot be ruled out. Similarly, it remains possible that all of the above explanations are wrong. Fortunately, there are a number of experiments now underway that should either detect or eliminate some of these candidates. To go out on a limb, my personal guess is that things will turn out to be somewhat more complicated

116 than expected. Specifically, it seems plausible that both axions and WIMPS exist and each contribute at some level to the total matter density.

31.8 Detection Experiments So how might one go about detection dark matter? Given the wide range of masses and interaction cross-sections for the various proposed candidates, the first step is basically this – pick what you believe is the most plausible candidate and hope that you are correct. If you are, and can be the first to find it, then a Nobel prize awaits. Conversely, if you pick the wrong candidate you could very well spend much of your professional career chasing a ghost. Assuming that you are going to search though, let’s take a look at how people are attempting to detect the different particles. Something to keep in mind in general in this discussion is that there are essentially two classes of dark matter searches – terrestrial direct detection experiments and astrophysical indirect detection observations. Keep this in mind.

117 32 An Aside on Scalar Fields

In the discussion of inflation we talked about inflation being driven by the vacuum energy in a scalar field. Since there is some confusion on the concept of scalar fields, let us revisit this matter briefly. Mathematically, a scalar field is simply a field that at each point in space can be represented by a scalar value. Everyday examples include things like temperature or density. Turning more directly to physics, consider gravitational and electric fields. In Newtonian gravity, the gravitational potential is a scalar field Φ defined by Poisson’s equation, 2Φ= 4πGρ, where ∇ − F = Φ, and −∇ V (Φ) = ρ(x)Φ(x)d3x. Z Similarly, for an electric potential φ,

2φ = 4πGρ, where ∇ − E = φ, and −∇ V (φ)= ρφd3x. Z In particle physics (quantum field theory), scalar fields are associated with particles. For instance the Higgs field is associated with the predicted Higgs particle. The Higgs field is expected to have a non-zero value everywhere and be responsible for giving all particles mass. In the context of inflation we are simply introducing a new field that follows the same mathematical formalism. The term vacuum energy density simply means that a region of vacuum that is devoid of matter and radiation (i.e. no gravitational or electromagnetic fields) has a non-zero energy density due to energy contained in a field such as the inflaton field (named because in the particle physics context it should be associated with an inflaton particle). During inflation this energy is liberated from the inflaton field. Note that dark energy does not have to be vacuum energy though.

33 Dark Energy

33.1 Generic Properties

As we have seen during the semester, the observable Friedmann equation is H = H0E(z), where 1/2 3(1+wi) 2 E(z)= Ω0i(1 + z) + (1 Ω0i)(1 + z) , (596) " − # Xi Xi and the energy density of any component goes at

3(1+w) ρ = ρ0(1 + z) , (597)

118 where w is the equation of state. Recall that w = 0 for dust-like matter, w = 1/3 for radiation, and w = 1 for a cosmological constant. While we have previously discussed − the possibility of dark energy corresponding to a cosmological constant, this is not the only possibility. Indeed, the most generic definition is that any substance or field that has an equation of state with w < 1/3, which corresponds to a negative energy density, is dark − energy. Perhaps the single most popular question in cosmology at present is the nature of dark energy, and the best means of probing this question is by attempting to measure w(z).

33.2 Fine Tuning Problem Let us start by framing this question. As you may have noticed, a recurring theme in cosmology is the presence of what are called “fine-tuning” problems. These tend to be the most severe problems and the ones that point the way to new physics (like inflation). In the current context, there is a very significant fine-tuning problem associated with either a cosmological constant or a dark energy component with a constant equation of state. For the specific case of the cosmological constant, the current concordance model values imply that the universe only started accelerating at z 0.7 and that the cosmological constant only began to dominate at z 0.4. The question≃ is why we should be so close in time to the ≃ era when the dark energy begins to dominate – a point where we can see evidence for the acceleration, but haven’t yet had structures significantly accelerated away from one another. Put another way, to get the current ratio of ρΛ/ρm 2, we require that at the Planck time −120 ≈ ρΛ/ρr 10 . This issue is intricately related to the phrasing of the cosmological constant problem≈ that we discussed earlier this semester, albeit in a somewhat more general form. What, then, are possibilities for dark energy, and can these possibilities also alleviate this fine-tuning problem?

33.3 Cosmological Constant The cosmological constant remains the leading candidate for dark energy, as recent observa- tions strongly argue that at z = 0 we have w = 1 to within 10%. If it is truly a cosmological constant, then avoiding the fine tuning problem− will require either new physics or a novel solution (like inflation as a solution to other fine tuning problems).

33.4 and Time Variation of the Equation of State Quintessence is the general name given to models with w 1. Quintessence models ≥ − were introduced as alternatives to the cosmological constant for two reasons – (1) because you can. If we don’t know why there should be a cosmological constant, why not propose something else, and (2) because if the equation of state is made to be time-dependent, one can potentially avoid the fine-tuning problem described above. There are many types of quintessence, but one feature that most have in common is that, like a cosmological constant, they are interpreted as being associated with the energy density

119 of scalar fields. These are generally taken to have 1 ρ = φ˙2 + V (φ) (598) 2 1 p = φ˙2 V (φ) (599) 2 − Note that in order to generate an accelerated expansion, the above relations require that 1 φ˙2 1, (602) (V ′)2 the scalar field rolling down the potential approaches a common evolutionary path such that the dark energy tracks the radiation energy density as desired.

33.5 Time Variation of the Equation of State Now, as this will be part of the discussion as we proceed, there is one important distinction to note if we have a time variable equation of state. The standard equation

2 2 3(1+w) H = H0 Ωw(1 + z) (603) h i only holds for a constant value of w. If w is also a function of redshift, then it must also be integrated appropriately, and what you end up with is

ln(1+z) 2 2 H = H0 Ωw exp 3 (1 + w(x))dln(1 + x). (604) " Z0 !# The origin of this expression can be seen by returning to the derivation of ρ(z) in 10, where we derived that for a constant w §

3(1+w) ρ = ρ0(1 + z) , (605) given an adiabatic expansion. If we start with the intermediate equation from that derivation, dρ da3 = (1+ w) (606) ρ − a3

120 we see that for a variable w dρ = (1 + w(a))d ln a3 (607) ρ − dρ = (1 + w(z))dln(1 + z)3 (608) ρ − ρ ln(1+z) ln =3 (1 + w(z))dln(1 + z) (609) ρ0 ! Z0 ρ ln(1+z) = exp 3 (1 + w(z))dln(1 + z) (610) ρ0 Z0 ! (611)

In principle, you can insert any function of w(z) that you prefer. At the moment though, the data isn’t good enough to constrain a general function, so people typically use a first order parameterization along the lines of z w = w + w . (612) 0 1 1+ z

33.6 The recent observations have given rise to serious consideration of one of the more bizarre possibilities – w< 1. Knop et al. (2003) actually showed that if you removed the priors on − Ωm for the data existing at the time, then the dark energy equation of state yielded a 99% probability of having w < 1. Current data have improved, with uncertainties somewhat more symmetric about w =− 1, but this possibility persists. − Models with w < 1 violate what is know as the weak energy condition, which simply means that for these models− ρc2 + p < 0 – the universe has a net negative energy density. If the weak energy condition is violated and the equation of state is constant, this leads to some rather untenable conclusions, such as. (1) The scale factor becomes infinite in a finite time after the phantom energy begins to dominate. Specifically, if w< 1

t 2/3(1+w) a aeq (1 + w) w , (613) ≃ " teq − # where the subscript eq denotes the time when the matter and phantom energy densities are equal. Note that the exponent in this equation is negative, which means that the solution is singular (a ) at a finite point in the future when →∞ w t = t . (614) eq 1+ w

For example, for w = 1.1, this says that the scale factor diverges when t = 10teq (so we’re over a tenth of the way− there!). If we look back at the standard equation for the Hubble

121 parameter, we see that it also diverges (which is consistent), as does the phantom density, which increases as t −2 ρ (1 + z) w . (615) ∝ " teq − # The above divergences have been termed the “”. (2) The sound speed in the medium, v =( dp/dρ )1/2 can exceed the speed of light. It is important to keep in mind that the above| issues| only transpire if the value of w is constant. You can get away with temporarily having w< 1. − 33.7 Chaplygin Gas The Chaplygin gas is yet another way to get a dark energy equation of state. Assume that there is some fluid which exerts a negative pressure of the form A p = − . (616) ρ

For an adiabatic expansion, where dE = pdV , or d(ρa3)= pda3, this yields − − ρ =(A + Ba−6)1/2 =(A + B(1 + z)6)1/2.(see below) (617)

If you look at the limits of this equation, you see that as z , →∞ ρ B1/2(1 + z)3, (618) → which is the standard density equation for pressureless dust models, while at late times,

ρ A1/2 = constant, (619) → similar to a cosmological constant. The nice aspect of this solution is that you have a simple transition between matter and dark energy dominated regimes. In practice, there are certain problems with the Chaplygin gas models though (such as structure formation). The more recent revision to this proposal has been for what is called a “generalized Chaplygin gas”, where

p ρ−α, (620) ∝ − which gives and equation of state

w w(z)= | 0| , (621) − w + (1 w )(1 + z)3(1+α) | 0| −| 0| where w0 is the current value of the equation of state. Note that we are now seeing another example of a time-dependent equation of state.

122 Derivation of equation for density - starting from the adiabatic expression, A ρda3 + a3dρ = da3 (622) ρ a3ρdρ = (ρ2 A)da3 (623) − − ρdρ da3 = − (624) ρ2 A a3 − 1/2 ln(ρ2 A)= B ln a−3 (625) − (ρ2 A)= Ba−6 (626) − ρ =(A + Ba−6)1/2 (627)

33.8 Cardassian Model Yet another approach to the entire problem is to modify the Friedmann equation, as with the Randall-Sundrum model at early times, replacing 8πG H2 = ρ (628) 3 with a general form of H2 = g(ρ), where g is some arbitrary function of only the matter and radiation density. The key aspect of Cardassian models is that they don’t include a vacuum component or curvature – the “dark energy” is entirely contained in this modification of the Friedmann equation. A simple version of these models is 8πG H2 = ρ + Bρn, where n< 2/3. (629) 3 In Cardassian models the additional term is negligible at early times and only begins to dominate recently. Once this term dominates, then a t2/(3n). The key point though is ∝ that for these models the universe can be flat, matter-dominated, and accelerating with a sub-critical matter density. Moving beyond the above simple example, the “generalized Cardassian model” has

1/q 8πG ρ q(n−1) H2 = ρ 1+ , (630) 3  ρcard !    where n< 2/3, q > 0 and ρcard is a critical density such that the modifications only matter when ρ < ρcard. Note that in many ways this is eerily reminiscent of MOND. This scenario cannot be ruled out though given the current observations, and there is a somewhat better motivation than in the case of MOND. In particular, modified Friedmann equations arise generically in theories with (Chung & Freese 1999), such as braneworld scenarios.

123 33.9 Other Alternatives In the brief amount of time that we have in the semester I can only scratch the surface (partially because there are a huge number of theories that are only modestly constrained by the data). For completeness, I will simply list some of the other proposed solutions to dark energy, such that you know the names if you wish to learn more. These include k essence, scalar-tensor models, Quasi-Steady State Cosmology, and Brane world models. −

124 34 Gravitational Lensing

Reading: Chapter 19, Coles & Lucchin Like many other sections of this course, the topic of gravitational lensing could cover an entire semester. Here we will aim for a shallow, broad overview. I also note that this section of the notes is currently more sparse than the other sections thus far and most of the lecture was not directly from these notes. For more in depth reading, I refer you to the following excellent text on the subject: Gravitational Lensing: Strong, Weak & Micro, Saas-Fee Advanced Course 33, Meylan et al. (2005).

34.1 Einstein GR vs. Newtonian A pseudo-Newtonian derivation for the deflection of light yields 2GM αˆ = . (631) rc2 In general relativity however there is an extra factor of 2 such that the deflection is 4GM αˆ = . (632) rc2 This can be derived directly from the GR spacetime metric for the weak field limit around a mass M, 2 2GM 2 2 2GM 2 ds = 1+ 2 c dt 1 2 dl , (633)  rc  −  − rc  as you will do in your homework. -refer to book discussion of deflection of light by the sun.

34.2 Gravitational Optics I refer the reader here to the Figure 19.1 in Coles & Lucchin or Figure 12 in the Saas-Fee text. If one considers a beam of light passing through a gravitational field, the amount by which the beam is deflected is determined by the gradient of the potential perpendicular to the direction of the beam. Physically, a gradient parallel to the path clearly can have no effect, and the stronger the gradient the more the light is deflected. The deflection angle is defined as 2 αˆ = 2 ⊥Φdl, (634) c Z ∇ where l is the direction of the beam. The above is formally only valid in the limit that the deflection angle is small (i.e. weak field), which for a point source lens is equivalent to saying that the impact parameter ξ is much larger than the Schwarzschild radius (r 2GM/cv2). s ≡ Definitions: The lens plane is considered to be the plane that lies at the distance of the lens; the source plane is equivalently the plane that lies at the distance of the source. It is

125 common to talk about the locations of objects in the source plane and the image in the lens plane. A Point Source Lens - For a point source the gravitational potential is GM Φ(ξ, x)= , (635) −(ξ2 + x2)1/2 where x is the distance from the lens parallel to the direction of the light ray. Taking the derivative and integrating along dx, one finds 2 4GM αˆ = 2 ⊥Φdx = 2 , (636) c Z ∇ c ξ which is the GR deflection angle that we saw before. Extended Lenses - Now, let us consider the more general case of a mass distribution rather than a point source. We will make what is called the thin lens approximation that all the matter lies in a thin sheet. In this case, the surface mass density is

Σ(ξ)= ρ(ξ, x)dx, (637) Z the mass within a radius ξ is

ξ M(ξ)=2π Σ(ξ′)ξ′dξ′, (638) Z0 and the deflection angle is

4G (ξ ξ′)Σ(ξ′) 4GM αˆ = − d2ξ′ = , (639) c2 ξ ξ′ 2 c2ξ Z | − | Another way to think of this is as the continuum limit of the sum of the deflection angles for a distribution of N point masses. Note that in the above equationα ˆ is now a two-dimensional vector. The Lens equation - Now, look at the figure referenced above. In this figure α, called the reduced deflection angle, is the angle in the observers frame between the observed source and where the unlensed source would be. The angle β is the angle between the lens and the location of the unlensed source. It is immediately apparent that θ, the angle between the lense and the observed source is related to these two quantities by the lens equation,

β = θ α(θ). (640) −

Now, if one assumes that the distances (Ds,Dds, Dd) are large, as will always be the case, then one can immediately show via Euclidean geometry that D α = ds α.ˆ (641) Ds

126 [Note that equation 19.2.10 in Coles & Lucchin is incorrect – the minus sign should be a plus sign.] Note that if there is more than one solution to the lens equation then a source at β will produce several images at different locations. If we take the definition ofα ˆ and rewrite the expression in angular rather than spatial coordinates (ξ = Ddθ), then D 1 θ θ′ α(θ)= ds αˆ = intd2θ′κ(θ′) − , (642) D π θ θ′ 2 s | − | where Σ(D θ) κ(θ)= d (643) Σcr and 2 c Ds Σcr = . (644) 4πG DdDds In the above equations κ is the convergence, and is also sometimes called the dimensionless surface mass density. It is the ratio of the surface mass density to the critical surface mass density Σcr. The significance of Σcr is that for Σ Σcr the lens is capable of producing multiple images of sources (assuming that the sources≥ are in the correct locations). This is the definition of strong lensing, so Σcr is the dividing line between strong and weak lensing. Your book also provides another way of interpreting the critical density, which is that for the critical density one can obtain β = 0 for any angle theta – i.e. for a source directly behind the lens all light rays are focused at a well-defined focal length (which of course will differ depending on the angle theta). Axisymmetric Lenses – Now, let’s consider the case of a circularly symmetric lens. In this case, Dds Dds 4GM(θ) 4GM(θ) α(θ)= αˆ = 2 = 2 , (645) Ds DdDs c θ Dc θ where D D D d s , (646) ≡ Dds and 4GM(θ) β = θ . (647) − Dc2θ The case β = 0 corresponds to

1/2 4GM(θE) θE = , (648) Dc2 !

where θE is called the Einstein radius. A source at β=0 is lensed into a ring of radius θE. Note that this angle is again simply the deflection angle in GR. One can rewrite the lensing equation in this case for a circularly symmetric lense to be β = θ θ2 /θ, (649) − E 127 or 1 θ = β (β2 +4θ2 ) (650) ± 2 ± E   These solutions correspond to two images – one on each side of the source. One of these is always at θ < θE, while the other is at θ > θE. In the case of β = 0, the two solutions are obviously both at the Einstein radius. General Case – Consider the more general case of a lense that lacks any special symmetry. Let us define what is called the deflection potential 2 ψ(θ)= 2 Φ(Ddθ, x)dx. (651) Dc Z The gradient of this deflection potential with respect to theta is

Dds θψ = Dd ξψ = αˆ = α (652) ∇ ∇ Dd and the Laplacian is 2ψ =2κ(θ)=2Σ/Σ . (653) ∇θ cr The significance of this is that we can express the potential and deflection angle in terms of the convergence, 1 ψ(θ)= κ(θ)log θ θ′ d2θ′ (654) π Z | − | 1 θ θ′ α(θ)= κ(θ) − d2θ′ (655) π θ θ′ 2 Z | − | (skip the last two equations in class). We will momentarily see why Φ(θ) is a useful quantity. Let us return to the lensing equation β = θ α(θ), (656) − where all quantities can be considered vectors in the lens plane with components in both the x and y directions (which we will can for example θ1 and θ2). Let us define a matrix based upon the derivative of β with respect to θ

A = ∂βi (657) ij ∂θj

∂αi(θ) = δij (658) − ∂θj  ∂2ψ  = δij (659) − ∂θi∂θj   This is the matrix that maps the source plane to the lens(image) plane. Now, let us see why ψ is particularly useful. Equation 653 can be rewritten as 1 κ = (ψ + ψ ) , (660) 2 11 22 128 and we can also use the deflection potential to construct a shear tensor, 1 γ = (ψ ψ ) (661) 1 2 11 − 22 γ2 = ψ12. (662)

Recall that convergence corresponds to a global size change of the image, while shear corresponds to stretching of the image in a given direction. Using these definitions for shear and convergence, we can rewrite A as

1 κ γ γ A(θ)= − − 1 − 2 . γ 1 κ + γ − 2 − 1 ! or 1 g g A(θ)=(1 κ) − 1 − 2 . − g2 1+ g1 − ! where g γ/(1 κ) is called the reduced shear tensor. When you look at a lensed image ≡ − on the sky it is this reduced shear tensor that is actually observable. What you really want to measure though is κ, since this quantity is linearly proportional to mass surface density (at least in the context of general relativity). I will skip the details, but given the mapping A in terms of κ and g, one can derive an expression for the convergence of

1 1 g1 g2 g1,1 + g2,2 ln(1 κ)= 2 2 − − . − 1 g g g2 1+ g1 ! g2,1 g2,2 ! − 1 − 2 − − From this one can then recover the mass distribution. The one caveat here is what is known as mass sheet degeneracy, which simply put states that the solution is only determined to within an arbitrary constant. To see this consider a completely uniform sheet of mass. What is the deflection angle? Zero. Thus, you can always modify your mass distribution by an arbitrary constant. For determinations of masses for systems like galaxy clusters the assumption is that far enough away the mass density goes to zero (at least associated with the cluster).

34.3 Magnification It is worth pointing out that the magnification of a source is given by the ratio of the observed solid angle to the unlensed solid angle. This is described by a magnification tensor M(θ)= A−1 such that the magnification is

∂2θ 1 1 µ = = detM = = (663) ∂β2 detA (1 κ)2 γ 2 − −| | .

129 34.4 Critical Curves and Caustics Definitions: Critical curves are locations in the lens plane where the Jacobian vanishes (det A(θ) = 0). These are smooth, closed curves and formally correspond to infinite magnifi- cation, though the limits of the geometrical optics approximation break down before this point. Lensed images that lie near critical curves are highly magnified though, and for high redshift galaxies behind galaxy clusters these magnifications can reach factors of 25-50. Definitions: Caustics correspond the the mapping of the critical curves into the source plane – i.e. they are the locations at which sources must lie in the source plane for an image to appear on the critical curve.

130