<<

PHY104 - Introduction to

S. P. Littlefair

June 4, 2013 Chapter 1

Properties of Light

1.1 Introduction

The only information we have about our Universe comes from the light emitted by objects within it. A good understanding of light is essential in all of astrophysics. We must learn what light is and how it behaves. We must understand it’s properties, and learn how to use those properties to discover the information we seek. Finally, we must understand how light interacts with the matter around it. Much of the rest of the astrophysics course at Sheffield deals with what we know. By covering the basic physics of light, this course aims to explain how we acquire that knowledge. Our starting point is a question which seems quite simple; “what is light”?

1.2 The wave nature of light

It is easy to demonstrate that light behaves like a wave. The famous Young’s slit experiment is an elegant demonstration, and is shown in the left-hand side of figure 1.1. In this experiment, a thin plate with two parallel slits are illuminated by a single light source, and the light passing through the slits strikes a screen behind them. When we look on the screen, we see a diffraction pattern, made up of a series of bright and dark fringes. The diffraction pattern is easily understood if we think of light as a wave propa- gating through some medium, like water waves on a lake. In our experiment, each slit acts as a source of light waves; a wavefront of light spreads out from each slit. The two wavefronts of light hit the screen, and the brightness of light at that point depends on how the wavefronts interfere. If the light waves are in phase (if the peaks of both waves line up), then we get a bright

1 Figure 1.1: Left: A schematic of the Young’s double slit experiment. A light source behind S1 illuminates the two slits at S2. These slits act as secondary sources of light, and light waves spread out from the slits like water waves on a lake. Right: the principle of superposition. Light in phase adds to give brighter light, but light which is out of phase cancels out to produce dark regions.

region. If the two waves are out of phase (the peak of one wave corresponds to a trough in the other), the light waves cancel out, and we see a dark region (see RHS of figure 1.1). We will return to the wave nature of light in a moment, but first let us look at the startling property of light that emerges from quantum mechanics.

1.3 The particle nature of light

Whilst it is easy to demonstrate that light behaves like a wave, it is also possible (though nowhere near as easy!) to demonstrate that light behaves like a particle. When a metal plate is illuminated with blue or light, electrons absorb the energy from the light, and can escape from the metal. This phenomenon is known as the photoelectric effect. If light is a wave, we might think that the energy of the escaping electrons would increase as the intensity of light increased, but the frequency of that light wouldn’t matter. In fact, the energy of the released electrons increases in proportion to the frequency of the light, and below a certain frequency, no electrons are emitted from the metal at all (see figure 1.2). It is extremely difficult to explain this result by thinking about light as a wave. The photoelectric effect’s dependence upon frequency was predicted by Einstein, in 1905, based upon a model of light as particles of light, called

2 e- ν Energy

E=hν - hν0

ν0 Frequency

Figure 1.2: The photoelectric effect. Blue light is shone onto a metal plate. Electrons in the metal absorb the energy from the light and escape from the metal (left hand side). When the energy of the emerging electrons is measured, it turns out to be proportional to the frequency of the light (right hand side). Furthermore, below a certain frequency, no electrons are released from the metal plate. This effect led Einstein to propose the particle nature of light in 1905.

photons. He suggested that each photon has an energy which is proportional to its frequency, E = hν. An electron requires a certain amount of energy to free it from the metal. If we call this amount of energy W (known as the work function) then the energy of the freed electron should be given by Enu = hν −W . Thus, the energy of the emerging electron is proportional to the frequency of the incident light, just as we see in the photoelectric effect. What happens if the frequency of the light is reduced, so that the energy carried by the photons is less than W ? In this case, no single photon has enough energy to liberate an electron, and so no electrons can escape the metal. These were the predictions made by Einstein for the behaviour of the photoelectric effect. Einstein’s predictions of the frequency dependence of the photoelectric effect, based upon the photon model, were confirmed in painstaking experiments by Millikan in 1913-1914. Although Millikan didn’t believe in the particle nature of light at the time, his experiments earned him the Nobel prize, and gave tremendous support to the picture of light as discrete particles of light. The photoelectric effect tells us that each photon has an energy given by E = hν. Photons also have a momentum, given by p = E/c = hν/c, although to understand why requires us to study Einstein’s theory of special

3 relativity, which is beyond this course.

1.4 The wave-particle nature of light

So we have some experiments which show that light behaves as a wave, and other which show quite clearly that light behaves as a particle. In fact, there are even experiments that show that light can behave as both a particle and a wave at the same time! We are going to go back to the Young’s slit experiment described earlier, but perform an experiment in which we reduce the intensity of the light so much that only one photon illuminates the plate with the slits at any one time. Now, let us put a special camera in place of the screen, that can detect each photon as it arrives at the screen. What we find is astonishing. The light is clearly behaving like photons, because we see each one hit the screen individually. The location of each photon as it hits the screen is seemingly random; one photon arrives at a given location and a moment later another photon arrives somewhere else. Over time, however, as we watch the photons arrive one-by-one, we find more photons are hitting the screen where the bright regions of the original diffraction pattern were! How can this be? The diffraction pattern was made by the interference of waves of light. The photons are going through the slit one- by-one and so cannot be interfering with each other. What is going on is that the photons are interfering with themselves; behaving as a particle and a wave at the same time. Thus, whilst light sometimes behaves as a wave, and sometimes as a particle, in reality it is neither. The true nature of light is much stranger, and is given by quantum mechanics, which you will study later in the Physics course. In the remainder of this module, we will choose to describe light as either a particle or a wave, depending on which suits us most!

1.5 The electro-magnetic spectrum

If light is a wave, what is it a wave in? The answer is that light is an electro- magnetic wave. Modern theory describes light as electric and magnetic fields which oscillate in phase, perpendicular to the velocity of propagation, but in planes oriented at 90 degrees to each other. Confused? Have a look at figure 1.3, or take a look at the JAVA applet at http://www.phys.hawaii. edu/~teb/java/ntnujava/emWave/emWave.html. Like any wave, it’s speed of propagation is given by the wave equation, c = νλ, where c is the speed of light, ν is the frequency, and λ is the wave-

4 Figure 1.3: An electromagnetic wave.

length of the light. Light of different frequencies is referred to by different names; visible light is only a tiny portion of the electro-magnetic spectrum, which is shown in figure 1.4. In astronomy one of the most useful techniques available to us is to exploit the full electro-magnetic spectrum. Because light is described both by the wave equation (c = νλ) and by E = hν, we can de- scribe a position on the electro-magnetic spectrum by wavelength, frequency, or energy. Visible light (frequencies around 1015 Hz) tells us much about the thermal emission from stars and galaxies, but infrared and microwave radiation (wavelengths from 1 micron to 1 cm) can show us the location of very cool stars and dust, whilst high-energy radiation (γ-rays, X-rays and UV emission) tells us about the most energetic processes in the Universe. Only a small portion of the electromagnetic spectrum is observable from the surface of the Earth, however. The optical, infrared and radio portions of the spectrum are visible through atmospheric windows, but the Earth’s atmosphere is opaque to other wavelengths of light. The transparency of the Earth’s atmosphere is shown in figure 1.5. Regions of the electromagnetic spectrum not visible from the Earth must be observed from satellites in space; there now exist satellites which cover most of the electromagnetic spectrum.

5 Figure 1.4: The electromagnetic spectrum, showing the position of visible light.

Figure 1.5: Transparency of the Earth’s atmosphere.

6 1.6 Measuring Light - Brightness

The most basic thing we can measure about the light from an astrophysical object is it’s quantity. How much light is emitted from a star. To do this usefully we need to define some quantities. The first is the monochromatic flux. This is the energy falling on a unit area, per unit time, at a given frequency. Actually, we need to be careful how we state that last part, “ at a given frequency”. In the same way that no light falls on a single point, because a point has no area, no light is emitted at a single frequency. Instead, we should properly ask how much light is emitted in a infinitesimally small range of frequency, dν. Suppose we have a perfect , which detects all the light that falls on it. The collecting area of this telescope is ∆A. Suppose we tune that telescope so it is only sensitive to light over a frequency range ∆ν, and collect light for a time interval of ∆t. We detect an amount of energy given by ∆E. The amount of energy detected per unit time, area ∆E and frequency is then given by ∆t ∆A ∆ν . To find the monochromatic flux, we must let all these intervals become infinitesimally small. In other words, the monochromatic flux is given by ∆E Fν = lim . (1.1) ∆A→0 ∆t ∆A ∆ν ∆ν→0 ∆t→0 As well as the monochromatic flux, which measures the amount of light as a function of wavelength, we may want to know the amount of light across all wavelengths. This quantity is known as the bolometric flux and is given by Z ∞ bol F = Fν dν. (1.2) 0

1.6.1 Monochromatic flux in more detail Above we talk about the monochromatic flux, and we define it as the amount of energy detected per unit time, area and per unit frequency. It is given the symbol Fν. We can also define a monochromatic flux as the amount of energy detected per unit time, area and per unit wavelength. We give this a symbol Fλ. How are the two related? Over a small frequency range dν, what is the total energy received per unit area and time? It is an amount equal to Fνdν. Clearly, this amount of energy doesn’t change, whether we measure the monochromatic flux in wavelength or frequency units. Using this fact, we can write Fλdλ = Fνdν, (1.3)

7 where dλ is the small wavelength range that corresponds to the frequency range dν. We can find dλ from

dλ dλ = dν. (1.4) dν

We take the modulus of dλ/dν because we only care about the size of the wavelength range dλ. Let’s substitute equation (1.4) into equation (1.3) to get

dλ Fλ dν = Fνdν, or dν

dλ Fλ = Fν dν . This is our answer; an equation linking Fλ and Fν. You might be wonder- dλ ing how we can calculate dν , but in fact it is rather simple, because λ and ν are related via the wave equation, c = νλ. Re-arranging, and differentiating this equation we find c λ = ν dλ −c = dν ν2 2 dλ c λ = = dν ν2 c , which gives λ2 F = F (1.5) ν λ c

1.6.2 Luminosity Whilst the flux is generally what we can measure from the Earth (or just above it), what we would often like to be able to measure is the total energy emitted by the source (in all directions). This is known as the luminosity and is a fundamental property of the source. The monochromatic luminosity is defined as the energy emitted by the source in unit time, per unit wavelength, i.e. ∆E Lν = lim . (1.6) ∆ν→0 ∆t ∆ν ∆t→0

8 Just as flux has a monochromatic and bolometric definition, we can also define the bolometric luminosity, which is the total energy emitted by the source per unit time, at all wavelengths, Z ∞ bol L = Lν dν. (1.7) 0

1.6.3 Inverse Square Law

d

Figure 1.6: Geometry for proving the inverse square law.

How are the luminosity and flux related? Consider a sphere at a distance d from the source (shown in figure 1.6). Assuming the source is isotropic, the energy from the source is evenly spread over the surface of sphere of radius d; the energy is spread over an area 4πd2. The total amount of energy emitted per unit time is the bolometric luminosity, Lbol. Therefore, the total energy arriving at the sphere, per unit time and per unit area (the bolometric flux) is given by the inverse square law Lbol F bol = . (1.8) 4πd2 A similar equation can be written relating the monochromatic flux and lu- minosity, Fν and Lν.

9 Chapter 2

Magnitudes

2.1 An astrononomy quirk - magnitudes

Generally speaking the flux is a perfectly useful measure of how much light we receive on Earth. There are some practical difficulties in measuring the flux of course; our detectors are not 100% efficient, and we have to correct for the amount of light absorbed by the atmosphere. Nevertheless, once these corrections have been made, there should be no problem reporting the amount of light received as fluxes, right? However, astronomy is a science with a tendency towards unconventional units of measure, and no exception is made here. In astronomy, we often use the apparent magnitude instead of the flux. The scale upon which magnitude is now measured has its origin in the ancient Greek practice of dividing those stars visible to the naked eye into six magnitudes. The brightest stars were said to be of first magnitude (m = 1), while the faintest were of sixth magnitude (m = 6), the limit of human visual perception (without the aid of a telescope). Each grade of magnitude was considered to be twice the brightness of the following grade (a logarithmic scale). Nowadays, magnitude is still a logarithmic scale, but has been put on a formal footing. The apparent magnitude is related to the monochromatic flux by mν = −2.5 log10 Fν + c. (2.1) However, the ancient Greek convention has been retained; brighter stars have smaller magnitudes than fainter ones - courtesy of that minus sign in equation (2.1). So a star with magnitude -3 is much brighter than a star of magnitude 14. What is the value of the constant, c in equation (2.1)? In fact, we are free to choose any value we like, as long as we keep the same

10 value between measurements. Although magnitudes can be a bit confusing on first acquaintance, they have one big advantage; the numbers involved tend to be easy to deal with. A magnitude of 0 is a lot quicker to grasp and remember than a flux of 3.44 × 10−8 W m−2 µm−1!

2.2 Measuring magnitude:

1 λc 0.9 0.8 0.7 0.6

0.5 Δλ 0.4 Transmission 0.3 0.2 0.1 0 500 520 540 560 580 600 620 640 660 680 700 Wavelength (nm)

Figure 2.1: Left: an idealised astronomical filter and right: the Johnson- Cousins filter set

The technique of measuring accurate fluxes and magnitudes of astro- nomical sources is called photometry. Photometric systems are defined by the sets of filters which are used to isolate individual wavelength ranges in order to measure a monochromatic flux. An example filter set is shown in figure 2.1. This is the Johnson-Cousins filter set, which is widely used throughout astronomy. In principle, astronomical photometry is simple. Consider the idealised filter shown in figure 2.1. This filter has a central (average) wavelength of λc and a width of ∆λ. This filter is placed on a camera on a telescope with collecting area ∆A. The camera is exposed to light for a time ∆t. Not every photon which falls on the telescope will be recorded. Suppose our camera/telescope combination detects a fraction, η of the photons which fall on the telescope (we say it has an efficiency of η). Therefore, if we detect N photons, the total number of photons which fell on our telescope was N/η. What is the average energy of each of these photons? They have an average Nhc energy of E = hν = hc/λ. So the total energy which arrived is ∆E = ηλ . ∆E The flux in our filter follows from Fλ = ∆A∆t∆λ .

11 Fluxes in astronomical filters are normally referred to using the name of the filter as a subscript. As an example, the flux as measured in the Johnson V-band would be called FV . The V-band magnitude can be calculated in the usual way mV = −2.5 log10 FV + cV .

Notice that I have called the constant cV , to indicate that this constant is specific to the V-band. In the Johnson , the constant c is chosen separately for each filter. The value of c is chosen so that the bright star, Vega, has a magnitude of 0 in every band. Magnitudes can be measured for a star in each photometric band and colour indices determined by subtracting the magnitudes as measured in different filters, e.g.

mB − mV = B − V = −2.5 log10 FB + cB + 2.5 log 10FV − cV FV B − V = +2.5 log10 + const. (2.2) FB Colour indices provide crude information about the spectra of astronomical sources (they are often just called the “colour”), and can be used to estimate the temperatures of stars. A couple of things are worth noting - because magnitudes are a logarithmic scale, subtracting two magnitudes is equivalent to dividing two fluxes. Also, since the magnitude scale is fixed so that Vega has a magnitude of 0 in each band, Vega also has colour indices of zero, which fixes the constant in the equation above.

2.3 Absolute Magnitudes

We noted in the last section that the luminosity is a fundamental property of the source. The flux, by contrast, is a property both of the source, and its distance from us. The inverse square law tells us that a source can be faint (low in flux) because it is either intrinsically dim (low in luminosity), or very far away. Since the apparent magnitude is related to the flux, it too depends on the distance to the source. We would like a measure of intrinsic brightness for the magnitude scale; the magnitude equivalent of luminosity. This measure, which we will call the absolute magnitude, is the magnitude an object would have, if it were 10 parsecs away1. How are the absolute and apparent magnitude related? Fν is the flux of our object, and we’ll call the

1Parsecs are a standard unit of distance in astrophysics. We’ll see why in the next section

12 10 flux it would have at 10 parsecs Fν . Let us measure the distance to our object, d, in parsecs. The inverse square law tells us that

2  2 Fν L/4πd 10 10 = 2 = . (2.3) Fν L/4π10 d Using equation (2.1) we can then write

10 M = −2.5 log10 Fν + c, and 10 m − M = −2.5 log10 Fν + 2.5 log10 Fν , or   Fν m − M = −2.5 log10 10 , Fν where M is the absolute magnitude. Combining this last equation with equation (2.3), we find

 d  m − M = 5 log , (2.4) 10 10 remembering all the while that d is measured in parsecs! Equation (2.4) tells us that the apparent and absolute magnitudes are related by the distance to the source; in fact, the quantity m − M is called the distance modulus. We can derive a general form of equation 2.4 too, which tells us how the apparent magnitude of an object changes with distance. We simply replace the 10 parsec distance we chose earlier by a general distance d2, and ask what it is the magnitude of our object at that distance m2?  d  m − m2 = 5 log10 , or d2   d1 m1 − m2 = 5 log10 , d2 where I have replaced m and d with m1 and d1. If we can measure the apparent magnitude for an object, and somehow know it’s absolute magnitude (perhaps all sources of a given type have the same absolute magnitude), then we can calculate the distance modulus, and from that, the distance. Generally speaking, however, we do not know the absolute magnitude of an object, and can only measure apparent magni- tudes. How then, do we calculate the distance to astronomical objects? We will cover distance measurement in the next chapter.

13 Chapter 3

Distance Measurement

3.1 Parallax

p p

p

d

Summer Winter 1 AU

Figure 3.1: Astronomical Parallax

Parallax is the name given to the phenomenon where an object appears to move, relative to the background, when viewed along two different lines of sight. Parallax is easily experienced by holding your thumb up at arms length and then closing one eye, and then the other. Your thumb appears to move relative to the background. The effects of parallax mean that each eye sees a subtly different scene; having two eyes with overlapping fields of view

14 is what allows your brain to measure the distances to everyday objects and gives you depth perception. Therefore, you are already an expert in applying parallax to measure distance, and what follows will surely be simple revision. In astrophysics, parallax is the only method we have for obtaining di- rect measures of distance for objects outside our solar system. The way it works is illustrated in figure 3.1. As the Earth moves round the , we ob- serve different lines of sight towards a nearby star. Because nearby objects experience larger parallax than further objects (it should be obvious why, from figure 3.1), the nearby star appears to move, relative to the distant, background stars. From the right hand side of figure 3.1, we can see that

tan p = (Earth-Sun distance)/d.

Since p is such a small angle, we can use the small angle approximation, tan p ≈ p, where p is measured in radians, to get

p (radians) = (Earth-Sun distance)/d.

We can make life much easier for ourselves by some careful choice of units. For a start, parallax angles are so small that a radian is an awkward unit. We would be better off using arcseconds1. Also, the Earth-Sun distance and typical stellar distances are too large to be easily measured in metres. What we will do is define a new unit, the parsec, so that an object at a distance of 1 parsec (1 pc) will have a parallax of 1 arcsecond. If we do this, the parallax equation becomes 1 p (arcseconds) = . (3.1) d (pc)

How large is a parsec? Basic trigonometry reveals that a parsec is 206,265 AU, 3.085 × 1016 m, or 3.28 light years. It is a very convenient unit for distances in astronomy.

3.2 Typical parallaxes

How large is the parallax for our nearest star? Proxima Centauri has a parallax of p = 000.75, which corresponds to a distance of d = 1.33 pc. This is a very small amount of parallax; roughly the same angle as that subtended

1Just like an hour is divided into minutes and seconds, so a degree is divided into arcminutes and arcseconds. 1◦ = 60 arcminutes (600), 10 = 60 arcseconds (6000), so 1◦ = 60 × 60 = 360000

15 by a 2 pence piece at a distance of 5km! It’s comparable to the resolution of ground-based (which you’ll remember are limited in resolution by the seeing). In other words, even for the nearest stars, we’re looking for a shift in position which is comparable to the size of the stellar image itself. Parallax is a very small effect, which requires extremely careful data taking and analysis to use. As an historical aside, the difficulty of observing parallax was one of the main reasons the Heliocentric model of the Solar System took so long to be accepted. In general, people assumed that the stars were not much further away than the of our own Solar System. Since they didn’t show large parallaxes, the logical conclusion was that the Earth itself did not move. The first succesful measurement of stellar parallax did not come until 1838, when Friedrich Bessel measured the parallax of 61 Cygni to be ∼ 000.3. Modern measurements of parallax have achieved an astonishing level of accuracy. By moving to space, we can overcome the blurring effects of the atmosphere and measure very small shifts in stellar positions. The current state of the art is the Hipparchos satellite, which achieved accuracies of ±0.001 arcseconds for the brightest stars. It should be clear that parallax is only useful for measuring the distance to nearby stars. Even at the 0.001 arcsecond accuracy provided by Hippar- chos, we can only detect the parallax for stars closer than 1000 pc, or 1 kpc. In order to measure the distance of the furthest stars, and to have any hope of measuring distances to Galaxies, we need to apply new techniques.

3.3 “Standard Candles” and the Distance Ladder

A standard candle is the name given to an object of accurately known lu- minosity. Before going on to describe some types of standard candle, we should ask why they are so useful? The reason is that we can easily measure the flux of these objects from Earth and, if the luminosity is known, we can apply the inverse square law to find the distance

1  L  2 d = . 4πF

In order to establish that an object is a standard candle, we need some way of knowing its luminosity. That means that, for at least some objects, we already know the distance. Thus, ultimately, all distances from standard candles are based upon distances obtained from parallax. Thus, we might discover a standard candle in our own Galaxy and measure the distance

16 using parallax. Because the luminosity of our standard candle is known, we could use these standard candles to measure the distances to nearby galaxies. In these galaxies, we may find another type of standard candle. Perhaps it is much brighter, allowing us to measure the distance to even further galaxies, and so on. In this manner, we build up a “distance ladder” which enables us to measure distances to all objects, right out to the edge of the Universe. Ultimately though, they are all based on distances measured in our own Galaxy, through parallax.

3.3.1 An standard candle example: Cepheid Variables Let’s look at standard candles in more detail by considering a specific, and important, example. Cepheid variables are pulsating stars, whose brightness varies periodically. They are very bright, and can be seen at great distances. 1994MNRAS.266..441L

Figure 3.2: Calibration of the Cepheid period-luminosity relationship from Lane & Stobie (1994). Here, the logarithm of the period is shown against the absolute magnitude. The distance to any Cepheid can be found by measuring its period and using this relationship to infer its absolute magnitude. This is compared to the measured apparent magnitude to yield a distance modulus, and hence a distance.

In 1893, Henrietta Leavitt began work measuring the brightness of stars in the Magellanic Clouds2. In 1908 she published her results, noting a large

2At that time, women did much work in astrophysics, but they were usually carrying

17 number of periodic variable stars which showed a pattern; the brighter stars seemed to have longer periods. In 1912 she published another study show- ing that these variables - Cepheid Variables - showed a close and predictable relationship between their luminosity and the period. In 1913, Hertzsprung used parallax to measure the distance to several Cepheids in our Galaxy. In this way, the relationship between luminosity and period was calibrated. As a result, if the period of a Cepheid Variable could be measured, then the luminosity is known, and the distance to the Cepheid variable can be mea- sured. Today, the relationship is very well established by studying Cepheid variables with known parallaxes from Hipparchos (see figure 3.2). At the time of Leavitt’s discovery, it still wasn’t clear that what we know call Galaxies were actually outside of our own Milky Way. Soon, however, Cepheids started to be discovered in other Galaxies. In 1923 Edwin Hubble used Cepheids in the Andromeda Galaxy to show that it was located far outside our Milky Way. In this way Cepheid variables revolutionalised our understanding of the Universe.

3.3.2 Type Ia Supernovae Cepheid Variables are bright; whilst the best parallaxes can yield distances to 1 kpc, Cepheids can be observed up to 10-20 Mpc (1 Mpc = 1 million pc). There are other types of variable stars which can be used as standard candles in a similar way to Cepheids (notably the RR Lyrae stars and W Virginis stars), but these are fainter than Cepheids. If we wish to measure the distance to the furthest galaxies, we will need to find another rung on the distance ladder. This rung comes from Type Ia supernovae. Type Ia supernovae are the explosions on white dwarf stars in binary systems. White dwarfs have a maximum mass, known as the Chandrasekhar limit. This is somewhere around 1.4M . If a white dwarf in a binary system accretes mass from it’s companion, it can be pushed over the Chandrasekhar limit. The white dwarf must then collapse, but the collapse causes the centre of the white dwarf to become extremely dense and hot, which triggers explosive nuclear burning in the core. The resulting supernovae explosion is extremely bright; it is about 14 magnitudes brighter than the brightest Cepheid variable. Using the general form of equation 2.4, we can see that type Ia super- novae can be seen to much larger distances than Cepheids. Suppose we have an object on the very limit of visibility, which we then make 14 magnitudes out the laborious lab work and number crunching for male professors. These “computers”, as they were called, rarely received direct credit for their work

18 brighter. Now, we move it away until it is once again at the very limit of visibility. How far has it moved?   d1 m1 − m2 = 5 log10 (3.2) d2   d1 ∆m = 5 log10 (3.3) d2   d1 14 = 5 log10 (3.4) d2 d1 10(14/5) = (3.5) d2 d1 ≈ 630 (3.6) d2 Type Ia supernovae can be seen 630 times further away than a Cepheid variable! Since Cepheids can be seen to a distance of 20 Mpc, we can see type Ia supernovae to over 10,000 Mpc; that is a significant fraction of the entire visible Universe! But are type Ia supernovae standard candles? In fact, there is a good rea- son to believe they are. Since no white dwarf can exceed the Chandrasekhar mass, all type Ia supernovae result from the explosion of a white dwarf at exactly the Chandrasekhar mass. Therefore, we can reasonably expect them all to share the same luminosity! By combining parallax, Cepheid variables and type Ia supernovae, we now have a distance ladder which covers almost the entire Universe.

19 Chapter 4

Motion of celestial objects

4.1 Proper Motion

The stars are in constant motion. Much of it is apparent motion. Each night, the stars appear to move because of the rotation of the Earth (usually called diurnal motion). Over the course of a year the Earth moves round the Sun, causing the stars which are visible from night to night to change. And finally, as we have discussed, the Earth’s motion around the Sun causes the position of nearby stars to change, due to the phenomenon of parallax. Once all these apparent sources of motion have been put together how- ever, the stars still move! A star’s motion through space is unsurprisingly called its space velocity. Viewed from Earth, this velocity can be broken down into two components. The component along our line of site is called the radial velocity, which we will discuss in more detail shortly. The second component lies in the plane of the night sky, and is called the proper motion - see figure 4.1. The proper motion is visible as a gradual shift in the posi- tion of a star, relative to the stars around it. It is convenient to measure it as the rate of angular displacement of the star, in arcseconds/yr. Proper motions of stars are generally quite small. The star with the largest proper motion, Barnard’s star, moves around 10.3 arcseconds every year. Nevertheless over significant periods of time, the proper motions of stars can build up to quite significant changes. Because a star’s proper motion is measured in arcseconds/yr, a star which is very far away can show a small proper motion, even if it is moving rather fast across our line of sight. Therefore, proper motions are extremely useful for making a sample of stars which are nearby, but become increasingly hard to measure for objects which are far away.

20 radial velocity

space velocity

tangential velocity

µ

proper motion

Figure 4.1: Stellar motion

21 4.2 Radial velocity and Doppler shift

We return now to the component of the space velocity which is directed along our line of sight; the radial velocity. How do we measure this? A star which has no proper motion but high radial velocity does not appear to move in the night sky... Fortunately, we can measure the radial velocity of celestial objects using the Doppler shift.

rest wavelength λ0

observed wavelength λ' vr

Figure 4.2: Doppler shift

The cause of the Doppler shift is shown in figure 4.2. Imagine a star which is moving away from us as it emits light. We can think of the light as being stretched out because of the movement of the star. We can work out how much the light is stretched out using some simple maths. What is the time taken by the star to emit one wavelength of light? Let us call the wavelength emitted by an object at rest λ0. Since the wave travels at the speed of light, the time between peaks is given by

λ0 ∆t = . c

Now consider the star as it moves away from us, with a radial velocity of vr. It still takes the same time to emit one wave, ∆t. However, in that time, the star itself has moved a distance

vrλ0 v ∆t = . r c

22 The distance between successive peaks of the light is given by the old wave- length λ0, plus the extra distance moved by the star. So the new wavelength, λ0, is given by vrλ0 λ0 = λ + , 0 c or, rearranging, 0 λ − λ0 ∆λ vr ≡ = . (4.1) λ0 λ0 c The quantity ∆λ is called the Doppler shift. An object moving away from us has a positive value of vr. That means the Doppler shift is positive, the wavelength increases, and so the emitted light becomes increasingly red. We say the light has been redshifted. Conversely, if an object moves towards us the emitted wavelength becomes shorter, and the light is blueshifted.

4.2.1 Measuring Doppler shifts

1

0.9

0.8

0.7 λ0

0.6

0.5 (normalised)

λ 0.4 F

0.3

0.2

0.1 Δλ

0 300 350 400 450 500 550 600 650 700 Wavelength (nm)

Figure 4.3: Measuring Doppler shift

How do we measure Doppler shifts? We can do it by taking a spectrum of the source, as illustrated in figure 4.3. Astronomical spectra often show sharp features at an easily measured wavelength (e.g. absorption or emis- sion lines). In many cases, the rest wavelengths of these lines are known from laboratory experiments performed on Earth, so by comparing the rest

23 wavelength with the observed wavelength from the spectrum, we can easily calculate the Doppler shift.

4.2.2 The expansion of the Universe The combination of measuring Doppler shifts together with the work done on the distance ladder lead directly to one of the most important discov- eries made in Astrophysics. Following on from his work in which he used Cepheid Variables to measure the distance of the Andromeda galaxy, Edwin Hubble began work combining the distances to galaxies (as measured by Cepheids), with the velocity at which the galaxies were moving away from us (measured by Doppler shifts). Hubble used a measure called the red- shift, z = ∆λ = vr , which is proportional to the speed at which the galaxy λ0 c is moving away from us. Using the 2.5m telescope at the Mount Wilson observatory, Hubble collected redshifts and distances for 46 galaxies. They discovered a proportionality between the redshift and the distance. Since the redshift is proportional to the velocity, this means that the radial veloc- ity of galaxies are proportional to the distance from us, a result now known as Hubble’s law v = H0D, (4.2) where H0 is a constant of proportionality known as Hubble’s constant. Hub- ble made several mistakes in his work, including getting all his distances wrong, so that his estimate of Hubble’s constant was out by nearly a factor of 10! Nevertheless, the observations changed our view of the Universe for- ever. At first glance, the result seems staggering. Every galaxy is moving away from us, and at a speed proportional to its distance from us. It is hard to explain without assuming that the Earth is located at the centre of the Universe. Astrophysicists don’t like to do that, because it violates an assumption we make that the Earth is not located anywhere particularly special (the Copernican principle). The solution to our predicament is that the entire Universe is expanding; that is, every point in space is moving away from every other point! As bizarre as it sounds, this behaviour had already been predicted, using Einstein’s General theory of Relativity, when Hubble published his results. Hubble’s work provided convincing proof that we live in an expanding Universe. Hubble’s work also shows us that equation (4.1) is wrong! Or at least, needs some correction. That’s because the redshifts he measured were very large. For example, the most distant galaxy in Hubble’s sample had a red- shift of nearly 4. Since z = ∆λ = vr , according to equation (4.1), that would λ0 c suggest the galaxy was moving away from us at nearly 4 times the speed

24 of light! Since nothing can go faster than light, clearly something is wrong with our formula for Doppler shift. In fact, the problem is that we need to make corrections to take account of special relativity. Since that is beyond the level of this course, I will just give the answer here, in the relativistic Doppler equation   1 ∆λ c + vr 2 = − 1. (4.3) λ0 c − vr This equation should be used in place of equation (4.1) whenever the radial velocity is a significant fraction of the speed of light. Note that I’ve been spelling Doppler shift with a capital D. That’s be- cause it is named after it’s discoverer, Christian Doppler. Doppler originally proposed it (in 1842) as an explanation of why different stars had different colours. With our modern understanding of Doppler shift, we know that the colour of stars is not caused by Doppler shift. The maximum velocity of stars in the Galaxy is around 300 km/s. At the centre of the optical waveband (500nm) this velocity would cause a Doppler shift of only 0.5nm; nowhere near enough to significantly change the colour of a star. We need to find a different explanation for the colours of stars.

4.3 The colours of stars

4.3.1 Thermal continuum radiation In everday life, most objects are visible because of the light which they reflect. You and I are visible because of the we reflect, not because of the light we emit ourselves. We do, however, emit light. All material objects give of some radiation: the hotter they are, the more radiation they emit. Can we be more specific about the properties of light emitted by warm objects? We can if we make one very important approximation; that the radiation emitted by an object is in thermal equilibrium with it’s surroundings. What do we mean by that? Suppose we heat an object up; it will radiate light. This radiation carries away energy and so the object cools. Now, we’ll put our object into a special box, which absorbs and re-emits all of the radiation which falls upon it. Now, as our object cools, the box fills with photons. Some of these will be absorbed by our object, and provide a small heating effect. Eventually, a balance will be set up, where for every photon absorbed another is emitted by the object. At this point the box, the radiation and the object are all in thermal equilibrium. This is a stable

25 situation; the spectrum of the radiation does not change with time. The radiation within the box is called thermal equilibrium radiation. It is often called black-body radiation, because the best way to get the box walls to absorb and emit all the radiation is to make them black. Suppose we now make a tiny, tiny hole in the box. Just large enough to allow some light to escape, but not so large as to break the thermal equilibrium within the box. What does the spectrum of the light look like? Black-body spectra for objects at three temperatures are shown in figure 4.4.

Figure 4.4: Black body light curves. Monochromatic intensity (monochro- matic flux per unit solid angle) is plotted against wavelength for black bodies at three temperatures. The wavelength range of visible light is shown.

The curves show a steep rise to a well-defined peak, and then a tail of emission towards longer wavelengths. As the temperature increases, the peak moves to shorter wavelengths, and the area under the curve increases. A derivation of the formula which describes the black-body spectrum is not easy, and produced the first quantum mechanical formula ever known. The

26 German physicist, Max Planck, first determined the formula empirically, by fitting the observed curve with a function that gave an extremely good fit. Classical physics was completely unable to explain why this formula worked so well, but quantum mechanics provided the answer. The formula, now known as the Planck function is

3 2hπν 1 2 −1 Fν(T ) = Wm Hz . (4.4) c2 exp(hν/kT ) − 1

Fν(T ) is the monochromatic surface flux. It is still the amount of light per unit time, per unit frequency and per unit area, but now that unit area refers to a small unit of area on the surface of the emitting object. The Planck function is extremely important in astrophysics, and we shall examine some of the implications of it in the next lecture. But before we do; what objects actually emit as black bodies?

4.3.2 Black bodies in Astrophysics The pedantic answer to the question above is that nothing emits as a black body. Perfect thermal equilibrium is impossible to achieve. However, there are important classes of object that are very nearly in thermal equilibrium, and so emit roughly as black bodies. What properties does an object need to be close to thermal equilibrium? It must be a near perfect absorber of light, or else it will reflect and the spectrum will differ from a black body. Also, since it absorbs perfectly, it must also emit perfectly, or else it will steadily absorb energy and heat up. Obviously, the object needs to have a stable temperature (it needs to be in thermal equilibrium). A good example is dust grains. Dust grains are bathed in radiation from the surrounding stars. Moreover, they are good absorbers and emitters. Therefore, they quickly reach a temperature where the radiation emitted from the grains balances that absorbed from the background starlight. An- other good example is the Cosmic Microwave Background. This background radiation emitted at a time when the Universe was small, hot and opaque, is almost a perfect black-body. It has cooled as the Universe has expanded, and has a black-body spectrum corresponding to a temperature of 2.725K (see figure 4.5). However, the most important example of black bodies in astrophysics are stars themselves. Since the energy lost through radiation is balanced by heat from nuclear fusion in their cores, stars have a stable temperature. And they are so dense in their interiors that nearly every photon is absorbed (they are very good absorbers and re-emitters). However, stars are not

27 Figure 4.5: Cosmic Microwave Background spectrum as measured by COBE. The theoretical curve is the Planck function for a 2.725K black body.

perfect black bodies. This is because the outer layers of the star is not a perfect absorber/emitter of radiation. Instead, near the surface of the star, the absorption of light is a strong function of wavelength. Starlight is therefore only approximated by the Planck curve. Figure 4.6 shows the Solar spectrum. After correction for the absorption by the atmosphere, the Sun’s spectrum is pretty closely predicted by the Planck function, but the agreement is not perfect. The fact that we can, to a rough approximation, treat stars as black bodies allows us to make many important deductions about their properties. This will be the topic of the next section.

28 Figure 4.6: Monochromatic flux from the Sun, as observed at ground level (red) and after correction for the absorption by the atmosphere (yellow). The spectrum is reasonably well fit by a black body curve for a temperature of 5250 K.

29 Chapter 5

Thermal Continuum Radiation

Last lecture we looked at the radiation emitted by hot objects. The most important thing we learnt is that if radiative thermal equilibrium is approx- imately obeyed, the surface flux from the object obeys the Planck curve - Equation (4.4). The emission from such objects is often called black-body radiation (Equation (4.4)).

3 2hπν 1 −2 −1 Fν(T ) = Wm Hz . c2 exp(hν/kT ) − 1 Some very important astrophysical objects, including stars, are close to thermal equilibrium, and so their spectra are approximated by the Planck curve. In this lecture, we are going to look at some important consequences of that fact, by looking at the properties of the Planck curve in detail.

5.1 Blackbody radiation in wavelength units

Above is the formula for the spectrum of blackbody radiation in frequency units. It is a monochromatic surface flux, so it gives the energy emitted per unit surface area, per unit time and per unit frequency. What is the corresponding curve in wavelength units? You might remember we solve this question by realising that the amount of energy emitted in a small frequency range must equal the amount of energy contained within the corresponding wavelength range, Fνdν = Fλdλ. That lead to equation (1.5), λ2 F = F . ν λ c

30 We can now use that equation to write the Planck curve in wavelength units c F = F λ ν λ2 2hπν3 1 = . cλ2 exp(hν/kT ) − 1

c We now substitute in ν = λ into the equation above to obtain

2hπc2 1 F (T ) = Wm2nm−1. (5.1) λ λ5 exp(hc/λkT ) − 1

5.2 Wien’s Law

At what wavelength does a hot body emit the most light? Assuming the hot object is roughly a black body, this question boils down to finding out at what wavelength the Planck curve peaks. The calculation is simple in principle, and a bit complicated in practice. In principle, we find the maxi- mum (or minimum) of a function by finding where the derivative is zero. In other words, the peak wavelength, λpeak is the one which satisfies

dF (T ) λ = 0. dλ The solution to this equation is not so straightforward. I’m going to in- clude (most of) it here, because it serves as a useful example of how to solve a complex differentiation problem. The derivation is non-examinable, however, and so the truly math-phobic might want to skip to the answer.

5.2.1 The derivation We want to solve dF (T ) d 2hπc2 1  λ = = 0 dλ dλ λ5 exp(hc/λkT ) − 1

We make a start by writing Fλ(T ) = uv, where

2hπc2 u = , and λ5 1 v = . exp(hc/λkT ) − 1

31 dF dv du du From the product rule, dλ = u dλ + v dλ . First, we calculate v dλ easily du −5  1  v = 2πhc2 . dλ λ6 exp(hc/λkT ) − 1 dv Next, we calculate u dλ . This is a little harder. dv 2hπc2 d  1  u = dλ λ5 dλ exp(hc/λkT ) − 1 We can’t easily do the differential in the square brackets, so we try to apply a few standard tricks to make it easier. In this case, we make a substitution hc x = λkT . dv 2hπc2 d  1  u = , dλ λ5 dλ exp(x) − 1 and use the product rule d  1  dx d  1  = dλ exp(x) − 1 dλ dx exp(x) − 1 hc . Since x = λkT , dx −hc = . dλ λ2kT Substituting this in to our earlier equation, we find dv −2h2πc3 d  1  u = . dλ λ7kT dx exp(x) − 1 But we still can’t simply write the answer for the differential in the formula above. However, we are now very close, because we can apply the quotient rule, df(x) dg(x) d f(x) g − f = dx dx , dx g(x) g(x)2 and set f(x) = 1, and g(x) = ex − 1. It follows that df/dx = 0 and dg/dx = ex. If we use this, we find d  1  −ex = . dx exp(x) − 1 (ex − 1)2 So now we have, dv −2h2πc3 d  1  2h2πc3 ex u = = dλ λ7kT dx exp(x) − 1 λ7kT (ex − 1)2 2h2πc3 exp(hc/λkT ) = . λ7kT (exp(hc/λkT ) − 1)2

32 dF dv du Phew! So now going back to dλ = u dλ + v dλ = 0, we find

dF −5  1  2h2πc3 exp(hc/λkT ) = 2πhc2 + dλ λ6 exp(hc/λkT ) − 1 λ7kT (exp(hc/λkT ) − 1)2 2πhc2  hc exp(hc/λkT )  = −5 + λ6(exp(hc/λkT ) − 1) λkT exp(hc/λkT ) − 1 = 0.

The term outside the square bracket in that equation is always positive, and never zero. That means the term inside the square brackets should be zero and so we arrive at... hc exp(hc/λkT ) −5 + = 0 λkT exp(hc/λkT ) − 1

Unfortunately the answer we’ve reached is not very useful. We can’t use it to find the value of the wavelength at the peak of the black body function yet. Worse, it is fiendishly difficult (though possible) to solve this equation analytically. Instead, we can use a computer to solve it numerically, in which case we arrive at...

5.2.2 The answer hc = 4.9651 λkT We can make this answer even more memorable by re-arranging, and sub- stituting in S.I values for h and c, to get

−3 λpeakT = 2.898 × 10 [mK], (5.2) where the wavelength is measured in metres, and the temperature in Kelvin. This formula is known as Wien’s Law, after Wilhem Wien, who derived it (following a different line of argument) in 1893. It shows that the peak wavelength of the Planck curve is inversely proportional to temperature. We have derived the property discussed last lecture that as the temperature of a body increases, the peak wavelength gets shorter (see figure 4.4). At a room temperature of ∼ 290 K, λpeak is about 10 microns. This is well into the infrared spectrum of light, and our eyes are not sensitive to these wavelengths. This explains why we see objects only from the sunlight they reflect and scatter. If our eyes were sensitive to infrared light, we would see the thermal radiation from everyday objects, like in figure 5.1. Our Sun,

33 Figure 5.1: An image taken at a wavelength of 12 microns of Prof Ned Wright (Caltech). Prof Wright is the lead scientist on NASA’s WISE mission, which aims to survey the night sky at infrared wavelengths between 3 and 12 microns. These wavelengths are not accessible from Earth, because the water vapour in the atmosphere makes it opaque.

34 has a temperature around 5800 K. Using Wien’s law, the Sun’s spectrum peaks at about 0.5 microns. It is by no means a coincidence that this is close to the middle of the range of light to which our eyes are sensitive! The middle curve in figure 4.4 is a decent approximation to the Sun’s spectrum. You’ll notice that the sun emits light more or less evenly across the whole visual range. Sunlight is thus a pretty even mix of colours, and we perceive it as white. However, the sunlight which reaches the Earth has passed through the atmosphere. Since the atmosphere preferentially scatters blue light most of the blue part of the Sun’s spectrum is scattered. That is why the Sun looks yellow, and the sky looks blue.

5.3 Colours and colour temperature

In principle, Wien’s Law gives us a way to obtain a measurement of stellar temperature. In practice we often don’t have sufficient data to measure the peak wavelength accurately. However, we can easily measure the colours of stars, and since Wien’s law tells us that objects get bluer as they get hotter, it shouldn’t be a surprise that the colour of a star can provide a measure of the temperature. The colour index, which we defined in equation (2.2), was the difference in magnitude, as measured in two filters, e.g.

FV B − V = +2.5 log10 + const. FB Using the Planck curve, we can see that the colour index is a direct mea- sure of temperature. Figure 5.2 shows two blackbodies, one at 4000 K and another at 12,000 K. The 4000 K blackbody emits less light at B than at V , and so it’s B − V colour index is positive. By contrast, the 12000 K blackbody emits more light at B than at V , and so it’s B − V colour index is negative. Thus, we see that B − V gets smaller as the star gets hotter and bluer. We can get a temperature estimate directly from a star’s B − V colour. We simply ask what black body curve would show the same B − V colour as we observe. Temperatures derived in this manner are called colour tem- peratures. Of course, since the star is not a perfect black body, the colour temperature is only an approximation to the star’s true temperature.

35 Figure 5.2: Colour indices and temperature. The top panel shows a black body at a temperature of 4000 K, and the bottom panel shows a black body at a temperature of 10,000 K. Clearly, the ratio of B-band flux to V- band flux increases with increasing temperature. On our magnitude scale, that corresponds to a B − V colour index which decreases with increasing temperature.

36 5.4 The Stefan-Boltzmann Law

Let’s use the Planck curve to work out another property of black bodies. What is the total flux emitted if we sum over all frequencies/wavelengths? In other words, what is the bolometric surface flux from a black body? To find this, we need to integrate the Planck curve over all frequencies (I’m going to use frequency here, rather than wavelength, because it makes the integration a bit easier). So, the bolometric surface flux is given by:

Z ∞ Z ∞ 2hπν3 1 F (T ) = Fν(T )dν = 2 dν 0 0 c exp(hν/kT ) − 1 This is quite a hard integral, although it’s a lot easier than the derivation of hν Wien’s Law above. We’ll make it look a bit simpler by substituting x = kT , xkT kT so ν = h and dν = h dx:

Z ∞ 2πh xkT 3 1 kT F (T ) = 2 x dx 0 c h e − 1 h If we tidy a few terms up, and take all the constant terms outside the integral, we get 2πk4T 4 Z ∞ x3 F (T ) = 3 2 x dx. h c 0 e − 1 R ∞ x3 The integral 0 ex−1 dx is not trivial, but fortunately it can be looked up in a table of standard integrals. It’s value is simply a constant;

Z ∞ x3 π4 x dx = , 0 e − 1 15 which we can substitute to get

2k4π5 F (T ) = T 4 Wm−2. 15h3c2 This is the Stefan-Boltzmann equation. It shows that the bolometric sur- face flux from a black body is proportional to the temperature to the fourth power. The constant of proportionality in the equation above is called Ste- fan’s constant, and given the symbol σ. In this form, the Stefan-Boltzmann equation looks like F (T ) = σT 4 Wm−2, (5.3) where σ = 5.67 × 10−8 Wm−2K−4.

37 5.4.1 Flux and luminosity from black bodies The Stefan-Boltzmann law gives the bolometric surface flux from the black body. In otherwords, it is the total energy emitted at all frequencies, per second and per unit area of the black bodies surface. To find the total energy emitted per second from the black body, we need to multiply by the surface area of the black body, so

bol 2 4 L = 4πR∗ σT W (5.4)

L Finally, we can use the inverse square law F = 4πd2 , to calculate the bolometric flux at a distance, d from the black body  2 R∗ F bol = σT 4 Wm−2. (5.5) d

5.4.2 Effective Temperatures Equation (5.5) can be used to estimate the surface temperature of a star. We measure the bolometric flux from Earth. Usually this involves measuring the monochromatic flux at as many wavelengths as possible, and integrating over them to obtain the bolometric flux. Provided we know the distance to the star, and its radius, we can estimate the temperature. Temperatures obtained this way are known as effective temperatures. They are the surface temperature the star would have if it radiated as a perfect black body. Again - since stars are not perfect black bodies, the effective temperature is only an approximation to the actual surface temperature of the star.

5.5 Other continuum emission mechanisms

The thermal radiation emitted by hot objects is a continuum emission mech- anism. By that we mean that the light emitted is spread out over a broad range of wavelengths, and has no sharp features in its spectrum. It is not the only continuum emission mechanism encountered in astrophysics, and for completeness I’ll briefly mention some of the other mechanisms here.

5.5.1 Synchrotron radiation Synchrotron radiation is emitted by electrons which move at close to the speed of light, in the presence of strong magnetic fields. The electrons feel a force from the magnetic fields, which result in them spiralling around

38 Figure 5.3: Synchrotron radiation is produced by electrons spiralling around the magnetic field lines

39 the field lines. Put another way, the electrons feel an acceleration from the magnetic field, and since all accelerated charges emit radiation, light is emitted from the electrons. Synchrotron radiation is emitted from regions where there are both fast moving electrons, and strong magnetic fields. Synchrotron radiation has a characteristic power-law spectrum, given by

−n Fν ∝ ν , where n ∼ 0.5–1.5. This means that most of the light emitted as synchrotron radiation is emitted at low frequencies (long wavelengths). Synchroton ra- diation is a source of radio waves, with wavelengths longer than a cm or so.

5.5.2 Bremsstrahlung

Bremsstrahlung

Fν = const

Synchrotron Black-body F ∝ν-n ν 2 Fν∝ν ν log F log

radio optical x-rays

log ν

Figure 5.4: A sketch of the contribution of various continuum sources at different frequencies

Bremsstrahlung or braking radiation is emitted from ionised gasses. As

40 the electrons move past the protons, they feel an electric force, which de- celerates them. Once, again, accelerated charged particles emit radiation, so the electrons emit light as they are braked. For the gas to be ionised it must be very hot, so that there is enough energy to free the electrons from atoms. It is not surprising that bremsstrahlung is mostly emitted at high frequencies. In fact, the spectrum from bremsstrahlung is flat; the amount of flux emitted is independent of wavelength. Below some cutoff wavelength, however, the flux is proportional to the square of the frequency, so very little light is emitted at low frequencies (the spectrum is sketched in figure 5.4. The cutoff frequency is high; bremsstrahlung is a good source of X-rays. Figure 5.4 also illustrate the point that the different continuum emission mechanisms produce light at very different wavelengths. As a result, the ap- pearance of an astrophysical object can change drastically with wavelength. In the optical, we see thermal radiation from hot objects, like stars. The infrared is also dominated by thermal emission, but from cooler objects, like dust. At radio wavelengths, if an object has a strong magnetic field and can produce very fast moving electrons, we will see synchrotron radiation from those electrons. In the X-rays, we might see bremsstrahlung from hot gas, at least so long as there is sufficient ionised gas!

41 Chapter 6

Thermal Properties of Matter

Up until now we’ve been focussing on the properties of radiation. We’ve seen that for an object where the radiation is in thermal equilibrium with the matter (a so-called black body), the spectrum of the radiation is given by the Planck curve, and we’ve looked at a few important properties of the Planck curve that allow us to obtain estimates of the temperature of stars. Now I’d like to take a brief detour, and look at some of the properties of matter. In particular, we’re going to continue to study our idealised object in which thermal equilibrium between the matter and radiation is maintained. This has two very important consequences: 1. the radiation field and the matter have a well defined temperature and, since they are in thermal equilibrium with each other, the same temperature describes the radiation field and the matter

2. since the matter is in thermal equilibrium, it’s properties are described by the laws of statistical physics Statistical physics is a way of looking at large systems. In a small-sized rooms, there might be something like 1026 air molecules. These molecules will be spread over a large range of speeds and energies. It is not possible in practice to calculate the behaviour of a single molecule in this room. Nevertheless, the gas in the room does have a few well defined properties, such as the pressure, temperature and density. Calculating these properties is the domain of statistical physics. The idea behind statistical physics is although we cannot predict a single particle’s energy, or speed, what we can do is calculate the probability that

42 it will have a given energy or speed. One of the most important results in statistical physics is that the probability of a particle having an energy, E depends upon the energy and temperature, like so

P (E) ∝ e−E/kT . (6.1)

This is known as the Boltzmann distribution. The behaviour it predicts makes intuitive sense. The probability that a particle has energy, E depends on the energy and the temperature. At a given temperature, a particle is not likely to have energies much higher than kT . Also, as the temperature increases, it becomes more likely that particles will have higher energies. Deriving this equation is beyond us at the moment, but everything we will discuss today follows directly on from this result.

6.1 Maxwell-Boltzmann distribution of particle speeds

Let’s consider our object in which the radiation and matter are in thermal equilibrium. Not every particle is going to have the same speed, so there will be a distribution of speeds. What is it? Amazingly, we can derive it using nothing more that the Boltzmann distribution. The details of the derivation involve some difficult maths, so I’m just going to cover the important steps below. Since the gas in in thermal equilibrium, the Boltzmann distribution states that the probability that a gas particle has energy E is proportional to e−E/kT . But for a gas particle with speed v and mass m, the energy is mv2 just the kinetic energy - E = 2 . So the probability that a particle as a speed v, is given by 2 P (v) ∝ e−mv /2kT . What we want to know is how many gas particles there are with speeds be- tween v and v +dv. Since dv is a very small change in speed, the probability that a particle has a speed in this range will simply be P (v), to a good ap- proximation. The fraction of particles between v and v + dv is proportional to the probability a particle has this speed, multiplied by the number of possible speeds between v and v + dv. The number of possible speeds turns out to be proportional to v2dv. Finally, the total number of particles with speeds in the range of interest is the fraction of particles with these speeds multiplied by the total number of particles, n, and so we can write that the number of gas particles with speeds between v and v + dv is

2 n(v)dv ∝ nv2e−mv /2kT dv. (6.2)

43 We can get rid of the proportionality by realising that if we integrate equa- tion (6.2) over all speeds, the answer must equal the total number of parti- cles, n. Doing this gives the Maxwell-Boltzmann distribution of speeds in a gas: 3  m  2 n(v)dv = 4πn 2 v2e−mv /2kT dv. (6.3) 2πkT This result gives the distribution of speeds in a gas which is in thermal equilibrium. The speeds of individual gas particles will change constantly as particles collide with each other, but the distribution of speeds within the gas as a whole does not change.

6.2 Properties of the Maxwell-Boltzmann distri- bution

Figure 6.1: Maxwell-Boltzmann distribution, n(v)/n for atoms at a temperature of 6000 K, The most probable speed is labelled.

What does the Maxwell-Boltzmann distribution look like? Figure 6.1 shows the distribution of speeds of hydrogen atoms at 6000K. At low speeds

44 the v2 term in equation (6.3) dominates. Note that because of this there are no particles with zero speed, regardless of temperature! At high speeds, the exponential term dominates, which makes very high speeds (where mv2  2kT ) unlikely. In between there is a maximum of the Maxwell-Boltzmann distribution, which defines the most probable speed. The most probable speed can be found by differentiating the Maxwell- Boltzmann distribution, and finding the point at which the slope is zero. Once again, the maths is awkward and adds nothing to our physical un- derstanding, so I will just give the result, that the most probable speed is 1 2kT  2 v = . (6.4) p m 1 2 Using E = 2 mv , we can convert this to the most probable energy, and find

Ep = kT (6.5)

6.2.1 Mean energy If we look again at figure 6.1, we see that it is not symmetric around the most probable speed. Instead, there is a tail extending towards higher velocities. What this means is that the mean of the distribution is not the same as the most probable speed (known as the mode of the distribution). How do we calculate the mean speed or, more interestingly, the mean energy of the Maxwell-Boltzman distribution? The mean energy is defined by R ∞ n(E)EdE E¯ = 0 . (6.6) n This equation should make sense to you; to calculate a mean you add up all the particle energies (R n(E) E dE) and divide by the total number of parti- 1 2 cles, n. We solve the integral in equation (6.6) by using E = 2 mv to convert the Maxwell-Boltzmann distribution into the distribution of energies, n(E). In doing so, we find the mean particle energy is 3 E¯ = kT. (6.7) 2

6.2.2 Equipartition Before we move on to discuss the pressure of our gas, let’s take a quick look at our result for the mean energy. Our particles have a mean (kinetic) energy 3 of 2 kT . Of course, our particles are free to move in three dimensions, so

45 their speed v has components directed along each axis (vx, vy, vz). Each of 1 2 1 2 1 2 these components has a corresponding kinetic energy, ( 2 mvx, 2 mvy, 2 mvz ), and since there is no reason to think that any one component will be larger than any other, we must conclude that all of these components share, on average, an equal part of the total energy! This means that each direction of motion (we call them degrees of free- 1 dom) has, on average, an energy 2 kT associated with it. This is known as the equipartition theorem. Although we have derived it in quite a specific setting it actually applies to all physical systems of many particles, and is tremendously useful throughout astrophysics.

6.3 Pressure

6.3.1 Gas Pressure

z

y x

Figure 6.2: A cubic volume of gas in thermal equlibrium

Having looked at the Maxwell-Boltzmann distribution, and the theory of equipartition, we are now in a position to derive the pressure caused by a gas in thermal equilibrium. We are going to make one simplifying

46 assumption; we are going to assume our gas consists of randomly moving, non-interacting particles. Such a gas is called an ideal gas. Because our ideal gas is in thermal equilibrium, the speeds of the particles follow the Maxwell- Boltzmann distribution and all the results we calculated for average energy etc. apply. We’re going to think about a cubic box of gas, shown in figure 6.2. The force on the walls of our box is caused by the collisions of gas particles with the walls. During a collision the particles momentum is changed. The force felt by the wall is equal in size to the rate of change of momentum of the particles. Let’s look at just one wall of the box. Symmetry tells us the pressure must be the same on all walls of the box, so we can pick any wall we choose. Let’s look at the right-hand wall. All of our gas particles are moving in three dimensions, but it is only the x-component of their velocity, vx, which causes them to hit this wall. When the particle hits the wall we assume the collision is elastic. This means that no kinetic energy is lost. Therefore, before the collision, the x-component of the particle’s velocity was vx; after the collision it is −vx. The change in momentum of the particle is

∆p = 2mvx.

On average, the time taken between collisions with the right-hand wall will be the time it takes a particle with a x-velocity vx to travel to the opposite wall, rebound and collide with the right-hand wall again. If our box has sides of length L, this time is ∆t = 2L/vx. The rate of change of momentum is thus 2 ∆p vx mvx = 2mvx . = . ∆t 2L L Since force is equal to the rate of change of momentum the force on the wall 2 of the box is mvx/L. The pressure on the box wall is the force per unit area. The area of the box wall is L2 and so the pressure due to a single particle (labelled i), is 2 2 mvx,i mvx,i P = = , i L3 V where V is the volume of the box. If we have N particles in the box, then we find the total pressure by adding up the contribution from all particles

N N X m X m P = P = v2 = (v2 + v2 + ... + v2 ). i V x,i V x,1 x,2 x,N 0 0

47 2 2 2 ¯2 Since (vx,1 + vx,2 + ... + vx,N ) = Nvx, we can write

Nmv¯2 P = x . (6.8) V Our equation gives the pressure in terms of the x-component of the particle velocities. It would be more interesting to re-write it terms of the particle 2 2 2 2 speed. We know that the speed, v obeys v = vx + vy + vz . Also, since there is no preferred direction of motion for our particles the average velocities in ¯2 ¯2 ¯2 the x, y, and z directions should all be equal, so vx = vy = vz . Therefore we can write 1 v¯2 = v¯2. x 3

Substituting this result into equation (6.8) we can replace vx with v and obtain 1 Nm P = v¯2. (6.9) 3 V The reason we have gone to all this trouble to write equation (6.9) in terms of the particle speed is that mv¯2 is just twice the mean energy of the gas particles. But since the gas in in thermal equilibrium, the speed distribution is the Maxwell-Boltzmann distribution, and we worked out in section 6.2.1 3 that the mean energy of the gas particles was 2 kT . So we substitute in mv¯2 = 3kT to obtain N P = kT = nkT, (6.10) V where n is the number of particles, per unit volume. This is the ideal gas law, which is hopefully familiar to you! The ideal gas law is an equation of state for an ideal gas. An equation of state relates pressure, density and temperature. We made a starting assumption in our derivation; that the gas particles do not interact with each other. This is not true of real atoms and molecules, so we can expect the ideal gas law to be an approximation to the behaviour of real gasses. In fact, it is a good approximation up to very high densities and the ideal gas law is used widely in astrophysics. It is used, for example to describe the state of matter in the interiors of stars. Only when the gas density becomes extremely high can we no longer safely ignore the interactions between gas particles. For very compact objects (e.g white dwarfs and neutron stars) the ideal gas law breaks down and we need to derive new equations of state.

48 Figure 6.3: NanoSail-D - a solar sail spacecraft designed by NASA to test methods of deploying solar sails for space travel. Unfortunately the space- craft was lost in a launch failure of the rocket it was onboard.

6.3.2 Radiation Pressure We saw in lecture 2 that photons also carry momentum. Remember, the momentum of a photon is p = E/c, where E is the photon energy. Since photons carry momentum they too can exert a pressure. This pressure is, unsurprisingly, known as radiation pressure. We could go through the derivation of the ideal gas law again, and replace our gas particles (with momentum mv) with photons (with momentum E/c). Rather than go through all the steps again, I shall just state the result 1 P = aT 4, (6.11) rad 3 where a = 4σ/c (σ is Stefan-Boltzmann’s constant - 5.67×10−8 Wm−2K−1). Radiation pressure is generally a very small effect. Note however that it is strongly temperature dependent; radiation pressure is a significant effect in the highest mass stars, which are also the hottest stars. Radiation pressure is also the reason why solar sails (figure 6.3) could be used for long distance space travel. The radiation pressure on a solar sail is tiny, but over time it could accelerate a spacecraft to a reasonable speed. The great advantage is that no fuel need be carried, so the range of a solar sail spacecraft is

49 unlimited!

50 Chapter 7

A brief history of astronomical spectroscopy

Spectroscopy is the task of measuring the amount of light as a function of wavelength. When taking spectra of astronomical objects, we are measuring the monochromatic flux as a function of wavelength. The history of spec- troscopy in general, and spectroscopy in astronomy and astrophysics are closely linked. A continuing theme throughout the history of spectroscopy has been that new technologies have allowed scientific breakthroughs. So the best place to start is with Isaac Newton, and the technology that allowed spectroscopy in the first place; the prism.

7.1 Isaac Newton and the nature of light

When Isaac Newton started studying light, it was already well known that a beam of light, shone through a prism, would split into many different colours. At the time, it was believed that white light had no colour, and the prism itself caused the light to have colour. Newton showed that this wasn’t true with a very clever experimental setup (see figure 7.1). In 1672, Newton shone window light through a prism and showed that the white light split into many colours. Then, he isolated the red light from this beam and passed this through a second prism. The red light remained unchanged. This proved that white light was made up of coloured light, and the prism merely split the white light up into it’s constituent colours. The prism was to remain the spectroscopic tool of choice for nearly 200 years.

51 Figure 7.1: Newton’s sketch of his experiment in the nature of sunlight, which he called his crucial experiment.

7.2 Wollaston and the dark bands in sunlight (1802)

Newton’s experiments were some of the first spectroscopic observations of a star (the Sun). It wasn’t until 1802, however, that William Wollaston showed that sunlight was not simply a continuous spectrum. Wollaston was a chemist principally, but he was also very interested in optics. He invented the first camera lens, and also the Wollaston prism, used for measuring the polarisation of light. His prisms were of much higher quality than Newton’s, and they allowed him to see for the first time that sunlight was not an unbroken, continuous spectrum. Instead, there were several prominent dark bands which he observed. Wollaston believed that these dark bands were ”gaps” in the colours of the Sun.

7.3 Joseph von Fraunhofer

Fraunhofer was a Bavarian, orphaned at the age of 11, when he went to work as an apprentice to a glassmaker. At the age of 14, Fraunhofer was involved in an accident when the workshop he was in collapsed. In a bizarre twist of fate, he was rescued by the Bavarian prince, Maximilian IV Joseph, who took an interest in Fraunhofer’s life, providing him with access to books, time to do research and glassmaking materials.

52 Fraunhofer became probably the best maker of optics in the world. He discovered the dark bands in the Sun’s spectrum, independently of Wollas- ton, in 1814. In total, he found 574 dark lines, which are named after him; the (see figure 7.2).

Figure 7.2: A graphical representation of the Fraunhofer lines in the Sun’s spectrum. Fraunhofer labelled the strongest lines A through K, whilst the weaker lines were also labelled with lower case letters. some- times still use these names today.

Fraunhofer’s optics allowed him to see that the Fraunhofer lines were not “gaps” in the solar spectrum, as Wollaston thought, but were instead absorption lines; discrete wavelengths at which the Sun was fainter, but at which it still emitted light. Fraunhofer was not really interested in these lines from a scientific point of view. Instead, he was looking for a way to calibrate his spectroscope, the precursor to the modern spectrograph. An example of a spectroscope is shown in figure 7.3; light from the object of study comes down a scope and is focussed on a prism, which disperses the light. Another scope then views the light which emerges from the prism; by moving this scope’s position you detect light of different wavelengths. Fraunhofer was using the Sun’s dark lines as fixed wavelength references, so he could calibrate the relationship between the spectroscope’s position and wavelength. He, nor anyone else at the time, had any understanding of where the dark lines came from.

7.4 Kirchhof & Bunsen

A major breakthrough in the understanding of astronomical spectra came with the work of Kirchhof and Bunsen, between 1859 and 1861. Using a spectroscope of a type designed by Fraunhofer (figure 7.3), they examined

53 Figure 7.3: Kirchhoff & Bunsen’s spectroscope, which they used to look a the spectra produced by burning various elements. Light from the flame is focussed on prism F by scope B. The prism disperses this light and, by moving scope C, the amount of light at a given wavelength can be measured.

the spectra of the flames produced by burning various elements. What they observed was that each element produced numerous bright emission lines, and that the wavelengths of these lines were characteristic of each individual element, and could be used as a “fingerprint” for that element. Kirchhof and Bunsen also produced, by experiment, a series of rules for the type of spectrum observed from various types of light sources (figure 7.4). They found that hot, dense gasses (or solids) produced continuous a spectrum (the Planck function, we discussed earlier). Hot gasses which were of low density produced a series of emission lines, characteristic to the composi- tion of the gas. Also, they found that when a hot solid or dense gas was observed through a cooler, less dense gas the spectrum observed was a con- tinuous spectrum, but with a series of absorption lines superimposed. The absorption lines were characteristic of the cool gas, and thus presumably produced by it. Kirchhoff and Bunsen’s work allowed an understanding of the origin of the Fraunhofer lines in the Sun’s spectrum. They should the Fraunhofer lines could be identified with lines emitted by known elements (Hydrogen, Calcium, Sodium, etc). The absorption line nature of the solar spectrum implied that we were observing the light from the hot, inner regions of the Sun, after it passed through the cooler surface layers.

54 hot, high density gas hot solid continuous spectrum or

emission hot, low lines density gas

thru

hot, dense gas cool, low continuous spectrum or density gas with absorption lines hot solid

Figure 7.4: Kirchhoff & Bunsen’s empirical rules for the type of spectrum observed from different types of light sources.

7.5 Huggins (1863-1864)

With the work of Kirchhoff and Bunsen, finally there was a framework for understanding astronomical spectroscopy. The effect on astronomers was astounding, the notable British William Huggins summed up the mood by saying

“Astronomy, the oldest of the sciences, has more than renewed her youth. At no time in the past has she been so bright with unbounded aspirations and hopes” - Huggins (1891).

Huggins in particular took Kirchhoff and Bunsen’s results and put them to great use. Combining large telescopes with early precursors of the modern astronomical spectrograph he obtained many spectra of nearby stars and used the absorption lines in their spectra to work out their composition. In a result that startled many at the time, it turned out that stars were made out of the same material as the Sun, and by extension, from the common elements found on the Earth. In his paper with Miller in 1864, Huggins wrote: “It is remarkable that the elements most widely diffused through the host of stars are some of those most closely connected with

55 the living organisms of our globe, including hydrogen, sodium, magnesium and iron...” - Huggins & Miller (1864)

Working with his wife, Margaret, Huggins also took some steps towards the understanding of Nebulae. These faint, diffuse, smudges of light were hotly debated at the time. Some claimed they were clouds of brightly glow- ing gas whilst others argued that they were distant collections of many stars. By taking spectra of the brightest nebulae, Huggins showed that they pos- sessed emission line spectra. Following the work of Kirchhoff and Bunsen, this proved that the brightest nebulae were clouds of hot, sparse gas. Later on, many of the fainter nebulae were shown to have absorption line spectra. It was the work of Hubble and Leavitt in the 1920’s which showed that these nebulae, now called galaxies were very distant collections of stars, like our own Milky Way.

7.6 Spectral classification of stars: Secchi (1866)

Figure 7.5: Secchi’s spectral classification scheme (1866)

Many people were by now acquiring large collections of stellar spectra. It wasn’t long before they noticed patterns emerge, and started to place stars into groups defined by their spectral appearance. This act of spectral classification was not motivated by an understanding of why stars shared

56 similar properties, but instead was a purely empirical exercise, somewhat akin to the classifying of animals into species within Biology. Nevertheless, the eventual spectral sequence of stars would prove to be hugely important within astrophysics, and understanding the spectral sequence will form the next part of our course. The first spectral classification scheme was devised by a Jesuit priest, Angelo Secchi, in 1866. He had acquired of the order of a thousand stellar spectra, and divided them into 4 main classes, based upon the types of absorption lines which appeared in their spectra. Secchi’s sequence is shown in figure 7.5. The basic idea of Secchi’s sequence; classifying stars by their absorption lines, remains with us today. However, the development of the modern stellar spectral classification scheme required the analysis of a very large number of stellar spectra. In turn, this relied on two developments in astrophysics; one technological, and one social.

7.7 The Harvard spectral classification sequence (1886-1992)

Figure 7.6: The Harvard group

The Harvard astronomer Edward Pickering was amongst the first to

57 use objective prism spectroscopy to collect large numbers of stellar spectra. Objective prism spectroscopy uses a large prism to disperse the light from all the stars in the field of view of a telescope. Using the large photographic plates which were newly available, all of these spectra could be recorded in a single image. However, such a large amount of data was being produced, that Pickering and his colleagues could not keep up with it’s analysis. A perhaps apocryphal tale suggests that Pickering, exasperated with the rate of progress provided by his postdoctoral assistants, claimed that even his housemaid could be more productive. Whether or not this is true, Pickering certainly hired his housemaid, Williamina Fleming, and a sizeable staff of other women astronomers. Pickering’s staff was hardworking, dedicated, talented and (most importantly) cheap to employ. Because they were female, they were employed at very low wages, allowing Pickering to process a large amount of data, for very little money. The large number of women on Pickering’s staff created a stir at the time, and the Harvard group acquired several nicknames, from the slightly insulting “Harvard Computers”, to the downright patronising “Pickering’s Harem”. Despite this, the contribution of the Harvard group to astron- omy was considerable. As well as the modern spectral classification scheme, we have already discussed the contribution of Henrietta Leavitt, one of the Harvard group, to the measuring of distance in astrophysics, and our un- derstanding of the scale of the Universe. The Harvard group was also the beginning of large-scale contribution to astronomy by women. By processing many thousands of stellar spectra between 1886 and 1922, the Harvard group assembled a classification scheme which persists to this day. The scheme is based upon the strengths of absorption lines in the stellar spectrum, and the requirement is that the absorption line strengths must vary smoothly and continuously along the spectral sequence. The Harvard scheme divides stars into seven spectral classes, each denoted by an upper case letter. The letters are OBAFGKM, and they can be remembered, in order, using the (terrible) mnemonic “Oh, Be AFine Girl, Kiss Me”. The details of how line strengths vary across the Harvard spectral clas- sification sequence is shown in figure 7.7. By measuring the relative line strengths in the spectrum of any star it is a simple matter to assign it to a spectral class in the Harvard sequence. It is worth remembering that, at the time, this work was taxonomical. Although stars could be placed on a sequence on the basis of their line strengths, the physical meaning of this, and what properties of the star changed along the Harvard sequence, were unknown. This is because the classical physics of the time was completely unable to explain the reason why

58

H

h t

g Molecules n HeI

e He II

r

t S

Ionised Neutral e

n Metals

i Metals L

O5 B0 A0 F0 G0 K0 M0 Spectral type

Figure 7.7: The variation of line strength along the Harvard spectral classi- fication sequence. The top graph shows the detailed behaviour of individual element species along the sequence. The species labelling denotes element and ionisation state. For example HI represents neutral, atomic hydrogen, and HeII represents singly ionised Helium atoms. TiO represents molecules of titanium oxide. The bottom graph shows a schematic representation of the Harvard sequence, at the level you are expected to learn it for this course.

59 elements produced characteristic absorption and emission lines. If we wish to understand this, and by doing so understand what the Harvard spectral sequence actually represents, we must turn to the beginnings of quantum theory, which was being developed at the turn of the 20th century.

60 Chapter 8

The Bohr model of the atom

The spectroscopic work carried out up till the beginning of the 20th century has left us with three facts which need explaining: 1. the presence of discrete lines in the emission spectra of elements; 2. the Kirchhoff-Bunsen rules, which dictate what kind of spectrum will be observed from different sources 3. the nature of the Harvard Spectral Sequence In this section, I will try and tackle the first two questions. The question we are really seeking to answer is how does an atom of, say, Hydrogen interact with light. Why does it emit and absorb at discrete wavelengths? To answer this question, we need to examine the structure of an atom.

8.1 The atom

In the final years of the 19th century, J.J Thompon discovered the electron. Since matter (and hence atoms) is neutral, this discovery meant that the atom consists of negatively charged electrons, and some distribution of posi- tive charge. In 1911, Ernest Rutherford showed that the positive charge was confined to a tiny, massive nucleus. He did so by firing energetic alpha par- ticles at thin metal foils. Astoundingly, some of the alpha particles bounced off the foil. Rutherford wrote: “It was quite the most incredible event that has ever happened to me in my life. It was almost as incredible as if you fired a 15-inch shell at a piece of tissue paper and it came back and hit you.”

61 Rutherford’s work led to a picture of the atom in which negatively charged electrons orbited around a tiny, positively charged and massive nu- cleus. This atomic picture had two major flaws. Firstly, it was known from Maxwell’s theories of electromagnetism, that acclerating charges emit radiation. An electron in a circular orbit is constantly accelerating1 and so it should emit radiation, lose energy and spiral into the nucleus. This should all happen in less that 10−8 s. Obviously matter is stable on very long timescales, so this is a major flaw in the model! Secondly, the atom as described couldn’t explain the work of Kirchhoff and Bunsen, who showed that atoms absorb and emit light at discrete wavelengths.

8.2 Niels Bohr and the ’semi-classical’ atom

At the same time as Rutherford was beginning to understand the nature of the atom, theoretical physicists were beginning to grasp the quantised nature of light. Einstein’s work on the photoelectric effect showed that light existed as photons with quantised energy, and this idea was exploited by Max Planck, who used it to derive the Planck curve for Black Body emission. The Danish physicist, Neils Bohr made a great step towards solving the structure of the atom by making a massive leap of intuition. Bohr noted that the dimensions of Planck’s constant [Js] are equivalent to [kg m2 s−1], the dimensions of angular momentum. What if the angular momentum of the electron was quantised? Just as an electron magnetic wave, made out of n photons fo frequency ν, could only have an energy of E = nhν, Bohr wondered about the consequences of an atom in which the electrons could only have quantised angular momenta given by L = nh/2π. As we will see, Bohr’s model atom allows us to understand why atoms only absorb and emit light at certain wavelengths. He was also able to ex- plain why the atom was stable; an electron in an orbit with an “allowed” angular momentum could not spiral into the nucleus, since that would in- volve passing through “forbidden” values of angular momentum. Bohr was not able to explain why the electron’s angular momentum was quantised. This would require full quantum mechanical theory, which described elec- trons in the atom not as particles orbiting a nucleus, but as probability waves, which describe the likely position, energy and momentum of an elec- tron. Nevertheless, Bohr’s success in overcoming the problems faced by the classical model of the atom was a sign he was on the right track. To show

1Its speed may be constant, but its direction is constantly changing. This is because it feels the attractive force of the nucleus; it is this force that provides the acceleration.

62 this, let’s look at why Bohr’s model explains the line emission from atoms.

8.3 Energy levels of electrons in the Bohr atom

Bohr’s model meant that electrons could only occupy certain orbits; those with allowed values of angular momentum L = nh/2π. What happens when an electron moves from one orbit to another? The electron’s energy must change, and that change results in the absorption or emission of a photon of the same energy. To calculate the energy involved, we need to work out the energy of electron orbits in the Bohr model. We’ll look at a hydrogen atom, as that is the simplest case we can consider.

Figure 8.1: The Bohr model of the hydrogen atom.

In a hydrogen atom we have an electron of mass m and charge e− in orbit around a proton of charge e+. The electron is in an allowed orbit, a distance r from the proton, and orbits with a speed v (see figure 8.1). We need to work out the energy of the orbit. We start by noting that the centripital force must balance against the electrostatic attraction between the electron and proton, which gives

Ze2 mv2 2 = , (8.1) 4π0r r where Z is the atomic number (the number of protons in the nucleus, Z = 1 for hydrogen). Re-arranging equation (8.1) gives

1 Ze2 mv2 = , (8.2) 2 8π0r

63 but 1/2mv2 is the kinetic energy of the electron, so Ze2 K.E = . (8.3) 8π0r We’re trying to work out the total energy of the orbit, so we also need to know the potential energy due to the electrostatic attraction. We can calculate the potential energy by looking at the work done assembling the atom. Start with the electron at an infinite distance from the proton, and move it a small distance dr towards the proton. The work done in moving that small distance dr is F dr, where F is the electric force on the electron. To find the potential energy of the atom, we have to add up all the work done in moving the electron from infinity to r. The potential energy is then given by Z r Z ∞ P.E = F dr = − F.dr ∞ r Z ∞ Ze2 = − 2 dr r 4π0r Ze2 = − . (8.4) 4π0r The total energy is the sum of the potential and kinetic energy:

E = P.E + K.E Ze2 Ze2 = − 8π0r 4π0r Ze2 = − . (8.5) 8π0r The total energy is negative because the electron is bound to proton; if we wish to free the electron from the proton, we must add energy. So far, our derivation has been entirely classical. We make a semi-classical model by adding Bohr’s hypothesis that the angular momentum is quantised, so

L = mvr = nh/2π, (8.6) where n = 1, 2, 3 ... ∞. We can use this to solve the radius of the electron’s orbit, r. We take equation (8.1), which we obtained from balancing the coulomb force and centripetal acceleration, and we re-write it like so

4π0 (mvr)2 = Ze2. (8.7) mr

64 We then substitute equation (8.6) for mvr in equation (8.7) to find  2 4π0 nh = Ze2, (8.8) mr 2π

th which can be re-arranged to give the radius of the n orbit, rn, as  2 4π0 nh rn = . (8.9) mZe2 2π Now we know the radius of the orbit, we can substitute this back into the equation for the total energy of the nth orbit, equation (8.5), to find

Ze2 Ze2 mZe2  2π 2 En = − = − 8π0rn 8π0 4π0 nh Z2e4m 1 = − 2 2 2 80h n W = − , (8.10) n2 where W is a constant for any given atom. Therefore, the integer n, known as the principal quantum number completely determines the radius and energy of each orbit of the Bohr atom. When the electron is in the lowest energy level (n = 1 - the ground state), its energy is simply E = −W . Since it would take an amount of energy equal to W to remove the electron from the atom; W is the ionisation energy of the atom.

8.4 Atomic lines of hydrogen

We are now in a position to understand the lines emitted in the spectrum of hydrogen. When an electron moves between one orbit and another, it emits or absorbs a photon. The energy of the photon is given by the difference in energy between the two orbits, ∆E = En1 − En2 . Equation (8.10) leads to an equation for the energy of the emitted or absorbed photon  1 1  Ephoton = En1 − En2 = W 2 − 2 . (8.11) n2 n1 Since, for a photon E = hν, the corresponding frequency of the photon is W  1 1  ν = 2 − 2 . (8.12) h n2 n1

65 And, finally, we can use the wave equation νλ = c to obtain the wavelength of the photon as

ch  1 1 −1 λ = 2 − 2 W n2 n1 1  1 1 −1 = 2 − 2 , (8.13) R n2 n1 where R = 1.097 × 107 m−1 is the Rydberg constant. These equations give the wavelengths/frequencies of the radiation that would be emitted/absorbed when electron jumps from one level to another. When an electron jumps from the n = 3 orbit to n = 2 level, a photon of 1 1 1 −1 wavelength λ = R 4 − 9 = 656 nm is emitted. The reverse process can also occur; an electron in the n = 2 orbit can absorb a photon of wavelength 656 nm and jump to the n = 3 orbit.

8.4.1 Hydrogen line series Considering the energy levels in more detail, it is clear that the line spectrum of hydrogen will exhibit a number of ”series”, associated with transitions to and from a given orbit. For example, transitions from the n = 3, 4,..., ∞ energy level to the n = 2 level cause a series of emission lines known as the , often denoted with the letter H (see figure 8.2). The line we considered above, when the electron jumps from n = 3 to n = 2 is part of the Balmer series. In fact, it is the first line of the Balmer series, Hα. It’s wavelength of 656 nm is in the middle of the optical part of the spectrum. The Balmer series is exactly the same series of lines observed by Kirchhoff and Bunsen. There are also series of lines in the ultraviolet, corresponding to transitions to and from the n = 1 ground state (the Lyman series) and an infrared series of lines, corresponding to transitions to and from the n = 3 orbit (the Paschen series). Look at the spacing of the energy levels in figure 8.2, and compare it to equation (8.10). As n increases, the energies of the orbits become more closely spaced. Therefore the difference in energy, or frequency, or wave- length between successive lines in a series gets smaller as n increases. Even- tually, the spacing of the energy levels approaches zero, and the emission or absorption lines get very closely spaced indeed. We say the series has reached it’s limit.

66 Figure 8.2: Energy level diagram for a hydrogen atom showing the Lyman, Balmer and Paschen lines (downward arrows indicate emission lines; upward arrows indicate absorption lines).

67 8.4.2 Hydrogen-like atoms When we derived the energy levels for hydrogen, we set the atomic number, Z = 1. However, you would get a similar series of energy levels for any atom consisting of a single electron orbiting a nucleus containing Z > 1 protons. For example, singly ionised helium, is such an atom, with Z = 2. Although the energy level diagram for any hydrogen-like atom has the same form as for hydrogen, the exact spacing of the levels depends upon the atomic number, Z. In the case of singly ionised helium, for example, equation 8.10 tells us that the spacing between the energy levels is four times that between the energy levels in a hydrogen atom.

8.4.3 More complex atoms

Figure 8.3: Some of the energy levels of a helium atom (2 protons, 2 elec- trons). A small number of possible transitions is also indicated.

Bohr’s model is very successful in describing the line spectrum of hydrogen- like atoms. However, it is important to note that Bohr’s model atom is not correct. Although the angular momentum is quantised, it is not quantised in the way suggested by Neils Bohr. To some extent, it is a matter of good luck that we obtained the correct energy level diagram for hydrogen! To

68 calculate the energy level of more complex atoms, it is necessary to use the complete quantum theory, and to account for the interactions between electrons, as well as between the electron and the nucleus. This calculation rapidly becomes extremely complex, and the number of possible energy lev- els grows rapidly as the number of electrons in the atom rises. Figure 8.3 shows a simplified energy level diagram for atomic helium. Even with a sin- gle extra electron, the energy level diagram is already much more complex, and there are many more possible transitions with corresponding emission and absorption lines.

8.5 The Kirchhoff-Bunsen laws

We are now in a position to understand the Kirchhoff-Bunsen laws, which describe what kind of spectrum will be seen from a given source.

• A hot dense gas or hot solid produces a continuum spectrum with no spectral lines2. If the body is in thermal equilibrium, this spectrum is described by the Planck curve.

• A hot, diffuse gas produces emission lines. Because the gas is hot, electrons exist in excited states. When the electrons decay to lower energy orbits the energy lost is carried away by a single photon. This photon can only have certain, discrete energies, corresponding to the differences in energy between allowed orbits.

• A cool, diffuse gas in front of a continuous spectrum source (a hot solid or dense gas) produces absorption lines in the continuous spectrum. Absorption lines are produced when an electron makes a transition from a lower energy orbit to a higher energy orbit. If a photon in the continuous spectrum has exactly the right amount of energy, equal to the energy difference between two orbits, that photon can be absorbed and the electron makes the transition to a higher orbit. The cool diffuse gas thus absorbs light from the continuous spectrum, but only at discrete wavelengths. This produces an absorption line spectrum.

2Since a solid is made of atoms, why don’t solids also emit a line spectrum? The answer is that interactions between the closely spaced atoms change the energy levels available to the electrons, so that electrons can have a large range of energies. As a result, the object can emit or absorb light across a large range of wavelengths

69 8.6 The use of spectral lines

It turns out that the presence of absorption and emission lines in the spec- tra of astrophysical objects is one of the most powerful tools available to astronomers. We can use the lines to measure velocities, using the Doppler effect. The wavelengths of lines act as fingerprints for the material in an ob- ject. What we will look at in the next section is that the relative strengths of these absorption lines depends upon the physical properties of the emit- ting gas. These properties (temperature, density and pressure) can be de- termined by a careful examination of spectral lines. We will see that the strengths of the spectral lines (and thus the explanation for the Harvard spectral sequence) is strongly dependent on the temperature; giving us yet more ways of measuring the temperature of astrophysical objects!

70 Chapter 9

Line strength

In the last section we dealt with two of the puzzles arising from early astro- nomical spectroscopy. Now, we turn to the last puzzle - the Harvard spectral classification sequence.

9.1 The Harvard sequence in more detail

H Ionised

Calcium

h t

g Molecules n HeI

e He II

r

t S

Ionised Neutral e

n Metals

i Metals L

O5 B0 A0 F0 G0 K0 M0 Spectral type

Figure 9.1: A crude sketch of the variation of line strength along the Harvard spectral classification sequence, at the level you are expected to remember it for this course.

The Harvard sequence classifies stars according to the strength of their absorption lines. There are various ways of presenting this information. In

71 Table 9.1: Harvard spectral classification

Spectral Type Characteristics Blue-white stars with few lines. O Strong He II lines He I absorption lines becoming stronger Blue-white stars B He I lines peak at B2 H I (Balmer) lines increasing in strength White stars A Balmer lines strongest at A0, becoming weaker Ca II lines becoming stronger Yellow-white F Ca II lines strengthen as Balmer lines weaken Neutral metal lines (Fe I, Cr I) appear Yellow (solar type). G Ca II lines continue to strengthen Neutral metal lines getting stronger Cool orange K Ca II (Fraunhofer H & K) lines peak at K0 Spectra dominate by neutral metal lines Cool red M Spectra dominated by molecular absorption bands (especially TiO and VO) Strong neutral metal lines

72 table 9.1 I present a summary of the trends in text form1. Figure 9.1 shows the important trends in line strength in a graphical sketch. What is the physical basis for the Harvard sequence? Since it is based on absorption line strengths, we must try and understand what controls the absorption line strength in stars. Why does one star have strong hydrogen lines, and another have weak hydrogen lines? Our first guess might be that it is related to the abundance of hydrogen in the star’s . Some easy observations show that this is not the case; figure 9.2 shows the Orion . Located halfway down Orion’s sword, this cloud of dust and gas is a stellar nursery. The stars we see here are only a million years old. Crucially, they have all formed from the same cloud of gas, and so we expect them to have the same composition. Nevertheless, the familiar Harvard sequence can still be seen in the young stars of Orion. So the abundance of elements in stars is not the primary cause of their line strength variations.

9.2 Line strengths

To see what factors control line strengths consider the first line of the Balmer series, Hα. An Hα absorption line is caused by an electron absorbing a photon and moving from the n = 2 energy level to the n = 3 level. Of course, for this to occur means that some electrons had to be in the n = 2 energy level in the first place. Since an excited electron will tend to decay into the ground state (n = 1), how do the electrons get into the n = 2 level in the first place? Electrons can be excited into higher energy levels by two mechanisms. As we have seen, they can absorb photons of the correct energy. This process is particularly important in stellar atmospheres. Collisions between atoms can also excite electrons by passing energy from one atom to another. Both of these processes depend upon the temperature; the higher the temperature, the higher the mean energy of atoms and photons, and more photons or atoms are capable of exciting electrons. Therefore, we might expect the number of electrons in the n = 2 energy level of hydrogen to increase with increasing temperature.

1Note that in the table the term metal is used to denote any element heavier than helium. This is a standard convention in astronomy. It arises because hydrogen and helium are by far the most abundant elements in the Universe.

73 Figure 9.2: The Orion Nebula, as seen from the . More than three thousand stars appear in this image, with spectral types ranging from mid-O to early-M. This is despite all these stars being formed from the same cloud of gas and dust.

74 9.2.1 The Boltzmann equation To understand this process in a quantative way, we need to return to sta- tistical physics, which we discussed earlier. Remember that, in thermal equlibrium, the probability of a particle having energy E was given by

P (E) ∝ e−E/kT . (9.1)

If we are comparing two energy levels (labelled 1 and 2), then the ratio of the probability P2 that an electron is in level 2 to the probability P1 that an electron is in level 1 is given by

P e−E2/kT 2 e−(E2−E1)/kT . = −E /kT = (9.2) P1 e 1

Suppose E2 is greater than E1. Therefore, as the temperature tends towards zero, the quantity −(E2 − E1)/kT tends towards −∞, and P2/P1 tends towards zero. In this case, all the electrons would be in level 1. However, as the temperature increases, the proportion of electrons in energy level 2also increases. In many atoms, there may exist many quantum states available to an electron which have the same energy. These quantum states are said to be degenerate. We define gn to be the number of states with energy En. Then, the ratio of the probability that an electron will be found in any of the g2 states with energy E2, to the probability that it will be found in any of the g1 states with energy E1 is given by

P (E2) g2 = e−(E2−E1)/kT . (9.3) P (E1) g1 Since astronomical objects contain very large numbers of atoms, the number of atoms N2 with energy E2 is indistinguishable from the probability that an atom has energy E2. Thus, the ratio of the numbers of atoms in one energy level to another is given by the Boltzmann equation

N(E2) g2 = e−(E2−E1)/kT . (9.4) N(E1) g1 Let’s look at a concrete example of the Boltzmann equation, and work out the relative populations of the n = 2 and n = 1 energy levels in hydrogen. Recall from last week that the energy of a electron orbit with quantum number n was given by W E = − , n n2

75 where W is the ionisation energy of the atom (13.6 eV for hydrogen). We also need to know the degeneracy of the n = 2 and n = 1 levels. For this we need the full quantum mechanical theory, but I’ll simply state there are 2 quantum states with energy E1 and 8 quantum states with energy E2. Therefore the number of electrons in state n = 2, divided by the number of electrons in state n = 1 is given by

N(E2) 8 2 2 = e−[(−13.6 eV/2 )−(−13.6 eV/1 )]/kT , N(E1) 2 or N(E2) = 4e−10.2 eV/kT . N(E1)

Figure 9.3: The number of electrons in n = 2 (N2) divided by the total num- ber of electrons N1 + N2 for hydrogen gas, as determined by the Boltzmann equation.

Figure 9.3 shows the number of electrons in energy level n = 2, divided by the total number of electrons, derived using the formula above. We can

76 see that the number of electrons in n = 2 is a rapidly rising function of temperature. This provides us with something of a puzzle. Recall that the Balmer lines are produced by electrons in the n = 2 level absorbing photons. The Balmer lines reach their peak strength at spectral types A0, corresponding to temperatures of ∼ 9500 K. Clearly, according to the Boltzmann equation, at temperatures higher than 9500 K an even greater number of electrons will be excited to the n = 2 level. If this is the case, why do the Balmer lines decrease in strength towards the hotter O and B stars?

9.2.2 The Saha equation The answer lies in the considering the number of atoms in different states of ionisation. Consider the ionisation of a species I

AI + hν )* AI+1 + e−, where hν > EI , the ionisation energy of the species I. Clearly, as the tem- perature increases, the number of photons with hν > EI will increase and the degree of ionisation of an element will increase correspondingly. This is why the Balmer lines decrease in strength above T ∼ 9500 K; it is due to the rapid ionisation of hydrogen above 1000 K. This process is illustrated in figure 9.4.

Figure 9.4: The electron’s position in the hydrogen atom at different tem- peratures. In (a), the electron is in the ground state. Balmer absorption lines can only be produced when the electron is excited to the n = 2 level, as shown in (b). In (c) the atom has been ionised, and no longer produces absorption lines.

77 Just as we did for electron excitation above, we can apply statistical physics to the process of ionisation to derive the Saha equation

N I+1 ZI+1  πm kT 3/2 2 2 e −EI /kT I = I 2 e . (9.5) N Ne Z h Since the derivation of this equation is beyond us, at least we can examine it to see if it makes intuitive sense. The Saha equation is proportional to e−EI /kT ; we should now expect this from our familiarity with statistical physics. The electron density Ne also enters the equation. This is not too surprising. Ionisation involves the creation of a free electron. The more free electrons that are present, the more likely an ionised atom is to capture an electron. Therefore the amount of ionisation should decrease as the electron density increases. This is what we see in the Saha equation. The term ZI also appears in the Saha equation. This is a quantity known as the partition function. It represents a weighted sum of the number of ways a species can arrange its electrons with the same energy, with more energetic (and hence less likely) configurations receiving less weight. One very important point to keep in mind; all of these results in statistical physics assume thermal equilibrium. The Saha equation, like the Boltzmann equation, is only strictly valid for systems in thermal equilibrium. If we combine the Saha and Boltzmann equations, we can calculate the number of electrons in the n = 2 level of hydrogen as a function of temper- ature. The results are shown in figure 9.5. The number of electrons in the n = 2 level peaks around 9900 K. This is in reasonable agreement with the temperature of A0 stars (around 9500 K), where the Balmer line strength peaks.

9.3 A physical interpretation of the Harvard se- quence

Finally, we are in a position to understand the Harvard spectral sequence as a sequence in temperature. The line strengths of species vary along the sequence due to the interplay of electron excitation and ionisation.

• Balmer lines of hydrogen - at low temperatures (spectral types K to A) the excitation effect dominates and line strength rises as the population of the n = 2 energy level rises. However, at high tempera- tures (spectral types A to O) the ionisation of hydrogen increases and the Balmer line strength drops.

78 Figure 9.5: The number of electrons in the n = 2 level of hydrogen, divided by the total number of hydrogen atoms. This calculation takes account of electron excitation (the Boltzmann equation) and ionisation (the Saha equation). The peak occurs at approximately 9900 K, in good agreement with the temperature of early-A stars, where the Balmer line strength peaks.

79 • Metal lines - at low temperatures (spectral types M to G), lines from neutral metals dominate, but neutral metal line strengths (e.g. Ca I, Fe I) decrease from K to G as the gas becomes ionised. From spectral types G to A the lines from singly ionised metals (e.g Mg II, Si II) become more prominent as the temperature rises and ionisation increases but eventually the gas becomes even more highly ionised and these lines also decrease in strength between spectral types A and B. Some metals (i.e Fe and Ca) have quite low ionisation energies and CaII lines are strong between G and M-type stars.

• Molecular bands - the M stars are dominated by molecular bands, especially those from titanium oxide (TiO) and vanadium oxide (VO). The generally become weaker as the temperature increases because those molecules are dissociated to form atoms of Ti, V and O. Like excitation and ionisation, dissociation can also occur due to collisions or the absorption of photons.

9.3.1 Stellar temperatures re-visited The Saha and Boltzmann equations give us two more ways of measuring the temperature of the stellar photosphere. Remember, the photospheric temperature can be derived from the peak of the continuum spectra (the Wien temperature), or from measurements of the flux at two wavelengths (the colour temperature), or from the bolometric luminosity and distance (the effective temperature). Measurements of line strengths give us two more ways of measuring the photospheric temperature. The excitation temperature is measured from the Boltzmann equation, after using the relative line strengths to measure the population of electrons in different excited strengths. The ionisation temperature is measured from the relative populations in different ionisation stages (for example He I and He II), using the Saha equation. Table 9.2 shows the temperature of the Sun’s photosphere, measured using some of these techniques. The temperature measurements do not agree with each other, which by now should come as no surprise to you. All of these temperature estimates are only approximate, because they all assume the Sun’s photosphere is in perfect thermal equilibrium, which is not true!

80 Table 9.2: Different temperature esimates of the Sun’s photosphere

Method Temperature Colour temperature 5640 K Wien temperature 6200 K Effective temperature 5778 K Excitation temperature 5600 K Ionisation temperature 6200 K

Table 9.3: Solar abundances by mass

Element Abundance H 73.4% He 24.9% C 0.29% N 0.10% O 0.77% Fe 0.16%

9.4 Abundances from line strengths

Once the line strength variations due to temperature have been accounted for, it turns out we can see line strength variations caused by differences in the abundances of elements in the stellar photosphere. It was found that differences in the abundances of main sequence stars of the same population were very small. By the far the most abundant element is hydrogen; the abundances of elements in the Sun are shown in table 9.3. However, there are abundance variations between stars. Stars in the haloes of galaxies (Population II stars) have lower metal abundances than stars in the disk (Population I stars). This observation allowed astronomers to realise that the Population II stars are an older generation than the Population I stars.

81 Chapter 10

Gravitational Astrophysics

Throughout this course, I hope you have been struck by how astronomy is a science of remote sensing; using our understanding of physics we can interpret the observations we make of the night sky, and deduce from them facts about the objects of our study. In no area of astrophysics is this more apparent than when we use Newton’s law of gravity to understand the motions of objects in gravitationally bound systems (for example, binary stars, galaxies, planetary systems). The application of gravity leads to some of the most subtle and elegant measurement techniques in astrophysics. We will spend the remainder of the course studying these techniques, so we had better have a firm grasp of gravity itself.

10.1 History

At the end of the 16th Century, the Danish nobleman Tycho Brahe was busy developing a huge collection of observations of the Solar system. As the offi- cial astronomer to the Holy Roman empire, Brahe had the best observatory in the world, and was credited with taking the most accurate astronomical observations of the time. As well as being famous for the painstaking accu- racy of his work, Brahe is also famous for his nose. Having lost part of his nose in a duel, Brahe replaced it with a prosthetic nose made of copper. Brahe’s observations were put to good use by Johannes Kepler; a Ger- man mathematician and astronomer who was, for a time, Brahe’s assistant. Kepler wanted to deduce the rules which governed the Solar system; rules which he believed were created by God. He used Brahe’s observations to tease out 3 laws which all bodies in the Solar system obey. Kepler’s laws were not a physical theory; there was no framework in place to understand

82 them. Instead they were a tour-de-force of empirical deduction. Kepler’s laws are still used by astronomers today, and played a crucial role in the development of a theory of gravity.

10.2 Kepler’s Laws

10.2.1 The 1st Law “The orbit of every is an ellipse with the Sun at a focus”

r b

θ

ea a

Figure 10.1: Planetary orbits: an ellipse with the Sun at a focus.

Derived from Brahe’s observations of the orbit of Mars, this observation is a very useful result in orbital theory. Although we won’t use them much in this course, where we will mostly consider circular orbits, a few properties of ellipses are summarised here. An ellipse has semi-major axis a, semi-minor axis b and eccentricity e. An ellipse has two focii - each focus of an ellipse is a distance ae from the centre. A circle is a special case of an ellipse with e = 0; the focii of a circle are in the centre of the circle. The size of the semi-minor axis b, the semi-major axis a and the eccentricity e are related by  b 2 e2 = 1 − . a Although the equation of an ellipse can be written in Cartesian co-ordinates (x,y) it is more useful to use polar co-ordinates with a focus at the origin

83 (as in figure 10.1). In this case, the equation of an ellipse is given by a(1 − e2) r = . 1 ± e cos θ

10.2.2 The 2nd Law “For any planet the radius vector sweeps out equal areas in equal times”

P2

P A1 1

A2

P3

P4

Figure 10.2: Kepler’s 2nd Law.

Kepler’s 2nd law is illustrated in figure 10.2. A1 is the area swept out as the planet moves from P1 to P2. A2 is the area swept out as the planet moves from P3 to P4. If the time taken to go from P1 to P2 equals the time taken to go from P3 to P4, then A1 equals A2. Just from looking at figure 10.2, you should be able to see that this means a planet will move faster when it is closer to the Sun.

10.2.3 The 3rd Law “The cubes of the semi-major axes of the planetary orbits are proportional to the squares of the planetary periods” Kepler’s 3rd law is a bit of a mouthful, but is more succinctly expressed in equation form, a3 ∝ P 2. (10.1)

84 It turns out Kepler’s third law is incredibly useful, and we will use it again and again in this section of the course. Because of that, we really need to work out the constant of proportionality in equation (10.1) above. To do so, we need a full theory of gravity.

10.3 Newton’s law of gravity

On the 5th July 1687, Isaac Newton published his “Philosophiæ Naturalis Principia Mathematica”. In it he set himself the incredible task of writing down the laws which governed the behaviour of everything in the Universe, from the smallest mote of dust to the planets themselves. It took him just three sentences:

• A body continues in a state of rest or uniform motion in a straight line unless compelled by some external force to act otherwise;

• The net force on an object is equal to the mass of the object multiplied by its acceleration;

• When a first body exerts a force on a second body, the second body ex- erts a force on the first body which is equal in magnitude and opposite in direction.

In the Principia, Newton also produced a derivation of Kepler’s laws from first principles. Since the planets are not moving in a straight line, some force must act upon them. Newton was able to show that his force of gravity reproduced Kepler’s laws in full. Newton’s gravitation force was, of course Gm1m2 F = , (10.2) r2 where m1 and m2 are the masses of the two attracting bodies, and r is the distance between them. G = 6.673 × 10−11 m3 kg−1 s−2 is the gravitational constant. It’s worth mentioning here that gravity is a tremendously weak force. 2 2 The electrostatic repulsion between two protons is e /4π0r , whilst the 2 2 gravitational attraction between them is Gmp/r . The ratio of these two 2 2 quantities is e /4π0Gmp. This expression is independent of radius, so the relative strengths of the forces is the same throughout all space. The value of this expression is 1036! The electrostatic force is 1036 times stronger every- where than the force of gravity, and yet it is gravity, not electromagnetism, that controls the motions of the stars and planets. This is because matter

85 is mostly neutral. Large amounts of matter have negligible net charge, but very large masses; allowing gravity to become the dominant force. An understanding of gravity is an incredibly versatile tool for an astro- physicist because it can be used to measure a basic property that, so far, we have no way of measuring; mass. By manipulating the laws of gravity we can make observations that allow us to measure the mass of astronomical objects as small as tiny satellites of Jupiter and as large as giant clusters of galaxies. To do so however, we need to spend a little bit of time developing some tools to use later in the course.

10.4 Some results on gravity

We’re going to derive some very useful results which follow from Newton’s law of gravity. We will need these results later on in the course. But before we do, let’s return to Kepler’s 3rd law and show that it can be derived from Newton’s law of gravity (finding the constant of proportionality in the process).

10.4.1 Kepler’s 3rd law revisited As in the rest of this course, we will consider circular orbits only. A full treatment of elliptical orbits is possible, but only serves to complicate the mathematics; circular orbits capture all of the essential physics. In the Solar system, Kepler stated that the planets orbit around the Sun. This is because the Sun is much heavier than the planets. In the general case of two bodies orbiting under their mutual gravitational attraction, both bodies perform circular orbits around the center of mass. This situation is shown in figure 10.3. We can derive Kepler’s third law by equating force and mass × acceler- ation. The acceleration of an object in a circular orbit is v2/r, so for star b (using the notation from figure 10.3), 2 GMaMb Mbvb 2 = . a rb Similarly, for star a, 2 GMaMb Mava 2 = . a ra Adding these two, we find  2 2  G(Ma + Mb) va vb 2 = + . a ra rb

86 centre of mass

vb

Ma X Mb

ra rb va

a = ra + rb

Figure 10.3: Circular orbits around the centre of mass (COM). The centre of mass is marked with a cross, whilst the circular orbit of star b around the COM is shown as a dotted line.

87 Now, we use one of the neat mathematical tricks that crop up throughout gravitational astrophysics. The distance round a circular orbit of radius r is 2πr, and the time taken to go round the orbit is P . Therefore we can write 2πr v = , P and substitute this into our equation above to get

 2 2  2 2 G(Ma + M ) 4π ra 4π r 4π 4π b = + b = (r + r ) = a. a2 P 2 P 2 P 2 a b P 2

Re-arranging, we find 2 GP (Ma + M ) b = a3, (10.3) 4π2 which you can see is Kepler’s third law P 2 ∝ a3. The form of Kepler’s third law given in equation (10.3) crops up again and again. It is worth spending some time memorising it.

10.4.2 Gravitational potential energy It is useful to define the gravitational potential energy; the work required to separate two bodies to an infinite distance from an initial separation r. We start by asking how much work is required to move them a small distance dr. Since work = force × distance

Gm1m2 Work = − dr r2 Why the minus sign? Because the direction of the force and dr are in opposite directions. If we choose our radius axis so that dr is positive, then 2 the gravitational force is −Gm1m2/r . The total work moving the bodies to infinity is the sum of all the little steps along the way Z ∞ Gm1m2 Grav. P.E = − 2 dr r r  ∞ Gm1m2 = r r Gm1m2 = − . (10.4) r This quantity is negative; as expected because we have to put energy in to separate the two bodies. The gravitational potential energy leads to the

88 concept of escape velocity. One object will escape the gravitational field of another if its kinetic energy is larger than the size of the gravitational potential well,

1 2 Gm1m2 m2v > 2 r   1 2Gm1 2 v > (10.5) esc r

10.4.3 Gravitational theorem #1 “A body inside a spherical shell of matter experiences no net gravitational force from that shell”

A1

R1

P

Ω

R2

Ω A2

Figure 10.4: A point P inside a hollow, thin shell experiences no net gravi- tational force.

This turns out to be OK to derive. We imagine the shell to be thin, with a density of ρ kg per unit surface area. We pick a point, P inside

89 the shell and draw two cones of the same solid angle radiating out from the point P , so that they includes two small areas of the shell on opposite sides: these two areas will exert gravitational attraction on a mass at P in opposite directions. We will show that these forces exactly cancel out. The situation is shown in figure 10.4. Since the cones have the same solid angle Ω, and the area of the base of 2 a cone of solid angle Ω is A = Ωr , we see that the ratio of the areas A1 and 2 2 A2 at distances r1 and r2 are given by A1/A2 = r1/r2. Since the masses of the bits of the shell are proportional to the areas, the ratio of the masses of 2 2 the shell sections is also r1/r2. It follows that the ratio of the gravitational forces from the two bits of shell is 2 2 2 M1 r2 r1 r2 F1/F2 = 2 = 2 2 = 1. (10.6) M2 r1 r2 r1 So the forces on a particle at P due to these sections of shell are the same size and in opposite directions; they cancel exactly. In fact, the gravitational pull from every small part of the shell is balanced by a part on the opposite sideyou just have to construct a lot of cones going through P to see this. So the net force on a particle inside the shell is zero. What if the shell is not thin? A particle inside a spherical cavity in a dust cloud is such a situation. We can consider it as an infinite number of thin shells nested inside each other. The force from each shell is zero, so the net force inside a cavity like this is also zero. This result will be tremendously important when we look at measuring mass in galaxies.

10.4.4 Gravitational theorem #2 “The gravitational force on a body that lies outside a closed spherical shell of matter is the same as it would be if all the shell’s matter were concentrated into a point at it’s centre.” We’ve already assumed that this theorem is true, when we derived Ke- pler’s 3rd law; we implicitly assumed we could treat the stars as point masses, even though they are spheres of finite size. There is a beautiful and elegant proof of this theorem, which can be written in about three lines, using a different way of writing the law of gravity known as Gauss’s theorem. Un- fortunately for you, the maths used is (I believe) more advanced than you have learned to date. By contrast, Newton’s derivation took him several pages to write and years to figure out! Therefore we will take this theorem to be true without proof; the curious will find complete derivations us- ing Newton’s method at http://galileo.phys.virginia.edu/classes/

90 152.mf1i.spring02/GravField.htm or http://en.wikipedia.org/wiki/ Shell_theorem. Gauss’s theorem is really quite beautiful mathematics, and allows you to easily work out the gravity from complex objects - you can find a good basic introduction at http://www.pgccphy.net/1030/gravity.pdf

91 Chapter 11

Measuring mass

Having dealt with the theory of gravity, let’s start to use it to measure mass. The mass of an object is probably the most fundamental and important measurement we can make for an object (for example, a star’s brightness, lifetime and evolution are mainly determined by its mass). In almost all cases, measuring the mass of an object involves measuring the effect of its gravity on nearby objects. We will start with a example which is close to home; the masses of planets in our solar system.

11.1 Planets in the solar system

If a planet has a orbiting it, we can use Kepler’s 3rd law to calculate the mass of the planet, relative to the mass of the Sun. Figure 11.1 shows the geometry. A planet of mass m, orbits the Sun (mass M) with a semi- major axis of a. The planet also has a moon, with a mass m1, which orbits the planet with a semi-major axis a1. Using Kepler’s 3rd law as written in equation (10.3), we find the following equation for the period of the planet’s orbit around the Sun 1 2π  a3  2 P = √ , G m + M and a corresponding equation for the period of the moon’s orbit around the planet 1  3  2 2π a1 P1 = √ . G m + m1

92 m1

a1

m

a

M

Figure 11.1: Geometry of a planet with a moon, orbiting the Sun (not to scale!)

We divide the two equations to get  2  3 P a m + m1 = P1 a1 m + M  3   a m 1 + m1/m = . a1 M 1 + m/M

But here we can use a (very good) approximation. Since the mass of the moon is much less than the mass of the planet (the Moon is around 1% the mass of the Earth), we have 1 + m1/m ≈ 1, and using the same argument for the planet and Sun, 1 + m/M ≈ 1. Re-arranging, we find

 2 m P a1 3 = . M P1 a

Thus the relative mass of any planet with a moon can be found once the periods and semi-major axes of the moon and planet are known. The periods are easy to measure by tracking the motion of planets in the night sky, and the same data can yield distances using the parallax method.

93 11.1.1 Absolute planetary masses The equation above yields the masses of the planets in the Solar system, relative to the mass of the Sun, M. If we could measure the absolute mass mp of a single planet, we can find the mass of the Sun from M = mp/ M . In a rare case of astronomy progressing by experiment, the mass of the Earth was measured in a beautiful experiment by Henry Cavendish in 1798 (more than 100 years after the Principia was published).

Figure 11.2: A sketch of Cavendish’s experiment to measure the mass of the Earth

Cavendish’s experiment is shown (in sketch form) in figure 11.2. He attached a bar holding small masses to a torsion wire and placed two much larger masses close by. The large masses were equidistant from the smaller, test masses - a distance r. By considering the forces on the small masses due to the large masses, and the Earth, Cavendish could calculate the mass of the Earth. The force on the test masses due to the large masses is

GmMp F = , (11.1) p r2

94 and the force on the test masses due to the Earth is

GmME FE = 2 . (11.2) rE Dividing equation (11.2) by equation (11.1) gives

2 FE r ME = 2 Fp rE Mp or FE rE 2 ME = Mp. (11.3) Fp r All of the quantities on the right hand side of equation (11.3) can be mea- sured. Fp is measured from the twist of the torsion wire. FE = ma can be measured by measuring the gravitational acceleration of the test particles when dropped1. The radius of the Earth is easily measured using geometri- cal techniques. In this way, Cavendish determined the mass of the Earth, and hence the absolute mass of all planets. Cavendish’s equipment was remarkably sensitive. The force involved in twisting the torsion balance was very small, roughly equivalent to the weight of a large grain of sand. To prevent air currents and temperature changes from interfering with the measurements, Cavendish placed the entire apparatus in a wooden box about 0.6 m thick, 3 m tall, and 3 m wide, all in a closed shed on his estate. Through two holes in the walls of the shed, Cavendish used telescopes to observe the movement of the torsion balance’s horizontal rod. The motion of the rod was only about 4mm, and Cavendish had to account for the constant swaying of the rod, which was never still. Cavendish’s experiment was repeated many times, but his accuracy wasn’t bettered for nearly 100 years.

11.2 Stellar masses

It is phenomenal to consider that we can measure the mass of stars so im- possibly distant that the light we see from them is hundreds of years old. All of the direct measurements of stellar masses we have come from stars in multiple systems; a collection of stars bound together by their own gravity. The most important of these are the binary stars - two stars which orbit each other around a common centre of mass. Binary stars are surprisingly

1The distance s travelled in a time t, under constant acceleration a is s = at2/2

95 common: over half of the stars visible to the naked eye are actually in bi- nary systems. Binary systems are classified according to their observational characteristics. Different types of binary systems provide different ways of measuring mass, and some types of binary system allow a rich set of data to be collected. In the sections that follow, we will consider some types of binary star in turn.

11.3 Visual Binaries

centre of mass

vb

Ma X Mb

ra rb va

a = ra + rb

Figure 11.3: Circular orbits around the centre of mass (COM). The centre of mass is marked with a cross, whilst the circular orbit of star b around the COM is shown as a dotted line.

Remember that the resolving power of a telescope is not infinite; in practise an image taken from a ground based telescope has an image quality dictated by the atmosphere. This is called seeing, and means that the typical size of a stellar disc in a ground-based image is around one arcsecond2. Therefore, if the two stars in a binary are very close together, so that their separation on the sky is less than an arcsecond, the light from the stars will be blurred together. We will not see the stars as a binary system at all; such a binary system is unresolved.

2Space-based telescopes can do rather better, being above the atmosphere and hence limited by their optics.

96 Binaries in which we can clearly see both components are called visual binaries. For a star to be a visual binary the components must be widely separated, and both components must be bright enough to detect. Visual binaries are very useful for measuring masses simply, with a minimum of observations. In some visual binaries, we are fortunate that the orbital period is short enough that we can actually watch the stars move around their orbits. By patiently watching visual binaries, we can measure the size of the orbits. Figure 11.3 shows the geometry of a binary star system with circular orbits. From the definition of the centre of mass, we know that the sizes of the star’s orbits ra and rb, are related through

mara = mbrb. (11.4)

Therefore the mass ratio (ma/mb) is given by ma r = b . mb ra From our images, we can measure the angular sizes of the orbits αa, αb. Since these angles are small, they are related to the sizes of the orbits by ra = dαa and rb = dαb, where d is the distance to the binary. As a result, the mass ratio is given by ma α = b , mb αa and can be found by a simple measurement of the angular sizes of the orbits, without knowing the distance to the stars. If the distance is known (e.g. from a measured parallax), we can calculate the physical sizes of the orbits, ra and rb. It follows that we can also measure the binary separation, a = ra + rb and use this in Kepler’s third law to find the total mass of the binary, since 4π2a3 m + m = , a b GP 2 and P is also directly measured from the star’s orbit. Once we know the mass ratio, and the total mass, we can find the individual masses through some simple algebra. You should be able to convince yourselves that ma + mb mb =  . 1 + ma mb Since all the terms on the right hand side of this equation can be measured, it follows we can measure individual masses of stars within a visual binary system, armed with nothing more than images of the stars as they orbit the centre of mass, and a distance to the binary!

97 11.3.1 Orbital Inclination Our discussion of using visual binaries to measure mass above presents quite a simplified picture. In reality, the analysis of the data is more complex because we do not know the inclination of the binary orbit to our line of site. The true situation is shown in figure 11.4. When we track the motion

plane of the sky

i r r cos i

to earth

Figure 11.4: An orbit inclined with respect to our line of sight. The angle i between the orbit and the plane of the sky is called the orbital inclination. We do not see the true orbit, instead we see the orbit projected on the plane of the sky. of stars in a visual binary, we do not see the true orbit, but the projection of the orbit onto the plane of the sky. Instead of measuring the true sizes of the orbits ra and rb we instead measure the projected sizes, for example 0 ra = ra cos i. We can still measure the mass ratio without knowing the inclination because 0 ma αb αb cos i αb = = = 0 . mb αa αa cos i αa

98 We do, however, need to know the orbital inclination to measure the total mass, as 4π2a3 4π2  a0 3 4π2  α0d 3 ma + m = = = , b GP 2 GP 2 cos i GP 2 cos i 0 0 0 where α = αa + αb is the projected angular separation of the binary. Therefore, whilst a visual binary can yield the mass ratio without know- ing the orbital inclination or distance, a full solution for the individual masses needs knowledge of both the orbital inclination and distance to the binary. We might measure the distance using a parallax measurement, but how to measure the orbital inclination? It turns out that very detailed ob- servations of the binary star orbits can tell us the orbital inclination as well. Figure 11.5 shows the basic idea. A circular orbit (shown in red) is inclined

Figure 11.5: The projection of an inclined, circular orbit onto the plane of the sky. A circular orbit (red) is inclined to our line of sight by 60 degrees. It’s projection on the sky (blue) is an ellipse. The centre of the circular orbit projects to the centre of the ellipse (both marked by dots). to our line of sight at 60 degrees. It’s projection onto the plane of the sky

99 is an ellipse. Therefore, a star on this orbit appears to follow an elliptical orbit. However, if the orbit is measured accurately, it is clear that some- thing is not right. The star orbits on an elliptical orbit, but the centre of the orbit is not at one of the focii of the ellipse. Instead, the centre of the orbit is in the centre of the ellipse. Thus, the star appears to violate Kepler’s first law! Therefore the inclination of the true orbit can be determined by comparing the observed stellar positions with mathematical projections of various orbits onto the plane of the sky. Therefore, detailed measurements of the orbits of visual binaries can be used to measure the mass ratio, and the inclination of the binary orbit. Combined with a distance measurement, the total mass of the binary (and hence the individual stellar masses) can also be determined.

11.3.2 A visual binary case study: Sirius

Figure 11.6: A bright Geminid meteor, and Sirius (the bright star in the bottom left).

We need not look far for an example of a visual binary. Sirius is the brightest star in the night sky; and in many ways is the archetypical visual binary. Sirius consists of two stars, separated by around 7.5 arcseconds. The brightest star, Sirius A has a luminosity of 25.4 L and an effective temperature of 9,940 K. In many ways it is a typical star of spectral type

100 A2. Sirius B is much hotter, at 25,200 K but much fainter, with a luminosity 2 4 of 0.026 L . Since, for a black body, L = 4πR σTeff , we can immediately tell that Sirius B is much smaller than Sirius A; in fact it is over 200 times smaller. Sirius is extremely bright because in part because it is close to us. It’s therefore not suprising that it shows a large proper motion. The motion of Sirius A and B in the sky is shown in figure 11.7. As well as a large proper motion relative to the background stars, the paths of the stars also reveal the motions of the two stars around the centre of gravity.

Figure 11.7: The paths of Sirius A (solid line) and Sirius B (dashed line) in the sky. Background stars are marked with dots and numbers. On top of the binary motion, there is a very large proper motion.

Figure 11.8 shows the motions of Sirius A and Sirius B again, but this time the proper motion, and the motion of Sirius A have been subtracted, so Sirius A appears stationary. This makes the orbit of Sirius B more obvious. The orbit is elliptical, but is not centred on one of the focii of the ellipse. This tells us the orbit is inclined at an angle to our line of sight. Detailed analysis of the orbital shape reveals that the orbital inclination of Sirius AB is roughly 44 degrees. The size of the orbit is α = 7.56 arcseconds. Parallax

101 Figure 11.8: The ’apparent’ orbit of Sirius B. This is the orbital motion of Sirius B after subtracting the proper motion of the binary, and the motion of Sirius A. measurements reveal the distance to the binary is d = 2.6 parsecs. The size of the orbit is thus given by a = αd = 19.7 AU. We can apply Kepler’s third law to find the total mass of the binary - about 3 Solar masses. Look again at the orbits of Sirius A and B in figure 11.7. It is clear that the orbit of Sirius A is about half the size of Sirius B’s orbit. Since

ma r α = b = b mb ra αa we can immediately see that Sirius A is roughly twice as massive as Sirius B. Since the total mass of Sirius AB is 3 Solar masses it follows that Sirius A is roughly 2 Solar masses and Sirius B is roughly the same mass as the Sun. Whilst Sirius A is essentially a normal star of spectral type A2, with a typical mass and luminosity, Sirius B is very odd indeed. It has roughly the same mass as the Sun, is nearly four times hotter than the Sun and yet its radius is 200 times smaller than a typical A2 star. That means the radius of Sirius B is much around the same as that of the Earth! Sirius B is one of the earliest known White Dwarfs, extremely dense stars which are the ultimate fate of stars like our Sun. They are formed from the hot dense core of the star as it reaches the end of its life.

102 The extremely high density of white dwarfs can only be supported against gravitational collapse due to electron degeneracy pressure; a curious quirk of quantum mechanics. The Heisenberg uncertainty principle states that you cannot simultaneously define the position and momentum of an electron to arbitrary precision. If the position is known more accurately, the momentum becomes more uncertain, and vice versa. In a white dwarf, the electrons are confined within a very small radius. Their positions are thus well known, so their momenta must be very uncertain! This means that some electrons will have high momenta, and be moving at high speeds. Just like a gas in a box, high electron speeds cause a pressure, which supports the white dwarf against gravity.

103 11.4 Spectroscopic Binaries

What if we cannot resolve each of the stars individually? In that case, we cannot measure the orbit of the binary directly, but there is still a wealth of information that can be extracted from the spectra of binary stars. If the orbital motion has a component along the line of sight, a periodic radial velocity shift will be observable3 - as shown in figure 11.9.

Figure 11.9: The orbital paths and radial velocities of two stars in circular orbits. In this example, M1 = 1 M , M2 = 2 M and the orbital period is P = 30 d. The whole binary is moving away from us with a radial velocity of −1 vcm = 42 km s . v1 and v2 are the velocities of star 1 and star 2 respectively. (a) The plane of the circular orbits lies along the line of sight of the observer. (b) The observed radial velocity curves.

As with the visual binaries before, the angle of inclination between the line of sight and the orbit effects the observed radial velocities. Figure 11.10 shows that if the star has a velocity v around its orbit, what we actually observe is v0 = v sin i, where i is the inclination angle of the orbit. To obtain the actual velocities of the stars it is thus necessary to determine the orbital inclination somehow.

11.4.1 Double-lined binaries If both stars are comparably bright, we will be able to see absorption lines from both stars in the binary. Such a binary is called a double-lined spectro-

3remember, we can determine a stars radial velocity because the Doppler shift will change the wavelength of absorption or emission lines from the star

104 plane of the sky

v i

i v sin i

to earth

Figure 11.10: A star follows a circular orbit in a binary with velocity v. The orbit is inclined at an angle i. This figure shows that the component of velocity along our line of sight - the radial velocity - is given by v sin i.

scopic binary. Double-lined binaries are very useful for mass determinations, as we will see below. We will assume the orbits are circular, in which case the speeds of the stars around the orbits are constant and given by v1 = 2πa1/P and v2 = 2πa2/P . Since m1 a2 = , m2 a1 we can use the formulae for the speed above to replace a1 and a2 with v1 and v2 to get 0 m1 v2 v2 sin i v2 = = = 0 . m2 v1 v1 sin i v1 Hence, as for visual binaries we can determine the mass ratio without know- ing the orbital inclination, in this case using the observed radial velocities, 0 0 v1 and v2. However, as is also the case with visual binaries finding the total mass of the binary does require knowledge of the orbital inclination. The total size of the orbit, a, can be written as P a = a1 + a2 = (v1 + v2) 2π

105 We can use this to replace a in Kepler’s third law 4π2  a3  P 2 = , G m1 + m2 and solve for the total mass,

P 3 m1 + m2 = (v1 + v2) . 2πG Re-writing this in terms of the observed radial velocities, we find 0 0 3 P (v1 + v2) m1 + m2 = . 2πG sin3 i Hence, provided we know some way of measuring the orbital inclination, double-lined spectroscopic binaries can yield individual stellar masses via radial velocity measurements.

11.4.2 Double-lined, eclipsing binaries Eclipsing binaries are the heavyweight champions of precise stellar measure- ments. In large part, this is because an eclipsing system tells us the orbit is very close to edge on. Put another way, if we see eclipses, we know the orbital inclination is close to 90◦. Even if it were assumed that i = 90◦, while the actual value was close to i = 75◦, the resulting error in sin3 i would only be around 10%, with a corresponding error in the total mass. Thus, the ob- servations of eclipses in a binary star’s lightcurve (figure 11.11) immediately allows us to roughly guess the orbital inclination, and get a decent estimate of the stellar masses. We can get a better estimate of the inclination from the eclipse shape. Figure 11.11 shows the lightcurve of a binary with i = 90◦. When the smaller star is eclipsed by the larger one a nearly constant minimum occurs in the brightness of the binary as a whole. Similarly, even though the larger star is not completely eclipsed by the smaller one, a constant amount of area is obscured, and so again a nearly constant drop in brightness is observed. The eclipse is described as ’flat-bottomed’ and is a clear indicator of very high inclinations. When the inclination is a little lower, one star is not completely eclipsed by its companion. In this case, the minima of the lightcurve are no longer constant implying that i < 90◦. However, double-lined eclipsing binaries allow much more than just the stellar masses to be measured. Detailed analysis of the eclipse also allows direct measurements of the stellar radii, and the ratio of the effective tem- peratures to be measured. It is for these reasons that eclipsing binaries are so useful in stellar physics.

106 Figure 11.11: The light curves of two eclipsing binaries with different in- clinations. In the top panel is a binary for which i = 90◦. The bottom curve shows a partially eclipsing binary, of lower inclination. The times in- dicated on the light curves correspond to the positions of the smaller star relative to its larger companion. It is assumed that the smaller star is hot- ter than it’s companion, so that the luminosities of the two stars are similar (L = 4πσR2T 4).

107 Stellar radii We refer again to figure 11.11 and looking at the binary with i = 90◦. Let us label the large star as star 1, and the smaller star as star 2. The relative velocities of the two stars is v = v1 + v2, so the time taken for the small star to move from a to b is tb − ta = 2r2/v. Since tb − ta can be measured from the lightcurve, we can immediately find the radius of the small star v r2 = (t − ta). 2 b Similarly, by considering the time take for the small star to move between b and c, the size of the larger star can be determined v v r1 = (tc − ta) = r2 + (tc − t ). 2 2 b

Effective temperatures By assuming the stars emit as black bodies we can find the ratio of the star’s effective temperatures. Recall that, for a black body the surface flux (energy 4 emitted/second per unit surface area of the star) is given by F = σTeff . The total light from the binary when both stars are visible is

2 2 LT = πr1F1 + πr2F2.

When the small star (star 2) is full eclipsed the light from the binary is

2 L2 = πr1F1.

When the larger star (star 1) is eclipsed most of the stellar disc is still visible, 2 but an area equal to πr2 is obscured by the smaller star. The light from the binary is therefore 2 2 2 L1 = πr2F2. + π(r1 − r2)F1.

The depth of the eclipse when star 1 is eclipsed is LT −L1. Likewise for star 2. A remarkable thing happens when we look at the relative depths of the two eclipses

2 2 2 LT − L2 πr1F1 + πr2F2 − πr1F1 = 2 2 2 2 2 , LT − L1 πr1F1 + πr2F2 − πr2F2. + π(r1 − r2)F1 which simplifies to  4 L − L2 F2 T2 T = = . LT − L1 F1 T1

108 This is why double-lined eclipsing binaries are such a precious object for stellar astronomers. They are the only way to directly measure both the mass and radius of a star, and they also allow effective temperatures to be measured. Furthermore; notice that we did not need to know the distance to the star. All that is needed is observations of the radial velocity curves and the light curve. Unlike visual binaries, double-lined eclipsing binaries can yield measurements of the properties of stars which are too distant for a parallax measurement.

109 11.4.3 Single-lined binaries Obviously, the mass ratio and total mass can only be measured if the radial velocities of both stars are measurable. This requires that absorption or emission lines from both stars are visible in the spectrum of the binary. If one star is much brighter than the other the spectrum of the fainter star will be overwhelmed. Such a binary is called a single-lined binary. Recall that, for a double-lined binary,

0 0 3 P (v1 + v2) m1 + m2 = , 2πG sin3 i and 0 m1 v2 = 0 . m2 v1 0 Suppose that only star 1 is visible, so we can only measure v1. We can use 0 0 0 the latter equation to replace v2 in the first equation with v2 = v1m1/m2 to give 03  3 P v1 m1 m1 + m2 = 3 1 + . 2πG sin i m2 Re-arranging terms gives

3 m2 3 P 03 2 sin i = v1 . (11.5) (m1 + m2) 2πG The right hand side of this equation is known as the mass function. It only depends on observable quantities of a single-lined binary, the period and ra- dial velocity of the visible component. The left-hand side of equation (11.5) is always less than m2, since m1 + m2 > m2 and sin i ≤ 1. Therefore, the mass function provides a lower limit for the mass of the unseen component, m2. As we will see in the following case study - this can still be very useful.

11.5 Exoplanets

The planets of our solar systems have been recognised since the Babylonians, since nearly 2000 years BC. In the four millenia that followed they were the only planets known to exist. Then, on October 6th 1995, Michel Mayor and Didier Queloz announced the discovery of a exoplanet orbiting the main- sequence star 51 Peg. This discovery started a new era of planet discovery; as of April 2010 there are 452 known planets outside our solar system.

110 Exoplanets were first discovered and characterised by measuring the ra- dial velocity of the star as it orbits the centre of mass of the star-planet system. This radial velocity of the host star is often called a Doppler wob- ble. To date, the vast majority of exoplanets have been discovered by looking for stars which show a detectable Doppler wobble. Another way of searching for exoplanets is to look for the tiny dip in light caused by the planet pass- ing in front of the host star - an exoplanetary . Transit searches have become a popular alternative to the Doppler wobble technique for exoplanet hunting. Transit searches have a number of advantages. Because many stars can fit on a CCD image, many more stars can be studied in a given time. Also, because the starlight is not being divided into many wavelengths for study, quite small telescopes can be used - in contrast to the Doppler wobble technique which uses the biggest telescopes available.

11.5.1 Exoplanets as single-lined binaries The planets themselves are extremely difficult to see directly; exoplanetary systems are therefore a type of single-lined spectroscopic binary, and can be analysed and treated as such. We measure the Doppler wobble of the plane- tary host, and can construct the mass function, as given by equation (11.5),

3 mp 3 P 03 2 sin i = vs . (mp + ms) 2πG where the subscript p denotes the exoplanet and the subscript s denotes the host star. However, the mass of the star is much greater than that of the planet (Jupiter, for example, is around 1000 times less massive than the Sun); we can use this to re-write the mass function in a much more useful form. The term mp + ms can be written as

mp mp + ms = ms(1 + ) ≈ ms, ms because mp/ms  1. Substituting mp + ms ≈ ms into the mass function, we get 3 3 mp 3 (mp sin i) P 03 2 sin i = 2 ≈ vs . (11.6) ms ms 2πG

It is normally possible to get a good estimate of the stellar mass, ms. This is because, on the main-sequence, there are well known relationships between a star’s mass and a number of observable quantities, such as its luminosity, effective temperature, or spectral type. So by measuring (for example), the

111 spectral type of the host star, we can estimate the stellar mass to an accuracy of a few percent. Armed with an estimate of the stellar mass and Doppler wobble measurements of the host star (which reveal both the observed radial 0 velocity vs and the period P ), we can calculate the quantity mp sin i, which is a lower limit to the mass of the planet.

11.5.2 Doppler wobble measurement The majority of the known exoplanets have been found by searching for Doppler wobbles in nearby stars, and as outlined above, the Doppler wobble gives a lower limit to the planetary mass. But how large is it? Consider the Doppler wobble of the Sun, as caused by Jupiter. The period of Jupiter is 11.86 years, and it’s mass is 1.9×1027 kg, compared to the Sun’s 2×1030 kg. Putting these quantities into equation (11.6), and assuming that i = 90◦, so all of the stellar motion is along our line of sight, we find a Doppler wobble of around 12 ms−1. To put this in some sort of context, that’s just less than 30 mph. To detect exoplanets thus needs us to be able to measure radial velocities of objects which are many parsecs away, and moving away from us at a similar speed to city centre traffic! Such observations are extremely challenging, and this explains why the discovery of exoplanets was so recent. The problem lies in calibrating a spec- trograph. If you measure the spectrum of a planet-hosting star you need to measure the wavelength of the absorption lines to measure the Doppler shift. What you actually measure is the position of an absorption line on your detector, and there are lots of flaws in the instrument that can cause this to change, even if the star itself shows no motion. For example, the spec- trograph can flex as the telescope moves, or as the instrument cools during the night. To get round this, astronomers calibrate their spectrographs, by observing lamps which emit lines of known wavelength. In conventional spectrographs this is done every hour or so. The calibration usually limits accuracies to a few km/s: much worse than we need to detect exoplanets! To get round this, spectrographs have been built where the starlight shines through an Iodine cell before the spectrum is measured. The Iodine super- imposes absorption lines on the star’s spectrum. The position of the star’s absorption lines can then be measured relative to the Iodine lines. This greatly improves precision, and the best spectrographs can reach radial ve- locity accuracies of 1m/s; a slow walking pace!

112 11.5.3 Planets found from Doppler Wobble: observational bias Figure 11.12 shows the properties of the known exoplanets as of April 2010. The known exoplanets are totally unlike the planets in our own solar system. Most of the known exoplanets are of a similar mass to Jupiter, and yet have orbits similar to that of Earth. Some exoplanets (so-called hot Jupiters) are similar in mass to Jupiter, but have orbits smaller than Mercury’s! This raises the immediate question of whether these planets are typical or not. There is good reason to suspect they are not, because planet-hunting us- ing Doppler wobble is very strongly biased. Look in detail at equation (11.6). 0 In order for the Doppler wobble vs to be large we need the mass of the planet mp to be large, and the period P to be small! A small period implies a small orbit4, so it is hardly surprising that the Doppler wobble technique is find- ing lots of heavy planets orbiting close to their host stars. Note also that there are almost no planets found using the Doppler wobble technique with orbital periods longer than 10 years. This is because we need to see the radial velocity curve repeat itself in order to measure the period and be sure we are seeing an exoplanet orbiting the star. Since we have been monitoring stars for only 15 years or so, it is natural that exoplanets with long periods are rare! Exoplanet searches are steadily becoming more accurate and as time goes on it is hoped we will start to find exoplanetary systems more like our own Solar system.

11.5.4 Transiting exoplanets Recall the mass function for planetary systems,

3 (mp sin i) P 03 2 ≈ vs . ms 2πG Obviously, to get more than a minimum mass for the exoplanet we need to know the orbital inclination. By comparison to spectroscopic binaries, the obvious way to do this is to look for systems in which the exoplanet passes in front of the host star. In stellar binaries these are called eclipses; in exoplanet systems, they are known as transits. As with spectroscopic binaries, the presence of transits allows us to mea- sure the inclination, and more besides. Again, in a similar way to eclipsing binaries, the presence of transits tells us that the inclination is close to 90◦, and this might be enough for our needs. If not, the detailed transit

4Kepler’s third law says that P 2 ∝ a3

113 Figure 11.12: Properties of known exoplanets as of April 2010. The y-axis shows planetary mass (we assume mass is equal to the minimum mass given by the mass function). On the x-axis we show either the size of the orbit (bottom axis) or the period of the orbit (top axis). The dots are colour coded according to their discovery method. Blue dots are planets detected by Doppler wobble. Green dots represent planets discovered from their transits.

114 Figure 11.13: Transit of exoplanet Wasp-4b in front of its host star.

shape tells us the precise inclination, as shown in figure 11.14. The transit depth can also tell us the radius of the exoplanet. The luminosity of the star/exoplanet system outside of transit is just given by the luminosity of the star. Assuming the star radiates as a black body this is given by

2 4 Lout = πrs σT .

During transit, the exoplanet blocks some of the surface of the star. The 2 2 visible surface area of the star is now π(rs − rp), so the luminosity during transit is 2 2 4 Lin = π(rs − rp)σT . Now, we calculate the transit depth, as a fraction of the total out-of-transit light  2 Lout − Lin rp = . (11.7) Lout rs Using equation (11.7) we can measure the planetary radius using the transit lightcurve. We need to know the radius of the host star but this, like the mass of the host star earlier, can be estimated from the spectral type and main-sequence mass-radius relationships. Note that equation (11.7) predicts that transit depths are very small - normally only 1 or 2% of the total light from the star. Measuring planetary

115 2

FIG. 1: Definition of transit light-curve observables. Two schematic light curves are shown on the bottom (solid and dotted lines),Figure and the 11.14: corresponding How geometry the of inclination the star and planet affects is shown the on the transit top. Indicated shape. on the The solid ligh solidt curv linee are the transit depth ∆F , the total transit duration tT , and the◦ transit duration between ingress and egress tF (i.e., the ”flat part” of theshows transit ligh thet curv transite when the shapeplanet is fully for sup i=90erimposed, whilst on the paren thet star). top First, shaded second, third, circles and fourth show contacts the are noted for a planet moving from left to right (not needed for this problem set). Also defined are R , Rp, and impact parameter ∗ b corresppositiononding to of orbital the inclination planet i at. Diff beginningerent impact parameters and end b (or ofdiffe therent i)ingress will result in and differen egresst transit fromshapes, as shotransit.wn by the transit Thes corresp dashedonding to line the solid shows and dotted the lines. transit shape for a lower inclination, and the lower row of circles shows the planetary positions at ingress and egress. Transits at lower inclinations last for shorter times, and have slower transitions into and out of transit.

FIG. 2: Transit planet schematic for deriving non-central transit parameters. Note the definition of i for orbital inclination

(i = 90◦ corresponds to ”edge-on”). Figure from R. Santana. 116 transits thus requires very accurate photometry, and transit searches are inevitably biassed towards larger planets, which cause the biggest transits. The transit depth does not depend on the distance of the planet from the star; unlike the Doppler wobble searches, transit searches are sensitive to planets in large orbits around their host stars. They are still slightly biassed against these planets though; planets in large orbits are less likely to show transits in the first place. The dependence of transit depth on the planetary radius means that detecting Earth-like planets needs better accuracy than can be obtained from the ground. To get round this, several satellites aimed at transit searches have been launched. Examples include NASA’s Kepler satellite and ESOs COROT satellite. These satellites offer a good chance of detecting a truly Earth-like planet within the next few years.

117 11.6 Weighing Galaxies

So far we have measured the mass of planets in our own Solar system, planets around other stars and stars in various types of binary systems. Now we take a step up in scale, and ask how we can measure the mass of galaxies. Galaxies have been known since the 10th century, but their nature was only recently understood. The idea that galaxies were disks of stars, like our Milky Way has been around since the mid-18th century. However, it was also thought possible that nebulae, as galaxies were then called were bright clouds of gas and dust within the Milky Way itself. As we discussed earlier, the matter was finally resolved in the 1920’s, when astronomers used Cepheid variables to measure the distances to nearby galaxies. Of course, galaxy masses are measured using the effects of gravity, but unlike stars and planets, we can measure the mass of a single galaxy, even if it is isolated in space. This is because we can use the rotation of the galaxy, to measure the mass.

11.6.1 Galaxy rotation

Figure 11.15: Galaxy rotation

Spiral galaxies, like the Milky Way, rotate. Due to the Doppler shift, light from one side of the galaxy will appear blueshifted, whilst light from the other side of the galaxy appears redshifted (see figure 11.15). By measuring how the rotation speed changes with distance from the galaxy centre, we

118 can measure the mass of the galaxy. Before we show how that works, we should quickly discuss how the rotational velocity is measured. The visible light from galaxies is dominated by stars. Therefore, the optical spectrum of a galaxy looks like the spectra of lots of stars added together, and contains many absorption lines that can be used to measure the Doppler shift. However, there is more to the galaxy than just starlight; galaxies also contain significant amounts of gas and dust; which can extend well beyond the regions of the galaxy that contains stars. Since hydrogen is the most abundant constituent of this gas, we can track the motion of this gas using hydrogen lines, but not the hydrogen lines which appear in the optical and ultraviolet (the Balmer, Lyman and Paschen series). Remember that absorption and emission lines are created when electrons hop between allowed energy levels in an atom. These energy levels are labelled with a principal quantum number, n. In earlier lectures we stated that all energy levels with the same value as n had the same energy. This is not quite true. It turns out there is a tiny, tiny difference in energy between electron orbits in which the electron and proton spin in the same direction, and orbits in which the electron and proton spin in opposite directions. This phenomenon is known as hyperfine splitting. The difference in energy is 9.5 × 10−25 J, and electrons changing from one level to another give rise to light with a wavelength of 21 cm. This is in the radio region of the electromagnetic spectrum. Thus, optical spectroscopy tells us about the rotation speeds of those parts of the galaxy which contain stars, and radio spectroscopy tells us about the rotation of the parts of the galaxy where stars are absent.

11.6.2 Galaxy rotation curves Before we look at the observed rotation of galaxies, let us work out what we expect to see. We look at the rotation under gravity of material a distance r from the centre of the gravity. We assume that the mass inside r and outside r is distributed spherically. Our simple model is shown in figure 11.16. To calculate the expected rotation, we need to remember two theorems we proved in section 10.4.3. One was that the gravity due to the mass inside r is the same as if all that mass was concentrated in a point at the centre of the galaxy. The other was that the mass outside r exerts no net gravitational force. We can find the rotation of material at r by balancing the force from gravity and the acceleration of the material (assuming it is on a circular orbit). This gives GM(r)m mv2(r) = , r2 r

119 v(r)

r

M(r)

Figure 11.16: A simple model of galaxy rotation

. where M(r) is the mass inside r, m is the mass of a small test mass at r, and v(r) is the rotational speed at r. Re-arranging gives r M(r)G v(r) = . (11.8) r Let’s assume the galaxy has a constant density. Inside the galaxy 4 M(r) = πr3ρ. 3 Substituting this back into equation (11.8) tells us that the rotational veloc- ity should increase linearly with radius, v(r) ∝ r. Outside the galaxy, the M(r) is constant (no more mass is enclosed by spheres of larger radii). This suggests that outside the galaxy, the rotation speed should slowly drop off as v(r) ∝ r−1/2. A graph of rotation speed v(r) versus radius r is known as a galaxy rotation curve, and our calculations predict it should look like the red line in figure 11.17.

120 Figure 11.17: A sketch of actual versus predicted rotation curves for a spiral galaxy

Actual galaxy rotation curves The first indication that real life doesn’t follow our simple calculation came from the Swiss astronomer, Fritz Zwicky, in the 1930’s. Zwicky is an amaz- ing character; an outspoken man who considered ”humbleness a lie”, he was not liked by his colleagues. Indeed, he referred to his colleagues as ”spher- ical bastards”, because they were ”bastards, whichever way you looked at them”. Perhaps because of his unpopularity, Zwicky is often not credited for discoveries to which he can rightly claim priority. A prime example is that, in the 1930’s, Zwicky showed that the motion of galaxies seemed to be at odds with what one would expect from theory. Figure 11.17 shows the predicted, and actual, rotation curve of a spiral galaxy. As seen above, theory predicts that the rotation speed should drop with increasing radius. Instead, the rotation speed is roughly constant, even out to distances well beyond the radius which contains all the visible starlight. From equation (11.8) it is obvious that a flat rotation curve implies that M(r) ∝ r. Since M(r) increases with r, even at these large radii, there must still be material there. In other words, even at radii well outside the visible extent of the galaxy, there is still large amounts of matter. Unsurprisingly, Zwicky labelled this material “Dark Matter”, since it was not visible, except through its gravitational influence. This was a major discovery which might have revolutionised astronomy at the time. Naturally, Zwicky’s colleagues ignored it, until the effect was once again discovered

121 forty years later. How much of galaxies is dark matter? When we measure the rotation of galaxies, we actually observe v0(r) = v(r) sin i. If we know the inclination of the galaxy, we can work out the true mass of the galaxy. We can estimate the total amount of mass contained within the visible components of the galaxy (e.g stars) by measuring the luminosity, and working out what mass of stars is necessary to create that luminosity. This only accounts for 10-20% of the total mass of the galaxy, as measured from the rotation curves. Therefore, something like 80-90% of the mass of galaxies is contained in dark matter.

11.6.3 Dark Matter What is dark matter? We know that is has mass, and that it does not emit light. This does not rule out the possibility that dark matter is made out of the same stuff as ordinary matter. Black holes, for example, have mass and emit no light by definition. Very low mass stars and brown dwarfs do emit some light, but they are so faint as to be invisible at the distances of nearby galaxies. Perhaps dark matter is made of large numbers of brown dwarfs, in the outer regions of galaxies? We can test this prediction by carrying out extremely deep searches for brown dwarfs in the outer regions of our own galaxy. Such studies reveal that brown dwarfs only constitute around 6% of the dark matter in our own galaxy. Black holes can be searched for by gravitational lensing. Einstein predicted that gravity bends light. If a black hole passes between us and a background star, its gravity will bend the light from the background star in a distinctive way. Studies using gravitation lensing have shown that black holes are not a big component of dark matter. We are left with the possibility that dark matter is made of material entirely unlike normal matter. Dark matter particles must be made of par- ticles which create and feel gravity, but interact only very weakly with light and normal matter. These particles are called WIMPs - “weakly interacting massive particles”. As yet, no-one has actually managed to detect a WIMP directly, so we know very little about their properties. Nevertheless, the search is currently ongoing, so next year the story may be different.

11.6.4 Galaxy Clusters Galaxies are often not alone in space. Instead they tend to form large clusters of galaxies. How much do these clusters weigh? A while back, we discussed the thermal properties of matter, and derived the equipartition theorem, which states that matter in thermal equilibrium has 1/2kT of

122 thermal energy for every degree of freedom. I also said that this could be used to solve difficult problems in astrophysics. The equipartition theorem is a closely related to another theorem for systems in equilibrium, known as the virial theorem. The virial theorem states that the kinetic energy of a large, self-gravitating system is minus 1/2 its gravitational potential energy, or −2K.E = P.E. (11.9) A cluster of galaxies is a large, self-gravitating system, and we can apply the virial theorem to measure the mass of the cluster. We assume for simplicity that the cluster is a sphere of radius Rc, con- taining N galaxies, all with the same mass m. The kinetic energy of galaxy 2 i is mvi /2, so the total kinetic energy of the cluster is

1 N 2 1 2 1 2 K.E = Σ mv = Nmhv i = Mchv i, 2 i=1 i 2 2 where Mc = Nm is the total mass of the cluster. Previously, we derived equation (10.4) for the gravitational potential energy of two masses,

Gm1m2 Grav. P.E = − . r For a spherical collection of galaxies, the gravitational potential energy is hard to work out, but is approximately given by 3 GM 2 P.E ≈ − c . 5 R Applying the virial theorem, we find that 2 2 3 GMc Mchv i = . 5 R Of course, we cannot measure the true speeds of the galaxies. Instead, we use the Doppler shift to measure the radial velocity of each galaxy vr. On average, a galaxy is as likely to be moving along our line of sight, as in 2 2 2 the other two directions (θ, φ), and so hvr i = hvθ i = hvφi. Hence we can 2 write 3hvr i = hv2i, which gives 2 2 3 GMc 3Mchv i = . r 5 R Re-arranging for the mass of the galaxy cluster gives 5Rhv2i M = r . (11.10) c G

123 Masses measured in this way are known as virial masses. An appropriate value for the size of the cluster must be adopted. This comes from combining the angular size of the cluster with a distance estimate (from Hubble’s law and the redshift of the cluster). When we calculate the virial masses of clusters, we find once again that most of the mass of the galaxy cluster is not explained by the amount of visible material. As well as being a major component of the galaxies themselves, it seems the space between galaxies is also full of dark matter!

124