Astronomy 700: Radiation.

1 Basic Radiation Properties

1.1 Basic definitions

Fundamental importance to Astronomy: Almost exclusive carrier of information

Radiation: Energy transport by electromagnetic fields

Other forms of energy transport:

cosmic rays • stochastic transport (micro: conduction, macro: convection) • gravitational waves • bulk transport (organized flows) • plasma waves • ... •

Transport time variability (see section of E&M) →

1.1.1 The spectrum

The most natural description of electromagnetic radiation is through Fourier decomposition into waves:

f(~r, t) f(~k,ν) (1.1) ↔ where E is some variable describing the radiation field.

Question: Why is this so natural?

As we will shortly see, electromagnetic radiation naturally decomposes into waves with wave- length λ and frequency ν

1 Often, it is convenient to write the wave vector ~k =2πk/λˆ and angular frequency ω =2πν. In vacuum, group and phase velocity of those waves are equal:

10 1 λν = ∂ω/∂k c 2.99792... 10 cms− (1.2) ≡ ≡ ×

Fourier decomposition allows us to describe the local spectrum of the radiation at a fixed point in space as the Fourier transform

∞ f f(ν)= dtei2πνtf(t) (1.3) F ≡ Z−∞ and the inverse Fourier transform

1 ∞ i2πνt − f f(t)= dνe− f(ν) (1.4) F ≡ Z−∞ Without going into any details on Lebesque integration, it is worth pointing out the following identity: The inverse Fourier transform of a delta function in frequency is

∞ 1 i2πνt i2πν0t − δ(ν ν )= dνe− δ(ν ν )= e− (1.5) F − 0 − 0 Z−∞

i2πν0 t Thus, the Fourier transform of e− is

∞ i2πν0t i2π(ν ν0)t e− = dte − = δ(ν ν ) (1.6) F − 0 Z−∞ as one would expect for a decomposition into a spectrum of different exponentials.

N.B.: Fourier transform conventions often differ in normalization and definition of frequency (ω vs. ν), so take care to make sure not to miss factors of 2π along the way.

Broad classification into spectral regions: Astronomy has fallen historically into different bands, enforced by fundamentally different observ- ing techniques. The following figure shows the rough classification of spectral regions (largely anecdotal):

2

radio IR UV X-rays γ-rays visible

9 12 15 18

ν [Hz]

Figure 1.1: Common classification into different spectral bands, dictated by observational tech- nique

1.1.2 Wave-particle duality and the radiative limit

Since we measure radiation only through the interaction with matter, wave propagation in vacuum is not all that interesting. When radiation interacts with matter, quantum mechanics can become important. As Planck found (and as we will see later in the semester), phase space is granular, with a fundamental quantized phase space volume of h3.

Photon energy: Eν = hν

28 where h is the h 6.626075... 10− ergss ≡ × are relativistic, thus their momentum is directly proportional to their energy:

Photon momentum: Pν = hν/c

N.B.: The classical description of electromagnetic waves is valid only in the statistical limit of many photons.

The optical limit of radiation (“ray treatment”) is further only valid in the limit of large distances r: r λ ≫ This is the limit that we will concentrate on for the next sections. Finally, the wave field we detect from astronomical sources is virtually always in the far field limit:

3 r λc/u, where u is the characteristic velocity of the radiating particles. ≫ The far-field limit guarantees that the optical limit is safely satisfied for the travel of light between the source and us, though quantum mechanical effects come into play when the radiation inter- acts with the detector, where characteristic distances can be smaller than the wavelength of the radiation.

1.1.3 Observables and basic definitions of radiation quantities

Astronomers want to measure radiation. Suppose you wanted to build a light capturing device, what quantities could you measure, and what properties of the device would be important?

Question: What are the properties describing a “ray” of light?

Observable: total energy of light captured dE Detector properties:

Angular resolution/pixel scale: solid angle dΩ • Bandwidth: frequency/wavelength range dν • Exposure: time interval of measurement dt • Effective area: total sensitive area of detector dA • Independent variables:

position ~r • ˆ direction ~k • frequency ν • time t • (polarization angle ψ) •

A) Specific intensity or (surface) brightness: Iν ˆ Define a quantity that describes completely how the measured energy dE depends on ~r, ~k, ν, t, and describes the radiation field as completely as possible (neglecting polarization) given the information from the detector:

dE I (~r, t, k,νˆ ) (1.7) ν ≡ dν dtdΩ dA 4 Figure 1.2: Sketch of wave direction, differential surface element, and solid angle. Note: for Iν, dA is taken as perpendicular to kˆ, while flux Fν is measured with respect to some fixed dA and integrates over all solid angles (i.e., kˆ does not have to be perpendicular to dA).

where the surface dA is taken perpendicular to the direction of the ray kˆ, as shown in the figure.

From Iν we can construct a number of important integrated quantities, or “moments” with respect to different observables.

B) Mean intensity: Jν

The zeroth moment of Iν with respect to the polar angle cos θ is the mean intensity:

1 J dΩI (1.8) ν ≡ 4π ν Z4π

C) Specific flux: Fν Energy flux at frequency ν across a surface dA, integrated over all photon directions (coming from both sides of the surface).

F dΩcosθI (1.9) ν ≡ ν Z4π where θ is measured relative to the normal of dA.

Fν is the first moment of Iν with respect to cos θ.

For an unresolved object, we can typically not determine the solid angle spanned by the object.

5 The flux is therefore the most general quantity we can derive directly from any measurements for the object.

N.B.: Fν =0 for isotropic radiation.

Sometimes it is useful to define a photon number flux. Since eν = hν, this is simply

Φν = Fν/hν (1.10)

D) Total flux: F Total energy flux across dA:

∞ F dνF (1.11) ≡ ν Z0

E) Radiation pressure: prad Total momentum flux across dA:

∞ I p = dν dΩcos2 θ ν (1.12) rad c Z0 Z4π

2 Question: prad is the second moment of Iν w.r.t. cosθ. Where does the cos θ come from?

F) Radiative energy density: uν Amount of radiative energy contained per unit volume. Differential volume element: dV = dAds, where ds is measured along any given ray. Since light always travels at c, we have dt = ds/c and can integrate over all rays and write duν = dE/(dAdsdν)= dE/(cdtdAdν):

I dE u = du = dΩ ν = (1.13) ν ν c dAdsdν Z Z4π

which can easily be verified to have the proper dimensions, and

∞ u = dνuν (1.14) Z0

6 N.B.: For isotropic radiation, u =3p, which is an important analog to relativistic particles (verify this as a quick exercise in integration in spherical polar coordinates).

G) Photon density: nν

Number of photons per unit frequency interval per unit volume. Since eν = hν, we have

u n = ν (1.15) ν hν

H) Luminosity: Lν Specific (spectral) luminosity

Lν = dAFν (1.16) I and total, “bolometric” luminosity:

L = dAF (1.17) I measure the total radiative output of an astronomical object. This is mostly what is of astrophysical interest and it has to be extrapolated from a set of measurements.

33 1 Example: The sun has a luminosity of L =3.83 10 ergss− . ⊙ ×

Question: At a distance of r =1.5 1013 cm, what is the solar flux measured on earth (i.e., the solar constant)? ×

I) Spectral index: α

It is often convenient to plot spectra on a log-log plot in frequency, log Fν vs. log ν. Many emission processes produce powerlaw-type spectra over some frequency range,

α F ν− (1.18) ν ∝ which appear as straight lines in a log-log plot. It is then useful to define a local spectral index

∂log F α ν (1.19) ≡− ∂log ν

7 which is the local slope of the spectrum in a log-log plot and is well defined even for non-powerlaw spectra, with the implicit understanding that α depends on frequency. Sometimes it is also useful to define a photon index Γ

∂logΦ Γ ν = α +1 (1.20) ≡− ∂log ν

N.B.: Sometimes the sign convention for the powerlaw index is switched, so make sure you understand how α is defined in a specific application

J) ν λ conversions: − Given λν = c, We can express any specific quantity fν = df/dν with respect to wavelength instead:

df df dλ λ2 c f = = f or f f (1.21) ν dν dλ dν ≡ λ c λ ≡ ν=c/λ λ2

Note the sign conversion implicit here such that both fν and fλ are always defined as positive.

1.1.4 Inverse square law

From eq. (1.16) and energy conservation, it follows that the flux from an isotropic radiation source must obey

L F = ν (1.22) ν 4πr2

For large distances (r Robj), the small angle approximation implies that radiation appears quasi-isotropic and the inverse≫ square law holds even for intrinsically non-isotropic sources. This approximation holds as long as dln I /dθ r/R. ν ≪

1.1.5 Conservation of surface brightness

Consider an object of surface area dS and an observer along a ray ~k at distance r. In the small 2 angle approximation, the solid angle subtended by the source is dΩS = dS/r , while the area dA 2 over which a given bundle of rays is spread is proportional to dA = dΩAr . Thus

dE dE r2 ν ν (1.23) Iν = = 2 2 2 = const dtdν dΩS dA dtdν (dS/r )(dΩAr ) ∝ r

Ergo: Intensity or surface brightness is independent of distance to the object!

8 dS k dA

r

Figure 1.3: Surface brightness along ray ~k. Area spanned by ray bundle dA is proportional to r2, 2 solid angle of source seen by observer is proportional to r− .

1.1.6 Measured quantities:

What detectors often measure is some integrated quantity

∆E = dν dA dt dΩIν (1.24) Z∆ν Z∆A Z∆t Z∆Ω where ∆A is the “effective area” of the detector (which factors in all inefficiencies of the tele- scope system). For sufficiently narrow ranges in these variables we can then approximate Iν ∆E/(∆ν∆A∆t∆Ω). ≈

Figure 1.4: The classic Johnson UBVRI filter transmission curves as a function of wavelength.

The bandwidth (frequency range) ∆ν can be set by a spectrograph or filter. Typical optical fil- ters used in Astronomical observations (which you will come across in many talks) adhere to the

9 “Johnson” UBVRI system. The filter responses for these bands are plotted in Fig. 1.4.

Magnitudes: Astronomers have long measured optical fluxes in logarithmic units (magnitudes). This was par- ticularly convenient before the age of calculators but has stuck around somehow like many archaic scientific habits (e.g., Fortran). Definition:

m = 2.5log F C (1.25) λ − λ − λ m = 2.5log F C (1.26) ν − ν − ν

where the constants Cν,λ depend on the specific filter used and the assumed spectrum of the object. For the classic UBVRI filter system, these constants are (assuming the star Vega as a template, Bessel et al. 1998):

Filter: U B V R I J H K

λeff [A]˚ 3660 4380 5450 6410 7980 12200 16300 21900 Cν 49.4 48.5 48.6 48.8 49.0 49.5 50.0 50.5 Cλ 20.9 20.5 21.2 21.7 22.4 23.8 24.9 25.9

Table 1.1: Zero points and central wavelengths for standard photometric bands (see Fig. 1.4 for 2 1 1 2 1 1 UBVRI bands). Fluxes are measured in ergs cm− s− Hz− and ergs cm− s− A˚ − respectively.

1.2 Blackbody radiation

Thermal equilibrium: all reactions are balanced, no net transfer of energy from one state to another (see discussion later in the semesters). Matter can be in thermal equilibrium (we will discuss this statement at length later in the semester), in which case it will emit thermal radiation. If this radiation is in complete thermal equilibrium with the matter (the requirement for radiation and matter to equilibrate fully, such that the photons achieve a thermal distribution is that the object be completely optically thick in addition to being in thermal equilibrium), the radition field is called blackbody radiation.

N.B.: Blackbody radiation is always thermal radiation, but thermal radiation is not necessarily blackbody radiation.

10 Figure 1.5: Blackbody Gedankenexperiment (Rybicki & Lightman,1979).

Gedankenexperiment: blackbody (isolated enclosure), at some T. After sufficient time, all species, including radiation, will be in equilibrium with each other inside the blackbody. Thus, Iν will achieve some asymptotic equilibrium form Bν.

Question: Can Bν depend on the microscopic or macroscopic details of the container?

Bring two containers (each in thermal equilibrium and both of equal temperature) in contact (allow them to exchange photons of specific frequency range dν and solid angle dΩ across some small opening of area dA). The total energy flux across contact point is:

dE dE 1 = 2 =(I I )dAdν dΩ (1.27) dt − dt ν,1 − ν,2 where each of the two containers emits radiation in thermal equilibrium with the container. However, the two containers are in thermal equilibrium with each other (at the same tempera- ture). The second law of thermodynamics implies that no energy can flow between two systems in equilibrium and so:

dE dE 1 = 2 =0 andthus I = I (1.28) dt dt ν,1 ν,2

The intensity of a blackbody cannot depend on any details of the container and must be a universal function Bν only of temperature.

Bν is called the Planck function and is given by

2hν3/c2 B = (1.29) ν ehν/kT 1 − 11 N.B.: Bν has no dependence on angle - it is isotropic.

Bν is an expression of the statistical mechanics for massless bosons (integer spin particles). We will derive the exact functional form and normalization later in the semester.

Question: What are some Astronomical examples of blackbody spectra?

1.2.1 Properties of Bν

Figure 1.6: Plot of the Planck curve for different as a function of frequency. Note: 2 Conversion to Bλ requires multiplication by c/λ . Rybicki & Lightman 1979.

The exponential in the denominator becomes large for hν kT , at which point B must fall ≫ ν exponentially. Thus, visual inspection indicates that Bν peaks around hν = kT .

A) Rayleigh-Jeans’ law: At low frequencies, hν kT , we can approximate the exponential by its first order Taylor expan- sion: ≪

3 2 2 RJ 2hν /c 2ν kT Bν kT/h = (1.30) ≪ ≈ (1 + hν/kT ) 1 c2 −

12 which is a powerlaw with index α = 2. −

N.B.: Planck’s constant is absent from the Rayleigh-Jeans law - it can be derived entirely from classical .

RJ Ultraviolet catastrophe: The integral of dνBν diverges, which is, however, of no concern, since we know B falls exponentially above kT . ν R B) Wien law: At high frequencies, ehν/kT 1 and we can approximate the spectrum as ≫

3 hν/kT W 2hν e− Bν kt/h (1.31) ≫ ≈ c2 which drops rapidly with ν (see figure), about 30 orders of magnitude in two decades in frequency.

C) Monotonicity:

Both the Wien law and the Rayleigh-Jeans law are monotonic in temperature, with Bν rising at every frequency with rising T . This suggests that Bν is monotonic as well. Simple derivation:

∂B (T ) 2hν3 ehν/kT hν ν (1.32) = 2 2 2 > 0 ∂T c "−(ehν/kT 1) # −kT −  

D) Wien-displacement:

The suggested peak in Bν can be found by setting ∂Bν/∂ν =0 to be

hνpeak =2.82kT (1.33)

The peak in the blackbody spectrum is directly proportional to the temperature. Finding the peak of a blackbody spectrum allows one to determine the temperature of a blackbody (see color tem- perature)

E) Flux and luminosity: The flux emitted at the surface of a blackbody thermal emitter (i.e., at distance r =0) is

2h (kT )4 x3 2πk4 π4 F = dΩcos θ dνB = π dx = T 4 σT 4 (1.34) S ν c2 h4 ex 1 h3c2 15 ≡ Z2π Z Z − 13 where σ is the Stefan-. The luminosity of a convex (not self-shadowing) blackbody of surface area A is then given by

4 L = AFS = AσT (1.35)

Knowing this allows us to estimate surface areas of blackbody emitters at known distances and with measured temperatures and fluxes (assuming isotropic emission).

1.2.2 Characteristic temperatures

Based on the simple temperature dependence of Bν, it is often convenient to define temperature measures of a radiation field.

A) Brightness temperature:

The temperature a blackbody spectrum would have at the measured intensity Iν. This works be- cause Bν is monotonic. Then,

hν (1.36) kTb = 3 2 ln[1+2hν /(Iνc )] which, in the Rayleigh-Jeans limit hν kT , reduces to the simple formula ≪ c2I kT = ν (1.37) b 2ν2

This expression is most often used in radio astronomy, where we are almost always in the Rayleigh- Jeans limit.

B) Color temperature: Determined by fitting a black body curve to a measured spectrum of unknown brightness (if the object is unresolved). This is often done in X-ray Astronomy, for example.

hν kT peak (1.38) c ≡ 2.82

C) Effective temperature: Determined by equating the measured total flux to that from a black body:

4 F = σTeff (1.39)

14 2 Radiative Transfer

2.1 Basic radiative transfer of absorbing media

Radiative transfer is the study of the interaction of radiation with matter. Kinds of interaction one might encounter:

• absorption • scattering • gravitational lensing • ... •

The theory of radiative transfer deals with the first four.

2.1.1 Absorption

For now, we will concentrate on absorption and leave scattering for later. There are many possible absorption processes, which we will learn about later on in the semester.

dV = dA ds

Iν − dI ν

σν

dA dA ds area dA as seen by I ν

Figure 2.1: Sketch of absorption of light along a ray by differential volume element dV , contain- ing dN = ndV particles.

Examples:

15 Resonance line absorption • Free-free absorption • Synchrotron-self absorption • ... • Consider a system of particles of density n that can absorb and emit light. Each particle has an effective cross section that describes its efficiency of absorbing light.

cross section σν: area of particle blocked by particle for light at given frequency, solid angle

N.B.: The following description makes the assumption that the extent of the radius of curvature of the wave front is much larger than the particle size scale, i.e., geometric optics.

Derivation of the absorption equation: Consider a light ray moving through a volume dV = Ads, with A perpendicular to the ray (ds measures the path length along the ray). dV contains dN = nAds particles. The total area blocked by the particles is

dS = dNσν = nAdsσν (2.1)

and the fraction of A blocked by the absorbing particles is

f dS/A = nσds (2.2) block ≡ The fraction of light absorbed in dV is

dI /I = dS/A = nσ ds (2.3) ν ν − − ν which we can rewrite to derive the absorption part of the radiative transfer equation:

dI ν = nσ I α I (2.4) ds − ν ν ≡− ν ν

with the absorption coefficient α nσ . ν ≡ ν

Question: What is the solution to this equation?

It is often convenient to introduce the opacity

κ σ/m = α /ρ where m is the average mass of per particle. ν ≡ ν

16 2.1.2 Spontaneous emission

Like absorption, there are many possible reasons for a particle to emit light spontaneously:

Line emission • Free-free emission (bremsstrahlung) • Synchrotron emission • ... •

We define the emission coefficient per unit volume jν as

dE dI j ν = ν (2.5) ν ≡ dV dtdνdΩ ds

With the same definition of the path length ds along the ray, spontaneous emission will add inten- sity to the ray as

dV dI = j = j ds (2.6) ν ν dA ν

which has the solution

s Iν(s)= Iν(s0)+ ds′jν(s′) (2.7) Z0

2.1.3 Stimulated emission

Was first introduced by Einstein in order to correctly derive the Planck spectrum. Quantum view: Bosons like each other, i.e., it is easier to add photons to a state that is already populated. For Fermions, the equivalent process would be “suppressed emission”.

Stimulated emission is proportional to Iν and exactly coherent with the incoming radiation field.

Interestingly, it is most easily treated as “negative absorption” and taken into an effective αν

α α α (2.8) ν ≡ ν,abs − ν,stim which leaves the form of the equations unchanged. As it turns out, knowing αν,abs implies that we also know αν,stim and vice versa (see Einstein relations below), so this mathematical step is straight forward and well motivated (much more so than trying to combine stimulated and spontaneous emission).

17 2.1.4 The equation of radiative transfer

Combining absorption and emission, we have

dI ν = α I + j (2.9) ds − ν ν ν

Given knowledge of αν and jν, we can readily solve for Iν(s). First, we divide by αν:

dIν jν = Iν + (2.10) ανds − αν then we define new variables: The optical depth τν such that

dτ α ds (2.11) ν ≡ ν

and the source function Sν

jν Sν (2.12) ≡ αν

Then we have the equation of radiative transfer:

dIν = Iν + Sν (2.13) dτν −

N.B.: Iν increases/decreases if Sν is larger/smaller than Iν

To solve formally, we multiply by the integrating factor eτν :

τν dIν τν τν e + e Iν = e Sν (2.14) dτν which we can re-write as

τν d (e Iν) τν = e Sν (2.15) dτν which we can integrate over τν to give

τν τν τ e I (τ ) I (0) = dτ ′ e ν′ S (2.16) ν ν − ν ν ν Z0 18 or

τ τν τν′ τν Iν(τν)= e− Iν(0) + dτν′ e − Sν (2.17) Z0

Optical depth τν: variable that measures how much a radiation field has been affected by matter

τν s s τν(s)= dτν′ = ds′ αν = ds′ nσν (2.18) Z0 Z0 Z0

The surface where τν =1 is called the “photosphere” of an object. Integrating by parts, the average optical depth a photon travels to is

τν dττ e− τ =1 (2.19) ν dτe τ h i≡ R − R so we define the

Mean free path: mean path length s a photon can travel before being absorbed, at the local value of αν (local quantity)

τν 1 τν λmfp s h i = since ds = (2.20) ≡h i≡ αν nσν αν

2.1.5 Reddening

One of the great advantages of expressing radiative quantities as a log is that multiplicative factors like extinction can be written as additive instead.

Say the optical depth towards an object is τν. The apparent magnitude towards the object is then

m = 2.5log F C +2.5log(e) τ 2.5log F C + A(ν) (2.21) ν − ν|unabs. − ν · ν ≡− ν|unabs. − ν where F is the unabsorbed flux we would observe for τ =0 and where we assume that the ν|unabs. ν filter is sufficiently narrow that τν does not change appreciably across it. A(ν) is the extinction.

If τν changes between filters, the result is that some filters will be more absorbed than others. Typi- cally, absorption is more efficient at shorter wavelengths (Rayleigh scattering has a ν4 dependence, for example). Thus, bluer bands will be more strongly affected than redder bands. The result is a “reddened” spectrum. Interstellar dust is the most common cause of reddening.

19 This can be conveniently expressed in terms of colors. Say we have two bands B and V. The B-V color is then given by

F B V = m m = 2.5log B (C C )+2.5log e (τ τ ) (2.22) − B − V − F − B − V · B − V  V  unabs.

and the color “excess” over the intrinsic colors is given by

E(B V) (B V) (B V) =2.5log e (τ τ )= A A (2.23) − ≡ − obs. − − unabs. · B − V B − V

If the intrinsic colors of an object (B V ) unabs. are known (e.g., for a star of a given spectral type), we can determine τ τ and,− with| some knowledge of the effective absorption cross B − V sections σB and σV, we can estimate the column depth

N dsn (2.24) dust ≡ dust Z of dust towards the object, since E(B V)=2.5 log e N (σ σ )=1.086 N (σ σ ). − · dust B − V dust B − V A convenient, dimensionless way to characterize the extinction law is through the parameters

A σ ν = ν (2.25) AV σV and

AV 1 RV = (2.26) ≡ E(B V) σB 1 − σV − For some “typical” Milkyway dust composition and gas-to-dust ratio,

R 3.1 (2.27) V ≈ and N A H (2.28) V ≈ 1.8 1021 cm 2 × −

where NH is the hydrogen column density NH = dsnH towards the absorbed object. R 2.2 Kirchhoff’s law and the Einstein Coefficients

Thermal emission: by material in thermal equilibrium (as opposed to blackbody radiation, in which case the radiation itself is in thermal equilibrium).

A thermal emitter at temperature Tem brought into a blackbody radiation field of the same temper- ature Trad = Tem must be in thermal equilibrium with the radiation field.

20 Figure 2.2: Different extinction laws, including the typical Milkyway extinction with RV =3.1.

2.2.1 Kirchhoff’s law

Gedankenexperiment: Place a thermal emitter into a blackbody enclosure of the same tempera- ture. The new system is again a blackbody. = the emitter cannot change I inside the container. ⇒ ν = dIν =0 inside the emitter. ⇒ dτν Since Iν = Bν, the equation of radiative transfer (eq. (2.13) gives:

dIν = = Bν + Sν =0 inside the emitter, or ⇒ dτν −

Sν = Bν(T ) (2.29)

With the definition of Sν = jν/αν, we have Kirchhoff’s law:

jν = ανBν(T ) (2.30)

which is valid for all thermal emitters and absorbers.

N.B.: Kirchhoff’s law is independent of the specific radiation mechanism (e.g., line or continuum emission) and must be an expression of an underlying symmetry.

Question: What symmetry might be at play here, given what “thermal equilibrium” means?

Looking again at the solution to the radiative transfer equation (for simplicity assuming a uniform

21 medium inside the thermal emitter/absorber), eq. (2.17),

τν τν τ τν τν I (τ )= e− I (0) + dτ ′ e ν′ − S = B (T )+ e− (I (0) B (T )) (2.31) ν ν ν ν ν ν ν − ν Z0 we find

lim Iν = Bν (2.32) τν →∞ which explains the requirement to have a “black body” to achieve blackbody radiation.

N.B.: Iν does not have to equal Bν, since Sν is only the source function. Only in the limit τ does thermal radiation approach the blackbody spectrum. ν → ∞

Examples of non-blackbody thermal emission:

Thermal bremsstrahlung • Thermal plasma emission (optically thin collisionally ionized plasma) • Thermal cyclotron emission • Thermal comptonization • ... •

2.2.2 The Einstein coefficients

Note: We will make use of some concepts here that we will define later in the semester: emis- sion/absorption lines and Boltzmann statistics. Einstein analyzed the underlying nature of Kirchhoff’s law. He considered a two-level atom (levels 1 and 2) with energies E1 and E2 >E1 and a radiatively allowed transition between the two states with exchange energy hν = E E . 0 2 − 1 Each state has a statistical weight, g1 and g2, respectively (degeneracy factors in populating that state with electrons).

Assume the transition has some finite width in frequency (energies of hν0 ∆ν/2, at the minimum, given by the uncertainty principle), following some functional form called± the (normalized) line profile function φν (see Fig. 2.3).

In what follows, we will assume that Iν does not vary appreciably across ∆ν. Then all integrals over ν can be approximated by a delta function.

σ φ = ν δ(ν ν ) and dνφ(ν)=1 (2.33) ν σ ≈ − 0 Z 22 2

ν

I

h 0 ν, ν A21 B12 Iν B21 Iν Iν

φ slowly varying

1 ν0

Figure 2.3: Left: Sketch of Einstein’s two level atom. Right: Line profile function for 1-2 transition (narrow compared to dλ/d ln Iν).

Three types of radiative transitions are possible:

A) Spontaneous emission: Spontaneous, isotropic emission of photons as an atom transitions from 2 to 1, emitting a photon with energy hν0 distributed according to line profile function φ(ν).

Probability of atom in state 2 to transition to 1 by spontaneous emission per unit time: A21.

Transition rate: proportional to n2 and rate coefficient A21

∂n ∂ ln n 2 = n A or A 2 (2.34) ∂t − 2 21 21 ≡ − ∂t spont. spont

B) Absorption:

The transition rate into state 2 should be proportional to incident mean intensity Jν, averaged over the line profile function, and the target density of atoms in the ground state, n1.

We define B12 such that the transition rate per atom from state 1 to state 2 by absorption of a photon

of energy hν0 is B12Jν0 . The transition rate per volume is

∂n 2 n B J (ν ) (2.35) ∂t ≡ 1 12 ν 0 abs.

where we used dν φ J J (ν ). ν ν ≈ ν 0 R 23 C) Stimulated emission: Isotropic probability of an atom in state 2 to emit a photon with energy hν through stimulated emission, per unit time: B21.

The transition rate out of state 2 should proportional to incident mean intensity Jν and the target density of atoms in the excited state, n2. We define B21 such that the transition rate per atom from state 2 to state 1 by stimulated emission of a photon of energy hν0 id B21Jν0 . The transition rate per volume is

∂n 2 n J (ν )B (2.36) ∂t ≡− 2 ν 0 21 stim.

2.2.3 The Einstein relations

In thermal equilibrium:

∂n 2 = n J (ν )B n J (ν )B n A =0 (2.37) ∂t 1 ν 0 12 − 2 ν 0 21 − 2 21

Solving for Jν:

n A A /B J (ν )= 2 21 = 21 21 (2.38) ν 0 n B n B (n /n )(B /B ) 1 1 12 − 2 21 1 2 12 21 − In thermal equilibrium, densities follow the Boltzmann relation:

E1/kT (E E ) (hν ) n1 g1 e− g1 2− 1 g1 0 = = e kT = e kT (2.39) E2/kT n2 g2 e− g2 g2 which we will discuss later in the semester.

N.B.: For hν kT , we have n /n g /g , while for hν kT , we have n /n 1. 0 ≪ 2 1 ≈ 2 1 0 ≫ 2 1 ≪

This gives

A /B J (ν )= 21 21 (2.40) ν 0 hν0 (g /g )(B /B )e kT 1 1 2 12 21 − For τ and thermal equilibrium, we have ν → ∞

3 2 2hν0 /c Jν Bν = (2.41) → ehν0/kT 1 − 24 We can thus equate coefficients:

A 2hν3 g B 21 0 1 12 (2.42) = 2 and =1 B21 c g2B21

These are the Einstein relations. They link all three radiative transition probabilities to each other, i.e., they are only one parameter. If we know, say, B12, we can calculate A21 and B21 from it. One of the these will have to be determined from quantum mechanical considerations, which will then suffice to completely describe the transitions between 1 and 2 via radiative processes.

N.B.: kT does not appear in these relations - they are valid regardless of whether the material is in thermal equilibrium or not.

The Einstein relations are a generalization of Kirchhoff’s law for bound-bound transitions. This is an example of what is know as “detailed balance relations”.

Detailed balance: The principle of microscopic time reversibility (time-symmetry). Each process has a time-inverse and the cumulative probability of all channels going one way must equal that of all channels going the other way in thermal equilibrium. It is a stronger statement than just equilibrium, which implies only that there is no net change. Detailed balance implies that, in thermal equilibrium, every reaction has zero flux.

2.2.4 Absorption and emission coefficients in terms of Einstein coefficients

We can re-express αν and jν in terms of the Einstein coefficients and the line profile function φν. Spontaneous emission:

A21 is the rate coefficient of how many hν0 photons (on average) get emitted per atom in state 2. The total rate per volume is then A21n2 and the total power emitted per volume is

dE = hνA n (2.43) dV dt 21 2

1 The emission is distributed over a frequency range following φν (note: φν has units of ν− ):

dE ν = hνA nφ (2.44) dV dtdν 21 ν and is isotropic (emitted equally into all solid angles), so

dE hνA nφ j = ν = 21 ν (2.45) ν dV dtdνdΩ 4π 25 Absorption and stimulated emission:

We once again lump absorption and stimulated emission into one coefficient, αν. Recall that B12Jν is the average transition rate per atom.

The transition rate per volume per unit frequency interval is B12Jν(ν0)φν. Absorption is isotropic, thus the rate of energy removed per unit volume per unit solid− angle per unit frequency is

dE hν dn hν = 1 = n B φ I = α I (2.46) dV dtdνdΩ 4π dtdν −4π 1 12 ν ν − ν|abs ν

Thus:

hν α = n B φ (2.47) ν|abs. 4π 1 12 ν

Similarly, we have

hν α = n B φ (2.48) ν|stim. −4π 2 21 ν

The total absorption coefficient is then

hν hν g1 n2 α = (n B n B ) φ = n B 1 φ (2.49) ν 4π 1 12 − 2 21 ν 4π 1 12 − g2 n1 ν  

Corollary: αν is positive as long as the term inside the brackets is positive. This is always the case for thermal populations, where

n2 g1 hν0 = e− kT < 1 (2.50) n1 g2

However, when a “population inversion” occurs (g1n2 > g2n1), we can have a negative absorption coefficient. In this case, intensity increases along a ray in a coherent fashion due to stimulated emission. Such a situation can arise when radiative selection rules and a pumping mechanism keep an upper level populated. This is the “MASER/” condition. For sufficiently large negative optical depths, the exponential intensification can be very large.

N.B.: LASER stands for Light Amplified by Stimulated Emission of Radiation. It requires a negative absorption coefficient by means of a population inversion. By definition, LASER emission is non-thermal.

26 2.3 Radiative transfer with scattering

We now return to the radiative transfer equation. In writing the formal solution (eq. 2.17) we implicitly assumed that the source function Sν is known locally. This is, of course, rarely the case. In particular, the source function can be complicated for media that are out of thermal equilibrium - it can be anisotropic, it can vary spectrally, it can be modi- fied by scattering, etc. And even for thermal media, the local temperature can depend on the radiation field. In these cases, one must solve a network of equations self consistently.

2.3.1 Scattering

Scattering can redistribute photons both in angle and in energy. Example: Compton scattering will change a photon’s energy and angle. This means that the scattered emissivity is a convolution of the incoming intensity with a nor- ˆ′ ˆ malized angular/energy redistribution function R(ν′,~k ,ν,~k) that encapsulates the scattering cross section,

sc ~ˆ ∞ ~ˆ′ ~ˆ jν (k)=˜σνρ dν′ dΩ′R(ν′, k ,ν, k)Iν′ (2.51) Z0 Z4π

where σ˜νρ is the total scattering coefficient (similar to the absorption coefficient αabs).

N.B.: It is customary to define the scattering opacity as σν, the same symbol used for the scat- tering cross section. The difference is that the scattering opacity is the scattering cross section per particle mass. Since we defined it as σ˜ , we have σ˜ = σ (n/ρ). Yet others ν ν ν · (including Rybicki & Lightman) use σν for the scattering coefficient σ˜νρ, just to confuse everybody completely.

Because jν depends on the complete information contained in Iν, it is immediately clear that even the simplest possible examples of scattering atmospheres can be difficult to solve.

2.3.2 Isotropic scattering

Occurs when atoms are strongly perturbed by collisions such that the scattered photon is out of phase with the incoming photon and loses angular coherence completely. Then, scattering is isotropic and energy is redistributed according to the line profile function (e.g., resonant scattering, which is technically absorption and re-emission of a line photon)

φν′ φν R(ν′,ν)= (2.52) 4π 27 and

sc ˆ σρ˜ ∞ ˆ′ ∞ j (~k)= φ dν′φ dΩ′I (~k )=˜σρ φ dν′φ J (2.53) ν 4π · ν ν′ ν′ · ν ν′ ν′ Z0 Z4π Z0

where the emission profile φν′ is identical to the absorption profile φν.

2.3.3 Coherent scattering (ν′ = ν)

Consider scattering with no energy shift, but angle redistribution may occur (e.g., Thompson scat- tering).

ˆ′ ˆ ˆ′ ˆ R(ν′,~k ,ν,~k)= R(~k ,~k)δ(ν ν′) (2.54) − and

sc ˆ′ ˆ jν =σ ˜νρ dΩ′R(~k ,~k)Iν (2.55) Z4π

2.3.4 Coherent isotropic scattering

The simplest case: no energy change and isotropy (e.g., Thompson scattering for an unpolarized beam of light). This idealized case is appropriate for example in complete local thermodynamic equilibrium (see 2.4). §

ˆ σ˜νρ j sc(~k)= dΩI =σ ˜ ρJ (2.56) ν 4π ν ν ν Z4π

In this case, the radiative transfer equation takes on the simple form

dI ν = (κ +˜σ ) ρI + jem + jsc ds − ν ν ν ν ν = (κ +˜σ )ρI + jem +˜σ ρJ (2.57) − ν ν ν ν ν ν

We now define the extinction coefficient αν as

αν =(κν +˜σν)ρ (2.58)

28 and the usual

dτν = ανds (2.59)

Dividing eq. 2.57 by αν, we get

em dIν jν σ˜νρJν = Iν + + (2.60) dτν − αν αν

We can now write the source function as

S (jem +˜σ ρJ )/α (2.61) ν ≡ ν ν ν ν With this, we can reduce the radiative transfer equation to the same form we derived for pure absorption:

dI ν = I + S (2.62) dτ − ν ν

with the understanding that the scattering term of the emissivity depends on the local mean inten- sity. In the following we will assume that scattering is isotropic and coherent and use the radiative transfer equation written in eq. 2.62

2.4 Local Thermodynamic Equilibrium (LTE)

We have loosely used the terminology of thermal equilibrium as defined such that the net reaction rates between different states of a system are, on average, zero. This implies that the probability distributions of states accessible to particles in the system under consideration do not change with time. We will derive the conditions for LTE when we discuss statistical mechanics but it is worth defining it here more specifically. We say a system is in local thermodynamic equilibrium if

1. Free particles follow a Maxwellian distribution

m 3/2 mv2 f(v)= e− 2kT (2.63) 2πkT 2. The distribution functions of electrons across energy states in atoms follow the Boltzmann distribution E E nj gj j − i = e− kT (2.64) ni gi

29 3. The distribution functions of different ionization states (e.g., singly ionized vs. doubly ion- ized) follow the so-called Saha equation (ionization is specified by charge Z), which is a generalization of the Boltzmann distribution to multiple particles:

3/2 E E nZ+1ne gZ+1ge 2πmekT Z+1− Z = e− kT (2.65) n g h2 Z Z   However, as noted before, this does not in itself imply that the radiation field itself is also in equilibrium with the matter, i.e., the intensity does not have to equal the Planck function. Matter in LTE must emit thermal radiation. Again assuming the simple case of isotropic and coherent scattering, we can use Kirchhoff’s law to rewrite the source function in LTE as

em jν +˜σνρJν κνρBν +˜σνρJν κνBν +˜σνJν Sν = = = (2.66) (κν +˜σν) ρ (κν +˜σν) ρ κν +˜σν

2.5 Solving the radiative transfer equation in a plane parallel atmosphere

The plane parallel atmosphere is an appropriate approximation in many instances. In particular, it can be used to describe spherically symmetric radiative transfer problems if the radius of curvature of the atmosphere is much larger than the mean free path, Rαν 1 (more generally, it is a good approximation if second derivatives of physical variables are small≫ over an optical depth). Consider the radiative transfer through a slab of material. Let the z-coordinate measure the direc- tion perpendicular to the slab. Let the polar angle measured with respect to the z-axis be θ and the azimuthal angle be φ, as usual. Let the slab have a thickness zmax. Since z is the only preferred direction, it is clear that the problem cannot depend on φ in any way. We are free to average over φ and reduce the problem to one in two dimensions, z and θ. It is convenient to define new variables,

µ = cos θ with dµ = sin θdθ (2.67) − and the optical depth along the vertical direction (implicitly understood to be a function of ν)

dτ = dz α (2.68) z − ν which we distinguish from the optical depth along a given ray

dτ = dsα = dτ /µ (2.69) s − ν z In terms of these variables, the radiative transfer equation can be re-written as

dIν µdIν = = Iν Sν (2.70) dτs dτz −

30 The plane parallel semi-infinite atmosphere Approximately plane parallel atmosphere

fre e spa c e z

Iν(μ<0)=0 τ =0 surfa c e line of sight, s z atmomsphere (t op of a t m osphe re )

θ outgoing

τ s τ 0<µ<1 z photosphere R *

τ =1 z µ=0

ingoing atmosphere −1<µ<0

τz=τm a x (bot t om of a t m osphe re )

Figure 2.4: The nomenclature of radiative transfer in the case of a plane parallel atmosphere (left) and an illustration of why this is a good approximation in many cases of stellar and planetary atmospheres.

τz/µ Multiplying this equation by e− /µ and re-arranging the terms, we can integrate the equation over τz to get the formal solution

τ τz,max τ z,max dτz′ z′ I (τ =0)= I (τ )e− µ + S (τ ′ )e− µ (2.71) ν z ν z,max µ ν z Z0 where τz,max is the optical depth at the back end of the atmosphere.

N.B.: In the typical notation of the plane parallel atmosphere, the optical depth is measured into the medium, rather than towards the observer.

In the case that τz,max 1, it is often appropriate to take the limit τz,max . This case is called the semi-infinite plane≫ parallel atmosphere. In this case, the solution simplifies→ ∞ to

τz,max ∞ dτ ′ τz′ µ z µ lim Iν(τz =0) = lim Iν(τz,max)e− + Sν(τz′ )e− τz,max τz,max µ →∞ →∞ Z0 h τ i ∞ dτz′ z′ = S (τ ′ )e− µ (2.72) µ ν z Z0

31 2.5.1 Examples

1. Accretion disks: A flat accretion disk is a good example of a plane parallel atmosphere. For a radiatively efficient accretion disk, the disk thickness (basically the density or pressure scale height of the disk) is small compared to the orbital radius, H R, so vertical gradients are much larger than radial ones. ≪ To lowest order, the radial gradients can be ignored and the disk can be regarded as plane parallel. Since typical accretion “disks” (as opposed to some other types of accretion flows) are very optically thick, they can be regarded as semi-infinite slabs. Note that accretion disks can have coronae that can violate the H R condition, and some accretion flows of low radiative efficiency also have large scale heights.≪ In those cases, a planar parallel treatment is not validated.

2. Stars: Stars are quasi-spherical, but the thickness of their atmospheres is typically much smaller than the stellar radius R , i.e, Rα 1. ⋆ ν ≫ Furthermore, stars are spherically symmetric to lowest order, making all gradients radial, i.e., perpendicular to the atmosphere. We can therefore locally approximate the atmosphere as plane parallel. For an observer at distance r R , a bundle of parallel rays will strike the stellar surface at ≫ ⋆ different angles µ = cos θ (see Fig. 2.4). Since the atmosphere is a very thin layer, this angle between the line of sight and the local vertical stays constant along the line of sight (it changes only if R changes significantly along the line of sight). In calculating the intensity at a given point of the stellar surface for a distant observer, one simply has to make a translation to viewing a plane parallel atmosphere at different incidence angles µ. It is straight forward to show that the flux F received by an observer at distance r R is ν,⋆ ≫ ⋆ R2 F = ⋆ F (τ =0+ ǫ) (2.73) ν,⋆ r2 ν,slab z

where Fν,slab is the local flux emitted at a fixed point at the outer edge of the plane parallel atmosphere (the τ =0 surface), where the ingoing intensity vanishes.

2.5.2 The Eddington-Barbier relation

In highly optically thick situations, as encountered in stars, for example, the radiative mean free path is often short compared to the local scale height.

32 Limb darkening Solar disk in visible light

θ=90°, µ=0 τ=1

T 2 < T 1 T 4 < T 3 θ T 3 < line of sight T 2 T 4 photosphere ( τ=1 surface)

θ=0°, µ=1

I4(µ 4=1) = Sν(τ=1)

I3(µ 3) < I4(µ 4)

I2(µ 2) < I3(µ 3)

I1(µ 1) < I2(µ 2)

Figure 2.5: Sketch of the effect of the temperature structure on the brightness distribution across the star (left) and observed limb darkening of the solar disk (right).

In these circumstances, a linear approximation of the source function optical depth τz is appropri- ate:

dSν Sν(τz) Sν(τz = 0)+ (τz = 0) τz (2.74) ≈ dτz · S + S τ (2.75) ≡ ν,0 ν,1 z

The solution to the radiative transfer equation at the surface of the atmosphere (using the semi- infinite approximation and integration by parts) is then

∞ dτz τz I (0,µ) = (S + S τ ) e− µ ν µ ν,0 ν,1 z Z0 ∞ dτz τz = S + S τ e− µ ν,0 ν,1 µ z Z0 ∞ dτz τz τz = S + S µ e− µ ν,0 ν,1 µ µ Z0 = S + S µ ν,0 ν,1 · = Sν(τz = µ)

= Sν(τs = τz/µ = 1) (2.76)

i.e., the intensity at the surface is given by the source function at optical depth τs =1.

33 Solar continuum opacity (Vögler, Bruls, & Schüssler 2004) Sketch of solar atmospheric temperature profile

8 Lyman edge (912 Å) Corona

6 sketch of line profile

]

-1 4

g

2

[cm continuum ν 2 λ

κ Photosphere

log log

0 line core Temperature (T) Temperature

line wings −2 bound-free opacity free-free opacity Chromosphere 1000 Å3000 Å 10000 Å 30000 Å R λ (on a log scale) τ

Figure 2.6: Model of the solar opacity (left) and sketch of the temperature profile of the solar atmosphere (right), showing temperature inversion towards the corona and a sketch of an absorp- tion line profile formed in this temperature structure (inset).

The flux at the surface is outgoing only, so

1 1 F = F =2π dµ µI =2π dµ µ (S + S µ) (2.77) ν +,ν · ν · ν,0 ν,1 Z0 Z0 S µ2 S µ3 1 2 = 2π ν,0 + ν,1 = π S + S (2.78) 2 3 ν,0 ν,1 3  0   = πSν (τz =2/3) (2.79)

N.B.: This is the Eddington-Barbier relation. It states that the intensity at the surface of a linear plane parallel atmosphere is equal to the source function at optical depth τs = 1, and that the flux is set by the source function (and thus temperature) at optical depth τz =2/3.

The immediate implication of this relation is that, even under LTE conditions, the radiation field is non-local, i.e., determined by conditions roughly an optical depth away.

N.B.: The region around the τν =1 surface is called the photosphere.

N.B.: From the Eddington-Barbier relation, we see that the effective temperature of a photo- sphere corresponds to the coditions at optical depth τz =2/3.

34 2.5.3 Limb darkening and line formation

Stars have temperature structure, with the temperature in the stellar interior being hottest. The temperature in the atmosphere is generally very much lower than in the core (where it needs to be high to sustain nuclear burning). This has important consequences for the appearance. Consider the simplest case of a monotonically decreasing temperature in the star. The outer layers of the star’s atmosphere will be colder than the inner layers (see Fig. 2.5 for an illustration). A distant observer will see light from across the stellar disk. Following Eddington-Barbier, at the pole (the center of the disk), the intensity is equal to that at an atmospheric depth of τz = µ =1.

At points away from the pole the light will have emanated from a depth τz = µ < 1. The closer to the equator, the shallower the emission region dominating the emitted intensity towards the observer. Since the temperature in higher layers of the atmosphere is lower, this implies that the emission towards the limb of the star (in the outer stellar disk) less bright than that from the center of the stellar disk. This phenomenon is called limb darkening and is observed, for example, in the sun (see Fig. 2.5). It is possible, however, that the temperature can vary with radius in a non-monotonic fashion, i.e., increasing towards the τz =0 surface of the star. In this case, the opposite effect must be true, the star is limb brightened. Since the optical depth depends on the frequency through the opacity, limb brightening is weaker at frequencies/wavelengths where the opacity is larger (because the photosphere is thinner, spanning a smaller range in T ).

Figure 2.6 shows a solar opacity model, illustrating the large variations in κν both due to line opacity and continuum opacity. The result is that the solar disk is more strongly limb-darkened at wavelengths near the minima in κν, around λ = 15000A˚ and λ = 3000A˚ . Since the optical depth is highest in absorption lines, the core of a line is emitted very close to the surface of the star (i.e., very high in the atmosphere), at a much shallower depth than the continuum or even the wings of the line. Typically, the temperature of a star is colder at larger radii, making the line appear strongly absorbed. However, if a star’s atmosphere has an inverted temperature structure, as is the case in the sun’s outermost layers and as sketched in Fig. 2.6 (right hand side), it is possible that the photons in the come from a region where the source function is larger than in the region where the continuum is emitted. In that case the line appears in emission. Depending on the location of the photosphere for the line, it is possible that the core of the line appears in emission while the wings appear in absorption. The outermost layers of stars (coronae) are generally not in LTE. As a result, Kirchhoff’s law no longer holds, and one cannot assume the source function to be Planckian anymore. Thus, even though the temperature might be very high in the corona, it is possible that the source function in

35 the core of a very optically thick line is well below the Planck function for that temperature. This can give rise to complex line profiles (e.g., doubly inverted line profiles, with a bump and another dip within the bump in the line center).

2.6 Moment equations

As we have seen, the local intensity depends on the source function in a non-local way (at about an optical depth away). Where exactly the τ = 1 surface at any given point is depends on the extinction coefficient along the way, which can itself depend on the intensity. This is what makes radiative transfer complex. Going back to our definitions from 1.1.3, recall the first three moments of the intensity, rewriting them slightly by integrating over µ (using§ sin θdθ = dµ) and averaging: − 1 1 1 dΩ dµ (2.80) 4π 4π −→ 2 1 Z Z− which gives the three Eddington moments:

1 1 c 1. Zeroth moment: Jν = 2 1 dµIν = 4π uν −

1R 1 Fν 2. First moment: Hν = 2 1 µdµIν = 4π − 1 R 1 2 c 3. Second moment: Kν = 2 1 µ dµIν = 4π pν − R N.B.: These are average quantities rather than integral quantities (thus the difference in 4π be- tween the moments we defined earlier and the Eddington moments defined here).

We are now in a position to take moments of the plane parallel radiative transfer equation by integrating it over µ.

2.6.1 Zeroth moment equation and radiative equilibrium

Using the fact that Sν is isotropic, we can write down the zeroth moment equation:

1 1 dIν dµ µ = Iν Sν (2.81) 2 1 dτz − −   Z 1 1 1 dIν 1 = dµµ = Jν dµSν (2.82) ⇒ 2 1 dτz − 2 1 Z− Z− d 1 1 = dµµIν = Jν Sν (2.83) ⇒ dτz 2 1 − Z− 36 dHν = = Jν Sν (2.84) ⇒ dτz −

Recall that in LTE, the source function is Sν = (κνBν +˜σνJν)/(κν +˜σν). We can then re-write the zeroth moment equation as:

dHν dH κνBν +˜σνJν κν (Jν Bν) = = Jν = − (2.85) dτz −(κν +˜σν) ρdz − κν +˜σν κν +˜σν which we re-arrange to give

dH = κ ρ (B J ) (2.86) dz ν ν − ν

Corollary: For a source in radiative equilibrium, the radiation field cannot add or deposit any net energy to the gas locally. This is equivalent to the statement that the divergence of the radiative flux must vanish, ~ F~ =0, which in our 1D approximation implies ∇ dH ∞ dH ν = dν ν =0 (2.87) dz dz Z0 ∞ = dν κ ρ (B J )=0 (2.88) ⇒ ν ν − ν Z0 ∞ ∞ = dνκ J = dνκ B (2.89) ⇒ ν ν ν ν Z0 Z0 i.e., the amount of energy absorbed must equal the amount of energy re-emitted.

N.B.: Isotropic, coherent scattering does not enter into the divergence of the flux — σ˜ν cancels out in this equation. Thus, scattering does not change the energy balance.

2.6.2 The first moment equation and closure

We can similarly take the first moment of the radiative transfer equation:

1 1 dIν µdµ µ = Iν Sν (2.90) 2 1 dτz − −   Z 1 1 1 1 d 1 2 1 1 1 = µ dµIν = µdµIν µdµSν = µdµIν (2.91) ⇒ dτz 2 1 2 1 − 2 1 2 1  Z−  Z− Z− Z− dKν = = Hν (2.92) ⇒ dτz

37 Writing this in terms of physical variables again:

dKν 1 dKν = (κν +˜σν) ρHν or Hν = (2.93) dz − −(κν +˜σν) ρ dz

Note that scattering does not cancel out here. We now have 2 moment equations:

dHν = Jν Sν (2.94) dτz − dKν = Hν (2.95) dτz and three unknowns: Jν, Hν, and Kν. We could take higher order moments of the radiative transfer equation, but we would simply add one more unknown (a higher order moment of Iν for every new equation added to the set). To make progress (reach closure), we will have to make some simplifying assumptions that close the system of equations.

2.7 The Eddington approximation

Recall that for an isotropic radiation field, the radiation energy density is 3 times the radiation pressure and that the flux is zero. In this case, we can eliminate one of the three unknowns (Hν) and we can close the system, since Jν =3Kν.

However, such a radiation field is boring: from eq. (2.95), we see that Kν = const. and eq. (2.94) implies that an isotropic radiation field, Jν = Sν. Since any objects of astrophysical interest will generally have some net flux, this solution is irrelevant. Consider the next most simplest case, an atmosphere that is isotropic in each radiation hemisphere, but with incoming and outgoing intensity not necessarily equal,

I for µ 0 I = + (2.96) ν I for µ<≥ 0  − This is sometimes called a split monopole configuration. In this case, we can integrate the moments directly:

1 0 1 1 1 Jν = dµI + dµI+ = (I+ + I ) (2.97) 2 1 − 2 0 2 − Z− Z 1 0 1 1 1 Hν = µdµI + µdµI+ = (I+ I ) (2.98) 2 1 − 2 0 4 − − − Z 0 Z 1 1 2 1 2 1 1 Kν = µ dµI + µ dµI+ = (I+ + I )= Jν (2.99) 2 1 − 2 0 6 − 3 Z− Z 38 Note that (a) the flux does not vanish, and (b) the same relation between Kν and Jν derived for the isotropic case still holds.

N.B.: Taking Kν = Jν/3 is the so-called Eddington approximation

There is a family of simplified radiation fields that satisfy the Eddington approximation. Closely related to the split monopole field is the so-called two-stream approximation, in which the intensity is described by two delta functions in µ,

I = I [δ(µ µ )+ δ(ν + µ )] (2.100) ν ν,0 − 0 0

It can easily be shown (see Rybicki & Lightman) that this reproduces the Eddington approximation only for µ0 =1/√3. In many cases, the radiation field will be close to isotropic (e.g., inside a star). Then, it is often appropriate to linearize the intensity in µ, such that

Iν = aν + bνµ (2.101)

Since Iν is the sum of an odd and an even power of µ, we can immediately that one of the two must always vanish when taking a moment. We find

1 1 1 1 1 2 Jν = dµ (aν + µbν)= aνµ + bνµ = aν (2.102) 2 1 2 2 1 Z−  − 1 1 1 1 1 2 1 3 bν Hν = µdµ (aν + µbν)= aνµ + bνµ = (2.103) 2 1 2 2 3 1 3 Z−  − 1 1 1 2 1 1 3 1 4 aν Jν Kν = µ dµ (aν + µbν)= aνµ + bνµ = = (2.104) 2 1 2 3 4 1 3 3 Z−  − thus, the Eddington approximation holds once again. This is the most realistic Ansatz to arrive at the Eddington approximation.

Substituting Kν = Jν/3 into the first moment equation, we get

1 dJν = Hν (2.105) 3 dτz which we can substitute into the zeroth moment equation (using eq. 2.85) to get

1 d2J ρκ ν ν (2.106) 2 = Jν Sν = (Jν Bν) 3 dτ − ρ (κν +˜σν) −

39 which is a second order equation that can be solved if the boundary conditions are properly spec- ified (e.g., H given at the top of the atmosphere and J B in the limit τ ) and the ν ν → ν z → ∞ temperature structure T (τz) and the opacity structure κν(τz) of the medium are known. This equa- tion is often called the radiative diffusion equation.

The problem we are now left with is the complicated frequency dependence of κν (see Fig. 2.6), which is implicitl in τz(ν) (the optical depth is frequency dependent). Because of this, different frequencies can affect each other in complicated ways. In addition, the temperature structure of the medium typically depends on the radiation field, which in turn determines κν. The way to proceed analytically is to make some radically simplifying assumptions about κν. The goal is to integrate the equations over ν and replace κν with an appropriately weighted mean.

2.8 Weighted opacities

With the Eddington approximation in hand, motivated by a set of different simplifying assump- tions, we can define some mean opacities to solve the radiative transfer equation appropriately. We will proceed using the split monopole approximation from above to make things simple, but things can be easily adjusted for the other cases.

2.8.1 Flux weighted opacities in an equilibrium atmosphere

Let us consider the atmosphere of an object in radiative equilibrium. Recall that in this case, the divergence of the flux must vanish:

d ∞ ∞ dν H =0 and dνH H = const (2.107) dz ν ν ≡ Z0 Z0 Using the Eddington approximation, let’s integrate the first moment equation over frequency:

∞ 1 dJ ∞ dν ν = dν ρ (κ +˜σ ) H (2.108) 3 dz − ν ν ν Z0 Z0 Defining the flux weighted opacity

∞ 0 dν (κν +˜σν) Hν kF (2.109) ≡ ∞ dνH R 0 ν R and dropping subscripts to denote frequency integrated quantities, the integrated moment equation becomes

1 dJ = ρk H (2.110) 3 dz − F 40 Question: What is the major problem with this approach?

We can re-define the optical depth

dτ ρk dz (2.111) F ≡ F to re-write the equation as

dJ = 3 kF H = const (2.112) dτF −

which we can integrate to give

J =3 HτF + const. (2.113)

Recall from the split monopole radiation field that J and H are

1 J = (I+ + I ) (2.114) 2 − 1 H = (I+ I ) (2.115) 4 − − (2.116)

Clearly, at the surface of the atmosphere, i.e., at τF = 0, the incoming intensity must vanish, I (τF =0)=0. There, we have −

1 J = I =2H (2.117) 2 +

which is the flux emitted by the atmosphere. This is the boundary condition we need in order to determine the integration constant in eq. (2.110):

2 J =3 Hτ +2 H =3 H τ + (2.118) F F 3  

So, in the equilibrium atmosphere, the intensity increases linearly in τ. The problem with this approach is that we averaged the opacity over the flux without actually knowing the flux. This requires an iterative solution.

41 2.8.2 The grey approximation in an equilibrium atmosphere

As a functional lowest order approximation, let’s assume that κν is constant across the relevant spectral range. This is, in fact, a good approximation for the scattering part of the opacity (not so much for the absorption opacity). Such an opacity is called grey opacity, since it is frequency independent. Given that we can usually assume that σ˜ν is grey, this translates into

κν = κG = const (2.119)

The grey opacity is naturally flux weighted so eq. (2.118) still holds. Let us once again consider an equilibrium atmosphere, so we can use the radiative equilibrium condition:

4 ∞ ∞ σ T dν κ J = κ J = dν κ B = κ SB (2.120) ν ν G ν ν G π Z0 Z0 or

σ T 4 J = B = SB (2.121) π

Recall the relation between H and flux:

F =4πH (2.122) and that the flux from a black body is

4 F =4πH = σSBTeff (2.123)

Note that a grey opacity is by default flux weighted, so using eq. (2.118) we have

σ T 4 σ T 4 2 J = SB =3 SB eff τ + (2.124) π 4π G 3   We have thus solved for the temperature structure of the atmosphere (still as a function of optical depth):

1 3 2 4 T (τ )= T τ + (2.125) G eff 4 G 3   

If the density of the atmosphere is known, we can solve for T (z).

42 For stellar and planetary atmospheres it is often appropriate to assume hydrostatic equilibrium in an externally imposed gravitational field. In this case, determining the density and therefore T (z) is straight forward. This is one of the key tasks of stellar atmospheres calculations and is important for the calculation of stellar interior conditions as well.

2.8.3 Rosseland Mean Opacity and the diffusion approximation

Recall that Jν Bν for τ 1 for thermal emission. Then recall eq. (2.106), which we can re-write as → ≫

d2J κ ν ν (2.126) 2 = (Jν Bν) 3dτz (κν +˜σν) −

Let’s assume that the opacity is large, such that Bν can be considered linear in τz, such that 2 2 d Bν/dτz 0 (i.e., we can truncate the Taylor expansion after the linear term). Using the last statement in≈ the left hand side, we can re-write eq. (2.126) above as

d2J d2 (J B ) κ ν ν ν ν (2.127) 2 = −2 = (Jν Bν) 3dτz 3dτz κν +˜σν − which has the solution

τ 3κν κ +sigma˜ Jν = Bν +(Jν,0 Bν,0)e− q ν ν − τ/τ = B +(J B )e− th (2.128) ν ν,0 − ν,0 where we define the thermalization depth

κ +˜σ τ ν ν (2.129) th 3κ ≡ r ν The thermalization depth is the optical depth at which the mean intensity approaches the local Planck function. Thus, deep within the atmosphere (say, in the interior of a star), where τ τ , ≫ th we can safely assume that Jν = Bν. Then, we can rewrite the first moment equation

1 1 dBν Hν = −3 (κν +˜σν) ρ dz 1 1 dB dT = ν (2.130) −3 (κν +˜σν) ρ dT dz

43 where we used the chain rule and made the assumption that the temperature of the atmosphere is monotonic in τ. Knowing the flux, we define the properly flux-weighted Rosseland Mean Opacity:

1 ∞ ∞ − κR = dν (κν +˜σν) Hν dν Hν 0 0 Z Z  1 ∞ 1 dB dT ∞ 1 1 dB dT − = dν ν dν − 0 3ρ dT dz − 0 3 (κν +˜σν) ρ dT dz  Z  Z 1  dB ∞ 1 dB − = dν ν (2.131) dT κ +˜σ dT  Z0 ν ν 

The Rosseland opacity is inversely weighted, i.e., dominated by regions of the spectrum where κν is small.

Note: The Rosseland mean opacity is usually written as κR, but it includes scattering, so is not purely an absorption opacity. Returning to the expression for the flux, we can now integrate over ν to get

1 1 dB dT 16σ T 3 dT F =4πH = = SB (2.132) −3 κRρ dT dz − 3κRρ dz

which is often also called the radiative diffusion equation because it applies in the deep interiors of stars where radiation transport is diffusive. Note that this radiative diffusion equation is different from that in eq. 2.106

2 For stars in equilibrium, we can directly relate this to the star’s luminosity, which is L⋆ =4πR⋆F to give

dT 3κ ρ L R ⋆ (2.133) = 3 2 dR −16σSBT 4πR⋆

which is one of the fundamental equations of stellar structure. Among other things, this relation is responsible for steep temperature gradients in stars where a lot of energy has to be transported outward, which leads to the formation of convection zones (in the envelope for low mass stars and in the core for high mass stars).

44 3 Electrodynamics

3.1 Basic Electrodynamics

We define electric and magnetic fields through Maxwell’s equations and the definition of charge and current density.

3.1.1 Maxwell’s equations, fields, and potentials

Maxwell’s equations in media with µ = ǫ =1 (no dielectric or permeable media):

B~ = 0 (MagneticGauss/solenoidal condition) (3.1) ∇ 1 ∂B E~ + = 0 (Faraday) (3.2) ∇× c ∂t E~ = 4πρ (Gauss) (3.3) ∇ 1 ∂E~ 4π B~ = ~j (Ampere/induction equation) (3.4) ∇× − c ∂t c

N.B.: The beauty of CGS units is that electromagnetic field units are expressed in terms of centimeters, grams, and seconds. This makes the vacuum dielectric constant ǫ0 and per- meability µ0 equal to unity.

For a system of particles, we can express the charge and current density ρ and ~j at position ~r as

ρ = q δ(~r ~r ) and ~j = q ~v δ(~r ~r ) (3.5) i − i i i − i i i X X which, in many-particle systems, can often be smoothed sufficiently to regard ρ and~j as continuous variables.

N.B.: Equation (3.1) is a stronger statement than Gauss’ law (which just states that there is a source term for electric fields: electric charge). It is an empirical law that states that there are no magnetic charges, i.e., no magnetic monopoles. Its validity is still being tested.

The condition B~ =0 implies that we can represent the field as the curl of another vector field A~ ∇

B~ = A~ (3.6) ∇× where A~ is the vector potential defining B~ . We call such a divergence free field solenoidal.

45 We recall that it is always possible to write the electric field as

∂A~ E~ = φ (3.7) −∇ − c∂t

where φ is the electrostatic potential.

N.B.: it is sufficient to specify four components of the EM field - φ and A~. Thus, the six compo- nents of E~ and B~ are not all independent from each other.

3.1.2 Conservation laws

Take the divergence of Ampere’s law:

4π 1 ∂ B~ =0= ~j + E~ (3.8) ∇· ∇× c ∇· c ∂t∇  

and ∂t of Gauss’ law:

∂ ∂ρ E~ =4π (3.9) ∂t∇ ∂t

Putting things together, we have:

∂ρ + ~j =0 (3.10) ∂t ∇

Interpretation: the current density ~j is the charge flux, so this is equation is simply an expression of charge conservation.

N.B.: All conservation laws for some quantity n can be written as ∂ ρ + f~ = S where ρ is t n ∇ n n n the volume density of n, f~n is the flux of n, and Sn are any sources and sinks

Next, let’s add Coulomb and Lorentz force density together and dot it with the velocity, which gives the power density (work done by/on the EM field per unit voume):

1 ~vf~ = ~v q δ(~r ~r ) E~ + ~j B~ i i − i c × i   X 1 = ~jE~ + ~j ~j B~ = ~jE~ (3.11) c ×   46 Dotting Ampere’s law with E~ and using A~ B~ = B~ A~ A~ B~ gives: ∇× ∇× −∇ ×       c 1 ∂E~ E~~j = E~ B~ E~ 4π ∇× − 4π ∂t c   c 1 ∂E2 = B~ E~ E~ B~ (3.12) 4π ∇× − 4π ∇ × − 8π ∂t     Dotting Faraday’s law with B~ gives

1 ∂B~ 1 ∂B2 B~ E~ = B~ = (3.13) ∇× − c ∂t −2c ∂t   Putting both together, we have the following conservation law:

1 ∂ c ~jE~ + E2 + B2 = E~ B~ S~ (3.14) 8π ∂t −∇ 4π × ≡ −∇    We are thus lead to introduce the electromagnetic energy density

E2 B2 u = + (3.15) EM 8π 8π

and the Pointing vector, which is the EM energy flux vector,

c S~ = E~ B~ (3.16) 4π ×

Which leads to the more compact form:

∂u EM + S~ = ~jE~ (3.17) ∂t ∇ −

i.e., the total change in energy density must equal the work done by/on the EM field. Work done by/on the EM field is considered a sink, since the energy goes to or comes from a different source.

47 3.2 Radiation

3.2.1 EM fields in vacuum

In vacuum, the source terms vanish, i.e., ~j = 0 and ρ = 0, which simplifies Maxwell’s equations (written in terms of the potentials) a lot:

φ ∂2 φ 2 = (3.18) ∇ A~ c2∂t2 A~    

The most general solution to this homogeneous equation is any function

f = f ~k~x ωt (3.19) ±   with ω/k = c.

N.B.: This solution implies that f is stationary on curves described by t = ~k~x/ω. These curves are the so-called “characteristics” of the equation, along which solutions± propagate.

To derive the B-field, we take the curl of A~ = A~(~k~x ωt): ±

B~ = A~ = ~k A~′(~k~x ωt) (3.20) ∇× × ±

which implies that B~ is perpendicular to the propagation direction ~k and again only a function of ~k~x ωt. ± It is straight forward to show that the electric field in this case is simply

E~ = kˆ B~ (3.21) ∓ ×

Thus, the most general solution to the vacuum equations are electro-magnetic signals, traveling at the , along straight lines ~k, with E~ B~ and B = E. Any arbitrary solution can be decomposed into a set of such basic solutions. ⊥

N.B.: There is equal amounts of energy in the electric and the magnetic part of an electromagnetic wave.

A particularly useful solution is the plane wave solution:

i(~k~x ωt) E~ = E~ωe ± (3.22)

48 which is a plane, monochromatic, sinusoidal wave. Fourier decomposition allows us to write any signal E~ (~k~x ωt) as a linear combination of an infinite set of such plane waves. ±

N.B.: The condition E~ ~k leaves two degrees of freedom for E~ , i.e., the components perpen- dicular to ~k. These⊥ are the two polarization states.

Say ~k zˆ, then we can have E and E , independent from each other. The same holds for B~ . k x y

3.2.2 The Lienard–Wiechert potentials

Next, consider the more complicated case of a point charge (i.e., a charged particle) with charge q at position ~x0(t) and with velocity ~v(t) and some acceleration ~a, seen by an observer at location ~x. ˆ This situation is shown in Fig. 3.1. We define ~r ~x ~x and ~k ~r/r. ≡ − 0 ≡ For an unaccelerated charge at rest (i.e., v =0, a =0), the observer measures the potentials

q q φ = = and A~ =0 (3.23) ~x ~x r | − 0| However, if we set the charge in motion, we must now Lorentz transform the four-potential into the observer’s frame (including the coordinates where and when the potential is measured). Information about the fields travels at the speed of light, so at any given point in time, the local potential is given by the position of the charge one “light travel time” ago. The resulting Lorentz-transformed four-potential is then given by:

q q~v φ = and A~ = (3.24) κr ret cκr ret h i  

where the notation [...]ret indicates that a quantity inside the brackets is to be evaluated at the retarded time defined such that t tret = r(tret)/c. We have used the usual definition of β~ ~v/c. We also define − ≡

ˆ ~rˆ ~v ~k ~v v κ 1 ret · ret =1 · ret =1 ret cos ϑ (3.25) ret ≡ − c − c − c ret where ϑ is the angle between the direction of motion of the particle and the line of sight.

N.B.: A~ and φ are the Lienard-Wiechert potentials of a moving charge.

As expected, the meaning of the retardation is that the potential at a particular point is set by where

49 1 the particle was a light crossing time rret/c ago . That is: the fields transport information about the particle’s position at exactly the speed of light. Two things stand out: ˆ A) The term ~k~v/c in κ is negligible for v c. It is only important for relativistic charges and, together with the time retardation factor γ,≪ accounts for relativistic Doppler boosting. It implies that field lines from an approaching charge get bunched together, making a stronger field.

B) The potentials are evaluated at the retarded time tret = t r/c. This implies that the potential at the current position is given by the point where the particle− crosses the observer’s “light cone”. Information about the potential travels exactly with the speed of light. The retarded time for a sample particle trajectory is shown in the sketch in Fig. 3.1.

t time-like

observer future light cone x=0

observer future light cone 6 a sin θ

θ forbidden

t5 t=0 x,y,z

t4 k space-like τ5 5 a t3 τ4 4 t2 past light cone τ3 3 t1 past light cone particle v τ2 2 retarded times retarded charged τ1 1 particle

Figure 3.1: Light cone and retarded time τ for a charged particle, as seen by the observer (sta- tionary at x =0, observed at time t). In the figure, as is customary in relativity, c =1.

3.2.3 The electromagnetic field of accelerated charges

Let’s assume that the charge moves non-relativistically, with v c. Then we can set κ 1. In this case eq. (3.24) indicates that the four-potential is constant≪ on spheres of radius R, centered∼ on where the charge was a light-travel time R/c ago. That is, the potential is constant on a set of nested spheres which are expanding at speed c and are off-set with respect to each other.

1 Because tret depends on rret, which itself depends on tret, both rret and tret have to be solved for self-consistently.

50 The relative offset of the center of two spheres with radius R and R + δR is δx = δR v/c, which is negligible to lowest order in v/c. ·

Because the spheres of constant A~ move out at speed c, we can relate the retarded time t on a sphere to its radius r: t(r)=(R r)/c (recall: forward in retarded time means inward in radius, towards the charge, thus the minus− sign) where we arbitrarily picked t =0 for r = R. Next, recall the vector identities

ψA~ = ψ A~ + ψ A~ (3.26) ∇× ∇ × ∇×   ~r ˆ r = = ~k (3.27) ∇ r ˆ 1 ~k = (3.28) ∇r −r2

and, since the velocity ~v depends on r only implicitly through t(r)=(R r)/c: − ˆ ˆ d~v d~v ~k d~v ~k ~v(t(r)) = t(r) =[ (R r) /c] = − = ~a (3.29) ∇× ∇ × dt ∇ − × dt c × dt × c

Then we can evaluate the curl of A~:

ˆ q~v 1 q~v q ~k q~v q ˆ B~ = = + ~v = + ~a ~k (3.30) ∇× cr ∇ r × c cr∇× −r2 × c c2r ×  

In a similar way, we can derive the electric field (to lowest order in v/c):

q ˆ q ˆ ˆ E~ = ~k + ~k ~k ~a (3.31) r2 c2r × ×   2 1 The first terms depend on the distance to the second power, r− , while the second terms go as r− . 2 Since the electromagnetic energy flux goes like E~ B~ the terms proportional to r− drop off very × 1 rapidly and we are left with the terms proportion to r− . This is the “radiation” part of the fields:

q ˆ ˆ q E~ = ~k ~k ~a and E = E~ = a sin θ (3.32) c2r × × c2r  

and

q ˆ q B~ = ~a ~k and B = B~ = a sin θ (3.33) c2r × c2r

51 ˆ N.B.: The instantaneous B-field is perpendicular to ~a and ~k: the radiation is 100% polarized. The ˆ electric field vector is perpendicular to ~k and B~ , but not necessarily perpendicular to ~a (it is perpendicular to ~a only for the direction ~k ~a). ⊥

3.2.4 The Larmor Formula

We can now calculate the radiated power: The Poynting flux (power per unit area) is

c c ˆ q2a2 sin2(θ)ˆ S~ = E~ B~ = B2~k = ~k (3.34) 4π rad × rad 4π 4πc3r2

Using dA/dΩ= r2, the power per unit solid angle is

dW dW dA q2a2 sin2(θ) = = (3.35) dtdΩ dt dA dΩ 4πc3

where W is the radiated energy, following Rybicki & Lightman’s notation. The integrated power is

dW q2a2 sin2(θ) P = dΩ = dφ sin(θ)dθ (3.36) dΩ∂t 4πc3 Z Z 2πq2a2 1 2 (3.37) = 3 d cos(θ)sin (θ) 4πc 1 =4/3 Z−  2q2a2 = (3.38) 3c3

N.B.: P = (2q2a2) / (3c3) is called the Larmor formula, which is easily memorized and used frequently throughout these notes. Remember: the 2 in the numerator goes with the squares of q and a, and the 3 in the denominator goes with the cube of c.

Note also that no radiation is emitted in the direction of acceleration and the emitted power is forward-backward symmetric (only valid in the non-relativistic case).

3.2.5 Dipole radiation

So far we have considered a single point charge. In this case, the retardation factor is constant for the entire source and in the previous derivation the only effect it had in the Larmor formiula was

52 that the power arrived at the observer a light travel time later. For a system of charges, there is a finite light travel time difference across the emitter and in principle this could complicate life. Let us consider the simplest possible scenario: dipole radiation.

ˆ → → → • ∆tret ~ (r 1 − r 2) k / c

→ →r ,k 1 → R1

2 R

Figure 3.2: Dipole approximation: source size R λ r and v c. ≪ ≪ ≪

For a system of accelerated charges, the collective B-field is

qi~ai,ret ˆ B (~r, t)= k~ (3.39) rad c2r × i i i,ret X This is not at all a trivial expression since the retarded times for different particles are different, and the radiation from each particle has to be added coherently (i.e., with phase information). However, under certain conditions, the light-travel time difference across the charge distribution is negligible compared to the acceleration time (the time over which the charge distribution varies considerably). Then, the phase off-set between different parts of the object is negligible, and we are free to first take the sum and then evaluate it at the retarded time of the entire charge distribution. This is the case when, for any two charges qi and qj contributing to the charge distribution, the 1 retarded time difference is small compared to a characteristic period τ = ν− ,

ri rj 1 λ | − | ν− = (3.40) c ≪ c

For a characteristic size of the charge distribution of R, this gives the condition the condition that

R λ (3.41) ≪

53 Under such conditions, it is appropriate to expand eq. (3.39) in powers of the small parameter (r r )/λ. For now, we will concentrate on the zero-order term. i − j

We define the Dipole moment: d~ q ~r ≡ i i i P ~˙ ~ q It is worth noting that d = P m for particles with equal mass-to-charge ratio. Using eqn. (3.35) and the notation from Fig. 3.2 we can now write

¨ ~ˆ qi~ri k ¨~ ~ˆ qi~ai ˆ i d k B~ = ~k = × = × (3.42) rad c2r ×  c2r c2r i P X ˆ 1 ˆ ˆ ¨ E~ = B~ ~k = ~k ~k d~ (3.43) rad rad × c2r × ×   We can thus write the Larmor formula for dipole radiation:

dW d¨2 sin2 (θ) dW 2d¨2 = = (3.44) dΩdt 4πc3 dt 3c3

3.2.6 Fourier transforming the Larmor dipole formula:

First, let’s Fourier transform the fields, E(t) = d¨(t)sin(θ)/c2r and B(t) = E(t). We can use the fact that the Fourier transform is analytic and write the second derivative of the dipole moment as

2 + + d ∞ i2πνt ∞ 2 2 i2πνt d¨(t)= dν d(ν)e− = dν d(ν) 4π ν e− (3.45) dt2 − Z−∞ Z−∞  so the Fourier transform of d¨is

d¨ = 4π2ν2 [d]= (2πν)2d(ν) (3.46) F − F − h i which, together with E(t)= d¨sin(θ)/c2, gives the Fourier identity

1 (2πν)2 E(ν)= [E]= d¨sin(θ) = d(ν)sin(θ) (3.47) F F c2r − c2r   We know that E and B are real, so we can write

+ ∞ i2πνt E( ν)= dtE(t)e− = E∗(ν) (3.48) − Z−∞ 54 Recall Parseval’s theorem, stating that the norm of a function is invariant under Fourier transform:

∞ ∞ dtE(t)E∗(t)= dν E(ν)E∗(ν) (3.49) Z−∞ Z−∞ which allows us to write the radiated energy (using the Larmor formula and E = B) as

+ + dW c ∞ 2 c ∞ = dtE (t)= dνE(ν)E∗(ν) (3.50) dA 4π 4π −∞ −∞ Z + Z + c ∞ 2c ∞ = dνE(ν)E( ν)= dνE(ν)E∗(ν) (3.51) 4π − 4π 0 Z−∞ Z or

dW c ν (<ν)= dν′ E(ν′)E∗(ν′) (3.52) dA 2π Z0 and thus

dW cE(ν)E (ν) = ∗ (3.53) dAdν 2π and, together with dA = r2dΩ, we have

2 4 dW cE(ν)E∗(ν)r (2πν) 2 = = d(ν)d∗(ν)sin θ (3.54) dνdΩ 2π 2πc3 and finally the expression for the power emitted per unit frequency (i.e., the spectrum) for dipole radiation

dW 4(2πν)4 dW dW = d(ν)d∗(ν) note : =2π (3.55) dν 3c3 dν dω

3.2.7 Multi-pole expansion (for reference)

Electric dipole radiation is only the zeroth order term in a multi-pole expansion of t = t r/c ret − in the small parameter (~r ~r) ~k r/λ. This expansion converges well for v c and R λ. i − · ≪ ≪ ≪ Higher order moments are sometimes required to describe the radiation field (e.g., because electric dipole radiation is quantum-mechanically forbidden, such as in nebular lines like [OIII]).

55 The first order term is the only other term we will need in this class:

1 ¨ ˆ 1 ˆ ··· ˆ ¨ ˆ ˆ B~ = d~ ~k + ~k Q~ ~k + M~ ~k ~k (3.56) c2r × 6c × × ×       with

Quadrupole moment (tensor): Q~ q (3~r ~r r2I) ≡ i i ⊗ − P and

qi qi Magnetic dipole moment (vector): M~ (~ri ~v)= L~ i ≡ i 2c × i 2mic P P The magnetic dipole moment is proportional to the angular momentum M~ = Lq/m~ for particles with equal mass-to-charge ratio.

N.B.: Magnetic dipole and electric quadrupole powers are smaller than electric dipole power by a factor (v/c)2. Thus, multi-pole radiation becomes important only when dipole radiation is forbidden.∼

Dipole radiation implies a change in the total momentum of a system of particles of equal mass- to-charge ratio. Thus, it is forbidden for electron-electron collisions. Magnetic dipole radiation implies a change in the total angular momentum of a system of particles with equal mass-to-charge ratio. Again, it is forbidden for electron-electron collisions.

3.3 The classical atom

We can now apply the dipole formula to discuss radiation from a simple charge distribution: an electron bound in an atom under the influence of an external electro-magnetic wave. Ansatz: We model the atom as a classical damped, driven harmonic oscillator (in 1D); see Fig. 3.3.

3.3.1 The damped, driven harmonic oscillator and the radiation reaction

The incoming radiation forces the electron to oscillate. The accelerated electron must then emit radiation. This is the classical interpretation of scattering. We can derive the scattering cross section from the ratio of radiated power to incident power. Restoring force on an electron (q = e, m = m ) in central potential, expanded to second order − e (quadratic), with some resonance frequency ω0:

x¨ = ω2x (3.57) |osc − 0 56

spring, ω0 radiation

x, ω

E, ω

Figure 3.3: Cartoon of the classical atom, approximated as a damped, driven harmonic oscillator.

with harmonic oscillator solution ω = ω0:

x(t)= x0 cos(ω0t) (3.58)

Damping force = “radiation reaction”: Radiating charges lose energy, thus work must be done. Damping −→

2 Fdamp Frest x¨ = Γ˙x ω0x = + (3.59) − − me me or

F = Γ˙xm (3.60) damp − e

ωt˜ This equation has solutions of the form x = x0e . Inserting this solution into the equation, we arrive at an equation for ω˜ that we can solve:

ω˜2 = Γ˜ω ω2 (3.61) − − 0 or

2 2 Γ Γ 2 Γ Γ ω˜ = ω0 = iω0 1 2 (3.62) − 2 ± r 4 − ± s − 4ω0 − 2

57 Ansatz: Γ ω0, which means that the atom will go through many oscillations before energy is radiated away≪ (check the validity of this later against the initial assumptions), then

Γ ω˜ iω (3.63) ≈± 0 − 2

and so

ωt˜ iω0t Γt/2 Γt/2 iω0t x = x e x e± − = x e− e± (3.64) 0 ≈ 0 0

The radiated power (averaged over the real part of one oscillation) from the Larmor formula is

2 2 2 4 2 4 dW 2e x¨ 2e ω0 2 Γt 2 e ω0 2 Γt = h i x e− cos (ω t) = x e− (3.65) h dt i 3c3 ≈ 3c3 0 h 0 i 3c3 0

which must be balanced by the work done,

2 2 Γt 2 2 2 Γt 2 ω0Γx0e− me xF˙ =Γm x˙ ( iω ) Γm x e− sin (ω t)= (3.66) h dampi eh i≈ ± 0 e 0 h 0 − 2

Setting both equal, we can derive the damping constant:

e2ω4x2e Γt ω2x2eΓtm 2e2ω2 2 e2 ω 2 2π Γ= 0 0 − / 0 0 e = 0 = 0 ω = r ω (3.67) 3c3 2 3c3m 3 m c2 c 0 3 e λ 0     e e where the classic electron radius is

e2 (3.68) re = 2 mec

2 2 (get this by setting the electro-static self-energy e /re equal to the electron rest mass energy mec ). Since the emitter must, by the far-field assumption, satisfy R λ, we can finally show that the initial Ansatz was justified: ≪

r R λ = Γ ω Q.E.D. (3.69) e ≤ ≪ ⇒ ≪ 0

This is the radiation reaction (classical interpretation: finite size of electron = finite retarded time difference across electron).

58 3.3.2 Line profile and scattering cross section

Now let’s include radiative driving by the passing plane-parallel wave. The electric field at the location of the electron (with λ L) is: ≫

iωt E = E0e (3.70)

and so the equation of motion becomes

2 eE0 iωt x¨ +Γ˙x + ω0x = e (3.71) − me

which is the equation of motion of a damped, driven harmonic oscillator. A stationary solution can be found from the well known ansatz:

iωt x = x0e (3.72)

which gives

2 2 eE0 x0 ω + iΓω + ω0 = (3.73) − − me  or

eE 1 x = 0 (3.74) 0 m ω2 ω2 iΓω e − 0 − which is the stationary solution to eq. (3.71). Interpretation of the complex denominator: Phase delay (damping). The total amplitude of the wave is

2 2 e E0 1 x0x∗ = (3.75) 0 2 2 2 2 2 2 me (ω ω ) +Γ ω − 0 and the power re-radiated by the driven electron is:

dW 2e2x¨2 2e2ω4 e2E2 cos2 (ωt) r2 cE2ω4 = = 0 h i = e 0 (3.76) 3 3 2 2 2 2 2 2 2 2 2 2 2 dt h 3c i 3c me (ω ω ) +Γ ω 3 (ω ω ) +Γ ω − 0 − 0 which is a Lorentzian profile.

59 Cross section: The total cross section is the ratio of scattered power to incoming flux (it must have units of area)

1 P r2 cE2ω4 cE2 − 8πr2 ω4 σ = h outi = e 0 0 = e (3.77) 2 2 2 2 Fin 3 (ω2 ω ) +Γ2ω2 8π 3 (ω2 ω ) +Γ2ω2 h i  − 0   − 0 where we used F = S = cE2/8π. h ini h i 0 It is now convenient to define the Thomson scattering cross section

2 8πre 25 2 σ = =6.65 10− cm (3.78) T 3 ×

3.3.3 The low frequency limit = Rayleigh Scattering: ω ω ≪ 0 ω 4 σ σ (3.79) → T ω  0 

Rayleigh scattering has a strong frequency dependence (ν4), well known from blue sky and red sunset.

3.3.4 Thomson scattering: ω ω ≫ 0 In this limit, the electron is essentially free, since incoming photons at high enough energies can unbind the electron. Note: This case reduces entirely to the free-electron case simply by setting ω0 =0 (i.e., no restoring force).

σ σ (3.80) → T

3.3.5 Resonance scattering: ω ω ∼ 0 In this case, the strongest dependence on ω is through the (ω ω ) term (in all other instances of − 0 ω we can set ω = ω0). Using

3Γc3m 4π 2 e (3.81) ω0 = 2 = cΓ 2e σT we have

σ ω4 σ ω4 σ = T T 0 (3.82) (ω ω )2 (ω + ω )2 +Γ2ω2 ≈ (2ω )2 (ω ω )2 +Γ2ω2 − 0 0 0 − 0 0 60 100 Γ ν =0.1 0 σ −1 10 σ Lorentz c) e

m −2

Γ 10 / 2 e π 10−3 /(4 σ 10−4

10−5 0.1 1.0 10.0 ν/ν 0

Figure 3.4: Scattering cross section for classical atom and Lorentzian profile for classical resonant scattering cross section.

σ ω2 σ ω2 = T 0 = T 0 (3.83) 4(ω ω )2 +Γ2 4 (ω ω )2 + (Γ/2)2 − 0 − 0 8πe4 3c3m Γ πe2 Γ e (3.84) = 2 4 2 2 2 = 2 2 4 3m c 2e (ω ω ) + (Γ/2) mec (ω ω ) + (Γ/2) · e − 0 − 0 2π2e2 Γ/2π 2π2e2 = 2 2 = ω (3.85) mec (ω ω ) + (Γ/2) mec L − 0 which is the classic Lorentzian line profile Lω Γ/2π ω = 2 (3.86) L (ω ω )2 + Γ − 0 2  which is normalized ( dω ω = 1, show this as a homework exercrise using contour integration) and has a FWHM of ∆ω L =Γ. R FWHM Since Γ ω , the FWHM is small, i.e., this is a sharp resonance, a “spectral line”. ≪ 0 Note that we derived Γ for a dependence on ω. Reformulating for ν, it is clear from the Lorentzian that we have to substitute

Γ Γ = (3.87) ν 2π

61 to get

2π2e2 (2πΓ )/2π πe2 σ = ν = (3.88) ν 2 2 2 ν mec (2π) (ν ν ) + 2πΓν mecL − 0 2  3.3.6

This is the classical expression for the strength of a line for one classically oscillating electron. Recalling our earlier discussion of Einstein coefficients and cross sections in terms of the line profile function φν, we have

πe2 σν = φν (3.89) mec or

φ = (3.90) ν,classical Lν

In quantum mechanics, we will calculate the cross section for a line transition between states 1 and 2 and express results in terms of the cross section of one classical, oscillating electron:

πe2 σν = φνf12 (3.91) mec which is identical except for the quantum correction f12.

f12 is called the ”oscillator strength”, i.e., the effective number of classical oscillators needed to give the proper line strength.

3.3.7 Polarization and angle dependence in Thomson scattering

We now revisit Thomson scattering (for simplicity, we set ω0 = 0 to describe free electrons). Since the incoming light is generally composed of two orthogonal polarizations (both orthogo- ˆ nal to ~kin), we can pick one incoming polarization direction ~z the be orthogonal to the direction of the scattered line of sight, ~kout. From the differential form of the Larmor formula, recall that dW/(dtdΩ) = 3sin2 (Θ)/8π dW/dt. Then the differential cross section for the fraction of in- · coming light polarized in the ~zˆ direction (for which Θ=90◦) is

dσ dW = out / F = r2 (3.92) dΩ h dΩdt i h ini e

62 Figure 3.5: Scattering intensity and polarization for Thomson scattering.

ˆ The other polarization direction ~y must lie in the plane spanned by ~kout and the incoming ~kin and the E-vector will be at an angle Θ = π/2 θ to the outgoing line of sight ~k , where θ is the − out scattering angle between ~kout and ~kin. The differential cross section is

dσ dPout 2 2 2 2 = = re sin (Θ) = re cos (θ) (3.93) dΩ dFindΩ

This implies that scattering generally induces net polarization from unpolarized incoming light. The scattered polarization fraction is the fraction of polarized light to total outgoing light:

r2 r2 cos2 (θ) e e (3.94) Πout = 2 − 2 2 re + re cos (θ)

the remaining fraction 1 Π of light is unpolarized (like the incoming radiation). − out

3.3.8 Compton scattering

So far we have neglected the momentum carried by the photon. Since the outgoing photon gener- ally travels in a different direction, some of the initial momentum must be imparted on the electron (Newton’s third law). This momentum loss for the photon must imply an energy loss as well.

63 Figure 3.6: Sketch of Compton scattering geometry.

Without a primer in , we can still derive the relevant kinematics (though not as elegantly). Because we are now dealing with relativistic particles (photons) we will have to borrow a few things from special relativity. Namely, the total energy of a particle, which is given in terms of the Lorentz factor

1 γ 1 (3.95) 2 ≡ 1 v ≥ − c q  with which the total energy of a particle becomes

e = γmc2 mc2 (3.96) ≥

which includes a constant term that is non-zero even if the particle is at rest, the rest-mass energy mc2. It is sometimes convenient (though not strictly accurate) to think of the relativistic correction as an increase in the particle’s inertia, modifying the particle mass from m to γm. This suggests that we can write the particle momentum as

P~ = γ m~v (3.97) which is the correct relativistic expression. Setup/assumptions:

initial particle energy negligible, i.e., particle at rest • photon energy small compared to electron rest mass, hν mc2, i.e., can use Thomson cross • section. ≪

64 Initial photon along xˆ. • WLOG, final photon scattered by angle θ within the x-y plane • Initial photon energy: e , momentum: e x/cˆ • i i Final photon energy: e , momentum: e [cos(θ)ˆx + sin(θ)ˆy] /c • f f Initial particle energy: E = m c2, momentum: P =0 • i e i Final particle energy: E = γm c2, momentum: P = γm ~v • f e f e f

Isolating Ef and Pf , we have the following set of equations:

γm c2 = e e + m c2 (3.98) e i − f e γm v = e /c e cos(θ)/c (3.99) e x i − f γm v = e sin(θ)/c (3.100) e y − f Now, build the following new equation: (3.98)2 c2 [(3.99)2 + (3.100)2], which gives − 2 γ2 m2(c2 v2 v2)c2 = m c2 + e e [e e cos(θ)]2 e2 sin2 (θ) (3.101) e − x − y e i − f − i − f − f  which reduces to

m2c4 = m2c4 + e2 + e2 +2m c2 (e e ) 2e e e e i f e i − f − i f e2 +2e e cos θ e2 cos2 (θ) e2 sin2 (θ) (3.102) − i i f − f − f After canceling out most terms, we are left with

m c2e e e = e i = i (3.103) f 2 ei ei [1 cos(θ)] + mec 1+ 2 [1 cos(θ)] − mec − In terms of wavelength, this takes the simple form

c hc hc e λ = = = 1+ i [1 cos(θ)] = λ + λ [1 cos(θ)] (3.104) f ν e e m c2 − i C − f f i  e  with the Compton wavelength

e hc h i (3.105) λC = λi 2 = λi 2 = mec λimec mec

Notice that in this expression the photon energy always decreases, for any scattering angle. How- ever, we have so far neglected any motion of the electron relative to the observer. If the electron has appreciable motion, the photon receives a double Doppler shift (once into the frame of the electron, and then back into the observer’s frame after scattering). In this case, the photon can actually pick up a significant amount of energy from the elctron. This process is called inverse Compton scattering and a detailed discussion is beyond the scope of this class (see Physics 772).

65 4 Continuum Radiation

4.1 Bremsstrahlung

Bremsstrahlung is the German word for “breaking radiation”. It is also called “free-free” emission. It is produced by a free electron Coulomb interacting with an ion. The Coulomb acceleration of the electron causes emission of continuum radiation (see Fig. 4.1). Consider an ion with charge Ze and electrons with charge e and velocity v moving by the ion with impact parameter b. Without loss of generality, take the− electron to move along x-axis.

4.1.1 The free-free spectrum of one particle

We will consider particles at large impact parameters, such that the particle trajectory is only slightly curved and we can neglect the parallel acceleration (along the velocity) and only consider the perpendicular acceleration. This is called the small angle or Born approximation. The position of the particle is given by x vt, such that at t =0 the particle is at closest approach (with v const). ≈ ≈ The perpendicular acceleration of the particle is given by

b Ze2 Ze2 b (4.1) a = 2 = 3/2 ⊥ r mer me (b2 + v2t2)

The second derivative of the dipole moment is given by the perpendicular acceleration:

Ze2 b ¨ (4.2) d = ea = e 3/2 ⊥ me (b2 + v2t2)

In order to determine the spectrum, we need to calculate the Fourier transform of d. Recall from eq. 3.46 that

[d¨] d(ν)= F (4.3) −(2πν)2

The Fourier transform of d¨is given by

3 2πiνt Ze b ∞ e ¨ (4.4) [d] = dt 3/2 F me (b2 + v2t2) Z−∞ 3 Ze b ∞ cos(2πνt)+ i sin(2πνt) (4.5) = dt 3/2 me (b2 + v2t2) Z−∞ 66 y Dt ~ 2b/v

-e v << v F - D F elec b F tron, “b db reaking”

+ x +Ze 2πbdb ion, at rest

Figure 4.1: Cartoon of electron-ion Coulomb interaction, producing Bremsstrahlung (free-free emission).

3 Ze b ∞ cos(2πνt) (4.6) = dt 3/2 me (b2 + v2t2) Z−∞ Ze3b 4πν 2πbν = K (4.7) m bv2 1 v e  

where K1(x) is the first-order modified Bessel function of the second kind (the sine-part of the integral vanishes because it is the integral over the product of an odd and an even function).

x For x 1, the asymptotic behavior is K1(x) (π/2x)e− : it decreases exponentially. That is, the spectrum≫ emitted by a free-free interacting≈ particle has a sharp cutoff at ν = v/2πb. p max For x 1, the asymptotic behavior of K is K (x) 1/x, so ≪ 1 ≈ v 0 for ν 2πb K1(2πbν/v) ≫ (4.8) ≈ v for ν v  2πbν ≪ 2πb which gives the low frequency limit

Ze3 2 1 d(ν)= 2 (4.9) − me bv (2πν)

Finally, we have the spectrum for one electron (for frequencies below the cutoff at v/2πb):

dW 4(2πν)4 16Z2e6 ∗ (4.10) = 3 d(ν)d (ν)= 3 2 2 2 dν 3c 3c meb v

67 N.B.: The free-free spectrum emitted by one electron is flat at low frequencies and has a sharp cutoff at ν = v/2πb.

We can now calculate the power emitted by the entire ensemble of electrons at different impact parameters. For electrons with density ne, the rate at which electrons move by the ion is nev. The rate at wich energy is emitted is then given by the rate at which electrons pass by times the energy per electron per frequency:

dW bmax 16Z2e6 32πn Z2e6 b = 2πbdbn v = e ln( max ) (4.11) dtdν e 3c3m2b2v2 3c3m2v b Zbmin e e min where there are some minimum and maximum impact parameters bmin and bmax set by physics not yet considered. The factor ln(bmax/bmin) is called the Coulomb logarithm.

The lower limit bmin can take different values. E.g., the uncertainty principle sets a lower limit of b ~/mv. Similarly, the small angle approximation requires Ze2/b mv2/2 or b 2Ze2/mv2. ≫ ≪ ≫ Typically, the upper limit bmax is set by the fact that the plasma self-shields ions (free-free emission requires ionized particles) on the so-called Debye length

kT λ (4.12) D 4πn e2 ≈ r e which is the size of the sphere inside which an excess charge binds electrons against their thermal motions, thus neutralizing the charge on scales larger than λD. Any electron flying by with b λD will not see the ion charge and thus not radiate. ≫ The effect of the Coulomb logarithm is small and we will relegate it to the Gaunt factor to be discussed in the next section.

4.1.2 Thermal Bremsstrahlung:

We have so far only discussed the emission from mono-energetic electrons. Consider gas in thermal equilibrium. Electrons in a thermal distribition follow a Maxwellian distribution:

3/2 2 2 me mev ∞ f(v)=4πv e− 2kT with dvf(v)=1 (4.13) 2πkT 0   Z This distribution is normalized. Before inegrating, it is important to consider one more constraint 2 on the emission of a signle particle: The energy available for radiation is Ekin = mev /2, which must exceed the energy of the emitted photon, so

m v2 e min = hν (4.14) 2 68 Then, the total power emitted by a Maxwellian distribution is

3/2 2 6 2 dW ∞ 2 me 32πZ e ne mev − kT (4.15) 4πv dv 3 2 e dνdt ∼ v (ν) 2πkT 3vc me Z min  

where vmin is the minimum velocity required to produce at least one photon of frequency ν. Re- ordering terms, we have

2 6 2 dW 64Z e ne π ∞ mev dvve− 2kT (4.16) dνdt ∼ 3c3kT 2m kT r e Zvmin  The integral is easily solved:

2 2 ∞ mev ∞ kT d mev kT hν dvve− 2kT = dv e− 2kT = e− kT (4.17) − 2hν me dv me vmin m Z Zq e  

Finally, the average isotropic emission for a single ion is

2 6 dW 64Z e ne π hν e− kT (4.18) dνdt ∼ 3 m c3 2m kT e r e To derive the total emissivity, which is the power per volume and per solid angle, we multiply by the ion density and divide by 4π to arrive at

dW power ions j = = (4.19) ν dV dνdtdΩ ion frequency ∗ volume solidangle 2 6 ∗ ∗ 8Z e 2 1/2 hν = n n T − e− kT (4.20) 3m c3 πkm i e e r e

A proper quantum calculation, corrections due to the change in parallel velocity (which we have neglected), and corrections for large angle scattering are taken into account by way of a “Gaunt” factor (π/3)¯g 1 (see R&L), which gives ff ∼

2 6 8Z e 2π 1/2 hν j = T − n n e− kT g¯ (4.21) ν 3m c3 3km i e ff e r e 39 1 1 1 3 2 1/2 hν = 5.4 10− ergss− Hz− Sr− cm− Z nineT − e− kT g¯ff (4.22) × CGS h i The free-free emissivity is porportional to (a) the square of the density, and (b) to 1/T . p

69 The total, bolometric power emitted per volume is given by the frequency integrated emissivity:

2 6 1/2 dW ∞ 8Z e 2π kT ǫ = =4π dνj = n n Z2g¯ ff dV dt ν 3m c3 3km h i e ff 0 e r e 27 Z 3 1 2 /2 = 1.4 10− ergs cm− s− g¯ n n Z T (4.23) × ff e i CGS   The bolometric emissivity goes like n2 and T 1/2, since it is approximately the product of the the 1/2 specific emissivity (proportional to T − ) with the bandpass hν kT . max ≈ Note again: As with all two-particle radiation processes, a square of the density appears in this expression. The total emission is therefore directly proportional to the volume emission measure (for some emitting volume ∆V ):

EM dV n n (4.24) V ≡ e i Z∆V Similarly, one can define a line emission measure (along the line of sight, over some length ∆z):

EM dz n n (4.25) L ≡ e i Z∆z

Sometimes it is convenient to replace ni with ne or vice versa (if the ratio between the two is known), which allows to solve for n2 if the emission measure can be determined observationally.

4.1.3 Thermal free-free absorption

For thermal free-free, we can use Kirchhoff’s law:

2 6 3 2 1 jν 8Z e 2π 1/2 hν 2hν /c − − − kT (4.26) αν = = 3 T ninee g¯ff hν Bν 3mec 3kme e kT 1 r  −  8 1 1/2 2 3 hν = 3.7 10 cm− T − Z neniν− 1 e− kT g¯ff (4.27) × − CGS h  i which, in the Rayleigh-Jeans limit, reduces to

1 3/2 2 − − − (4.28) αν =0.018cm T neniν CGS g¯ff

  2 Given that the absorption coefficient αν ν− , free-free spectra must become optically thick at sufficiently low frequencies. Under those∝ conditions, we can expect the spectrum to assume Rayleigh-Jeans form such that

I ν2 (4.29) ν ∝ Thus, the total spectrum rises like ν2 at low frequencies, becomes roughly flat at higher frequencies, and drops exponentially at frequencies above hν kT . ∼ 70 4.2 Synchrotron radiation

Consider a charged particle with mass m and charge q gyrating in a magnetic field of strength B with pitch angle α between ~v and B~ . General considerations: Recall that the effective inertia of a particle with mass m assumes the value m γm in relativity. → Assume v c, such that γ =1/ 1 v2/c2 1. Taylor expansion gives: ≈ − ≫ p v 1 1 β = = 1 1 + ... (4.30) c − γ2 ≈ − 2γ2 r Gyro orbit: Balance Lorentz force and centrifugal force

q q ~v B~ + γm~rω2 = ω B~r + γmω2 ~r =0 (4.31) −c × B −c B B

which gives helical orbits with gyro frequency and Larmor radius:

1 qB ωL v γmc ωB = = and RL = ⊥ = v sin α (4.32) γ mc γ ωB qB

The gyro acceleration is perpendicular to the direction of motion,

qB v a = R ω2 = sin α (4.33) L B γm c

4.2.1 Lorentz transformed accelerations

Consider a relativistic particle undergoing acceleration. We can decompose the acceleration into two components: Along the direction of motion, a , and perpendicular to it, a . k ⊥ From undergraduate physics, recall Lorentz contraction and time dilation imposed by the Lorentz transform. Lengths along the direction of motion are Lorentz contracted in the observer’s frame by a factor of γ, while the time measured in the observer’s frame is longer by a factor of γ. First consider the perpendicular acceleration, a . The transverse dimensions are not Lorentz con- ⊥ tracted, however, the observer sees a time dilation, dtobs = γdtpart. Suppose the observer measures an acceleration a , then in the particle frame the transverse accel- eration would be ⊥

d dl ,part 2 d dl ,obs 2 a ,part = ⊥ = γ ⊥ = γ a ,obs (4.34) ⊥ dtpart dtpart dtobs dtobs ⊥

71 B 1.0 −1 0.8 v/c)] θ RL 0.6 v 0.4 (1−cos [Γ v 0.2 0.0 −4 −2 0 2 4 θΓ

Figure 4.2: Left: sketch of cyclotron/synchrotron-gyro orbit. Right: Plot of the beaming function 1 [1 cos(θ)v/c]− for γ 1, normalized to a peak value of 1. − ≫

Now consider the longitudinal acceleration, a . We have the same time dilation factor of γ2, but k additionally, we have to take account of the Lorentz contraction, which makes dl ,part = γdl ,obs. Thus, k k

d dl ,part d dl ,obs k 3 k 3 2 a ,part = = γ = γ a ,obs (4.35) k dtpart dtpart dtobs dtobs k

We now have the acceleration in the particle frame in terms of the acceleration measured in the observer’s frame (which is what we have calculated above for Larmor gyration).

4.2.2 Lorentz invariance of the Larmor formula

The Larmor formula provides the energy emitted per unit time interval by a single particle,

dW 2q2a2 = (4.36) dt 3c3

Since particle numbers are Lorentz invariant, the only two things that we have to consider under Lorentz transform in this formula are energy and time. We already know that Lorentz transforms induce a time dilation,

dt dt = obs (4.37) part γ

We now have to consider the total energy of the light emitted. Consider the emission forward- backward symmetric in the particle frame (i.e., no net momentum is transported in any direction by the radiation in the particle’s frame). The total energy emitted is dWpart.

72 Now consider a spherical shell traveling with the particle that absorbs the radiation completely and converts all the radiated energy to rest mass (which is allowed by special relativity). In the frame of the particle, the (rest mass) energy of the sphere is now larger by dWpart. In the observer’s frame, the (mass) energy of the sphere is larger by an amount

2 dWobs = γdMc = γdWpart (4.38)

Thus, the Larmor formula Lorentz transforms simply as

dW γdW dW 2q2a2 obs part part part (4.39) = = = 2 dtobs γdtpart dtpart 3c

where apart is measured in the particle frame.

2 4 2 6 Given that apart = γ a ,obs + γ a ,obs, we finally have the Larmor formula for an accelerated particle, expressed in terms⊥ of quantitiesk entirely measured in the observer’s frame:

dW 2q2 obs 4 2 6 (4.40) = 3 γ a ,obs + γ a ,obs dtobs 3c ⊥ k  Inserting the expression for Larmor gyration from above and noting that the acceleration is per- pendicular to ~v, we have the power emitted per particle:

dW 2q2 qBβ sin α 2 2q4B2β2 sin2 αγ2 = γ4 = (4.41) dtdN 3c3 γm 3c3m2  

4.2.3 Doppler boosting

Equation (4.40) gives the total power emitted by the particle measured in the observers frame, integrated over frequency and solid angle. However, the emission is not uniform in either ν or Ω. This is apparent from the relativistic expression for the Poynting flux:

ˆ ˙ 2 2 2 β~ ~k β~ ~ q ~ˆ ~ ~ˆ ~˙ − × Srad = k β k β 6 (4.42) c2κ6 − × ret ∝ h(1 cos θβ) i h  i − (β cos θ)2 + sin2 θ sin2 φ − (4.43) ∝ (1 cos θ β)6 − ˆ ˆ where θ is the angle between the line of sight ~k and the velocity β~ and φ is the angle between ~k ˙ and β~ β~. × 73 The term 1 cos(θ)v/c in the denominator is exceedingly small when cos(θ)v/c 1. This can only be the− case when β 1 (i.e., γ 1) and cos θ 1 (i.e., θ 1). The part of∼ the gyro-orbit where this is the case will≈ dominate the≫ total emission.≈ ≪ To see how the emission is peaked around θ =0, we expand the denominator in 1/γ 1: ≪

v θ2 1 θ2 1 1 cos θ 1 1 1 + (4.44) − c ≈ − − 2 − 2γ2 ≈ 2 2γ2    which is very small (see Fig. 4.2) when both of the following conditions are satisfied:

1 γ 1 and θ . (4.45) ≫ γ

N.B.: Emission from relativistic particles is concentrated into a “beaming cone” around their direction of motion with a half-opening angle of 1/γ. ∼

Because only particles whose beaming cone sweeps across the observer contribute significantly, only particles at a pitch angle

α = θ 1/γ (4.46) B,LOS ± ˆ contribute, where θB,LOS is the angle between the line of sight ~k and the direction of the magnetic field. Thus, α can be regarded as the angle between the line of sight and the B-field in the following.

4.2.4 The characteristic synchrotron emission frequency

The “beaming cone” is rotating with the particle orbit (see Fig. 4.3) and an observer will not stay inside the beaming cone for very long (since the cone itself has a very narrow opening angle). The angular velocity with which the particle beaming cone sweeps across the observer is

dθ 2π sin α = = sin α ωB (4.47) dt Torb

Since the full width of the opening angle of the beaming cone is θ 2/γ, the time the observer is inside the beaming cone for a particle whose beaming cone does sweep∼ over the observer is of order

θ 2 ∆τ = (4.48) ≈ dθ/dt γ sin α ωB

74 → B

α v→

2π sin α

2π B emission cones v 2/ γ

2/g 1/g v

Figure 4.3: Left: Beaming cone rotating around gyro orbit: Only observers within 1/γ from direction of motion see pulse of radiation. Right: Solid angle illuminated by particle: 2/γ 2π sin α. ·

The distance the particle travels in that time is ∆τv. The distance the photon travels in that time is ∆τc. The distance the photon emitted at beginning of pulse is ahead of the particle at the end of the pulse is ∆τ(c v). − The arrival time difference between beginning and end of the pulse (see Fig. 4.4) is the character- istic time scale during which the observer receives the bursts of radiation (one per particle orbit):

c v 2 v 2 1 1 (4.49) ∆ tpulse =∆tobs =∆τ − = 1 2 = 3 c γ sin α ωB − c ≈ γ sin α ωB 2γ sin αγ ωB   The fundamental frequency of the pulse is:

3 2 2 1 sin αγ ωB γ q sin αB 3γ qB sin α νc = = R&L : (4.50) ≈ 2∆τpulse 2 2mc 2πmc

The spectrum emitted by each particle is broad, but we will simplify by assuming each particle only emits at exactly νc. This relates the γ of a particle to the energy it emits:

2πmcν 1/2 dγ πmc 1 γ = and = (4.51) 3 qB sin α dν 3 qB sin α γ   75 Dtobs=Dt(c-v)/c

2/g Dt=2RL/g

Figure 4.4: Duration of pulse seen by observer: Combination of beaming angle and light-travel time.

4.2.5 Synchrotron spectra:

For a distribution of particles (and in particular the case of a powerlaw distribution)

dN s = f(γ)= N γ− (4.52) dγ 0 the power emitted per frequency is

4 2 2 2 2 dW dW dN dγ 2q B β sin αγ s πmc 1 = N γ− γ− (4.53) dtdν dtdN dγ dν ≈ 3c3m2 0 3 qB sin α s 1 −2 2πB sin α 1 s 2πB sin α 2πmcν − N γ − N (4.54) ≈ 9c2 m 0 ≈ 9c2 m 0 3 qB sin α   1+α α (sin αB) N ν− (4.55) ∝ 0 which is the classic synchrotron powerlaw with spectral index α =(s 1)/2. For typical powerlaw particle spectra, s 2, so α 0.5. − ∼ ∼ Use Rybicki & Lightman for quantitative calculations:

s 1 − dW √3q3N B sin α s 19 s 1 2πmcν − 2 = 0 Γ + Γ (4.56) dtdν mc2 (s + 1) 4 12 4 − 12 3qB sin α     

ˆ Note: α is the angle between ~k and B~ . Thus: No synchrotron emission along the field direction. Equation (4.56) has to be averaged over all particl pitch angles. For an isotropic pitch angle distri- bution the weighting introduces another factor of sin α.

76 For s =2 and randomly tangled magnetic field we have roughly:

3 1 2 2 38 ergs p B ν − ǫ 1.2 10− (4.57) ν ≈ × cm3 s Hz 10 12 ergs cm 3 10 5 G 5 109 Hz  − −  −   × 

4.2.6 Polarization

Synchrotron radiation is typically polarized, with a preferred direction given by magnetic field orientation. Without proof (see R&L homework problem) the maximum degree of polarization is

s +1 9 Π= = 69% (4.58) s +7/3 s=2 13 ≈

4.2.7 Synchrotron-self absorption

We discussed non-thermal synchrotron radiation (which is by far the most common). cannot use Kirchhoff’s law to calculate α → ν The Einstein relations still hold, though, so we can in principle determine B12 from ǫν. However, the derivation is beyond the scope of this course, so we will just state the result for a powerlaw distribution of electrons as defined above:

3 2 s/2 √3q c 3q (s+2)/2 3s +2 3s + 22 (s+4)/2 α = N (B sin α) Γ Γ ν− (4.59) ν 8π 2πmc 0 12 12      

N.B.: The normalization C used in Rybicki&Lightman is very misleading. It differs by a fac- tor mc2 from the normalization C used in their eq. (6.36). The above normalization is consistent with our previous discussion.

Source function:

jν ǫν 5/2 1/2 Sν = = ν B− (4.60) αν 4παν ∝

Two things to note: A) The powerlaw index for optically thick synchrotron emission is independent of s and different from the powerlaw index of low frequency thermal emission (the thermal case is 2, not 2.5). B) The density cancels out, so we can constrain the magnetic field strength if we can measure the intensity of the source.

77 5 Atomic structure: Single Electron Atoms

5.1 The need for a QM treatment

When energy and size scales of a problem approach the characteristic scales of atoms we can no longer treat emitters and absorbers classically. The best indication of this is the fact that there are emission and absorption lines: In our treatment of the classical atom there was no preferred energy/frequency scale of the oscillation (i.e., no inherent “spring”). The observation of sharp resonances in the form of lines thus necessitated quantum mechanics.

N.B.: Whenever there is a characteristic, universal energy scale in a radiative process (e.g., a characteristic edge in the spectrum), it is likely that there is an underlying quantum me- chanical phenomenon at work (either atomic, or nuclear).

We will discuss a more proper treatment of multi-electron atoms and molecules on the basis of non-relativistic QM in later chapters. We will first discuss the hydrogen atom and some basic energy scalings to give you a feel for the different kinds of effects one can expect.

5.1.1 The hydrogen spectrum

Observed hydrogen atomic line series: Lyman (UV), Balmer (Optical), Paschen (IR), Bracket (IR), Pfund (far IR) The Hydrogen line series are:

n=1 n=2 n=3 n=4 n=5 Lyman Balmer Paschen Brackett Pfund

n’=2 1216A˚ (Ly α)— — — — n’=3 1026A˚ (Ly β) 6563A˚ (H α)— — — n’=4 972A˚ (Ly γ) 4861A˚ (H β) 18,751A˚ (P α) — — n’=5 950A˚ (Ly δ) 4340A˚ (H γ) 12,818A˚ (P β) 40,520A˚ (Br α) — n’=6 938A˚ (Ly ǫ) 4102A˚ (H δ) 10,938A˚ (P γ) 26,258A˚ (Br β) 74,760A˚ (Pf α) n’=7 . . . 21,660A˚ (Br γ) 46,640A˚ (Pf β) ...... n’ 912A˚ (Ly C) 3647A˚ 8206A˚ 14,588A˚ 22,800A˚ → ∞

Table 5.1: Hydrogen spectral series (without fine and hyperfine structure corrections)

78 5.2 Order of magnitude scalings and the Bohr atom

Niels Bohr postulated a rather ad-hoc quantum theory that quantized the angular momentum of orbiting particles:

nh l = n~ = with n [1, 2, 3,...] (5.1) 2π ∈

5.2.1 Electronic transitions in the Bohr atom

Take an atom of nuclear charge Ze with one electron. The orbital radii (Homework problem) are then:

n2~2 (5.2) Rn = 2 Ze me and the energy levels the electron can occupy are

Z2e4m E = (5.3) n − 2n2~2

This turns out to be the correct expression and can explain the basic properties of the Hydrogen spectrum, since line energies follow directly from the energy difference between levels:

hc Z2e4m 1 1 1 1 ∆E = hν = = e = 13.6 eV Z2 (5.4) n,n′ λ 2~2 n2 − n 2 n2 − n 2  ′   ′  where we conveniently define the Rydberg as

e4m Ry = e (5.5) 2~2 and the Bohr radius as

~2 9 ˚ − (5.6) a0 = R1 = 2 =0.52A=5.2 10 cm e me ×

Note the orbital velocity

Ze2 v = Zαc (5.7) 0 ~ ≡

79 where we define the fine structure constant

e2 ~ 1 v α = = = = 0 (5.8) ~c mea0c 137 c

With this, we have

~ a0 = (5.9) me αc

N.B.: The fact that α = v0/c 1 for the hydrogen atom tells us that relativistic corrections are small and contribute only≪ on the fine structure level.

5.2.2 Fine Structure

The previous discussion considered only Coulomb interaction between the nucleus and the elec- tron. A complication that must be considered is the fact that the electron has spin. Since it is a charged particle, we must expect it to also have a magnetic moment ~µ. We can write down the order of magnitude of the electron magnetic moment. Recall the classical electron radius

e2 (5.10) re = 2 mec

Following Bohr, the electron spin should be quantized

~ 12 1 l merev = ~ = v 4 10 cms− (5.11) ∼ ⇒ ≈ mere ≈ ×

N.B.: This velocity is clearly unrealistic. It proves that electron spin is a relativistic phenomenon.

Since v γv, we can proceed cautiously, however, noting that this is only a scaling argument. → The electric current due to the spinning electron is roughly:

e e ~ I γv = (5.12) ∼ 2πre 2πre mere

which gives the magnetic moment by Ampere’s law:

2 ~ ~ 2 I πre e e ′′ (5.13) µ πre 2 gs = gs “Bohr Magneton ∼ c ∼ 2πmere · 2mec · ≡

80 A proper quantum-electrodynamic treatment (a quantum mechanical treatment of electrodynamic fields) gives the correction factor gs =2.00 for the electron magnetic moment. The QCD correction for free protons is gs,p =5.6. From the induction equation, the B-field induced at the electron position due to its orbit around the nucleus is roughly

4 4πI Ze v0 Z e B 2 = 2 α (5.14) ∼ cR0 ∼ R0 c a0

Finally, the exchange energy between a magnetic moment and a background magnetic field (and thus the energy scale of the fine structure correction) is:

~ 4 ~ 2 ~ e Z e 4 e 4 2 ∆E = ~µB 2 α gs = Z α gs = Z Ry α gs (5.15) ∼ 2mec a0 · · mea0c 2a0 · · · · ·

5.2.3 Hyper-fine structure

Since the nucleus also has spin, there is a further correction due to its interaction with the electron magnetic moment (which generates a dipole field in which the electron is placed).

Ze~ µnuc gs,p (5.16) ∼ 2mpc ·

Recall that dipole fields falls off as B µ/r3, thus the nuclear field at the position of the elctron ∼ (at radius R0) is

4 4 µnuc Z e~ Z eα ~ Bnuc 3 = 3 gs,p = 2 gs,p (5.17) ∼ R0 2mpa0c · a0 αa0 cmp · ~ 1 me me = Borb gs,p = Borb gs,p (5.18) me αc a0 mp · mp ·

Thus, the exchange energy between the two magnetic moments of proton and electron give the hyper-fine energy correction:

∆EFS me ∆EHF µeBnuc gs,p (5.19) ∼ ∼ 2 mp ·

By far the most important hyperfine structure transition is the 21 centimeter line of neutral atomic hydrogen in the ground state. Our estimate for ∆HF from above is too small by a factor of 3, which is not bad, given the gross, back-of-the envelope level of approximation. Note: Because the 21 cm transition only involves a spin flip but no change in orbital angular momentum it proceeds via magnetic dipole radiation and has a very small Einstein A coefficient. But because hydrogen is so abundant, this line is still an important observable diagnostic.

81 5.3 Basics of quantum mechanics

Quantum mechanics is a probability theory. Wave functions describe the likelihood distribution of particles/systems, and solutions are not fully deterministic. One of the most important aspects of quantum mechanics is that it is a linear theory. Wave functions and operators can be added and superposed without changing their behavior. This is in contrast to classical mechanics, which contains non-linear elements. States are often described by wave functions ψ which are typically complex and are elements of special vector spaces, called Hilbert spaces. As a generalization from classical mechanics, we can build canonical pairs of variables, a coordi- nate and its associated canonical momentum. Examples are ~r and P~ and t and E. In quantum mechanics, observables take the form of operators Oˆ acting on states/wave functions ψ:

ψ′ = Oψˆ linearity : α(ψ′ + φ′)= α(Oˆ + Qˆ)(ψ + φ) (5.20)

Many operators are straight forward adaptations from classical mechanics, e.g., the momentum operator and the position operator:

P~ i~~ and ~r ~r (5.21) →− ∇ → which suggests that the angular momentum operator should be (using the short-hand ∂x for ∂/∂x):

y∂ z∂ sin φ∂ cos θ cos φ ∂ z − y θ − sin θ φ L~ i~ ~r ~ = i~ z∂ x∂ = i~ cos φ∂ cos θ sin φ ∂ (5.22) →− × ∇ −  x − z  −  θ sin θ φ  x∂ y∂ −∂ y − x φ     and the kinetic energy operator:

P 2 ~2 2 ~2 1 1 1 ∇ = ∂ r2∂ + ∂ (sin θ∂ )+ ∂2 (5.23) 2m →− 2m −2m r2 r r r2 sin θ θ θ r2 sin2 θ φ    P 2 L2  = r + (5.24) 2m 2mr2

5.3.1 Dirac notation

It is most convenient to use the Dirac bra-ket notation to write states and their duals as

T ψ ψ and its dual ψ ∗ ψ (5.25) →| i →h | 82 We require the Hilbert space to have an inner product between a bra and a ket:

φ ψ = ψ φ ∗ = c with c C (5.26) h | i h | i ∈ which also defines the norm on the space. For spatial wave functions, this reduces to

3 φ ψ = d xφ∗(~x)ψ(~x) (5.27) h | i Z Typically, we want wave functions to be normalized, such that

ψ ψ =1 (5.28) h | i Two kets (states) are orthogonal if

φ ψ =0 (5.29) h | i An operator can act on kets, giving another ket:

Oˆ ψ = ψ′ (5.30) | i | i and so typically we have

ψ Oˆ ψ = ψ ψ′ = o with o C (5.31) h | | i h | i ∈ which is the expectation value for the observable Oˆ.

The adjoint of an operator Oˆ† is defined such that

ψ Oˆ† Oˆ ψ (5.32) h | ←→ | i are corresponding duals such that

∗ ψ Oˆ† φ = φ Oˆ ψ (5.33) h | | i h | | i   h  i A special class of operators is called Hermitian: these operators satisfy

φ Oˆ ψ = ψ Oˆ φ ∗ (5.34) h | | i h | | i A ket α is called an “Eigenket” (eigenfunction) of Oˆ if | i Oˆ α = a α with a C (5.35) | i | i ∈ 83 For Hermitian operators, we can always find a set of orthogonal eigenkets α with real eigenvalues | ii αi that span the entire Hilbert space. We can then write any other ket as a decomposition:

ψ = α α ψ = c α with c C (5.36) | i | iih i| i i| ii i ∈ i i X X Thus, we can write the identity operator as

ˆI= α α (5.37) | iih i| i X Finally, we define the commutator between two operators:

O,ˆ Qˆ = OˆQˆ QˆOˆ (5.38) − h i which is, in general, not zero. If it is, the two operators commutate and they are called “compati- ble”. This means that eigenkets for one operator are also eigenkets for the other (though not with the same eigenvalues, otherwise the operators would be identical). This will become important soon.

5.3.2 Uncertainty principle

Measurement of observable O:

O = ψ Oˆ ψ (5.39) h | | i

In general, two observables cannot be determined independently. Consider two measurements:

ψ OˆQˆ ψ ψ QˆOˆ ψ = ψ [O,ˆ Qˆ] ψ =0 (5.40) h | | i−h | | i h | | i 6 which implies that for operators that do not commutate, we cannot determine one observable and then the other and get a unique result. E.g., position and momentum:

ψ [~x, Pˆ] ψ = i~ (5.41) h | | i

which is a measure of the uncertainty with which both position and momentum can be determined.

84 5.4 The hydrogen atom

5.4.1 The time independent Schrodinger¨ equation for central potentials

The time dependent Schrodinger¨ equation, which can be derived from generalizing Hamiltonian mechanics to the probabilistic description afforded to us by the Heisenberg and Schrodinger¨ inter- pretations of quantum mechanics, is the following:

∂ Hˆ ψ = i~ ψ (5.42) | i ∂t| i

where the Hamiltonian is generally an expression of the energy of the system. For a single particle, the Hamiltonian is

Pˆ2 ~2 Hˆ = Tˆ + Vˆ = + V (~r)= 2 + V (~r) (5.43) 2m −2m∇ so Hˆ is the sum of kinetic and potential energy operators.

Under some circumstances, the Hamiltonian is time independent, i.e., Hˆ = Hˆ ~r, P,~ t does not 6 \ depend on time. Then,  

Hˆ ψ = const. ψ = E ψ (5.44) | i ·| i | i

where E is the energy. This is an eigenvalue problem for ψ with eigenvalue E. Recall from de Broglie wave mechanics for particles that we can decompose the solution into a space and time dependent part, which has a harmonic solution of the well known form

iEt/~ ψ = ψ(~r)e− (5.45)

with E = hν = ~ω, so

∂ψ iE i~ = i~ ψ = Eψ (5.46) ∂t − ~

Now, let’s consider a particle in a central potential. The most natural coordinate system to use are, of course, spherical-polar coordinates. Then:

Pˆ2 ~2 ~2 (~r )(1+ ~r ) (~r )(~r ) Pˆ2 Lˆ2 = 2 = ∇ ∇ + ×∇ ×∇ = r + (5.47) 2m −2m∇ −2m r2 r2 2m 2mr2  

85 with the radial part of the momentum operator being

1 ∂ Pˆ = i~ r (5.48) r − r ∂r   Adding the central potential (which depends on r only) the Schrodinger¨ equation becomes

Pˆ2 Lˆ2 r (5.49) + 2 + V (r) ψ = Eψ "2m 2mr #

We separate the radial and angular terms to get

Pˆ2 2mr2 r + V (r) E ψ = Lˆ2ψ (5.50) "2m − # −

The fact that we can separate the equation in this way suggests that we can separate ψ into an angle and a radial part:

(r) ψ = R Y(θ,φ) (5.51) r

The angle equation becomes

1 ∂ ∂Y 1 ∂2Y sin θ + = const Y= l(l + 1)Y (5.52) sin θ ∂θ ∂θ sin2 θ ∂φ2 · l,m  

with the well known spherical harmonics Yl,m as solutions (Tab. 5.2 and Fig. 5.1). Yl,m form a complete orthonormal base:

dΩYl,m∗ Yl′,m′ = δl,l′ δm,m′ (5.53) Z

Note: Yl,m are also eigenfunctions of Lˆz:

∂ Lˆ Y = i~ Y = m~Y with m l (5.54) z l,m − ∂φ l,m l,m | |≤

but not of Lˆx or Ly.

86

Figure 5.1: Surface renderigns for the first four ourders of the spherical hamonics.

0 1 Y0 = 4π q 0 3 1 3 iφ Y = cos θ Y ± = sin θe± 1 4π 1 ∓ 8π q q 0 5 2 1 15 iφ Y = (3cos θ 1) Y ± = sin θ cos θe± 2 16π − 2 ∓ 8π 2 q 15 2 2iφ q Y2± = 32π sin θe± q 0 7 3 1 21 2 iφ Y = (5cos θ 3cos θ) Y ± = sin θ (5cos θ 1) e± 3 16π − 3 ∓ 64π − 2 q 105 2 2iφ 3 q 35 3 3iφ Y ± = sin cos θe± Y ± = sin θe± 3 32π 3 ∓ 64π q q Table 5.2: The first three orders of spherical harmonics

87

1s 1s

P2 R

5 r/a0 15 5 r/a0 15

2p 2p 2 P 2s 2s

R

5 r/a0 15

5 r/a0 15 3d 3d 3p 3p 3s 3s

2 R P

5 r/a0 15 5 r/a0 15

Figure 5.2: Plot of radial wave fuctions R and radial probability density r2R2.

3/2 1,0 Z Zr/a0 R = R = 2e− (5.55) 1,0 r a  0  3/2 Z Zr Zr/2a0 R = 2 1 e− (5.56) 2,0 2a − 2a  0   0  3/2 Z 2 Zr Zr/2a0 R2.1 = e− (5.57) 2a0 √3 2a0     3/2 2 Z 2Zr 2 Zr Zr/3a0 R = 1 + e− (5.58) 3,0 3a − 3a 3 3a  0  " 0  0  # 3/2 √ Z 4 2 Zr 1 Zr Zr/3a0 R = 1 e− (5.59) 3,1 3a 3 3a − 2 3a  0   0  0  3/2 √ 2 Z 2 2 Zr Zr/3a0 R3,2 = e− (5.60) 3a0 3√5 3a0    

∞ 2 Table 5.3: First three orders of normalized radial hydrogen wave functions ( 0 r drR∗R =1). R

88 Conclusion: Angular eigenstates dictate that the electron has quantized angular momentum, with

ψ Lˆ2 ψ = l(l + 1)~2 (5.61) h | | i or l(l + 1), and z-component p ψ Lˆ ψ = m~ =( l, l +1,...,l 1,l)~

5.4.2 Coulomb potential

Specializing to the Coulomb potential

Ze2 V (r)= (5.63) − r

gives the radial equation

~2 d2 l(l + 1)~2 Ze2 R + = E (5.64) −2m dr2 2mr2 − r R R     which has the radial wave function solutions

̺/2 l 2l+1 = C e− ̺ L (̺) with n N, n l +1 (5.65) Rn,l n,l n+l ∈ ≥

with the “Associated Laguerre Polynomials” Ln,l and the dimensionless radial variable

2Ze2m r 2Zr e (5.66) ̺ = 2 = n~ na0

From this expression, it is clear that the Bohr radius is the fundamental scale of the problem. Properties of the Hydrogen atom:

Asymptotics: • n 1 ̺/2 m ̺ − e− for ̺ 1 ψn,l,m(r,θ,φ)= RY ≫ (5.67) r l ∝ ̺l for ̺ 1  ≪ # of Nodes = (n l 1) • − − Symmetry=( 1)l under parity change [~r ~r] • − →− 89 Lower l orbitals have a higher probability of being closer to nucleus. • Energy levels: E = E = 13.6 eV Z2 • nlm n − n2 Energy levels are degenerate in l and m •

This explains the gross energy spacing of the hydrogen line series (as does the Bohr atom). We label states according to their n and l values, as shown in Tab. 5.4.

l=0 1 2 3 n= 1 1s — — — 2 2s 2p — — 3 3s 3p 3d — 4 4s 4p 4d 4f ......

Table 5.4: Nomenclature for single electron states

5.5 Fine Structure

5.5.1 Spin

The discovery of fine structure, the Zeeman effect, and the Stern-Gerlach experiments all required the introduction of the electron spin. While a proper treatment of spin in quantum mechanics requires a relativistic treatment. However, accepting that spin is necessary and using the results from a relativistic calculation for the mag- netic moments, we can follow our treatment of orbital angular momentum and define spin angular 2 momentum operators Sˆ and Sz such that spin eingen states exist with

Sˆ2 ψ = s(s + 1)~2 ψ (5.68) | i | i

and

Sˆ ψ = m ~ ψ (5.69) z| i s | i

Again, ψ are not eigen states of Sˆx and Sˆy.

90 Electrons, protons, and neutrons are Fermions with

1 1 s = and m = (5.70) 2 s ±2

and the two eigenstates

1 0 s 1 α1/2 = and s 1 α 1/2 = (5.71) | 2 i≡ 0 | − 2 i≡ − 1     To build a particle wave function, weCombine spin and space wave functions:

ψ s ψ(r,θ,φ)α 1/2 (5.72) | i| i → ±

Angular momentum operators can generally couple (see Fig. 5.3) to give a new angular momentum operator:

Jˆ = Lˆ Lˆ (5.73) 1 ⊗ 2 with possible values for j of j = l l , l l +1, ...,l + l so | 1 − 2| | 1 − 2| 1 2 1 ˆj = ˆl Sˆ with j = l (5.74) ⊗ ± 2

From the spin operator, we can derive the magnetic moment operator

e µˆ = g Sˆ (5.75) − s 2mc with g =2 for electrons, g =5.6 for protons, g = 3.8 for neutrons ( substructure, i.e., Quarks). − →

N.B.: Combining all three angular momenta, we define a new notation for an electron state: 2s+1 2 2 2 LJ , e.g. P3/2 vs. P1/2 vs. S1/2.

5.5.2 Spin-orbit = Russel-Saunders = LS coupling, fine structure

For sufficiently small nuclear charges (e.g., the Hydrogen atom), the individual angular momentum eigenstates are still good eigenstates after coupling. That is, we can build an electron state from eigenstates to orbital and spin angular momentum and couple them together to form a total angular momentum.

91 Lz for l=2 angular momentum addition: J=1+2

m=2

m=1

m=0

m=-1

m=-2

Figure 5.3: Left: Schematic showing Lz values for l =2; right: angular momentum addition for l1 =2 and l2 =1.

Because of spin-orbit coupling (the interaction of the spin magnetic moment with the B-field in- duced by the electron orbit), the Hamiltonian an contains extra term:

g Z e 2 Hˆ = Hˆ + s SˆLˆ = Hˆ + Hˆ (5.76) 0 2r3 m c 0 1  e  Because this is such a small effect (see discussion above), this is typically treated as a perturbation. In order to treat this term, we use the H0 eigenstates to evaluate the expectation value of Hˆ1 (that is, the change in energy due to the presence of the spin-orbit coupling term):

∆E = ψ H ψ (5.77) n,l,m h nlm| 1| nlmi

This technique for calculating the corrections in energy levels works particularly well for operators that commute with the zero-order Hamiltonian, since they have the same set of eigen-kets and just alter the eigenvalues. Because orbital relativistic effects come in at the same order of magnitude as spin-orbit coupling (fine structure), all must be combined to give

Z2α2 (Zα)2 1 3 E = mc2 1 1+ + O(Zα)6 (5.78) n − 2n2 n j +1/2 − 4n     

Thus: a Dirac (=relativistic) treatment gives energy levels degenerate in j, not l (but the “Lamb shift” lifts that degeneracy — a QED effect).

92 Continuum

2 2 3 D 3 P ⁵⁄₂ 2 ³⁄₂ 32D 3 S½ 2 ³⁄₂ 3 P½ Hα 22P 22S ³⁄₂ ½ 2 Fine structure 2 P½ Lamb shift Lβ

triplet 12S Hyperfine structure ½ singlet

Figure 5.4: Schematic showing fine structure level split.

The result diagram is schematically shown in Fig. 5.4. Note: states with larger j-value have higher energy. Ergo: when spin and angular momentum align for an atom with only a single valence electron, the spin-orbit energy is larger. This makes sense: Two magnetic dipoles are in their minimum energy state when they are mis-aligned (such that the fields cancel each other).

N.B.: We will discuss selection rules later, but note that not all states connect to all other states via radiative transitions. Shown in Fig. 5.4 are only dipole-allowed transitions.

5.5.3 Hyper-fine structure

The nucleus also has spin and thus a magnetic field. For hydrogen, this is simply due to the magnetic moment of the proton. Protons are spin-1/2 particles. Combined with the electron spin, this gives us four possible combi-

93 nations: the triplet states with total spin 1:

↑↑ ψ = 1 [ + ] (5.79) trip  √2 ↑↓ ↓↑  ↓↓ and the anti-symmetric singlet state with total spin 0:

1 ψsing = [ ] (5.80) √2 ↑↓ − ↓↑ giving a hyper-fine energy split of

~µnuc~µe 6 ∆HF 3 =5.9 10− eV λ = 21cm (5.81) ≈ a0 × ↔

which we discussed above. The important addition to our previous discussion is the multiplicity: We have three states with total spin 1 and only one state with spin zero. Thus, the statistical weights g are

gs=1 =3 (5.82) and

gs=0 =1 (5.83)

5.6 Radiative transitions, Selection rules, oscillator strengths, and Einstein coefficients

On of our main motivations to study QM in Astronomy 700 is to derive Quantum corrections (e.g., Gaunt factors, oscillator strengths), selection rules and rates for radiative transitions. We will concentrate on a semi-classical treatment with appropriate corrections. Read Rybicki & Light- man and Shu for more detailed discussions about radiative transitions. The guiding “correspondence principle” behind our approach is this: Generalize classical treatment by replacing variables with operators.

5.6.1 The dipole operator

We will specialize to dipole radiation. Recall the dipole assumptions: λ R and v c. Then, we can instantaneously approximate the electric field across the atom as uniform≫ and disregard≪ all effects of retarded time.

94 From the notes, we know that the power emitted per particle in the classical limit is just

dW 2 d¨ 2 = | | (5.84) dtdN 3c3

Clearly, we will have to replace d with an appropriate expectation value for the dipole operator:

xˆ ˆ d~ d~ = q~rˆ = q yˆ (5.85)   → zˆ   Then,

¨ d2 ˆ d~ ψ d~ ψ (5.86) → dt2 h d| | ui for any two states ψd and ψu that are eigenstates of the unperturbed Hamiltonian (in this case, the hydrogen states).| i | i We are allowed to take the time derivatives of the expectation value because we can neglect effects of retarded time since we are considering dipole radiation. Since we express the eigenfunctions in spherical polar coordinates, we need to express the dipole operator in those coordinates as well:

ˆ 1 iφ iφ 2π 1 1 dx = qr sin θ cos φ = qr sin θ e + e− = qr Y1 + Y1− (5.87) 2 r 3   ˆ i iφ iφ 2π 1 1 dy = qr sin θ sin φ = qr sin θ e− e = qri Y1− Y1 (5.88) 2 − r 3 −   ˆ 4π 0 dz = qr cos θ = qr Y1 (5.89) r 3 which will come in handy soon.

5.6.2 Radiative transitions and selection rules

Next, realize that we can always write the time dependent part of the wave-function for our zero order system (not restricted to the H-atom) as

ψ = ψ (~r)eiEt/~ = ψ (~r)ei2πνt (5.90) | i | i | i For two states ψ and ψ , we then have | ui | di 2 2 d ˆ d i2π(νu ν ) ˆ 2 ψ d~ ψ = e − d ψ (~r) d~ ψ (~r) = (2πν ) d~ (5.91) dt2 h d| | ui dt2 h d | | u i − ud ud   95 where we defined the dipole matrix element

ˆ d~ ψ d~ ψ (5.92) ud ≡h d| | ui and the frequency of the transition between state i and f,

E E ν = ν ν = u − d (5.93) ud u − d h

which is positive for Eu >Ed. Finally,

2 d ˆ 4 d~ 2 = (2πν ) d2 (5.94) |dt2 | ud ud

where we define

d2 = ψ dˆ ψ ∗ ψ dˆ ψ (5.95) ud h d| j| ui h d| j| ui j=x,y,z X  

Note: we need to sum over both photon spin states in this approach, which introduces a factor of 2 to the Larmor formula (this is actually a subtle quantum electrodynamics correction which we will not have time to go into here). The emitted power per particle is then:

dW 2 2 d¨ 2 4(2πν )4 = | | = ud d2 (5.96) dt 3c3 3c3 ud i=1 X

Note: If we were to switch states i and f, such that νud < 0, we would have to consider time reversal to properly interpret this. This would lead directly to the principle of detailed balance, which we will use below to calculate the different Einstein coefficients.

5.6.3 Selection rules

Equation (5.96) contains the dipole matrix elements dud which connect different energy states in an atom. Since we can express the components of the dipole operator as combinations of the first or- der spherical harmonics, these matrix elements are related to the so-called 3-J symbols and the

96 Clebsch-Gordan coefficients, which are simply tabulated integrals over combinations of spheri- cal harmonics to make calculation of matrix elements easier. Generally, because spherical harmonics are all orthogonal to each other, it is clear that some of the dipole matrix elements are going to vanish. For example, consider two l =0 S-states of hydrogen, nS and mS . | i | i The dipole matrix element of this transition would be

∞ 0 0 nS dˆ mS = dr ∗ dΩY dˆ Y (5.97) h | j| i Rn,0Rm,0 0 j 0 Z0 Z We can express one of the two angular wave functions as

0 1 Y0 = (5.98) r4π

ˆ 1,0,1 and, since the dj are all linear combinations of Y1− , we can substitute:

∞ 1 2π 0 1 1 nS dˆ mS = q dr ∗ r dΩ Y Y + Y − =0 (5.99) h | x| i Rn,0 Rm,0 4π 3 0 1 1 Z0 r r Z  with equal null-results for dˆy and dˆz. Thus, the s to s transitions in hydrogen are dipole-forbidden:

d2 msns =0 (5.100)

In the case of the one-electron atom one can show generally that the following dipole transition rules hold strictly:

∆l = 1 (5.101) ∆m =0± , 1 ±

Note: The photon angular momentum is ~, which explains the ∆l = 1 and the ∆m 1 rule (a photon cannot carry more than one unit of ~ of angular momentum in± any given direction).≤ In cases where electric dipole radiation is forbidden, we can follow a similar argument to arrive at the conditions for magnetic dipole and electric quadrupole radiation. In the case of magnetic dipole radiation, we have to include the magnetic dipole moment of the electron spin, which is why it can induce spin flip hyperfine transitions.

97 5.6.4 Einstein coefficients and oscillator strengths

Note: If there is multiplicity (i.e., one or both states are degenerate), one has to average over all possible upper states and sum over all final states to get the total emission at the desired energy. This introduces the statistical weights gu and gd.

The total power per volume emitted by particles in the i-state (with number density nu), transition- ing into the f-state, is then

dW 64π4ν4n d2 = u ud ud (5.102) dV dt 3c3 g P u Recall the definition of the Einstein A coefficient from above:

dW = hνA n (5.103) dV dt ud u

which immediately yields the desired expression:

64π4ν3 d2 A = ud ud (5.104) ud 3hc3 g P u The Einstein relations then give the other two coefficients:

32π4 d2 B = ud ud (5.105) ud 3h2c g P u and the Einstein coefficient for absorption is:

32π4 d2 B = ud ud (5.106) du 3h2c g P d

From section 1, recall that the relation between absorption coefficient and Bdu is

hν α = σ n = n B φ (5.107) ν|abs ν d 4π d du ν and the absorption cross section is σν = αν/nd:

hν 8π3ν d2 σ = B φ = ud ud φ (5.108) ν 4π ud ν 3hc g ν P d

98 which we compare to the classical expression

πe2 σν = φνfud (5.109) mec

to identify the oscillator strength

8π3ν d2 m c 8π2m ν d2 f = ud ud e = e ud ud (5.110) ud 3hc g πe2 3he2 g P d P d which measures the

5.6.5 Natural line broadening

The Einstein coefficients imply a time constant for the radiative decay from the upper state (a life time to radiative de-excitation)

γu = Aud (5.111) Xd and similarly, a life time for the lower state due to its own radiative de-excitation

γd = Add′ (5.112) Xd′ The uncertainty relation links the transition energy and the transition time scale by

∆E∆t ~ (5.113) ∼ so we would expect

∆E ~ γ + γ ∆ν = u d (5.114) h ∼ h∆t ∼ 2π

while comparison with the classical atom suggests the broadened line profile function

γ/4π2 φ = (5.115) ν (ν ν )2 +(γ/4π)2 − 0 with a FWHM of

γ γ + γ ∆ν = = d u (5.116) FWHM 2π 2π

99 6 Multi-Electron Atoms

The Hamiltonian for a multi-electron atom must be a combination of the Hamiltonian for each in- dividual electron in the central potential and the interaction terms between different electrons. The latter includes the Coulomb interaction of electrons with each other, and the spin-orbit coupling of the electron distribution.

Pˆ2 Ze2 e2 σˆ ˆl dV Hˆ = i + i i + ... (6.1) 2m − r r − m2r c2 dr i e i i i,j ij i e i i X X X X The Ansatz behind discussing multi-electron atoms is based on the principle of building up multi- electron configurations by placing electrons in single electron orbitals (i.e., hydrogen-electron or- bitals), which is essentially an exercise in perturbation theory. Important: Complete l-shells are spherically symmetric and just add to the central Coulomb po- tential. Thus, we only need to consider partially filled shells. Electrons in partially filled shells are called valence electrons.

6.1 Notation & Nomenclature

l-sub-shells: n-shells: l =0123 and n =123 4 spdf K LM N

Ionization: element symbol + roman numeral of (missing electrons + 1) Example: Neutral Hydrogen=HI, ionized hydrogen=HII, neutral iron=FeI, fully ionized iron=FeXXVII. Alternative nomenclature: superscript ionization state Example: H0=HI, H+=HII, O+3=OIV

An iso-sequence is a series of different elements with identical number of electrons, increasing in ionization with increasing nuclear charge Z. Example: The Hydrogen isosequence: HI, HeII, LiIII,... ,FeXXVI Example: The Helium isosequence: HeI, LiII,... ,FeXXV Why is this useful?

The level structures for ions of an iso-sequence are mostly idential • Wavelengths, energies, and other properties vary as smooth functions of nuclear charge Z • Cosmically abundant elements: H, He, C, N, O, Ne, Mg, Si, S, Ca, Fe

100 Φ(r) Q(r) a0/Z a0 r Outer electrons 2 En ~ Ry /n Charge cloud

Φscreened

Inner electrons 2 2 r En ~ Z Ry /n

Figure 6.1: Charge cloud screening outer electrons from full nuclear charge, leading to a reduced effective potential and making s-electrons generally more strongly bound.

The inner electrons see full nuclear charge, outer electron see screened potential (see Fig. 6.1).

N.B.: S-electrons are closer to nucleus on average, thus have higher binding energy.

6.2 Generalized Pauli Principle

We construct wave functions from products of individual orbitals:

Ψ = ψ ψ ψ (6.2) | i | ai1| bi2 ···| kiN But quantum particles are indisinguishable, so we must use all possible permutations of particles into different orbitals by superposition. The resulting total state is a combination of space and spin wave functions:

Ψ = Ψ Ψ (6.3) | i | spacei| spini The Pauli exlusion principle states: No two Fermions can occupy the exact same state. total wave function product must be anti-symmetric under particle exchange, Pˆ Ψ= Ψ → exch − where Pˆexch is the parity under exchange operator.

Symmetric spatial wave function anti-symmetric spin Anti-symmetric spatial wave function↔ symmmetric spin ↔

This guarantees that no two fermions occupy the same state (they would if the wave function were symmetric, i.e., the same under particle exchange)

101 Formally, this anti-symmetric superposition is achieved using the so-called Slater determinant:

ψa(1) ψa(2) ψa(N) ψ (1) ψ (2) ··· ψ (N) b b ··· b 1 Ψ= · (6.4) √ N! ·

· ψ (1) ψ (2) ψ (N) k k ··· k

Note: For simple electron configurations (e.g., 1,2,4,5 valence electrons), the symmetry of the space part of the wave function is given by the total orbital angular momentum quantum number L:

Pˆ Ψ =( 1)LΨ (6.5) exch space − space

and, for the spin part of the wave function:

Pˆ Ψ = Ψ (6.6) exch spin ± spin

with + for a symmetric and for an anti-symmetric spin state. The symmetry of the spin state has to be worked out in a straight− forward but tedious way of combining all possible spin m combinations. Example: Coupling the spin of two electrons gives the two well know triplet and singlet states: σ σ | i1 ⊗| i2

+ 1 + 2 | 1 i | i Ψ (S =1)= ( + 1 2 + 1 + 2) (symmetric) (6.7) | spini  √2 | i |−i |−i | i  |−i1|−i2  and

1 Ψspin (S =0)= ( + 1 2 1 + 2) (anti symmetric) (6.8) | i √2 | i |−i − |−i | i −

The generalized Pauli principle is responsible for the absence for certain “missing states” in multi-electron atoms. A critically important consequence of this is that electrons with like (=aligned) spin must, on average, be further away from each other than electrons with opposite (mis-aligned) spin since the electrons cannot be too close to each other. This means that their Coulomb repulsion force on each other is lower, reducing their Coulomb energy. This “exchange force” therefore reduces the energy of electrons with aligned spin. We will discuss this more in 6.4. § 102 6.3 L-S coupling

Consinder only the valence electrons and couple their angular momentum and spin seperately (this is appropriate for multi-electron atoms with low to moderate nuclear charge Z):

S~ = ~σi and L~ = ~li (6.9) i i X X

Then couple the individually coupled orbital and spin angular momenta, L~ and S~, to obtain the total coupled angular momentum

J~ = L~ S~ (6.10) ⊗ which once again behaves like a quantum mechanical angular momentum with eigenvalues J(J + 1) to Jˆ2 and m J to Jˆ . | J |≤ z The possible values for J are, as usual, J = L S , L S +1,...,L + S. | − | | − | ˆ ˆ S~ and L~ commute, so operationally:

ˆ ˆ ˆ ˆ ˆ ˆ Jˆ2 = L~ + S~ L~ + S~ = Lˆ2 + Sˆ2 +2L~ S~ (6.11) · ˆ ˆ 1    L~ S~ = Jˆ2 Lˆ2 Sˆ2 (6.12) ⇒ · 2 − −   which means that we can express the energy shift associated with L-S coupling as

σˆ ˆl dV i i (6.13) ∆ELS = Ψ Σi 2 2 Ψ h | meric dri | i 1 dV ˆ ˆ = Ψ Σ L~ S~ Ψ (6.14) h | i m2c2r dr · | i  e i i  1 dV 1 = Ψ Σ [J(J + 1) L(L + 1) S(S + 1)] Ψ (6.15) h | i m2r c2 dr 2 − − | i  e i i    = C [J(J + 1) L(L + 1) S(S + 1)] (6.16) nili − −

where Cni is the radial integral. Thus, for a given L and a given S, combinations with higher J value have higher energy. This is called the Lande´ interval rule.

6.4 Energy split (Hundt) rules

There is a hierarchy in how the energy of an electron state depends on the quantum numbers, due to the fact that the Coulomb terms are so much stronger than the spin-orbit terms (which themselves

103 ³⁄₂ Li I 6708 Å 1s² 2p ²P ¹⁄₂ C IV 1549 Å average wavelength N V 1240 Å OVI 1035 Å 1s² 2s ²S ¹⁄₂ Fe XXIV 192 Å/235 Å fine structure splitting

configuration term level

Figure 6.2: ground and first excited states in the Li isosequence

are much stronger than spin-spin terms, which we have neglected). From largest to smallest energy correction, these are: A) The exchange force due to Pauli Exclusion principle is strongest: Electrons with aligned spin are, on average, further apart than electrons with opposite spin = exchange Coulomb energy is lower for larger total spin. ⇒ Note: This is opposite of what one might naively expect for two magnetic dipoles (the spin-spin energy is lower when two electron spins are misaligned). However, electron Coulomb repulsion completely dominates the spin-spin coupling. B) Next in line is the residual Coulomb repulsion between electrons after taking the exchange force into account. For electrons with aligned/misaligned spin, the terms with larger L value are lower in energy, because their electrons are on average further away from the nucleus and thus, again, further away from each other, lowering their Coulomb repulsion energy. C) Finally, for given L and S values, the L-S-coupling fine-structure split gives higher energies for higher J values by the Lande´ interval rule from 6.3. § 6.5 Nomenclature, spectroscopic jargon, selection rules

We will use CIV as an example. CIV is a three electron ion. Two electrons fill the 1s shell, so CIV has one valence electron. The ground and first excited state are 1s22s: Ground state 1s22p: Excited state We distinguish the following nomenclature:

1. “Transition array”: electron configurations (how many electrons are in which shells?), i.e., (1s22s 1s22p) − 2. The “multiplet” of the valence electrons, called “terms” (what are the S and L values for the

104 combined valence electrons?), i.e., (2S 2P). This determines the gross energy splits, based on electron repulsion. − 3. The “level” (what is the J value for the combined valence elctrons?), i.e., (J = 1/2; J = 3/2). This is important for spin-orbit coupling. 2S+1 4. Putting the last two together, we label terms by LJ , which now makes more sense than in the Hydrogen atom, where we only have one electron spin to consider.

Recall that spin 1 combinations are triplets, spin 0 combinations are singlets, and spin 1/2 elec- trons are, naturally, doublets. Permitted (dipole allowed) resonance/sub-ordinate lines: Electric dipole transitions are called resonance and sub-ordinate lines. We have the following selection rules for dipole transitions:

∆S =0 (no spin flip) • ∆L =0, 1 (angular momentum) • ± ∆l = 1 (only one electron “jumps”, for all others, ∆l =0) • ± ∆J =0, 1 (but 0 0) • ± 6→ ∆M =0, 1 (but 0 0 if ∆J =0) • J ± 6→

Resonance lines: Resonance lines are dipole transitions into the ground state (e.g., Hydrogen Lyman series). In the ISM, typically only the ground state is populated appreciably resonant absorption lines are strong! ⇒ Resonance lines typically lie in UV/soft X-ray. As a consequence, one must use UV/X-ray spec- trometers to study low density gas in absorption. Subordinate lines are transitions between two excited states (e.g., Balmer series).

Forbidden lines: Electric dipole “forbidden” lines can sometimes occur via magnetic dipole radiation, which can induce spin flip transitions: ∆S = 1, or via higher order multipoles (e.g., electric quadrupole). ± Electric quadrupole emission can cause ∆J =0, 1, 2. ± ± These “forbidden” lines appear in emission usually after the upper state is excited by an electron or atomic collision

5 3 They require low density environments (n < 10 cm− ) to avoid collisional de-excitation before spontaneous emission of a photon (see 8.2). § 105 ¹S ¹S₀ S=0 ¹P ¹P₁ missing (singlet) ¹D ¹D₂

2p²

³S ₂ ³S₁ missing S=1 ₁ ³P ₀ ³P₀₁₂ ₂₃ (triplet) ³D ₁ ³D₁₂₃ missing

unperturbed Spin-spin correlation Residual Coulomb Spin-orbit coupling configuration energy (like-spin electron energy (higher L electron are further apart from configurations further each other) apart from each other)

Figure 6.3: Term diagram for 2p2 configuration.

Notation: [OIII]

Intercombination lines: Intercombination lines are lines for which the multiplicity of the atom changes (i.e., ∆S = 0). These are strictly forbidden by E1 (electric dipole radiation) under LS coupling. However,6 for heavy nuclei, LS coupling itself is broken, thus allowing the electric dipole operator to couple two states with different S, though with very small matrix elements. Such transitions are stronger than strictly forbidden ones. Semi-forbidden lines are intercombination lines that have a relatively higher A value than strictly forbidden lines. They typically have large energies (high ν), giving them larger A values (since A ν3) and go by magnetic dipole transitions (M1, ∆S = 1, ∆L = 1), and they radiate at a rate∝intermediate between permitted and forbidden lines. ± ± Notation: CIII]

Single-photon transitions: For any transition occuring by single photon emission, the selection rule J = 0 0 is strictly observed. Such transitions can only occur by two-photon decay, which is rare and does6 ⇒ not give an emission line, producing continuum emission instead (with the sum of the photon energies equal to the transition energy).

106 [OIII] (Carbon isosequence) ¹S₀ Forbidden “Nebulium” lines 1s²2s²2p² by M1 or E2.

4363Å

Å Note: ¹S₀ ³P₀ (J=0 0) ¹D₂ 331 2 2321Å

Å 5007Å 4931 Å

4959 ³P₂ 51.7 µm IR fine structure lines ³P₁ 88 µm ³P₀ (seen in nebulae/atomsphere)

Figure 6.4: Forbidden and fine structure lines for OIII.

Rough distinction in A-values:

1 Type of transition : example λc typical A21(s− ) Resonance (dipole allowed) : H0 1216A˚ 108 1013 − (6.17) Semi forbidden CIII] 1909A˚ 10 104 − −5 Forbidden [OIII] 5007A˚ 10− 1 − 6.6 Term diagrams

Term diagrams are plots of the energy levels of different electron configurations of a multi-electron atom (e.g., recall the term diagram for the hydrogen atom from Fig. 5.4). They allow easy, qualitative analysis of line multiplicity, energy spacing/ordering, and whether lines are allowed or forbidden. Using the Hundt rules, constructing term diagrams is easy for 1 (H), 2 (He), 3 (C IV), and 4 (C III) valence electrons, hard for 5 (better to look up), and easy again for 6 and 7 valence electrons. For one valence electron, the term diagram simply follows the hydrogen energy levels (see Fig. 5.4). For two valence electrons, we have to couple the electrons in such a way that they obey the gener- alized Pauli principle. Two s-electrons: Coupling two s-electrons is easy: If they are in the same n shell, their spins have to be anti-aligned, thus only a singlet state is allowed.

107 [OII] (Nitrogen isosequence)

²P ¹⁄₂ 1s²2s²2p³ ²P ³⁄₂

0Å 8Å Again: J-order of multiplet inverted

9Å 1Å

733 731 (higher J have lower energy) 731 733

²D ³⁄₂ ²D ⁵⁄₂

726Å 729Å 3 3 very important pair of lines (ne diagnostic in nebulae)

⁴S ³⁄₂

Figure 6.5: Forbidden and fine structure lines for OII.

For a configuration with three to 7 electrons in the valence shell (e.g., 2s22p) we can ignore the two s-electrons since they form a spherically symmetric charge cloud and concentrate only on the p- or higher l electron.

One p-electron: For one p-electron (e.g., 2p), we again follow the p orbital levels for the hydrogen atom.

Two p-electrons: Example: Ions in the Carbon isosequence (e.g., 2p2) For two p-electrons (see Fig. 6.3), we must couple the spins and orbital angular momenta: Possible total spin: S =0 or 1, so (2S +1)=1 or 3. Possible total orbital angular momentum: L =0, 1, or 2. Possible total angular momentum: J =0, 1, 2, or 3.

1 1 1 3 3 3 This gives the possible states: S0, P1, D2 and S1, P0,1,2, D1,2,3. For identical electrons (e.g., 2p2, but not 2p3p), we must consider only the Pauli-allowed combi- nations among those states: For S = 0, the spin wave function is anti-symmetric, so we must have a symmetric spatial wave function. From the Generalized Pauli Principle, it turns out that even total L configurations are symmetric and odd total L configurations are anti-symmetric.

108 [OI] (Oxygen isosequence) ¹S₀ Note similarity to Carbon isosequence 1s²2s²2p⁴ except J-order of multiplet inverted

5577Å Note: ¹S₀ ³P₀ (J=0 0) ¹D₂

2972Å 2958Å

Å 6391Å 6300 Å

6364 ³P₀ 147 µm IR fine structure lines ³P₁ 63µm ³P₂

Figure 6.6: Forbidden and fine structure lines for OI.

This leaves L = 0 and L = 2 as Pauli-allowed states, and since S = 0, we have J = L and the 1 1 terms are S0 and D2. For S = 1, the spin wave function is symmetric, so we must have an anti-symmetric spatial wave function, which leaves L =1. The possible J values are now J = L S,L,L + S or J =0, 1, 2. − The J-sub levels are split according to LS-coupling, so that higher J-values have higher energies.

Three p-electrons: Example: Nitrogen isosequence (e.g., 2p3) This is done by coupling (p p) p. This exercise is too tedious to derive in class. We just quote 4 2 ⊗ ⊗ 2 the allowed terms: S 3 , P 1 3 , and D 3 5 . The term diagram for such a configuration is sketched 2 2 , 2 2 , 2 in Fig. 6.5.

More than three p-electrons: Example: Oxygen isosequence (2p4) and Fluor isosequence (2p5) Consider the shell full and then take away two missing electrons (as if the two missing electrons were there but had a positron attached to them that would cancel their charge). The complete l-shell would be spherically symmetric, so it would be sufficient to consider the interaction between the two canceling charges. The only thing that changes are the LS-coupling terms, since the positive canceling charges have an opposite spin magnetic moment. This leads to an inverted level population for the J-states (the last step in making the term diagram).

109 7 Molecular spectra

Molecular structure is the topic of chemistry and can be very complex. However, many astrophys- ical molecules are simple and we can gain significant insight from simple considerations. Molecular bonds come in three different types:

+ Ionic bond (e.g., Na Cl− ... very polar) • Covalent (e.g., CO ... shared valence electrons) • van der Waals (hydrogen bonds ... weak bonding) • Molecules are composed of multiple distinct nuclei and a cloud of shared electrons. The additional terms to the Hamiltonian due to electron-electron and electron-nuclear interactions are the main cause for the complexity of molecular physics (among other things, the nuclear poten- tial is no longer spherically symmetric, thus the angular momentum eigenfunctions are no longer spherical harmonics). Simplification: “Born-Oppenheimer” approximation — assume that electrons move much faster than nuclei and calculate electron wave functions considering the nuclei fixed. Then treat the slow motion of the nuclei as perturbations. For simplicity, and because this is sufficient for many astrophysical scenarios, we will consider only di-atomic molecules. In this case, the molecule still has a symmetry axis, which simplifies matters significantly: The nuclei can then rotate around two axes perpendicular to the inter-nuclear axis, and vibrate along the inter-nuclear axis. This makes for three additional degrees of freedom (on top of the three translational degrees of freedom). This is important for the thermodynamics of gas because the ratio of specific heats depends on the number of degrees of freedom available. For a di-atomic molecule, the maximum number of degrees of freedom is six, corresponding to an adiabatic index of γ =1+2/n =4/3.

7.1 Molecular energy level hierarchy

The electron-nuclear split suggests an energy hierarchy:

A) Electronic states: The energy Eel involved in electronic transitions is comparable to that in atoms, of order a few eV,

e4m e4m2 ~2 ~2 E Ry = e = e = few eV (7.1) el ~2 ~4 2 ∼ 2 2me 2mea0 ≈ ×

The electrons bind the two nuclei by screening the Coulomb repulsion between them. The higher the electron probability density between the nuclei, the stronger the bond.

110 Figure 7.1: Non-electronic modes of molecular structure: rotation (left) and vibration (right) in a di-atomic molecule.

Therefore, in a bound molecule, the electrons have to spend a good fraction of their time between the nuclei, and thus the typical size of the electron cloud is of the order of the nuclear separation. Since the molecule has to be in a lower energy state than the two individual atoms it associated from, the separation of the nuclei should be of the order of, but smaller than the original size of atoms, so of the order of a0. Since the electrons move so rapidly compared to the nuclei, we can think of the electron cloud as a smooth charge distribution. To lowest order, it is instructive to consider a simple spherical cloud with radius R. Then, the potential due to the electrons is quadratic in radius until the edge of the cloud, outside of which it is proportional to 1/r. The potential seen by the nuclei is then the superposition of the Coulomb repulsion of the two 1 nuclei (which goes like D− ) and the Coulomb attraction between the nuclei and the electron cloud. For a bonding orbital, there must be a minimum to this curve, with negative energy. This is the equilibrium inter-nuclear separation D0. The potential is then approximately:

~2 2 φ(D) 4 (D D0) const. (7.2) ∼ meD0 − −

This is shown in Fig. 7.2. For different electronic states, the electron distribution will be different (with excited states having larger electron clouds), thus the equilibrium nuclear separation will be larger for excited electron states.

111 0.5

Nuclear repulsion

φ 0.0 Effective potential

Electron potential −0.5

0 1 2 3

Nuclear separation D [D eq ]

Figure 7.2: Sketch of the effective potential along the inter-nuclear axis of a diatomic molecule.

B) Vibrations: Around D0, φ can be approximated as parabolic to lowest order, making the molecule a quantum mechanical harmonic oscillator, with a spring constant of

~2 k = 4 (7.3) meD0

The oscillators are the two nuclei. The oscillation frequency is

k ω0 = (7.4) sµ

where µ is the reduced mass of the molecule,

mamb µ = 2mnuc (7.5) ma + mb ∼

Recall from basic quantum mechanics that the energy Ev of a quantum mechanical harmonic oscillator is

2 ~ me Evib = ~ω0(v +1/2) Ee 0.03Ee 0.1 0.5 eV (7.6) ∼ √m m D2 ∼ m ∼ ∼ − e nuc r nuc with eigenvalues v =0, 1, 2,... and with equal energy spacings. Thus, vibrational energies are about a factor of m /m 40 below electronic energy levels. e p ∼ p 112 C) Rotations: We must consider quantized angular momentum for the molecular rotation, with quantum number K (often also called N):

2 2 Lˆ Ψm = ~ K(K + 1) (7.7)

Now recall that the kinetic energy of a rigidly rotating barbell (rotating around the center of mass) with angular velocity Ω is

1 1 1 E = m v2 + m v2 = IΩ2 = L2 rot 2 1 1 2 2 2 2I 2 2 L ~ K(K+ 1) me (7.8) 2 = 2 Ee ∼ 2µD 2µD ∼ mnuc

So the rotational states are lower in energy than vibrational states by a factor of about 40, and lower than the electronic states by a factor of about 2000. Their energy spacing is quadratic in K.

The energy hierarchy is thus

m m E e E e E (7.9) rot ∼ m vib ∼ m el r nuc nuc 7.2 Electronic states

We can construct electron orbitals from superpositions of single electron orbitals, following the Pauli exclusion principle. For the simplest case, H+, two ground state orbitals are possible, bond- ing (ψ =(ψ + ψ )/√2) and anti-bonding (ψ =(ψ ψ )/√2). 1 2 1 − 2 Electron angular momentum: The nuclear potential is no longer spherically symmetric, so l is no longer a good quantum number. For diatomic molecules, there is still a symmetry axis. Thus, while we can no longer separate radial and angular part of the electron wave function, we can use cylindrical coordinates and separate out the angular part. This suggests that the z-component of the angular momentum still has eigenvalues of m~, so m is still a good quantum number. The z-component (along the inter-nuclear axis) of l of molecular electron orbits is called λ. The same holds for the electron spin: s is no longer a good quantum number, but s σ is. z ≡ Note: molecular quantum numbers are often expressed as the Greek letter of the atomic equivalent.

113 Covalent bond

electron cloud

+ + Internuclear axis

el ρ bonding Internuclear axis

anti-bonding

Figure 7.3: Sketch of the electron charge density (top) and two possible super-positions of single atom orbitals for a bonding and an anti-bonding orbital.

We can label the individual electron states by their λ value, similar to the atomic scheme (but using greek letters instead):

λ = 0 1 2 3 4 term σ π δ φ γ

Each electron contributes to the total angular momentum. Depending on the mass of the nuclei, we can have the equivalent of LS coupling: For light nuclei, the individual λ couple to make Λ, and the individual σ couple to Σ. Each of these two angular momentum vectors precesses around the inter-nuclear axis. Then, the total z- component of the electron angular momentum is

Ω=Λ+Σ (7.10)

In this case the nomenclature is similar to the atomic case:

(2Σ+1) 2 2 + 3 ΛΩ;π e.g. Σ0,1;g (H2), Π0,1,2;g (O ) Σ 1 3 (O2) (7.11) 2 2 , 2 ;g

114 Ω and the rotational J angular momentum K couple to the total angular momentum J and precess around this axis.

K

spin and orbital angular momenta couple and precess around inter-nuclear axis and couple to Λ and Σ, Λ Σ then to Ω Ω

Figure 7.4: Spin-orbit coupling scheme for molecules: orbital angular momenta of electrons couple together and form L which precesses around the molecular axis, and the spins couple to S, again precessing around the axis. Only the components along the axis are constant and add up to Ω=Λ+Σ, which couples with the molecular rotation K to form the total angular momentum J (K and Ω are perpendicular). Both K and Ω precess around J.

For heavy molecules, the λ and σ couple first to an electron Jel, which precesses around the molec- ular axis, such that Ω= Jel,z Finally, for homo-nuclear molecules, an important consideration is whether the electron orbital is even or odd under parity inversion. For hetero-nuclear molecules, there is no center of symmetry, so this is not important. Even states are labeled with a sub-script g (for “gerade”, German for even) and odd states are labeled with a u (for “ungerade”, German for odd).

7.3 Herzberg notation and corrections

The molecular energy hierarchy suggests that we separate the individual contributions and write the energy of a molecular state as the sum of the electronic state energy, the contribution from a quantum harmonic oscillator, and the quantum mechanical energy of the molecular rotation.

~2K(K + 1) E(v,K)= V + ~ω (v +1/2) + (7.12) 0 0 2I

115 Mol. term we xe ye Be De αe R0 (A)˚ 1 H2 Σg 4401.21 121.33 0.812 60.853 0.0471 3.062 0.74144 1 6 CO Σg 2169.81 13.288 0.0105 1.9313 6.12 10− 0.0175 1.12832 3 × 6 O2 Σg 1580.19 11.98 0.0474 1.43767 4.84 10− 0.0159 1.21752 2 × 6 NO Π 1 3 1904.20 14.075 0.011 1.67195 0.5 10− 0.0171 1.15077 2 , 2 1 × 6 N2 Σg 2358.57 14.324 -0.0023 1.99824 5.76 10− 0.0173 1.09768 2 × 4 OH Π 1 , 3 3737.76 84.881 0.540 18.910 19.4 10− 0.7242 0.96966 2 2 ×

Table 7.1: Herzberg correction parameters for vibrational and rotational energies (Huber & Herzberg 1979).

where the subscript 0 means that the energy is evaluated at the equilibrium distance between the nuclei (potential minimum). At higher excitations, this model breaks down:

The potential that we described as quadratic is not really quadratic, i.e., it becomes “anhar- • monic”, thus the simple solution of the quantum mechanical oscillator goes bad. When the molecule rotates rapidly enough, the centrifugal force stretches out the molecule • and changes the moment of inertia and the spring constant of the oscillator.

We can Taylor-expand the energy to higher orders and write

E(v,K) = V + w (v +1/2) x (v +1/2)2 + y (v +1/2)3 + ... 0 e − e e + B K(K + 1) D K2(K + 1)2 + ... (7.13)  v − v    with

B = B α (v +1/2) + ... (7.14) v e − e Dv = De + ... (7.15)

1 These constants are typically expressed in wave numbers (λ− ), since E = hc/λ. Table 7.1 lists these constants for a few astrophysically important molecules.

7.4 Pure rotational spectra

Transitions involving only a change in rotational quantum number K are low energy transitions, typically in the far IR to radio. Selection rules for electric dipole radiation are straight forward:

d~ =0: molecule must have finite dipole moment • 6 116 ∆K =∆J = 1: emission • − ∆K =∆J =1: absorption •

The first rule implies the homonuclear molecules like H2 cannot radiate dipole allowed transitions. They can, however, radiate quadrupole lines, with ∆J =0, 2. ± Line energies: The energies are E(K) = BvK(K + 1), so the line energies are simply the difference:

∆E = E(K) E(K 1) = B 2K (7.16) − − v and for quadrupole lines:

∆E = E(K) E(K 2) = B 2(2K 1) (7.17) − − v − 7.5 Pure vibrational spectra

Are not really relevant, as K (and thus J) will change too in at least part of the cases, but they are instructive to consider. Selection rules:

d(d~)/dR =0 • 6 For a true quantum harmonic oscillator, a strict rule of ∆v = 1 applies. Anharmonicities • in the potential break this rule, however. ±

Line energies: The energy levels are E(v)= ~ω0(v +1/2) for low v, so

∆E = E(v′) E(v)= ~ω (v′ v)= ~ω (7.18) − 0 − 0

for ∆v = 1, which is degenerate. This degeneracy is broken by the anharmonicity, however. − 7.6 Rotational-vibrational spectra

Both v and K change within the same electronic state.

Generally, since ∆Ev ∆EK , the molecule will be in any number of rotational states for each vibrational transition, so≫ the rotational spectrum, which is evenly spaced in energy, will be super- imposed on the vibrational spectrum.

117

O branch P branch R branch S branch K’

}

} } } 4

v’ 3

2 1 0

K’’ 4

v’’ 3

2 1 0

P(5) P(4)P(3) P(2) P(1) R(0) R(1)R(2) R(3) R(4)

ω O(4) O(3) O(2) S(0) S(1) S(2)

Figure 7.5: Rotational-vibrational transition for a Λ=0 state.

Electric dipole selection rules:

d~ =0 • 6 d(d~)/dR =0: The dipole moment must change between states • 6 ∆v = 1: for pure harmonic oscillator • ± ∆K = 1 for Λ=0 • ± ∆K =0, 1 for Λ =0 • ± 6

Emission line notation: A given band is written as

O(K′′) for∆K = K′′ K′ = 2 (quadrupole only) − P (K′′) for∆K = K′′ K′ =1  − (v′ v′′) Q(K′′) for∆K = K′′ K′ =0 (7.19) −  −  R(K′′) for∆K = K′′ K′ = 1  − − S(K′′) for∆K = K′′ K′ = 2 (quadrupole only) − −    118 Figure 7.6: Fortrat diagram of the rotational band structure for an electronic transition.

where v′ is the upper and v′′ the lower vibrational quantum number and K′ and K′′ the upper and lower rotational quantum number, respectively. Example: (1 0)R(1) means (v =1,K = 2) (v =0,K = 1). − → Dipole spectra: Rotational-vibrational dipole transitions have three branches: P , Q, and R, except for Λ=0 (which is true in most ground states). Thus, most ground state molecules only have two branches of dipole-allowed rotational-vibrational transitions, P and R.

Quadrupole spectra: It turns out that quadrupole transitions are very important for homonuclear molecules, as they have no dipole moment. The selection rules for quadrupole transitions are

∆K = K′′ K′ =0, 2 for Λ =0 • − ± 6 ∆K = 2 for Λ=0 • ± Branches: For hetero-nuclear molecules, dipole radiation dominates, so the P, Q, and R branches dominate (Q only exists for Λ = 0), but for homonuclear molecules, only quadrupole radiation is allowed, so no P and Q branches6 exist.

7.7 Electronic transitions

At the highest transition energies (optical/UV), quantum numbers n, v, Λ, and J can all change. Recall that Λ is the projection of the electrons’ orbital angular momentum on the molecule’s sym- metry axis and J is the molecule’s total angular momentu, This implies that, for a given electron state with quantum number Λ, we know that J Λ . ≥| | 119 The molecule now has changing angular momentum components from the nuclei and the electrons, as sketched in Fig. 7.4 and acts as an axially “symmetric top” (precession, nutation, etc.) The quantum energy levels for the symmetric top are

~2 1 1 E(J, Λ) = J(J +1)+Λ2~2 (7.20) 2I 2I − 2I n  e n  where Ie is the moment of inertia for the electrons and In Ie is the nuclear moment of inertia. Including the Λ2 term, the lowest order energy levels are: ≫

~ ~ ~2 2 E(n,v,J, Λ) = Vn + ωn(v +1/2) + αn J(J +1)+ γn Λn (7.21)

Note: For ground states, Λ is usually 0, in which case the rotational energy depends only on J.

Selection rules:

∆Λ=0, 1 (electron angular momentum must change by 0, 1) • ± ± ∆J =0, 1 (but J =0 0 and ∆J =0 for Λ=0 0) • ± 6→ 6 → ∆v =anything (no selection rule for vibration) • Notation: As for rotational-vibrational transitions we have

O(J ′′) for∆J = +2 P (J ) for∆J = +1  ′′ [A B](v′ v′′) Q(J ′′) for∆J =0 (7.22) − −   R(J ′′) for∆J = 1  − S(J ′′) for∆J = 2 −   A band is a progression of J-fine structure superposed on a (n′,v′) (n′′,v′′) transition. Since the moment of inertia and spring constants change with the transition,→ the “band energy” is

∆E = E + ~(v′ω v′′ω )+ H(J ′)+ ... (7.23) nvJ n′,n′′ n′ − n′′ The Q-band is missing for J =0. This is the “center” of the band, so the energy of the Q-band is called the “band origin”:

ω ω +(v′ω v′′ω ) (7.24) e ≡ n′,n′′ n′ − n′′ and is used to identify the origin of the J-series in a spectrum. Notice the fact that the quadratic J–term in H no longer cancels, because the two moment of inertia are no longer the same in the two states. This means that the rotational energy spacings are no longer equidistant and the Q-branch is no longer at a single energy. This is shown in the “Fortrat” diagram in Fig. 7.6

120 7.8 Nuclear spin

Nuclear spin I is only important for homonuclear molecules. In that case, the two nuclei are indistinguishable. Consequence: We have to obey the generalized Pauli Principle The total nuclear wave function is composed of the spatial (rotation) and spin part:

ψ = ψ ψ (7.25) n spin · space If the nuclei are Fermions (non-integer spin), the total nuclear wave function has to be anti- • symmetric. If the nuclei are Bosons (integer spin), the total nuclear wave function must be symmetric. • Thus, the spatial symmetry must conform to the spin symmetry. The space symmetry is given by

Pψˆ =( 1)K ψ (7.26) space − space But: Radiative transitions do not change nuclear spin! Thus: The spatial symmetry ( 1)K must be preserved under radiative transitions: − = ∆K =0, 2 (7.27) ⇒ ± Thus: for homonuclear molecules, only the O,Q, and S branches are accessible.

Examples:

1. Fermions with I =1/2 such as H2:

We can couple the nuclear spins in H2 and other similar molecules to a singlet and a triplet state. The total wave function must be anti-symmetric.

Singlet: I I S =0. This state is the so-called “Para” state. • ⊗ → n The spin wave function is anti-symmetric, so space wave function must be symmetric, so K must be even.

The statistical weight for the nuclear spin is gn = (2Sn +1)=1, so the total statistical weight is g = (2Sn + 1)(2K +1)=(2K + 1) Triplet: I I S =1. This is the so-called “Ortho” state. • ⊗ → n The spin wave function is symmetric, so the space wave function must be anti-symmetric, so K must be odd.

The statistical weight for the nuclear spin is gn = (2Sn +1)=3, so the total statistical weight is g = (2Sn + 1)(2K +1)=3(K + 1).

121 2. Bosons with I =0 such as O2: We can only couple the nuclear spin to I I S =0. ⊗ → n Since the total wave function must be symmetric, we can only use even K to construct solution.

Not: J = K + Ω. For O2, Ω=1 in the ground state.

3. Bosons with I =1 such as N2: We can couple the nuclear spin to I I S =0, 1, 2, where S =0, 2 are the symmetric ⊗ → n n states and Sn =1 is anti-symmetric.

As above, the even K states go with Sn =0, 2 and the odd K states go with Sn =1.

The statistical weights are once again g = (2Sn + 1)(2K + 1).

Note: Since nuclear spin only changes due to collisions, the molecules remain in their respective spin states for a long time and can act like separate species (“Ortho” and “Para”).

Rotational temperatures: If the plasma is in LTE, the different rotational levels are related to each other by Boltzmann statistics:

n(J ′) g(J ′) [E(J ) E(J )]/kT = e ′′ − ′ (7.28) n(J ′′) g(J ′′)

Measuring the column density in different rotational states can thus be used to determine the tem- perature of the plasma, given the statistical weight for the rotational state. Deviations from Boltz- mann statistics can be used to investigate non-LTE processes, like UV pumping of levels. Since even and odd states only interact via collisions, the temperature derived from the density ratio of the ground states of even and add (J ′ = 0 and J ′′ = 1) is a good measure of the electron kinetic temperature.

122 8 Plasma diagnostics and equilibria

In the previous section, we studied atomic lines. In this section, we will apply some of this material to astrophysical objects and develop some tools to probe the physical state of plasmas. In many cases, we will make the reasonable assumption that the material is in equilibrium.

8.1 Non-quantum line broadening mechanisms

Above, we discussed natural line broadening. Other processes contributing to line broadening:

8.1.1 Doppler broadening

Atoms are moving with some velocity distribution. Doppler shift due to the different velocities will broaden absorption and emission lines.

For an atom moving with non relativistic velocity vz towards us (vz is negative if moving away), the Doppler formula gives:

ν ν ∆ν v − 0 = = z (8.1) ν0 ν0 c

For a velocity distribution f(vz), we have

dN dN dν f(vz)dvz = dvz = dvz = φνdν (8.2) dvz dν dvz

with

c(ν ν0) c vz = − and dvz = dν (8.3) ν0 ν0

For a Maxwellian, we have

2 ma mavz f(v )= e− 2kT (8.4) z 2πkT r and thus

2 (ν ν )2 ma(ν ν ) 0 dvz c ma − 0 1 − 2 φ = f(ν(v )) = e− 2kT = e− (∆νD) (8.5) ν dν z ν 2πkT ∆ν √π 0 r d

123 1 with the Doppler width ∆νD = ν0c− 2kT/ma and full-width-half-max of p

ν0 2ln(2) kT ∆νFWHM =2√ln2∆νD =2 · (8.6) c s ma

At line center, the line profile function is

1 φν(ν0)= (8.7) ∆νD√π

8.1.2 Collisional broadening

Collisions between atoms and with electrons can affect an ongoing radiative transition. For a collision frequency νcoll between particles, it can be shown (see R&L problem 10.7) that this leads to an added term in the natural line profile and an effective damping constant Γeff :

Γeff = γnatural +2νcoll (8.8) and a Lorentzian with

Γ /4π2 φ = eff (8.9) ν (ν ν )2 + (Γ /4π)2 − 0 eff

8.1.3 The Voigt profile

Combining the Lorentzian and the Doppler Gaussian profiles gives the so-called Voigt profile. vzν0 Keeping in mind that ∆ν = c , convolve the two to get

∞ 1 φ = dν′φ (ν′)φ (ν ν′)= H(a,u) (8.10) ν ν,Dopp ν,Lorentzian − ∆ν √π Z0 D with the

y2 a ∞ e H(a,u)= dy − (8.11) π a2 +(u y)2 Z0 − where a is the ratio of natural and collisional width to Doppler width (a measure of the relative importance) and u is the frequency shift ∆ν in units of the Doppler width ∆νD:

Γ ν ν a and u − 0 (8.12) ≡ 4π∆νD ≡ ∆νD

124 100 1.0

10−1 0.8

10−2 0.6

ν −3 τ(ν) − φ 10 e 0.4 10−4

0.2 10−5

10−6 0.0 −40 −20 0 20 40 −200 −100 0 100 ν−ν ν−ν 0 0

Figure 8.1: Left: Voigt profiles for a = 0.1 (narrow), a = 1 (middle), and a = 10 (wide). For a =0.1, thin lines show the Gaussian and Lorentzian that convolve to make the Voigt profile (in this case, the core is Gaussian). right: Voigt profile in absorption, optical depth increasing by a factor of 10 between each curve. Inner 2 curves; liner. Middle 2 curves: saturated. Outer 2 curves: Damped.

Note: The exponential always falls faster than any power-law, so the exponential term must be small at large ν ν . Thus, schematically, − 0

u2 a H(a,u) e− and H(a,u) (8.13) core ≈ wings ≈ √πu2 where the first term is the line core (Gaussian) and the second term are the so-called damping wings. So: ν ν ∆ν : core and ν ν ∆ν : wings | − 0| ≪ D| | − 0| ≫ D|

N.B.: For a> 1, wings completely dominate core. Typically: a< 1 unless ∆νD is very small.

So: For most astrophysical sources, Doppler broadening dominates the core but there are always wings where the natural and collisional broadening will become important. Fig. 8.1 shows the Voigt profile in emission and absorption for different parameters.

8.1.4 The equivalent width

There are different ways to measure the strength of a line. One way would be to measure the intensity at line center and compare it to the intensity of the continuum (since a line is always added to/subtracted from a continuum). But this doesn’t specify the total absorption of total line flux, since that depends on the integral over the line.

125 The integrated line intensity (in excess of the continuum intensity I0) is

D D ∞ ∞ hνA φ N A ∞ I = dν dz j = dν dz n 21 ν = 2 21 dνhνφ (8.14) emiss ν 2 4π 4π ν Z0 Z0 Z0 Z0 Z0

with the usual definition of the column density N2 = dzn2 and the total spectral intensity of I = I + I . ν ν,0 ν,emiss R For an absorption line, we integrate over the flux removed from the continuum intensity Iν,0.

2 ∞ ∞ πe τν mc Nfφν I = dνI (1 e− )= dνI 1 e−  (8.15) abs ν,0 − ν − Z0 Z0   where f is the oscillator strength of the transition. As a measure of the relative line strength, compared to the continuum, we define the equivalent width Wν. For emission:

∞ I I ∞ N A hνφ W dν ν − ν,0 = dν 2 21 ν (8.16) ν ≡ I 4πI Z0 ν,0 Z0 ν,0 and for absorption

2 ∞ I I ∞ ∞ πe ν,0 ν τν mc N2fφν W dν − = dν 1 e− = dν 1 e−  (8.17) ν ≡ I − − Z0 ν,0 Z0 Z0     We can also define the equivalent width for wavelengths:

2 ∞ τ λ W dλ 1 e− λ = W (8.18) λ ≡ − c ν Z0  

N.B.: The equivalent width is the width (in ν or λ) of a rectangle with unit height and the same area as the area under the emission/absorption line. See Fig. 8.2

So, the equivalent width measures how strong the line is relative to the continuum. For some very strong emission lines, the equivalent width can approach the line frequency.

8.1.5 The curve of growth: The equivalent width for absorption lines with Voigt profile

The general line profile is given by the Voigt profile, a convolution of a Gaussian Doppler profile (due to thermal motions) and a Lorentzian profile that contains contributions from natural broad- ening and collisional broadening.

126 2.0

1.5 ν,0

/I 1.0 ν I

0.5 =

Wν 0.0 −40 −20 0 20 40 ν

Figure 8.2: Equivalent widths for an emission and an absorption line. The areas under the grey rectangles are equal to the areas under the lines (hatched).

For emission lines, there is nothing more to consider, since there is no upper limit to the flux. However, for absorption lines, Recall that resonance lines can typically be seen in absorption, since there is always some material in the ground state (unless the atom is ionized, which we will discuss later). Consider: Some continuum background source and a gaseous foreground absorber.

Observe: Equivalent width Wν or Wλ. Want to know: The column density of gas N. Define: The curve of growth is the relation between column density N and absorption line equiv- alent width W . General integral for W is over the Voigt function H(a,u) and is not solvable in closed form.

2 ∞ ∞ πe H(a,u) τν mc N1f ∆ν √π W = dν 1 e− = dν 1 e−  D (8.19) ν − − Z0 Z0     Luckily, we can approximate the integral in the relevant regimes to understand the qualitative behavior of the curve of growth.

1. Linear Regime (τ 1): ν ≪ 2 2 ∞ ∞ ∞ τν πe πe W = dν 1 e− dντ = N f dν φ = N f(8.20) ν − ≈ ν mc 1 ν mc 1 Z0 Z0   Z0     since φν is normalized.

127 So: The equivalent width is directly proportional to the column density of gas, N1. Thus, in the limit τ 1, we are in the linear regime of the curve of growth. ν ≪ For non-saturated lines (optically thin), the Doppler profile is generally a good description. The condition for the linear approximation is then

πe2 Nf πe2 Nfλ τ = = 0 1 (8.21) ν0 mc √π∆ν mc √πb ≪   D  

where we introduce the Doppler parameter through b/c =∆νD/ν0:

∆ν 2kT 1/2 b c D = λ ∆ν = (8.22) ≡ ν 0 D m 0   2. Doppler (“Flat”) regime (1 τ τ ): ≪ 0 ≪ Damp Optically thick in line core, but still optically thin before wings dominate. The equivalent width is then approximately

∞ x2 ( τ0e ) W ∆ν dx 1 e − − (8.23) ν ≈ D − Z−∞ h i where we extended the lower bound to minus infinity and neglected the Lorentzian wings. The integrand of this function is very boxy, as shown in Fig. g. 8.3. Approximate the integrand as a top-hat. The FWHM of this function is found from setting

x2 ( τ0e− FWHM ) 1 1 e − = (8.24) − 2 h i or

τ x = ln 0 (8.25) FWHM ln2 r  

so the equivalent width approximately 2νFWHM or

τ W 2∆ν ln 0 ln(N) (8.26) ν ≈ D ln2 ∝ r   p which is only very weakly dependent on N.

3. Damped (“SQRT”) regime (τ τ ): 0 ≫ Damp In the Damped part, the line core is completely blackened, and the integral is dominated by the wings (which are much wider than the core). We can thus neglect the Gaussian profile.

128 10−2 1.0 b=6 km/s b=9 km/s b=12 km/s 10−3 0.8 b=15 km/s

) b=18 km/s λ / 0.6 λ 0 10−4 I/I

0.4 log (W 10−5 Linear Doppler Damped 0.2

10−6 0.0 12 14 16 18 −2 −1 0 1 2 10 10 10 10 log NFλ [cm−2] x/xc

Figure 8.3: Left: The line profile for optically thick Doppler profile (very boxy) and top-hat approximation with∆ν =FWHM. Right: Curve of growth for five different thermal widths for 8 1 Γ=2 10 s− . ×

Because the constant in the denominator of the Lorentzian is only important near the peak, we can neglect it as well. Then:

2 Γeff /4π φν (8.27) ≈ (ν ν )2 − 0 so

2 πe2 Γ /4π ∞ Nf eff − mc (ν ν )2  W dν 1 e   − 0 (8.28) ν ≈ − Z0 ( ) We substitute

πe2 mc NfΓeff k y = = (8.29) 4π2 (ν ν )2 (ν ν )2 − 0 − 0 to get

2 y 2 1 πe NfΓ ∞ [1 e ] πe NfΓ eff − 1/4 eff (8.30) Wν 2 dy − 3 2π 2 ≈ 2s mc 4π y 2 ≈ s mc 4π   Z−∞  

Γeff ∆νD 1/2 √τ0 N (8.31) ≈ s √π ∝

So, in the Damped regime, where the damping wings dominate the profile, the equivalent width is proportional to the square root of the column density N, and once again independent of temperature.

129 -3.0 -3.0

HD 152200 HD 152200

-3.5 H2 -3.5 Fe II ) λ / λ

-4.0 -4.0 log (W -1 -1 -4.5 b = 9 km s J = 3 -4.5 b = 17 km s Fe II b = 7 km s-1 J = 4 b = 15 km s-1 b = 5 km s-1 J = 5 b = 13 km s-1 -3.0-5.0 -3.0-5.0

HD 152270 HD 152270 -1 -1 -3.5 H2 (-8 km s ) -3.5 Fe II (-8 km s ) ) λ / λ

-4.0 -4.0 log (W -1 -1 -4.5 b = 10 km s J = 3 -4.5 b = 18 km s Fe II b = 8 km s-1 J = 4 b = 16 km s-1 b = 6 km s-1 J = 5 b = 14 km s-1 -5.0 -5.0 15 16 17 18 19 20 15 16 17 18 19 20 log (Nfλ) log (Nfλ)

Figure 8.4: Example of the curve of growth for a number of different lines towards a background star (Marggraf, Blum, & Boehr, 2004, A&A 416,251).

The transition to the Damping regime occurs when

Dopp Damp τ0 4√π∆νD Wν = Wν or = (8.32) ln τ0 Γeff

Damp 3 4 1 Typically, τ 10 − for b 1 10kms− . 0 ∼ ∼ − Example: Lyman alpha.

8 1 b 10 b − (8.33) A21 =Γ=6.3 10 s and ∆νD = =8.22 10 Hz 1 × λ0 × 10kms− gives (homework for the interested reader)

Damp τ0 = 8500 (8.34)

Fitting the COG to a number of absorption lines from a single sight line, it is possible to determine the column density of different species (thus, get a total estimate of the gas mass and abundance estimates for different elements) and an estimate of b, the Doppler broadening. Since turbulence can also contribute to the Doppler broadening, the temperature determined from b is typically an upper limit.

130 σ€ (v) 2 Coll. strength (Born) approximation

hν0 A21 C12 ne C21 ne

σ∝ v-2 1 v

Figure 8.5: Left:Sketch of the 2 level collisionally excited atom. Right: Sketch of the collision de-excitation cross section as a function of velocity.

8.2 Collisional excitation and de-excitation

Inelastic or super-elastic collisions can excite and de-excite levels in atoms, competing with ab- sorption (radiative excitation) and stimulated emission. Super-elastic: electron carries away energy from collision (de-excitation); inelastic: electron comes out with less energy (excitation) Electron collisions dominate this process (Question to the astute reader: why?)

8.2.1 The collision strength formalism

In analogy to the Einstein B coefficients for absorption and stimulated emission, define the colli- sion rate coefficient

C = σ (v)v = Maxwellian average (8.35) ij h ij i where the understanding is that the incoming collisional particles are Maxwellian (valid for thermal distributions). Then, the collisional transition rate (per atom in state i) is

Cijne (8.36)

We can easily write down how the excitation cross section σ(v) for collisions should scale with the particle energy: Known: Coulomb collisions with easy scaling

131 Need: interaction energy ∆E in excess of E12. Use: “Born approximation” (particle on straight line with small perturbation) The same analysis as in free-free emission: For particle with impact parameter b we have Interaction time: 2b ∆t = (8.37) v Coulomb acceleration perpendicular to ~v:

F e2 a Coulomb = (8.38) ⊥ ≈ m mb2 Total velocity change due to interaction:

2b e2 e2 2 ∆v ∆t a = (8.39) ⊥ ≈ · ⊥ ≈ v mb2 b mv and the exchange energy (since ~a ~v) is ⊥ m∆v2 e42 ∆E = ⊥ E (8.40) 2 ≈ mv2b2 ≥ 12 This is larger than the line energy for

e42 1 e2 2 e2 1 2E 2E b2

So: inside a cylinder with impact parameter smaller than b12, there is sufficient energy to excite the line. However, the energy transferred has to match the line energy (within some slop because the atom can pick up a little kinetic energy and because of uncertainty considerations). Clearly, the cross section for large enough velocities must then be proportional to the cross section of that cylinder,

2 2 2 σ πb πa v− (8.42) 12 ∼ 12 ∝ 0 Like with the oscillator strength in the case of the classical absorption cross section, the quantum corrections and the pre-factors from the proper Coulomb treatment get subsumed into a collision strength parameter Ω12:

E Ω π~2 Ω σ (v)= πa2 H 12 = 21 (8.43) 12 0 E g m2v2 g  e  1   e  1   132 Similarly,

E Ω π~2 Ω σ (v)= πa2 H 21 = 12 (8.44) 21 0 E g m2v2 g  e  2   e  2  

where the collisions strength Ω21 = Ω12 is a dimensionless number (typically of order unity). Note: The collision strength is often quoted for a multiplet. If one of the two states is a singlet state (SLJ) and the other a multiplet (S’L’J’), the multiplet collision strength Ω(SL,S′L′) can be expressed in terms of the line collision strength Ω(SLJ,S′L′J ′) following

(2J ′ + 1) Ω(SLJ,S′L′J ′)= Ω(SL,S′L′) (8.45) (2S′ + 1)(2L′ + 1) which simply reflects the ratio of the different statistical weights. Note that collisional excitation is typically done by the electrons in the high energy tail of the Maxwellian, which gives the rule of thumb that collisional excitation is important for

E kT 12 (8.46) ≈ 3

8.2.2 The collisional de-excitation rate

Because there is no velocity floor to de-excitation, it is usually more convenient to evaluate C21 first. Do the average over the Maxwellian:

∞ C = σ v = 4πv2 dvf(v) σ v (8.47) 21 h 21 i 21 Z0 where f(v) is the velocity distribution — a Maxwellian for thermal equilibrium:

3/2 2 me mev f(v)= e− 2kT (8.48) 2πkT   so

2 1/2 6 3 1 ~ Ω 2π 8.616 10− cm s− Ω C (T )= 21 × 21 (8.49) 21 3/2 g kT ≈ T 1/2 g me  2   2

Note: At low temperatures, the assumption that Ω21 is constant breaks down.

133 8.2.3 Detailed balance

As in the case of the Einstein relations, we can use the principle of detailed balance (microscopic time reversal) to determine C12 from C21. Consider a two-level atom and balance only collisional excitation and de-excitation (that’s the “detailed” part). Then, the net rate must be zero, or:

3 3 n1neσ12(v1)v1f(v1)d v1 = n2neσ21(v2)v2f(v2)d v2 (8.50) where energy conservation gives

m v2 m v2 e 2 + E = e 1 (8.51) 2 12 2

So

3 2 v1dv1 = v2dv2 and d v1 =4πv1dv1 (8.52)

Now, in thermodynamic equilibrium, the velocity distribution is Maxwellian and the level popula- tions follow the Boltzmann distribution, so

n2 g2 E12 = e− kT (8.53) n1 g1

Putting it all together:

2 2 mev1 mev2 n2 2kT 2 2kT 2 v1σ12(v1)e− 4πv1dv1 = v2σ21(v2)e− 4πv2dv2 (8.54) n1 2 mev2 g2 E21 2kT kT 2 = v2σ21(v2)e− e− 4πv2dv2 (8.55) g1 or, using eq. (8.52) and eq. (8.51) and replacing v1 with v:

g (v2 v2 ) 2 12 2 2 (8.56) σ12(v)= −2 σ21( v v12) g1 v − q where

2E12 2 2 v12 = = v1 v2 (8.57) me − r q 134 is the minimum energy necessary for excitation.

Finally, we can evaluate C12: The cross section σ12 must be zero below v12, so we evaluate the integral only above v12:

∞ 2 C12(T ) = 4πv (vdv) σ12(v) f(v) Zv12 ∞ g = 4π v2 v2 (vdv) 2 σ [(v2 v2 )1/2] f(v) − 12 g 21 − 12 Zv12 1 g ∞  = 2 4πv2 (vdv) σ (v) f[(v2 + v2 )1/2] g 21 12 1 Z0 mv2 g2 ∞ 2 12 = 4πv (vdv) σ (v) f(v) e− 2kT g 21 1 Z0 g2 E12 = C21e− kT g1 6 3 1 8.616 10− cm s− Ω21 E12 − kT (8.58) = × 1/2 e T g1

8.2.4 The 2-level collisional atom

Consider a two level atom in which collisional excitation is balanced by collisional de-excitation and spontaneous emission. The rate equation is then

dn 2 = n n C n [A + n C ]=0 (8.59) dt 1 e 12 − 2 21 e 21

Using the detailed balance relation for C12 and C21 we have

E E E g2 12 g2 12 g2 12 ne C21e− kT e− kT e− kT n n C g1 g1 g1 2 = e 12 = = = (8.60) A21 ncr n1 neC21 + A21 neC21 + A21 1+  1+ C21ne ne

where we define the critical density:

A21 ncr (8.61) ≡ C21

Clearly, the critical density is a threshold above which collisional de-excitation dominates radiative de-excitations, and below which spontaneous emission dominates.

Quenching: For ne > ncr, the line is said to be quenched (i.e., emission is suppressed by colli- sional de-excitation). Typical critical densities:

135 3 ²D³⁄ (g=4) 2 ²D⁵⁄ (g=6)

Å

9

2

7

3726Å

3 1 ⁴S ³⁄ (g=4)

Figure 8.6: Left:Sketch of the forbidden (quadrupole) density diagnostic lines of [OII] at 3729A/3726˚ A.˚ Right: Plot of the line ratio for [OII] and [SII] as a function of electron density (Osterbrock).

3 Fine structure lines (IR): n 1 100cm− • cr ∼ − 7 3 Forbidden lines (optical): n 100 10 cm− • cr ∼ − 9 11 3 Semi-forbidden lines (UV): n 10 10 cm− • cr ∼ − 15 19 3 Resonance lines (UV): n 10 10 cm− • cr ∼ −

Line luminosity / cooling: Collisionally excited line emission is a major cooling mechanism in the interstellar medium. Thermal energy from collisions is transferred to radiative energy which can leave the gas, thus transporting energy away. The cooling rate (line emissivity) in the gas must be

ǫ21 = n2A21E12 (8.62)

So, in the low density limit with n n , we have e ≪ cr

E g2 12 C21(T )e− kT g1 g2 E12 ǫ = n n A E n n C E (T )e− kT n n Λ(T ) (8.63) 21 e 1 n C (T )+ A 21 21 ≈ e 1 g 21 21 ≡ e H e 21 21  1  where we defined the cooling function Λ(T ). You see the n2 dependence that we already encoun- tered in free-free emission - collisional excitation is a two-particle process.

In the high density limit with ne ncr, the level populations approach the Boltzmann distribution (thermodynamic equilibrium), and≫ we have ǫ n. The electron density is no longer relevant because the level populations are fixed. ∝

136 Table 8.0: Collision strength values for OIII-like ions (Osterbrock).

8.2.5 Density diagnostics

The critical density provides a threshold where the emission changes. We can use that to map the density of the gas. Consider a three-level atom (e.g., [OII]), with levels 1, 2, and 3. For [OII], consider the forbidden transitions:

A) Critical Densities: Consider the [OII] transitions

2D 4 S : 2D 4 S : 3/2 → 3/2 5/2 → 3/2 λ = 3726 A˚ λ = 3729 A˚ • • g = (2J +1)=4, g =4 g =6, g =4 • 3 1 • 2 1 4 1 4 1 A =1.8 10− s− A =3.6 10− s− • 31 × • 21 × Ω31 =0.536 Ω21 =0.804 • 1/2 • 1/2 8 3 1 104K 8 3 1 104K C =1.16 10− cm s− C =1.16 10− cm s− • 31 × T • 21 × T 4 3 T 1/2 3 3 T 1/2 n =1.55 10 cm− n =3.1 10 cm− • cr,31 × 104 K • cr,21 × 104 K   State 2 has a lower critical density and will be quenched at lower ne, so the intensity ratio of the two lines, λ3729/λ3726 will drop with increasing ne. We can use the line ratio as a density diagnostic

137 in ther regime where one line is quenched and the other is not yet quenched.

B) Two level intensity model

I n A E R λ3729 = 2 21 21 (8.64) ≡ Iλ3726 n3A31E31

For this analytic argument, we neglect the coupling between states 2 and 3, so we have the equi- librium conditions

n n C n n C 2 = e 12 and 3 = e 13 (8.65) n1 neC21 + A21 n1 neC31 + A31 so

C n C n + n A E R = 12 e 31 e cr,31 21 21 (8.66) C n C n + n A E 13  e 21  e cr,21  31 31 E E A21 E21 g2 31− 21 [ne + ncr,31] = e− kT (8.67) A31 E31 g3 [ne + ncr,21] 4 3 4 1/2 [OII] 29K ne +1.55 10 cm− (T/10 K) = 0.3 e T × (8.68) 3 3 4 1/2 " ne +3.1 10 cm− (T/10 K) # × The same analysis can be made for other, similar transitions (e.g., [SII] and [CIII]). The result- ing curves for [OII] and [SII] are shown in Fig. 8.6. Note: These curves take into account all atomic levels and are thus more accurate than the analytic formula derived above.

C) Limiting cases:

For small n we have • e A21 E21 g2 ncr,3 29K E21 g2 C21 29K R e T = e T 1.50 (8.69) → A31 E31 g3 ncr,2 E31 g3 C31 ≈

3 For n 3000 cm− • e ∼ A21 E21 ncr g2 29K 1 R e T (8.70) ∼ A31 E31 ne g3 ∝ ne 3 4 3 which changes with ne, allowing a range of 10cm− < ne < 10 cm− to be probed by [OII] (the range depends somewhat on temperature). The [CIII] lines allow higher densi- ties to be probed. For large n we have • e A21 E21 g2 29K R e T 0.30 (8.71) → A31 E31 g3 ≈

138 Table 8.1: Collision strength values for OII-like ions (Osterbrock).

8.2.6 Temperature diagnostics from emission lines

The exponential dependence in the line ratio of eq. (8.68) indicates that we can use similar line ratios to measure temperature. This exponent comes from the ratio of the upward collision rates (and enforces the Boltzmann distribution in the limit of large densities where collisions dominate). Pick lines with energy difference that is large enough to sample the desired temperature range and whose energies fall into the optical (so they are easily measurable). Commonly used temperature diagnostic ions: Carbon iso-sequence (OIII). This is one reason the powerful nebular forbidden [OIII] lines are so important.

λ4959 + λ5007 R = (8.72) λ4363

Also: NII

λ6548 + λ6584 R = (8.73) λ5755

For the temperature range this line ratio is sensitive to ( 10, 000K), it is often appropriate to evaluate the line ratio in the low density limit, in which cas∼e the T 1/2 dependence of the collisional excitation rates cancel out and we are left with the exponential from the Boltzmann distribution.

139 Table 8.2: Collision strength values for OV and OVI-like ions (Osterbrock).

Consider the three-level [OIII] system: Because the lines are not quenched every collisional excitation results in a line photon being emitted. Collisions excite the 1D2 and the 1S0 level from the 3P state, but excitations from 1D to 1S do not occur (low density so the two upper states decay radiatively before they suffer from another collision). The 1S level can decay to 1D and 3P with 1 3 relative frequency A1S1D and A1S3P , and the D level decays to P with A1D3P , where we lump all transitions into the three 3P states into one average Einstein coefficient and average over the frequency appropriately emission weighted. Denoting the 1S, 1D, and 3P states 3,2, and 1, respectively, the levels are populated according to

neC13 neC12 + A32 n n C n A32+A31 3 = e 13 and 2 = (8.74) n1 A32 + A31 n1 A21 

For both OIII and NII, the collision excitation coefficient C13 is much smaller than C12 both because the collisions strength is much larger and because of the exponential factor. Thus, we can neglect the population of the 1D state by radiative transitions from the 1S state. Then

n n C 2 e 12 (8.75) n1 ≈ A21

Then, the approximate ratio of the λ4363 and the λ4959 + λ5007 lines is

I + I hν n A hν n C A /(A ) R λ4959 λ5007 = 21 2 21 21 e 12 21 21 (8.76) ≡ Iλ4363 hν32n3A32 ≈ hν32neC13A32/(A32 + A31)

140

3 ¹S

4363Å 2321Å

λ λ

2 ¹D

4959Å 5007Å

λ λ 1 ³P

Figure 8.7: Left: OIII ground state and strongest forbidden lines (optical: solid lines, UV: dashed line). Right: OIII temperature diagnostic. (Osterbrock).

hν21Ω21 A32 + A31 hν32 32,900 K = e kT 7.73 e T (8.77) hν Ω A ≈ × 32 31  32 

8.3 Ionization and recombination

2 Recall: Ionization energy (ionization potential) for Hydrogen is I = e = 13.6 eV. H 2a0 The first ionization potentials for heavier neutrals vary and are sometimes bigger (He), sometimes smaller than this energy (Fe). Effects: shielding of outer electrons; outer electrons are farther away from nucleus; closed shells are more tightly bound. Note: Ionization potential increases more drastically for closed shells. Write nuclear charge as Z and ion charge as z. Example: Oxygen ionization potentials (nuclear charge Z=8, see Table 8.3). Consquences:

He-like ion stages are important over a large temperature range (in hot plasmas) because • they are harder to destroy/ionize. Li-like ions have UV-resonance lines within the ground state that are at low energies com- • pared to other resonance lines (but, unlike Ly α they are optically thin), which makes them very important coolants.

141 z 0123 4 5 6 7

hνi[eV] 13.6 35.1 54.9 77.4 113.9 138.1 739.1 871.1 isosequence O N C B Be Li He H electr. str. 2p4 2p3 2p2 2p 2s2 2s 1s2 1s

Table 8.3: Ionization potentials of Oxygen (an important temperature diagnostic). Notice the large jump from z=5 to z=6 due to the He-like structure of OVII.

E.g.: CIV (1549A);˚ NV (1238A);˚ OVI1035A˚

8.3.1 Possible physical ionization and recombination processes 1. Collisional ionization:

[Z + e− (Z +1)+ e− + e−] (8.78) −→ 1 2 2 primary electron requires energy in excess of ionization energy I, mv1/2 I. Recall: Ionization energy of Hydrogen: 13.6 eV, Helium: 24.6 eV and 54.4 eV, ... ≥ 2. Photo-ionization:

[z + γ (z +1)+ e−] (8.79) −→ with hν I. γ ≥ 3. Radiative recombination:

[(z +1)+ e− z + γ] (8.80) −→ The electron is typically captured into an excited state, which decays radiatively (emitting a series of lines) into the ground state. This process is the inverse of photo-ionization. 4. Dielectronic recombination:

[(z +1)+ e− z∗ z + γ] (8.81) −→ −→ An electron is captured into an excited state and the excess energy goes into exciting a second electron from the ground state, creating a doubly excited state. For successful recombination, the second excited electron makes a radiative transition into the ground state, producing a “satellite line” of the normal resonance line (n 1), because the ground state is modified by the second excited electron. →

142 Table 8.4: Radiative decay rates (Einstein A values) for different ion transitions from the Carbon iso-sequence (Osterbrock).

8.3.2 Ionization equilibrium

The four processes considered above contribute to populating and de-populating ionization stages. The kinetic equation for the ionization state z is

d n(z)= n(z 1) λ(z 1,z) n(z)[λ(z,z +1)+ λ(z,z 1)] + n(z + 1) λ(z +1,z) dt − · − − − · where λ(z,z +1) is the ionization rate of stage z and λ(z,z 1) is the recombination rate of stage z and so on. − This assumes that only adjacent stages are coupled (typically not a bad assumption). Equilibrium: d/dt =0, so start with z =0:

n(1) λ(0, 1) = (8.82) n(0) λ(1, 0) and

n(0)λ(0, 1) + n(2)λ(2, 1) n(1)λ(1, 0) + n(2)λ(2, 1) n(1) = = (8.83) λ(1, 2) + λ(1, 0) λ(1, 2) + λ(1, 0)

143 Table 8.5: Radiative decay rates (Einstein A values) for different ion transitions from the Nitrogen iso-sequence (Osterbrock).

or

n(2) λ(1, 2) = (8.84) n(1) λ(2, 1) and, generally (proof by induction):

n(z + 1) λ(z,z + 1) = (8.85) n(z) λ(z +1,z)

So: the ionization network can be solved pair-wise.

8.3.3 Collisional ionization

Following the discussion on collisional excitation we write the collisional ionization cross section as

I σ πa2 Ω (8.86) c,z ≈ 0 E z  e  However, the calculation and measurement of the cross section and thus the collisional ionization rate are subject of ongoing research. It is often best to look up the approximate fitting formula

144 Table 8.6: Radiative decay rates (Einstein A values) for different ion transitions from the Oxygen iso-sequence (Osterbrock). for the collisional ionization rate in the literature (e.g., Shull & van Steenberg 1982, Arnaud & Rothenflug 1985, Sutherland & Dopita 1993). For the collisional ionization rate, we have to integrate over the thermal electron distribution. The collisional ionization rate coefficient is then

Iz ∞ e kT C (T )= 4πv2dvvσ (v)f(v) A T 1/2 − (8.87) z c,z ≈ z 1+0.1(kT/I ) Zvth z

8.3.4 Photo-ionization

To derive the photo ionization cross section in detail would go beyond the scope of the class. We can, however, gain some insight from our discussion on free-free absorption and the cross section we derived using Kirchhoff’s law. Recall that for free-free absorption, the thermal absorption coefficient is

2 6 8Z e 2π 3 hν − − kT (8.88) αν = 3 nineν 1 e 3mec 3mekT − r h i

145 

e si ² v ₀a

½ ² p

9 . n 0.0

hn n³ hnth

n n n

Figure 8.8: Left: Sketch of photo-ionization. Right: Zero order frequency dependence of photo ionization cross section.

This expression is integrated over a Maxwellian velocity distribution, but it is clear that the free- 3 free cross section for radiation must have a ν− dependence in the large frequency limit. This dependence comes from the semi-classical treatment of radiative transitions. It must carry over to the large frequency limit of bound-free transitions, because to a high energy photon a bound electron with binding energy I hν looks like a free electron. ≪ It is also clear that the cross section has to have a sharp turnoff below the ionization energy hν = I1 for the n = 1 (K) shell. Finally, we have to estimate the order of magnitude of the cross section, 2 which should be related to the size scale of an atomic system, πa0, or

3 3 3 2 I 64α 2 I 2 I σph = ξiπa g1(ν)= πa g1(ν)=0.09πa g1(ν) (8.89) 0 hν 3√3 0 hν 0 hν       and the general expression for ionization of the n shell,

3 64α 2 In σph,n = n πa gn(ν) (8.90) 3√3 0 hν   where gn is the usual quantum Gaunt factor you have to look up in the literature. Note that this holds for ions of arbitrary charge z and is not limited to neutrals.

3 Remember where the ν− dependence comes from: free-free emission has a flat spectrum (not just for thermal particles), and the inverse process is free-free absorption. Its cross section must be related to the free-free Einstein A coefficient through the Einstein relation. The third power of frequency in A comes directly from the Planck formula, which also carries a ν3 dependence. As

146 Figure 8.9: XMM-Newton X-ray spectra of two AGNs strongly affected by photoelectric absorp- tion (Mateos et al. 2005, A&A 444, 79)

we will learn, this comes from the phase space volume accessible to photons, which is higher at higher energies (proportional to ν2), and the average energy per photon (proportional to ν).

Suppose a gas of density n is subject to an ionizing flux Fν. The photo-ionization rate per ion of charge z is then

∞ F ∞ F Γ = dν ν σ = dν ν σ (8.91) z hν ph,z hν ph,z Z0 Zνth where Fν/hν is the photon flux and the cross section vanishes below νth = Iz/h.

Note again that only photons above hνth contribute to photo-ionization and that, because the cross section drops off quickly with ν, it is predominantly photons near the absorption edge that con- tribute, so, for the Lyman edge at 13.6 eV, it is photons just above 13.6 eV that are most important. 3 This is (a) because of the ν− dependence in the cross section, (b) because most UV spectra drop 1 with frequency, and (c) because the photon flux is Fν/hν, picking up another power of ν− .

The total ionization rate per unit volume is simply n˙ z = nzΓz.

17 2 Note: The order of magnitude of σ at the edge is 10− cm . ∼ Osterbrock provides a fitting formula for the photoionization cross section with parameters given in Table 8.7:

s s 1 ν − ν − − α = a β + (1 β) for ν>ν (8.92) ν T ν − ν T "  T   T  #

147 8.3.5 Photo-electric absorption

The spectral importance of absorption: Starting with hydrogen, each element has a series of char- acteristic absorption edges at increasing energies (one for each ionization stage at the respective ionization potential). Hydrogen being the most abundant, the Lyman edge is the strongest edge, but only near 13.6 eV. Helium is the next most important, but only up to the second ionization potential at 54.4eV. Above that, metals are the only relevant absorbers, so the soft X-ray absorption cross section is entirely dominated by metals. Because of the many ionization edges, different edges overlap and form a broad cross section that causes strong absorption in the energy range between 13.6 eV and about 2 keV. Above that energy, iron is the only element left with sufficient abundance to cause measurable photo-electric absorption, with the iron K-edge at around 7 keV (depending on the ionization state of the iron). Fig. 8.9 shows an X-ray spectrum subject to photo-electric absorption. Fitting a superposition of absorption edges from different elements, one can derive a column density in metals from the total optical depth, NZ = τν/σν.

This can then be translated into a hydrogen column density NH making some assumption about the metal abundance of the absorbing gas. This is typically what is referred to when X-ray observers 20 24 2 determine hydrogen column densities (with typical values of order 10 10 cm− ). −

8.3.6 Radiative recombination

The inverse process to photo-ionization is radiative recombination, the capture of a free electron into the n-shell, resulting in the release of a photon of energy

1 hν = I + m v2 (8.93) z,n 2 e

Since we know the cross section for photo-ionization, we can once again use detailed balance to determine the radiative recombination cross section. Set the photo-ionization rate equal to the radiative recombination rate, assuming full thermody- namic equilibrium (Maxwellian and Planck distributions of equal temperature). We once again have to balance photons with electrons that satisfy the energy condition eq. (8.93). Since we are interested in the “spontaneous” recombination rate, we have to subtract a term for the “stimulated” recombination rate from the photo-ionization rate. This gives

3 4πBν(T ) hν n(z + 1) n σ (v)v f(v)d v = n(z) σ 1 e− kT dν (8.94) · e · r · hν ν −   148 Table 8.7: Parameters for photoionization cross section fitting formula (eq. 8.92) in the notes, adopted from Osterbrock.

149 The differentials are related by

3 2 mevdv = hdν and d v =4πv dv (8.95)

and we use a Maxwellian

3/2 2 me mev f(v)= e− 2kT (8.96) 2πkT   Finally, we have to borrow a result from a few lectures into the future: The ionization level popu- lations in thermal equilibrium (and also in LTE) obey the so-called Saha equation

3/2 n(z + 1)ne 2gz+1 2πmekT Iz = e− kT (8.97) n(z) g h2 z   which is the generalization of the Boltzmann relation for ionization states. Note: the statistical weights for the ionization sates are (2S+1)(2L+1) for the respective electronic state of the ion before and after recombination/ionization. Solving for the recombination cross section:

3 hν 3/2 n(z) 4πhν 1 e− kT 2πkT hν I dν kT− (8.98) σr,z = −hν e σph,z(ν) 2 n(z + 1)ne hν e kT 1 ! me 4πv v dv −   · g 2hν hν g E E σ z z ν ν ph,z (8.99) = 2 2 σph,z = 2 2 2gz+1 mev mec 2gz+1 Ee mec c 64α I 3 2hν hν z (8.100) = n πa0 2 2 gph,z 3√3 hν mev mec    

2 In terms of order of magnitude, this is smaller than σν by roughly a factor hν/mec . While photo-ionization most often captures electrons from ions in the ground state, the recombin- ing electron can be captured into any n shell and state. The excited electron then has to make a series of transitions to the ground state. This is called the recombination cascade. For any given n, the radiative recombination rate into level n is

α (T )= σ v (8.101) z,n h r,z i But to determine the total level population, we have to sum over all upper states that ultimately transition into state n, so

1/2 ∞ 8kT I α(n)= αz,k = Ar φn(T ) (8.102) πme kT Xk=n     150 Table 8.8: Recombination coefficients for different target orbitals and different temperatures for Hydrogen (Osterbrock).

The functions φn(T ) are tabulated and best looked up. When conditions are optically thin to all photons, we will be interested mainly in the ground level, so α(1). This is called case A recombination. Because hydrogen is so abundant and because each radiative decay into the ground state ends with a Lyman alpha photon, these photons get re-absorbed because the gas is optically thick to photons in the Lyman series. In this case, recombinations occur predominantly through the Balmer series into n =2 and we have to consider case B recombination, α(2).

151 Approximate fitting formulae for hydrogen are (case A)

0.726 (1) 13 3 1 T − α =4.09 10− cm s− (8.103) H × 104 K   and (case B)

0.845 (2) 13 3 1 T − α =2.59 10− cm s− (8.104) H × 104 K   so at 104 K, 37% of all recombinations go directly into the n =1 state (since the difference between the two is the rate of transitions that go directly into the ground state). Note: Transitions do not have to be radiative - even forbidden transitions can be made via colli- sional de-excitation and contribute to α. Line ratios for radiative recombination cascades are tabulated and can be used to estimate rates and ionizing fluxes. Tables 8.9 and 8.10 list the tabulated literature values of line intensity ratios for case A and case B recombination, as well as the effective recombination coefficients for Hβ (for historical reasons). These are the rates for all recombinations that result in the emission of an Hβ photon (correcting the total case A and B recombination coefficients for all recombinations that do not result in the emission of an Hβ photon.

N.B.: The relation between photo-ionization and recombination cross sections is called the Milne relations.

We can re-write it as

σ m c2 m v2/2 g g bf = e e e z+1 (8.105) σfb hν hν gz which is easy to memorize.

8.3.7 Di-electronic recombination

The dielectronic recombination rate calculated from

∞ n α n σ v = n d3vσ (v) vf(v) (8.106) e di ≡ eh di i e di Z0 which looks identical to radiative recombination in its dependence on ne. Even though it ulti- mately results in the same outcome as radiative recombination (unless autoionization reverses the

152 Table 8.9: Hydrogen line ratios for case A recombination (Osterbrock).

153 Table 8.10: Hydrogen line ratios for case B recombination (Osterbrock).

154 Table 8.11: Recombination coefficients for different n for Helium (Osterbrock).

di-electronic capture) this process is important because (a) it alters the energy of the emitted res- onance line (n 1) because of the excited electron state from which the line is emitted and (b) because it increases→ the effective cross section for recombination at certain energies, producing a “resonance”. The di-electronic recombination is approximately given by the fitting function:

T T 3/2 B I α A T − e− T 1+ B e− T (8.107) di ≈ di di h i where X 0.6 0.9. α is best looked up in the relevant literature. rad ≈ − di

155 Table 8.12: Total recombination coefficients for different emperatures. Case B is most often used (Osterbrock).

Since this process can occur in parallel to radiative recombination, the effective recombination coefficient includes both effects. In terms of approximate fitting formulae, this gives

αz = αrad(T )+ αdi (8.108) TB TI Xrad 3/2 = AradT − + AdiT − e− T 1+ Bdie− T (8.109) h i

8.3.8 Summary

To summarize the results from above: The rates (per ion) for the three main processes that lead to net changes in the ionization state of an ion are:

Collisional ionization: • ∞ n C (T ) n σ v = n d3vσ (v) vf(v) (8.110) e z ≡ eh i i e i Zvth 1 3 where Cz is the collisional ionization coefficient (in units of s− cm ). Photoionization: • ∞ F Γ = dν ν σ (ν) (8.111) z hν ph Zνth 156 where Fν is the flux at the location of the ion and Γ is the photoionization rate (in units of 1 s− ). Radiative recombination: • ∞ n α (T ) n σ v = n d3vσ (v) vf(v) (8.112) e rad ≡ eh r i e r Z0 1 3 where αrad is the recombination coefficient (in units of s− cm ). Di-electronic recombination: • ∞ n α n σ v = n d3vσ (v) vf(v) (8.113) e di ≡ eh di i e di Z0 Dielectronic and radiative recombination combine to give the effective recombination coef- • ficient αz = αrad + αdi.

8.4 Equilibria

We can use the rate coefficients introduced above to discuss different equilibria that are astrophys- ically important.

8.4.1 Nebular (photoionization) equilibrium

At low densities or temperatures and/or in intense radiation fields, photoionization dominates. This is the case when

Γ n C (T ) (8.114) z ≫ e z

Conditions under which this approximation applies can be found in

Interplanetary space • Ionization nebulae (HII regions) • Quasars/AGNs • Planetary nebulae • Comets • The upper atmosphere •

The rate equation then reads

n(z)Γph(z)= nen(z + 1)αz(T ) (8.115)

157 Figure 8.10: Fractional ionization abundance for Oxygen in photoionization equilibrium (AGN spectrum) as a function of ionization parameter U = nγ/nH. Shull & Van Steenberg (1982).

and the ionization fraction is

n(z + 1) Γ (z) = ph (8.116) z(z) neαz(T ) which depends on the incident radiative flux Fν, the temperature T , and the density ne. Because the energetics are dominated by the external radiation field, the temperature is set by the balance above, so we can solve for the temperature if ne and Fν are known.

Inversely, if ne and T are known, we can solve for the incident radiative flux. Note that, for a point source of radiation at distance R (say, a star or quasar) the flux incident upon the plasma is

L F = ν (8.117) ν 4πR2

It is often useful to characterize the importance of the incident radiation via the “photoionization parameter”, which has different manifestations:

ξ Lν flux-to-density ratio • ≡ nR2 U nγ ratio of ionizing photon density to the hydrogen density • ≡ nH

158 Figure 8.11: Fractional ionization abundance for Carbon in coronal equilibrium.

prad L Ξ = 2 ratio of radiation to gas pressure. • ≡ pgas 4πR c(2.3nHkT )

The factor 2.3 assumes the cosmic abundance of He, such that ntot = nH +nHe +ne =2.3nH for nHe =0.1nH.

The temperature is dominated by this balance because every absorbed photon deposits energy into the electron gas, which can only cool by recombination and line emission. The heating rate per neutral hydrogen atom is

∞ F = dν ν σ (ν)(hν hν ) (8.118) H hν ph − th Zνth  

Thermal balance requires that this heating rate be balances by the cooling rate, nenHΛ(T ):

n = n n Λ(T ) (8.119) HH e H

The spectra from such a plasma are best calculated using software such as the photo-ionization code “Cloudy”2. The fractional ionization is a direct diagnostic of the ionization parameter (Fig. 8.10).

8.4.2 Coronal (collisional ionization) equilibrium

In the coronal model, thermal collisional ionization dominates, so we can neglect photoionization.

2The code can be downloaded at http://www.nublado.org

159 Figure 8.12: Fractional ionization abundance for Neon in coronal equilibrium.

This requires sufficient electron densities at sufficiently large temperatures (sufficient to support a population of ionizing electrons). This requires Γ n C (T ). Conditions where this approximation holds can be found in ≪ e z Stellar and solar coronae • Hot plasma (accretion disk coronae, galaxy cluster atmospheres, cooling flows) • Stellar flares • Supernova remnants • Stellar winds • Consider two adjacent ionization stages, z and z +1 (remember that we can solve the equilibrium network in a pair-wise fashion). The equilibrium kinetic equation is

neCz(T )n(z)= neαz(T )n(z + 1) (8.120)

The ionization fraction is

n(z + 1) C (T ) = z (8.121) n(z) αz(T ) which is independent of ne, but does depend on the temperature T . This implies that we can use the ionization state of a gas as a temperature diagnostic. The spectrum emitted by plasmas that satisfy this condition is a combination of lines and bremsstrahlung, called

160 Figure 8.13: The cooling function Λ(T ) for gas in coronal equilibrium (solar abundances). Left: dominant cooling processes; right: dominant species. Sutherland & Dopita (1993). thermal plasma emission. Calculating such spectra is complicated and best left to dedicated software packages3. Fig. 8.13 shows the cooling curve calculated for plasma in coronal equilibrium for solar abun- dances. Again, thermal balance requires

n = n n Λ(T ) (8.122) eH e H but is now given by some other physical process, like reconnection or viscous dissipation. H One of the key diagnostics for coronal plasmas are the ionization fractions of different ions. Figures 8.11 and 8.12 shows the fractional ionization of Carbon and Neon as examples. Thus, measuring the fractional abundances of different ionization stages is a sensitive temperature diagnostic. Also note the big temperature range spanned by the He-like ions, CV and NeIX.

8.4.3 Saha equilibrium (LTE)

We will consider one more example of ionization equilibrium: local thermodynamic equilibrium between radiation and matter, valid at very high densities and optical depths. This is the limit

3Common spectral models of plasma emission are the Raymond-Smith model and MEKAL. A publically available code to calculate Raymond-Smith spectra is the APEC package, available for download as part of the ATOMDB from http://cxc.harvard.edu/atomdb/index.html

161 which we have used to derive detailed balance relations. As we will show in the next section, in this limit, the relation between the ground electronic states two adjacent ionization stages can be written as

3/2 n(z + 1)ne 2gz+1 2πmekT Iz,z+1 = e− kT (8.123) n(z) g h2 z   To derive this relation, as well as the Planck spectrum, which we have used earlier, we need to do some statistical mechanics.

8.4.4 Appendix: A short summary of some plasma diagnostic tools

For ionized plasmas, consisting of primarily H and He with trace elements of heavy elements, one one can derive good estimates of the following quantities:

1. Density (ne): Use line ratios from states with different A values (e.g., [OII], [SII], CIII]), or emission measure from object with known volume.

2. Temperature (Te): Use line ratios from states of different excitation temperature (e.g., [OIII], [NII], [NeIII]). Ion abundances from the same element can also determine T in ionization equilibrium. 3. Total mass: The luminosity of a two-body emission process like recombination or collision- ally excited lines or bremsstrahlung is proportional to the emission measure

3 EMV = d xnenH+ (8.124) Z With a size measurement and some geometric assumption about the depth of the object (slab, sphere), the measured luminosity and the emission measure together yield the produce nenH+ . If ne is known (either because the gas is completely ionized such that ne = nH+ = nH or from the density diagnostic), we can determine the total mass in nH+ . 4. Abundances: Abundances are derived from line ratios of ions relative to H α or H β. Know- ing Te, we can model the plasma to derive the abundances. 5. Extinction: Measure Hα/Hβ or some other well calibrated recombination line ratio and com- pare it to the theoretical value (technically, this requires knowledge of T and ne). This yields τ1+τ2 τ(λ1)+τ(λ2) the optical depth to exctinction from I1/I2 = e− = e− . Alternatively, we can measure the line ratio from a common upper state with known branch- ing ratio (i.e., ratio of A values) if collisional de-excitation is not important (the this tech- nique is insensitive to T and ne).

7/4 α 6. Synchrotron emission: Emission typically proportional to Vp ν− with α 0.5 1.5. Assume equipartition to derive an estimate of the relativistic pressure and magnetic∼ − field 1/2 strength. For optically thick synchrotron emission, S B− . ν ∝ 162 9 An excursion into statistical mechanics

In order to understand the distribution of ions and neutrals in local thermodynamic equilibrium (LTE) and the distribution of photons in LTE (blackbody), it is necessary to discuss a few basics of thermodynamics and statistical mechanics. We will be talking about “systems” (a large number N of particles with a given energy E in a given volume V ). We can look at the system on a microscopic, but statistical level (statistical mechanics) or a global, macroscopic level (thermodynamics)

9.1 Thermodynamics

Thermodynamics: Macroscopic description of system Quantities measuring system:

Temperature T — signifies whether two systems are in thermal equilibrium (T = T ) • 1 2 Chemical potential µ — how much energy does it “cost” to add a particle. Signifies whether • two system are in chemical equilibrium (µ1 = µ2). Pressure P — signifies whether two systems are in mechanical equilibrium (P = P ). • 1 2 Entropy S • Volume V • Total number of particles N • The 3+1 laws of thermodynamics:

Energy conservation: dU = dQ PdV + µdN • − Entropy always increases: dS 0 — i.e., system is probabilistic • ≥ Absolute zero is at 0 Kelvin • Two systems in thermodynamic equilibrium with each other are at equal temperature • For an equilibrium system, two of the intrinsic variables T , S, P , and V suffice to describe the macroscopic state of a system with a given energy. For an equilibrium system, these variables are related through an “equation of state” (e.g., relation between volume and temperature for fixed entropy is an “adiabatic equation of state”).

9.2 Elements of statistical mechanics

We will not go into any details here, but will only sketch results that are important. Consider a system with a given “macrostate”, i.e., volume V , total particle number N, and total energy E. There are a large number of possible ways to distribute these N particles among the

163 different states that will yield the same total energy. Each of those manifestations is called a “microstate”. Define the total number of possible microstates that give the same “macrostate” Ω(N,V,E).

There is no reason why one of these manifestations should be better than the others, so the fun- damental assumption of statistical mechanics is that the relative probabilities of the different microstates are equal. The relative probability of a system being in a given macrostate must then be proportional to the number of microstates accessible to the macrostate,

P Ω(N,V,E) (9.1) (N,V,E) ∝ 9.3 Connection between statistical mechanics and thermodynamics

This suggests a direct connection between statistical mechanics and thermodynamics: The likeli- hood of a system being in a given state is given by the entropy of the system

S = k ln(P )= k ln(Ω) (9.2)

16 1 where k = 1.38 10− ergsK− is the Boltzmann constant, which links statistical mechanics (counting microstates)× to thermodynamics (measuring macroscopic variables). Note: Since the minimum number of microstates a system can be in is 1, there is a minimum of 0 to S. This corresponds to the thermodynamic ground state and yields the 3rd law of thermodynamics. The reason the entropy is defined as the logarithm of the probability is that probabilities are mul- tiplicative (the probability of two systems being in a given total state is Ω1Ω2), while we want the entropy to be additive, similar to volume and particle number. The logarithm takes care of that. Based on this, dQ = TdS, and the first law of thermodynamics

dQ = TdS = Tkd[ln(Ω)] = dU + PdV µdN dE + PdV µdN (9.3) − ≡ − we can derive all other thermodynamic variables from statistical mechanics:

∂S ∂ ln Ω T = T k =1 (9.4) ∂E ∂E or

1 ∂ ln Ω − 1 kT = = (9.5) ∂E β   where β will be used again soon.

164 Note: For two systems to be in thermodynamic equilibrium, they have to have the same tempera- ture, which can easily be shown using the above relation by maximizing the probability (entropy) under the constraint that E1 + E2 = E, which gives

dE = dE + dE =0 or dE = dE (9.6) 1 2 1 − 2 so maximizing the entropy takes on the form

∂S ∂S ∂S ∂S ∂S 1 1 = 1 + 2 = 1 2 = =0 (9.7) ∂E1 ∂E1 ∂E1 ∂E1 − ∂E2 T1 − T2 or

T1 = T2 (9.8)

We can derive the pressure as

∂S ∂ ln Ω P = T = kT (9.9) ∂V ∂V and the chemical potential as

∂S ∂ ln Ω µ = T = kT (9.10) − ∂N − ∂N Just as is the case with internal energy (temperature) exchange between two systems, for two sys- tems to be in equilibrium under particle exchange, they have to have the same chemical potential (i.e., it costs you the same amount of energy to add one particle to each of the systems, so exchang- ing two particles between the two systems is energy neutral), which can be shown along the same lines as equation 9.7. Similarly, two equilibrium systems in a fixed volume must be at the same pressure.

9.4 The statistical mechanics of ideal quantum mechanical gases

We will now consider the case of an ideal, quantum mechanical gas of N non-interacting, indistin- guishable particles4.

Assume there are a total of N particles distributed over the energy states Ei of the system, with ni being the number of particles with energy Ei, such that

ni = N and Eini = E (9.11) i i X X 4The statistical mechanics aficionados will recognize that this derivation makes use of the “micro-canonical ensem- ble”. This will require us to assume a fixed grouping of energy eigenstates, which does not lend itself to the continuous energy levels accessible to free particles and is thus not fully self-consistent. However, the results are identical to those derived using “canonical” or “grand canonical” ensemble considerations.

165 An ideal gas of fermions

E Macrostate: E=17, N=6 microstate 1 microstate 2 microstate 3 microstate 4

E=6, g=6

E=5, g=2 E=4, g=1

E=3, g=3

E=2, g=3

E=1, g=3

An ideal gas of bosons

E Macrostate: E=17, N=6 microstate 1 microstate 2 microstate 3 microstate 4

E=6, g=6

E=5, g=2 E=4, g=1

E=3, g=3

E=2, g=3

E=1, g=3

Figure 9.1: Examples of different complexions (microstates) for a given macrostate of a fictitious fermion and boson gas with six energy levels and six particles (there are many other possible microstates not shown, of course).

where E is the total energy of the system.

Each energy eigenstate has an associated statistical weight gi.

The key question is: What is the number Ωi of microstates associated with Ei? I.e., how many ways are there to distribute ni particles over gi degenerate energy eigenstates? We have to distinguish between Fermions, Bosons, and classical particles, since they have different packaging requirements.

1. Bosons: The number of distinct ways to distribute ni indistinguishable particles over gi spaces (combinations, i.e., order does not matter), allowing that one space can hold more than one particle (Bosons) is a classic combinatorial problem (the number of combinations with repetitions): (n + g 1)! Ω = i i − (9.12) i n !(g 1)! i i − 2. Fermions: The number of distinct ways to distribute ni indistinguishable particles over gi

166 spaces, but without allowing a state to hold more than one particle (Fermions, thus requiring g n ) is given by i ≥ i g ! Ω = i (9.13) i n !(g n )! i i − i 3. Classical particles: For distinguishable particles, the number of possible complexions is ni gi . Since the particles are indistinguishable, we must divide by the number of permutations 5 of ni particles in each level and arrive at

ni gi Ωi = (9.14) ni!

The total number of microstates for a given distribution of ni over all the eigenstates is then

Ω n = ΠiΩi (9.15) { i}

We want to find the most likely (i.e., equilibrium) distribution of ni. Since the probability of a given distribution set ni is directly Ω n , we could maximize Ω n , but for reasons of convergence, we { } { i} { i} chose to maximize its log instead: ln Ω ni . Since the maximum of both functions must be equal, we are free to do so. { } During the maximization, we have to obey the constraints

ni = N and Eini = E (9.16) i i X X Maximization under constraints can be achieved using the method of Lagrange multipliers, which gives an equation for each ni:

∂ i ln Ω ni α i ni β i Eini { } − − =0 (9.17) ∂n P Pi P  Taking the limit n 1, we use the Stirling approximation for the factorial yields i ≫ ln n! n ln n n (9.18) ≈ − Once again, we distinguish different particle types, but we will do the calculation only for Bosons (the other types are easily reproduced as a homework exercise):

Bosons:

ln Ω (n + g 1)[ln(n + g 1) 1] n [ln(n ) 1] i ≈ i i − i i − − − i i − 5This is a fudge, since we started out with distinguishable particles and made the correction for them being indistin- guishable only after counting all the initial states. This inconsistency is remedied by using one of the above quantum statistics, but in the limit of large N and large E, the two approaches agree.

167 (g 1)[ln(g 1) 1] − i − i − − g n n ln i +1 + g ln 1+ i (9.19) ≈ i n i g  i   i  and

∂ ln Ωi [ln(ni + gi 1) 1]+1 [ln ni 1]+1 (9.20) ∂ni ≈ − − − − g 1 g = ln 1+ i ln 1+ i (9.21) n − n ≈ n  i i   i 

The general expressions for all three types of particles are

g g n ln Ω = n ln i a i ln 1 a i (9.22) i i n − − a − g  i   i  and

∂ ln Ω g i ln i a (9.23) ∂n ≈ n − i  i  with a = 1 for bosons, a =1 for fermions and a =0 for classical particles. − With this expression, the maximization is simple and yields

g ln i a α βE =0 (9.24) n − − − i  i,eq  or

gi ni,eq = (9.25) eα+βEi + a

with a =1, 0, 1 for fermions, classical particles, and bosons, respectively. −

Ω ni ,eq is the most probable set of complexions for a given macrostate. The total number of microstates{ } for all possible sets of n is { i}

Ω= Ω n (9.26) { i} n X{ i}

It turns out, however, that within the accuracy of the Sterling approximation in the limit of N −→ , this is identical to Ω n ,eq. The reason is (a) that all the individual Ωi are factorials, and thus ∞ { i} 168 even a small change to the argument can be big, and (b) changing the system away from the optimal state, while satisfying the constraints on N and E involves changing a large number of particles6. Then, with a little algebra,

S ln Ω n ,eq = ln (ΠiΩi) k ≈ { i} g g n = n ln i a i ln 1 a i,eq i,eq n − − a − g i i,eq i X      α+βEi gi a = ni,eq ln e + a a ln 1 (9.27) − − a − eα+βEi + a i X     α+βEi α+βEi gi e = ni,eq ln e + a a ln (9.28) − − a eα+βEi + a i    X  α+βEi gi 1 = ni,eq ln e + a a ln (9.29) − − a 1+ ae α βEi i   − −  X  gi α βE = n (α + βE )+ ln 1+ ae− − i i,eq i a i X h i 1 α βE = αN + βE + g ln 1+ ae− − i (9.30) a i i X  From this, we can finally deduce the meaning of the two “Lagrange multipliers” — they are the chemical potential and the inverse of kT :

∂S ∂ ln Ω µ α = = = (9.31) k∂N ∂N −kT

(from eq. 9.10) and

∂ ln Ω 1 β = = (9.32) ∂E kT

from eq. (9.9). Finally, using E = TS PV + µN, we have −

S µN E 1 µ Ei PV + = g ln 1+ ae kT − kT = (9.33) k kT − kT a i kT i X   6For example, moving one electron in a hydrogen gas from the ground state (n=0) to the first excited state (n=1) requires 12 electrons in other hydrogen atoms to be moved from the third excited state (n=3) to the first excited state (n=1) as well to maintain the same total energy. This is already a significant change to the system equilibrium.

169 so the pressure of the system can be derived from

kT µ Ei PV = g ln 1+ ae kT − kT (9.34) a i i X h  i In the Maxwell-Boltzmann case, where a 0, this reduces to the well known result of P = nkT (homework exercise). →

9.5 Extension to the continuum

Since kinetic energy states of free particles are continuous, we have to go from the sum over states to a phase space integral:

n d3p d3xξ n(p) (9.35) i −→ · i X Z Z In order to make the sum continuous, we have to divide phase space up into cells of an appropriate phase space volume and integrate over the density within those cells7. To arrive at the correct phase space density normalization ξ, we realize that particles and photons have an associated De Broglie wavelength

λ = c/ν = hc/E (9.36)

Concentrating on photons, we know that the photon momentum is

p = E/c (9.37) and so a photon should occupy a phase space volume of

∆p3∆x3 = λ3p3 = h3 (9.38) which gives a phase space density normalization of

3 ξ = h− (9.39) photons per phase space volume element. Incidentally, this holds for all types of particles. This is equivalent to replacing the statistical weight g of a particle with a new, continuous definition

d3pd3x g = g (9.40) qm h3

Where gqm is the quantum mechanical intrinsic statistical weight. 7In essence, the sum on the left is just the integral over a phase space density that is given by delta functions, appropriate for stationary particles, e.g., in a crystal lattice.

170 9.6 Saha equilibrium

Using this phase space normalization, we are now in a position to evaluate three interesting cases of particles. First, we will derive the occupation numbers for ionization equilibrium when a classical gas is in local thermodynamic equilibrium. This situation is called Saha equilibrium and was used in deriving the Milne relation between the photoionization and recombination cross sections. We consider an ionization reaction of the form

(z)+ hν (z +1)+ e− (9.41) ←→ where E is the ionization energy. For now, we will assume that both ions are in their ground state.

First, we note that in thermodynamic equilibrium, the chemical potential must be equal for • both sides of the equation. Second, we note that photons have zero rest mass, so the chemical potential is zero. •

Then, thermal equilibrium (which implies maximum entropy, dS =0) combined with the first law applied at fixed volume (dV =0) gives

TdS = dE + PdV µ dN µ dN µ dN − z z − z+1 z+1 − e e = dE µ dN µ dN µ dN =0 (9.42) − z z − z+1 z+1 − e e

The energy differential dE is the ionization energy, given by dE = Iz,z+1dNz where Iz,z+1 is the ionization potential and the continuity equation is

dN = dN = dN (9.43) z − z+1 − e

Combingin these three conditions gives

µ + µ = µ I (9.44) z+1 e z − z,z+1

Next, we need to evaluate µ for a constituent particle of an ideal gas from our expression above. For each of the free particles, we have to worry about the kinetic energy due to their free motion, while the excess energy due to the ionization is taken care of by the term dE in eq. (9.44).

171 We use the definition of the statistical weight gj for species j above and get

Nj = nEi i X 3 3 2 µ p2 µ d pd x 4πp dp j− Vg 3/2 j n = V g e 2mkT = (2πm kT ) e kT (9.45) → h3 E h3 j h3 j Z Z Z and thus the chemical potential for species j is

N h2 3/2 µ = kT ln j (9.46) j Vg 2πm kT " j  j  #

We are interested in the equilibrium densities of the different species, which appear in the chemical potentials as N/V . Thus, the Saha equilibrium condition on the chemical potentials translates to Saha equation for the densities:

N h2 3/2 N h2 3/2 kT ln z+1 + kT ln e Vg 2πm kR Vg 2πm kT " z+1  z+1  # " e  e  # N h2 3/2 = kT ln z I (9.47) Vg 2πm kR − z,z+1 " z  z  #

Dividing by kT , replacing N/V with n, and taking the exponential of the equation gives

2 3/2 2 3/2 2 3/2 nz+1 h ne h nz h Iz,z+z = e− kT (9.48) g 2πm kT g 2πm kT g 2πm kT z+1  z+1  e  e  z  z  Solving for the ratio of the densities, we finally arrive at the Saha equation:

3/2 3/2 (E ) (E ) nz+1ne gz+1ge mz+1 2πmekT z,0 − z+1,0 = e kT (9.49) n g m h2 z z  z    3/2 gz+1ge 2πmekT Iz,z+1 e− kT (9.50) ≈ g h2 z  

This equation is strictly true for the electronic ground state of the ions, respectively. Realizing that the occupation numbers in thermal equilibrium are simply Boltzmann-related,

Ei Ei g e kT g e kT n = N i − = N i − (9.51) i Ei kT Z(T ) i gie− P 172 one can write down a Saha equation for each pair of excited states of the two ions as well. For an ionization network, one such equation must be written for each neighboring pair of ioniza- tion states to solve for the equilibrium ionization structure. This quickly becomes fairly complex and, in most cases, has to be solved numerically. Example: Hydrogen in Saha equilibrium

The statistical weights for hydrogen are gz = gH0 = 2, ge = 2, gH+ = 1 and mH0 mH+ . Note that the spin of the nucleus cancels out in the statistical weights, so only the statistical≈ weight of the electron configuration counts. Thus, the statistical weight for the proton is gH+ =1.

Write the ionization fraction as X = nH+ /nH, thus nH0 = nH (1 X). The Saha equation for hydrogen then becomes −

2 3/2 X 1 2πmekT IH = e− kT (9.52) 1 X n h2 − H  

N.B.: Saha equilibrium (LTE) is only achieved at high particle and photon densities, such that detailed balance is valid among all channels for all reactions and their inverses.

9.7 The Planck spectrum

Having derived the statistical mechanics of quantum particles, we can now derive the Planck spec- trum (photons in thermal equilibrium). Several things are worth pointing out:

Photons are Bose-Einstein particles with spin 1, so they have a = 1 in the notation above. • − Each state carries a degeneracy factor of 2 for the photon spin. • Photons are mass-less, so their chemical potential is zero (it takes zero energy to add a ground • state photon). This is an expression of the fact that photon number is not conserved. We must make the transition from discrete energy levels to continuous energy states and • 3 3 3 positions, using g = gphd pd x/h .

Because the radiation field is isotropic, we can write (using p = hν/c)

h3ν2 d3p =4π dν (9.53) c3 From Bose-Einstein statistics above we know the occupation number of photons for a given energy hν (taking g =2 into account):

2 nE = hν (9.54) e kT 1 − 173 1.2 1.2

1.0 1.0

0.8 0.8 /g /g

E 0.6 E 0.6 n n

0.4 0.4

0.2 0.2

0.0 0.0 10−2 kT/µ 1 10 102 µ/kT 1 2 3 4 5 E/µ E/kT

Figure 9.2: Left: Fermi-Dirac distribution for a degenerate gas with kT =0.1µ. The hatched area shows the particles that fall under the exponential Maxwell-Boltzmann tail. Right: Fermi-Dirac distribution for a non-degenerate gas with kT = 10µ and the Maxwell-Boltzmann distribution for the same parameters (dotted curve).

The photon energy in the interval [0,ν] is then

d3pd3x 2hν ν 4πν˜2dν˜ 2hν˜ ν 8πhν˜3 1 (9.55) Eν = 3 hν = V 3 hν˜ = V dν˜ 3 hν˜ h e kT 1 0 c e kT 1 0 c e kT 1 Z − Z − Z − and the spectral energy density is

1 ∂E 8πhν3 u = ν = (9.56) ν hν V ∂ν c3 e kT 1 −   and, since the radiation is isotropic, the intensity is given by

cu 2hν3 B = ν = (9.57) ν hν 4π c2 e kT 1 −   which is the well known Planck spectrum from chapter 1. Note that the stimulated emission terms is naturally included in the Bose-Einstein statistic (a = 1). −

174 9.8 Degeneracy of a Fermi-Dirac gas

For a =1 for fermions, the occupation number is

g ne = E µ (9.58) e kT− +1

The function is plotted in Fig 9.2. For small E µ, the occupation number approaches g. This upper limit is due to the Pauli exclusion principle.≪ These electrons are called “degenerate”. For E µ, n drops exponentially, approaching the Maxwell-Boltzmann expression. ≫ E The detailed behavior is determined by the ratio µ/kT : If µ kT , the fermionic nature of the electrons is not very important, most electrons have energies above≪µ and fall in the exponential tail. The gas behaves like a classical Maxwell-Boltzmann gas and we need not discuss it in any more detail. If µ kT , the fermionic nature becomes critically limiting. Most electrons have energies below µ and≫ fall in the degenerate portion of the distribution. The gas is called degenerate. This case is worth discussing. In this case, we can approximate the function as a top-hat

g =2 forE<µ n = e (9.59) E 0 forE µ  ≥

The momentum associated with the so-called Fermi energy Ef = µ is

pf = 2meEf = 2meµ (9.60) p p The total number of particles is then given by

d3pd3x pf d3x 4πp3 V N = n 2 4πp2 dp =2 f (9.61) h3 E ≈ × h3 3 h3 Z Z Z0 Z   or

3nh3 1/3 p = (9.62) f 8π   This gives a Fermi energy of

p2 3n 2/3 h2 µ = E = f = (9.63) f 2 m 8π 2m e   e 175 Thus, the condition µ kT for the gas to be degenerate (electrons forced to be at energies higher than kT ) translates to ≫

8π 2m kT 3/2 n e (9.64) ≫ 3 h2   The pressure (momentum flux) for a degenerate gas is derived from

d3xd3p pf p2dp π PV = nEp v =2πV sin θdθ [cos θp cos θv] nE h3 ⊥ ⊥ h3 · Z Z Z0 Z0 4π p5 1 8π = V 2 f = p5V (9.65) · · 3 · m 5 · h3 15m h3 f   e e or

8π p5 8π 1 3nh3 5/3 P = f = n5/3 (9.66) 15 m h3 15 m h3 8π ∝ e e   which is independent of temperature, but follows the familiar 5/3 adiabatic equation of state for ideal, monatomic gases. That is, regardless of how cold the gas is, there is a minimum pressure that is given by its density. So, even though the temperature of the gas is low, the particles have more energy than what would be their Maxwell–Boltzman fair share. The reason is the Pauli exclusion principle: At low temperatures or high densities, the fermions are 3 very close together in phase space. However, there is a maximum phase space density of n gf h− imposed by the Pauli principle. So, Fermions will provide pressure even at cold temperatures≤ and will refuse to give up their energy in this case. Most particles will have energies near Ef . This “degeneracy pressure” is at the heart of white dwarf and neutron star structures: Even though these stars no longer have fuel to burn to keep them hot and thus pressurized, the degeneracy pressure of electrons/neutrons keeps them from collapsing under their own gravity.

176