<<

AST 418/518, Fall 2016 Homework 3 Solutions

1. Malmquist Bias Simulation. We want to understand how limits affect the use of supernovae observations in the measurement of the Hubble constant, H_0. To do so, we will use Monte Carlo computations to generate a set of simulated data. For this simulation, we make the assumption that we are only using a class of supernovae that all have similar intrinsic brightness, so that their apparent flux can be used to measure how far away they are. If their Doppler shift is also measured, the measurements can be used to calculate the expansion rate of the universe, or Hubble's constant.

A subtle effect occurs in this measurement, called Malmquist bias, that can affect the result. The effect is caused by the range of apparent brightness for supernova. Supernova in our simulation have an of M= -19, Assume the supernova have a scatter about their absolute magnitude of approximately 1 magnitude. Survey telescopes will detect objects as faint as m=20. The limiting magnitude corresponds to a (m-M=-5 log10(d/10) of 39, which suggests that supernova can be seen as far away as 1000 Mpc.

Assume supernova are formed uniformly throughout a sphere with radius r=2000 Mpc. In the data generation part of the simulation, assume that each supernova is receding at a rate v=H_0 * d , where H_0 is 72 km/s / Mpc and d is the distance in Mpc. a. Create a Monte Carlo program to generate a user-selectable number of randomly placed supernovae within this volume. Have the program generate the true distances, d, to the supernovae. Using the program, calculate the mean distance for a set of supernova. You will need to use the transformation method to generate properly distributed distances. You can also calculate the mean distance analytically. Make sure the result agrees with your Monte Carlo answer.

The probability of finding a supernovae increases for larger distances, since there is more volume of space further away. We expect this to increase as r^2, since the surface area of objects at a given distance increases at this rate. This allows us to write the unnormalized probability distribution as

2 P (r)dr = Ar dr

where A is a normalization constant. If we expect to find all supernovae between 0 and 2 Gpc, then A=3/8 (for r expressed in units of Gpc). This means that the mean distance of supernovae in this sample will be 2 Ar4 = P (r)rdr = 2 =1.5Gpc 4 |0 Z0 In order to generate a random number which follows this probability distribution, you should transform a uniform deviate between 0-1 to the above probability distribution. Balancing the probability distributions, we get

P(r)dr = dx 2 2 3 ar dr = a0r = x + c Z 0

where a, a’, and c are normalization and integration constants. When x=0, r=0, so c=0. When x=1, r=2, so a’=1/8. This means that the code which generates the random number generator should have the equation:

r= (8*x)^(1/3), where x is the original random deviate.

You could also just generate three uniform random deviates in Cartesian coordinates, but note that this is less computationally efficient, which may be important for large simulations. b. Now assume each of the supernovae has a brightness governed by

M = -19 +G(1)

where G(1) is a random number with Gaussian distribution and standard deviation of one magnitude. Calculate the of each , using the distance generated in a. If m>20, assume the object is too faint to detect, and reject it from the sample.

Use the remaining points, to simulate the observations by generating a velocity for each one, using Hubble's law. Generate an observed distance (d') by using its apparent magnitude and Hubble’s law, with the assumption that the supernova has an absolute magnitude (M) of 19. Create a simulated observation sample of 100 supernovae. Plot these with observed distance on the x-axis and velocity on the y-axis and compare it to the original sample. Explain the effect of the observing limit on the resulting sample.

The observed distance, d’ is generated by using the equation: d’ = 5log10 (m/M).

Since the simulation will only include those points with m<20, it tends to reject the intrinsically fainter objects near the distance limit of the sample. This tends to bias the sample. c. Calculate H_0 by averaging v/d’ for all your detected points. Discuss the level of the bias. How could you account for it in the observations?

The calculated level of H_0 will significantly increase in the resulting data, unless this is accounted for. If one knows the variation of the parent sample, it is possible to limit the observations to a volume-limited sample. Or the effect could be modeled in just the way this question asks, to estimate the bias effect and remove it from the final value. The difficulty, is that one may not know the luminosity distribution very well in the first place, making the level of this bias difficult to quantify.

2. Markov Chain Monte Carlo (AST 518 only) The Bayesian analysis of the linear fit program from HW 7 was hampered by the need to do a 3-dimensional integral for each value of m and b, essentially a 5-d grid calculation. While this was conceptually more straightforward, there is a more efficient method which uses Markov Chain Monte Carlo calculations. Here, we will modify the previous problem (HW 7, problem 3) to carry this out. It will have the additional advantage of giving us an estimate for the parameters of the bad data points, and the confidence intervals for the fit parameters. You will want to modify your program along the following lines: i. Define a function to calculate the maximum likelihood function, given a vector of the 5 parameters(m,b,Pb,Vb,Yb). Make sure ML=0 for nonphysical values of the parameters such as Pb<0. Define a set of starting values for your parameters and calculate ML. ii. Define a function that changes the vector of parameters to new values in some defined way. You can cycle through each parameter, or randomly choose one to change. You want this to be a random variation with appropriate variation for the problem. It is this function where the art of MCMC is found. iii.If the new set of parameters has MLnew>MLold, accept it. If MLnew

A program similar to the below section should have been developed:

%starting values m=2.4; b=30; Pb=0.2; Vb=1000; Yb=10; %Create a vector of parameters PV=[m b Pb Vb Yb];

%totalpoints total=1e4; chain=zeros(5,total); %iterate points kept points=1; while(points

PVnew=pv; end

Once you have a working program, you should be able to answer the following: a. Evaluate the Markov chain to determine how many points are needed to converge to a solution. Eliminate these points from further analysis. Try different starting points and ways of generating new steps to optimize this. What worked/ didn’t work?

It is important to choose a starting point close to the best fit found by the grid search, otherwise the fit might not converge.

After several hundred iterations, the MC for this particular metric is typically “burned in” if the sigma values are high enough. Lower jitter amounts will cause slower convergence, while values that are too high, may result in poorly sampling efficiency. b. Create histograms of the 5 fit parameters showing their range of variations in the MCMC program.

You should get results similar to the following:

c. Estimate the 95% confidence intervals for the slope and intercept parameters using the Markov chain data directly. That is, calculate what range about the mean encompasses 95% of the data points. Why is this a better description of the parameter uncertainty than quoting a 2 sigma value for the parameters? Do the two values differ for this situation?

You can calculate these directly from the above histograms. The results for the MCMC simulation should be deltam of 0.2 and deltab of 35. While these are similar to estimates for the formal uncertainties, this is a direct calculation that 95% of the time the results will be within this value. The formal uncertainties assume Gaussian statistics, which may not be relevant, to predict how often the results will be within the calculated 2 sigma error bars.