arXiv:1401.1508v1 [astro-ph.GA] 7 Jan 2014 osrit ntesrcueo h aayshl (e.g., halo ’s the of structure the on constraints ( Galac- the halo in substructure tic of abundance an revealed disk have Galactic the at outside the (e.g., mostly of latitudes, structure the Galactic on high focused have stars, of lions ESRN ITNE N EDNNSFRABLINSAS TOW STARS: BILLION A FOR REDDENINGS AND DISTANCES MEASURING k uvy(SDSS; which Survey stars. Sky dust, these of from presence light bulk the the the es- reddens to where owing and are obscures disk, reside, stars stars Galactic these the the of of in Galaxy’s positions our uncertain of the pecially fraction Moreover, assembled, a been only have volume. probe stars stars of these millions though posi- of the stars. types of and individual measurements tions meanwhile, probe Way, lack Milky to but the sensitivity , In global shaping and in detailed, resolution work a at the forces paint the to of of Studies begun picture have galaxies. of galaxies formation external and structure the stand ueeMagnier Eugene uhmD13E England 3LE, DH1 Durham USA 96822, HI Honolulu, France Strasbourg, l’Universit´e, 67000 85721 Heidelberg 69117 02138 MA Cambridge, St., [email protected] rpittpstuigL using 2014 typeset 9, Preprint January version Draft codnl,wd-edsreslk h la Digital Sloan the like surveys wide-field Accordingly, under- to been has of goal long-standing A rgr arc Green Maurice Gregory ilBurgett Will 6 5 4 3 2 1 eateto hsc,Dra nvriy ot Road, South University, Manoa, Durham de at Physics, Hawaii of rue of Department University 11 Astronomy, for Strasbourg, AZ Institute Tucson, de Avenue, Astronomique Cherry Observatoire N. 933 Corporation, D- LSST 17, K¨onigstuhl Astronomie, f¨ur Garden 60 Max-Planck-Institut Astrophysics, for Center Harvard-Smithsonian uice l 2008 al. Juri´c et h eeto iiso h uvy o tr ntePnSAR (P 1 Pan-STARRS distributio the the in stars of For knowledge re survey. prior in the uncertainty incorporate of full We limits the functio map. detection as density dust the well probability th 3D as full a the stars, the produce in individual to from for used type samples be stellar method can Our method this Galaxy. how show and photometry, tr ae ncmaioswt okctlg,w xetdistanc expect we catalogs, mock with within to comparisons constrained on Based . htorrdeigetmtsaeubae,adacrt to accurate and unbiased, are estimates reddening our that iie otepsbnso h S uvy u a eetne to f extended the be in may and The available), but plane. (where survey, the headings: SDSS Subject in t PS1 2MASS, science Both the as Galactic such further of PS1. of surveys, A passbands wealth by other a surveyed the sky. enable sky to the will the limited on map of dust position quarters 3D its three the and the type over stellar dust precise the star, eouo ta.2006 al. et Belokurov epeetamto oifrrdeig n itne osas b stars, to distances and reddenings infer to method a present We 1. 5 5 ioa Martin Nicolas , ent .Chambers C. Kenneth , oke l 2000 al. et York INTRODUCTION A T ; E vz´ ta.2008 al. Ivezi´c et tl mltajv 5/2/11 v. emulateapj style X ut—IM tutr tr:dsacs—Glx:srcue—m — structure Galaxy: — distances stars: — structure ISM: — dust tical 1 dadF Schlafly F. Edward , ∼ ,adhv e onew to led have and ), 0 0,atog hsrnecnvr,dpnigo h reddenin the on depending vary, can range this although 60%, - 20% ,wihosre mil- observed which ), 4 ie Metcalfe Nigel , .Teestudies These ). 5 ee .Draper W. Peter , rf eso aur ,2014 9, January version Draft A-TRS1 PAN-STARRS 2 ABSTRACT oga .Finkbeiner P. Douglas , 6 onTonry John , n ple hmt iuae PA aa( data IPHAS simulated to them applies and (SEGUE; Exploration Extension and Sloan Understanding the Galactic in for observed sub-populations using ucino itne sn tr staeso h dust the of tracers as a stars as using reddening derive column. distance, We infer of to star. technique each function distance Bayesian for to principled of type framework stellar a Bayesian function and a density distribu- reddening in probability and and Galaxy full types the the in the deliver stars of stars. of knowledge resolved of tion of structure prior surveys and in exploit optical properties We embedded from the disk stars of Galaxy’s to study our enable reddenings to and dust, distances substruc- the disk fer possible as well as disk, ture. heavily the smooth of of the into structure samples investigation aid -limited will stars dust-obscured large, for mates iuae aaosbsdo h ea¸o oe fthe of Besan¸con model ( the on Galaxy based catalogs simulated 2005 of that to by plane Galactic the of 2MASS map the comparing 3D a produced 2009 al. et Yanny Galaxy’s the much how there. work unclear formed at and is processes ( reside the it inform stars and can Galaxy’s halo disk, the the of in bulk the Still, 2010 Majewski & Law h isn aeltsProblem, (e.g., formation Satellites Galaxy Missing of picture the standard the to lenges 2012a eaenttefis nti area. this in first the not are We in- simultaneously to technique a presents paper This 6 ete Flewelling Heather , .Temto rsne eefrotiigstellar obtaining for here presented method The ). , c , ∼ 5 b oi ta.2003 al. et Robin ihr Wainscoat Richard , eiesot oesfrteGlci disk, Galactic the for models smooth derive ) Sale 0 . eidvda tla neecsand inferences stellar individual he 3mgi E( in mag 13 dnn safnto fdistance of function a as ddening tr,LS n Gaia. and LSST uture, sfrmi-eunesast be to stars main-sequence for es e-iesoa utmpo the of map dust ree-dimensional ( ae ilpeeta3 a of map 3D a present will paper 2012 1 sdol nterbroad-band their on only ased 1 3 S1) ai Juri Mario , noprt htmtyfrom photometry incorporate .Bte htmti itneesti- distance photometric Better ). fdsac,rdeigand reddening distance, of n fsasi h aayand Galaxy the in stars of n ,wopeet eae techniques related presents who ), π ehdw rsn snot is present we method RSA3 UTMPFROM MAP DUST 3D A ARDS n h dnicto fchal- of identification the and ) J uvy edemonstrate we survey, − B 5 .Orwr sms similar most is work Our ). ofPtrKudritzki Peter Rolf , − K V s c ´ 5 o h typical the for ) tla oost hs of those to colors stellar 3 hitpe Waters Christopher , asWle Rix Hans-Walter , io ea2007 Geha & Simon tos statis- ethods: asale al. et Marshall fthe of g rwe al. et Drew oye al. et Bovy ( 2006 5 2 , , ). 5 ) 2 reddenings and distances is also related to that of Berry We make the assumption that stars which are close to et al. (2011), who highlight the large amount of 3D struc- one another in 3D space are behind the same column of ture in the Galaxy’s dust using data from SDSS. The dust. By grouping together stars which are close-by on work of Bailer-Jones (2011), likewise, presents a similar the sky, we are able to use the stars as tracers of the technique using broad-band photometry and Hipparcos total dust column at different distances in a particular parallaxes (Perryman et al. 1997). Vergely et al. (2001) direction on the sky. So long as the dust column does and Lallement et al. (2003) map out the 3D distribu- not vary significantly over small angular scales, then this tion of clouds in the Local Bubble by measuring absorp- is a valid assumption. tion lines imprinted by the interstellar medium on the Let us denote the reddening profile along a particular spectra of stars with Hipparcos parallaxes. Lallement line of sight by et al. (2013) uses ∼ 23, 000 stellar parallaxes and red- dening estimates from a number of sources to infer the E(µ ; ~α) , (1) 3D distribution of dust opacity out to 800 - 1000 pc in the plane of the Galaxy, and ∼ 300 pc out of the plane. where E, the color excess in some pair of passbands (e.g., Our work differentiates itself from these primarily in that E(B−V )), is a function of distance modulus µ, and ~α de- it is adapted to studying the approximately one billion notes any fitting parameters defining the reddening pro- stars with high-quality Pan-STARRS 1 (PS1) photome- file. These parameters could be, for example, the dust try, which cover three quarters of the sky and two thirds density in each distance bin, and could in principle in- of the Galactic plane, representing an unprecedented re- clude the value of RV , which parameterizes the wave- source for studies of the Galaxy’s disk. length dependence of the extinction law (Cardelli et al. The paper is organized as follows. Section §2 de- ~ scribes the Pan-STARRS 1 survey. Section §3 devel- 1989; Fitzpatrick 1999). The extinction A in any set of ops the Bayesian formalism required to produce a three- passbands is then assumed to be a function of the red- dimensional reddening map. Sections §4.1 and §4.2 de- dening E: scribe the stellar and Galactic models we employ. Section §5 describes the practical implementation of our model to A~ = A~(E, RV ) . (2) produce a 3D map. Then, in section §6, we conduct tests of our method with mock photometry, and in section §7, Along one line of sight, denote the photometry of star we validate our model with real photometry. i by ~mi, and the set of all observed stellar magnitudes as {~m}. We wish to determine how the model parameters 2. PAN-STARRS 1 SURVEY ~α for the line-of-sight reddening profile depend on the stellar photometry {~m}. That is, we wish to determine In this work we derive the distances and reddenings to stars observed by Pan-STARRS 1. Pan-STARRS 1 is a 1.8-meter optical and near-infrared telescope located p(~α |{~m}) . (3) on Mount Haleakala, Hawaii (Kaiser et al. 2010; Ho- dapp et al. 2004). The telescope is equipped with the Using Bayes’ rule, GigaPixel Camera 1 (GPC1), consisting of an array of 60 CCD detectors, each 4800 pixels on a side (Tonry & p({~m} | ~α) p(~α) p(~α |{~m})= . (4) Onaka 2009; Onaka et al. 2008). The majority of the p({~m}) observing time is dedicated to a multi-epoch 3π stera- ◦ dian survey of the sky north of δ = −30 (Chambers in The likelihood p({~m} | ~α) is the probability density of prep.). The 3π survey observes in five passbands gP1, obtaining the set of observed magnitudes {~m}, given the rP1, iP1, zP1, and yP1, which together span 400–1000 nm reddening profile defined by ~α. The likelihood of the (Stubbs et al. 2010). The images are processed by the entire set of stellar observations is just the product of Pan-STARRS 1 Image Processing Pipeline (IPP) (Mag- the individual likelihoods, i.e. nier 2006), which performs automatic astrometry (Mag- nier et al. 2008) and photometry (Magnier 2007). The data is photometrically calibrated to better than 1% ac- p({~m} | ~α)= p(~mi| ~α) . (5) i curacy (Schlafly et al. 2012; Tonry et al. 2012). The re- Y sulting homogeneous optical and near-infrared coverage of three quarters of the sky makes the Pan-STARRS1 This is just a statement that the photometry of one star data ideal for studies of the distribution of the Galaxy’s does not influence the photometry of any other star. dust. Plugging this into Eq. (4), and dropping the normal- izing factor p({~m}), 3. LINE-OF-SIGHT REDDENING PROFILE Here, we describe the basic assumptions that we make p(~α |{~m}) ∝ p(~α) p(~mi| ~α) . (6) i in order to produce a three dimensional dust map. We Y show that the problem can be decomposed into two steps. In the first step, we determine the probability density We now introduce nuisance parameters describing the function of distance and reddening for each star (See §4). distance to and intrinsic type of each star, and then In the second step, we use information from stars on the marginalize over these parameters to obtain the likeli- same small patch of sky to infer reddening as a function hood. We denote the distance modulus to star i by µi, of distance in the given direction. and the parameters describing the stellar type by Θ~ i. For 3 an individual star, dust properties. One could imagine simultaneously con- straining stellar types and distances, as well as the dust p(~m| ~α)= dµ dΘ~ p ~m, µ, Θ~ | ~α (7) density and RV parameter throughout space. By fitting the dust properties in many voxels simultaneously, one Z   would be able to place priors on the density power spec- = dµ dΘ~ p ~m| µ, Θ~ , ~α p µ, Θ~ | ~α (8) trum of the dust, and to infer RV as a function of position Z     in the Galaxy. By fitting dust properties throughout the = dµ dΘ~ p ~m| µ, Θ~ , E(µ; ~α) p µ, Θ~ . (9) entire volume of the Galaxy simultaneously, one could even attempt to constrain global parameters, such as the Z     In the last step, we have assumed that the joint prior on dust scale height and scale length. This would represent the distance and intrinsic stellar type are independent of a hierarchical approach to the problem of creating a 3D the reddening profile. Up to a normalizing constant, the dust map (see, for example, Kruschke 2010, for a dis- above integrand is equivalent to the posterior density cussion of hierarchical Bayesian models). At the highest level in the hierarchy, one has global parameters, which p µ, E, Θ~ | ~m (10) describe the overall dust distribution and density power spectrum. One level below in the hierarchy, one would   for an individual star, where the prior on E is flat. In- have parameters describing the dust properties in each tuitively, this is because the prior on reddening is on the voxel in the Galaxy. At the lowest level, one could have fitting parameters ~α, rather than the reddening of an in- the type and distance for each star. Such a hierarchi- dividual star. After marginalizing over the stellar type cal approach is appealing because it takes into account Θ, spatial correlations in dust properties, and because it di- rectly fits the global structure of the Galaxy’s dust com- ponent. However, this hierarchical approach potentially p(~m| ~α) ∝ dµ p(µ, E(µi; ~α) | ~m) . (11) requires much greater computational power than the ap- Z proach we take in this paper, as it does not allow one Plugging the above into Eq. (6), we find that the full to process each star individually and treat each line of posterior density for ~α is given by sight separately, greatly increasing the dimensionality of parameter space. This paper therefore confines itself to p(~α |{~m}) ∝ p(~α) dµi p(µi, E(µi; ~α) | ~mi) , (12) fitting each star independently, and then combining the i Z information from each star along any given line of sight Y to determine the reddening profile as a function of dis- where we have defined the function tance. 1 p(µ , E | ~m ) ≡ dΘ~ p ~m | µ , Θ~ , E p µ , Θ~ , i i i Z i i i i i i i 4. INDIVIDUAL STARS i Z    (13) Now that we have factorized the problem of determin- ing the line-of-sight reddening profile into one of deter- which is equivalent to the posterior probability density ~ of finding a single star at distance µi and reddening Ei, mining p µi, Ei, Θi | ~mi for each star, we need to deter- where the prior on reddening is flat. Here, Zi is a normal- mine the individual stellar likelihoods and priors. In the izing constant. Effectively, it is the Bayesian evidence for following, we will drop the subscript i, as it is assumed star i, a measure of how likely the star is to be drawn from that we are dealing with one star. the model. A higher evidence indicates that the data is Using Bayes’ Rule, more consistent with the model, while a low value for Zi indicates that the model does not well describe the object p µ, E, Θ~ | ~m ∝ p ~m | µ, E, Θ~ p µ, E, Θ~ (14) i. Our strategy for sampling from a posterior of the form givenbyEq. (12) is to pre-compute Eq. (13) for each star       by Markov-Chain Monte Carlo (MCMC) sampling, and The likelihood, p ~m | µ, E, Θ~ , is the probability density then to sample from p(~α |{~m}). This approach has two of a star having apparent magnitudes ~m, given a dis- advantages. First, it factorizes the full problem into a se- tance, reddening and stellar type. The likelihood is thus ries of smaller problems of lower dimension, potentially dependent on our model of intrinsic stellar colors, which speeding up the computation. The second advantage of ~ this two-step approach is that it allows outlier rejection we discuss below (in §4.1). The priors, p µ, E, Θ , are on the basis of the Bayesian evidence for each star be- dependent on our model of the distribution of stars of fore proceeding to the second step. Point sources which different types throughout the Galaxy. We discuss our do not fit the chosen stellar model (e.g., blue stragglers, Galactic model in §4.2. white dwarfs and galaxies mistakenly classified as stars) can be filtered out by imposing a cut on the evidence Zi 4.1. Stellar Model (See §5.2). More principled approaches to reducing the In our model, each star is described by two intrinsic pa- influence of outliers exist (e.g., Hogg et al. 2010), though rameters, its M in the PS1 r band in the context of our problem they are significantly more r and its [Fe/H]. In terms of our previous nota- computationally expensive to implement. ~ Fe A more general limitation to the approach taken here tion, Θ = (Mr, [ /H]). Given a set of stellar templates, is that it does not allow the simultaneous fitting of pa- M~ (M, [Fe/H]), which map intrinsic stellar type to a set of rameters describing the stars and the spatial variation in absolute magnitudes, one obtains theoretical apparent 4 magnitudes

~mmod = M~ (Mr, [Fe/H]) + A~(E, RV )+ µ . (15)

The extinction vector A~(E, RV ) used in this work fol- lows a Fitzpatrick (1999) reddening law with RV =3.1, adapted to the PS1 grizyP1 filter set by Schlafly & Finkbeiner (2011). The likelihood of observing appar- ent magnitudes ~m with Gaussian uncertainties ~σ is then given by

p(~m | µ,Mr, [Fe/H] , E)= N (~m | ~mmod , ~σ) . (16) In this paper, we use the notation N (~x | ~µ, ~σ) to denote the probability density of a multivariate normal with mean ~µ and standard deviations ~σ, evaluated at ~x. We adopt a set of empirical stellar templates based on stellar observations in PS1, with photometric parallaxes and derived from the work of Ivezi´cet al. (2008). That work determines the absolute magnitude of a star as a function of its intrinsic color and metallic- ity using observations of globular clusters in SDSS. These globular clusters are uniformly old, and as a result our stellar templates are appropriate only for old populations and do not include age as a parameter. This means that young blue stars are not included, and that the morphol- Fig. 1.— Model stellar colors as a function of absolute r- magnitude and metallicity in Pan-STARRS 1 passbands. The stel- ogy of the and giant branches is only approx- lar templates are based on PS1 color-color relations, and color is imate. Accordingly, our giant branch distances are less related to absolute magnitude and metallicity by SDSS observa- reliable than our distances. Nevertheless, tions of globular clusters (Ivezi´cet al. 2008). Our empirical tem- we include the giant branches in our models because any plates therefore assume an old . While the main sequence below the turnoff is nearly invariant with age, the giant given star may indeed be a distant giant rather than a branch and the location of the turnoff do, in reality, vary con- nearby dwarf. siderably with age. For this reason, we expect our inferences for In detail, we fit a spline to the colors of stars (in 4-color main-sequence stars to be more accurate than those for giants. The space) near the north Galactic pole to derive the shape narrowness of the kink at Mr ≃ 2.4 is an artifact of our models of the stellar locus. All main-sequence stars are required (See §4.1). to have intrinsic colors lying along this one-dimensional curve. We then associate each position along the main to adopt a set of empirical models that match the col- sequence with an absolute magnitude and a metallicity ors of most stars well, though in future work a hybrid using the relations of Ivezi´cet al. (2008), which give ab- approach may be best suited to the problem. solute magnitude as a function of color and metallicity. Models for the giants are obtained from linear fits of 4.2. Galactic Model absolute magnitude to color and metallicity by Ivezi´c We now present the priors that we place on the intrin- (private communication), based on observations of glob- sic and extrinsic parameters describing each star. We ular clusters. These giant branch fits are joined to the factorize the priors as follows: main sequence via a cubic interpolating polynomial for 4 >Mr > 2.35. We are able to use relations derived from p(µ,Mr, [Fe/H]) = p(µ) p([Fe/H] | µ) p(Mr) . (17) SDSS because of the close similarity between the PS1 and SDSS filter sets; we transform from the PS1 to SDSS col- We describe the distance prior in §4.2.1, the metallicity ors using the color transformations of Finkbeiner et al. prior in §4.2.2 and the prior on absolute magnitude in (in prep.), which have residuals of less than about 1% §4.2.3. across the full range of stellar types considered in this work. 4.2.1. Distance The resulting stellar templates are shown in Figure 1, For a given line of sight, the prior probability of finding which gives the templates’ colors as a function of their a star in a small range dµ in distance modulus is propor- absolute magnitude and metallicity. Our empirical ap- tional to the number of stars per unit distance modulus proach produces a close match to the observed colors per unit solid angle in the direction of the pixel: of stars, and comparison with Globular and Open Clus- dN dN dr dr ters indicates that the absolute magnitudes are accurate p(µ) ∝ = = n(µ) r2 along the main sequence (See §7.2). An alternative ap- dµdΩ drdΩ dµ dµ 3 proach would have been to adopt template colors from a ∝ 10 µ/5 n(µ) . (18) library of synthetic spectra, such as the Padova & Trieste Code (PARSEC; Bressan et al. 2012), Here, r denotes physical distance from the , and n(µ) giving us access to age as an additional stellar parameter. is the stellar number density at distance modulus µ along However, the synthetic libraries have difficulty reproduc- the chosen line of sight. The distance prior is thus con- ing the colors of M-dwarfs in detail. We choose therefore trolled by the line-of-sight number density of stars, as 5 well as a volume factor, 103µ/5, which grows with dis- tance. This latter term takes into account that the vol- ume represented by a beam of constant, small width in distance modulus grows with distance. For a typical line of sight, the prior is driven upwards by the volume factor at small distances, while farther out it is suppressed by the decline in density in the outer reaches of the Galaxy. For constant density, n(µ), Eq. (18) simply reduces to the Euclidean counts equation. The distance prior thus requires us to calculate the stel- lar number density at arbitrary locations in the Galaxy. We employ the three-component Galactic model devel- oped in Juri´cet al. (2008), which comprises a thin disk, a thick disk and an oblate halo. In cylindrical coordinates centered on the Galactic center, and with the Galactic Fig. 2.— The distance prior for (ℓ, b) = (90◦, 10◦). The contri- plane defining Z = 0, each disk component has number butions of the disk and halo are shown individually in green and purple, respectively, while the total prior is given by the gray con- density of the form tour. The break in the contribution from the halo is due to the R |Z| use of a broken power law for the number density of stars in this −( /R0,i+ /Z0,i) ni(R,Z)= n0,i e , (19) component. where R0, i and Z0, i are the scale radius and height re- spectively, of each disk component, and n is the stellar 0,i TABLE 1 number density of each component at the Galactic cen- Stellar Number Density Parameters ter. In the Solar neighborhood,

R |Z | Thin Disk Thick Disk Halo n (R ,Z )= n e−( ⊙/R0,i+ ⊙ /Z0,i) . (20) i ⊙ ⊙ 0,i † Rthin 2150 pc Rthick 3261 pc Rbr 27.8 kpc † We can thus write the number density of each disk com- Zthin 245 pc Zthick 743 pc qh 0.70 ponent in terms of locally defined quantities, which are fthick 0.13 fh 0.003 † ηinner 2.62 more readily measurable than quantities defined at the † Galactic center: ηouter 3.80 † Values from Sesar et al. (2011). All other adopted values R−R |Z|− Z −[( ⊙)/R0,i+( | ⊙|)/Z0,i] ni(R,Z)= ni(R⊙,Z⊙) e . are from Juri´cet al. (2008). (21) where R⊙ and Z⊙ are the Galactocentric Solar radius and height in cylindrical coordinates, respectively. From possible range inferred in Juri´cet al. (2008), is chosen to this point onwards, we will denote nthin(R⊙,Z⊙) as n⊙, better match observed PS1 color-magnitude diagrams at and write high Galactic latitudes. Throughout, we use R⊙ = 8kpc n (R ,Z ) ≡ f n . (22) and Z⊙ = 25pc. The shape of the distance prior for a thick ⊙ ⊙ thick ⊙ line of sight centered on ℓ = 90◦ and b = 10◦ is shown in The halo is assumed to have stellar number density Fig. 2. R −η n (R,Z)= n f eff , (23) 4.2.2. Metallicity halo ⊙ h R  ⊙  We adopt the model of Galactic metallicity developed with in Ivezi´cet al. (2008) and Bond et al. (2010), assigning separate metallicity distributions to the disk and halo. 2 Z 2 2 Reff ≡ R + ( /qh) + Rǫ . (24) The metallicity distribution of the disk varies with height q above the Galactic plane, and is thus dependent on line- Here, qh controls the oblateness of the halo and η con- of-sight distance. The metallicity prior takes the form trols the steepness of the power law. Following Sesar Fe Fe et al. (2011), the power law breaks at Reff = Rbr, be- p([ /H] | µ)= p([ /H] | µ, disk) p(disk | µ) coming steeper. We therefore define ηinner and ηouter, + p([Fe/H] | halo) p(halo | µ) . (25) corresponding to the halo steepness inside and outside of the break. We introduce the distance scale Rǫ over The membership probabilities are simply which the inner region of the halo is smoothed, in order n (µ)+ n (µ) to remove the singularity at the Galactic Center. We p(disk | µ)= thin thick , (26) choose Rǫ to be 500 pc; at this scale, it has a negligible nthin(µ)+ nthick(µ)+ nhalo(µ) effect on the halo density in the regions which Juri´cet al. p(halo | µ)=1 − p(disk | µ) , (27) (2008) studied, but it prevents the halo from dominating over the disk at the Galactic Center and is the same scale which can be calculated on the basis of the preceding adopted by Robin et al. (2003). discussion (§4.2.1). We employ the parameters given in Table 10 of Juri´c For the disk, stellar metallicity is distributed as a sum et al. (2008) and Sesar et al. (2011). These are listed in of Gaussians. The mean of each Gaussian varies with Table 1. The adopted value of fh, on the low end of the height above the Galactic plane, so that the prior is best 6

pendent of position. The priors on distance and lumi- TABLE 2 nosity are then separable, so that Metallicity Parameters

Disk Halo p(µ, Mr)= p(µ) p(Mr) , (32) a −0.89 a −1.46 D H dN σD 0.20 σH 0.30 with p(Mr) = LF(Mr) ∝ dM . c 0.63 r ∆a 0.14 We adapt the PS1 functions provided by the ∆µ 0.55 Padova & Triese Stellar Evolution Code (Bressan et al. Hµ 0.5 kpc 2012, PARSEC), assuming a Chabrier (2001) log-normal initial mass function. We average over luminosity func- tions for populations with ages of τ = 7 ± 2 Gyr and metallicities [Fe/H] = −0.5 ± 0.5 dex. Denote the lumi- nosity function for a population of age τ and metallicity [Fe/H] as LF(Mr | τ, [Fe/H]). The luminosity function we adopt is then

LF(Mr) ∝ dτ d[Fe/H] LF(Mr | τ, [Fe/H]) Z Z 2 Fe Fe 2 (τ − τ0) ([ /H] − [ /H]0) × exp − 2 − 2 , " 2στ 2σ[Fe/H] # (33)

Fe with τ0 = 7 Gyr, στ = 2 Gyr, [ /H]0 = −0.5 dex and σ[Fe/H] = 0.5 dex. In principle, it is possible to make the luminosity function depend on metallicity, by not averaging over [Fe/H] in Eq. (33). For simplicity, we Fig. 3.— The metallicity prior, p([Fe/H] | Z), in the Solar neigh- borhood (R = 8 kpc). High above the plane of the Galaxy, where assume here that the luminosity function is universal. the halo dominates, the metallicity distribution has a constant mean and variance. In the plane, where the disk dominates, the mean decreases with scale height. Adapted from Fig. 9 of Ivezi´c 4.2.4. Reddening et al. (2008). As indicated in §3, the manner in which we have fac- torized the line-of-sight reddening problem requires us written in terms of cylindrical coordinates: to place a flat prior on the color excess, E, for each p([Fe/H] | Z, disk) = c N ([Fe/H] | a(Z), σ ) star. The priors on the reddening profile are imposed D on the parameters which control the line-of-sight red- + (1−c) N ([Fe/H] | a(Z)+∆a, σD) , (28) dening, rather than on individual stellar reddenings. For where example, if one divides each line of sight into N dis- tance bins and assigns a different dust density ρi to −|Z|/Hµ a(Z)= aD + ∆µe . (29) each bin, then the reddening prior would take the form p(ρ1, ρ2,...,ρN ). The parameters aD, controlling the central disk metal- licity, and ∆µ and Hµ, describing the vertical metallicity gradient in the disk, are defined in Table 2. We assume 4.2.5. Survey Selection Function the halo metallicity to be spatially invariant, and dis- tributed as The distance prior developed above only asks how many stars are in a thin shell at each distance. However, p([Fe/H] | halo) = N ([Fe/H] | aH , σH ) . (30) for a magnitude-limited survey, we would like instead to know the number of observable stars at a given distance. The parameters, aH and σH , describing the halo metal- We should assign zero prior probability to the possibil- licity, are given in Table 2. The metallicity is plotted in ity of a star being observed which our instrument cannot Fig. 3 as a function of height above the Galactic mid- detect. The fact that a star has been observed by a given plane in the Solar neighborhood. instrument therefore tells us something about its stellar type, distance and extinction. Using the notation from 4.2.3. Absolute Magnitude Sale (2012), we define the vector S~ for each star, where We use the r-band absolute magnitude to parameterize Si is true if a star has been observed in passband i, and the luminosity of each star. The joint prior on luminosity false if the star is not detected in that passband. The and distance is PS1 dataset used in this paper is not based on forced dN(µ, M ) photometry, so there is a separate probability of a source p(µ, M ) ∝ r . (31) r dµ dM being detected in each passband. If forced photometry r were conducted, there would be one single probability The luminosity function is assumed to be same in the p(S), equal to the probability of detecting the source in halo and both disk components, and furthermore inde- at least one of the passbands. Including this information 7 in the single-star posterior, Eq. (13), we get We determine the completeness fraction of the PS1 3π survey as a function of p µ, E, Θ~ | ~mobs, S~ ∝ p ~mobs| µ, E, Θ~ , S~ ∆m ≡ m − mlim , (41)     × p µ, E, Θ~ | S~ . (34) where m is the PS1 magnitude, and mlim is an estimate of the local PS1 5σ magnitude limit, based on the point-   But the prior is now just spread function of nearby PS1 detections and local sky and read noise. We divide the SDSS Stripe 82 footprint nside ′ p µ, E, Θ~ | S~ ∝ p S~ | µ, E, Θ~ p µ, E, Θ~ , (35) into HEALPix = 128 pixels (with ∼ 27 scale). In each pixel, we select all Stripe-82 detections classified as       so in full, stars, and transform their ugriz magnitudes to grizyP1, using color transformations derived by Finkbeiner (in prep.), based on standard-star catalogs. In each pixel, p µ, E, Θ~ | ~mobs, S~ ∝ p ~mobs| µ, E, Θ~ , S~ p µ, E, Θ~ we determine the PS1 limiting grizyP1 magnitudes from    ~  ~  the median limiting magnitudes estimated for individual × p S | µ, E, Θ . (36) PS1 detections. For each Stripe 82 detection in the pixel,   we obtain ∆m in each band by subtracting the local lim- The first term is simply the likelihood we found earlier, iting magnitude from the transformed detection magni- since the knowledge that the star has been detected has tude. In each passband, we bin Stripe 82 detections by no effect on the apparent magnitudes the model predicts, ∆m, obtaining an empirical estimate of the completeness assuming the stellar type, distance and reddening are in each bin from the number of PS1 matches. known. That is to say, We find that the completeness fraction is reasonably well fit by ~mmod = M~ mod Θ~ + A~(E)+ µ , (37) p(Si = true | mmod, i)   and m − m − ∆m −1 = 1+exp mod, i lim, i 1 , (42) ~ ~ ∆m2 p ~mobs| µ, E, Θ, S = N (~mobs| ~mmod, ~σ) . (38)      where mlim, i is a limiting magnitude calculated for each The only element of the calculation which changes when point-source detection in the PS1 3π survey, equal to the we take into account Malmquist bias is therefore the magnitude of a source that would be detected at 5σ in prior, which picks up an extra factor of one exposure, given the sky and read noise. ∆m1 = 0.16 mag and ∆m2 = 0.2 mag are fitting parameters. ~ ~ ~ p S | µ, E, Θ = p S | ~mmod The positive value of ∆m1 indicates that the PS1 pipeline     goes somewhat deeper than our naive estimate mlim. The = p (Si | mmod, i) . (39) same fitting parameter values reproduce the complete- i ness curve in all five PS1 passbands reasonably well, re- Y flecting the consistency of the PS1 optics and pipeline This is the survey selection function. across the entire filter set. The empirically measured If forced photometry were used instead, then we would completness fraction and our fit are plotted for each pass- have a single detection parameter S, denoting that the band in Fig. 4. source was detected in at least one passband, and the survey selection function would be 5. SAMPLING METHOD 5.1. Individual Stars p S =true | µ, E, Θ~ = p(S | ~mmod) We use Markov Chain Monte Carlo sampling to explore   =1 − p (Si =false | mmod, i) the parameter space for individual stars. The sampling i must be performed with great care, owing to two features Y (40) of the distributions p(µ, E). First, the distributions are invariably highly elongated, and second they are often in place of the expression in Eq. (39). multimodal. The elongation stems from the close align- We therefore require an estimate of the completeness ment between the reddening vector and the stellar locus of the survey in each band, as a function of apparent in the PS1 bands, as shown in Fig. 5. The multimodal- magnitude. We determine completeness by comparison ity has two causes. First, the reddening vector in general with point-source detections in the 275 deg2 SDSS Stripe intersects the gri stellar locus in two locations, creating 82 survey (York et al. 2000; Annis et al. 2011). The a degeneracy between blue and red main-sequence stars. co-added Stripe 82 images go more than a magnitude Second, the PS1 bands do not distinguish dwarfs from deeper than the individual PS1 3π images, allowing us giants, leading to the possibility that a star can be either to use Stripe 82 detections as a complete catalog of point a faraway or a nearby . sources past the detection limits of the PS1 3π survey. These degeneracies are easier to visualize if we consider For each Stripe 82 source, we deteremine whether there is only three passbands, as shown in Figure 5. In this re- a Pan-STARRS 1 detection within 1′′. The completeness duced space, stellar photometry is fully described by a fraction of the PS1 3π survey is the percentage of Stripe- single overall observed magnitude and two colors. If we 82 detections with a PS1 match. observe a star at a given location in color-color space, 8

Fig. 4.— Completeness of the PS1 3π survey, as a function of magnitudes past a locally estimated 5σ magnitude limit. The completeness is estimated by comparison with SDSS Stripe 82 (York et al. 2000; Annis et al. 2011). The shaded curve shows the median completeness, with 1σ-range of completeness in each bin, based on estimates in 27’ pixels. The solid black line shows our fit to the completeness curve. The dashed black line shows the effect of adding a small floor to our fit, which takes into account an assumed small rate of false coincidences between PS1 and Stripe 82 detections. we can then move backwards along the reddening vector until we intersect the stellar locus. One can then com- pare the observed with the absolute magnitude of the stellar locus at the point of intersection. One thus obtains both a reddening and distance for the star. If there are multiple intersections, then the ob- served star could be of different intrinsic types, and thus have different reddening and distance combinations. The relative probability of each mode is in practice given by the space of stellar types lying close to the gray line, as well as the priors applied to the problem. We perform a Markov Chain Monte Carlo sampling of these surfaces using a custom C++ implementation of the affine-invariant sampler introduced in Goodman & Weare (2009) and recently given in a python imple- mentation by Foreman-Mackey et al. (2012). We employ both short-range “stretch” and long-range “replacement” Fig. 5.— Sketch of how photometric parallax works, for illustra- moves (Goodman & Weare 2009). The long-range moves tive purposes, adapted from Berry et al. (2011). A star is observed allow mixing between widely separated modes in pa- at location 1 in color-color space. Its de-reddened colors may lie rameter space, but are more computationally expensive along any point on the gray line, parallel to the reddening vector. than the short-range “stretch” steps. The replacement The intersections of this line with the model stellar locus, labeled 2 and 3, represent the most likely intrinsic stellar types. The pos- moves are closely related to the Normal Kernel Coupler terior density for the star will thus have two modes – one at larger of Warnes (2001). We have found that with the addition distance and lesser reddening (2), and one at smaller distance and of long-range “replacement” steps, the affine-invariant greater reddening (3). For simplicity, we assume Solar metallicity sampler is capable of handling the multimodality of the in this example. This is how one would make a distance and red- dening determination by eye. Our more rigorous Bayesian method problem, and that it is well suited to the strong degen- takes into account photometric uncertainties, as well as priors on eracies in parameter space. For each star, we sample stellar type and Galactic structure. from each stellar posterior density in four independent runs and check convergence with the Gelman-Rubin di- agnostic (Gelman & Rubin 1992). The Gelman-Rubin discarded as burn-in. Thus, excluding the burn-in phase, diagnostic essentially verifies that the variance between a total of 80000 samples are drawn for each star across the means of separate chains is small compared with the the four chains, with the number of independent sam- variance within the chains. If independent MCMC runs ples being lower. On four cores of a 2.67 GHz Intel Xeon produce significantly different estimates of the parame- X5650 with 12 MB of L3 cache, our run time per star ter means, one or more of the runs must not have con- is typically 0.15 seconds per run per core, or 0.8 CPU verged. Each run employs 20 samplers, with a mix of 80% seconds for four independent runs. stretch steps and 20% replacement steps, and 2000 steps per sampler. The first 1000 steps from each sampler are 5.2. Bayesian Evidence & Outlier Rejection 9

When an observed object does not match our stellar observes very few stars. In addition to requiring that model, the inferences we draw on its distance, reddening reddening increase monotonically with distance, we ap- and stellar type are unreliable. One means of quantifying ply a wide log-normal prior on the differential reddening the reliability of our inferences for an individual star is in each distance bin, ∆E(i) . to compute the evidence We use the affine-invariant sampler to draw a represen- tative sample of possible reddening profiles. We sample Z ≡ dµ dE dΘ~ p ~m, S~ | µ, E, Θ~ p µ, E, Θ~ , (43) from the posterior density given by Eq. (6). We can thus produce a three-dimensional reddening map which Z     which is the probability density of drawing the observed includes the uncertainty in reddening as a function of magnitudes ~m from the stellar model and observing the distance. star in the PS1 survey. A low evidence indicates that the observed point source is a member of a stellar population 6. TESTS WITH MOCK PHOTOMETRY not included in our model (e.g., a young blue giant or an The first and most straightforward test of our method unresolved binary system with colors that do not match is to generate mock photometry for stars of varying stel- any stellar template), is not a star (e.g., a lar type, distance and extinction, and to see how well we or galaxy), that the errors in the photometry have been can recover those parameters. This is less of a test of the underestimated, or that the reddening vector is inaccu- particular stellar model used than a demonstration that rate. Here, our approach is similar to Berry et al. (2011), photometry alone is capable of sufficiently constraining which identifies objects which do not fit the stellar model stellar parameters. We find that our method is capable by a threshold χ2 statistic. There is no direct analogue of accurately recovering both single-star parameters and for the χ2 statistic in a Bayesian framework, but evidence the line-of-sight reddening profile. may serve a similar purpose in model comparison. Note that as we do not include priors on the extinc- 6.1. Generating Mock Catalogs tion to individual stars, but rather on the line-of-sight In order to generate a mock photometric catalog for reddening profile, we do not strictly calculate the evi- a particular region on the sky, we begin by drawing in- dence. Instead, we calculate the evidence of a model trinsic stellar types (metallicities and absolute rP1 mag- with a prior on E with wide support (i.e. a prior which nitudes) and distances from our priors. We assign a allows E to take on a wide range of values), such that the reddening to each star, either according to an assumed prior is nearly constant across all relevant reddennings: distance – reddening relationship, or from a reddening a 0 ≤ E . E distribution we define, depending on the purpose of the p(E) ≈ 0 , (44) mock catalog. For each star in the catalog, we generate af(E) E & E  0 model magnitudes, as described in §4.1. where f(E) is some integrable function whose precise We determine which passbands each star is detected in, behavior is unimportant, a is a normalizing constant, and according to our probabilistic PS1 completeness model, E0 is some large extinction. We thus effectively calculate Eq. (42). In the remainder of the paper, we reject simu- the evidence Z for a model of this form, up to a constant lated stars which do not have 5-band detections. factor a, which is the same for every star. This allows We then apply magnitude-dependent Gaussian photo- for outlier rejection, based on comparisons between the metric errors to each simulated star to obtain observed evidence for different stars. magnitudes. The error we apply to each star in each We employ a modified harmonic mean estimate, which passband is a function of the model apparent magnitude: is obtained directly from the Markov chain produced in sampling the posterior density, and thus requires little 2 (m − m ) σ2(m)= σ2 + σ2 exp lim . (45) additional computation (Gelfand & Dey 1994; Robert & floor 0 ∆m Wraith 2009). This method is presented in more detail  3  in Appendix A. As before, mlim is the 5σ limiting magnitude in the given passband. We set the error floor to σfloor = 0.02. 5.3. Line-of-Sight Fitting For PS1 passbands, we find that σ0 = 0.16 and ∆m3 = We choose the HEALPix pixelization scheme (Gorski 0.8 give a reasonable fit to the photometric uncertainties. et al. 2005) as our method of dividing the sky into Our final catalog thus contains noisy observed mag- individual lines of sight. Once we have determined nitudes of each star, along with the photometric uncer- p(µ, E | ~m) for each star in a given HEALPix pixel and tainty in each passband. As a final step, we plug the rejected stars which fall below the evidence cut, we can observed magnitudes back into Eq. (45) to obtain a new apply Eq. (12) to determine the posterior probability of estimate of the photometric uncertainties for each star. the parameters ~α describing the reddening profile. We The final catalog that we pass to our pipeline thus re- parameterize the reddening profile as a piecewise-linear ports inexact photometric uncertainties, much as a real- (i) function in distance modulus, with αi = ∆E describ- istic catalog would. The usefulness of these mock cata- ing the rise in r-band reddening in distance segment i. logs is that they allow us to generate a large amount of We split up each line of sight into 30 distance segments photometry for stars whose ‘true” distances and redden- of equal width in µ, with the closest distance being at ings are known. µ = 4, corresponding to 63 pc, and the furthest distance being at µ = 19, corresponding to 63kpc. It must be 6.2. Single-Star Tests cautioned that, in general, our method does not tightly We illustrate the typical appearance of single-star pos- constrain reddening at this latter distance, where PS1 terior distributions in distance and reddening in Fig. 6. 10

TABLE 3 Uncertainty in Inferred Distances and Reddenings

Low Latitude High Latitude

∆d ∗ ∆d d ∆E(B−V ) d ∆E(B−V ) +41% +0.07 +48% +0.09 −1

+55% +0.12 +79% +0.10 4

+97% +0.28 +92% +0.22 6

+261% +0.62 +31% +0.33 8

+14% +0.13 +15% +0.10 10

Fig. 6.— Distance and reddening estimates for four simulated † +63% +0.16 +50% +0.17 stars. The joint posterior in distance and reddening is shown as a Dwarfs 9−33% 0.01−0.20 4−18% 0.01−0.12 heat map. As this is mock photometry, we know the “true” dis- +55% +0.12 +45% +0.16 tances and reddenings for the stars, which are shown as green dots. All Stars −3−35% −0.01−0.16 2−23% 0.01−0.12 The true stellar parameters lie in regions of high inferred proba- ∗ ∆d bility, as expected. The shape of the probability density functions d is given in percent. See Eq. (48). † traces that of the stellar locus. The probability density at closer Dwarfs are defined here as all stars in the range 4 < Mr ≤ 12. distances corresponds to the main sequence, with increasing red- dening compensating for the bluer intrinsic colors as one travels up of sight. For our high-Galactic-latitude target, we choose the stellar locus. The peak in reddening corresponds to the main- sequence turnoff. Distances beyond the turnoff correspond to the the North Galactic Pole, where the stellar population is giant branch. dominated by the halo. Here, we apply reddenings of E(B−V ) . 0.1 to the simulated stars. For the low Galac- tic latitude target, we choose ℓ = 45◦, b =0◦, and draw For each of the four simulated stars in Fig. 6, the “true” reddening uniformly from the range 0 ≤ E(B−V ) ≤ 2. distance and reddening are indicated by a dot, while the We run the two mock catalogs through our pipeline, and background heat map shows the probability density in- compare the inferred distances and reddenings, drawn ferred by our pipeline. from the posterior probability density p(µ, A), to the In order to determine how far off our estimates are on true values. For this test, we allow our inferred red- average, we define the centered probability density denings to go slightly negative (E(B−V ) > −0.25), to N avoid introducing a bias into the inferred values. We 1 p˜(∆µ, ∆E) ≡ p(µ∗ + ∆µ, E∗ + ∆E) , (46) give the median, and 15.87th and 84.13th percentiles N i i ∆d i=1 of and ∆E(B−V ), equivalent to the one-standard- X d ∗ ∗ deviation range for a Gaussian distribution. where µi and Ei are the true distance modulus and red- Uncertainties in distance modulus can be transformed dening, respectively, for star i. For a simulated line-of- to uncertainties in distance by making use of the relation sight, this function gives the average probability density of our inference being offset from the true stellar parame- d = (10pc)10µ/5 . (47) ters by (∆µ, ∆E). We plotp ˜ for a typical line of sight in Fig. 7. The typical spread of in ∆µ and ∆E varies across Let µinferred = µtrue + ∆µ. Then, different lines of sight, but the centered probability den- ∆d d − d ∆µ ≡ inferred true = 10 /5 − 1 . (48) sity generally peaks at the origin, as should be expected. d d In the bottom two panels of 7, we show the effect of ap- inferred plying flat priors to the stellar parameters, in place of Similarly, we define ∆E(B−V ) as E(B−V )inferred − the priors developed in §4. The effect is to widen and E(B−V )true. bias the inferred probability density functions. Due to Our distance and reddening estimates are unbiased. the near-alignment of the reddening vector with the PS1 However, if one selects a subsample of stars of a cer- stellar locus for much of the main sequence, this bias re- tain known type, a bias in distance is introduced. Thus, mains even for stars with low observational uncertainties. distance estimates for mock dwarf stars are biased low, The stellar priors are thus important in correctly infer- as the model assigns some probability to the possibility ring stellar parameters from Pan-STARRS 1 photometry. of them being giants. Inversely, distances to giants are In the right two panels of 7, we only use inferences for biased high. A star drawn at random, however, has an stars with low signal-to-noise detections. The low signal- unbiased distance estimate. Distance and reddening con- to-noise population is generated using an inflated error straints depend upon the quality of the photometry and model that applies 3× the normal observational uncer- direction on the sky, and therefore vary significantly on tainty to the mock photometry. As expected, the inferred a star-per-star basis. It is, in general, more informative parameters for such stars are less constrained, but they to look at the detailed shape of the posterior distribution are nonetheless unbiased. for a given star in distance and reddening space (See Fig. In Table 3, we present typical uncertainties in the in- 6). ferred distance and reddening of individual stars. To do Next, we test that the true stellar parameters are this, we generate mock catalogs along two different lines drawn from the probability density functions we cal- 11

Fig. 7.— Centered and stacked probability densities on a linear scale for 5000 stars along a simulated line of sight pointed at ℓ = 90◦, b = 20◦. In the bottom panels, we fit the stars using flat priors, so that only the likelihood function comes into play. In the right panels, we show inferences for low signal-to-noise detections, generated using 3× the normal observational uncertainties. Removing the priors biases the inferred distances and, to a lesser extent, reddenings. Inferred distances and reddenings for stars with high signal-to-noise detections have smaller uncertainties. In each panel, each stellar probability density function has first been centered on the true distance and reddening, before the probability densities for the stars have been summed, as described in the text. The “X” shape of the stacked probability densities in the top-left panel reflects the existence of separate giant and dwarf modes. The feature stretching from the bottom left to the top right corresponds to the main sequence, while the perpendicular feature corresponds to the giant mode. The histograms bordering each panel show the distribution of ∆µ and ∆E(B−V ), with the 15.87% to 84.13% region shaded. culate. For each simulated star, we derive the poste- rior density p(µ, E), based on the simulated photometry. Since we know the true distance modulus µ∗ and redden- ing A∗ of the star, a natural question is whether µ∗ and E∗ are drawn from p(µ, E). This cannot be answered for a single star, but we can test this hypothesis for a large number of stars. Assign a percentile to a given star as follows:

P (p

6.3. Mock Line of Sight Correct distance determination requires not only cor- Finally, we demonstrate that we are able to recover rect model colors, but correct absolute magnitudes. We line-of-sight reddening profiles for simulated photometry. therefore compare our stellar models to globular and We first invent an arbitrary relationship E(B−V )(µ) be- open clusters. In Fig. 11, we compare our model magni- tween distance and reddening. We add in low-level scat- tudes to photometry from four globular and open clus- ter to the distance-reddening relationship, as the red- ters. dening relation across one HEALPix pixel may vary. We We find that the model absolute magnitudes match then generate mock photometry for 150 stars along the the main sequence. The model magnitudes are unre- line of sight. Following the procedure outlined in §3, we liable past the main sequence turnoff, particularly for use the simulated photometry to determine a posterior younger clusters. Our model magnitudes trace the giant density in distance and reddening space for each star, branch of intermediate-age clusters—typical of the age of and then combine the information from all of the stars most stars in the Galaxy—somewhat better. Neverthe- to find the range of allowable reddening profiles. Our less, we expect that most of the information in our dust final product is thus a set of reddening profiles, drawn maps will come from the main sequence, where distance from the probability density over reddening profiles (Eq. and reddening estimates are better constrained. Massive (12)). We parameterize the reddening as a piecewise- young blue stars and blue stars, which linear function in distance, and apply a weak log-normal are not included in the stellar templates, are found to prior to the differential reddening in each distance seg- have low evidence, allowing them to be identified and ment, as described in §5.3. The results for one simu- excluded from the line-of-sight dust inference. Inclusion lated line of sight, shown in Fig. 9, demonstrate that of age-dependent stellar models, and therefore more reli- we are able to correctly infer the reddening profile for able distance and reddening determinations for the most mock data. The method produces the best results at massive stars, could potentially increase the distance to distances where there are many stars. Nearby and at which our dust maps are reliable, and is an important large distances, where there are comparatively few stars direction for future work. to constrain the reddening profile, the uncertainties in 7.3. Reddenings reddening can become very large. However, at interme- diate distances (µ ∼ 10 to ∼ 15 for typical lines of sight, In order to test the accuracy of our reddening infer- corresponding to 1 to 10 kpc), the fit produces uncertain- ences for individual stars, we compare our photometric ties on the order of ∆E(B−V ) ∼ 0.05 mag, consistent reddenings to independently measured reddenings for a with the intrinsic scatter in reddening which we intro- sample of stars. The Sloan Extension for Galactic Un- duce into the simulated photometry. derstanding and Exploration (SEGUE) (Yanny et al. 2009), part of SDSS-II, provides a convenient set of 7. stars for which one can independently determine redden- COMPARISON WITH DATA ing. The SEGUE survey obtained moderate-resolution 7.1. Colors spectroscopy for 240,000 stars with SDSS photometry. We compare our model colors to PS1 stellar photom- Whereas most of the SDSS footprint is at low reddening, etry from low-extinction regions at high Galactic lati- some of the SEGUE targets are at moderate reddening, tudes. It is important to choose low-extinction regions, up to ∼ 1 mag in E(B−V ). The SEGUE Stellar Param- so that assumptions about the reddening law and what eter Pipeline (SSPP) fits an atmospheric model to each percentage of the total dust column is in front of each star to derive the temperature, metallicity, and grav- star play only a minor role. This allows us to obtain a ity of the star, as well as other parameters (Lee et al. comparison between the intrinsic colors in our model and 2008a,b; Allende Prieto et al. 2008). These stellar pa- of real stars. We de-redden the stellar colors assuming rameters were used by Schlafly & Finkbeiner (2011) to that they are behind the full dust column predicted by predict the intrinsic colors of stars, and to study the ef- Schlegel et al. (1998, SFD). fect of reddening by attributing the differences between The results for the North Galactic Pole are shown in the observed and intrinsic colors to dust. We use the Fig. 10. We compute an evidence for each star, as de- reddening estimates of Schlafly & Finkbeiner (2011) in scribed in §5.2. As expected, objects which lie far from the four independent SDSS colors, in concert with their the stellar locus in color-color space tend to have lower recommended RV = 3.1 reddening vector (Fitzpatrick evidence. This helps us to reject objects that are ei- 1999), to estimate E(B−V ) to each star. The details of ther not stars, are not included in our stellar model (e.g., deriving the reddening based on SEGUE-determined in- young and BHB stars), or that have particularly bad pho- trinsic colors and SDSS photometric colors are described tometry. In the window shown here, 15% of the detected in Appendix B. We use the same sample of SEGUE targets as Schlafly objects would fail an evidence cut of ln Z > ln Zmax −20. These objects tend to have problematic photometry for & Finkbeiner (2011). This sample excludes objects tar- which the PS1 pipeline may have produced inaccurate geted as white dwarfs, and also removes M dwarfs, for results, though some are variables, quasars, and unrec- which the stellar parameters are less reliable. We also ognized galaxies, which our technique is not designed to require that each SEGUE target have a PS1 counter- handle. Our line-of-sight reddening inferences are not part. Distances and reddenings to each star are then in- strongly dependent on the choice of the evidence thresh- ferred as described in §4.1, and compared to the SEGUE- old. determined reddenings. For this comparison, we allow our photometric reddening estimates to be negative, for consistency with the SEGUE-derived reddenings. We 7.2. Distances save 100 reddening samples from the Markov chain for 13

Fig. 9.— Recovery of the line-of-sight reddening profile from simulated photometry for 150 stars. The large panel shows the inferred posterior densities of the stars, stacked on top of one another. The contrast is stretched at each distance for purposes of visualization. This gives a picture of the information which is fed into the second stage of our analysis, in which we recover the reddening as a function of distance from the individual stellar probability densities (See Eq. (12)). The stacked image, however, plays no direct role in this inference. The curves show possible reddening profiles, conditioned on the mock photometry. The green curve traces the most probable reddening profile. The remaining curves are colored according to the logarithm of their probability density, with blue denoting high probability and red denoting low probability. We recover a reddening profile similar to the original, which had a single cloud of depth E(B−V ) = 0.5 at distance modulus µ = 8.5 (shown as a dashed black line on the plot). The slight, gradual increase in inferred reddening beyond the cloud is due to the constraint that differential reddening in each bin be non-negative, and to the log-normal prior on differential reddening. A priori, having no reddening away from the cloud is unlikely, and this is reflected in our inference. The upper four panels show individual stellar posterior density functions over the same domain, with the same reddening profiles overplotted. Each star is consistent with the range of possible reddening profiles. each star, as well as the maximum-posterior density red- not target stars at random, but instead targets only sub- dening. In the left panel of Fig. 12, we compare the sets of stars of particular interest. Moreover, the sub- maximum-posterior density Bayesian reddening estimate set of SEGUE-observed stars for which we have reliable with the SEGUE-derived reddening for 200,000 stars. reddening estimates does not extend to the M-dwarfs, We bin the stars by the reddening expected from the meaning that our sample of SEGUE-derived reddening dust maps of Schlegel et al. (1998, SFD), and plot a his- estimates use only intrinsically blue stars. We consider togram of the difference in the two reddening measures intrinsically redder stars — and therefore less reddened in each bin. We place the SFD reddening on the x-axis stars — in our analysis than are actually present in our because it is a good proxy for reddening and is indepen- sample of SEGUE-observed stars. In order to simulate dent of both of the two reddening estimates we wish to the effect of excluding M-dwarfs, we modify the luminos- compare, while placing either the SEGUE-derived red- ity function prior to assign zero probability for Mr > 6. dening or the Bayesian reddening along the x-axis can When we account for this effect, the distribution of the introduce spurious trends in the resulting comparison. difference between our and the SEGUE reddening esti- We find that the mode of our reddening estimates is un- mates is unbiased. To illustrate this, instead of present- biased over a range of E(B−V ) = 0 to 1, above which ing single reddening difference for each star, we show 100 there are too few stars in the SEGUE sample to extend samples of the distribution of reddening differences be- the comparison. The scatter in the difference between tween our Bayesian reddening estimate and the SEGUE- the two reddening estimates is approximately 0.12 mag derived reddening estimate. The resulting residuals are in E(B−V ), with the overall estimate being unbiased to shown as a function of SFD reddening in the right panel within 0.03 mag. of Fig. 12. The residuals are unbiased at the 0.01 mag Our Bayesian reddening and distance estimates assume level, with a scatter of E(B−V )=0.13 mag. This result that stars are drawn at random from the observable stars is indicative of the accuracy we achieve for high-signal-to- on each line of sight. The SEGUE survey, however, does noise detections, as most SEGUE targets are well above 14

Fig. 10.— Comparison of PS1 stellar colors in the vicinity of the North Galactic Pole with our model colors. Each object is colored according to the evidence Z we compute. Objects represented by red dots have a low probability of being drawn from our stellar model, and are rejected for the line-of-sight reddening determination. The solid black line traces our model stellar colors. Our main-sequence model colors do not depend on metallicity, while the model colors for the giant branch have a slight metallicity dependence.

Fig. 11.— PS1 color-magnitude diagrams of three globular and one . For each cluster, the model isochrone with the catalog metallicity of the cluster is overplotted. The stellar photometry has been de-reddened and shifted by the catalog distance modulus to produce absolute magnitudes. The reddening vector is plotted in the top left corner of each panel in red for reference. Each star is colored by its evidence, with red stars unlikely to be drawn from our stellar model. In particular, stars which are blueward of the the main-sequence turnoff, which are bluer than any star in our template library, have low evidence. the detection limit in PS1. pixel. We have shown that this method correctly re- covers the distance–reddening relationship for simulated 8. CONCLUSION lines of sight. We have additionally shown by comparison with SEGUE-derived reddenings that for high-SNR de- We have presented a general method for deriving a tections, our Bayesian reddening estimates are unbiased three-dimensional map of Galactic reddening from stel- at the 0.01 mag level, with a scatter of ∼ 0.13 mag in lar photometry. Our technique is based on grouping E(B−V ). Based on comparisons with mock catalogs, in stars into pixels, determining the joint posterior of dis- highly-reddened regions of the Galaxy, our distance in- tance and reddening for each star, and then determining ferences have typical uncertainties of +47% on the high the most probable reddening-distance relation in each 15

Fig. 12.— Histogram of the difference between the mode of the Bayesian reddening inference and the SEGUE-derived reddening, as a function of SFD reddening (See §7.3). The blue envelopes mark the 15.87 and 84.13 percentiles of the residuals, equivalent to one standard deviation for a normal distribution, while the central blue curve marks the median of the residuals. The left panel compares the mode of the Bayesian posterior density functions with the mean of the SEGUE-derived reddenings. The right panel compares random samples drawn from the Bayesian posteriors with random samples drawn from the SEGUE-derived posteriors, which are Gaussian. end, and −21% on the low end. In high-Galactic lati- nearer future, the Dark Energy Survey (DES) will sur- tude regions with low reddening, our distances have typ- vey a complementary 5000deg2 of sky to Pan-STARRS ical uncertainties of +52% on the high end, and −38% 1, in a similar filter set (The Dark Energy Survey Col- on the low end. These uncertainties may be reduced by laboration 2005). The Gaia mission (Lindegren et al. feeding back information on reddening as a function of 1994), meanwhile, will provide multiband photometry distance, derived from all the stars along the line of sight. and low-resolution spectroscopy alongside parallax dis- A subsequent paper will present the results of applying tance measurements and proper motions for one billion the techniques developed here to construct a 3D redden- stars. Gaia’s parallax distances, in particular, will break ◦ ing map covering the δ > −30 sky. many of the degeneracies in our model for rP1 . 20 stars, In addition to determining the dust density in the while its proper motions will aid in inferring the popu- nearby Galaxy, our method can be used to determine the lation each star belongs to. These new datasets will in- distribution of stars in the Galactic plane. Earlier optical crease the power of our method to determine Galactic studies of the distribution of stars in the Galaxy tradi- reddening and structure. tionally consider only high-latitude stars, where the cor- The Pan-STARRS1 Surveys (PS1) have been made rection for dust extinction is straightforward (e.g., Juri´c possible through contributions of the Institute for As- et al. 2008). Infrared surveys of the plane are less sen- tronomy, the University of Hawaii, the Pan-STARRS sitive to dust extinction, but their wavelength coverage Project Office, the Max-Planck Society and its partic- also makes them less sensitive to intrinsic stellar type, ipating institutes, the Max Planck Institute for Astron- rendering photometric distances uncertain. Our tech- omy, Heidelberg and the Max Planck Institute for Ex- nique provides distances to stars throughout the Galac- traterrestrial Physics, Garching, The Johns Hopkins Uni- tic plane, enabling future studies of the distribution of versity, Durham University, the University of Edinburgh, stars in the disk. Queens University Belfast, the Harvard-Smithsonian The technique described in this paper is not limited Center for Astrophysics, the Las Cumbres Observatory to PS1 photometry. Inclusion of information from the Global Telescope Network Incorporated, the National 2MASS J, H and Ks bands, WISE bands, as well as Central University of Taiwan, the Space Telescope Sci- SDSS u-band photometry will improve our distance and ence Institute, the National Aeronautics and Space Ad- reddening estimates. In addition, kinematic information, ministration under Grant No. NNX08AR22G issued such as , may be incorporated into our through the Planetary Science Division of the NASA Sci- framework in order to allow a more precise determination ence Mission Directorate, the National Science Founda- of stellar distances. tion under Grant No. AST-1238877, the University of Upcoming surveys will also dramatically enhance our Maryland, and Eotvos Lorand University (ELTE). Gre- ability to measure the distances and reddenings to stars gory M. Green and Douglas P. Finkbeiner are partially in the Galaxy. The LSST (Ivezic et al. 2008) will pro- supported by NSF grant AST-1312891. The computa- vide deeper photometry spanning a similar set of filters tions in this paper were run on the Odyssey cluster sup- as those used in SDSS and Pan-STARRS 1, providing ported by the FAS Science Division Research Computing photometry for the sky south of δ < +34.5◦. In the Group at Harvard University.

APPENDIX HARMONIC MEAN ESTIMATE OF THE BAYESIAN EVIDENCE The harmonic mean approximation, developed in Gelfand & Dey (1994), allows one to compute the Bayesian evidence using samples returned from a Markov Chain Monte Carlo simulation. For a model with parameters θ and data D, 16

Bayes’ rule tells us that p(D|θ) p(θ) p(θ|D)= . (A1) p(D) We wish to compute the evidence p(D), often denoted by Z. Multiplying each side of the above by an arbitrary function φ(θ), rearranging terms and taking the integral over all θ, 1 φ(θ) dθ φ(θ)= dθp(θ|D) . (A2) p(D) p(D|θ) p(θ) Z Z The r.h.s. is simply the expectation value of φ(θ) (A3) p(D|θ) p(θ) for samples drawn from the posterior density p(θ|D). This is convenient, since MCMC methods draw a set of samples from the distribution p(θ|D). If the integral of φ(θ) is normalized to unity, then 1 1 φ(θ) ≡ ≈ . (A4) p(D) Z p(D|θ) p(θ)  chain This estimate has finite variance as long as φ has steeper wings than p(θ|D) (Robert & Wraith 2009). We choose φ to be constant within an ellipse centered on a point of high density within the chain, and zero outside (Robert & Wraith 2009). The ellipse is aligned with the principle axes of the covariance matrix, in order to ensure that it contains only well sampled regions of parameter space. We find a point of high density by first centering an ellipse on a random point in the chain. Because the points in the chain are sampled proportionately to the posterior probability density, this point is already likely to lie in a well-sampled region of parameter space. We then find the mean of the points in the chain falling in the ellipse, and move the center of the ellipse to that position in parameter space. One can iterate this procedure several times to settle into a densely sampled region of parameter space. The size of the ellipse we use to define φ(θ) is chosen such that a preset fraction of samples in the chain are enclosed. For our calculations, we iterate five times to find a dense region of parameter space, and scale the ellipse such that it contains 5% of samples.

SEGUE-DERIVED REDDENINGS Here, we review a method for calculating stellar reddenings on the basis of SDSS photometry and SEGUE predicted intrinsic stellar colors. As explained in Schlafly & Finkbeiner (2011), since the extinction in an individual band X is given by

AX = RX E(B−V ) , (B1) colors transform as

E(X −Y )= AY − AX = (RY − RX ) E(B−V ) . (B2) If we have only one color, X − Y , we can therefore estimate reddening as E(X −Y ) E(B−V )= . (B3) RY − RX Our goal is to extend this formula to allow the use of multiple colors, possibly with strong covariance. Since Schlafly & Finkbeiner (2011) only predict the colors of stars, and not their overall magnitudes, we work in color space. Denote the intrinsic stellar colors as ~ci, and the reddened colors as ~cr. In a multidimensional color space, Eq. (B2) becomes

~cr − ~ci = R~ E(B−V ) , (B4) where R~ has one component per color X − Y , given by RXY ≡ RY − RX . The estimated intrinsic colors and observed reddened colors are Gaussian random variables, with covariances Σi and Σr, respectively. Denote the estimated ′ ′ intrinsic colors as ~ci , and the observed reddened colors as ~cr. The likelihood of these two quantities taking on a particular set of values is given by ′ ′ ′ ′ p(~ci , ~cr | E(B−V ) , ~cr)= N (~cr |~cr, Σr) N (~ci |~ci, Σi) (B5) ′ ′ ~ = N (~cr |~cr, Σr) N ~ci |~cr − R E(B−V ) , Σi . (B6)   17

In the second step, we have replaced ~ci using Eq. (B4). Using the symmetry of the Gaussian distribution, ′ ′ ′ ~ ′ p(~ci , ~cr | E(B−V ) , ~cr)= N (~cr |~cr, Σr) N ~cr − R E(B−V ) |~ci , Σi (B7) ′  ~ ′  = N (~cr |~cr, Σr) N R E(B−V ) − ~cr | − ~ci , Σi . (B8)   If we assume a flat prior on E(B−V ) and ~cr, then the above is proportional to the posterior probability density ′ ′ p(E(B−V ) , ~cr |~ci , ~cr). We could, in practice, frame the question in terms of the intrinsic colors ~ci, and put priors on them based on our Galactic and stellar model, but we wish to avoid tying our SEGUE-derived reddenings in any way to our Bayesian photometric reddening estimates. Now, integrating over ~cr, we obtain a convolution of two Gaussians, which is itself a Gaussian distribution: ′ ′ ′ ~ ′ p(E(B−V ) |~ci , ~cr) ∝ d~cr N (~cr |~cr, Σr) N R E(B−V ) − ~cr | − ~ci , Σi (B9) Z   ~ ′ ′ = N R E(B−V ) |~cr − ~ci , Σr +Σi (B10)   The probability density function of E(B−V ) is thus a ray taken through a multivariate Gaussian. It can be shown that the above is also Gaussian, with mean and standard deviation given by (~c ′ − ~c ′)T (Σ +Σ )−1 R~ hE(B−V )i = r i r i , (B11) T −1 R~ (Σr +Σi) R~ −1 2 ~ T −1 ~ σE(B−V ) = R (Σr +Σi) R . (B12)

h i ′ We plug intrinsic stellar colors appropriate for the SSPP stellar parameters into ~cr, and use the observed SDSS colors ′ for ~ci. One final note is that Schlafly & Finkbeiner (2011) used the SSPP stellar types to derive estimates of the mean and covariance of the magnitudes, rather than the colors. We will now show how to obtain the covariance of the colors from the covariance matrix of the magnitudes. Denote the covariance matrix of the magnitudes as Σij , where i and j ′ label passbands. Label the color mi − mj as cij . We then call the covariance matrix of the colors cij and ckℓ Σij,kℓ. By expanding out ′ Σij,kℓ = hcij ckℓi − hcij i hckℓi (B13) in terms of magnitudes, one obtains ′ Σij,kℓ =Σik − Σiℓ − Σjk +Σjℓ . (B14) th The common choice of colors is to set the i color to mi − mi+1. Plugging j = i + 1 and ℓ = k + 1 into the above, we find that the covariance of the ith color with the kth color is given by ′ Σi,k =Σi,k − Σi,k+1 − Σi+1,k +Σi+1,k+1 . (B15)

REFERENCES Allende Prieto, C., Sivarani, T., Beers, T. C., et al. 2008, AJ, 136, Hodapp, K. W., Kaiser, N., Aussel, H., et al. 2004, Astronomische 2070 Nachrichten, 325, 636 Annis, J., Soares-Santos, M., Strauss, M. A., et al. 2011, Hogg, D. W., Bovy, J., & Lang, D. 2010, arXiv:1008.4686 arXiv:1111.6619v2 Ivezi´c, v., Sesar, B., Juri´c, M., et al. 2008, ApJ, 684, 287 Bailer-Jones, C. a. L. 2011, MNRAS, 411, 435 Ivezic, Z., Tyson, J. a., Acosta, E., et al. 2008, 34 Belokurov, V., Zucker, D. B., Evans, N. W., et al. 2006, ApJ, 642, Juri´c, M., Ivezi´c, v., Brooks, A., et al. 2008, ApJ, 673, 864 L137 Kaiser, N., Burgett, W., Chambers, K., et al. 2010, in Society of Berry, M., Ivezi´c, v., Sesar, B., et al. 2011, arXiv:1111.4985 Photo-Optical Instrumentation Engineers (SPIE) Conference Bond, N. a., Ivezi´c, v., Sesar, B., et al. 2010, ApJ, 716, 1 Series, Vol. 7733, Society of Photo-Optical Instrumentation Bovy, J., Rix, H.-W., & Hogg, D. W. 2012a, ApJ, 751, 131 Engineers (SPIE) Conference Series Bovy, J., Rix, H.-W., Hogg, D. W., et al. 2012b, ApJ, 755, 115 Kruschke, J. 2010, Doing Bayesian Data Analysis: A Tutorial Bovy, J., Rix, H.-W., Liu, C., et al. 2012c, ApJ, 753, 148 Introduction with R (Academic Press) Bressan, A., Marigo, P., Girardi, L., et al. 2012, 20, 1 Lallement, R., Vergely, J.-L., Valette, B., et al. 2013, Cardelli, J. A., Clayton, G. C., & Mathis, J. S. 1989, ApJ, 345, arXiv:1309.6100 245 Lallement, R., Welsh, B. Y., Vergely, J. L., Crifo, F., & Sfeir, D. Chabrier, G. 2001, ApJ, 554, 1274 2003, A&A, 411, 447 Drew, J. E., Greimel, R., Irwin, M. J., et al. 2005, MNRAS, 362, Law, D. R., & Majewski, S. R. 2010, ApJ, 714, 229 753 Lee, Y. S., Beers, T. C., Sivarani, T., et al. 2008a, AJ, 136, 2022 Fitzpatrick, E. L. 1999, PASP, 111, 63 —. 2008b, AJ, 136, 2050 Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. Lindegren, L., Perryman, M. A., Bastian, U., et al. 1994, in , 2012, arXiv:1202.3665 599–608 Gelfand, A. E., & Dey, D. K. 1994, Journal of the Royal Magnier, E. 2006, in The Advanced Maui Optical and Space Statistical Society, 56, 501 Surveillance Technologies Conference Gelman, A., & Rubin, D. B. 1992, Statistical Science, 7, 457 Magnier, E. 2007, in Astronomical Society of the Pacific Goodman, J., & Weare, J. 2009, Communications in Applied Conference Series, Vol. 364, The Future of Photometric, Mathematics and Computational Science, 5 Spectrophotometric and Polarimetric Standardization, ed. Gorski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759 C. Sterken, 153–+ 18

Magnier, E. A., Liu, M., Monet, D. G., & Chambers, K. C. 2008, Sesar, B., Juri´c, M., & Ivezi´c, v. 2011, ApJ, 731, 4 in IAU Symposium, Vol. 248, IAU Symposium, ed. W. J. Jin, Simon, J. D., & Geha, M. 2007, ApJ, 670, 313 I. Platais, & M. A. C. Perryman, 553–559 Stubbs, C. W., Doherty, P., Cramer, C., et al. 2010, ApJS, 191, Marshall, D., Robin, A., Reyl´e, C., Schultheis, M., & Picaud, S. 376 2006, arXiv:astro-ph/0604427 The Dark Energy Survey Collaboration. 2005, Onaka, P., Tonry, J. L., Isani, S., et al. 2008, in Society of arXiv:astro-ph/0510346 Photo-Optical Instrumentation Engineers (SPIE) Conference Tonry, J., & Onaka, P. 2009, in Advanced Maui Optical and Series, Vol. 7014, Society of Photo-Optical Instrumentation Space Surveillance Technologies Conference Engineers (SPIE) Conference Series Tonry, J. L., Stubbs, C. W., Lykke, K. R., et al. 2012, ApJ, 750, Perryman, M. A. C., Lindegren, L.; Kovalevsky, J., Hoeg, E., 99 et al. 1997, A&A, 323, 49 Vergely, J.-L., Freire Ferrero, R., Siebert, A., & Valette, B. 2001, Robert, C. P., & Wraith, D. 2009, I Can, 12 A&A, 366, 1016 Robin, A. C., Reyl´e, C., Derri`ere, S., & Picaud, S. 2003, A&A, Warnes, G. 2001, The Normal Kernel Coupler: An adaptive 409, 523 Markov Chain Monte Carlo method for efficiently sampling Sale, S. E. 2012, 000, 13 from multi-modal distributions, Tech. Rep. 395, Department of Schlafly, E. F., & Finkbeiner, D. P. 2011, ApJ, 737, 103 Statistics, University of Washington Schlafly, E. F., Finkbeiner, D. P., Juri´c, M., et al. 2012, ApJ, 756, Yanny, B., Rockosi, C., Newberg, H. J., et al. 2009, AJ, 137, 4377 158 York, D. G., Adelman, J., Anderson, Jr., J. E., et al. 2000, AJ, Schlegel, D. J., Finkbeiner, D. P., & Davis, M. 1998, ApJ, 500, 120, 1579 525