arXiv:1502.02828v7 [astro-ph.CO] 19 Aug 2015
⋆ observa- dominant to which respect errors with un-noticeable systematic almost secondary are some to naturally is sensitive analysis highly chi-square that r notice analysis convincing also and We robust chi-square sults. more providing for with critical co is estimation which before errors parameter systematic significant mological of sources possible according prediction observations. “standard” various from find to simil deviation to seeking how in in and lies feature cosmology, are standard sections the of dark to imperfectness the key what The about perspective. question theoretical en in unknown components contains well ergy it extremely observations, fit astrophysical to most enough to perfect although ΛCDM model estima- better parameters. theoretical and on review hypotheses tion careful non-standard more various having and us fu upon energy enable speaking, will dark specifically observations More (ie., ture universe. sections the in dark matter) of dark dis- in nature observations confident the are astrophysical covering cosmologists precise corner, the more around coming and deeper With INTRODUCTION 1 lcsvaipoe yohsstest and hypothesis improved candles via standard clocks in uncertainties hidden Detecting o.Nt .Ato.Soc. Astron. R. Not. Mon. c 2 1 ixnWang Jiaxin tt e aoaoyo hoeia hsc,Isiueo T of Institute Physics, Theoretical 300071, of Laboratory Tianjin Key University, State Nankai Physics, of Department E-mail:[email protected] 05RAS 2015 eerhr r lascnenn bu eoigall removing about concerning always are Researchers cosmological standard the that accepted commonly is It 1 , 2 ⋆ 000 n ih Meng Xinhe and , ?? – e words: Key data-sets. error related hypothesis-induced cosmological all o the in high since with available, samples are observational uncertainties. sion cosmological observational independent various various large when relatively with by useful feature up covered chr similar be cosmic noticeable may Meanwhile, no late model. has and cosmological con servation fe uncertainty standard observations with systematic evolutional Ia present samples noticeable SN JLA under but of analysis slight compilation chi-square find of JLA We to data-set. analysis chronometer a our chi-square of apply residuals detectin we behind for hidden uncer errors, method systematic cosmological-hypothesis-induced sensitive ondary the and contain efficient which sou an errors, other introduce We from relatively errors. indistinguishable are is errors observational uncertainty present hypothesis-induced the is because hypotheses only cosmological not by caused uncertainty systematic Tiny ABSTRACT ?? 21)Pitd4Jl 08(NL (MN 2018 July 4 Printed (2015) yohssts omlgclprmtr–sseai uncertain systematic – parameter cosmological – test hypothesis 1 , 2 not ertclPyis hns cdm fSine ejn 10 Beijing Science, of Academy Chinese Physics, heoretical ar e- s- China - - a xs BntzHreae l 03 hoe l 2012; al. et Previ- Zhao uncertainties. 2013; 2013). Ma al. hidden & et Zhang (Benitez-Herrera the indications uncertain exist in means hypothesis-induced may that This lie suggest researches samples. physics er- ous systematic data new dominant observed of than en- in smaller dark much rors constant is possibly the term, ie., cosmo- from by ergy hypothesis, non-standard originated caused cosmological uncertainty for is imperfect systematic obstacle evidence that This clear fact history. find expansion to logical hard is the or analysis data hypotheses. unconcerned selection, retical data from observation, come in may hid errors of uncertainties Sources systematic uncertainties. previous hidden den in it detected name or we concerned researches, such seldomly Since are precisely. truth errors estimatio the small of knowing more from accuracy far us the blinding although affecting and are also which but re- significant errors, may less This systematic un- analyzing. tiny data in systematic in i sult hidden ignored feature the be evolutional may represents certainties, possible which the residuals, residuals, the of squares of systemat in neglected removing. usually error are and uncertainties tional hntefloigts saothwt n h sig- the find to how about is task following the Then it observations, of quantity and quality present With summation weighted on relies analysis chi-square Since A T E tl l v2.2) file style X ag u lobecause also but large rl hncombining when trol, ayi.I hspaper, this In nalysis. anyadohrsec- other and tainty adt edetected, be to hard u ehdcnbe can method Our per unbiasedly appears srainlpreci- bservational cso systematic of rces tr nresiduals in ature iysystematic tiny g oes which models, nmtrob- onometer 10 China 0190, tcosmic st ty. ty o- ic n n - 2 J. Wang and X. Meng nal of hidden uncertainties against observational noises. Pre- is not highly sensitive to tiny evolutional structure in resid- cisely speaking, the signal which we are looking for, performs uals. This would make theoretical breakthrough hard to be as a specific form of systematic uncertainty while we esti- found in phenomenological researches. mating ΛCDM model parameters with observed samples. Hypothesis test that follows Ronald Fisher’s principle We separate the task into two steps, the first step is about seems to be a more direct approach, although the Fisher’s introducing diagnostic method that suits various observa- approach had been accommodated in Neyman-Pearson test tions and theoretical predictions; and the second step con- scheme (James 2006). But we would like to conduct null cerns finding similar feature shared by various observations hypothesis test as a classical significance test in Fisher’s in diagnostic results. scheme. The following section serves as a brief review about In classical significance test, we do not talk about the analysing schemes used in previous hypothesis tests. Then possibility that the null hypothesis can be proved, even if it we move on to try detecting hidden systematic uncertainties is a truth (James 2006). The probability that a null hypoth- with diagnostic function. esis is not true can be expressed by the significance level of test. The classical significance test does not have to be ac- companied by alternative hypotheses, which is the most dif- ferent part between Fisher’s approach and Neyman-Pearson 2 HYPOTHESIS TEST lemma. Imagine function X is only related to observation, the The basis of Bayes theorem lies in its explanation of proba- value of it lies in W where only part of W can be predicted bility which reads from a null hypothesis Hn. We define the value of X pre- dicted from null hypothesis lies in w, thus the probability P (B|Ai)P (Ai) P (Ai|B)= , (1) that null hypothesis Hn is not true reads ΣiP (B|Ai)P (Ai) where A can be regarded as a set of hypotheses with P (X ∈ W − w|Hn)= α, (5) each specific model A , while B represents observed data. i where α is the level of significance about falsification of null P (A |B) means the probability of acquiring model A with i i hypothesis H . given data-set B, which is usually called posterior probabil- n We can benefit from null hypothesis test as it offers a ity; while P (B|Ai), the likelihood function concerns about 2 standard diagnostic criteria for conveniently assessing var- χ analysis ious models, since every model is presumed to be rejected χ2(B|A ) equally at first. P (B|A )= N · exp{− i }, (2) i 2 Classical hypothesis test, or more specifically speaking in this paper, the null hypothesis test often requires precise represents the possibility of acquiring data-set B with given value of parameters from hypothesis in order to perform model A . The priori, P (A ), does not affect too much about i i in full power. So the problem is, the standard cosmological estimating parameters of models, thus is often given by ex- model does not predict precise value of most of the param- perience. eters, such as dimensionless Hubble constant h and energy The decision theory based on Bayesian theorem can not 2 density of dust matter Ω h . Considering about that, pre- perform in full power in comparing different cosmological m vious researchers focused on null hypotheses like “The con- models, not only because the convincing form of priori can stant parameters are not dynamical.”, in order to use hy- not be given precisely, but also due to the fact that the loss pothesis test method in testing standard cosmology. Those function, which is another critical part in making decision, null hypotheses are all logically equaling to standard cosmo- is often neglected or given by experience. But without ex- logical model, only different in their expressions of diagnos- plicit forms of both priori and loss function (actually we just tic functions. The diagnostic function is actually acting as neglect their effects), we can still try to compare different X in Eq. (5). While focusing our discussion with flat space hypotheses with Bayesian factor and low redshift observations, the most popular diagnostic P (B|A ) function reads K = i , (3) P (B|Aj) 2 2 H(zi) − H(zj ) D = 3 3 , (6) which in practice shares the same idea with Neyman-Pearson (1 + zi) − (1 + zj ) lemma. where H(z) represents Hubble parameter at observed red- Neyman-Pearson lemma defines that likelihood-ratio shifts zi and zj . This function is also named as two-point di- test rejects hypothesis A1 in favour of A2 when agnostic function (Shafieloo et al. 2012), since it associates P (B|A1,θ) any two points in observation data-set at different redshifts. Λ(B)= 6 η, (4) P (B|A2,θ) The probability that the standard cosmological model is not true reads where the P is the likelihood function we talked about in Eq. (2), θ is a common parameter in both hypothesis, η is a P (D 6= const.|ΛCDM) = 1, (7) kind of threshold factor in discussing about significance. P (D = const.|ΛCDM) = 0, (8) Principally, we can choose the best theory only among all existing ones with Bayesian factor. But the consequent where D represents a diagnostic function. This is an problem is we are not able to probe tiny systematic errors ideal but not practical rule (Sahni, Shafieloo & Starobinsky hidden behind observational errors, since chi-square analysis 2008), which says the standard cosmological model is fal-
c 2015 RAS, MNRAS 000, ??–?? Detecting hidden uncertainties in standard candles and clocks via improved hypothesis test 3 sified if the values of diagnostic function deviate from one by precise observation without any error. For the observation constant at any redshift. suffers from systematic error and random error, their effects According to the fact that realistic observations can be added to the projected value as δr and δs. are coming with observational errors, a modified In practice we assume the probability distribution of rule (Seikel & Clarkson 2013) has been introduced which observed parameter qz is gaussian, for example the possi- says the probability of falsification reads bility of finding the ith component qz,i in qz at redshift z equals to p reads, Len(B; D 6= const.) P (D 6= const.|ΛCDM) = P (D), (9) 2 Len(B) (p − q¯i,z) P (q = p)= P0 exp{− }, (13) z,i σ2 where Len(B) means the whole redshift range of observation i,z data-set B, while Len(B; D 6= const.) means the redshift where {q¯i,z,σi,z} is a set of observational output of quantity range where diagnostic function deviates from a constant. qi at redshift z. Since the diagnostic function are reconstructed from data Although we are not able to express the probability dis- with observational error, its value at each specific redshift tribution function of Dz in a specific form, we can estimate is expressed by a random variable with upper and lower its statistical expectation limits. The limits are characterized by a level of confidence, for example, when P (D) = 0.68, the limits are chosen as E[Dz]= Dproj + E[δr]+ E[δs], (14)
1σ deviation limits from the mean value. Thus the modified where we presume Dproj should depend on redshift when the rule can be more applicable and reasonable. null hypothesis is not true. Random error in real observation also depends on redshift, ie., random observational error is expected to grow when redshift goes larger, but we can av- 3 AUTOCORRELATION ANALYSIS erage observational results at roughly the same redshift for minimizing influence of random error. We find out that modified rule is still vulnerable to uncer- We can also express the probability density function tainties that occurred in observations; besides, the signal of g(D) of Dz as deviation between hypothesis and observation, even if can be observed with above rules, can not be clearly explained E[Dz]= Dg(D)dD = D(fproj + fr + fs)dD, (15) or quantified. Imagine we have observed some level of de- Z Z viation according to rule Eq. (9), we still can not conclude where fr and fs are related to z, while fproj should be a delta that the deviation is due to the fact that our tested theory function independent of redshift only under null hypothesis. has some kind of level of deviation from the truth, since the Eq. (10) can be written as conditional expectation deviation can be an illusion caused by systematic and/or random uncertainties in observation and data analysis. C(θz)= EY [Dzk − Dzl ], (16) There exists a more critical question for null test in where Y means (zk − zl) = θz, as the condition of averag- practice: Are we expecting to find strong evidence of ruling ing. We can regroup the terms in C(θz) according to their out some model with null test while same level of evidence dependency on θz like has not been spotted via chi-square analysis? We do not attempt to address this question in this article but we keep in C(θz)= A + B(θz), (17) mind that null hypothesis significance test may be assigned which means A contains redshift independent terms. It is to a slightly different task discussed below. easy to calculate that the random error term reads In order to improve the null hypothesis test, we intro- duce autocorrelated analysis. The basic idea of our analysis D(fr,k − fr,l)dD, (18) is simple: Deviation between a theoretical hypothesis and ZY truth can be manifested by observational inference in evo- where the offset character of random error should result lutional behavior of a constant parameter, where observa- in zero value of this term since theoretically we have tional deviation against theoretical prediction acts like sys- Y D(fr,k)dD = 0. A also contains tematic error which unbiasedly appears in every related ob- R servational samples. Dp,k − Dp,l, (19) For revealing that particular systematic error we define under null hypothesis (Dp,k represents value of Dproj of an autocorrelation function C(θz) of diagnostic function as E[Dzk ]), but it is a θz dependent variable if null hypothesis is
C(θz)= hI(θz)iB, (10) violated and thus acts like a (possibly secondary) systematic error as we mentioned. This hypothesis-induced systematic where θz = (zk − zl), h iB represents average process in error is difficult to be separated from systematic uncertain- data-set B with respect to θz. The function I is defined as ties induced by other sources unless they have been carefully evaluated and corrected. I(θz)= D(zk) − D(zl), (11) But we can tell the difference between cosmological where D(zk) represents available values of diagnostic which hypothesis-induced systematic uncertainty and other sys- are related to observed quantity-set q at redshift zk. The tematic uncertainties when apply same diagnostic quantity value of Dz can be expressed as to different observations, since only cosmological hypothesis- induced systematic error is shared by every data-set. D (q )= D + δ + δ , (12) z z proj r s The offset between random error terms requires enough where Dproj means true value of D that should be projected samples under Y condition, while in practice we change that
c 2015 RAS, MNRAS 000, ??–?? 4 J. Wang and X. Meng
critical condition into (zk −zl)= θz ±δθ which means the av- The essence of this analysis is finding evolutional struc- erage operations are taken in bins. Notice that for a sample- ture in residuals, with respect to best-fitted prediction of set with finite redshift range, the change in δθ shall also chi-square analysis after dominant systematic uncertainties change θz accordingly, this can be observed in Fig. 2. being corrected. Different from previous treatments, we do not apply this scheme directly for detecting whether the ΛCDM is in- consistent with current observations. In the next section, we assign it with a more tangible mission: detecting secondary 5 CONFRONTING OBSERVATION systematic error after chi-square analysis. In this section, we apply the correlation analysis to type Ia supernova (SN Ia) and cosmic chronometer (CC) obser- vations, which are useful and carefully analysed low red- 4 TINY BIAS AFTER FITTING shift observation in testing standard (ΛCDM) cosmological Here we prove the autocorrelation analysis can serve as an model. efficient diagnostic method for detecting tiny bias after fit- ting cosmological parameters with chi-square analysis. We take wCDM as a general cosmological model which at low 5.1 diagnostic function for SN Ia redshift reads, In order to use SN Ia data, we start with the relation between 3 3δ h(z)= h Ωm(1 + z) + (1 − Ωm)(1 + z) , (20) (dimensionless) luminosity distance and redshift p −1 −1 where h(z)= H(z)/(100 km · s Mpc ), and δ =1+ w. w z dz′ represents the equation-of-state parameter of dark energy. dL(z)=(1+ z) , (23) Z E(z′) The standard ΛCDM model can be recovered when w = 0 −1. We presume the true Universe follows wCDM model where E(z′), defined by dividing Hubble parameter H(z) ∗ ∗ ∗ with parameter {Ωm, h , δ }. Notice that this assumption is with Hubble constant H0, is the dimensionless Hubble pa- a general setting for convenience which will not affect our rameter. Practically, the observed information is represented proof. by distance module µ which reads In practice we build diagnostic function which focuses 8 on Hubble constant. In the simplest case where the Hubble µ(z) = 5 log10(dL/h) + 5 log10(3 · 10 ), (24) parameters at various redshift can be measured, the corre- where h is the dimensionless Hubble constant, defined as sponding diagnostic function reads −1 −1 h = H0/(100km · s · Mpc ). For simplicity and accuracy, 8 h (z) we define µ0 = 5 log10(3 · 10 ). D(z)= −5 log [ obs ], (21) 10 3 In terms of dimensionless distance, we have ignored Ωm(1 + z) + (1 − Ωm) p the radiation and curvature terms in standard cosmologi- which contains observed quantities along with fiducial pa- cal model since their contributions at such low redshift are rameter Ωm. This is a little different from what we con- negligible. So we have fronted in section 3 where diagnostic function only contains 3 1/2 observed quantities. In practice it is hard to avoid introduc- E(z) = [Ωm(1 + z) + (1 − Ωm)] . (25) ing extra fiducial parameters, especially when using more Through the observational perspective, distance mod- precise observations, ie., SN Ia observation. ule µ can be obtained through fitted light-curve parameters So generally we re-express the diagnostic function as with SALT-II light-curve model (Mosher et al. 2014) of SN D(q,γ), where γ means extra parameters. In Eq. 21, we see Ia observation, which reads γ = Ωm. The correlation spectrum can be expressed as I I 1 hobs(zk) E(zk; γ) µobs = mB − (MB +∆MB )+ αX − βC, (26) C(θz)= EY [5 log10( ) − 5 log10( )], (22) hobs(zl) E(zl; γ) where m represents apparent peak magnitude of luminosity I 3 at B-band with M as its corresponding intrinsic magnitude where E(z; γ) = Ωm(1 + z) + 1 − Ωm. For an unbiased ∗ I of luminosity; ∆MB means stellar mass related correction data fitting withp ΛCDM model, the true Universe h (z) 10 ∗ f ∗ f ∗ which is zero when Mstellar < 10 M⊙; X1 and C are stretch must be characterized by {Ωm = Ωm, h = h , δ = 0}. f f parameter and color parameter respectively, which describe We set γ with the best-fit value {Ωm, h } from chi-square the properties light-curve with m together. α and β are their analysis, then the autocorrelation function C(θz) must yield constant coefficients. zero values at each θz, on the condition that every source of bias has been clearly corrected before and during chi-square Considering both theoretical and observational expres- analysis and we had presumed a right cosmological model. sion of distance module, the diagnostic function is designed as Deviation from C(θz) = 0 may be result from bias be- fore and during chi-square analysis, and may also come from I D =(mB − ∆MB + αX1 − βC) − 5 log10 dL,fit. (27) cosmological model we assumed if it deviates from the truth. I Random uncertainties may also affect the diagnostic result, The diagnostic quantity of this function is MB − 5 log10 h, but with sufficient samples in each averaged bin, we can where the intrinsic magnitude of supernova has also been eliminate that influence and the robustness of the result assumed to be a constant. The distance term dL,fit rep- against random error can be evaluated by using different resents luminosity distance described by Eq. (23) obtained width of average bin. with best-fitted Ωm value.
c 2015 RAS, MNRAS 000, ??–?? Detecting hidden uncertainties in standard candles and clocks via improved hypothesis test 5
5.2 test with JLA compilation random error and finite sample number in blurring signals in our analysis. If the result from real data lies within the The JLA (Joint Light-curve Analysis, which is jointly region of simulated result, then we conclude there exists no conducted by SDSS and SNLS collaborations) compila- noticeable hidden systematic uncertainties. Thus our focus tion (Betoule et al. 2014) is adopted as our SN Ia sam- here is about evaluating blurring range (as shown by black ples. The systematic uncertainties from SALT-II light-curve points with error bars in Fig. 1) from random error rather analysis have been thoroughly evaluated with calculation than conducting extremely realistic simulations of observa- and simulation in previous researches (Guy et al. 2010; tion. Mosher et al. 2014) and careful systematic uncertainty anal- Notice that in step 5 we concern analyzing uncertainties ysis had been adopted in original report (Betoule et al. in light-curve parameters, which may lead to un-expected 2014). The best-fit values of cosmological parameters are systematic-like uncertainties in final result since observed constrained after bias corrections in light-curve parameters sample number is finite. In order to include various possi- included in the released data which is adopted in our anal- bilities, we generate a general mock data-set (displayed in ysis. Fig. 1) in which mean value of light-curve parameters are set For evaluating values of diagnostic function at vari- by true values, and extend the error to the maximum 2σ level ous redshifts, we adopt the JLA’s original report where I as σm,i = 0.35, σC,i = 0.214 and σX,i = 3.282. Systematic- α = 0.141, β = 3.101, ∆MB = −0.07 which are best-fit like fluctuations in each correlation spectrum from a single parameter values with systematic uncertainties and bias cor- mock data-set can be well included by 2σ evaluation uncer- rections included in chi-square analysis. The joint analysis tainties of this general data-set as shown in Fig. 1. also conduct parameter constraints according to joint anal- The means and standard deviations of values of C(θz) ysis of Planck2013 (Planck Collab. 2013) and JLA, which is estimated by the following process: First, for each SN Ia yield slightly higher matter density Ωm = 0.305±0.010 than sample at redshift, ie., zk, we can estimate the distribution of from JLA alone which gives Ωm = 0.295 ± 0.034 while other value of diagnostic function and characterize it by its mean parameters suffers smaller differences. Avoiding from bring- and standard deviation. Then for each pair of SN Ia samples, ing possibly extra unknown tiny uncertainty from combining distribution of value of Eq. (11) can be estimated. We also two independent data-sets (Planck Collab. 2015), we adopt characterize each I with its mean and standard deviation. best-fitted Ωm = 0.295 according to JLA sample only. In order to realize average condition Y as we previously We also generated a simple set of mock observational mentioned, mean value and standard deviation of C in each data which also contains 740 samples for comparing their re- θz-bin are estimated through unconstrained averaging with sults. The mock data-set is produced as procedure described standard weighted least-squares method, which reads bellow (Karpenka 2015): 1 ΣiwiIi − 2 1. The redshift is drawn independently from zi ∼ C ± δC = ± (Σiwi) , (28) U(0, 1.3), where U(a,b) denotes a uniform distribution in Σiwi the range [a,b]. where the weight wi is 2. The predicted distance module at redshift zi 1 µi(zi, Ωm = 0.3, h = 0.7) is calculated using Eq. (24). wi = 2 . (29) (δIi) 3. The hidden variables Mi (intrinsic magnitude), X1,i (stretch parameter) and Ci (color parameter) are drawn C and δC stand for mean value and standard deviation of ˆ from the respective distributions Mi ∼ N (−19.03, 0.02), value of C(θz) in each bin, while Ii represents each mean X1,i ∼ U(−2.862, 2.337) and Ci ∼ U(−0.250, 0.260), where value in the same bin with its wight calculated from stan- N (µ, σ2) denotes a normal (Gaussian) distribution with 2 dard deviation. Notice that we estimate D(z) in bins directly mean µ and variance σ . before calculating C(θz) through the same average method, 4. The value of mB,i is calculated using the Phillips which is displayed as bottom panel in Fig. 4. relation mB,i = µ(zi, 0.3, 0.7) + Mi − αX1,i + βCi. The autocorrelation spectrums from JLA and mock 5. The simulated observational data are drawn inde- data-sets are displayed in Fig. 1, where we conduct expec- 2 pendently from the distributionsz ˆi ∼ N (zi,σz ),m ˆ B,i ∼ tation operation EY [ ] in bins which means Y represents 2 ˆ 2 ˆ 2 N (mB,i,σm,i), X1,i ∼ N (X1,i,σX,i) and Ci ∼ N (Ci,σC,i), |zk − zl| = θz ± 0.025. The robustness of taking different where {σm,i,σX,i,σC,i} are drawn randomly from uniform binning widths is shown in Fig. 2, we also tried randomly distributions σm,i ∼ U(0.09, 0.175), σX,i ∼ U(0.018, 1.641) dropping one third number of samples in the full data-set and σC,i ∼ U(0.012, 0.107). All upper and lower limits in but the result is negligibly affected. uniform distributions mentioned in each step are obtained Since the mock data-set is free from systematic uncer- from real JLA data-set. tainties, its spectrum remains relatively horizontal as we ex- We do not re-constrain the fiducial cosmology which has pected. We also checked the final error in reconstructed value {Ωm = 0.3, h = 0.7, α = 0.12, β = 3.2} according to mock of D, the mock data can provide average final standard de- data, but reproduce the diagnostic function C(θz) with fidu- viation 0.178 which is exactly the same as that from real cial parameter-set {Ωm, α, β} since we do not doubt whether data. chi-square fitting scheme in real SN Ia analysis may intro- Real observational result shows an noticeable deviation duce systematic bias. from horizontal line, we think that pattern results from ei- The simulation process adopted here is far from real- ther un-concerned tiny systematic uncertainties in data or istic, but it is sufficient for comparing results from correla- cosmological hypothesis. tion analysis with simulated and real observed data-set. The We also find that replacing standard cosmology by point is that we are trying to estimate the influence from wCDM model with EoS parameter of dark energy lower
c 2015 RAS, MNRAS 000, ??–?? 6 J. Wang and X. Meng
0 0 40
0 0 ) 35
1000 30 0 00 ∗ 25 )
θ 0 0
( 20 15 0 0
10 0 5 sample number ( number sample