Meta-Analysis

Meta-Analysis

Previous’ lecture I Single variant association I Use genome-wide SNPs to account for confounding (population substructure) I Estimation of effect size and winner’s curse Meta-Analysis Today’s outline I P-value based methods. I Fixed effect model. I Meta vs. pooled analysis. I Random effects model. I Meta vs. pooled analysis I New random effects analysis Meta-Analysis I Single study is under-powered because the effect of common variant is very modest (OR ≤ 1.4) I Meta-analysis is an effective way to combine data from multiple independent studies Meta-Analysis Methods I P-value based I Regression coefficient based I Fixed effects model I Random effects model P-value Based I Conduct meta-analysis using p-values I Simple and widely used I Fisher and Z -score based methods P-value Based I K studies K X 2 TFisher = −2 log(pk ) ∼ χ2K k=1 I Simple and works well I Direction of effect is not considered P-value Based I Stouffer’s Z -score (one-sided right-tailed) Zk = Φ(1 − pk ) I Φ is the standard cumulative normal distribution function PK Z Z = kp=1 k K I Z follows the standard normal distribution. Fisher versus Z -score I These two are not perfectly linear, but they follow a highly linear relationship over the range of Z values most observed (Z ≥ 1). P-value Based I Z-score can incorporate direction of effects and weighting scheme I β : K effectp for study k I wk = nk Zk = Φ(1 − pk =2) · sign(βk ) PK k=1 wk Zk Z = q ∼ N(0, 1) PK 2 k=1 wk P-value Based I Easy to use I Z-score is perhaps more popular as it works well to find a consistent signal from studies I Cannot estimate the effect size Regression Coefficient Based I For each study k = 1, ... , K gfE(Yik )g = Xik αk + Gik βk I Estimate βk and its standard error, (βbk , σbk ) 2 βbk ∼ N(βk , σk ) Fixed Effect Model I Assume β1 = ··· = βK I Effect size estimation K P w βb β = k=1 k k b PK k=1 wk ! p n P w 2σ2 (β − β) ! , k k k n b N 0 P 2 ( k wk ) 1 I wk = gives the minimal variance of βb var(βbk ) Fixed Effect Model I Assumes all studies in the analysis have the same effect I Each study can be considered as a random sample drawn from the population with true parameter value β. I There is between-study heterogeneity I Different definitions of phenotypes. I Effect size may be higher (or lower) in certain subgroups (e.g., age, sex). Assess Heterogeneity I Cochran’s Q test K X 2 Q = wk (βbk − βb) k=1 I Q should be large if there is heterogeneity. I Under the null hypothesis of no heterogeneity 2 Q ∼ χK −1 I Under powered when there are fewer studies. Assess Heterogeneity I Measure the % of total variance explained by the between-study heterogeneity (Higgins and Thompson 2002) Q − (K − 1) I2 = × 100% Q I > 50% indicates large heterogeneity I Intuitive interpretation, simple to calculate, and can be accompanied by an uncertainty interval An Example Meta- vs. Pooled-Analysis I Meta analysis: Combining summary statistics (e.g., βbk ) of studies I Pooled analysis: Combining original or individual-level of all studies Relative Efficiency I For k = 1, ··· , K studies T gfE(Yki )g = αk + β Xki I β is common to all K studies; αk is specific to the kth study I The maximum likelihood estimator (MLE) βbk for the kth study by maximizing n Yk L = sup f (Y , X ; β, α ) k αk ki ki k i=1 Relative Efficiency I βe is MLE of β by maximizing K n Y Yk L = sup f (Y , X ; β, α ) fα1,··· ,αK g ki ki k k=1 i=1 K n Y Yk = sup f (Y , X ; β, α ) αk ki ki k k=1 i=1 K Y = Lk k=1 Relative Efficiency I Information K K @2 X @2 X I(β) = log L = log L = I (β) @β2 @β2 k k k=1 k=1 I Recall 1 1 var(β) = = b PK 1 PK Ik (β) k=1 var(βk ) k=1 1 I var(βe) = I(β) I Using summary statistics has the same asymptotic efficiency as using original data, if β is the only common parameter across studies. Relative Efficiency I Common nuisance parameters, say γ. IK ββ IK βγ Iββ Iβγ Ik = I = IK @β IK γγ Iγβ Iγγ ( K )−1 X −1 var(βb) = (Ikββ − IkβγIkγγIkγβ) k=1 K X −1 −1 = (Iββ − IkβγIkγγIkγβ) k=1 −1 −1 ≥ (Iββ − IβγIγγ Iγβ) = var(βe) −1 I Equality holds if and only of var(βk ) cov(βbk , γbk ) are the same among the K studies. Random Effects Model I Under the fixed effect model, β1 = ··· = βK = β I The random effects model βk = β + ξk (k = 1, ··· , K ) 2 I ξk ∼ N(0, τ ) Random Effects Model 2 I Estimation of τ I DerSimonian & Laird (1986) method-of-moments estimator Q − (K − 1) τ 2 = b P −1 P −2 P −1 VK − VK = VK I Vk = var(βbk jβk ) 2 I var(βbk ) = Vk + τ I Estimate β with inverse variance estimator wbk = varc (βbk ) K P w βb β = k=1 bk k b PK k=1 wbk 1 SE(βb) = pP k wbk Random Effects Model I Fixed effect model is more powerful, but ignores the heterogeneity between studies. I Random effects model is probably more robust, but is under powered. The confidence intervals have poor coverage for small and moderate K . Meta- vs Pooled-Analysis I The maximum likelihood estimator βe from pooled data can be obtained by maximizing K Z 2 ! Y 1 ξk fk (Yki jXki , β + ξk )p exp − dξk 2 2τ 2 k=1 2πτ I Challenging to establish the theoretical properties of βb and βe under the random effects model because nk K . Asymptotic Distributions (Zeng and Lin 2015) I Assumptions: I For k = 1, ··· , K , nk = πk n for some constant πk within a compact interval 2 (0, 1). 2 1 2 2 I τ = n σ , σ is a constant. The between-study variability is comparable to the within-study variability. I Asymptotic distribution of MLE βe p K ! K 1=2 X πk X πk n(βe − β0) !d Zk vk + πk A vk + πk A k=1 k=1 2 I nτe !d Ae; nK Vbk ! vk 2 I Zk are independently distributed ∼ N(0, vk + πk σ0). I A is a complicated form that involves fZk g. I It is a mixture of normal random variables with the mixing probabilities both being random and correlated with the normal random variables. Asymptotic Distributions I Asymptotic distribution of βb p K K 1=2 X πk −1 X πk Zk n(βb − β0) −!d ( ) k=1 vk + πk Ab k=1 vk + πk Ab 2 I nτb !d Ab −1=2 I As n, K ! 1, Kn ! 0, var(βe) ≥ var(βb), i.e., the weighted average estimator βb is at least as efficient as the MLE I When K = 100, 200, 400, the empirical relative efficiency is 1.037, 1.083, and 1.171. I Perform statistical inference for βb and βe based on asymptotic distributions using resampling techniques I Or simpler, use the profile likelihood to construct 95% confidence intervals (Hardy and Thompson, 1996) 95% Coverage Probabilities I Solid: Zeng & Lin; Dashed: DerSimonian-Laird; dotted Jackson-Bowden resampling method; dot-dash: Hardy-Thompson profile method. Left: K=10 studies; Right: K=20 Testing with random effects model I Random effects (RE) model gives less significant p values than the fixed effects model when the variants show varying effect sizes between studies I Ironic because RE is designed specifically for the case in which there is heterogeneity I All associations identified RE are usually identified by the fixed effect model I Causal variants showing high between-study heterogeneity might not be discovered by either method Revisit the Traditional RE I First step: estimate the effect size and CI by taking heterogeneity into account. I Second step: normalize β=b SE(βb) and translate it into p-value. I Effectively, RE assumes heterogeneity under the null hypothesis, i.e., 2 2 βbk ∼ N(0, σ + τ ). I There should not be heterogeneity under the null because β1 = ··· = βK = 0. New RE 2 I New null hypothesis H0 : β = 0, τ = 0. 2 βbk ∼ N(0, σ ) I Model βb = (βb1, ··· , βbK ). I βb follows a multivariate normal distribution βb ∼ MVN(β1 , Σ) 2 Σ = V + τ I , V = diag(V1, ··· , VK ). I The likelihood ratio test statistic. L1 Snew = −2 log L0 K 2 ! Y 1 βbk L0 = p exp − 2πV 2Vk k=1 k K 2 ! Y 1 (βbk − β) L = exp − 1 p 2 2 2(Vk + τ ) k=1 2π(Vk + τ ) New RE I Basically we test both fixed and random effects together Snew = Sβ + Sτ 2 I Sβ is equal to the Z-statistic based on the fixed effect 2 model, and Sτ 2 is the test statistic for testing τ = 0 2 I β is unconstrained, but τ ≥ 0. Hence, 1 1 S ∼ χ2 + χ2. new 2 1 2 0 I However, asymptotics cannot be applied because of small K . The asymptotic p-value is conservative because of the tail of asymptotic distribution is thicker than that of the true distribution at the tail I Resampling approach should be used to calculate p-values Power Comparison Interpretation and Prioritization I In the usual meta-analysis where one collects similar studies and expects the common effect, the results found by the fixed effects model should be the top priority.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    38 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us