Empirical Bayes Ranking with R-Values

Methods for Ranking and Selection in Large-Scale Inference Nicholas Henderson Department of Statistics, University of Wisconsin { Madison −6 −4 −2 0 2 4 6 scale θ θ^ measurement ± 2SE In this talk, we will look at effect sizes θ and estimates θ^. 2 Multiple effect size estimates −6 −4 −2 0 2 4 6 parameters θ1; : : : ; θ10 3 Rank Ordering Effects −6 −4 −2 0 2 4 6 θ increasing what we want 4 Rank Ordering Effects −6 −4 −2 0 2 4 6 increasing estimate what we get 5 Large Scale −6 −4 −2 0 2 4 6 I regression effect I variance effect 6 Large Scale −6 −4 −2 0 2 4 6 increasing estimate/SE 7 Type 2 Diabetes (T2D) GWAS (Morris et. al., 2012) I case/control 22; 669/58; 119 I many T2D associated loci, but of small effect. (3371 SNPs shown) I How to rank order? 8 Gene-Set Enrichment (Hao et. al. 2013) I list of 984 human genes linked to influenza-virus 0.50 replication Ai I overlap of this list with annotated gene sets from the Gene Ontology 0.20 (5719 gene sets) How to rank order? I 0.10 proportion of set detected by RN 0.05 10 20 50 100 200 500 1000 set size N 9 Connection to Large-Scale Inference I Test statistics T1;:::; Tn with Ti jθi ∼ f (tjθ) I Unobserved effect sizes: θ1; : : : ; θn I Large-scale multiple testing Consider H0 : θi = 0, for i = 1;:::; n. Our Focus I Non-sparse cases: substantial fraction of the θi are not zero. I Ranking/Prioritizing the non-null cases. I Differing levels of information across units 10 Overview Objective We have data from a large number of measurement or inference units, and we would like to identify the most important units by some measure. Typical setup: I n units I Data: D1;:::; Dn I signals or effect sizes: (θ1; : : : ; θn). I parametric model: p Di jθi 11 Type 2 Diabetes (T2D) GWAS I unit = SNP I n = 3; 371 odds(T 2DjAi ) I θi = log c odds(T 2DjAi ) I Di = (θî ; σî ) 12 Basketball: Free-Throw Percentages 1.0 I 461 NBA players (2013 − 2014) 0.8 I free throw percentages 0.6 I lots of variation in 0.4 number of attempts Makes/Attempts 0.2 0.0 1 5 20 50 100 250 500 1000 # Free Throw Attempts unit − Basketball Player Di = (mi ; Xi ) = (# attempts; # makes) Xi jθi ; mi ∼ Binomial(θi ; mi ) 13 Motivating model 0 2 1 0 1 X1 σ1 θ1 . Data:@ . A Signals: @ . A 2 Xn σn θn Xi = θî estimate of θi , σi - precision of Xi . 2 The fσi g often differ substantially across units. normal/normal model 2 2 Xi jθi ; σi ∼ N(θi ; σi ) θi ∼ N(0; 1) 2 σi ∼ g(·) 14 How to rank? 2 2 Xi jθi ; σi ∼ N(θi ; σi ) θi ∼ N(0; 1) 2 σi ∼ g(·) I MLE: sort by θî = Xi I p-value: sort by pi = Φ Xi /σi , or equivalently, Xi /σi ^PM 2 I post. mean: sort by θi = Xi =(σi + 1) σ2 ^Q ^PM i I post. quantile: sort by θi = θi − 1:96 × 2 σi +1 Each of the above methods produce the same rankings whenever σ1 = σ2 = ::: = σn. 15 Variances given selection: sorting by p-value 2 p(σi jpvali ≤ p:05) 1 fi : pvali ≤ p:05g - top 5% by p-value 0.5 density 0.1 p(σi ) 0 0 1 2 3 4 5 6 σ 16 Sort by MLE 2 1 0.5 p(σi jXi ≥ x:05) density fi : Xi ≥ x:05g - top 5% by MLE 0.1 p(σi ) 0 0 2 4 6 8 σ 17 Sort by Posterior Mean 2 1 p(σi jPMi ≥ pm:05) fi : PMi ≥ pm:05g - top 5% by posterior mean 0.5 density 0.1 p(σi ) 0 0 2 4 6 8 σ 18 Bayesian approaches I It has often been suggested to use Bayes estimates for ranking. I These shrink units with high variance. I posterior mean - good for point estimation posterior expected rank - treats all ranks equally posterior tests - good for hypothesis testing I Can we do any better? 19 Threshold functions 2 tα(σ ) - threshold function 2 I Place unit i in the top α-fraction if Xi ≥ tα(σi ). I A unit is selected if its effect size estimate Xi is sufficiently large relative to the associated precision. Size constraint 2 On average, the proportion of units selected through Xi ≥ tα(σi ) must equal α 2 P Xi ≥ tα(σi ) = α; for each α 2 (0; 1): 20 Threshold functions: visualizing tradeoffs I We can equate most methods for sorting units with a family T of threshold functions T = ftα(·): α 2 (0; 1)g. ^PM 2 I e.g. posterior mean, θi = Xi =(σi + 1) Xi 2 2 2 ≥ uα =) tα(σ ) = uα(σ + 1) σi + 1 I Plots of tα(·) show how each method trades off observed signal versus estimation precision. 21 Threshold functions: T2D 22 Threshold functions: visualizing tradeoffs Threshold functions associated with various ranking criteria, normal/normal model 2 criteria ranking variable threshold function tα(σ ) MLE Xi uα PV H0 : θi = 0 Xi /σi uασ PV H0 : θi = c (Xi − c)/σi c + uασ 2 2 PM Xi =(σi + 1) uα(σ + 1) 2 p 2 2 PER P(θi ≤ θjXi ; σ ) uα (σ + 1)(2σ + 1) i r 2 n 2 o P(Xi jσi ,θi 6=0) 2 2 (σ +1) BF 1(Xi > 0) 2 σ (σ + 1) uα + log 2 P(Xi jσi ,θi =0) σ 23 From thresholds to ranks Assign ranks ri by sweeping through the family T 2 ri = inf α 2 (0; 1) : Xi ≥ tα(σi ) . 24 Agreement The overlap between the \true" top α-fraction i : θi ≥ θα 2 and the reported top α-fraction i : Xi ≥ tα(σi ) is n 1 X 1fθ ≥ θ g1fX ≥ t (σ2)g; n i α i α i i=1 where Pfθi ≥ θαg = α. The expected overlap is the agreement 2 Agreement(α) = P θi ≥ θα; Xi ≥ tα(σi ) | {z } \Limiting overlap" 2 Because P Xi ≥ tα(σi ) = α, this a \fair" way to compare procedures. 25 Maximizing agreement 2 I Assume: positive joint density for (Xi ; σi ; θi ) 2 I θi = E(Xi jθi ; σi ) 2 2 I σi = var(Xi jθi ; σi ) ∗ ∗ The family T = ftαg is optimal if for any α 2 (0; 1): ∗ 2 2 P θi ≥ θα; Xi ≥ tα(σi ) ≥ P θi ≥ θα; Xi ≥ tα(σi ) (1) Theorem 1 (Necessary condition) ∗ A necessary condition for tα to be optimal as in (1) is that it satisfies ∗ 2 2 2 Pfθi ≥ θαjXi = tα(σ ); σi = σ g = cα 26 T2D data: maximal agreement 27 Normal/Normal Threshold functions associated with various ranking criteria, normal/normal model 2 criteria ranking variable threshold function tα(σ ) MLE Xi uα PV H0 : θi = 0 Xi /σi uασ PV H0 : θi = c (Xi − c)/σi c + uασ 2 2 PM Xi =(σi + 1) uα(σ + 1) 2 p 2 2 PER P(θi ≤ θjXi ; σ ) uα (σ + 1)(2σ + 1) i r 2 n 2 o P(Xi jσi ,θi 6=0) 2 2 (σ +1) BF 1(Xi > 0) 2 σ (σ + 1) uα + log 2 P(Xi jσi ,θi =0) σ 2 p 2 2 max agreement r-value θα(σ + 1) − uα σ (σ + 1) 28 Local Tail Probabilities 2 2 Vα(Xi ; σi ) = P θi ≥ θαjXi ; σi and P θi ≥ θα = α Theorem 2 2 2 If Vα(x; σ ) is right-continuous and non-decreasing in x for every (α; σ ), and λα is chosen so that 2 P Vα(Xi ; σi ) ≥ λα = α; then the optimal family is given by ∗ 2 2 tα(σ ) = inffx : Vα(x; σ ) ≥ λαg: 29 Crossing? Do functions from the optimal family cross? Theorem 3: (Crossing) ∗ 2 No functions in the family tα(σ ): α 2 (0; 1) \cross" as long as ∗ 2 @tα(σ ) @α < 0. This property holds in the normal/normal model for any distribution of σ2. 30 Optimal ranking variable For T = ftαg, assign percentile ranks by sweeping through the family 2 2 r(Xi ; σi ) = inffα : Xi ≥ tα(σi )g: ∗ For the family ftαg maximizing agreement, this is 2 2 r(Xi ; σi ) = inf α 2 (0; 1) : Vα(Xi ; σi ) ≥ λα : | {z } 2 P(θi ≥θαjXi ,σi ) \r-value" ∗ ∗ If the family T = ftαg has no crossings then 2 2 r(Xi ; σi ) ≤ α if and only if Vα(Xi ; σi ) ≥ λα 2 Recall: PfVα(Xi ; σi ) ≥ λαg = α. 2 2 P r(Xi ; σi ) ≤ α = P Vα(Xi ; σi ) ≥ λα = α | {z } uniformly distributed 31 r-values ∼ Unif(0,1) NBA 1.0 0.8 0.6 Density 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 rvalues 32 Interpretation 2 2 \r-value" ! r(Xi ; σi ) = inf α : Vα(Xi ; σi ) ≥ λα 2 fi : Vα(Xi ; σi ) ≥ λαg − top α - fraction of units when 2 ranking by Vα(Xi ; σi ) A unit with an r-value of α may be interpreted as the smallest value at which the unit should be placed in the top α-fraction of units when 2 ranking by Vα(Xi ; σi ). 33 Another View Constrained loss for classifying units as in the top α-fraction or not: n n X X Lα(a; θ) = 1fai ≤ α; θi ≥ θαg; s.t.

Empirical Bayes Ranking with R-Values

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support