
Methods for Ranking and Selection in Large-Scale Inference Nicholas Henderson Department of Statistics, University of Wisconsin { Madison −6 −4 −2 0 2 4 6 scale θ θ^ measurement ± 2SE In this talk, we will look at effect sizes θ and estimates θ^. 2 Multiple effect size estimates −6 −4 −2 0 2 4 6 parameters θ1; : : : ; θ10 3 Rank Ordering Effects −6 −4 −2 0 2 4 6 θ increasing what we want 4 Rank Ordering Effects −6 −4 −2 0 2 4 6 increasing estimate what we get 5 Large Scale −6 −4 −2 0 2 4 6 I regression effect I variance effect 6 Large Scale −6 −4 −2 0 2 4 6 increasing estimate/SE 7 Type 2 Diabetes (T2D) GWAS (Morris et. al., 2012) I case/control 22; 669/58; 119 I many T2D associated loci, but of small effect. (3371 SNPs shown) I How to rank order? 8 Gene-Set Enrichment (Hao et. al. 2013) I list of 984 human genes linked to influenza-virus 0.50 replication Ai I overlap of this list with annotated gene sets from the Gene Ontology 0.20 (5719 gene sets) How to rank order? I 0.10 proportion of set detected by RN 0.05 10 20 50 100 200 500 1000 set size N 9 Connection to Large-Scale Inference I Test statistics T1;:::; Tn with Ti jθi ∼ f (tjθ) I Unobserved effect sizes: θ1; : : : ; θn I Large-scale multiple testing Consider H0 : θi = 0, for i = 1;:::; n. Our Focus I Non-sparse cases: substantial fraction of the θi are not zero. I Ranking/Prioritizing the non-null cases. I Differing levels of information across units 10 Overview Objective We have data from a large number of measurement or inference units, and we would like to identify the most important units by some measure. Typical setup: I n units I Data: D1;:::; Dn I signals or effect sizes: (θ1; : : : ; θn). I parametric model: p Di jθi 11 Type 2 Diabetes (T2D) GWAS I unit = SNP I n = 3; 371 odds(T 2DjAi ) I θi = log c odds(T 2DjAi ) I Di = (θ^i ; σ^i ) 12 Basketball: Free-Throw Percentages 1.0 I 461 NBA players (2013 − 2014) 0.8 I free throw percentages 0.6 I lots of variation in 0.4 number of attempts Makes/Attempts 0.2 0.0 1 5 20 50 100 250 500 1000 # Free Throw Attempts unit − Basketball Player Di = (mi ; Xi ) = (# attempts; # makes) Xi jθi ; mi ∼ Binomial(θi ; mi ) 13 Motivating model 0 2 1 0 1 X1 σ1 θ1 . Data:@ . A Signals: @ . A 2 Xn σn θn Xi = θ^i estimate of θi , σi - precision of Xi . 2 The fσi g often differ substantially across units. normal/normal model 2 2 Xi jθi ; σi ∼ N(θi ; σi ) θi ∼ N(0; 1) 2 σi ∼ g(·) 14 How to rank? 2 2 Xi jθi ; σi ∼ N(θi ; σi ) θi ∼ N(0; 1) 2 σi ∼ g(·) I MLE: sort by θ^i = Xi I p-value: sort by pi = Φ Xi /σi , or equivalently, Xi /σi ^PM 2 I post. mean: sort by θi = Xi =(σi + 1) σ2 ^Q ^PM i I post. quantile: sort by θi = θi − 1:96 × 2 σi +1 Each of the above methods produce the same rankings whenever σ1 = σ2 = ::: = σn. 15 Variances given selection: sorting by p-value 2 p(σi jpvali ≤ p:05) 1 fi : pvali ≤ p:05g - top 5% by p-value 0.5 density 0.1 p(σi ) 0 0 1 2 3 4 5 6 σ 16 Sort by MLE 2 1 0.5 p(σi jXi ≥ x:05) density fi : Xi ≥ x:05g - top 5% by MLE 0.1 p(σi ) 0 0 2 4 6 8 σ 17 Sort by Posterior Mean 2 1 p(σi jPMi ≥ pm:05) fi : PMi ≥ pm:05g - top 5% by posterior mean 0.5 density 0.1 p(σi ) 0 0 2 4 6 8 σ 18 Bayesian approaches I It has often been suggested to use Bayes estimates for ranking. I These shrink units with high variance. I posterior mean - good for point estimation posterior expected rank - treats all ranks equally posterior tests - good for hypothesis testing I Can we do any better? 19 Threshold functions 2 tα(σ ) - threshold function 2 I Place unit i in the top α-fraction if Xi ≥ tα(σi ). I A unit is selected if its effect size estimate Xi is sufficiently large relative to the associated precision. Size constraint 2 On average, the proportion of units selected through Xi ≥ tα(σi ) must equal α 2 P Xi ≥ tα(σi ) = α; for each α 2 (0; 1): 20 Threshold functions: visualizing tradeoffs I We can equate most methods for sorting units with a family T of threshold functions T = ftα(·): α 2 (0; 1)g. ^PM 2 I e.g. posterior mean, θi = Xi =(σi + 1) Xi 2 2 2 ≥ uα =) tα(σ ) = uα(σ + 1) σi + 1 I Plots of tα(·) show how each method trades off observed signal versus estimation precision. 21 Threshold functions: T2D 22 Threshold functions: visualizing tradeoffs Threshold functions associated with various ranking criteria, normal/normal model 2 criteria ranking variable threshold function tα(σ ) MLE Xi uα PV H0 : θi = 0 Xi /σi uασ PV H0 : θi = c (Xi − c)/σi c + uασ 2 2 PM Xi =(σi + 1) uα(σ + 1) 2 p 2 2 PER P(θi ≤ θjXi ; σ ) uα (σ + 1)(2σ + 1) i r 2 n 2 o P(Xi jσi ,θi 6=0) 2 2 (σ +1) BF 1(Xi > 0) 2 σ (σ + 1) uα + log 2 P(Xi jσi ,θi =0) σ 23 From thresholds to ranks Assign ranks ri by sweeping through the family T 2 ri = inf α 2 (0; 1) : Xi ≥ tα(σi ) . 24 Agreement The overlap between the \true" top α-fraction i : θi ≥ θα 2 and the reported top α-fraction i : Xi ≥ tα(σi ) is n 1 X 1fθ ≥ θ g1fX ≥ t (σ2)g; n i α i α i i=1 where Pfθi ≥ θαg = α. The expected overlap is the agreement 2 Agreement(α) = P θi ≥ θα; Xi ≥ tα(σi ) | {z } \Limiting overlap" 2 Because P Xi ≥ tα(σi ) = α, this a \fair" way to compare procedures. 25 Maximizing agreement 2 I Assume: positive joint density for (Xi ; σi ; θi ) 2 I θi = E(Xi jθi ; σi ) 2 2 I σi = var(Xi jθi ; σi ) ∗ ∗ The family T = ftαg is optimal if for any α 2 (0; 1): ∗ 2 2 P θi ≥ θα; Xi ≥ tα(σi ) ≥ P θi ≥ θα; Xi ≥ tα(σi ) (1) Theorem 1 (Necessary condition) ∗ A necessary condition for tα to be optimal as in (1) is that it satisfies ∗ 2 2 2 Pfθi ≥ θαjXi = tα(σ ); σi = σ g = cα 26 T2D data: maximal agreement 27 Normal/Normal Threshold functions associated with various ranking criteria, normal/normal model 2 criteria ranking variable threshold function tα(σ ) MLE Xi uα PV H0 : θi = 0 Xi /σi uασ PV H0 : θi = c (Xi − c)/σi c + uασ 2 2 PM Xi =(σi + 1) uα(σ + 1) 2 p 2 2 PER P(θi ≤ θjXi ; σ ) uα (σ + 1)(2σ + 1) i r 2 n 2 o P(Xi jσi ,θi 6=0) 2 2 (σ +1) BF 1(Xi > 0) 2 σ (σ + 1) uα + log 2 P(Xi jσi ,θi =0) σ 2 p 2 2 max agreement r-value θα(σ + 1) − uα σ (σ + 1) 28 Local Tail Probabilities 2 2 Vα(Xi ; σi ) = P θi ≥ θαjXi ; σi and P θi ≥ θα = α Theorem 2 2 2 If Vα(x; σ ) is right-continuous and non-decreasing in x for every (α; σ ), and λα is chosen so that 2 P Vα(Xi ; σi ) ≥ λα = α; then the optimal family is given by ∗ 2 2 tα(σ ) = inffx : Vα(x; σ ) ≥ λαg: 29 Crossing? Do functions from the optimal family cross? Theorem 3: (Crossing) ∗ 2 No functions in the family tα(σ ): α 2 (0; 1) \cross" as long as ∗ 2 @tα(σ ) @α < 0. This property holds in the normal/normal model for any distribution of σ2. 30 Optimal ranking variable For T = ftαg, assign percentile ranks by sweeping through the family 2 2 r(Xi ; σi ) = inffα : Xi ≥ tα(σi )g: ∗ For the family ftαg maximizing agreement, this is 2 2 r(Xi ; σi ) = inf α 2 (0; 1) : Vα(Xi ; σi ) ≥ λα : | {z } 2 P(θi ≥θαjXi ,σi ) \r-value" ∗ ∗ If the family T = ftαg has no crossings then 2 2 r(Xi ; σi ) ≤ α if and only if Vα(Xi ; σi ) ≥ λα 2 Recall: PfVα(Xi ; σi ) ≥ λαg = α. 2 2 P r(Xi ; σi ) ≤ α = P Vα(Xi ; σi ) ≥ λα = α | {z } uniformly distributed 31 r-values ∼ Unif(0,1) NBA 1.0 0.8 0.6 Density 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 rvalues 32 Interpretation 2 2 \r-value" ! r(Xi ; σi ) = inf α : Vα(Xi ; σi ) ≥ λα 2 fi : Vα(Xi ; σi ) ≥ λαg − top α - fraction of units when 2 ranking by Vα(Xi ; σi ) A unit with an r-value of α may be interpreted as the smallest value at which the unit should be placed in the top α-fraction of units when 2 ranking by Vα(Xi ; σi ). 33 Another View Constrained loss for classifying units as in the top α-fraction or not: n n X X Lα(a; θ) = 1fai ≤ α; θi ≥ θαg; s.t.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages64 Page
-
File Size-