Why and How to Use Random Forest Variable Importance Measures (And

Why and how to use random forest Introduction Construction variable importance measures R functions Variable importance (and how you shouldn’t) Tests for variable importance Conditional importance Summary Carolin Strobl (LMU München)and Achim Zeileis (WU Wien) References [email protected] useR! 2008, Dortmund [imagine a long list of references here] I have become increasingly popular in, e.g., genetics and the neurosciences I can deal with “small n large p”-problems, high-order interactions, correlated predictor variables I are used not only for prediction, but also to assess variable importance Introduction Introduction Construction R functions Random forests Variable importance Tests for variable importance Conditional importance Summary References [imagine a long list of references here] I can deal with “small n large p”-problems, high-order interactions, correlated predictor variables I are used not only for prediction, but also to assess variable importance Introduction Introduction Construction R functions Random forests Variable importance Tests for variable importance I have become increasingly popular in, e.g., genetics and Conditional the neurosciences importance Summary References I can deal with “small n large p”-problems, high-order interactions, correlated predictor variables I are used not only for prediction, but also to assess variable importance Introduction Introduction Construction R functions Random forests Variable importance Tests for variable importance I have become increasingly popular in, e.g., genetics and Conditional the neurosciences [imagine a long list of references here] importance Summary References I are used not only for prediction, but also to assess variable importance Introduction Introduction Construction R functions Random forests Variable importance Tests for variable importance I have become increasingly popular in, e.g., genetics and Conditional the neurosciences [imagine a long list of references here] importance Summary I can deal with “small n large p”-problems, high-order References interactions, correlated predictor variables Introduction Introduction Construction R functions Random forests Variable importance Tests for variable importance I have become increasingly popular in, e.g., genetics and Conditional the neurosciences [imagine a long list of references here] importance Summary I can deal with “small n large p”-problems, high-order References interactions, correlated predictor variables I are used not only for prediction, but also to assess variable importance (Small) random forest Introduction 1 1 1 1 Start Start Start Construction Start p < 0.001 p < 0.001 p < 0.001 p < 0.001 ≤ 8 > 8 ≤ 12 > 12 ≤ 1 > 1 2 3 2 7 2 3 ≤ 8 > 8 n = 13 Age Age n = 49 n = 8 Number R functions y = (0.308, 0.692) p < 0.001 p < 0.001 y = (1, 0) y = (0.375, 0.625) p < 0.001 2 3 n = 15 Start ≤ 87 > 87 ≤ 68 > 68 ≤ 4 > 4 y = (0.4, 0.6) p < 0.001 4 5 3 6 4 7 n = 36 Start Number n = 12 Age n = 31 y = (1, 0) p < 0.001 p < 0.001 y = (0.25, 0.75) p < 0.001 y = (0.806, 0.194) ≤ 14 > 14 Variable ≤ 13 > 13 ≤ 4 > 4 ≤ 125 > 125 4 5 n = 34 n = 32 6 7 4 5 5 6 y = (0.882, 0.118) y = (1, 0) n = 16 n = 16 n = 11 n = 9 n = 31 n = 11 importance y = (0.75, 0.25) y = (1, 0) y = (1, 0) y = (0.556, 0.444) y = (1, 0) y = (0.818, 0.182) 1 Number 1 1 1 Start Start Tests for variable p < 0.001 Start p < 0.001 p < 0.001 p < 0.001 ≤ 5 > 5 2 9 ≤ 12 > 12 ≤ 14 > 14 importance Age n = 11 2 7 2 7 ≤ 12 > 12 p < 0.001 y = (0.364, 0.636) Age Number Age n = 35 ≤ 81 > 81 p < 0.001 p < 0.001 p < 0.001 y = (1, 0) 2 3 3 4 n = 38 Number n = 33 Start Conditional ≤ 18 > 18 ≤ 3 > 3 ≤ 71 > 71 y = (0.711, 0.289) p < 0.001 y = (1, 0) p < 0.001 3 4 8 9 3 4 ≤ 12 > 12 n = 10 Number n = 28 n = 21 n = 15 Start importance 5 6 ≤ > y = (0.9, 0.1) p < 0.001 y = (1, 0) y = (0.952, 0.048) y = (0.933, 0.067) p < 0.001 3 3 n = 13 Start y = (0.385, 0.615) p < 0.001 ≤ 4 > 4 ≤ 12 > 12 4 5 ≤ > n = 25 n = 18 15 15 5 6 5 6 y = (1, 0) y = (0.889, 0.111) 7 8 n = 12 n = 10 n = 16 n = 15 n = 12 n = 12 y = (0.417, 0.583)y = (0.2, 0.8) y = (0.375, 0.625) y = (0.733, 0.267) y = (0.833, 0.167) y = (1, 0) Summary 1 1 Start 1 1 Number p < 0.001 Start Start p < 0.001 p < 0.001 p < 0.001 ≤ 12 > 12 ≤ 6 > 6 References 2 7 2 7 ≤ 12 > 12 ≤ 8 > 8 Age Start Number n = 10 p < 0.001 p < 0.001 p < 0.001 y = (0.5, 0.5) 2 5 2 5 Age Start Start Age ≤ 27 > 27 ≤ 13 > 13 ≤ 3 > 3 p < 0.001 p < 0.001 p < 0.001 p < 0.001 3 4 8 9 3 6 n = 10 Number n = 11 n = 37 Start n = 37 y = (1, 0) p < 0.001 y = (0.818, 0.182) y = (1, 0) p < 0.001 y = (0.865, 0.135) ≤ 81 > 81 ≤ 13 > 13 ≤ 3 > 3 ≤ 136 > 136 ≤ 4 > 4 ≤ 13 > 13 3 4 6 7 3 4 6 7 5 6 n = 20 n = 16 n = 11 n = 34 n = 12 n = 14 n = 47 n = 8 4 5 n = 14 n = 9 y = (0.85, 0.15) y = (0.188, 0.812) y = (0.818, 0.182) y = (1, 0) y = (0.667, 0.333) y = (0.143, 0.857) y = (1, 0) y = (0.75, 0.25) n = 10 n = 24 y = (0.357, 0.643)y = (0.111, 0.889) y = (0.8, 0.2) y = (1, 0) 1 1 1 1 Start Start Start Start p < 0.001 p < 0.001 p < 0.001 p < 0.001 ≤ 8 > 8 2 3 ≤ 8 > 8 ≤ 12 > 12 ≤ 12 > 12 n = 18 Start y = (0.5, 0.5) p < 0.001 2 5 2 5 2 3 Start Start Age Start n = 28 Start ≤ 12 > 12 p < 0.001 p < 0.001 p < 0.001 p < 0.001 y = (0.607, 0.393) p < 0.001 4 5 n = 18 Number y = (0.833, 0.167) p < 0.001 ≤ 1 > 1 ≤ 12 > 12 ≤ 71 > 71 ≤ 14 > 14 ≤ 14 > 14 ≤ 3 > 3 3 4 6 7 3 4 6 7 4 5 n = 9 n = 13 n = 12 n = 47 n = 15 n = 17 n = 17 n = 32 n = 21 n = 32 6 7 y = (0.778, 0.222) y = (0.154, 0.846) y = (0.833, 0.167) y = (1, 0) y = (0.667, 0.333) y = (0.235, 0.765) y = (0.882, 0.118) y = (1, 0) y = (0.905, 0.095) y = (1, 0) n = 30 n = 15 y = (1, 0) y = (0.933, 0.067) I draw ntree bootstrap samples from original sample I fit a classification tree to each bootstrap sample ⇒ ntree trees I creates diverse set of trees because I trees are instable w.r.t. changes in learning data ⇒ ntree different looking trees (bagging) I randomly preselect mtry splitting variables in each split ⇒ ntree more different looking trees (random forest) Construction of a random forest Introduction Construction R functions Variable importance Tests for variable importance Conditional importance Summary References I fit a classification tree to each bootstrap sample ⇒ ntree trees I creates diverse set of trees because I trees are instable w.r.t. changes in learning data ⇒ ntree different looking trees (bagging) I randomly preselect mtry splitting variables in each split ⇒ ntree more different looking trees (random forest) Construction of a random forest Introduction Construction R functions Variable I draw ntree bootstrap samples from original sample importance Tests for variable importance Conditional importance Summary References I creates diverse set of trees because I trees are instable w.r.t. changes in learning data ⇒ ntree different looking trees (bagging) I randomly preselect mtry splitting variables in each split ⇒ ntree more different looking trees (random forest) Construction of a random forest Introduction Construction R functions Variable I draw ntree bootstrap samples from original sample importance Tests for variable I fit a classification tree to each bootstrap sample importance Conditional ⇒ ntree trees importance Summary References Construction of a random forest Introduction Construction R functions Variable I draw ntree bootstrap samples from original sample importance Tests for variable I fit a classification tree to each bootstrap sample importance Conditional ⇒ ntree trees importance I creates diverse set of trees because Summary References I trees are instable w.r.t. changes in learning data ⇒ ntree different looking trees (bagging) I randomly preselect mtry splitting variables in each split ⇒ ntree more different looking trees (random forest) Random forests in R Introduction Construction I randomForest (pkg: randomForest) R functions I reference implementation based on CART trees Variable (Breiman, 2001; Liaw and Wiener, 2008) importance Tests for variable – for variables of different types: biased in favor of importance Conditional continuous variables and variables with many categories importance (Strobl, Boulesteix, Zeileis, and Hothorn, 2007) Summary I cforest (pkg: party) References I based on unbiased conditional inference trees (Hothorn, Hornik, and Zeileis, 2006) + for variables of different types: unbiased when subsampling, instead of bootstrap sampling, is used (Strobl, Boulesteix, Zeileis, and Hothorn, 2007) (Small) random forest Introduction 1 1 1 1 Start Start Start Construction Start p < 0.001 p < 0.001 p < 0.001 p < 0.001 ≤ 8 > 8 ≤ 12 > 12 ≤ 1 > 1 2 3 2 7 2 3 ≤ 8 > 8 n = 13 Age Age n = 49 n = 8 Number R functions y = (0.308, 0.692) p < 0.001 p < 0.001 y = (1, 0) y = (0.375, 0.625) p < 0.001 2 3 n = 15 Start ≤ 87 > 87 ≤ 68 > 68 ≤ 4 > 4 y = (0.4, 0.6) p < 0.001 4 5 3 6 4 7 n = 36 Start Number n = 12 Age n = 31 y = (1, 0) p < 0.001 p < 0.001 y = (0.25, 0.75) p < 0.001 y = (0.806, 0.194) ≤ 14 > 14 Variable ≤ 13 > 13 ≤ 4 > 4 ≤ 125 > 125 4 5 n = 34 n = 32 6 7 4 5 5 6 y = (0.882, 0.118) y = (1, 0) n = 16 n = 16 n = 11 n = 9 n = 31 n = 11 importance y = (0.75, 0.25) y = (1, 0) y = (1, 0) y = (0.556, 0.444) y = (1, 0) y = (0.818, 0.182) 1 Number 1 1 1 Start Start Tests for variable p < 0.001 Start p < 0.001 p < 0.001 p < 0.001 ≤ 5 > 5 2 9 ≤ 12 > 12 ≤ 14 > 14 importance Age n = 11 2 7 2 7 ≤ 12 > 12 p < 0.001 y = (0.364, 0.636) Age Number Age n = 35 ≤ 81 > 81 p < 0.001 p < 0.001 p < 0.001 y = (1, 0) 2 3 3 4 n = 38 Number n = 33 Start Conditional ≤ 18 > 18 ≤ 3 > 3 ≤ 71 > 71 y = (0.711, 0.289) p < 0.001 y = (1, 0) p < 0.001 3 4 8 9 3 4 ≤ 12 > 12 n = 10 Number n = 28 n = 21 n = 15 Start

Load more