Lecture 22 Ensemble

Lecture 22 Ensemble

Class Website CX4242: Ensemble Learning Mahdi Roozbahani Lecturer, Computational Science and Engineering, Georgia Tech Which Classifier/Model to Choose? • Possible strategies: • Go from simplest model to more complex model until you obtain desired accuracy • Discover a new model if the existing ones do not work for you • Combine all (simple) models 2 Common Strategy: Bagging (Bootstrap Aggregating) Originally designed for combining multiple models, to improve classification “stability” [Leo Breiman, 94] Uses random training datasets (sampled from one dataset) http://statistics.about.com/od/Applications/a/What-Is-Bootstrapping.htm Common Strategy: Bagging (Bootstrap Aggregating) Consider the data set 푆 = 푋푖, 푌푖 푖=1,…,푛 • Pick a sample S* with replacement of size n (S* called a “bootstrap sample”) • Train on S* to get a classifier f * • Repeat above steps B times to get f1, f2,...,fB • Final classifier f(x) = majority{fb(x)}j=1,...,B http://statistics.about.com/od/Applications/a/What-Is-Bootstrapping.htm Common Strategy: Bagging Why would bagging work? • Combining multiple classifiers reduces the variance of the final classifier When would this be useful? • We have a classifier with high variance Bagging decision trees Consider the data set S • Pick a sample S* with replacement of size n • Grow a decision tree Tb • Repeat B times to get T1,...,TB • The final classifier will be Random Forests Almost identical to bagging decision trees, except we introduce some randomness: • Randomly pick m of the d available attributes, at every split when growing the tree (i.e., d - m attributes ignored) Bagged random decision trees = Random forests What are our Hyper-Parameters in Random Forest 푚 = 푁푢푚푏푒푟 표푓 푟푎푛푑표푚푙푦 푐ℎ표푠푒푛 푎푡푡푟푏푢푡푒푠 Usual values for 푚 = 푑, 1,10 푑 is number of dimensions or features or attributes How to optimize 푚? Cross-Validation 퐵 = 푁푢푚푏푒푟 표푓 푚표푑푒푙푠 표푟 푑푒푐푠표푛 푡푟푒푒푠 푛 푅푎푛푑표푚 퐹표푟푒푠푡 Keep adding trees until training error stabilizes (reaches to a plateau).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us