CS480 Introduction to Machine Learning Ensemble Learning
Edith Law Ensemble Learning
Models that combine the opinions of multiple classifiers.
not not guilty guilty guilty
Advantages: •use much simpler learners and still achieve great performance. •efficiency at training and test time because of parallelism Multiple Voting Classifiers
•All the learning algorithms we have seen so far are deterministic - if you train a decision tree multiple times on the same dataset, you will get the same tree back. •To get an effect out of multiple voting classifiers, they need to differ. •There are different ways to get variability: - change the learning algorithm - change the dataset Approach #1: Combine different types of classifiers
•Instead of learning a single classifier (e.g., decision tree) on this dataset, you train a set of different classifiers h 1 , … , h K (e.g., decision tree, perceptron, KNN, multiple neural networks with different architectures).
•For a test point x’, you make a decision by voting:
y1̂ = h1(x′) … yK̂ = hK(x′)
•Classification: Predict +1 if there are more +1 in the votes. •Regression: take mean or median prediction from different classifiers. •Caveat: While it is unlikely that all classifiers will exactly the same mistake, the inductive biases of different learning classifiers can be highly correlated, i.e., they are prone to similar types of errors. Overview
• Bagging • Random Forests • Boosting