Decision Trees, Bagging and Boosting

Decision Trees, Bagging and Boosting CSE 446: Machine Learning created by Emily Fox ©2017 Emily Fox CSE 446: Machine Learning Predicting potential loan defaults ©2017 Emily Fox CSE 446: Machine Learning What makes a loan risky? I want a to buy a new house! Credit History ★★★★ Income ★★★ Term ★★★★★ Loan Application Personal Info ★★★ 3 ©2017 Emily Fox CSE 446: Machine Learning Credit history explained Did I pay previous loans on time? Credit History ★★★★ Example: excellent, Income good, or fair ★★★ Term ★★★★★ Personal Info ★★★ 4 ©2017 Emily Fox CSE 446: Machine Learning Income Credit History What’s my income? ★★★★ Example: Income ★★★ $80K per year Term ★★★★★ Personal Info ★★★ 5 ©2017 Emily Fox CSE 446: Machine Learning Loan terms Credit History How soon do I need to ★★★★ pay the loan? Income Example: 3 years, ★★★ 5 years,… Term ★★★★★ Personal Info ★★★ 6 ©2017 Emily Fox CSE 446: Machine Learning Personal information Credit History ★★★★ Income Age, reason for the loan, ★★★ marital status,… Term Example: Home loan for a ★★★★★ married couple Personal Info ★★★ 7 ©2017 Emily Fox CSE 446: Machine Learning Intelligent application Loan Applications Safe ✓ Intelligent loan application Risky review system ✘ Risky ✘ 8 ©2017 Emily Fox CSE 446: Machine Learning Classifier review ŷi = +1 Loan Classifier Safe Application MODEL Risky Output: ŷ Input: xi Predicted class ŷi = -1 9 ©2017 Emily Fox CSE 446: Machine Learning This module ... decision trees Start excellent poor Credit? fair Income? Safe Term? high Low 3 years 5 years Risky Safe Term? Risky 3 years 5 years Risky Safe 10 ©2017 Emily Fox CSE 446: Machine Learning Scoring a loan application xi = (Credit = poor, Income = high, Term = 5 years) Start excellent poor Credit? fair Income? Safe Term? high Low 3 years 5 years Risky Safe Term? Risky 3 years 5 years Safe Risky ŷi = Safe 11 ©2017 Emily Fox CSE 446: Machine Learning Decision tree learning task ©2017 Emily Fox CSE 446: Machine Learning Decision tree learning problem Training data: N observations (xi,yi) Credit Term Income y excellent 3 yrs high safe fair 5 yrs low risky Optimize fair 3 yrs high safe quality metric on training data T(X) poor 5 yrs high risky excellent 3 yrs low risky fair 5 yrs low safe poor 3 yrs high risky poor 5 yrs low safe fair 3 yrs high safe 13 ©2017 Emily Fox CSE 446: Machine Learning Quality metric: Classification error • Error measures fraction of mistakes Error = # incorrect predictions # examples - Best possible value : 0.0 - Worst possible value: 1.0 14 ©2017 Emily Fox CSE 446: Machine Learning How do we find the best tree? Exponentially large number of possible trees Learning the smallest makes decision tree learning hard! decision tree is an NP-hard problem T (X) T (X) 1 2 T3(X) [Hyafil & Rivest ’76] T4(X) T5(X) T6(X) 15 ©2017 Emily Fox CSE 446: Machine Learning Greedy decision tree learning ©2017 Emily Fox CSE 446: Machine Learning Our training data table Assume N = 40, 3 features Credit Term Income y excellent 3 yrs high safe fair 5 yrs low risky fair 3 yrs high safe poor 5 yrs high risky excellent 3 yrs low risky fair 5 yrs low safe poor 3 yrs high risky poor 5 yrs low safe fair 3 yrs high safe 17 ©2017 Emily Fox CSE 446: Machine Learning Start with all the data Loan status: Safe Risky (all data) # of Risky loans 22 18 # of Safe loans N = 40 examples 18 ©2017 Emily Fox CSE 446: Machine Learning Compact visual notation: Root node Loan status: Safe Risky Root 22 18 # of Risky loans # of Safe loans N = 40 examples 19 ©2017 Emily Fox CSE 446: Machine Learning Decision stump: Single level tree Loan status: Root Safe Risky 22 18 Split on Credit Credit? excellent fair poor 9 0 9 4 4 14 Subset of data with Subset of data with Subset of data with Credit = excellent Credit = fair Credit = poor 20 ©2017 Emily Fox CSE 446: Machine Learning Visual notation: Intermediate nodes Loan status: Root Safe Risky 22 18 Credit? excellent fair poor 9 0 9 4 4 14 Intermediate nodes 21 ©2017 Emily Fox CSE 446: Machine Learning Proceed to build a tree on each of children Loan status: root Safe Risky 22 18 credit? excellent fair poor 9 0 9 4 4 14 Safe 22 ©2017 Emily Fox CSE 446: Machine Learning Selecting best feature to split on ©2017 Emily Fox CSE 446: Machine Learning How do we pick our first split? Loan status: Root Find the “best” Safe Risky 22 18 feature to split on! Credit? excellent fair poor 9 0 9 4 4 14 24 ©2017 Emily Fox CSE 446: Machine Learning How do we select the best feature? Choice 1: Split on Credit Choice 2: Split on Term Loan status: Loan status: Root Root Safe Risky Safe Risky 22 18 22 18 Credit? OR Term? excellent fair poor 3 years 5 years 9 0 9 4 4 14 16 4 6 14 25 ©2017 Emily Fox CSE 446: Machine Learning How do we measure effectiveness of a split? Loan status: Root Safe Risky 22 18 Idea: Calculate classification error of this decision stump Credit? Error = # mistakes # data points excellent fair poor 9 0 9 4 4 14 26 ©2017 Emily Fox CSE 446: Machine Learning Calculating classification error • Step 1: ŷ = class of majority of data in node • Step 2: Calculate classification error of predicting ŷ for this data Loan status: Safe Risky Root 22 18 22 correct 18 mistakes Safe Tree Classification error (root) 0.45 ŷ = majority class 27 ©2017 Emily Fox CSE 446: Machine Learning Choice 1: Split on Credit history? Choice 1: Split on Credit Root Loan status: Does a split on Credit reduce Safe Risky 22 18 classification error below 0.45? Credit? excellent fair poor 9 0 9 4 4 14 28 ©2017 Emily Fox CSE 446: Machine Learning Split on Credit: Classification error Choice 1: Split on Credit Loan status: Root Safe Risky 22 18 Error = 4+4 _. Credit? 22 + 18 excellent fair poor = 0.2 9 0 9 4 4 14 Tree Classification error Safe Safe Risky (root) 0.45 Split on credit 0.2 0 mistakes 4 mistakes 4 mistakes 29 ©2017 Emily Fox CSE 446: Machine Learning Choice 2: Split on Term? Choice 2: Split on Term Loan status: Root Safe Risky 22 18 Term? 3 years 5 years 16 4 6 14 Safe Risky 30 ©2017 Emily Fox CSE 446: Machine Learning Evaluating the split on Term Choice 2: Split on Term Loan status: Root Safe Risky 22 18 Error = 4+6 . Term? 40 3 years 5 years = 0.25 16 4 6 14 Tree Classification error Safe Risky (root) 0.45 Split on credit 0.2 4 mistakes 6 mistakes Split on term 0.25 31 ©2017 Emily Fox CSE 446: Machine Learning Choice 1 vs Choice 2: Tree Classification error Comparing split on (root) 0.45 Credit vs Term split on credit 0.2 split on loan term 0.25 Choice 1: Split on Credit Choice 2: Split on Term Loan status: Root Loan status: Root Safe Risky 22 18 Safe Risky 22 18 Credit? OR Term? excellent fair poor 3 years 5 years 9 0 WINNER8 4 4 14 16 4 6 14 32 ©2017 Emily Fox CSE 446: Machine Learning Feature split selection algorithm • Given a subset of data M (a node in a tree) • For each feature ϕi(x): 1. Split data of M according to feature ϕi(x) 2. Compute classification error split • Chose feature ϕ*(x) with lowest classification error 33 ©2017 Emily Fox CSE 446: Machine Learning Recursion & Stopping conditions ©2017 Emily Fox CSE 446: Machine Learning We’ve learned a decision stump, what next? Loan status: Root Safe Risky 22 18 Credit? excellent fair poor 9 0 9 4 4 14 All data points are Safe è Safe nothing else to do with this subset of data Leaf node 35 ©2017 Emily Fox CSE 446: Machine Learning Tree learning = Recursive stump learning Loan status: Root Safe Risky 22 18 Credit? excellent fair poor 9 0 9 4 4 14 Safe Build decision stump with Build decision stump with subset of data where subset of data where Credit = fair Credit = poor 36 ©2017 Emily Fox CSE 446: Machine Learning Second level Loan status: Root Safe Risky 22 18 Credit? excellent fair poor 9 0 9 4 4 14 Safe Term? Income? 3 years 5 years high Low 0 4 9 0 4 5 0 9 Risky Safe Risky Build another stump these data points 37 ©2017 Emily Fox CSE 446: Machine Learning Final decision tree Loan status: Root poor 22 18 4 14 Safe Risky Credit? Income? excellent Fair 9 0 9 4 high low 4 5 0 9 Safe Term? Term? Risky 3 years 5 years 0 4 9 0 3 years 5 years 0 2 4 3 Risky Safe Risky Safe 38 ©2017 Emily Fox CSE 446: Machine Learning Simple greedy decision tree learning Pick best feature to split on Learn decision stump with this split For each leaf of decision stump, recurse When do we stop??? 39 ©2017 Emily Fox CSE 446: Machine Learning Stopping condition 1: All data agrees on y All data in these nodes have same Loan status:è Root poor y value 22 18 4 14 NothingSafe Risky to do Credit? Income? excellent Fair 9 0 9 4 high low 4 5 0 9 Safe Term? Term? Risky 3 years 5 years 0 4 9 0 3 years 5 years 0 2 4 3 Risky Safe Risky Safe 40 ©2017 Emily Fox CSE 446: Machine Learning Stopping condition 2: Already split on all features Already split on all possible features è NothingLoan status: to do Root poor 22 18 4 14 Safe Risky Credit? Income? excellent Fair 9 0 9 4 high low 4 5 0 9 Safe Term? Term? Risky 3 years 5 years 0 4 9 0 3 years 5 years 0 2 4 3 Risky Safe Risky Safe 41 ©2017 Emily Fox CSE 446: Machine Learning Greedy decision tree learning • Step 1: Start with an empty tree • Step 2: Select a feature to split data Pick feature split • For each split of the tree: leading to lowest • Step 3: If nothing more to do, make classification error predictions Stopping conditions 1 • Step 4: Otherwise, go to Step 2 & & 2 continue (recurse) on this split Recursion 42 ©2017 Emily Fox CSE 446: Machine Learning Is this a good idea? Proposed stopping condition 3: Stop if no split reduces the classification error 43 ©2017 Emily Fox CSE 446: Machine Learning Stopping condition 3: Don’t stop if error doesn’t decrease??? y = x[1] xor x[2] y values Root x[1] x[2] y True False 2 2 False False False False True True Error = 2 .

Load more