Machine-Learning-Course Documentation Release 1.0

Machine-Learning-Course Documentation Release 1.0 Amirsina Torfi May 12, 2019 Foreword 1 Introduction 1 1.1 Machine Learning Overview.......................................1 1.1.1 How was the advent and evolution of machine learning?...................1 1.1.2 Why is machine learning important?..............................1 1.1.3 Who is using ML and why (government, healthcare system, etc.)?..............2 1.1.4 Further Reading.........................................2 2 Cross-Validation 3 2.1 Motivation................................................3 2.2 Holdout Method.............................................3 2.3 K-Fold Cross Validation.........................................4 2.4 Leave-P-Out / Leave-One-Out Cross Validation.............................4 2.5 Conclusion................................................6 2.6 Motivation................................................6 2.7 Code Examples..............................................6 2.8 References................................................7 3 Linear Regression 9 3.1 Motivation................................................9 3.2 Overview................................................. 10 3.3 When to Use............................................... 12 3.4 Cost Function............................................... 12 3.5 Methods................................................. 16 3.5.1 Ordinary Least Squares..................................... 16 3.5.2 Gradient Descent........................................ 16 3.6 Code................................................... 16 3.7 Conclusion................................................ 16 3.8 References................................................ 17 4 Overfitting and Underfitting 19 4.1 Overview................................................. 19 4.2 Overfitting................................................ 19 4.3 Underfitting................................................ 20 4.4 Motivation................................................ 20 4.5 Code................................................... 21 4.6 Conclusion................................................ 21 4.7 References................................................ 21 i 5 Regularization 23 5.1 Motivation................................................ 23 5.2 Overview................................................. 23 5.3 Methods................................................. 24 5.3.1 Ridge Regression........................................ 27 5.3.2 Lasso Regression........................................ 28 5.4 Summary................................................. 29 5.5 References................................................ 29 6 Logistic Regression 31 6.1 Introduction............................................... 31 6.2 When to Use............................................... 32 6.3 How does it work?............................................ 32 6.4 Multinomial Logistic Regression.................................... 33 6.5 Code................................................... 33 6.6 Motivation................................................ 33 6.7 Conclusion................................................ 33 6.8 References................................................ 33 7 Naive Bayes Classification 35 7.1 Motivation................................................ 35 7.2 What is it?................................................ 36 7.3 Bayes’ Theorem............................................. 36 7.4 Naive Bayes............................................... 36 7.5 Algorithms................................................ 37 7.5.1 Gaussian Model (Continuous)................................. 37 7.5.2 Multinomial Model (Discrete)................................. 37 7.5.3 Bernoulli Model (Discrete)................................... 39 7.6 Conclusion................................................ 39 7.7 References................................................ 39 8 Decision Trees 41 8.1 Introduction............................................... 41 8.2 Motivation................................................ 43 8.3 Classification and Regression Trees................................... 43 8.4 Splitting (Induction)........................................... 45 8.5 Cost of Splitting............................................. 45 8.6 Pruning.................................................. 46 8.7 Conclusion................................................ 48 8.8 Code Example.............................................. 48 8.9 References................................................ 49 9 k-Nearest Neighbors 51 9.1 Introduction............................................... 51 9.2 How does it work?............................................ 52 9.3 Brute Force Method........................................... 52 9.4 K-D Tree Method............................................ 52 9.5 Choosing k................................................ 52 9.6 Conclusion................................................ 53 9.7 Motivation................................................ 53 9.8 Code Example.............................................. 54 9.9 References................................................ 55 10 Linear Support Vector Machines 57 10.1 Introduction............................................... 57 ii 10.2 Hyperplane................................................ 58 10.3 How do we find the best hyperplane/line?................................ 58 10.4 How to maximize the margin?...................................... 59 10.5 Ignore Outliers.............................................. 59 10.6 Kernel SVM............................................... 59 10.7 Conclusion................................................ 61 10.8 Motivation................................................ 61 10.9 Code Example.............................................. 62 10.10 References................................................ 62 11 Clustering 65 11.1 Overview................................................. 65 11.2 Clustering................................................ 65 11.3 Motivation................................................ 66 11.4 Methods................................................. 66 11.4.1 K-Means............................................ 67 11.4.2 Hierarchical........................................... 69 11.5 Summary................................................. 71 11.6 References................................................ 71 12 Principal Component Analysis 73 12.1 Introduction............................................... 73 12.2 Motivation................................................ 73 12.3 Dimensionality Reduction........................................ 75 12.4 PCA Example.............................................. 75 12.5 Number of Components......................................... 76 12.6 Conclusion................................................ 77 12.7 Code Example.............................................. 78 12.8 References................................................ 78 13 Multi-layer Perceptron 81 13.1 Overview................................................. 81 13.2 Motivation................................................ 82 13.3 What is a node?............................................. 82 13.4 What defines a multilayer perceptron?.................................. 82 13.5 What is backpropagation?........................................ 82 13.6 Summary................................................. 83 13.7 Further Resources............................................ 83 13.8 References................................................ 85 14 Convolutional Neural Networks 87 14.1 Overview................................................. 87 14.2 Motivation................................................ 88 14.3 Architecture............................................... 88 14.3.1 Convolutional Layers...................................... 89 14.3.2 Pooling Layers......................................... 92 14.3.3 Fully Connected Layers..................................... 93 14.4 Training................................................. 93 14.5 Summary................................................. 94 14.6 References................................................ 94 15 Autoencoders 95 15.1 Autoencoders and their implementations in TensorFlow........................ 95 15.2 Introduction............................................... 95 15.3 Create an Undercomplete Autoencoder................................. 96 iii 16 Contributing 99 16.1 Pull Request Process........................................... 99 16.2 Final Note................................................ 99 17 Contributor Covenant Code of Conduct 101 17.1 Our Pledge................................................ 101 17.2 Our Standards.............................................. 101 17.3 Our Responsibilities........................................... 102 17.4 Scope................................................... 102 17.5 Enforcement............................................... 102 17.6 Attribution................................................ 102 18 LICENSE 103 iv CHAPTER 1 Introduction The purpose of this project is to provide a comperehensive and yet simple course in Machine Learning using Python. 1.1 Machine Learning Overview 1.1.1 How was the advent and evolution of machine learning? You can argue that the start of modern machine learning comes from Alan Turing’s “Turing Test” of 1950. The Turing Test aimed to find out if a computer is brilliant (or at least smart enough to fool

Machine-Learning-Course Documentation Release 1.0

Black Box Explanation by Learning Image Exemplars in the Latent Feature Space

On the Boosting Ability of Top-Down Decision Tree Learning Algorithms

Boosting with Multi-Way Branching in Decision Trees

Inductive Bias in Decision Tree Learning • Issues in Decision Tree Learning • Summary

A Prediction for Student's Performance Using Decision Tree ID3 Method

Galaxy Classification with Deep Convolutional Neural Networks

X-Trepan: a Multi Class Regression and Adapted Extraction of Comprehensible Decision Tree in Artificial Neural Networks

Evaluation and Comparison of Word Embedding Models, for Efficient Text Classification

Chapter 2. Machine Learning Overview

A Commodity Classification Framework Based on Machine

COMP 307 — Lecture 07 Machine Learning 4 DT Learning And

1|Decision Trees